[PDF] Linear random walks on the torus

Abstract

We prove a quantitative equidistribution result for linear random walks on the torus, similar to a theorem of Bourgain, Furman, Lindenstrauss and Mozes, but without any proximality assumption.

Full PDF

aa r X i v : . [ m a t h . D S ] O c t LINEAR RANDOM WALKS ON THE TORUS

WEIKUN HE AND NICOLAS DE SAXCÉ

Abstract.

We prove a quantitative equidistribution result for linear randomwalks on the torus, similar to a theorem of Bourgain, Furman, Lindenstraussand Mozes, but without any proximality assumption. An application is givento expansion in simple groups, modulo arbitrary integers. Introduction

The goal of the present paper is to study the equidistribution of linear randomwalks on the torus. We are given a probability measure µ on the group SL d ( Z ) ofinteger matrices with determinant one, and consider the associated random walk ( x n ) n ≥ on the torus T d = R d / Z d , starting from a point x in T d , and moving atstep n following a random element g n with law µ : x n = g n x n − = g n . . . g x . We say that the measure µ on SL d ( Z ) has some ﬁnite exponential moment if thereexists ε > such that Z k g k ε d µ ( g ) < ∞ , where k k denotes an arbitrary norm on M d ( R ) , the space of d × d matrices withreal coeﬃcients. Our goal is to prove the following theorem. Theorem 1.1 (Equidistribution on the torus) . Let d ≥ . Let µ be a probabilitymeasure on SL d ( Z ) . Denote by Γ the subsemigroup generated by µ , and by G < SL d the Zariski closure of Γ . Assume that:(a) The measure µ has a ﬁnite exponential moment;(b) The only subspaces of R d preserved by Γ are { } and R d ;(c) The algebraic group G is Zariski connected.Then, for every irrational point x in T d , the sequence of measures ( µ ∗ n ∗ δ x ) n ≥ converges to the Haar measure in the weak- ∗ topology. With an additional proximality assumption, this theorem was proved a decadeago by Bourgain, Furman, Lindenstrauss and Mozes [11], and we follow their ap-proach to this problem, via a study of the Fourier coeﬃcients of the law at time n of the random walk on T d . One advantage of this method – besides being the onlyone available at the present – is that it yields a quantitative statement, giving aspeed of convergence of the random walk, in terms of the diophantine properties ofthe starting point x , see [11, Theorem A]. Theorem 1.2 (Quantitative equidistribution on the torus) . Under the assumptionsof Theorem 1.1, let λ denote the top Lyapunov exponent associated to µ . Mathematics Subject Classiﬁcation.

Primary .

Key words and phrases.

Sum-product, Random walk, Toral automorphism.W.H. is supported by ERC grant ErgComNum 682150.

Given λ ∈ (0 , λ ) , there exists a constant C = C ( µ, λ ) > such that for every x ∈ T d and every t ∈ (0 , / , if for some a ∈ Z d \ { } , | \ µ ∗ n ∗ δ x ( a ) | ≥ t and n ≥ C log k a k t , then there exists q ∈ Z > and x ′ ∈ ( q Z d ) / Z d such that q ≤ (cid:16) k a k t (cid:17) C and d ( x, x ′ ) ≤ e − λn . One of our motivations for removing any proximality assumption from this theo-rem was to generalize a theorem of Bourgain and Varjú on expansion in SL d ( Z /q Z ) ,where q is an arbitrary integer, to more general simple Q -groups. We brieﬂy de-scribe this application at the end of the paper, in §6.1.In [11], the proximality assumption is used at several important places, especiallyin the study of the large scale structure of Fourier coeﬃcients [11, Phase I]. Let usmention the main ingredients we had to bring into our proof in order to overcomethis issue.One important tool in the proof of Bourgain, Furman, Lindenstrauss and Mozes[11] is a discretized projection theorem, due to Bourgain [10, Theorem 5], givinginformation on the size of the projections of a set A ⊂ R d to lines. But, when therandom walk is not proximal, one should no longer project the set to lines, but tosubspaces whose dimension equals the proximality dimension of Γ . One approach,of course, would be to generalize Bourgain’s theorem to higher dimensions, andthis has been worked out by the ﬁrst author [29]. But it turns out that the naturalgeneralization of Bourgain’s theorem, used with the general strategy of [11], onlyallows to deal with some special cases [31]. Here we take a diﬀerent route. Insteadof working in the space Z d of Fourier coeﬃcients, we place ourselves in the simplealgebra E ⊂ M d ( R ) generated by the random walk. This allows us to use theresults of the ﬁrst author on the discretized sum-product phenomenon in simplealgebras [28]. Thus, instead of a projection theorem, we use a result on the Fourierdecay of multiplicative convolutions in simple algebras, derived in Section 2, andgeneralizing a theorem of Bourgain for the ﬁeld of real numbers [10, Theorem 6].Then, in order to be able to apply this Fourier estimate to the law at time n ofthe random walk, we have to check some non-concentration conditions. For that, weuse a result of Salehi Golseﬁdy and Varjú [43] on expansion modulo prime numbersin semisimple groups, combined with a rescaling argument, proved with the theoryof random walks on reductive groups. In the end, we obtain some Fourier decaytheorem for the law at time n of a random walk on SL d ( Z ) , Theorem 3.19, which,we believe, bears its own interest and, we hope, will have other applications.The rest of the proof, corresponding to [11, Phase II], follows more closely thestrategy of [11]. But since at several points we had to ﬁnd an alternative proofto avoid the use of the proximality assumption, we chose to include the wholeargument, rather than refer the reader to [11]. We hope that this will make theproof easier to follow.2. Sum-product, L -flattening and Fourier decay The main objective of this section is to prove that in a simple real algebra, mul-tiplicative convolutions of non-concentrated measures admit a polynomial Fourierdecay. The precise statement is given in Theorem 2.1 below.From now on, E will denote a ﬁnite-dimensional real associative simple algebra,endowed with a norm k k . Given a ﬁnite Borel measure µ on E and an integer INEAR RANDOM WALKS ON THE TORUS 3 s ≥ , we write µ ∗ s = µ ∗ · · · ∗ µ | {z } s times for the s -fold multiplicative convolution of µ with itself. In order to ensure theFourier decay of some multiplicative convolution of the measure µ , we need twoassumptions: First, µ should not be concentrated around a linear subspace of E ,and second, µ should not give mass to elements of E that are too singular.To make these requirements more precise, let us set up some notation. For ρ > and x ∈ E , let B E ( x, ρ ) denote the closed ball in E of radius ρ and centered at x .For a subset W ⊂ E , let W ( ρ ) denote the ρ -neighborhood of W , W ( ρ ) = W + B E (0 , ρ ) . For a ∈ E deﬁne det E ( a ) to be the determinant of the endomorphism E → E , x ax . Note that since E is simple this quantity is equal to the determinant of E → E , x xa . For ρ > , deﬁne S E ( ρ ) , the set of badly invertible elements of E , as S E ( ρ ) = { x ∈ E | | det E ( x ) | ≤ ρ } . Theorem 2.1 (Fourier decay of multiplicative convolutions) . Let E be a normedsimple algebra over R of ﬁnite dimension. Given κ > , there exists s = s ( E, κ ) ∈ N and ε = ε ( E, κ ) > such that for any parameter τ ∈ (0 , εκ ) the following holdsfor any scale δ > suﬃciently small. Let µ be a Borel probability measure on E .Assume that(i) µ (cid:0) E \ B E (0 , δ − ε ) (cid:1) ≤ δ τ ;(ii) for every x ∈ E , µ ( x + S E ( δ ε )) ≤ δ τ ;(iii) for every ρ ≥ δ and every proper aﬃne subspace W ⊂ E , µ ( W ( ρ ) ) ≤ δ − ε ρ κ .Then for all ξ ∈ E ∗ with k ξ k = δ − , | c µ ∗ s ( ξ ) | ≤ δ ετ . For E = R , this is due to Bourgain [10, Lemma 8.43]. Li proved in [37] a similarstatement for the semisimple algebra R ⊕ · · · ⊕ R . While a more general statementshould hold for any semisimple algebra, we do not pursue in this direction and focusin the present paper only on simple algebras.2.1. L -ﬂattening. The aim of this subsection is to prove a sum-product L -ﬂattening lemma for simple algebras.We shall consider both additive and multiplicative convolutions between mea-sures or functions on E . To avoid confusion, we shall use the usual symbol ∗ todenote multiplicative convolution and the symbol ⊞ to denote additive convolution.In the same fashion, for ﬁnite Borel measures µ and ν on E, we deﬁne µ ⊟ ν to bethe push forward measure of µ ⊗ ν by the map ( x, y ) x − y .For a Borel set A ⊂ E , denote by | A | the Lebesgue measure of A . For δ > , deﬁne P δ = | B E (0 , δ ) | − B E (0 ,δ ) . For absolutely continuous measures such as µ ⊞ P δ , by abuse of notation, we write µ ⊞ P δ to denote both the measure and theRadon-Nikodym derivative. For x ∈ E , we write D x to denote the Dirac measureat the point x . For K > , deﬁne the set of well invertible elements of E as G E ( K ) = { x ∈ E × | k x k , k x − k ≤ K } . Note that if x ∈ G E ( K ) , the left, or right, multiplication by x as a map from E toitself is O ( K ) -bi-Lipschitz. Proposition 2.2 ( L -ﬂattening) . Let E be a normed ﬁnite-dimensional simplealgebra over R of dimension d ≥ . Given κ > , there exists ε = ε ( E, κ ) > such WEIKUN HE AND NICOLAS DE SAXCÉ that the following holds for δ > suﬃciently small. Let µ be a Borel probabilitymeasure on E . Assume that(i) µ is supported on G E ( δ − ε ) ;(ii) δ − κ ≤ k µ ⊞ P δ k ≤ δ − d + κ ;(iii) for every proper linear subspace W < E , ∀ ρ ≥ δ , µ ( W ( ρ ) ) ≤ δ − ε ρ κ .Then, k ( µ ∗ µ ⊟ µ ∗ µ ) ⊞ P δ k ≤ δ ε k µ ⊞ P δ k . Remark 1. If E = R the same holds if condition (iii) is replaced by(2.1) ∀ ρ ≥ δ, ∀ x ∈ E, µ ( B E ( x, ρ )) ≤ δ − ε ρ κ . Note also that when dim( E ) ≥ , property (2.1) is implied by condition (iii). Remark 2.

We shall apply this proposition to measures that are not probabilitymeasures. It is clear that by making ε slightly smaller, the same statement holdsfor measures µ with total mass µ ( E ) ≥ or just µ ( E ) ≥ δ ε . Proof.

In this proof, the implied constants in the Landau or Vinogradov notationdepend on the algebra structure of E as well as the choice of norm on it. We usethe following rough comparison notation : for positive quantities f and g , we write f . g if f ≤ δ − O ( ε ) g and f ∼ g for f . g and g . f . For instance, if a ∈ G E ( δ − ε ) then | det E ( a ) | ∼ .To simplify notation, we shall also use the shorthand µ δ = µ ⊞ P δ . Now assumefor a contradiction that the conclusion of the proposition does not hold, namely(2.2) k ( µ ∗ µ ⊟ µ ∗ µ ) ⊞ P δ k ≥ δ ε k µ δ k . Step 0: Compare the L -norms of ( µ ∗ µ ⊟ µ ∗ µ ) ⊞ P δ and µ ∗ µ δ ⊟ µ δ ∗ µ .For x ∈ E , write ( µ ∗ µ ⊟ µ ∗ µ ⊞ P δ )( x )= | B (0 , δ ) | − µ ⊗ { ( a, b, c, d ) | ab − cd ∈ B E ( x, δ ) }≤| B (0 , δ ) | − ( µ ⊗ ⊗ P ⊗ δ ) { ( a, b, c, d, y, z ) | a ( b + y ) − ( c + z ) d ∈ B E ( x, δ − ε ) } . | B (0 , δ − ε ) | − ( µ ⊗ ⊗ P ⊗ δ ) { ( a, b, c, d, y, z ) | a ( b + y ) − ( c + z ) d ∈ B E ( x, δ − ε ) } =( µ ∗ µ δ ⊟ µ δ ∗ µ ⊞ P δ − ε )( x ) . Above at the sign ≤ , we used the assumption that Supp( µ ) ⊂ B E (0 , δ − ε ) . There-fore, by Young’s inequality, k µ ∗ µ ⊟ µ ∗ µ ⊞ P δ k . k µ ∗ µ δ ⊟ µ δ ∗ µ ⊞ P δ − ε k ≤ k µ ∗ µ δ ⊟ µ δ ∗ µ k . To conclude step 0, we deduce from the above and (2.2) that(2.3) k µ ∗ µ δ ⊟ µ δ ∗ µ k & k µ δ k . Step 1: Discretize the measure µ using dyadic level sets.For a subset A ⊂ E , we denote by N ( A, δ ) the least number of balls of radius δ in E that is needed to cover A . By a δ -discretized set we mean a union of balls ofradius δ . Note that if A is a δ -discretized set then N ( A, δ ) ≍ E δ − d | A | . Is is easy tocheck that there exist δ -discretized sets A i ⊂ B E ( δ − ε ) , i ≥ such that A i is emptyfor i ≫ log δ , and(2.4) µ δ ≪ X i ≥ i A i ≪ µ δ + 1 . INEAR RANDOM WALKS ON THE TORUS 5

Step 2: Pick a popular level in order to transform (2.3) into a lower bound on theadditive energy between two δ -discretized sets.We have µ ∗ µ δ ⊟ µ δ ∗ µ = Z Z E × E ( D a ∗ µ δ ) ⊟ ( µ δ ∗ D b ) d µ ( a ) d µ ( b ) . From the left inequality in (2.4), µ ∗ µ δ ⊟ µ δ ∗ µ ≪ X i,j ≥ i + j Z Z ( D a ∗ A i ) ⊟ ( A j ∗ D b ) d µ ( a ) d µ ( b ) . Observe that D a ∗ A i = | det E ( a ) | − aA i and A j ∗ D b = | det E ( b ) | − A j b . Hence µ ∗ µ δ ⊟ µ δ ∗ µ ≪ X i,j ≥ i + j Z Z aA i ⊟ A j b | det E ( a ) det E ( b ) | d µ ( a ) d µ ( b ) . By (2.3), the triangular inequality and the assumption that µ is supported on G E ( δ − ε ) , X i,j ≥ i + j Z Z k aA i ⊟ A j b k d µ ( a ) d µ ( b ) & k µ δ k , There are at most O (log δ ) . terms in this sum. Hence by the pigeonholeprinciple, there exist i ≥ and j ≥ such that(2.5) i + j Z Z k aA i ⊟ A j b k d µ ( a ) d µ ( b ) & k µ δ k . From now on we ﬁx such i and j . By the right-hand inequality in (2.4), we ﬁnd i | A i | / = k i A i k ≪ k µ δ k + 1 ≪ k µ δ k , so that for all a, b ∈ G E ( δ − O ( ε ) ) ,(2.6) k i aA i k . k µ δ k and k j A j b k . . Hence by Young’s inequality, ∀ a, b ∈ G E ( δ − O ( ε ) ) , i + j k aA i ⊟ A j b k . k µ δ k . This combined with (2.5) implies that the set B = { ( a, b ) ∈ G E ( δ − O ( ε ) ) × | i + j k aA i ⊟ A j b k ≥ δ O ( ε ) k µ δ k } has measure µ ⊗ µ ( B ) & . For c = ( a, b ) ∈ B , using (2.6), we ﬁnd k aA i ⊟ A j b k & | aA i | / | A j b | , and switching the role of aA i and A j b , k aA i ⊟ A j b k & | aA i || A j b | / . Hence, k aA i ⊟ A j b k & | aA i | / | A j b | / . Note that aA i and A j b are δ − O ( ε ) -discretized sets. Hence the last inequality trans-lates to(2.7) E δ ( aA i , − A j b ) & N ( aA i , δ ) / N ( A j b, δ ) / where E δ denotes the additive energy at scale δ , as deﬁned in [44, Section 6] or [17,Appendix A.1].Step 3: Apply an argument of Bourgain [9, Proof of Theorem C] sometimes knownas the additive-multiplicative Balog-Szemerédi-Gowers theorem.We are going to use Rusza calculus. For subsets A, A ′ ⊂ E we write A ≈ A ′ if N ( A − A ′ , δ ) . N ( A, δ ) / N ( A ′ , δ ) / . WEIKUN HE AND NICOLAS DE SAXCÉ

Ruzsa’s triangular inequality and the Plünnecke-Ruzsa inequality [45, Chapters 2& 6] can be summarized as : the relation ≈ is transitive , i.e. A ≈ A ′ and A ′ ≈ A ′′ implies A ′ ≈ A ′′ .By Tao’s non-commutative version of the Balog-Szemerédi-Gowers lemma [44,Theorem 6.10] applied to (2.7), for every c ∈ B , there exists A c ⊂ A i and A ′ c ⊂ A j such that N ( A c , δ ) & N ( A i , δ ) and N ( A ′ c , δ ) & N ( A j , δ ) and(2.8) aA c ≈ A ′ c b. By taking δ -neighborhoods if necessary, we may assume that A c and A ′ c are δ -discretized sets. Write X = A i × A j ⊂ R d and X c = A c × A ′ c ⊂ X . Fromthe Cauchy-Schwarz inequality applied to the function x R B X c ( x ) d µ ⊗ ( c ) , weinfer that Z Z B × B | X c ∩ X d | d µ ⊗ ( c ) d µ ⊗ ( d ) & | X | . By the pigeonhole principle, there exists c ⋆ ∈ B and B ⊂ B such that µ ⊗ ( B ) & µ ⊗ ( B ) & and for all c ∈ B , | X c ⋆ ∩ X c | & | X | . Abbreviate A c ⋆ as A ⋆ and A ′ c ⋆ as A ′ ⋆ . Wethen have, for every c ∈ B ,(2.9) N ( A ⋆ ∩ A c , δ ) & N ( A i , δ ) and N ( A ′ ⋆ ∩ A ′ c , δ ) & N ( A j , δ ) . For c = ( a, b ) ∈ B , by the Rusza calculus and (2.8), aA c ≈ A ′ c b ≈ aA c . Since a ∈ G E ( δ − ε ) , this implies A c ≈ A c . Using (2.9) and the deﬁnition of the symbol ≈ , we get A ⋆ ∩ A c ≈ A c and for thesame reason A ⋆ ∩ A c ≈ A ⋆ . Hence aA ⋆ ≈ a ( A ⋆ ∩ A c ) ≈ aA c ≈ A ′ c b ≈ ( A ′ ⋆ ∩ A ′ c ) b ≈ A ′ ⋆ b. On the other hand, writing c ⋆ = ( a ⋆ , b ⋆ ) , we have a ⋆ A ⋆ ≈ A ′ ⋆ b ⋆ . Hence a ⋆ A ⋆ b − ⋆ b ≈ A ′ ⋆ b and then a ⋆ A ⋆ b − ⋆ b ≈ aA ⋆ and ﬁnally(2.10) N ( a − ⋆ aA ⋆ − A ⋆ b − ⋆ b, δ ) . N ( A ⋆ , δ ) , ∀ ( a, b ) ∈ B ∪ { ( a ⋆ , b ⋆ ) } . Step 5: Apply the sum-product theorem stated below as Proposition 2.3.We claim that the assumptions of Proposition 2.3 are satisﬁed by the set A ⋆ , theset B and the measure µ for the parameters κ/ in the place of κ and O ( ε ) in theplace of ε . Indeed, using Young’s inequality and remembering (2.6), we obtain k µ δ k . i + j k a ⋆ A i ⊟ A j b ⋆ k ≤ i | a ⋆ A i | / j | A j b ⋆ | . k µ δ k . Hence i | A i | / ∼ k µ δ k and j | A j | ∼ . Inversing the roles of A i and A j , we getalso i | A i | ∼ . Thus,(2.11) | A i | ∼ k µ δ k − and i ∼ k µ δ k . Hence | A ⋆ | . | A i | . k µ δ k − ≤ δ κ , which implies, as A ⋆ is δ -discretized, N ( A ⋆ , δ ) . δ − d + κ . Moreover, let ρ ≥ δ and x ∈ E and let B = B ( x, ρ ) . Since µ δ = µ ⊞ P δ ,inequality (2.1) implies µ δ ( B ) . ρ κ . Strictly speaking, ≈ is not relation, because it involves an implicit constant in the . notation. INEAR RANDOM WALKS ON THE TORUS 7

By (2.11) and (2.4), | A i ∩ B || A i | . i | A i ∩ B | ≪ µ δ ( B ) (cid:1) . But A ⋆ ⊂ A i and | A ⋆ | & | A i | , hence | A ⋆ ∩ B || A ⋆ | . | A i ∩ B || A i | . ρ κ . It follows that for all ρ ≥ δ , N ( A ⋆ , ρ ) & ρ − κ . The veriﬁcation of the other assumptions in Proposition 2.3 are straightforward, sowe can apply Proposition 2.3, which leads to a contradiction to (2.10) when ε > is chosen small enough. (cid:3) A sum-product theorem.

In the proof of L -ﬂattening, we used the follow-ing result. Proposition 2.3 (Sum-product estimate in simple algebras) . Let E be a normedﬁnite-dimensional simple algebra over R of dimension d ≥ . Given κ > , thereexists ε = ε ( E, κ ) > such that the following holds for every δ > suﬃcientlysmall. Let A be a subset of E , µ a probability measure on E , and B a subset of E × E . Assume(i) A ⊂ B E (0 , δ − ε ) ;(ii) ∀ ρ ≥ δ , N ( A, ρ ) ≥ δ ε ρ − κ ;(iii) N ( A, δ ) ≤ δ − ( d − κ ) ;(iv) µ is supported on G E ( δ − ε ) ;(v) for every proper linear subspace W < E , ∀ ρ ≥ δ , µ ( W ( ρ ) ) ≤ δ − ε ρ κ ;(vi) µ ⊗ µ ( B ) ≥ δ ε .Then for every a ⋆ , b ⋆ ∈ G E ( δ − ε ) , there exists ( a, b ) ∈ B ∪ { ( a ⋆ , b ⋆ ) } such that N ( a − ⋆ aA + Ab − ⋆ b, δ ) ≥ δ − ε N ( A, δ ) . The idea of the proof is to consider the action of E × E on E by left and rightmultiplication and to apply a sum-product theorem [28, Theorem 3] for irreduciblelinear actions due to the ﬁrst author. For the reader’s convenience, let us recall thestatement of the latter. Theorem 2.4 (Sum-product theorem for irreducible linear actions) . Given a pos-itive integer d and a real number κ > there exists ε = ε ( d, κ ) > such that thefollowing holds for δ > suﬃciently small. Let X be a subset of the Euclideanspace R d and Φ ⊂ End( R d ) a subset of linear endomorphisms. Assume(i) X ⊂ B R d (0 , δ − ε ) ;(ii) for all ρ ≥ δ , N ( X, ρ ) ≥ δ ε ρ − κ ;(iii) N ( X, δ ) ≤ δ − ( d − κ ) ;(iv) Φ ⊂ B End( R d ) (0 , δ − ε ) ;(v) for all ρ ≥ δ , N (Φ , ρ ) ≥ δ ε ρ − κ ;(vi) for every proper linear subspace W ⊂ R d , there is ϕ ∈ Φ and w ∈ B W (0 , such that d ( ϕw, W ) ≥ δ ε .Then N ( X + X, δ ) + max ϕ ∈ Φ N ( X + ϕX, δ ) ≥ δ − ε N ( X, δ ) . Here, of course, ϕX denote the set { ϕx | x ∈ X } WEIKUN HE AND NICOLAS DE SAXCÉ

Proof of Proposition 2.3.

In this proof the implied constants in the Vinogradov orLandau notation may depend on E . We may assume without loss of generality that B ⊂ Supp( µ ) × Supp( µ ) . This implies that for all ( a, b ) ∈ B , k a k , k a − k , k b k , k b − k ≤ δ − ε , and consequently the multiplication on the left or right by a or b is a δ − O ( ε ) -bi-Lipschitz endomorphism of E .For ( a, b ) in B , deﬁne ϕ ( a, b ) ∈ End( E ) by ∀ x ∈ E, ϕ ( a, b ) x = − a − a ⋆ xb − ⋆ b. We would like to apply the previous theorem to X = A and Φ = { ϕ ( a, b ) ∈ End( E ) | a, b ∈ B } . We claim that the assumptions of Theorem 2.4 hold with O ( ε ) in theplace of ε . Hence there is ε > such that when ε > is small enough, we haveeither N ( A + A, δ ) ≥ δ − ε N ( A, δ ) in which case we are done, or there exists ( a, b ) ∈ B such that N ( A + ϕ ( a, b ) A, δ ) ≥ δ − ε N ( A, δ ) . In the latter case we conclude by multiplying the set above by a − ⋆ a on the left, N ( a − ⋆ aA − Ab − ⋆ b, δ ) ≥ δ O ( ε ) N ( A + ϕ ( a, b ) A, δ ) ≥ δ − ε + O ( ε ) N ( A, δ ) . It remains to check the assumptions in Theorem 2.4. Items (i)–(iv) are immediate.To check the remaining assumptions, write, for b ∈ E , B ( b ) = { a ∈ E | ( a, b ) ∈ B } . By assumption (vi) of the proposition we are trying to prove, we can pick b ∈ E such that µ ( B ( b )) ≥ δ ε . From the inequalities k a − a ′ k ≪ k a kk a ′ kk a ′− − a − k and k a ′− − a − k = k (cid:0) ϕ ( a, b ) − ϕ ( a ′ , b ) (cid:1) ( a − ⋆ b − b ⋆ ) k ≤ k ϕ ( a, b ) − ϕ ( a ′ , b ) kk a − ⋆ b − b ⋆ k , we see that the map a ϕ ( a, b ) is δ − O ( ε ) -bi-Lipschitz on B ( b ) . Thus item (v)follows from assumption (v) of Proposition 2.3.Finally, assume for contradiction that item (vi) fails with δ Cε in the place of δ ε .Namely, there is a linear subspace W ⊂ R d of intermediate dimension < k < d such that ∀ ( a, b ) ∈ B , d ( W , ϕ ( a, b ) W ) ≤ δ Cε , where d denotes the distance on thethe Grassmannian Grass( k, d ) of k -planes in R d deﬁned by d ( W, W ′ ) = min w ∈ B W (0 , d ( w, W ′ ) = min w ′ ∈ B W ′ (0 , d ( w ′ , W ) . In particular, for a, a ′ ∈ B ( b ) , we have d ( W , ϕ ( a, b ) W ) ≤ δ Cε and d ( W , ϕ ( a ′ , b ) W ) ≤ δ Cε . Multiplying the second inequality on the left by a − a ′ , we obtain d ( a − a ′ W , ϕ ( a, b ) W ) ≤ δ ( C − O (1)) ε . By the triangular inequality, d ( W , a − a ′ W ) ≤ δ ( C − O (1)) ε , which means(2.12) ∀ g ∈ a − B ( b ) , ∀ w ∈ W , d ( gw, W ) ≤ k gw k d ( gW , W ) ≤ δ ( C − O (1)) ε k w k . INEAR RANDOM WALKS ON THE TORUS 9

Observe that the assumption (v) of Proposition 2.3 implies that the subset B ( b ) ⊂ E is δ O ( ε ) -away from linear subspaces. Hence so is the subset a − B ( b ) . Using[28, Lemma 8], we obtain from (2.12), ∀ x ∈ B E (0 , , ∀ w ∈ B W (0 , , d ( xw, W ) ≤ δ ( C − O (1)) ε . We can do the same argument for the right multiplication. Thus, similarly, ∀ x ∈ B E (0 , , ∀ w ∈ B W (0 , , d ( wx, W ) ≤ δ ( C − O (1)) ε . Consider the map f : Grass( k, d ) → R deﬁned by f ( W ) = Z Z B E (0 , × B W (0 , (cid:0) d ( xw, W ) + d ( wx, W ) (cid:1) d x d w. On the one hand, from the above, f ( W ) ≤ δ ( C − O (1)) ε . On the other hand, f iscontinuous and deﬁned on a compact set. It never vanishes for the reason that azero of f must be a two-sided ideal of E contradicting the simplicity of E . Hence f has a positive minimum on Grass( k, d ) . We obtain a contradiction if C is chosenlarge enough, proving our claim regarding item (vi). (cid:3) Fourier decay for multiplicative convolutions.

The goal here is to proveTheorem 2.1 using iteratively the L -ﬂattening lemma proved above.Let E be any ﬁnite-dimensional real algebra. The Fourier transform of a ﬁniteBorel measure µ on E is the function on the dual E ∗ given by ∀ ξ ∈ E ∗ , ˆ µ ( ξ ) = Z E e ( ξx ) d µ ( x ) . where e ( t ) = e πit for t ∈ R , and we simply write E ∗ × E → R ; ( ξ, x ) ξx for theduality pairing. The product on E yields a natural right action of E on E ∗ givenby ∀ ξ ∈ E ∗ , x ∈ E, y ∈ E, ( ξy )( x ) = ξ ( yx ) , and for ﬁnite Borel measures µ and ν on E , the Fourier transform of their multi-plicative convolution is given by(2.13) [ µ ∗ ν ( ξ ) = Z ˆ ν ( ξy ) d µ ( y ) . The idea of the proof of Theorem 2.1 is to iterate Proposition 2.2 to get a mea-sure with small L -norm, and then to get the desired Fourier decay by convolvingone more time. Two technical issues arise. First, after each iteration, the measurewe obtain does not necessarily satisfy the non-concentration property required byProposition 2.2. To settle this, at each step, we truncate the measure to restrict thesupport on well-invertible elements. Second, the measure we obtain in the end ofthe iteration is not an additive convolution of a multiplicative convolution of µ butsome measure obtained from µ through successive multiplicative and additive con-volutions. To conclude we need to clarify relation between the Fourier transformsof these measures. This is settled in Lemma 2.7. Lemma 2.5.

Let E be a ﬁnite-dimensional normed algebra over R and µ a Borelprobability measure on E such that for some τ, ε > ,(i) µ (cid:0) E \ B E (0 , δ − ε ) (cid:1) ≤ δ τ ;(ii) for every x ∈ E , µ ( x + S E ( δ ε )) ≤ δ τ ;(iii) for every ρ ≥ δ and every proper aﬃne subspace W ⊂ E , µ ( W ( ρ ) ) ≤ δ − ε ρ κ .Set µ = µ | B E (0 ,δ − ε ) and deﬁne recursively for integer k ≥ , η k = µ k | E \ S E ( δ kε ) and µ k +1 = η k ∗ η k ⊟ η k ∗ η k . Then we have for k ≥ , (2.14) µ k ( E ) ≥ − O k ( δ τ ) (2.15) Supp( µ k ) ⊂ B E (0 , δ − O k ( ε ) ) (2.16) ∀ x ∈ E, µ k ( x + S E ( δ k ε )) ≤ δ τ (2.17) ∀ ρ ≥ δ, ∀ W ⊂ E proper aﬃne subspace , µ k ( W ( ρ ) ) ≤ δ − O k ( ε ) ρ κ . As a consequence, the same holds for η k in the place of µ k .Proof. The proof goes by induction on k .The result is clear for k = 1 , by assumption on µ . Assume (2.14)–(2.17) true forsome k ≥ , so that the same holds for η k . Then (2.14) and (2.15) for k + 1 followimmediately.Let us prove (2.16) for k + 1 . Let x ∈ E . Since µ k +1 = η k ∗ η k ⊟ η k ∗ η k , µ k +1 (cid:0) x + S E ( δ k +1 ε ) (cid:1) = Z Z η k { y ∈ E | | det E ( yz − w − x ) | ≤ δ k +1 ε } d η k ( z ) d( η k ∗ η k )( w ) Note that for z ∈ Supp( η k ) , by deﬁnition, | det E ( z ) | ≥ δ k ε . Hence | det E ( yz − w − x ) | ≤ δ k +1 ε implies y − ( w + x ) z − ∈ S E ( δ k ε ) . Therefore, by induction hypothesis(2.16) µ k +1 (cid:0) x + S E ( δ k +1 ε ) (cid:1) ≤ max z ∈ Supp( η k ) , w ∈ E η k (cid:0) ( w + x ) z − + S E ( δ k ε ) (cid:1) ≤ δ τ . Finally, let us prove (2.17) for k + 1 . Let ρ ≥ δ and let W be a proper aﬃnesubspace of E . We have as above µ k +1 (cid:0) W ( ρ ) (cid:1) ≤ max z ∈ Supp( η k ) , w ∈ E η k (cid:0) ( w + W ( ρ ) ) z − (cid:1) . For all z ∈ Supp( η k ) , we have | det E ( z ) | ≥ δ O k ( ε ) and by the induction hypothesis k z k ≤ δ − O k ( ε ) . Hence k z − k ≤ δ − O k ( ε ) . Thus ( w + W ( ρ ) ) z − = wz − + W z − + B E (0 , ρ ) z − ⊂ wz − + W z − + B E (0 , δ − O k ( ε ) ρ ) , which is nothing but the ( δ − O k ( ε ) ρ ) -neighborhood of another proper aﬃne subspace.Hence by induction hypothesis (2.17), µ k +1 (cid:0) W ( ρ ) (cid:1) ≤ δ − O k ( ε ) ρ κ . This ﬁnishes the proof of the induction step and that of the lemma. (cid:3)

Lemma 2.6.

Let E be normed algebra over R of dimension d . Let µ and ν beBorel probability measures on E . Assume(i) k µ ⊞ P δ k ≤ δ κ ,(ii) for every ρ ≥ δ and every proper aﬃne subspace W ⊂ E , ν ( W ( ρ ) ) ≤ δ − ε ρ κ .Then for ξ ∈ E ∗ with δ − ε ≤ k ξ k ≤ δ − − ε , | [ µ ∗ ν ( ξ ) | ≤ δ κd +3 − O ( ε ) Proof.

This is a slightly more general form of [10, Theorem 7]. The proof is essen-tially the same. A detailed proof is implicitly contained in [30, Lemma 2.11]. (cid:3)

For a ﬁnite Borel measure ν on E and an integer ℓ ≥ , we denote by ν ⊞ ℓ = ν ⊞ · · · ⊞ ν | {z } ℓ times the ℓ -fold additive convolution of ν . INEAR RANDOM WALKS ON THE TORUS 11

Lemma 2.7.

Let E be a ﬁnite-dimensional real algebra. Let ℓ ≥ be an integer, ν a Borel probability measure on E , and set µ = ν ⊞ ℓ ⊟ ν ⊞ ℓ . Then for every integer m ≥ , for every ξ ∈ E ∗ , the Fourier coeﬃcient d µ ∗ m ( ξ ) isreal and (2.18) | d ν ∗ m ( ξ ) | (2 ℓ ) m ≤ d µ ∗ m ( ξ ) . The same inequality also holds for a ﬁnite Borel measure with total mass ν ( E ) ≤ .Proof. We proceed by induction on m . For m = 1 , ˆ µ ( ξ ) = | ˆ ν ( ξ ) | ℓ . Assume then that (2.18) is true for some m ≥ . By (2.13), the Hölder inequalityand the induction hypothesis, | \ ν ∗ ( m +1) ( ξ ) | (2 ℓ ) m = (cid:12)(cid:12)(cid:12)Z d ν ∗ m ( ξy ) d ν ( y ) (cid:12)(cid:12)(cid:12) (2 ℓ ) m ≤ Z | d ν ∗ m ( ξy ) | (2 ℓ ) m d ν ( y ) ≤ Z d µ ∗ m ( ξy ) d ν ( y )= Z Z e ( ξyx ) d µ ∗ m ( x ) d ν ( y ) Taking the ℓ -th power and using again the Hölder inequality and (2.13), weobtain | \ ν ∗ ( m +1) ( ξ ) | (2 ℓ ) m +1 ≤ (cid:12)(cid:12)(cid:12)Z Z e ( ξyx ) d ν ( y ) d µ ∗ m ( x ) (cid:12)(cid:12)(cid:12) ℓ ≤ Z (cid:12)(cid:12)(cid:12)Z e ( ξyx )) d ν ( y ) (cid:12)(cid:12)(cid:12) ℓ d µ ∗ m ( x )= Z Z Z e (cid:0) ξ ( y + · · · + y ℓ − y ℓ +1 − · · · − y ℓ ) x (cid:1) d ν ( y ) . . . d ν ( y ℓ ) d µ ∗ m ( x )= Z Z e ( ξzx ) d µ ( z ) d µ ∗ m ( x )= \ µ ∗ ( m +1) ( ξ ) This proves the induction step and ﬁnishes the proof of (2.18). If ν is a ﬁnite Borelmeasure with ν ( E ) ≤ , we may apply (2.18) to the probability measure ν ( E ) − ν ,which yields (cid:16) | d ν ∗ m ( ξ ) | ν ( E ) m (cid:17) (2 ℓ ) m ≤ d µ ∗ m ( ξ ) ν ( E ) ℓm . (cid:3) Proof of Theorem 2.1.

Let d = dim( E ) . Let ε > be the constant given byProposition 2.2 applied to the parameter κ/ . Deﬁne s max = ⌈ dε ⌉ . First, remarkthat by the non-concentration assumption, k µ ⊞ P δ k ∞ ≤ δ − d + κ − ε . Hence k µ ⊞ P δ k ≤ k µ ⊞ P δ k k µ ⊞ P δ k ∞ ≤ δ − d + κ/ if we choose ε ≤ κ/ . For k = 1 , . . . , s max , let µ k and η k be deﬁned as inLemma 2.5. Since k ≤ s max is bounded, the implied constants in the O k ( ε ) nota-tions in Lemma 2.5 can be chosen uniformly over k . Thus, when ε > is suﬃciently small, Lemma 2.5 allows us to apply Proposition 2.2 and Remark 2 after it to themeasure η k for each k = 1 , . . . , s max . Thus, either k η k ⊞ P δ k ≤ δ κ/ or k η k +1 ⊞ P δ k ≤ k µ k +1 ⊞ P δ k ≤ δ ε k η k ⊞ P δ k . We deduce that there exists s ∈ { , . . . , s max } such that k η s ⊞ P δ k ≤ δ κ/ . Remembering (2.17), we apply Lemma 2.6 to obtain, for all ξ with k ξ k = δ − , | \ η s ∗ η s ( ξ ) | ≤ δ κ/ (2 d +6) ≤ δ τ . Here we assumed ε suﬃciently small compared to κ/d .For k = 1 , . . . , s , apply Lemma 2.7 with ℓ = 1 and m = 2 k to µ s − k +1 = η ∗ s − k ⊟ η ∗ s − k . We obtain (cid:12)(cid:12) \ η ∗ k +1 s − k ( ξ ) (cid:12)(cid:12) k ≤ (cid:12)(cid:12) \ µ ∗ k s − k +1 ( ξ ) (cid:12)(cid:12) . Moreover, by (2.16), µ s − k +1 diﬀers from η s − k +1 by a measure of total mass at most δ τ . Hence (cid:12)(cid:12) \ µ ∗ k s − k +1 ( ξ ) (cid:12)(cid:12) ≤ (cid:12)(cid:12) \ η ∗ k s − k +1 ( ξ ) (cid:12)(cid:12) + O k ( δ τ ) . From the above, we deduce using a simple recurrence that for all k = 1 , . . . , s , (cid:12)(cid:12) \ µ ∗ k s − k +1 ( ξ ) (cid:12)(cid:12) ≪ k δ τ/O k (1) . In particular, (cid:12)(cid:12) d µ ∗ s ( ξ ) (cid:12)(cid:12) ≪ s δ ετ , which allows to conclude since µ diﬀers from µ by a measure of total mass at most δ τ . (cid:3) Non-concentration in subvarieties

Our goal here is to prove that the law of a large random matrix product satisﬁessome regularity conditions.Throughout this section, unless otherwise stated, µ denotes a probability mea-sure on SL d ( Z ) . As in the introduction, Γ denotes the subsemigroup generated by Supp( µ ) , G < SL d is the Zariski closure of Γ , and G = G ( R ) its group of R -rationalpoints. We also let E denote the subalgebra of M d ( R ) generated by G , and ﬁxa norm on the space of all polynomial functions on E . We shall prove two non-concentration statements for the distribution of a random matrix product. Theﬁrst one shows that the law µ ∗ n at time n of the random matrix product is notconcentrated near aﬃne subspaces of the algebra E . Proposition 3.1 (Non-concentration on aﬃne subspaces) . Let µ be a probabilitymeasure on SL d ( Z ) , for some d ≥ . Denote by Γ the subsemigroup generated by µ , and by G the Zariski closure of Γ in SL d . Assume that:(a) The measure µ has a ﬁnite exponential moment;(b) The action of Γ on R d is irreducible;(c) The algebraic group G is Zariski connected.There exists κ > such that for every proper aﬃne subspace W ⊂ E , ∀ n ≥ , ∀ ρ ≥ e − n , µ ∗ n ( { g ∈ Γ | d ( g, W ) ≤ ρ k g k} ) ≪ µ ρ κ . The second result concerns general subvarieties of the algebra E , with the caveatthat we have to replace µ ∗ n by an additive convolution power of itself, to avoid someobstructions. It is also worth noting that the quantiﬁcation of the non-concentrationis slightly weaker than in the case of aﬃne subspaces. INEAR RANDOM WALKS ON THE TORUS 13

Proposition 3.2 (Non-concentration on subvarieties) . Let µ be a probability mea-sure on SL d ( Z ) , for some d ≥ . Let Γ denote the subsemigroup generated by µ , G the Zariski closure of Γ in SL d , and λ the top Lyapunov exponent of µ . Assumethat:(a) The measure µ has a ﬁnite exponential moment;(b) The action of Γ on R d is irreducible;(c) The algebraic group G is Zariski connected.Given an integer D ≥ and given ω > , there exists c > and n ∈ N such thatthe following holds. Let f : E → R be a polynomial function of degree D . Writing f D for its degree D homogeneous part, we have ∀ k ≥ dim( E ) , ∀ n ≥ n , ( µ ∗ n ) ⊞ k (cid:0)(cid:8) x ∈ E | | f ( x ) | ≤ e ( Dλ − ω ) n k f D k (cid:9)(cid:1) ≤ e − cn . These two propositions will allow us to apply the results of the previous sectionto the law µ ∗ n at time n of an irreducible random walk on SL d ( Z ) . This will yieldTheorem 3.19 below.Non-concentration estimates for subvarieties can sometimes be obtained by somelinearization techniques, as is done in [1]. But this approach does not seem to yielda uniform statement for subvarieties of bounded degree, which is crucial for ourapplication.The argument developed in this section relies on the spectral gap property mod-ulo primes for ﬁnitely generated subgroups of H ( Z ) , where H is a semisimple Q -subgroup of SL n , a still rather recent result obtained by Salehi Golseﬁdy and Varjú[43] after several important works in this direction, starting with Helfgott [33], fol-lowed by Bourgain and Gamburd [12], Breuillard, Green and Tao [20], and Pyber-Szabó [40].3.1. Prelude : Expansion in semisimple groups.

Since elements in Γ haveinteger coeﬃcients, G is deﬁned over Q , so we may choose a set of deﬁning polyno-mials with coeﬃcients in Z . Given a prime number p , this allows to consider thevariety G p deﬁned over F p by the reduction modulo p of the polynomials deﬁning G . On the space ℓ ( G p ( F p )) of square-integrable functions on G p ( F p ) , we shall con-sider the convolution operator T µ : ℓ ( G p ( F p )) → ℓ ( G p ( F p )) f µ ∗ f Let ℓ ( G p ( F p )) ⊂ ℓ ( G p ( F p )) denote the subspace of functions on G p ( F p ) havingzero mean.The theorem we shall need is the following; up to some minor modiﬁcations, itappears in Salehi Golseﬁdy and Varjú [43, Theorem 1]. Theorem 3.3 (Spectral gap theorem) . Let d ≥ and let µ be a probability measureon SL d ( Z ) such that the Zariski closed subgroup G generated by µ is semisimple.Then there exists a constant c > and an integer k such that for every primenumber p , k T kµ k ℓ ( G p ( F p )) ≤ − c. Remark 3.

Another way to state the above theorem is to say that the spectralradius of the operator T µ restricted to ℓ ( G p ( F p )) is bounded above by − c , forevery prime number p .For completeness, we now explain how to derive the above theorem from [43,Theorem 1]. The argument uses the following two lemmata. Lemma 3.4.

Let Γ be a subsemigroup of SL d ( R ) whose Zariski closure G is semisim-ple. There exists a ﬁnite subset S ⊂ Γ such that the semigroup generated by S isZariski dense in G .Proof. For any ﬁnite subset S ⊂ Γ , denote by Z e ( S ) the identity component of theZariski closure of the semigroup generated by S . Let S ⊂ Γ be a ﬁnite subsetsuch that H := Z e ( S ) has maximal dimension among these subgroups. Then H is also maximal for the order of inclusion, because Z e ( S ) ⊂ Z e ( S ∪ S ) are bothirreducible subvarieties and have the same dimension.In particular, for any γ ∈ Γ , γ H γ − = Z e ( γS γ − ) ⊂ H . Hence H is a normalsubgroup in G , since Γ is Zariski dense. By [7, Theorem 6.8 and Corollary 14.11],the quotient G / H is a semisimple linear algebraic group such that the projection π : G → G / H is a morphism of algebraic groups.Moreover, for any γ ∈ Γ , there is k ≥ such that γ k ∈ Z e ( { γ } ) ⊂ H . Hencethe image π (Γ) of Γ in G / H is a torsion group. By the Jordan-Schur theorem [41,Theorem 8.31], π (Γ) is virtually abelian. But π (Γ) is Zariski dense in G / H . Hencethe Zariski closure of any of its subgroups of ﬁnite index contains the identitycomponent of G / H . Therefore, the identity component of G / H is both semisimpleand abelian, hence trivial. It follows that G / H is ﬁnite. Thus, by adding a ﬁnitenumber of elements to S , we can make sure that S generates a Zariski densesubsemigroup in G . (cid:3) Lemma 3.5.

Let G be a connected semisimple algebraic group deﬁned over Q . Let S ⊂ G ( Q ) be a ﬁnite subset which generates a Zariski dense subgroup. Then thereexists k ≥ such that the symmetric set S − k S k also generates a Zariski densesubgroup in G .Proof. By a result of Nori [39, Theorem 5.2], the group Γ generated by S is densein some open subgroup Ω of G ( Q p ) , for some prime number p . In other words,the union S k ≥ S k is dense in Ω . Since the Lie algebra g of Ω satisﬁes [ g , g ] = g ,the derived subgroup [Ω , Ω] contains an open subgroup Ω ′ ; then, the increasingsequence of subsets ( S k S − k ) k ≥ gets arbitrarily dense in Ω ′ .On the other hand, there exists a neighborhood U of the identity in G ( Q p ) and δ > , such that if H is an algebraic subgroup of G , then H ( Q p ) is not δ -densein U . Indeed, the Lie algebra h of H ( Q p ) satisﬁes dim Q p h < dim Q p g , and we candistinguish two cases. If the normalizer N ( h ) of h in G ( Q p ) does not contain anopen subgroup, then we conclude using [22, Lemma 2.2], which is still valid over Q p . Otherwise, h is an ideal in g , so that H is a sum of simple factors of G ; thereare only ﬁnitely many such groups. (cid:3) Proof of Theorem 3.3.

Let ˇ µ denote the image measure of µ by the map g g − .By lemmata 3.4 and 3.5 above, there is k ≥ such that the support of ˇ µ ∗ k ∗ µ ∗ k contains a ﬁnite symmetric subset S which generates a Zariski dense subgroup in G . By [43, Theorem 1] applied to S , there is c > such that for any prime number p suﬃciently large, k T µ S k ℓ ( G p ( F p )) ≤ − c, where µ S denotes the normalized counting measure on S . Then, we can write ˇ µ ∗ k ∗ µ ∗ k = αµ S + (1 − α ) µ ′ INEAR RANDOM WALKS ON THE TORUS 15 where α > and µ ′ is some probability measure on SL d ( Z ) . Thus, for any primenumber p suﬃciently large, using the fact that T ∗ µ = T ˇ µ , k T kµ k ℓ ( G p ( F p )) ≤ k ( T ∗ µ ) k T kµ k ℓ ( G p ( F p )) = k T ˇ µ ∗ k ∗ µ ∗ k k ℓ ( G p ( F p )) ≤ α k T µ S k ℓ ( G p ( F p )) + (1 − α ) k T µ ′ k ℓ ( G p ( F p )) ≤ − αc. (cid:3) One can also interpret Theorem 3.3 as a statement on the speed of equidistri-bution of the random walks associated to µ on the Cayley graphs G p ( F p ) . This isexplained for instance in [34, §3.1] for the case of simple random walks on a familyof expander graphs. In our setting, we obtain the following corollary, whose proofis left to the reader. For a prime number p , we denote by π p : Z → F p the reductionmodulo p . By abuse of notation we extend the domain of deﬁnition of π p to anyfree Z -module. Corollary 3.6.

Let d ≥ and let µ be a probability measure on SL d ( Z ) such thatthe Zariski closed subgroup G generated by µ is semisimple. There exists C ≥ such that for every prime number p suﬃciently large, and n ≥ C log p , for all a ∈ M d ( F p ) , µ ∗ n ( { g ∈ M d ( Z ) | π p ( g ) = a } ) ≤ | G p ( F p ) | . We conclude this paragraph by a lemma that will allow us to use the spectralgap theorem in our setting of random walks on the torus.

Lemma 3.7 (Strong irreducibility and semisimplicity) . Let G be a Zariski closedsubgroup of SL d ( R ) generated by elements of SL d ( Z ) and acting strongly irreduciblyon R d . Then G is semisimple.Proof. Since G acts irreducibly on a ﬁnite-dimensional vector space, it is a reductivegroup, and can be written as an almost product G = Z · S, where Z is a torus, central in G , and S is semisimple, with Z ∩ S ﬁnite. Thegroup Z is equal to the intersection of G with the center of the algebra E generatedby G . Since E is a simple algebra over R , its center can be identiﬁed with R or C . Note that the restriction of the determinant on M d ( R ) to the center of E issimply a power of the usual norm on C , and since G ⊂ SL d ( R ) , the group Z must beincluded in the group of complex numbers of norm . In particular, G/S ≃ Z/Z ∩ S is compact.Now since G is deﬁned over Q , the projection map G → G/S is given by somepolynomial map with rational coeﬃcients. In particular, the image F of G ∩ SL d ( Z ) inside Z/Z ∩ S is made of matrices whose entries are rational with bounded denomi-nators, and is therefore ﬁnite. But G equals the Zariski closure of G ∩ SL d ( Z ) , so theimage of G itself under the projection G → G/S is ﬁnite, i.e. G is semisimple. (cid:3) Escaping from subvarieties: a consequence of the spectral gap.

Theaim of this subsection is to establish the following proposition, using the results ofthe previous paragraph.

Proposition 3.8.

Let µ be a probability measure on SL d ( Z ) such that the Zariskiclosed subgroup G generated by Supp µ is semisimple and connected. There exists a constant c > depending on µ such that for all polynomials f ∈ Q [ M d ] , of degree at most D and not vanishing on G , we have ∀ n ≥ , µ ∗ n ( { g ∈ Γ | f ( g ) = 0 } ) ≪ µ,D e − cn . Indeed, this is a general version of [13, Corollary 1.1] which is stated for thegroup SL d and for the simple random walk on the Cayley graph. The proof isessentially the same. But since it will be important for us to know that the upperbound depends only on the degree of f , we provide a detailed argument.We need the following lemma, which is a consequence of the Lang-Weil inequality.It will be important for us to have an estimate which is uniform for subvarieties ofbounded complexity. Lemma 3.9.

Given a geometrically irreducible subvariety V ⊂ A d deﬁned over Q and an integer D ≥ , there exists p = p ( V, D ) such that the following holds. Let f ∈ Q [ X , . . . , X d ] be a polynomial of degree at most D . Assume that f does notvanish on V . Then for every prime number p ≥ p , |{ π p ( x ) ∈ F dp | x ∈ V ∩ Z d , f ( x ) = 0 }| ≪ V,D p − | V p ( F p ) | where V p denotes the reduction modulo p of V . Note that to speak about reductions modulo p of a variety over Q , we need tochoose a model over Z . But since V is a subvariety of an aﬃne space, among suchchoices, there is a canonical one. See the ﬁrst paragraph in the proof below. Proof.

We abbreviate Q [ X , . . . , X d ] as Q [ X ] and Z [ X , . . . , X d ] as Z [ X ] . Let I ⊂ Q [ X ] be the ideal of all polynomials with coeﬃcients in Q and vanishing on V . Let I Z = I ∩ Z [ X ] . For any prime number p , let V p be the variety over F p deﬁned bythe ideal π p ( I Z ) ⊂ F p [ X ] . By the Bertini-Noether theorem [23, Proposition 10.4.2], V p is geometrically irreducible for p ≥ p , where p is a constant depending only onthe embedding V ⊂ A d .On the one hand applying the Lang-Weil inequality [35, Theorem 1] to theirreducible variety V p , we obtain(3.1) | V p ( F p ) | ≥ p dim( V p ) . On the other hand, one can prove, using Gröbner bases [21, Chapter 2], that givenan integer D ≥ there is p = p ( V, D ) such that for all p ≥ p and all f ∈ Q [ X ] \ I of degree at most D there is h ∈ ( Q f ⊕ I ) ∩ Z [ X ] of degree at most O V,D (1) andsuch that π p ( h ) / ∈ π p ( I Z ) . For such h , we have { x ∈ V ∩ Z d | f ( x ) = 0 } ⊂ { x ∈ V ∩ Z d | h ( x ) = 0 } and hence |{ π p ( x ) ∈ F dp | x ∈ V ∩ Z d , f ( x ) = 0 }| ≤ |{ x ∈ V p ( F p ) | π p ( h )( x ) = 0 }| . The right-hand side is the number of F p -points in the subvariety V p ∩ { π p ( h ) = 0 } .This subvariety has dimension at most dim( V p ) − since V p is irreducible and π p ( h ) / ∈ π p ( I Z ) . Thus, applying a version of the Schwarz-Zippel estimate, like [35,Lemma 1], and using the fact that the complexity controls the degree, we get(3.2) | ( V p ∩ { π p ( h ) = 0 } )( F p ) | ≪ V,D p dim( V p ) − . Together with (3.1), this proves the desired inequality. (cid:3)

Now we are ready to prove Proposition 3.8.

INEAR RANDOM WALKS ON THE TORUS 17

Proof of Proposition 3.8.

By Lemma 3.9, |{ π p ( g ) ∈ M d ( F p ) | g ∈ Γ , f ( g ) = 0 }| ≪ G ,D p − | G p ( F p ) | for every prime number p ≥ p ( G , D ) . Combined with Corollary 3.6, this yields µ ∗ n ( { g ∈ Γ | f ( g ) = 0 } ) ≪ G ,D p − . for all n ≥ C log p , where C is a constant depending only on µ . We conclude bychoosing p to be a prime number such that p ≍ e n/C and p ≥ p . (cid:3) Interlude : large deviation estimates for random matrix products.

In this subsection, µ is a Borel probability measure on SL d ( R ) , not necessarilysupported on matrices with integer coeﬃcients. By Γ we denote the closure of thesubsemigroup generated by Supp( µ ) and by G the group of R -points of the Zariskiclosure of Γ .Let us ﬁrst recall the large deviation estimates for random matrix products.This result is originally due to Lepage [36], and the version below is taken fromBougerol [8, Theorem V.6.2]. For g ∈ GL d ( R ) , denote by σ ( g ) ≥ · · · ≥ σ d ( g ) > the singular values of g ordered decreasingly. Theorem 3.10 (Large deviation estimates) . Let µ be a Borel probability measureon GL d ( R ) having a ﬁnite exponential moment. Let λ k denote the k -th Lyapunovexponent associated to µ . Assume that G acts strongly irreducibly on R d . For any ω > , there is c > , n > such that the following holds.(i) For all n ≥ n , µ ∗ n (cid:0)(cid:8) g ∈ Γ | | n log k g k − λ | ≥ ω (cid:9)(cid:1) ≤ e − cn . (ii) For all k = 1 , . . . , d and all n ≥ n , µ ∗ n (cid:0)(cid:8) g ∈ Γ | | n log σ k ( g ) − λ k | ≥ ω (cid:9)(cid:1) ≤ e − cn . (iii) For all n ≥ n , For all v ∈ R d \ { } , µ ∗ n (cid:0)(cid:8) g ∈ Γ | (cid:12)(cid:12) n log k gv kk v k − λ (cid:12)(cid:12) ≥ ω (cid:9)(cid:1) ≤ e − cn . Item (i) is, of course, a special case of Item (ii) since σ ( g ) = k g k . One conse-quence of the above theorem that will be useful to us is the following proposition. Proposition 3.11.

Let µ be a Borel probability measure on GL d ( R ) having a ﬁniteexponential moment. Assume that G acts strongly irreducibly on R d . Then thereexists κ > such that for all n ≥ and all ρ ≥ e − n , for every v ∈ R d \ { } , µ ∗ n (cid:0)(cid:8) g ∈ Γ | k gv k ≤ ρ k g kk v k (cid:9)(cid:1) ≪ µ ρ κ . Proof.

Let r denote the proximal dimension of G , i.e. r = min { rank g ; g ∈ R G \ { }} . For g ∈ G , write its Cartan decomposition g = k diag( σ ( g ) , . . . , σ d ( g )) ℓ , where k and ℓ are orthogonal matrices and σ ( g ) ≥ · · · ≥ σ d ( g ) are the singular values of g .Deﬁne also V + g = k Span( e , . . . , e r ) and W − g = ℓ − Span( e r +1 , . . . , e d ) where ( e , . . . , e d ) is the standard basis of R d .We ﬁrst prove the proposition in the case where G is proximal, i.e. r = 1 . Underthis condition, we have [11, Lemma 4.1(2)] :(3.3) ∀ g ∈ GL d ( R ) , ∀ v ∈ R d \ { } , k gv k ≥ d ∡ ( R v, W − g ) k g kk v k , where d ∡ is deﬁned, for any two subspaces V, W ⊂ R d of R d with respective or-thonormal bases ( v , . . . , v s ) and ( w , . . . , w t ) , by the formula d ∡ ( V, W ) = k v ∧ · · · ∧ v s ∧ w ∧ · · · ∧ w t k . Then, [11, Lemma 4.5] applied to the transposed random walk, which is also prox-imal, shows that there exists κ > such that(3.4) ∀ ρ ≥ e − n , µ ∗ n (cid:0)(cid:8) g ∈ Γ | d ∡ ( V + t g , v ⊥ ) ≤ ρ (cid:9)(cid:1) ≪ µ ρ κ . Noting that ( W − g ) ⊥ = V + t g and that d ∡ ( V, W ) = d ∡ ( V ⊥ , W ⊥ ) if dim( V )+dim( W ) = d , we see that the result follows from (3.3).We now use Bougerol’s trick [8, Proof of Theorem V.6.2] to reduce the propositionto the proximal case. Namely, by [18, Lemma 3.2] or [6, Lemma 4.36], we have adecomposition of ∧ r R d into G -invariant subspaces ∧ r R d = Λ + ⊕ Λ such that the action of G on Λ + is strongly irreducible and proximal and moreover,(3.5) ∀ g ∈ G, k ( ∧ r g ) | Λ + k ≫ G k g k r ≥ k∧ r g k . Denote by π + : ∧ r R d → Λ + the projection with respect to this decomposition. By[18, Lemma 3.3], for any v ∈ R d , we can ﬁnd a subspace P ⊂ R d of dimension r and containing v such that(3.6) π + ( v P ) ≫ G k v P k , where v P ∈ ∧ r R d is the wedge product of the elements of a basis of P . Now, observethat (3.3) still holds without the proximality assumption, and we have, for every g in GL d ( R ) , d ∡ ( R v, W − g ) ≥ d ∡ ( P, W − g ) . Hence, it suﬃces to prove, for some κ > and n ≥ ,(3.7) ∀ n ≥ n , ∀ ρ ≥ e − n , µ ∗ n (cid:0)(cid:8) g ∈ Γ | d ∡ ( P, W − g ) ≤ ρ (cid:9)(cid:1) ≪ µ ρ κ . By [18, Lemma 4.2], ∀ g ∈ GL d ( R ) , k ( ∧ r g ) v P kk∧ r g k k v P k ≤ d ∡ ( P, W − g ) + σ r +1 ( g ) σ r ( g ) . Combined with (3.5) and (3.6), this yields ∀ g ∈ G, k ( ∧ r g ) | Λ + π + ( v P ) kk ( ∧ r g ) | Λ + k k π + ( v P ) k ≪ G d ∡ ( P, W − g ) + σ r +1 ( g ) σ r ( g ) . By a result of Guivarc’h-Raugi [26], λ r > λ r +1 . Applying Theorem 3.10(ii) to k = r and r + 1 and ω = ( λ r − λ r +1 ) / > , we get c > and n ≥ such that ∀ n ≥ n , µ ∗ n (cid:0)(cid:8) g ∈ Γ | σ r +1 ( g ) σ r ( g ) ≤ e − ω n (cid:9)(cid:1) ≥ − e − cn . Note that e − cn ≤ ρ c . The desired estimate (3.7) then follows by the proximal caseapplied to the induced random walk on Λ + and to the vector π + ( v P ) ∈ Λ + . (cid:3) Escaping a small neighborhood of a subvariety.

For an integer D ≥ ,and a regular function f ∈ R [ G ] on G , we say that f has degree at most D if it canbe represented by a polynomial on M d of degree at most D . Denote by R [ G ] ≤ D the ﬁnite-dimensional subspace consisting of regular functions of degree at most D .We ﬁx a norm on R [ G ] ≤ D . INEAR RANDOM WALKS ON THE TORUS 19

Lemma 3.12.

Let µ be a Borel probability measure on SL d ( Z ) having a ﬁnite ex-ponential moment. Assume that the Zariski closed subgroup G generated by Supp µ is semisimple and connected.Given an integer D ≥ , there exist constants C > , c > and n ≥ dependingon µ and D such that ∀ f ∈ R [ G ] ≤ D , ∀ n ≥ n , µ ∗ n (cid:0)(cid:8) g ∈ Γ | | f ( g ) | < e − Cn k f k (cid:9)(cid:1) ≤ e − cn . Proof.

By Theorem 3.10(i), there is c > such that for n large enough µ ∗ n (cid:0)(cid:8) g ∈ Γ | k g k ≥ e λ n (cid:9)(cid:1) ≤ e − cn , where λ is the top Lyapunov exponent associated to the random walk deﬁned by µ on R d . Thus, we are left to bound from above the µ ∗ n -measure of the set A C = (cid:8) g ∈ Γ | | f ( g ) | ≤ e − Cn k f k and k g k ≤ e λ n (cid:9) . Let us abbreviate V = R [ G ] ≤ D . Let V Q denote the set of functions f ∈ V whichare Q -rational, i.e. represented by polynomials on M d with coeﬃcients in Q . Notethat V Q deﬁnes a Q -structure on V .We claim that if C is chosen large enough then A C must be contained in somesubvariety { f = 0 } where f ∈ V Q \ { } . Then Proposition 3.8 applied to f allowsto conclude.For every g ∈ Γ let ev g : V → R be the evaluation map ∀ v ∈ V, ev g ( v ) = v ( g ) . Since the matrices g ∈ A C have integer coeﬃcients, the intersection W = \ g ∈ A C ker(ev g ) is a subspace of V deﬁned over Q , i.e. W = R ⊗ Q ( W ∩ V Q ) . We want to show that W ∩ V Q contains nonzero element. Assume for a contradiction that W = { } .Write s = dim( V ) . We can choose g , . . . , g s ∈ A C such that { } = s \ i =1 ker(ev g i ) . Fix a basis ( v , . . . , v s ) of V in which each element is represented by a polynomialon M d with coeﬃcients in Z . Thus, the map Φ : V → R s deﬁned by ∀ f ∈ V, Φ( v ) = (cid:0) ev g i ( v ) (cid:1) i is invertible and has integer coeﬃcients when expressed in the basis ( v , . . . , v s ) and the standard basis of R s . Thus, in these bases, the determinant of Φ satisﬁes | det(Φ) | ≥ . Moreover, k Φ k ≪ max ≤ i,j ≤ s | ev g i ( v j ) | ≪ G ,D max ≤ i ≤ s k g i k D ≤ e Dλ n . It follows that k Φ − k ≪ k Φ k s − | det(Φ) | ≤ e Dsλ n . For f ∈ V , by deﬁnition of A C , we have k Φ( f ) k = max ≤ i ≤ s | f ( g i ) | ≤ e − Cn k f k . Thus, k f k ≤ k Φ − kk Φ( f ) k ≪ G ,D e (2 Dsλ − C ) n k f k . We get a contradiction if C is chosen to be larger than Dsλ + O G ,D (1) . (cid:3) The following is a variant and an easy consequence of the previous lemma.

Lemma 3.13.

Let µ be a Borel probability measure on SL d ( Z ) having a ﬁnite ex-ponential moment. Assume that the Zariski closed subgroup G generated by Supp µ is semisimple and connected.Given an integer D , there exist constants C > , c > and n ≥ dependingon µ and D such that ∀ f ∈ R [ G ] ≤ D , ∀ n ≥ n , µ ∗ n (cid:0)(cid:8) g ∈ Γ | | f ( g ) | < e − Cn k g kk f k (cid:9)(cid:1) ≤ e − cn . Proof.

For any n ≥ and any C > , we have µ ∗ n (cid:0)(cid:8) g ∈ Γ | | f ( g ) | < e − ( C +2 λ ) n k g kk f k (cid:9)(cid:1) ≤ µ ∗ n (cid:0)(cid:8) g ∈ Γ | | f ( g ) | < e − Cn k f k (cid:9)(cid:1) + µ ∗ n (cid:0)(cid:8) g ∈ Γ | k g k ≥ e λ n (cid:9)(cid:1) . We conclude by using Lemma 3.12 for the ﬁrst term and Theorem 3.10(i) for thesecond. (cid:3)

Non-concentration near aﬃne subspaces.

We now want to prove Propo-sition 3.1. Of course, if we are to show that the random walk does not concentratenear any proper aﬃne subspace in the algebra E , we should ﬁrst check that thegroup G is not trapped in any proper aﬃne subspace. Lemma 3.14.

Let G be a subgroup of GL d ( R ) acting irreducibly on R d , and E theassociative subalgebra generated by G in M d ( R ) . Then G is not contained in anyproper aﬃne subspace of E .Proof. Equivalently, we have to show that the linear span W = Span( G − of G − is E . For this, it suﬃces to prove that ∈ W . Firstly, W is closed under mul-tiplication. Indeed, any product between two elements of W is a linear combinationof elements of the form ( g − h − with g, h ∈ G and we have ( g − h −

1) = ( gh − − ( g − − ( h − ∈ W. Secondly, any subspace of R d preserved by W is preserved by G , hence the onlysubspaces preserved by W are R d and { } . We conclude by using the followingalgebraic lemma. (cid:3) Momentarily, in the next lemma and its proof, algebras are not assumed to beunital. Thus, an subalgebra is a linear subspace that is closed under multiplication.Accordingly, a left (resp. right) ideal, is a subspace preserved under multiplicationon the left (resp. right) by all elements of the algebra.

Lemma 3.15.

If a nonzero subalgebra of M d ( R ) does not preserve any propernontrivial subspaces of R d , then it contains the multiplicative identity of M d ( R ) .Proof. Let W ⊂ M d ( R ) be such a subalgebra. We ﬁrst show that the only nilpotentright ideal of W is the zero ideal. Indeed, let I be a nonzero nilpotent right ideal of W . Let k be the maximal number such that I k = 0 . Let f ∈ I k . Since I k +1 = 0 ,we have f ( R d ) ⊂ \ f ∈ I ker( f ) . The intersection on the right-hand side is preserved by W and not equal to R d ,because I is a nonzero right ideal. Then f must be zero, which contradicts I k = 0 .Thus, W is an algebra without radical [46, Chapter XVI, §116]. By [46, ChapterXVI, §117] , W has a multiplicative identity W . Its image W ( R d ) is preserved by W and nonzero since W is nonzero. Hence W ( R d ) = R d and W ∈ GL d ( R ) . Then W = 1 W forces W to be the identity of M d ( R ) . (cid:3) More precisely, we apply the theorem to algebras, or rings with operator domain R , using theterminology of Van der Waerden, see [46, Chapter XVI, §115]. INEAR RANDOM WALKS ON THE TORUS 21

Now we are ready to prove Proposition 3.1.

Proof of Proposition 3.1.

By Lemma 3.14, R [ G ] ≤ is isomorphic to R ⊕ E ∗ , thespace of aﬃne mappings from E to R . On R ⊕ E ∗ , let G act by ∀ g ∈ G, ∀ f ∈ R ⊕ E ∗ , ∀ x ∈ E, ( g · f )( x ) = f ( xg ) . By Lemma 3.13, there exist C ≥ and c > such that(3.8) ∀ f ∈ R ⊕ E ∗ , ∀ m ≥ , µ ∗ m (cid:0)(cid:8) g ∈ Γ | | f ( g ) | < e − C m k g kk f k (cid:9)(cid:1) ≪ µ e − c m . Given a proper aﬃne subspace W ⊂ E , there exists f ∈ R ⊕ E ∗ such that itslinear part f ∈ E ∗ has norm k f k = 1 and ∀ g ∈ E, d ( g, W ) = | f ( g ) | . Let ρ ≥ e − n . Pick m such that e − C m ≍ ρ / . Using the relation µ ∗ n = µ ∗ m ∗ µ ∗ ( n − m ) , we have µ ∗ n ( { g ∈ Γ | | f ( g ) | ≤ ρ k g k} ) = Z Γ µ ∗ m ( { g ∈ Γ | | ( h · f )( g ) | ≤ ρ k gh k} ) d µ ∗ ( n − m ) ( h ) We distinguish two cases according to whether ρ k h k ≤ e − C m k h · f k . If this is thecase, then ρ k gh k ≤ e − C m k g kk h · f k and then by (3.8), µ ∗ m (cid:0)(cid:8) g ∈ Γ | | ( h · f )( g ) | ≤ ρ k gh k (cid:9)(cid:1) ≪ µ e − c m ≪ ρ c / C . Otherwise, k h · f k ≤ ρe C m k h k ≤ ρ / k h k by the choice of m .Thus, µ ∗ n ( { g ∈ Γ | | f ( g ) | ≤ ρ k g k} ) ≪ µ ρ c / C + µ ∗ ( n − m ) (cid:0)(cid:8) h ∈ Γ | k h · f k ≤ ρ / k h k (cid:9)(cid:1) . Now observe that E ∗ is a submodule of the the semisimple G -module M d ( R ) ∗ ,which is isomorphic to the sum of d copies of the simple G -module R d . It followsthat E ∗ is isomorphic to the sum of dim( E ) d copies of R d . For each i = 1 , . . . , dim( E ) d ,let π i : R ⊕ E ∗ → R d denote the projection to the i -th R d -factor. Remembering k f k = 1 , we obtain k π i ( f ) k ≫ G for some i . Then k h · f k ≤ ρ / k h k implies k hπ i ( f ) k ≪ G ρ / k h kk π i ( f ) k . Hence µ ∗ ( n − m ) (cid:0)(cid:8) h ∈ Γ | k h · f k ≤ ρ / k h k (cid:9)(cid:1) ≤ µ ∗ ( n − m ) (cid:0)(cid:8) h ∈ Γ | k hπ i ( f ) k ≤ C ρ / k h kk π i ( f ) k (cid:9)(cid:1) where C is a constant depending only on G .By our choice of m , we have C ρ / ≥ e − ( n − m ) . Hence, by Proposition 3.11, µ ∗ ( n − m ) (cid:0)(cid:8) h ∈ Γ | k hπ i ( f ) k ≤ C ρ / k h kk π i ( f ) k (cid:9)(cid:1) ≪ µ ρ κ / where κ > is the constant given by Proposition 3.11 which depends only on µ .This proves the desired estimate with κ = min { c C , κ } . (cid:3) Escaping a larger neighborhood of a subvariety.

The rest of this sectionis devoted to the proof of Proposition 3.2. The idea is to generalize what we didabove for aﬃne subspaces. This time, the variety that we want to avoid is deﬁnedby a general polynomial map f on the algebra E , so that we shall have to considerthe representation ρ : G → GL( R [ G ]) deﬁned by ∀ g ∈ G, ∀ f ∈ R [ G ] , ∀ x ∈ G , ( ρ ( g ) f )( x ) = f ( xg ) . We refer to ﬁnite-dimensional subrepresentations of this representation as G -modules.For a G -module M , we denote by λ ( µ, M ) the top Lyapunov exponent associatedto the random walk on M deﬁned by µ : λ ( µ, M ) = lim 1 n Z G log k ρ ( g ) | M k d µ ∗ n ( g ) where k k denotes some operator norm.For a real number λ ≥ , deﬁne M λ to be the sum of submodules M of R [ G ] ≤ D such that λ ( µ, M ) ≥ λ . Let p λ : R [ G ] ≤ D → M λ be an epimorphism of G -modulesonto M λ . Remark that M λ is a sum of isotypical components in R [ G ] ≤ D so that p λ is uniquely deﬁned. Proposition 3.16.

Let µ be a probability measure on SL d ( Z ) , for some d ≥ . Let Γ denote the subsemigroup generated by µ , G the Zariski closure of Γ in SL d , and λ the top Lyapunov exponent of µ . Assume that:(a) The measure µ has a ﬁnite exponential moment;(b) The action of Γ on R d is irreducible;(c) The algebraic group G is Zariski connected,and let the notation be as above.Given D , λ ≥ and ω > , there is c > and n ≥ (depending also on µ )such that ∀ f ∈ R [ G ] ≤ D , ∀ n ≥ n , µ ∗ n (cid:0)(cid:8) g ∈ Γ | | f ( g ) | ≤ e ( λ − ω ) n k p λ ( f ) k (cid:9)(cid:1) ≤ e − cn . Proof.

Note that, for any < m < n , we have(3.9) µ ∗ n (cid:8) g ∈ Γ | | f ( g ) | ≤ e ( λ − ω ) n k p λ ( f ) k (cid:9) = Z Γ µ ∗ m (cid:8) g ∈ Γ | | ( ρ ( h ) f )( g ) | ≤ e ( λ − ω ) n k p λ ( f ) k (cid:9) d µ ∗ ( n − m ) ( h ) For any f ∈ R [ G ] ≤ D , there is a simple G -module M contained in M λ such that k p M ( f ) k ≫ G ,D,λ k p λ ( f ) k where p M : M λ → M is a projection of G -modules. By deﬁnition of M λ , the topLyapunov exponent in M satisﬁes λ ( µ, M ) ≥ λ . Note that since G is Zariskiconnected, the G -action on the simple module M is strongly irreducible. Hence, bythe large deviation estimate Theorem 3.10(iii), there is c > and n ≥ such thatfor all m > satisfying n − m ≥ n , µ ∗ ( n − m ) (cid:8) h ∈ Γ | k ρ ( h ) p M ( f ) k ≤ e ( λ − ω )( n − m ) k p M ( f ) k (cid:9) ≤ e − c ( n − m ) , and hence(3.10) µ ∗ ( n − m ) (cid:8) h ∈ Γ | k ρ ( h ) f k ≤ e ( λ − ω )( n − m ) k p λ ( f ) k (cid:9) ≤ e − c ( n − m ) . Applying Lemma 3.12 to the function ρ ( h ) f , for h ∈ Γ , we obtain ∀ m ≥ m (3.11) µ ∗ m (cid:8) g ∈ Γ | | ( ρ ( h ) f )( g ) | < e − Cm k ρ ( h ) f k (cid:9) ≤ e − cm , for some C > , c > and m ≥ . Setting m = (cid:4) ωn C + λ ) (cid:5) , so that ( λ − ω ) n − ( λ − ω n − m ) ≤ − Cm, the desired inequality follows from (3.9), (3.10) and (3.11). (cid:3)

Criterion to have nonzero component in modules of maximal Lya-punov exponent.

In order to use Proposition 3.16, we need to be able to saywhen a regular function has a nonzero component in a simple submodule of largeLyapunov exponent.

Lemma 3.17.

Let µ be a probability measure on SL d ( R ) , d ≥ , with some ﬁniteexponential moment. Assume that the group Γ generated by µ is non-compact andacts irreducibly on R d , and that its Zariski closure G is connected. INEAR RANDOM WALKS ON THE TORUS 23

Let f ∈ R [ M d ] be a polynomial of degree D ≥ whose degree D homogeneouspart does not vanish on the algebra E generated by G . The following holds for everyinteger k ≥ dim( E ) . Consider the polynomial ¯ F ∈ R [ M kd ] deﬁned by ¯ F ( x , . . . , x k ) = f ( x + · · · + x k ) and let F ∈ R [ G k ] be the restriction of ¯ F to G k . Then p ( F ) = 0 where p : R [ G k ] ≤ D → R [ G k ] ≤ D is the projection to the sum of all simple G k -submodules M of R [ G k ] ≤ D having λ ( µ ⊗ k , M ) ≥ Dλ ( µ, R d ) . We shall use the theory of the highest weight as well as the theory of randomwalks on semisimple groups. So let us ﬁx some notation and recall brieﬂy the neededresults. Let g denote the Lie algebra of G . Let K be a maximal compact subgroupof G . Inside the orthogonal complement, with respect to the Killing form, of the Liealgebra of K , we choose a Cartan subspace a of g . Every algebraic representationof G is diagonalizable for a . That is, for every G -module M , we have M = M χ ∈ a M χ where for each χ ∈ a ∗ , M χ is the associated weight space M χ = { v ∈ M | ∀ a ∈ a , exp( a ) · v = e χ ( a ) v } . The linear forms χ ∈ a ∗ for which M χ = { } are called the weights of M . Denoteby Σ( M ) the sets of weights of M .The set of of nontrivial weights of the adjoint representation of G is the set ofrestricted roots. We denote it by Σ . It forms a root system. We ﬁx a set Σ + ofpositive roots and denote by a + the associated Weyl chamber: a + = { a ∈ a | ∀ α ∈ Σ + , α ( a ) ≥ } . We also write a ++ to denote the interior of the Weyl chamber: a ++ = { a ∈ a | ∀ α ∈ Σ + , α ( a ) > } . Let g ∈ G . The Cartan projection κ ( g ) of g is the a + -part in its Cartan decom-position, that is, the unique element in a + such that g ∈ K exp( κ ( g )) K . The lawof large numbers for a semisimple group, [6, Theorem 10.9], says that there is anelement ~λ ( µ ) in a ++ , called the Lyapunov vector associated to µ , such that ~λ ( µ ) = lim n → + ∞ n Z G κ ( g ) d µ ∗ n ( g ) . If M is a simple G module, then M has a highest weight, denoted by χ M ∈ Σ( M ) ,so that for any weight χ ∈ Σ( M ) , χ M − χ is a sum of positive roots. By [6, Corollary10.12], we have λ ( µ, M ) = χ M ( ~λ ( µ )) . Now let us recall the deﬁnition of the limit set of the group G in M d ( R ) . Wewrite R G for the set of all elements M d ( R ) of the form λg with λ ∈ R , and g ∈ G .Let R G denote the closure of R G in M d ( R ) for the norm topology. Let r G denotethe proximal dimension of G , deﬁned by(3.12) r G = min (cid:8) rank( π ) | π ∈ R G \ { } (cid:9) . The limit set of G in M d ( R ) is deﬁned to be Π G = (cid:8) π ∈ R G | rank( π ) = r G (cid:9) . Lemma 3.18.

Let G < SL d be connected semisimple R -group. Assume that G = G ( R ) acts irreducibly on R d . Let π ∈ M d ( R ) be the spectral projector to the weightspace associated to the highest weight. Then Π G = R ∗ Kπ K = R ∗ Gπ G. If moreover G is not compact then, writing E = Span R ( G ) , the sum-set Gπ G + · · · + Gπ G | {z } dim E times contains an open subset of E .Proof. Let χ = χ R d ∈ Σ( R d ) denote the highest weight of the simple G -module R d . For a weight χ ∈ Σ( R d ) , let π χ be the spectral projector to the associatedweight space. Let a ∈ a ++ be any element. We have k exp( na ) k − exp( na ) = X χ ∈ Σ( R d ) e n ( χ ( a ) − χ ( a )) π χ . Now by deﬁnition of the highest weight, χ ( a ) − χ ( a ) < for χ = χ . It followsthat π = lim n → + ∞ k exp( na ) k − exp( na ) ∈ R G. Let π ∈ R G be another nonzero element. There exists sequences ( λ n ) ∈ R N and ( g n ) ∈ G N such that π = lim n → + ∞ λ n g n . Let g n = k n exp( a n ) ℓ n ∈ K exp( a + ) K be the Cartan decomposition of g n . Bycompactness of K , replacing ( g n ) by a subsequence if necessary, we may assumethat k n converges to k ∈ K and ℓ n converges to ℓ . Then k − πℓ − = lim n → + ∞ λ n exp( a n ) Observe that exp( a n ) = X χ ∈ Σ( R d ) e χ ( a n ) π χ . Hence λ n exp( a n ) = λ n e χ ( a n ) ( π + X χ ∈ Σ( R d ) \{ χ } e χ ( a n ) − χ ( a n ) π χ ) . Note that e χ ( a n ) − χ ( a n ) ≤ for all χ ∈ Σ( R d ) and n ≥ . We deduce that λ n e χ ( a n ) converges to λ = 0 , for otherwise π would be zero. Moreover, rank( π ) = rank( k − πℓ − ) ≥ rank( π ) . Equality holds if and only if lim n → + ∞ e χ ( a n ) − χ ( a n ) = 0 for all χ ∈ Σ( R d ) \ { χ } ,which in turn is equivalent to k − πℓ − = λπ . Therefore, r G = rank( π ) and Π G ⊂ R ∗ Kπ K . We conclude by noticing that Π G is invariant under multiplication by G on both sides : R ∗ Kπ K ⊂ R ∗ Gπ G ⊂ Π G .For the last assertion, assume that G is not compact. Then, χ = 0 and therefore χ ( a ) = R . For a ∈ a , exp( a ) π = e χ ( a ) π , so that R ∗ + π ⊂ Gπ and hence R ∗ + Gπ G ⊂ Gπ G. Since the action of G on R d is irreducible, E is an simple algebra over R , by aversion of Wedderburn’s theorem [46, 2. page 194]. Observe that Span R ( Gπ G ) isa nontrivial two-sided ideal of E , hence Span R ( Gπ G ) = E . Therefore we can pick INEAR RANDOM WALKS ON THE TORUS 25 dim( E ) elements ( π , . . . , π dim( E ) ) from Gπ G making a basis of E . We concludethat Gπ G + · · · + Gπ G | {z } dim E times ⊃ R ∗ + π + · · · + R ∗ + π dim( E ) contains an open subset of E . (cid:3) Proof of Lemma 3.17.

The Lie algebra of G k is g ⊕ · · · ⊕ g , in which we choose b = a ⊕ · · · ⊕ a to be the Cartan subspace. Then the associated restricted rootsystem is the direct sum Σ ⊔ · · · ⊔ Σ ⊂ b ∗ . We choose Σ + ⊔ · · · ⊔ Σ + as the set ofpositive roots so that b + = a + × · · · × a + is the corresponding Weyl chamber and ~λ ( µ ⊗ k ) = ( ~λ ( µ ) , . . . , ~λ ( µ )) ∈ b + is the Lyapunov vector associated to the random walk deﬁned by µ ⊗ k .For any algebraic representation π of G k , we denote by Σ( G k , π ) the set ofweights of π with respect to b .Let σ : G → GL( R d ) denote the standard representation of G and, for i =1 , . . . , k , let σ i : G k → G → GL( R d ) denote the representation of G k obtained bycomposing the projection G k → G to the i -th factor with σ . Note that for each i ,there is a natural bijection between Σ( G k , σ i ) → Σ( σ ) , χ ˜ χ such that the weight χ is the composition of the i -th projection with ˜ χ ∈ Σ( σ ) .Let G k act on R [ M kd ] ≤ D by right translation. Let ρ : G k → GL( R [ M kd ] ≤ D ) denote the corresponding representation. Then, ρ is equivalent to D M j =0 Sym j (cid:0) σ ⊕ · · · ⊕ σ | {z } d times ⊕ · · · ⊕ σ k ⊕ · · · ⊕ σ k | {z } d times (cid:1) . It follows that any weight χ ∈ Σ( G k , ρ ) in ρ is the sum of at most D elements from S ki =1 Σ( G k , σ i ) . In particular, λ ( µ ⊗ k , R [ M kd ] ≤ D ) = max χ ∈ Σ( G k ,ρ ) χ ( ~λ ( µ ⊗ k )) ≤ D max i max χ ∈ Σ( G k ,σ i ) χ ( ~λ ( µ ⊗ k ))= D max i max χ ∈ Σ( G k ,σ i ) ˜ χ ( ~λ ( µ )) ≤ Dλ . Since G is not compact, λ is positive, by a result of Furstenberg [25], and it followsthat λ ( µ ⊗ k , R [ M kd ]

Dnχ σ ( a ) ≤ nχ M j ( b ) + O (1) and necessarily Dχ σ ( a ) ≤ χ M j ( b ) . We have seen that χ M j is the sum of D elements from S ki =1 Σ( G k , σ i ) : χ M j = χ + · · · + χ D . Then χ M j ( b ) = χ ( b ) + · · · + χ D ( b ) = ˜ χ ( a ) + · · · + ˜ χ D ( a ) . Thus we have simultaneously Dχ σ ( a ) ≤ ˜ χ ( a ) + · · · + ˜ χ D ( a ) and ˜ χ + · · · + ˜ χ D ≤ Dχ σ for the order over the set of weights. Since a ∈ a ++ , , this forces ˜ χ = · · · = ˜ χ D = χ σ . Therefore λ ( µ ⊗ k , M j ) = χ M j ( ~λ ( µ ⊗ k )) = ( ˜ χ + · · · + ˜ χ D )( ~λ ( µ )) = Dχ σ ( ~λ ( µ )) = Dλ . We conclude that p ( ρ ( h ) F ) = 0 and hence p ( F ) = ρ ( h ) − p ( ρ ( h ) F ) = 0 . (cid:3) We can now easily deduce Proposition 3.2 from Proposition 3.16 and Lemma 3.17.

INEAR RANDOM WALKS ON THE TORUS 27

Proof of Proposition 3.2.

Note that under our assumptions, G cannot be compact.Let f : E → R be a polynomial map of degree D , and denote by f D its degree D homogeneous part. Deﬁne F ∈ R [ G k ] by ∀ g , . . . , g k ∈ G, F ( g , . . . , g k ) = f ( g + . . . + g k ) . Let p : R [ G k ] ≤ D → R [ G k ] ≤ D be the projection to the sum of all simple submod-ules M such that λ ( µ ⊗ k , M ) = Dλ ( µ, R d ) . By Lemma 3.17, k p ( F ) k = 0 implies k f D k = 0 , and since these two expressions deﬁne seminorms on the space of poly-nomial maps on E , it follows that(3.14) k f D k ≪ k p ( F ) k . By Proposition 3.16 applied to the random walk on G k associated to the measure µ ⊗ k , with λ = Dλ ( µ, R d ) we get that for every ω > , there exists c > and n ∈ N such that ∀ n ≥ n , µ ∗ n (cid:0)(cid:8) g ∈ G k | | F ( g ) | ≤ e ( λ − ω ) n k p ( F ) k (cid:9)(cid:1) ≤ e − cn . Together with (3.14), this proves what we want. (cid:3)

Fourier decay for random walks.

The relevant object here is the measure ˜ µ n , obtained from µ ∗ n after rescaling by a factor e − λ n . This rescaling shrinks µ ∗ n to a ball of subexponential size around . An important consequence of the resultsof this section and the previous one is the following theorem. Theorem 3.19 (Fourier decay for ˜ µ n ) . Let µ be a probability measure on SL d ( Z ) , d ≥ . Let Γ denote the subsemigroup generated by µ , G the Zariski closure of Γ in SL d and E the subalgebra of M d ( R ) generated by G ( R ) . Denote ˜ µ n = e − λ n ∗ µ ∗ n where λ is the top Lyapunov exponent of µ . Assume that:(a) The measure µ has a ﬁnite exponential moment;(b) The action of Γ on R d is irreducible;(c) The algebraic group G is Zariski connected.Then there exists α > such that for every < α < α , there exists c > and n ≥ such that for every n ≥ n , ∀ ξ ∈ E ∗ with e αn ≤ k ξ k ≤ e α n , | c ˜ µ n ( ξ ) | ≤ e − c n . Proof.

We want to apply Theorem 2.1 to the measure ˜ µ n , and for that, we shouldcheck that it is not concentrated near any aﬃne subspace, nor near any translateof the set of non-invertible elements of E . This will follow from Propositions 3.1and 3.2. Recall that given ρ > , we write S E ( ρ ) for the set S E ( ρ ) = { x ∈ E | | det E ( x ) | ≤ ρ } . Let D = dim( E ) . Under the assumptions of the theorem, we claim that there exists κ > depending only µ such that for every ω > , there exists c = c ( µ, ω ) > suchthat for every n ≥ , we can decompose the convolution ˜ µ ⊞ Dn ⊟ ˜ µ ⊞ Dn = η + θ into positive Borel measures satisfying the following properties.(i) θ ( E ) ≪ µ e − cn ,(ii) η ( E \ B E (0 , e ωn )) ≪ µ e − cn ,(iii) ∀ x ∈ E, η ( x + S E ( e − ωn )) ≪ µ e − cn ,(iv) ∀ ρ ≥ e − n , ∀ W < E aﬃne subspace , η ( W ( ρ ) ) ≪ µ e ωn ρ κ .To justify this claim, let η be the restriction of ˜ µ n to E \ B E (0 , e − ωn ) and put η ′ = η ⊞ ˜ µ ⊞ ( D − n and η = η ′ ⊟ ˜ µ ⊞ Dn . By Theorem 3.10(i), there is c = c ( µ, ω ) > such that for every n ≥ , ˜ µ n ( B E (0 , e − ωn )) ≪ µ e − cn and ˜ µ n ( E \ B E (0 , e ωn )) ≪ µ e − cn . It follows that θ ( E ) ≪ µ e − cn and η (cid:0) E \ B E (0 , De ωn ) (cid:1) ≪ µ e − cn . For x ∈ E , apply Proposition 3.2 to the polynomial function y det E ( y − e λ m x ) .Note that these polynomials all have degree D = dim( E ) and all have the samedegree D homogeneous part, namely det E . We obtain c > such that η ′ ( x + S E ( e − ωn )) ≤ µ ⊞ Dn (cid:0)(cid:8) y ∈ E | | det E ( e − λ n y − x ) | ≤ e − ωn (cid:9)(cid:1) ≤ µ ⊞ Dn (cid:0)(cid:8) y ∈ E | | det E ( y − e λ n x ) | ≤ e ( Dλ − ω ) m (cid:9)(cid:1) ≪ µ e − cn . Since this property is preserved under additive convolution, the same holds for η . Now let W be a proper aﬃne subspace of E . Using the deﬁnition of η andProposition 3.1, we ﬁnd for every ρ ≥ e − n , η ( { g ∈ E | d ( g, W ) ≤ ρ } ) ≤ ˜ µ n (cid:0)(cid:8) g ∈ E | d ( g, W ) ≤ ρe ωn k g k (cid:9)(cid:1) ≤ µ n (cid:0)(cid:8) g ∈ E | d ( g, W ) ≤ ρe ωn k g k (cid:9)(cid:1) ≪ µ e ωκn ρ κ . Again this property is preserved under additive convolution, so that η satisﬁes therequired conditions.Let ε = ε ( µ, κ ) > and s = s ( µ, κ ) ≥ be the constant given by Theorem 2.1applied with the parameter κ . Set ω = αε/ . Recall that we chose c = c ( µ, ω ) .Finally set τ = min { c/ , εκ } . With the choice of these parameters, for every n large enough, for every R ∈ [ e αn , e n ] ,(i) θ ( E ) ≤ R − τ ,(ii) η ( E \ B E (0 , R ε )) ≤ R − τ ,(iii) for all x ∈ E , η ( x + S E ( R − ε )) ≤ R − τ ,(iv) for all ρ ≥ R − and every proper aﬃne subspace W ⊂ E , η ( W ( ρ ) ) ≤ R ε ρ κ .In other words, the assumptions of Theorem 2.1 are satisﬁed for the measure η atthe scale R . Therefore, for all ξ ∈ E ∗ in the range e αn ≤ k ξ k ≤ e n , we have | c η ∗ s ( ξ ) | ≤ k ξ k − ετ . It follows that | \ (˜ µ ⊞ Dn ⊟ ˜ µ ⊞ Dn ) ∗ s ( ξ ) | ≤ k ξ k − ετ + O s ( k ξ k − τ ) . Applying Lemma 2.7 to ˜ µ ⊞ Dn ⊟ ˜ µ ⊞ Dn , we get | d ˜ µ sn ( ξ ) | ≤ k ξ k − c , where c = ετ D ) s and, again, assuming that n ≥ n ( µ, α ) . This shows the desiredupper bound for c ˜ µ n ( ξ ) provided that n is a multiple of s .To prove the estimate for general n , write n = sq + r with ≤ r < s . On theone hand, by (2.13), c ˜ µ n ( ξ ) = Z c ˜ µ sq ( ξx ) d˜ µ r ( x ) , INEAR RANDOM WALKS ON THE TORUS 29

On the other hand, it follows from the Markov inequality and the fact that ˜ µ r hasbounded exponential moment that, for x outside of a set of exponentially small ˜ µ r -measure, e − αn k ξ k ≤ k ξx k ≤ e n s k ξ k . This proves the theorem with α = s . (cid:3) The set of large Fourier coefficients

Starting from Theorem 3.19, we now use some Fourier analysis to show a ﬁrstintermediate statement towards Theorem 1.2.

Proposition 4.1 (First step: concentration and separation) . Let µ be a probabilitymeasure on SL d ( Z ) , d ≥ . Denote by Γ the subsemigroup generated by µ , and by G the Zariski closure of Γ in SL d . Assume that:(a) The measure µ has a ﬁnite exponential moment;(b) The action of Γ on R d is irreducible;(c) The algebraic group G is Zariski connected.There exist constants C ≥ and σ > τ > such that the following holds.Let ν be a Borel probability measure on T d . Let t ∈ (0 , / . Assume that for some a ∈ Z d \ { } , | \ µ ∗ n ∗ ν ( a ) | ≥ t and n ≥ C | log t | . Then, writing N = e σn k a k and M = e − τn N , there exists a M -separated set X ⊂ T d such that ν [ x ∈ X B ( x, N ) ! ≥ t C . The proof of this statement goes by two steps. First, following [11], one appliesa Fourier analytic lemma [11, Proposition 7.5] to translate the concentration of ν to a statement about its Fourier coeﬃcients. Then, one uses the Fourier decay of ˜ µ n to study the set of large Fourier coeﬃcients(4.1) A ( t ) = { a ∈ Z d | | ˆ ν ( a ) | ≥ t } , and prove the desired statement.4.1. Detecting concentration from the Fourier coeﬃcients.

Because it is soelementary, and yet beautiful, we include the Fourier analytic lemma needed forour argument. The reader is referred to [11, Proposition 7.5] for its ingenious proof.

Lemma 4.2.

Given d ∈ N , there exists c > such that if a measure ν on T d satisﬁes N ( A ( t ) ∩ B (0 , N ) , M ) ≥ s (cid:18) NM (cid:19) d for some numbers s, t > and some M, N ≥ such that M < cN , then there existsa M -separated subset X ⊂ T d such that ν [ x ∈ X B ( x, N ) ! ≥ c ( st ) . Going back to the statement of Proposition 4.1 above we see that it is enoughto show that, under the same assumptions, there exist C ≥ and σ > τ > suchthat, for N = e σn k a k and M = e − τn N ,(4.2) N ( A ( t C ) ∩ B (0 , N ) , M ) ≥ t C (cid:16) NM (cid:17) d . This is the goal of the next paragraph.

Fourier decay and large coeﬃcients.

For a ∈ Z d and x ∈ T d , we denoteby ( a, x )

7→ h a, x i ∈ T the natural pairing. Vectors in Z d = c T d indexing Fouriercoeﬃcients are naturally understood as row vectors, so that for any g ∈ SL d ( Z ) , wehave h a, gx i = h ag, x i . Before we start the proof of (4.2), we record an elementary lemma – not muchmore than the Cauchy-Schwarz inequality – which shows that the set of large Fouriercoeﬃcients of a measure has some additive structure. It will later be combinedwith the multiplicative properties of µ ∗ n , allowing us to exploit the sum-productphenomenon for the study of the set of large Fourier coeﬃcients. This approachto Fourier coeﬃcients of multiplicative convolutions of measures goes back to thework of Bourgain and Konyagin [15] on exponential sums in ﬁnite ﬁelds.We use the symbols ⊞ and ⊟ introduced in Section 2. Lemma 4.3 (Additive structure of Fourier coeﬃcients) . Let µ be a Borel probabilitymeasure SL d ( Z ) and ν a Borel probability measure on T d . If | [ µ ∗ ν ( a ) | ≥ t > , then for any integer k ≥ , the set A = (cid:8) g ∈ M d ( Z ) | | ˆ ν ( a g ) | ≥ t k / (cid:9) satisﬁes (cid:0) µ ⊞ k ⊟ µ ⊞ k (cid:1) ( A ) ≥ t k . Proof.

Observe that [ µ ∗ ν ( a ) = Z T d Z Γ e ( h a , gx i ) d µ ( g ) d ν ( x ) = Z T d Z Γ e ( h a g, x i ) d µ ( g ) d ν ( x ) . By Hölder’s inequality, t k ≤ | [ µ ∗ ν ( a ) | k ≤ Z T d (cid:12)(cid:12)Z Γ e ( h a g, x i ) d µ ( g ) (cid:12)(cid:12) k d ν ( x ) ≤ Z Γ k ˆ ν (cid:0) a ( g + · · · + g k − g k +1 − · · · − g k ) (cid:1) d µ ⊗ k ( g , . . . , g k ) ≤ Z E | ˆ ν ( a g ) | d (cid:0) µ ⊞ k ⊟ µ ⊞ k (cid:1) ( g ) ≤ (cid:0) µ ⊞ k ⊟ µ ⊞ k (cid:1) ( A ) + t k (cid:0) µ ⊞ k ⊟ µ ⊞ k (cid:1) ( E \ A ) , which ﬁnishes the proof of the lemma. (cid:3) Combining the above observation and Theorem 3.19, we can derive (4.2).

Proof of (4.2) . As before, for n ≥ , we let ˜ µ n = ( e − λ n ) ∗ µ n denote the rescalingof µ n = µ ∗ n . Denote by E the subalgebra of M d ( R ) generated by G ( R ) and write D = dim( E ) . By Theorem 3.19 there exists constants α > and c > such that ∀ ξ ∈ E ∗ with e α n D +4 ≤ k ξ k ≤ e α n , | c ˜ µ n ( ξ ) | ≤ k ξ k − c , Now ﬁx δ = e − α n and write α = 1 / (4 D + 2) . Let C = C ( D, α ) from Lemma 4.4below and set k = ⌈ C /c ⌉ so that the above implies ∀ ξ ∈ E ∗ with δ − α ≤ k ξ k ≤ δ − , | \ ˜ µ ⊞ kn ⊟ ˜ µ ⊞ kn ( ξ ) | ≤ k ξ k − C , This says that the measure ˜ µ ⊞ kn ⊟ ˜ µ ⊞ kn is regular at all scales between δ and δ α . INEAR RANDOM WALKS ON THE TORUS 31

On the other hand, since | \ µ n ∗ ν ( a ) | ≥ t , it follows from Lemma 4.3 that theset A = (cid:8) g ∈ E ∩ M d ( Z ) | | ˆ ν ( a g ) | ≥ t := t k / (cid:9) satisﬁes (cid:0) µ ⊞ kn ⊟ µ ⊞ kn (cid:1) ( A ) ≫ t k . Letting ˜ A = e − λ n · A be the rescaling of A , we ﬁnd (cid:0) ˜ µ ⊞ kn ⊟ ˜ µ ⊞ kn (cid:1) ( ˜ A ) ≫ t k . From the large deviation estimate Theorem 3.10(i), we also have (cid:0) ˜ µ ⊞ kn ⊟ ˜ µ ⊞ kn (cid:1) ( E \ B E (0 , δ − α )) ≪ µ e − c n for some c = c ( µ ) > . Assuming n ≥ kc | log t | , this implies (cid:0) ˜ µ ⊞ kn ⊟ ˜ µ ⊞ kn (cid:1) ( ˜ A ∩ B E (0 , δ − α )) ≫ t k . So we can apply Lemma 4.4 to the restriction of ˜ µ ⊞ kn ⊟ ˜ µ ⊞ kn to B E (0 , δ − α ) .Letting t = t k , we obtain x ∈ B E (0 , δ − α ) such that N ( ˜ A ∩ B E ( x, δ / ) , δ ) ≫ D t D +11 δ − D/ . Rescaling back, we ﬁnd(4.3) N (cid:0) A ∩ B E ( e λ n x, N ) , M (cid:1) ≫ D t (cid:16) N M (cid:17) D , where t = t D +11 , N = e σn and M = e − τn N with σ = λ − α and τ = α . Inaccordance with the statement of the proposition, we put N = N k a k and M = M k a k . Consider the map ϕ : E → R d , g a g . Letting A ′ = A ∩ B E ( e λ n x, N ) , wehave(4.4) N ( A ′ , M ) ≤ N (cid:0) ϕ ( A ′ ) , M (cid:1) max b ∈ ϕ ( A ′ ) N (cid:0) A ′ ∩ ϕ − ( B ( b, M )) , M (cid:1) . We claim that(4.5) max b ∈ ϕ ( A ′ ) N (cid:0) A ′ ∩ ϕ − ( B ( b, M )) , M (cid:1) ≪ E (cid:16) N M (cid:17) D − d Evidently k ϕ k ≍ E k a k . Let W = ker ϕ . Since G acts irreducibly on R d , ϕ is surjective and hence dim( W ) = D − d . The restriction ϕ | W ⊥ : W ⊥ → R d isbijective. Moreover, by a compactness argument, k ϕ − | W ⊥ k ≍ E k a k − . Consequently, for any y ∈ E , ϕ − ( B ( ϕ ( y ) , M )) ⊂ y + W ( O E ( M ))0 . Hence N (cid:0) A ′ ∩ ϕ − ( B ( ϕ ( y ) , M )) , M (cid:1) ≤ N (cid:0) B E (0 , N ) ∩ W ( O E ( M ))0 , M (cid:1) ≪ E (cid:16) N M (cid:17) D − d , which proves the claim (4.5). From (4.3), (4.4) and (4.5), we get N (cid:0) ϕ ( A ′ ) , M (cid:1) ≫ E t (cid:16) NM (cid:17) d . By deﬁnition of A ′ , we have ϕ ( A ′ ) ⊂ A ( t ) ∩ B ( b, k ϕ k N ) , where b = e λ n a x and k ϕ k N ≪ E N .This is almost what we want, except that the ball B ( b, N ) is not centered at theorigin. To recenter that ball, we make use once more of the additive properties of the set of large Fourier coeﬃcients. Choosing the densest ball of radius N/ inside B ( b, N ) , we get some b ′ ∈ R d such that N (cid:0) A ( t ) ∩ B ( b ′ , N , M (cid:1) ≫ E t (cid:16) NM (cid:17) d . Choose an M -separated subset A ⊂ A ( t ) ∩ B ( b ′ , N/ of cardinality | A | ≫ N (cid:0) A ( t ) ∩ B ( b ′ , N , M (cid:1) and such that all Fourier coeﬃcients ˆ ν ( a ) , for a ∈ A , fall into the same quadrantof C . Then | A | t ≤ (cid:12)(cid:12) X a ∈ A ˆ ν ( a ) (cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)Z T d X a ∈ A e ( h a, x i ) d ν ( x ) (cid:12)(cid:12)(cid:12) By the Cauchy-Schwarz inequality, | A | t ≤ Z T d (cid:12)(cid:12) X a ∈ A e ( h a, x i ) (cid:12)(cid:12) d ν ( x ) = X a ,a ∈ A | ˆ ν ( a − a ) | Thus, there exists a ∈ A such that | A | t ≤ X a ∈ A | ˆ ν ( a − a ) | . Set A = ( A − a ) ∩ A (cid:0) t (cid:1) , we have | A | ≥ t | A | and A ⊂ B (0 , N ) . It followsthat N (cid:16) A (cid:16) t (cid:17) ∩ B (0 , N ) , M (cid:17) ≫ E t t (cid:16) NM (cid:17) d . This concludes our proof. (cid:3)

The next lemma is the regularity statement we need for measures on the Eu-clidean space that have a strong Fourier decay. It essentially states that if a setin R D carries a large proportion of a measure with small Fourier coeﬃcients at allfrequencies between δ − α and δ − − α , then we can ﬁnd a ball of radius δ O ( α ) in theset on which the measure is comparable to the Lebesgue measure at scale δ . Lemma 4.4 (Regularity from Fourier decay) . Given D ≥ and α > , there existconstants c = c ( D, α ) > and C = C ( D, α ) > such that the following holds forall < δ < ct . Let µ be a Borel measure on R D , of total mass µ ( R D ) ≤ . Let A be a subset of R D . Assume(i) Supp( µ ) ⊂ B (0 , δ − α ) ,(ii) ∀ ξ ∈ R D , with δ − α ≤ k ξ k ≤ δ − − α , | ˆ µ ( ξ ) | ≤ k ξ k − C ,(iii) µ ( A ) ≥ t .Then there exists x ∈ R D such that N ( A ∩ B ( x, δ β ) , δ ) ≥ ct D +1 (cid:16) δ β δ (cid:17) D , where β = (2 D + 1) α .Proof. Let ϕ : R D → R be a nonnegative smooth function supported on B (0 , such that R R D ϕ = 1 . Set ϕ δ ( x ) = δ − D ϕ ( δ − x ) , ∀ x ∈ R D . Note that ∀ ξ ∈ R D , c ϕ δ ( ξ ) = ˆ ϕ ( δξ ) . Since ϕ is smooth, for any C > , we have(4.6) ∀ ξ ∈ R D , | ˆ ϕ ( ξ ) | ≪ C (1 + k ξ k ) − C . Deﬁne µ δ = µ ⊞ ϕ δ , viewed either as a measure or as a smooth function on R D .Clearly, µ δ ( A ( δ ) ) ≥ µ ( A ) ≥ t. INEAR RANDOM WALKS ON THE TORUS 33

Let c > be a small constant depending on D and α to be determined later.Assume δ < ct and set ρ = cδ β t . Let ( B i ) ≤ i ≤ i max be an essentially disjoint coveringof B (0 , δ − α ) by closed balls of radius ρ . In other words, the intersection multiplicityof the covering is at most C d = O d (1) , so that in particular the number of balls isat most i max = O d ( δ Dα ρ − D ) . Consider I = n ≤ i ≤ i max | µ δ ( A ( δ ) ∩ B i ) µ δ ( B i ) ≥ t C d o . Using the ﬁnite multiplicity of the covering, we infer that P i ∈ I µ δ ( B i ) ≥ t/ . Hencethere exists i ∈ I such that µ δ ( B i ) ≫ ti max ≫ d tδ Dα ρ − D . We ﬁx this i from now on. Deﬁne M = max x ∈ B i µ δ ( x ) and m = min x ∈ B i µ δ ( x ) . We have µ δ ( B i ) ≤ M | B i | and hence(4.7) M ≫ d tδ Dα . Let x ∈ B i such that µ δ ( x ) = M . By the Plancherel theorem, for any x ∈ B i , µ δ ( x ) = µ ⊞ ϕ δ ( x ) = Z R D e ( −h ξ, x i )ˆ µ ( ξ ) c ϕ δ ( ξ ) d ξ Thus, | µ δ ( x ) − µ δ ( x ) | ≤ Z R D | − e ( h ξ, x − x i ) || ˆ µ ( ξ ) || c ϕ δ ( ξ ) | d ξ ≤ T + 2 T + 2 T , where T = Z k ξ k≤ δ − α | − e ( h ξ, x − x i ) | d ξ ≪ d Z k ξ k≤ δ − α k ξ kk x − x k d ξ ≪ d δ − ( D +1) α ρ,T = Z δ − α ≤k ξ k≤ δ − − α | ˆ µ ( ξ ) | d ξ ≪ d Z δ − α ≤k ξ k≤ δ − − α k ξ k − C d ξ ≪ d δ ( C − D ) α C − D ,T = Z k ξ k≥ δ − − α | ˆ ϕ ( δξ ) | d ξ ≪ D,C Z k ξ k≥ δ − − α δ − C k ξ k − C d ξ ≪ D,C δ ( C − D ) α − D C − D .

In the last line, we used (4.6).Picking β = (2 D + 1) α , C = Dα +1 α + D and C = Dα + D +1 α + D and puttingthese inequalities together, we obtain, remembering (4.7) and δ < ct , M − m ≪ D,α ctδ Dα + δ Dα +1 ≪ D,α cM.

This implies

M/m ≤ provided that c is chosen small enough according to D and α . Remembering i ∈ I , we have t ≪ d µ δ ( A ( δ ) ∩ B i ) µ δ ( B i ) ≤ Mm | A ( δ ) ∩ B i || B i | . Hence N ( A ∩ B i , δ ) ≫ d δ − D | A ( δ ) ∩ B i | ≫ d tδ − D | B i | ≫ d tδ − D ρ D = c D t D +1 δ Dβ − D . Let x ∈ R D be the center of B i . Then A ∩ B i ⊂ A ∩ B ( x, δ β ) and hence N ( A ∩ B ( x, δ β ) , δ ) ≫ D,α t D +1 δ Dβ − D . (cid:3) Concentration near rational points

In this section, we ﬁnish the proof of Theorem 1.2 from the introduction. Theargument follows closely the one given in Section 7 of [11], but some modiﬁcationsare required since we cannot make use of the proximality assumption.In all this section, unless stated otherwise, µ denotes a probability measure on SL d ( Z ) , d ≥ . The Lyapunov exponents of µ are denoted by λ ≥ · · · ≥ λ d . Thesubsemigroup generated by µ is denoted by Γ , its Zariski closure by G , and wewrite G = G ( R ) for the set of real points. We assume that:(a) The measure µ has a ﬁnite exponential moment;(b) The action of Γ on R d is irreducible;(c) The algebraic group G is Zariski connected.We also let E be the subalgebra of M d ( R ) generated by G . For n ∈ N , we write µ n = µ ∗ n for the law of the random walk in G at time n . Finally, ν denotes a Borelprobability measure ν on T d , understood as the starting distribution of a randomwalk on T d . We write ν n = µ n ∗ ν for the law of the random walk at time n .We shall divide the proof of Theorem 1.2 into three parts. First, one observesthat given a Borel probability measure ν on T d , the sequence of measures ν n satisﬁesa diophantine property: if it gives much weight to a ball of small radius, then theball must contain a rational point with small denominator. Second, starting theseparated set X around which, by Proposition 4.1, ν n is concentrated, one goesbackwards along the random walk in order to increase the concentration of themeasure around the set X , until one can apply the diophantine property to concludethat ν n − m is concentrated near some rational points with bounded denominator.The last part, concluding the proof, is again going backwards along the randomwalk, to show that if ν n concentrates near the set of rational points of boundedheight, then ν is even more concentrated near that set.5.1. An almost diophantine property.

The key to obtain the concentrationnear rational points is the following almost diophantine property of the sequence ofmeasures ν n = µ n ∗ ν , n ∈ N . Given Q ≥ and ρ > , we denote by W Q the set ofrational points in T d with denominator at most Q , and by W ( ρ ) Q its ρ -neighborhood. Proposition 5.1 (Almost diophantine property) . Let µ be a probability measure on SL d ( Z ) , d ≥ , with some ﬁnite exponential moment. Assume that µ acts stronglyirreducibly on R d .There exist constants C ≥ and η > depending only on µ , such that forevery Borel probability measure ν on T d , for every x ∈ T d , every ρ > , and every n ≥ C | log ρ | , ν n ( B ( x, ρ )) ≥ ρ η = ⇒ x ∈ W ( ρ / ) ρ − / . Proof.

By Lemma 3.14, the manifold G × G is not included in any proper aﬃnesubspace of E × E and therefore the set { g + · · · + g d − h − · · · − h d | g i , h i ∈ G } INEAR RANDOM WALKS ON THE TORUS 35 contains a non-empty open set in E . This implies in particular that the map ( g i , h i ) det( g + · · · + g d − h − · · · − h d ) is not identically zero on G d × G d . ByProposition 3.8, we infer that there exists c > such that for every m large enough, µ ⊗ dm ( { ( g i , h i ) ≤ i ≤ d | det( X g i − X h i ) = 0 } ) ≤ e − cm . Set η = c dλ , ρ ≍ e dλ m , and for B = B ( x, ρ ) , write e − cm ≍ ρ η ≤ ν n ( B ) d = (cid:16)Z T d X g ∈ Γ µ m ( g ) B ( gx ) d ν n − m ( x ) (cid:17) d ≤ Z T d (cid:0)X g ∈ Γ µ m ( g ) g − B ( x ) (cid:1) d d ν n − m ( x ) (by Jensen’s inequality) = X g i ,h i µ m ( g ) · · · µ m ( g d ) µ m ( h ) · · · µ m ( h d ) ν n − m ( g − B ∩ · · · ∩ h − d B ) . This shows that the µ ⊗ dm -measure of d -tuples of elements g , . . . , h d such that ν n − m ( g − B ∩ · · · ∩ h − d B ) ≫ e − cm is at least e − cm . In particular, using the large deviation estimate Theorem 3.10(i)and the observation above on the determinant, we may ﬁnd elements g , . . . , h d inthe support of µ m satisfying  max( k g i k , k h i k ) ≤ e λ m , det( g + · · · + g d − h − · · · − h d ) = 0 ,g − B ∩ · · · ∩ g − d B ∩ h − B ∩ · · · ∩ h − d B = ∅ . If y ∈ R d represents a point in that intersection, then, writing M = g + · · · + g d − h − · · · − h d , there exists v ∈ Z d such that M y = v + O ( ρ ) , whence(5.1) y = M − v + O ( k M k − ρ ) . Now, the matrix M has integer entries, and its determinant is bounded above by e dλ m , so that the entries of M − are rational numbers with denominator boundedabove by e dλ m ≤ ρ − . Moreover, k M − k ≤ k M k d − det( M ) − ≤ e d − λ m . Equality (5.1) above shows that x = g y mod Z d is at distance at most ρ froma rational point with denominator at most ρ − . This ﬁnishes the proof. (cid:3) Bootstrapping concentration.

We now wish to combine the diophantineproperty of ν n = µ n ∗ ν with the concentration statement given by Proposition 4.1to obtain some concentration near rational points. To help the reader follow ourprogress towards Proposition 5.7, we formulate another intermediate step, which isthe goal of this paragraph.Given a subset X ⊂ T d and a small parameter ρ > , we shall write X ( ρ ) for the ρ -neighborhood of X . Thus, for instance, we write W ( ρ ) Q for the set of points in T d that lie at distance at most ρ from a rational point with denominator at most Q . Proposition 5.2 (Second step: concentration around rational points) . Under theassumptions recalled at the beginning of this section, there exists a constant C de-pending only on µ such that the following holds.Let t ∈ (0 , / . Assume that for some a ∈ Z d \ { } , | c ν n ( a ) | ≥ t and n ≥ C log k a k t . Then, for every integer m such that m ≥ C log k a k t and n − m ≥ Cm , ν n − m (cid:0) W ( Q − ) Q (cid:1) ≥ t C , for some Q ∈ [ e mC , e Cm ] . The concentration statement given by Proposition 4.1 is not strong enough for adirect application of the diophantine property. We ﬁrst need Lemma 5.3 below tobootstrap concentration. It is exactly the same statement as [11, Proposition 7.2],and the proof is also the same, with some minor modiﬁcations to avoid the use ofthe proximality assumption; we include it nonetheless, for readability.

Lemma 5.3.

Given ε > , there exist c > and m ∈ N so that for m ≥ m , thefollowing holds for every Borel probability measure ν on T d . Given scales r, ρ > such that e dλ m ρ < r , there are scales r = e − ( λ + ε ) m r and ρ = e − ( λ − ε ) m ρ ,so that for every r -separated set X ⊂ T d , one can construct an r -separated set X ⊂ T d with ν ( X ( ρ )1 ) ≥ ν m ( X ( ρ ) ) d − e − cm . Proof.

First, by Jensen’s inequality used in the same way as in the proof of Propo-sition 5.1, ν m ( X ( ρ ) ) d ≤ X g ,...,g d ∈ Γ µ m ( g ) . . . µ m ( g d ) ν ( g − X ( ρ ) ∩ · · · ∩ g − d X ( ρ ) ) . This implies that the set of d -tuples ( g i ) ≤ i ≤ d such that(5.2) ν ( g − X ( ρ ) ∩ · · · ∩ g − d X ( ρ ) ) ≥ ν m ( X ( ρ ) ) d − e − cm has µ ⊗ dm -measure at least e − cm . By Theorem 3.10(ii) and Lemma 5.4 below, if c is chosen small enough, there must exist ( g , . . . , g d ) satisfying this inequality, andmoreover(5.3) ∀ i = 1 , . . . , d, k g i k ≤ e ( λ + ε ) m and k g − i k ≤ e ( − λ d + ε ) m and(5.4) ∀ v ∈ R d \ { } , max i k g i v kk v k ≥ e ( λ − ε ) m . We ﬁx such elements g , . . . , g d for the rest of the proof.Without loss of generality, we may assume that ε > is so small that λ − λ d + 3 ε < dλ . We claim then that the set g − X ( ρ ) ∩ · · · ∩ g − d X ( ρ ) is included in a union of atmost | X | balls of radius ρ = e − ( λ − ε ) m ρ : Indeed, from (5.3) one ﬁnds – drawinga picture of X ( ρ ) and g − i X ( ρ ) – that given x ∈ X and i ≥ , the set g − B ( x, ρ ) meets at most one component g − i B ( y, ρ ) , y ∈ X . Therefore, there are at most | X | non-empty intersections g − B ( x , ρ ) ∩ . . . g − d B ( x d , ρ ) , for x , . . . , x d ∈ X .If x, y lie inside such an intersection, then, for each i , k g i ( x − y ) k ≤ ρ , and (5.4)implies that k x − y k ≤ e − ( λ − ε ) m ρ = ρ . Thus, each intersection g − B ( x , ρ ) ∩ · · · ∩ g − d B ( x d , ρ ) is included in a ball of radius ρ .Finally, using (5.3) again, we see that these intersections are separeated by atleast r = e − ( λ + ε ) m r , and the proposition follows. (cid:3) INEAR RANDOM WALKS ON THE TORUS 37

We now state and prove the large deviation estimate use in the above argument.

Lemma 5.4.

Let µ be a Borel probability measure on SL d ( R ) with some ﬁniteexponential moment, and assume that the semigroup Γ generated by µ acts stronglyirreducibly on R d . Then, for every ε > , there exists c > such that for everylarge enough m ∈ N , µ ⊗ dm n ( g , . . . , g d ) | ∀ v ∈ R d \ { } , max i k g i v kk v k ≥ e ( λ − ε ) m o ≥ − e − cm . Remark 4.

If one assumes that µ is supported on SL d ( Z ) , and replaces d by d ,then this lemma follows directly from Proposition 3.2. This particular case wouldbe suﬃcient for our purposes. Proof.

In this proof, c denotes a small positive constant, depending on ε , and whosevalue may vary from one line to the other. Let r denote the proximality dimensionof Γ . We shall use the notation introduced in the proof of Proposition 3.11. Recallthat for g ∈ Γ , we consider its Cartan decomposition g = k diag( σ ( g ) , . . . , σ d ( g )) ℓ ,where k and ℓ are orthogonal matrices and σ ( g ) ≥ · · · ≥ σ d ( g ) are the singularvalues of g . We deﬁned W − g = ℓ − Span( e r +1 , . . . , e d ) where ( e , . . . , e d ) is the standard basis of R d , so that for every non-zero v ∈ R d ,(5.5) k gv kk v k ≥ d ∡ ( R v, W − g ) k g k , By the large deviation estimate Theorem 3.10(i) if g , . . . , g d are independent ran-dom variables with law µ m , then with probability at least − e − cm , ∀ i ∈ { , . . . , d } , k g i k ≥ e ( λ − ε ) m . For a subspace W ≤ R d , we let Nbd(

W, ρ ) denote the ρ -neighborhood of W in R d . It follows from the above that the lemma will be proved – with dε instead of ε – if we can show that with probability at least − e − cm , the intersection d \ i =1 Nbd( W − g i , e − dεm ) reduces to a ball of radius .For that, we construct inductively for k = 1 , . . . , d − r + 1 a linear subspace W k of dimension d − r + 1 − k , depending on g , . . . , g k , such that k \ i =1 Nbd( W − g i , e − dεm ) ⊂ Nbd( W k , e − d +1 − k ) εm ) . At each step W k +1 is constructed in terms of W k and g k +1 and the constructionis possible with probability − e − cm . For k = 1 , one may simply take W = W − g . Then, suppose W k has been constructed, and let R w ⊂ W k be any line. ByTheorem 3.10(iii), with probability at least − e − cm k g k +1 w kk w k ≥ e ( λ − ε ) m . By Theorem 3.10(ii), with probability − e − cm ,(5.6) ∀ j ∈ { , . . . , d } , | m log σ j ( g k +1 ) − λ j | ≤ ε, and by a straightforward generalization of [11, Lemma 4.1(2)], k gw kk w k ≤ k g k d ( R w, W − g k +1 ) + σ r +1 ( g k +1 ) . Since by a theorem of Guivarc’h and Raugi [26], λ r > λ r +1 , we deduce from theabove that, provided ε > is small enough, d ( R w, W − g k +1 ) ≥ e − εm . This implies that there exists a proper subspace W k +1 < W k such that Nbd( W k , e − d − k +1) εm ) ∩ Nbd( W − g k +1 , e − dεm ) ⊂ Nbd( W k +1 , e − d − k ) εm ) . This proves what we want. (cid:3)

To prove Proposition 5.2, we proceed as follows. Applying ﬁrst Proposition 4.1,we shall obtain m ∈ N and scales ρ and r , together with an r -separated set X ⊂ T d such that ν n − m ( X ( ρ )0 ) ≥ t C . The idea is then to reduce the radius of the balls, by an iterated application ofLemma 5.3. We shall thus obtain an increasing sequence of integers m k and adecreasing sequence of scales ρ k and r k , k = 1 , , . . . together with an r k -separatedset X k ⊂ T d such that | X k | ≤ | X | and ν n − m k ( X ( ρ k ) k ) ≥ t C k . Once we arrive at a scale ρ k such that | X k | ≤ | X | ≤ ρ − η k , we shall be able to use the diophantine property of the random walk to conclude.Now let us turn to the detailed proof. Proof of Proposition 5.2.

Let C and σ > τ > be the constants given by Propo-sition 4.1. and C ′ and η > the ones given by Proposition 5.1. Then write m = m + km + , where m ≥ C | log t | , τ m dλ ≤ m + ≤ τ m dλ and k = (cid:6) d σητ (cid:7) = O µ (1) . This is feasible provided m ≥ C | log t | , where C ≥ depends on µ via the constants C , τ , etc. Note that within constants depending only on µ , m ≍ m ≍ m + . By Proposition 4.1 applied to ν n = µ m ∗ ν n − m , there exist scales ρ = e − σm k a k − and r = e τm ρ together with an r -separated subset X ⊂ T d such that ν n − m ( X ( ρ )0 ) ≥ t C . Note that, since X is r -separated, | X | ≪ d r − d ≤ e d ( σ − τ ) m k a k d . Thus if C was chosen large enough, we have | X | ≤ e dσm . Choose ε > such that kε < dλ so that kεm + < τ m INEAR RANDOM WALKS ON THE TORUS 39 and apply Lemma 5.3 to ν n − m = µ m + ∗ ν n − m − m + . This is allowed since by our choice of parameters e dλ m + ρ ≤ e τm ρ < r . This yields scales ρ = e − ( λ − ε ) m + ρ and r = e − ( λ + ε ) m + r together with an r -separated subset X such that | X | ≤ | X | and ν n − m − m + ( X ( ρ )1 ) ≥ t C , provided m is large enough to ensure that e − cm + < t C . We may repeat this proce-dure at least k times, and therefore obtain a sequence of scales deﬁned inductivelyby ρ i +1 = e − ( λ − ε ) m + ρ i and r i +1 = e − ( λ + ε ) m + r i . Indeed, our choice of ε ensures that for every i ≤ k , e dλ m + ρ i ≤ e dλ m + +2 iεm + − τm r i < r i . In the end, we obtain scales ρ k and r k , and a set X k with | X k | ≤ | X | ≤ e dσm such that(5.7) ν n − m ( X ( ρ k ) k ) ≥ t C k . Moreover, ρ k = e − k ( λ − ε ) m + ρ ≤ e − kλ m +2 ≤ e − dσm η so that | X k | ≤ ρ − η k . Therefore, adjusting slightly the values of the constants, we may restrict X k to thepoints satisfying ν n − m ( B ( x, ρ k )) ≥ ρ ηk , while preserving (5.7).Note that we also have ρ − k ≤ e ( σ + λ ) m k a k . Thus, if C was chosen large enough,then n − m ≥ Cm ≥ C ′ | log ρ k | , and we may conclude by Proposition 5.1 that X ( ρ k ) k ⊂ W ( ρ / k ) ρ − / k . This proves the proposition with Q = ρ − / k in the desired range. (cid:3) End of the proof of Proposition 5.7: near rational points.

The end ofthe proof of Proposition 5.7 is based on an argument similar in spirit to the one usedin Lemma 5.3, to bootstrap concentration. The proposition we shall need is againtaken from [11], where it appears as [11, Proposition 7.4]. The proof we presentfollows closely the one given in [11], but the key Lemma 5.6 below, analogous to[11, Lemma 7.10], is proved using a new argument, which avoids using a regularityproperty of the µ -stationary measure on the projective space, only available with aproximality assumption. Proposition 5.5.

Let µ be a probability measure on SL d ( Z ) with some ﬁnite ex-ponential moment and acting strongly irreducibly on R d . Given ε > , there exist m ∗ and ω > such that if ρ > , Q ≥ and m ≥ m ∗ satisfy e dλ m ρ < Q − , and ν is any Borel probability measure on T d , then ( µ m ∗ ν )( W ( e − ( λ − ε ) m ρ ) Q ) ≥ ν ( W ( ρ ) Q ) − e − ωm . The proof of this proposition is based on the following lemma.

Lemma 5.6.

Let µ be a Borel probability measure on SL d ( R ) with some ﬁniteexponential moment and whose support generates a subsemigroup acting stronglyirreducibly on R d . Given ε > , there exists θ > such that the following holds forevery integer m suﬃciently large.Let A be a subset of SL d ( R ) such that µ m ( A ) ≥ e − θm . There exists a subset G = { g i } ≤ i ≤ k of cardinality k ≥ e θm in A such that for every subset { g i , . . . , g i d } of d elements of G , for every v in R d , max ≤ j ≤ d k g i j v k ≥ e ( λ − ε ) m k v k . Proof.

Having ﬁxed ε > , let T = { ( h , . . . , h d ) | ∀ v ∈ R d \ { } , max i k h i v kk v k < e ( λ − ε ) m } . By Lemma 5.4, there exists c > such that for every large enough m , µ ⊗ dm ( T ) ≤ e − cm . We shall prove that the lemma holds with θ = cd d +1 . Let A = { g ∈ A | µ ⊗ d − m ( { ( h , . . . , h d ) | ( g, h , . . . , h d ) ∈ T } ) ≥ e − cm/ } . Then e − cm ≥ µ ⊗ dm ( T ) ≥ e − cm/ µ m ( A ) , and therefore µ m ( A ) ≤ e − cm/ . To construct G , we ﬁrst choose g ∈ A \ A ; this is possible because θ < c/ . Let A ( g ) = { g ∈ A | µ ⊗ d − m ( { ( h , . . . , h d ) | ( g , g, h . . . , h d ) ∈ T } ) ≥ e − cm/ } . Since g A , we have e − cm/ ≥ µ ⊗ d − m ( { ( h , . . . , h d ) | ( g , h , . . . , h d ) ∈ T } ) ≥ e − cm/ µ m ( A ( g )) , whence µ m ( A ( g )) ≤ e − cm/ . We may therefore pick an element g ∈ A such that g A ∪ A ( g ) . Then set A ( g , g ) = { g | µ ⊗ d − m ( { ( h , . . . , h d ) | ( g , g , g, h , . . . , h d ) ∈ T } ) ≥ e − cm/ } for which it is readily checked, using the fact g A ( g ) , that µ m ( A ( g , g )) ≤ e − cm/ . This allows us to pick g ∈ A such that g A ∪ A ( g ) ∪ A ( g ) ∪ A ( g , g ) .Following this procedure, the elements g , g , g , . . . of G are constructed induc-tively. Once g , . . . , g k have been chosen, one picks g k +1 ∈ A outside the union ofall subsets A r ( g i , . . . , g i r − ) = { g | µ ⊗ d − rm ( { ( h r +1 , . . . , h d ) | ( g i , . . . , g i r − , g, h r +1 , . . . , g d ) ∈ T } ) ≥ e − cm/ r } , INEAR RANDOM WALKS ON THE TORUS 41 where ( g i , . . . , g i r − ) can be any subset of ( g , . . . , g k ) with at most d elements. Byconvention, for r = d , write A d ( g i , . . . , g i d − ) = { g | ( g i , . . . , g i d − , g ) ∈ T } . Just as above, one checks by induction, using g i r − A r − ( g i , . . . , g i r − ) , that µ m ( A r ( g i , . . . , g i r − )) ≤ e − cm/ r ≤ e − cm/ d . Thus, at step k , the union of all subsets A r ( g i , . . . , g i r ) to be avoided has measureat most [1 + (cid:18) k (cid:19) + · · · + (cid:18) kd (cid:19) ] e − cm/ d ≪ k d e − cm/ d . So the procedure can go on as long as k d e − cm/ d < µ m ( A ) . Since µ m ( A ) ≥ e − θm = e − cm/ d +1 , one can at least reach some k ≥ e cmd d +1 , which proves the lemma. (cid:3) The rest of the proof of Proposition 5.5 is exactly as in [11, §7.D.]; we include itfor completeness.

Proof of Proposition 5.5.

Once more, write ν m ( W ( ρ ) Q ) = X g µ m ( g ) ν ( g − W ( ρ ) Q ) , to observe that A = { g | ν ( g − W ( ρ ) Q ) ≥ ( µ m ∗ ν )( B ) − e − θm } satisﬁes µ m ( A ) ≥ e − θm . Using the large deviation estimate for k g − k , we may reduce A without any signif-icant loss of µ m -measure so that for every g in A , k g − k ≤ e ( λ d + ε ) m ≤ e dλ m . By Lemma 5.6, there exists a subset

G ⊂ A of cardinality at least e θm such thatfor any distinct elements g , . . . , g d in G , for every v ∈ R d , max ≤ i ≤ d k g i v k ≥ e ( λ − ε ) m k v k . For such elements g , . . . , g d , g − W ( ρ ) Q ∩ . . . g − d W ( ρ ) Q = [ x ,...,x d ∈ W Q g − B ( x , ρ ) ∩ · · · ∩ g − d B ( x d , ρ ) . Now, if u ∈ R d represents an element of g − B ( x , ρ ) ∩ · · · ∩ g − d B ( x d , ρ ) , then, forsome vectors v i ∈ Z d ,(5.8) g i u = x i + v i + O ( ρ ) , i = 1 , . . . , d i.e. u = g − i ( x i + v i ) + O ( e dλ m ρ ) . But the points g − i ( x i + v i ) are rational with denominator at most Q , so that theyare at least Q − away from one another. Since e dλ m ρ < Q − , this shows thatthere exists u ∈ R d , rational with denominator at most Q such that for each i , g i u = x i + v i . Coming back to (5.8) above, we ﬁnd k g i ( u − u ) k ≤ ρ, i = 1 , . . . , d and by deﬁnition of the subset G , k u − u k ≤ e − m ( λ − ε ) ρ. This shows that g − W ( ρ ) Q ∩ . . . g − d W ( ρ ) Q ⊂ W ( e − ( λ − ε ) m ρ ) Q . In other words, the family of subsets { g − W ( ρ ) Q \ W ( e − m ( λ − ε ) ρ ) Q ; g ∈ G} has intersection multiplicity less than d . Therefore, X g ∈G ν m ( g − W ( ρ ) Q \ W ( e − m ( λ − ε ) ρ ) Q ) ≤ d, and as |G| ≥ e θm , there must exist g in G such that ν m ( g − W ( ρ ) Q \ W ( e − m ( λ − ε ) ρ ) Q ) ≤ de − θm . Then, ν m ( W ( e − m ( λ − ε ) ρ ) Q ) ≥ ν m ( g − W ( ρ ) Q ) − de − θm ≥ ν ( W ( ρ ) Q ) − ( d + 1) e − θm ≥ ν ( W ( ρ ) Q ) − e − ωm . (cid:3) We are ﬁnally ready to prove the main theorem of this article, Theorem 1.2,announced in the introduction. We shall in fact prove a slightly more generalstatement, given as Proposition 5.7 below. Recall that for parameters Q ≥ and ρ > , we write W Q for the set of rational points on T d with denominator at most Q , and W ( ρ ) Q for its ρ -neighborhood. Proposition 5.7.

Let d ≥ . Let µ be a probability measure on SL d ( Z ) . Denoteby Γ the subsemigroup generated by µ , and by G < SL d the Zariski closure of Γ .Assume(a) The measure µ has a ﬁnite exponential moment;(b) The only subspaces of R d preserved by Γ are { } and R d ;(c) The algebraic group G is Zariski connected.Let λ denote the top Lyapunov exponent associated to µ . Given λ ∈ (0 , λ ) , thereexists a constant C = C ( µ, λ ) > such that for every Borel probability measure ν on T d and every t ∈ (0 , / , if for some a ∈ Z d \ { } , | \ µ n ∗ ν ( a ) | ≥ t and n ≥ C log k a k t , then ν ( W ( e − λn ) Q ) ≥ t C for some Q ≤ (cid:0) k a k t (cid:1) C Proof.

Recall the shorthand ν n = µ n ∗ ν , n ∈ N . By Proposition 5.2, there is aconstant C > depending only on µ such that for m = C log k a k t with C ≥ C ,there exists Q ∈ [ e m /C , e C m ] such that ν n − m ( W ( Q − ) Q ) ≥ t C . Set ρ = Q − , choose m maximal so that e dλ m ρ < Q − . Then e dλ m ≍ µ Q − ρ − = Q and hence(5.9) m ≥ dλ log Q − O µ (1) ≥ C dλ C log k a k t − O µ (1) . INEAR RANDOM WALKS ON THE TORUS 43

Thus by picking C suﬃciently large, we can make m ≥ m ∗ where m ∗ is theconstant given by Proposition 5.5 applied to ε := ( λ − λ ) / .It is easy to see that if C = C ( µ, λ ) is chosen large enough, every integer n ≥ C log k a k t can be written as n = m + m + · · · + m k for some k ≥ and some integers m , . . . , m k satisfying(5.10) ∀ j = 1 , . . . , k − , m j < m j +1 < (cid:0) λ − εdλ (cid:1) m j Deﬁne recursively for j = 1 , . . . , k , ρ j = e − ( λ − ε ) m j ρ j − . Then (5.10) implies, by a simple induction, that ∀ j = 1 , . . . , k, e dλ m j ρ j − < Q − . Therefore we can apply repeatedly Proposition 5.5 to get ν ( W ( ρ k ) Q ) ≥ t C − k X j =1 e − ωm j where ω > is a constant depending only on µ and λ . Observe that, ﬁrst, ρ k = e − ( λ + ε )( n − m ) ρ ≤ e − λn provided that C ≥ λ + εε C . Secondly, by (5.10), k X j =1 e − ωm j ≤ e − ωm X i ≥ e − ωi ≪ ω e − ωm is smaller than t C / provided that C /C is chosen large enough (recall (5.9)).This ﬁnishes the proof. (cid:3) Conclusion

To conclude this paper, we mention one application of our main theorem, andthen give some possible further directions of research, some of which we hope toaddress in publications to come.6.1.

Expansion in simple groups modulo arbitrary integers.

In Section 3,we made use of the result of Salehi Golseﬁdy and Varjú [43] about expansion insemisimple groups modulo prime – or square-free – numbers. In a reverse direction,it was observed by Bourgain and Varjú [16] that the quantitative equidistibution oflinear random walks on the torus of Bourgain, Furman, Lindenstrauss and Mozes[11] could be used to derive some expansion results in SL d ( Z /q Z ) , where q runsover all natural integers. Because of the proximality assumption required by [11],their argument could only apply to R -split simple Q -groups, such as SL d . WithTheorem 1.2 at hand, we can now generalize their result to any simple Q -group. Theorem 6.1 (Expansion in simple groups modulo arbitrary integers) . Let S be aﬁnite subset of GL d ( Z ) , and Γ the subgroup generated by S . If the Zariski closure of Γ is a simple algebraic group, then the family of Cayley graphs G ( π q (Γ) , π q ( S )) q ∈ N is a family of expanders . Since all the ideas of the proof are contained in [16], we will not include the proofhere but rather put it in an explanatory note [32]. For more background, we referthe readers to the survey [19] and to [14, Appendix], [42], [24] for relevant recentprogress.As observed by Salehi Golseﬁdy and Varjú [43, Question 2], one should expectthe theorem to hold with the weaker assumption that the Zariski closure of Γ isperfect. To prove such a result, if one wants to exploit some equidistribution resulton the torus similar to Theorem 1.2, one should relax the irreducibility assumption,which leads us to the second point of this conclusion.6.2. Without irreducibility.

The only obvious obstruction to equidistribution iswhen the random walk is trapped in a rational coset of a subtorus that is obtainedas the image of a Γ -invariant rational subspace via the projection R d → T d . Thus,in order to prove equidistribution of a linear random walk on the torus, it may bemore natural only to assume the action of Γ to be irreducible on Q d , rather than R d .Indeed, for example, assume that the group generated by the random walk issemisimple and acts strongly irreducibly on Q d . Then Guivarc’h and Starkov [27]and independently Muchnik [38] showed that every proper closed invariant subsetis a ﬁnite set of rational points. Moreover, under the same assumption, Benoist andQuint [4, Corollary 1.4] showed that the only non-atomic stationary measure on T d is the Haar measure. See also Benoist-Quint [5] for a result on equidistribution oftrajectories.Similarly, Theorem 1.2 should remain valid if one only assumes the irreducibilityof the action of Γ on Q d , as long as the Zariski closure of Γ is semisimple.The general approach used here should work in this setting, but there is oneimportant diﬀerence: the algebra E generated by Γ will no longer be simple, butonly semisimple. In particular, the rescaled measure ˜ µ n studied in Section 3 mayvery well be concentrated on a proper ideal of E . One therefore needs to modifyseveral of our arguments to adapt the proof to this more general setting.Furthermore, the question of equidistribution is still interesting even withoutany irreducibility assumption. For example, the above-mentioned work of Benoistand Quint gives a classiﬁcation of orbit closures and stationary measure under theassumption that the Zariski closure of the group is semisimple. In another direction,Bekka and Guivarc’h [2] showed that the measure preserving action of a subgroup Γ < SL d ( Z ) on T d has a spectral gap if and only if there is no nontrivial Γ -invarianttorus factor on which Γ acts as a virtually abelian group.6.3. The two other assumptions.

First, we believe that the Theorem 1.2 is stillvalid even if one does not require the group G to be Zariski connected. In fact,many arguments in our proof still works without this assumption, but, as is the casewithout the irreducibility assumption, the rescaled measures ˜ µ n may concentratenear a proper subspace of E : the algebra generated by the connected componentof G . This leads to several technical diﬃculties when trying to prove a ﬂatteningstatement.Second, it would be an interesting problem to determine what moment conditionsare really necessary in order to have the convergence statement of Theorems 1.1and 1.2. It seems plausible for example that Theorem 1.1 holds with the weakerassumption of a moment of order : R log k g k d µ ( g ) < ∞ . Even a counter-exampleto Theorem 1.1 without any moment condition would be interesting.6.4. Spaces of lattices.

Given the results of Benoist and Quint [4] classifyingstationary measures on the space of lattices SL d ( R ) / SL d ( Z ) , it is very natural to INEAR RANDOM WALKS ON THE TORUS 45 ask whether one can obtain an analog of Theorem 1.2 in this setting. Even thefollowing qualitative equidistribution problem is still open [3, §5.4. Question 3].Let µ be a measure on SL d ( R ) generating a Zariski dense subgroup Γ , and x apoint in SL d ( R ) / SL d ( Z ) with inﬁnite Γ -orbit. Show that the sequence of measures ( µ n ∗ δ x ) n ≥ converges to the Haar measure as n goes to inﬁnity. Acknowledgements.

We are indebted to Emmanuel Breuillard for several ideasused in §§3.2 and 3.3 and for sharing his unpublished note [18] on non-concentrationestimates for random matrix products. It is a pleasure to thank him, as well asRichard Aoun, Yves Benoist, Elon Lindenstrauss, Jean-François Quint and PéterVarjú, for useful and motivating discussions.

References [1] R. Aoun. Transience of algebraic varieties in linear groups—applications to generic Zariskidensity.

Ann. Inst. Fourier (Grenoble) , 63(5):2049–2080, 2013.[2] B. Bekka and Y. Guivarc’h. On the spectral theory of groups of aﬃne transformations ofcompact nilmanifolds.

Ann. Sci. Éc. Norm. Supér. (4) , 48(3):607–645, 2015.[3] Y. Benoist and J.-F. Quint. Introduction to random walks on homogeneous spaces.

Jpn. J.Math. , 7(2):135–166, 2012.[4] Y. Benoist and J.-F. Quint. Stationary measures and invariant subsets of homogeneous spaces(II).

J. Amer. Math. Soc. , 26(3):659–734, 2013.[5] Y. Benoist and J.-F. Quint. Stationary measures and invariant subsets of homogeneous spaces(III).

Ann. of Math. (2) , 178(3):1017–1059, 2013.[6] Y. Benoist and J.-F. Quint.

Random walks on reductive groups , volume 62 of

Ergebnisse derMathematik und ihrer Grenzgebiete. 3. Folge. A Series of Modern Surveys in Mathematics .Springer, Cham, 2016.[7] A. Borel.

Linear algebraic groups , volume 126 of

Graduate Texts in Mathematics . Springer-Verlag, New York, second edition, 1991.[8] P. Bougerol and J. Lacroix.

Products of random matrices with applications to Schrödingeroperators , volume 8 of

Progress in Probability and Statistics . Birkhäuser Boston, Inc., Boston,MA, 1985.[9] J. Bourgain. Multilinear exponential sums in prime ﬁelds under optimal entropy conditionon the sources.

Geom. Funct. Anal. , 18(5):1477–1502, 2009.[10] J. Bourgain. The discretized sum-product and projection theorems.

J. Anal. Math. , 112:193–236, 2010.[11] J. Bourgain, A. Furman, E. Lindenstrauss, and S. Mozes. Stationary measures and equidistri-bution for orbits of nonabelian semigroups on the torus.

J. Amer. Math. Soc. , 24(1):231–280,2011.[12] J. Bourgain and A. Gamburd. Uniform expansion bounds for Cayley graphs of SL ( F p ) . Ann.of Math. (2) , 167(2):625–642, 2008.[13] J. Bourgain and A. Gamburd. Expansion and random walks in SL d ( Z /p n Z ) . II. J. Eur. Math.Soc. (JEMS) , 11(5):1057–1103, 2009. With an appendix by Bourgain.[14] J. Bourgain and A. Kontorovich. On the local-global conjecture for integral Apollonian gas-kets.

Invent. Math. , 196(3):589–650, 2014. With an appendix by Péter P. Varjú.[15] J. Bourgain and S. V. Konyagin. Estimates for the number of sums and products and forexponential sums over subgroups in ﬁelds of prime order.

C. R. Math. Acad. Sci. Paris ,337(2):75–80, 2003.[16] J. Bourgain and P. P. Varjú. Expansion in SL d ( Z /q Z ) , q arbitrary. Invent. Math. , 188(1):151–173, 2012.[17] R. Boutonnet, A. Ioana, and A. Salehi Golseﬁdy. Local spectral gap in simple Lie groups andapplications.

Invent. Math.

Groups St An-drews 2013 , volume 422 of

London Math. Soc. Lecture Note Ser. , pages 1–50. CambridgeUniv. Press, Cambridge, 2015.[20] E. Breuillard, B. Green, and T. Tao. Approximate subgroups of linear groups.

Geom. Funct.Anal. , 21(4):774–819, 2011. [21] D. A. Cox, J. Little, and D. O’Shea.

Ideals, varieties, and algorithms . Undergraduate Textsin Mathematics. Springer, Cham, fourth edition, 2015. An introduction to computationalalgebraic geometry and commutative algebra.[22] N. de Saxcé. A product theorem in simple Lie groups.

Geom. Funct. Anal. , 25(3):915–941,2015.[23] M. D. Fried and M. Jarden.

Field arithmetic , volume 11 of

Ergebnisse der Mathematik undihrer Grenzgebiete. 3. Folge. A Series of Modern Surveys in Mathematics [Results in Mathe-matics and Related Areas. 3rd Series. A Series of Modern Surveys in Mathematics] . Springer-Verlag, Berlin, second edition, 2005.[24] E. Fuchs, K. E. Stange, and X. Zhang. Local-global principles in circle packings.

Compos.Math. , 155(6):1118–1170, 2019.[25] H. Furstenberg. Noncommuting random products.

Trans. Amer. Math. Soc. , 108:377–428,1963.[26] Y. Guivarc’h and A. Raugi. Frontière de Furstenberg, propriétés de contraction et théorèmesde convergence.

Z. Wahrsch. Verw. Gebiete , 69(2):187–242, 1985.[27] Y. Guivarc’h and A. N. Starkov. Orbits of linear group actions, random walks on homogeneousspaces and toral automorphisms.

Ergodic Theory Dynam. Systems , 24(3):767–802, 2004.[28] W. He. Discretized sum-product estimates in matrix algebras.

ArXiv e-prints 1611.09639 ,Nov. 2016. To appear in

Journal d’Analyse Mathématique .[29] W. He. Orthogonal projections of discretized sets.

ArXiv e-prints , page arXiv:1710.00795,Oct. 2017. To appear in Journal of Fractal Geometry.[30] W. He.

Sums, products and projections of discretized sets . Phd thesis, Université Paris-Saclay,Sept. 2017.[31] W. He. Random walks on linear groups satisfying a Schubert condition. arXiv e-prints , pagearXiv:1905.05695, May 2019. To appear in Israel Journal of Mathematics.[32] W. He and N. de Saxcé. Expansion in simple groups modulo arbitrary integers. To be availablesoon.[33] H. A. Helfgott. Growth and generation in SL ( Z /p Z ) . Ann. of Math. (2) , 167(2):601–623,2008.[34] S. Hoory, N. Linial, and A. Wigderson. Expander graphs and their applications.

Bull. Amer.Math. Soc. (N.S.) , 43(4):439–561, 2006.[35] S. Lang and A. Weil. Number of points of varieties in ﬁnite ﬁelds.

Amer. J. Math. , 76:819–827,1954.[36] E. Le Page. Théorèmes limites pour les produits de matrices aléatoires. In

Probability mea-sures on groups (Oberwolfach, 1981) , volume 928 of

Lecture Notes in Math. , pages 258–303.Springer, Berlin-New York, 1982.[37] J. Li. Discretized Sum-product and Fourier decay in R n . arXiv e-prints , pagearXiv:1811.06852, Nov 2018. arXiv:1811.06852, to appear in Journal d’Analyse Mathéma-tique.[38] R. Muchnik. Semigroup actions on T n . Geom. Dedicata , 110:1–47, 2005.[39] M. V. Nori. On subgroups of GL n ( F p ) . Invent. Math. , 88(2):257–275, 1987.[40] L. Pyber and E. Szabó. Growth in ﬁnite simple groups of Lie type.

J. Amer. Math. Soc. ,29(1):95–146, 2016.[41] M. S. Raghunathan.

Discrete subgroups of Lie groups . Springer-Verlag, New York-Heidelberg,1972. Ergebnisse der Mathematik und ihrer Grenzgebiete, Band 68.[42] A. Salehi Golseﬁdy. Super-approximation, II: the p -adic case and the case of bounded powersof square-free integers. J. Eur. Math. Soc. (JEMS) , 21(7):2163–2232, 2019.[43] A. Salehi Golseﬁdy and P. P. Varjú. Expansion in perfect groups.

Geom. Funct. Anal. ,22(6):1832–1891, 2012.[44] T. Tao. Product set estimates for non-commutative groups.

Combinatorica , 28(5):547–594,2008.[45] T. Tao and V. H. Vu.

Additive combinatorics , volume 105 of

Cambridge Studies in AdvancedMathematics . Cambridge University Press, Cambridge, 2010.[46] B. L. Van der Waerden.

Modern Algebra. Volume II. Based in part on lectures by E. Artinand E. Noether.

Frederick Ungar Publishing Co., New York, transl. from the 3rd germanedition, 1950.

Einstein Institute of Mathematics, The Hebrew University of Jerusalem, Jerusalem91904, Israel.

E-mail address : [email protected] CNRS – Université Paris 13, LAGA, 93430 Villetaneuse, France.

E-mail address ::