[PDF] A simultaneous version of Host's equidistribution Theorem

Abstract

Let μ be a probability measure on R/Z that is ergodic under the ×p map, with positive entropy. In 1995, Host showed that if gcd(m,p)=1 then μ almost every point is normal in base m . In 2001, Lindenstrauss showed that the conclusion holds under the weaker assumption that p does not divide any power of m . In 2015, Hochman and Shmerkin showed that this holds in the "correct" generality, i.e. if m and p are independent. We prove a simultaneous version of this result: for μ typical x , if m>p are independent, we show that the orbit of (x,x) under (×m,×p) equidistributes for the product of the Lebesgue measure with μ . We also show that if m>n>1 and n is independent of p as well, then the orbit of (x,x) under (×m,×n) equidistributes for the Lebesgue measure.

Full PDF

aa r X i v : . [ m a t h . D S ] A p r A simultaneous version of Host’s equidistribution Theorem

Amir Algom

Abstract

Let µ be a probability measure on R / Z that is ergodic under the × p map, with positiveentropy. In 1995, Host [14] showed that if gcd( m, p ) = 1 then µ almost every point is normalin base m . In 2001, Lindenstrauss [19] showed that the conclusion holds under the weakerassumption that p does not divide any power of m . In 2015, Hochman and Shmerkin [13]showed that this holds in the ”correct” generality, i.e. if m and p are independent. We prove asimultaneous version of this result: for µ typical x , if m > p are independent, we show that theorbit of ( x, x ) under ( × m, × p ) equidistributes for the product of the Lebesgue measure with µ .We also show that if m > n > n is independent of p as well, then the orbit of ( x, x ) under( × m, × n ) equidistributes for the Lebesgue measure. Let p be an integer greater or equal to 2. Let T p be the p -fold map of the unit interval, T p ( x ) = p · x mod 1 . Let m > p , that is, log p log m / ∈ Q . Henceforth, we will write m p toindicate that m and p are independent. In 1967 Furstenberg [7] famously proved that if a closedsubset of T := R / Z is jointly invariant under T p and T m , then it is either ﬁnite or the entire space T . A well known Conjecture of Furstenberg about a measure theoretic analouge of this result, isthat the only continuous probability measure jointly invariant under T p and T m , and ergodic underthe Z action generated by these maps, is the Lebesgue measure. The best results towards thisConjecture, due to Rudolph [26] for p, m such that gcd( p, m ) = 1 and later to Johnson [17] for p m , is that it holds if in addition the measure has positive entropy with respect to the Z + actiongenerated by T p (see also the earlier results of Lyons [21]).In 1995 Host proved the following pointwise strengthening of Rudolph’s Theorem. Recall thata number x ∈ [0 ,

1] is said to be normal in base p if the sequence { T kp x } k ∈ Z + equidistributes for theLebesgue measure on [0 , p expansion of x hasthe same limiting statistics as an IID sequence of digits with uniform marginals. Theorem. (Host, [14]) Let p, m ≥ be integers such that gcd( p, m ) = 1 . Let µ be T p invariantergodic measure with positive entropy. Then µ almost every x is normal in base m . Host’s theorem can be shown to imply Rudolph’s Theorem, but is more constructive in thesense that it proves that a large collection of measures satisfy a certain regularity property. Host’sTheorem is also closely related to classical results of Cassels [3] and Schmidt [27] from around 1960,

Supported by ERC grant 306494 and ISF grant 1702/17.2010 Mathematics Subject Classiﬁcation 11K16, 11A63, 28A80, 28D05. p (though applying to a less general class of measures, the latterresults nonetheless hold for any independent integers p, m ≥ p, m in Host’s Theorem, however, is stronger than it”should” be. Namely, it is stronger than assuming that p m . In 2001, Lindenstrauss [19] showedthat the conclusion of Host’s Theorem holds under the weaker assumption that p does not divideany power of m . Finally, in 2015, Hochman and Shmerkin [13] proved that Host’s Theorem holdsin the ”correct” generality, i.e. when p m .Now, let µ be a measure as in Host’s Theorem, with p m . Then, on the one hand, by theresults of Hochman and Shmerkin, for µ almost every x , its orbit under T m equidistributes for theLebesgue measure. On the other hand, for µ almost every x , its orbit under T p equidistributes for µ (this is just the ergodic Theorem). The main result of this paper is that this holds simultaneously. Theorem 1.1.

Let µ be a T p invariant ergodic measure with dim µ > . Let m > n > be integerssuch that m p .1. If n = p then N N − X i =0 δ ( T im ( x ) ,T in ( x )) → λ × µ, for µ almost every x, where the convergence is in the weak-* topology, and λ is the Lebesgue measure on [0 , .2. If n p then N N − X i =0 δ ( T im ( x ) ,T in ( x )) → λ × λ, for µ almost every x. Several remarks are in order. First, the assumption that µ has positive dimension and theassumption that it has positive entropy are equivalent, so there is no discrepancy between theassumptions in Host’s Theorem and those in Theorem 1.1 (see Section 2.1 for a discussion on thedimension theory of measures). Secondly, in the second part of the Theorem we do not need that m and n are independent, only that m > n . In addition, we can prove a version of Theorem 1.1where the initial point ( x, x ) is replaced with ( f ( x ) , g ( x )) for f, g that are non singular aﬃne mapsof R that satisfy some extra mild conditions. This is explained in Section 6.Theorem 1.1 can also be considered as part of the following general framework. Let S, T ∈ End( T ), and let ν be an S invariant probability measure. The idea is to study the orbits { T k x } k ∈ Z + for µ typical x . In our situation, S = (cid:18) p p (cid:19) , T = (cid:18) m n (cid:19) and the measure ν = ∆ µ , where ∆ : T → T is the map ∆( x ) = ( x, x ).Problems around this framework were studied by several authors. Notable related examples arethe works of Meiri and Peres [24], and the subsequnet work of Host [15]. Meiri and Peres prove aTheorem similar to ours, with the following diﬀerences: • They work with two general diagonal endomorphisms S and T , but they require that thecorresponding diagonal entries S i,i and T i,i be larger than 1 and co-prime.2 They allow for more general measures then the one dimensional measures that we work with(in Theorem 1.1 we work with measures on the diagonal of T ).Host in turn has some requirements on S and the measure that are more general than ours, but alsorequires that det( S ) and det( T ) be co-prime, and that for every k the characteristic polynomial of T k is irreducible over Q (clearly this is not the case here). The results of both Host, and Meiri andPeres, extend to any d dimensional torus.Our proof of Theorem 1.1 is inspired by the work of Hochman and Shmerkin [13]. In particular,the scenery of the measure µ at typical points plays a pivotal role in our work. We devote the nextSection to deﬁning this scenery and some related notions, and to formulating the main technicaltool used to prove Theorem 1.1. We ﬁrst recall some notions that were deﬁned in ([13], Section 1.2). However, we remark that thesenotions and ideas have a long history, going back varisouly to Furstenberg ([8], [9]), Z¨ahle [31],Bedford and Fisher [1], M¨orters and Preiss [25], and Gavish [10]. See ([13], Section 1.2) and [11]for some further discussions and comparisons.For a compact metric space X let P ( X ) denote the space of probability measures on X . Let M (cid:3) = { µ ∈ P ([ − , ∈ supp( µ ) } . For µ ∈ M (cid:3) and t ∈ R we deﬁne the scaled measure S t µ ∈ M (cid:3) by S t µ ( E ) = c · µ ( e − t E ∩ [ − , , where c is a normalizing constant . For x ∈ supp( µ ) we similarly deﬁne the translated measure by µ x ( E ) = c ′ · µ (( E + x ) ∩ [ − , , where c ′ is a normalizing constant . The scaling ﬂow is the Borel R + ﬂow S = ( S t ) t ≥ acting on M (cid:3) . The scenery of µ at x ∈ supp( µ )is the orbit of µ x under S , that is, the one parameter family of measures µ x,t := S t ( µ x ) for t ≥ x is what one sees as one ”zooms” into the measure.Notice that P ( M (cid:3) ) ⊆ P ( P ([ − , P ( P ([ − , P ( R ) as measures. A measure µ ∈ P ( R )generates a distribution P ∈ P ( P ([ − , x ∈ supp( µ ) if the scenery at x equidistributes for P in P ( P ([ − , T →∞ T Z T f ( µ x,t ) dt = Z f ( ν ) dP ( ν ) , for all f ∈ C ( P ([ − , . and µ generates P if it generates P at µ almost every x .If µ generates P , then P is supported on M (cid:3) and is S -invariant ([11], Theorem 1.7). We saythat P is trivial if it is the distribution supported on δ ∈ M (cid:3) - a ﬁxed point of S . To an S -invariantdistribution P we associate its pure point spectrum Σ( P, S ). This set consists of all the α ∈ R forwhich there exists a non-zero measurable function φ : M (cid:3) → C such that φ ◦ S t = exp(2 πiαt ) φ forevery t ≥

0, on a set of full P measure. The existence of such an eigenfunction indicates that somenon-trivial feature of the measures of P repeats periodically under magniﬁcation by e α .Finally, we say that a measure µ ∈ P ([0 , T n for a measure ρ ∈ P ([0 , µ almost every x equidistributes for ρ under T n , that is,1 N N − X k =0 f ( T kn x ) → Z f ( x ) dρ ( x ) , ∀ f ∈ C ([0 , .

3e are now ready to state our second main result, which is the technical tool that shall be employedto prove Theorem 1.1.

Theorem 1.2.

Let µ ∈ P ([0 , and let m > n > be integers, such that:1. The measure µ generates a non-trivial S -ergodic distribution P ∈ P ( P ([ − , .2. The pure point spectrum Σ( P, S ) does not contain a non-zero integer multiple of m .3. The measure µ is pointwise generic under T n for an ergodic and continuous measure ρ .Then N N − X i =0 δ ( T im ( x ) ,T in ( x )) → λ × ρ, for µ almost every x. (1)Notice that under assumption (3) of Theorem 1.2, the measure ρ is T n invariant, so its ergodicityis with respect to this map. Theorem 1.2 together with the machinery developted by Hochman andShmerkin in ([13], Section 8) imply Theorem 1.1: this is explained in Section 5.We end this introduction with a brief overview of the proof of Theorem 1.2. First, we note thatif we only assume (1) and (2) in Theorem 1.2, then1 N N − X i =0 δ T im ( x ) → λ, for µ almost every x, (2)according to the main result of Hochman and Shmerkin [13]. This is proved by roughly followingthree steps: ﬁrst, using the spectral condition, they show that any accumulation point ν of measuresas in (2) can be represented as an integral over measures that are closely related to those drawnaccording to P . They proceed to use this representation to show thatThere exists some ǫ > τ ∈ P ([0 , τ ≥ ǫ, dim τ ∗ ν = 1 . They conclude by showing that the only T m invariant measure ν satisfying the latter property isthe Lebesgue measure.Our strategy is to ﬁrst show that a T m × T n invariant measure ν , that projects to λ and to ρ inthe ﬁrst and second coordinate respectively, must be λ × ρ if it satisﬁes the following condition:There exists some ǫ > τ ∈ P ([0 , τ ≥ ǫ, dim τ ∗ P ν y = 1 , for ρ almost every y , where ν y is the conditional measure of ν on the ﬁber { ( x, z ) : z = y } , and P ( x, y ) = x . This is Claim 3.4 in Section 3. Afterwards, we show that this property holds forall the accumulation points of the measures from (1). This is done via a corresponding integralrepresentation, see Claim 4.1 in Section 4. Notation

We shall use the letter λ to indicate both the Lebesgue measure on [0 ,

1] and theLebesgue measure on T . Which is meant will be clear from context. Also, whenever we have aﬁnite product space, we denote by P i the projection to the i -th coordinate. Acknowledgements

I would like to thank Mike Hochman for suggesting the problem studiedin this paper to me, and for his various insightful comments on earlier versions of this manuscript.I would also like to thank Shai Evra for some helpful conversations related to this paper.4

Preliminaries

For a Borel set A in some metric space X , we denote by dim A its Hausdorﬀ dimension, and bydim P A its packing dimension (see Falconer’s book [5] for an exposition on these concepts). Now,let µ ∈ P ( X ). The (lower) Hausdorﬀ dimension of the measure µ is deﬁned asdim µ := inf { dim A : µ ( A ) > , A is Borel } , and the upper Hausdorﬀ dimension of the measure µ is deﬁned asdim µ := inf { dim A : µ ( A ) = 1 , A is Borel } , The (upper) packing dimension of the measure µ is deﬁned asdim P µ = inf { dim P A : µ ( A ) = 1 , A is Borel } . An alternative characterization of the dimension of µ that we shall often use is given in termsof their local dimensions: For every x ∈ supp( µ ) we deﬁne the local (pointwise) dimension of µ at x as dim( µ, x ) = lim inf r → log µ ( B ( x, r ))log r where B ( x, r ) denotes the closed ball or radius r about x . The Hausdorﬀ dimension of µ is equalto dim µ = ess-inf x ∼ µ dim( µ, x ) , (3)and the upper Hausdorﬀ dimension of µ is equal todim µ = ess-sup x ∼ µ dim( µ, x ) . (4)see e.g. [4]. If dim( µ, x ) exists as a limit at almost every point, and is constant almost surely,we shall say that the measure µ is exact dimensional. In this case, most metric deﬁnitions of thedimension of µ coincide (e.g. lower and upper Hausdorﬀ dimension and packing dimension).Let us now collect some known facts regarding dimension theory of measures: Proposition 2.1.

1. Let µ ∈ P ( R d ) and suppose that there is a distribution Q ∈ P ( P ( R d )) such that µ = Z νdQ ( ν ) . Then dim µ ≥ ess-inf ν ∼ Q dim ν If µ = p µ + p µ where µ , µ ∈ P ( R d ) and ( p , p ) is any probability vector, then dim µ = min { dim µ , dim µ } .

2. Let f : X → Y be a Lipschitz map between complete metric spaces. Then for any µ ∈ P ( X ) , dim f µ ≤ dim µ with an equality if f is locally bi-Lipschitz. . Let µ ∈ P ( T ) be exact dimensional, and ν ∈ P ( T ) be a measure supported on ﬁnitely manyatoms. Then dim µ ∗ ν = dim µ , and moreover, µ ∗ ν is exact dimensional. The next Lemma is essentially Lemma 3.5 in [13], with a minor modiﬁcation which follows e.g.from Lemma 6.13 in [11].

Lemma 2.2. ([13], Lemma 3.5) Let µ ∈ P ( R ) .1. Suppose that for P µ almost every y , dim µ y ≥ α (where µ y is the conditional measure on theﬁber { ( x, z ) : z = y } ). Then dim µ ≥ dim P µ + α .2. For an upper bound, we have dim µ ≤ dim P P µ + dim P µ . We end this section with a brief discussion of the Fourier coeﬃcients of measures on T d . Theseare deﬁned as follows. First, given µ ∈ P ( T d ) we deﬁne for any k ∈ Z d the corresponding Fouriercoeﬃcient by ˆ µ ( k ) := Z e πik · x dµ ( x ) . The following relations are easily veriﬁed for two measures µ, ν ∈ P ( T ): [ µ ∗ ν ( k ) = ˆ µ ( k ) · ˆ ν ( k ) , k ∈ Z . (5) \ µ × ν ( k, j ) = ˆ µ ( k ) · ˆ ν ( j ) , ( k, j ) ∈ Z . (6)The following Lemma is standard: Lemma 2.3. ([22], Section 3.10 ) Let µ, ν ∈ P ( T ) . If ˆ µ ( k ) = ˆ ν ( k ) for all k ∈ Z then µ = ν . Finally, let m ≥ µ be the Cantor-Lebesgue measure corresponding to the non-degenerate probability vector ( p , ..., p m − ). That is, µ is the distribution of the Random sum P ∞ k =1 X k m k , where X k are IID random variables with P ( X k = i ) = p i . It is a well known fact thatfor every k ∈ Z , ˆ µ ( k ) = ∞ Y j =1 m − X u =0 p u exp(2 πiu km j ) ! . (7) In this paper, a dynamical system is a quadruple ( X, B , T, µ ), where X is a compact metric space, B is the Borel sigma algebra, and T : X → X is a measure preserving map, i.e. T is Borelmeasurable and T µ = µ . Since we always work with the Borel sigma-algebra, we shall usually justwrite ( X, T, µ ). For example one may consider X = T , the Borel map T p for some p ≥

2, and someCantor-Lebesgue measure with respect to base p .A dynamical system is ergodic if and only if the only invariant sets are trivial. That is, if B ∈ B satisﬁes T − ( B ) = B then µ ( B ) = 0 or µ ( B ) = 1. A dynamical system is called weakly mixingif for any ergodic dynamical system ( Y, S, ν ), the product system ( X × Y, T × S, µ × ν ) is alsoergodic. In particular, weakly mixing systems are ergodic. Moreover, If both ( X, T, µ ) and (

Y, S, ν )are weakly mixing, then their product system is also weakly mixing. A class of examples of weaklymixing systems is given ( T , T p , µ ) where µ is a Cantor-Lebesgue measure with respect to base p .We will have occasion to use the ergodic decomposition Theorem: Let ( X, T, µ ) be a dynamicalsystem. Then there is a map X → P ( X ), denoted by µ µ x , such that:6. The map x µ x is measurable with respect to the sub-sigma algebra I of T invariant sets.2. µ = R µ x dµ ( x )3. For µ almost every x , µ x is T invariant and ergodic. The measure µ x is called the ergodiccomponent of x .Finally, we shall say that a point x ∈ X is generic with respect to µ if1 N N − X i =0 δ T i x → µ, where δ y is the dirac measure on y ∈ X, in the weak-* topology. By the ergodic Theorem, if µ is ergodic then µ a.e. x is generic for µ . Recall that in general, if µ ∈ P ( X ) is a T invariant measure, we may deﬁne its entropy with respectto T , a quantity that we shall denote by h ( µ, T ). As there is an abundance of excellent texts onentropy theory (e.g. [30]), we omit a discussion on entropy here. We now restrict our attentionto dynamical systems of the form ( T , µ, T p ) or ( T , µ, T m × T n ), where we always assume that m > n >

1. In the one dimensional case, the dimension of µ may be computed via the entropies ofits ergodic components: Theorem 2.4. ([20], Theorem 9.1) Let µ ∈ P ( T ) be a T p invariant and ergodic measure. Then µ is exact dimensional and dim µ = h ( µ, T p )log p . In general, if µ ∈ P ( T ) is a T p invariant measure with ergodic decomposition µ = R µ x dµ ( x ) , then dim µ = ess-inf x ∼ µ dim µ x (8) and dim µ = ess-sup x ∼ µ dim µ x (9)The situation for dynamical systems of the form ( T , µ, T m × T n ) is more complicated. Thismay be attributed to the fact that the map T m × T n is not conformal. There is, however, a wayto compute the dimension of µ in this situation via entropy theory, using a suitable version of theLedrappier-Young formula . This was ﬁrst done by Kenyon and Peres in [18] for ergodic measures.The general case may be treated using similar methods, as observed by Meiri and Peres in ([24],Lemma 3.1). Theorem 2.5. [24] Let µ ∈ P ( T ) be a T m × T n invariant measure. Then for µ almost every x the local dimension dim( µ, x ) exists as a limit and dim( µ, x ) = h ( µ x , T m × T n ) − h (( P µ ) P x , T n )log m + h (( P µ ) P x , T n )log n where µ x and ( P µ ) P x denote the corresponding ergodic components of µ , and of P µ , respectively. Finally, we will require the following result of Meiri, Lindenstrauss and Peres from [20]:7 heorem 2.6. [20] Let µ ∈ P ( T ) be a T p invariant weakly mixing measure, such that dim µ > .Let µ ∗ k denote the convolution of µ with itself k -times. Then dim( µ ∗ k ) → , monotonically as k → ∞ We remark that we have only cited a special case of this result. Indeed, Meiri, Lindenstraussand Peres deal with the growth of the entropy of more general convolutions of T p ergodic measures.We refer the reader to [20] for the full statement. Let X be a compact metric space, T : X → X a Borel measurable map, and let µ, ν ∈ P ( X ).Following Hochman and Shmerkin [13], we shall say that µ is pointwise generic for ν under T if µ almost every x equidistributes for ν under T , that is,1 N N − X k =0 f ( T k x ) → Z f dν, ∀ f ∈ C ( X ) . This notion is closely related to the main results of this paper. Indeed, let X = T , T = T m × T p for m > p > m p , and α be the pushforward of a T p invariant ergodic positive dimensionalmeasure µ ∈ P ( T ) to the diagonal of T . Then Theorem 1.1 part (1) for example may be stated as” α is pointwise generic for λ × µ under T ”.In [13], the authors obtain a criteria for this to occur, one that shall play a central role inthis paper as well. We now recall its formulation. Let A be a ﬁnite partition of X , and for every i ∈ N ∪ { } let T i A = { T − i A : A ∈ A} . Let A k = W k − i =0 T i A denote the coarsest commonreﬁnement of A , T A ..., T k − A . Now, if the smallest sigma algebra that contains A k for all k is theBorel sigma algebra, we say that A is a generator for T . We say that A is a topological generatorif sup { diam A : A ∈ A k } → k → ∞ . A topological generator is clearly a generator.Let us give two examples of topological generators that shall be used in this paper: for every p ∈ N let D p be the p -adic partition of T (and of R ), that is, D p = { [ zp , z + 1 p ) : z ∈ Z } . Then, under the map T p , we see that D kp = D p k = { [ zp k , z + 1 p k ) : z ∈ Z } . It is thus easy to see that D p is a generator for T p . Similarly, if m > n then the partition D m × D n of T is a generator under T m × T n .Finally, in general, for every k ≥ x ∈ X , let A k ( x ) denote the unique element of A k thatcontains x . Given µ ∈ P ( X ) and x ∈ X such that µ ( A k ( x )) >

0, let µ A k ( x ) = c · T k ( µ | A k ( x ) ) , where c = µ ( A k ( x )) − , which is well deﬁned almost surely. Theorem 2.7. ([13], Theorem 2.1) Let T : X → X be a Borel measurable map of a compact metricspace, µ ∈ P ( X ) and A a generating partition. Then for µ almost every x , if x equidistributes for ν ∈ P ( X ) along some N k → ∞ , and if ν ( ∂A ) = 0 for all A ∈ A k , k ∈ N , then ν = lim k →∞ N k N k − X k =0 µ A k ( x ) , weak-* in P ( X ) . m > n , anddeﬁne for every k ∈ N A k = { x ∈ R : D m k ( x )

6⊆ D n k ( x ) } Also, recall that the density of a sequence S ⊆ N (if it exists) is the limit of the sequence | S ∩ [1 ,N ] | N as N → ∞ . If the limit does not exist, the corresponding lim sup is called the upper density of S . Claim 2.8.

Suppose that µ ∈ P ([0 , is a measure that is pointwise generic under T n for acontinuous measure ρ . Then for µ almost every x , if x ∈ lim sup A k and { n k } represents the timeswhen x ∈ A n k , then the density of { n k } is zero.Proof. Choose x ∼ µ , and if x ∈ lim sup A k let { n k } be the sequence as in the statement of theClaim. Let ǫ >

0. We will show that the upper density of { n k } is at most ǫ . First, since ρ is acontinuous measure, there exists some δ > ρ ( B (0 , δ )) < ǫ , where B (0 , δ ) is the ballabout 0 in T . By our assumption that µ is pointwise generic under T n for ρ , and since ρ is acontinuous measure, V δ = { i | T in ( x ) ∈ B (0 , δ ) } has density ρ ( B (0 , δ )) < ǫ .Now, let us decompose our sequence { n k } = ( { n k } ∩ V δ ) ∪ ( { n k } ∩ ( N \ V δ )) . Then the upper density of { n k } ∩ V δ is at most ǫ . We now show that the density of the sequence { ℓ k } := { n k } ∩ ( N \ V δ ) is 0. In fact, we will show that this is a ﬁnite sequence.Indeed, let K > log δ log nm . We claim that { ℓ k } ⊆ [0 , K ]. Assume towards a contradiction that thereexists some q > K such that ℓ k = q for some k . Then there is a unique n q -adic number a (anendpoint of an D n q cell) such that a ∈ D m q ( x ). Write a = sn q for some integer s . Then we have | x − sn q | ≤ m q , which implies that T qn ( x ) ∈ B (0 , n q m q ) ⊂ B (0 , δ ), by the choice of K . Thus, q ∈ V δ , contradictingthe choice of the sequence { ℓ k } . Thus, { ℓ k } ⊆ [0 , K ], which is suﬃcient for us.We will also require the following Lemma. Lemma 2.9.

Let x ∈ [0 , be such that it equidistributes for a continuous measure ρ under T n . Let D ⊂ [0 , be some interval. Let { n k } be the sequence of times when T n k n x / ∈ D but d ( T n k n ( x ) , ∂D ) ≤ ( nm ) n k . Then the density of the sequence { n k } is .Proof. Let ǫ >

0. Since ρ is continuous, there exists some δ > ρ ( { y : d ( y, ∂D ) ≤ δ } ) < ǫ. Let V δ = { k | T kn x ∈ { y : d ( y, ∂D ) ≤ δ } } . Then by our assumption on x , the density of V δ is at most ǫ . However, the sequence { n k } ⊆ V δ ,apart from maybe ﬁnitely many indices. It follows that the upper density of { n k } is at most thedensity of V δ , and therefore is at most ǫ . This proves the Lemma.9 .4 Ergodic fractal distributions Recall the deﬁnitions introduced in Section 1.2. In this Section we discuss some other relatedresults of [13] that we shall require. First, we cite a result about the implication of not having someelement t > Proposition 2.10. ([13], Section 4) Suppose that µ generates an S -ergodic distribution P and thatno non-zero integer multiple of t > is in Σ( P, S ) . Then P is t -generated by µ at almost every x , i.e. the sequence { µ x,kt } ∞ k =0 equidistributes for P . The next result says that distributions P ∈ P ( P ([0 , µ have some additional invariance properties: Theorem 2.11. ([13], Theorem 4.7) Suppose that µ generates an S -invariant distribution P .Then P is supported on M (cid:3) and satisﬁes the S -quasi-Palm property: for every Borel set B ⊆ M (cid:3) , P ( B ) = 1 if and only if for every t > , P almost every measure η satisﬁes that η x,t ∈ B for η almost every x such that [ x − e − t , x + e − t ] ⊆ [ − , . We shall refer henceforth to S -ergodic distributions P supported on M (cid:3) that satisfy the conclu-sion of Theorem 2.11 as EFD’s (Ergodic Fractal Distributions), a term coined by Hochman in [11].The next Proposition says that typical measures with respect to a non-trivial EFD have positivedimension (recall the deﬁnition of non-triviality in this situation from Section 1.2): Proposition 2.12. ([13], Proposition 4.12) Let P be an EFD. Then there exists some δ ≥ suchthat P almost every ν has dim ν = δ . If P is non-trivial then δ > . We will also need to know that P -typical measures are not ”one sided at small scales” Proposition 2.13. ([13], Proposition 4.13) Let P be an EFD. For every ρ > , for P almost every ν , we have inf ν ( I ) > , where I ⊆ [ − , ranges over closed intervals of length ρ containing . The next Proposition follows from the S -invariance of EFD’s, and from a Theorem of Hunt andKaloshin [16]: Proposition 2.14. ([13], Lemma 5.8) Let P be a non trivial EFD such that P typical measureshave dimension δ > . Let τ ∈ P ( R ) be such that dim τ ≥ − δ . Then dim τ ∗ ν = 1 for P almostevery ν . Finally, the next Proposition shows that ergodic T p invariant measures of positive dimensiongenerate non-trivial EFD’s: Theorem 2.15. [12] Let µ ∈ P ([0 , be a T p invariant ergodic measure with dim µ > . Then µ generates a non-trivial S ergodic distribution P (which is an EFD by Theorem 2.11). Let m p . We remark that while non-degenerate Cantor-Lebesgue measures with respect tobase p do generate EFD’s P such that k log m / ∈ Σ( P, S ) for every non zero integer k , this is nottrue in general. Thus, in order to deduce Theorem 1.1 from Theorem 1.2, we shall require someadditional machinery developed by Hochman and Shmerkin in [13] for a similar purpose. This isdiscussed in Section 5. 10 Some properties of (times m, times n) invariant measures

Throughout this section we ﬁx integers m > n >

1. We begin with an elementary Lemma fromentropy theory. Recall that we denote the coordinate projections by P , P . Lemma 3.1.

Let α ∈ P ( T ) be a T m × T n invariant measure such that P α = ρ . If h ( T m × T n , α ) = log m + h ( T n , ρ ) then α = λ × ρ .Proof. Let E be the invariant sigma algebra that corresponds to the second coordinate of T . Then,by the Abramov-Rokhlin Lemma (see [2] for the non-invertible case), h ( T m × T n , α ) = h ( T m × T n , α |E ) + h ( T n , ρ ) . Combining this with our condition, we see that h ( T m × T n , α |E ) = log m. Recall that the partition A = D m × D n is a generating partition of T (see Section 2.3). Then itfollows from Fekete’s Lemma and the Kolmogorov-Sinai Theorem thatinf k k H α ( k − _ i =0 ( T m × T n ) i A|E ) = h ( T m × T n , A , α |E ) = h ( T m × T n , α |E ) = log m. As log m is also an upper bound for the sequence { k H α ( W k − i =0 ( T m × T n ) i A|E ) } , we ﬁnd that forevery k ∈ N , 1 k H α ( k − _ i =0 ( T m × T n ) i A|E ) = log m. So, by the formula for conditional entropy as average of the conditional measures { α E x } ,log m k = H α ( k − _ i =0 ( T m × T n ) i A|E ) = Z H α E x ( k − _ i =0 ( T m × T n ) i A ) dρ ( x ) = Z H α E x ( k − _ i =0 T im D m ) dρ ( x ) , where the partition in the last term on the RHS should be understood as the corresponding par-tition on the ﬁber [0 , × { P ( x ) } . We also have H α E x ( W k − i =0 T im D m ) ≤ log m k almost surely, since W k − i =0 T im D m has m k atoms. Therefore, H α E x ( k − _ i =0 T im D m ) = log m k almost surely. Such an equality is possible only if α E x is the uniform measure on D m k . It followsthat almost surely the measure α E x is the uniform measure on D m k for every k . By the Kolmogorovconsistency Theorem, α E x = λ almost surely. Since α = R α E x dρ ( x ), this proves the result. Claim 3.2.

Let θ ∈ P ( T ) be a T m × T n invariant measure such that P θ = ρ is exact dimensional.If dim θ = 1 + dim ρ then θ = λ × ρ . roof. By equation (3), and by Theorem 2.51 + dim ρ = dim θ = ess-inf x ∼ θ dim( θ, x )= ess-inf x ∼ θ h ( θ x , T m × T n ) − h (( P θ ) P x , T n )log m + h (( P θ ) P x , T n )log n = ess-inf x ∼ θ h ( θ x , T m × T n ) − h ( ρ P x , T n )log m + h ( ρ P x , T n )log n (10)(recall that θ x and ρ P x denote the corresponding ergodic components of θ and of ρ , respectively).Now, by Theorem 2.4, and since ρ has exact dimensioness-sup x ∼ θ dim (cid:0) ρ P x (cid:1) = ess-sup y ∼ ρ dim ( ρ y ) = dim ρ = dim ρ = ess-inf x ∼ θ dim (cid:0) ρ P x (cid:1) . So, for θ almost every x we have h ( ρ P x , T n )log n = dim ρ P x = dim ρ. (11)Combining (11) with (10), we ﬁnd that1 = ess-inf x ∼ θ h ( θ x , T m × T n ) − h ( ρ P x , T n )log m . (12)Therefore, by (12), the formula for entropy as an average over ergodic components, the Abramov-Rokhlin Lemma, and the formula for entropy as the average of conditional measures (as in Lemma3.1), we have log m ≤ Z (cid:0) h ( θ x , T m × T n ) − h ( ρ P x , T n ) (cid:1) dθ ( x )= Z h ( θ x , T m × T n ) dθ ( x ) − Z h ( ρ P x , T n ) dθ ( x )= h ( θ, T m × T n ) − Z h ( ρ y , T n ) dρ ( y )= h ( θ, T m × T n ) − h ( ρ, T n )= h ( θ, T m × T n |E ) ≤ log m where E be the invariant sigma algebra that corresponds to the second coordinate of T . Thus, wehave that θ almost surely, h ( θ x , T m × T n ) − h ( ρ P x , T n )log m = 1 , and h ( θ, T m × T n |E ) = log m. (13)Now, (13) and the Abramov-Rokhlin Lemma imply that θ almost surelylog m + h ( ρ P x , T n ) = h ( θ x , T m × T n ) = h ( θ x , T m × T n |E ) + h ( P θ x , T n ) . (14)By (13) and the formula for entropy and convex combinations,log m = h ( θ, T m × T n |E ) = Z h ( θ x , T m × T n |E ) dθ ( x ) . ≤ h ( θ x , T m × T n |E ) ≤ log m almost surely, we must have h ( θ x , T m × T n |E ) = log m almostsurely. By this equality and (14) we see that for θ almost every x , h ( P θ x , T n ) = h ( ρ P x , T n ) . (15)Finally, by (15) and (14), h ( θ x , T m × T n ) = log m + h ( P θ x , T n ) for almost every x . By Lemma 3.1, almost every ergodic component θ x equals λ × P θ x . Thus, θ = Z θ x dθ ( x ) = Z λ × P θ x dθ ( x ) = λ × (cid:18)Z P θ x dθ ( x ) (cid:19) = λ × P (cid:18)Z θ x dθ ( x ) (cid:19) = λ × ρ. Next, we make a short digression to discuss the relation between the conditional measures ofa convolution of measures, and the conditional measures of the individual measures convolved,in some special cases. In the following, the convolution of the two measures on the unit square[0 , is understood to take place in R . For a measure ν ∈ P ( R ), Let ν = R ν y dP ν ( y ) be thedisintegration of ν with respect to the projection P . Claim 3.3.

Let θ, ν ∈ P ([0 , ) be two measure such that θ = τ × α , where the measure α isa convex combination of ﬁnitely many atomic measures. Then for P ( ν ∗ θ ) almost every z , theconditional measure ( ν ∗ θ ) z with respect to the projection P is a ﬁnite convex combination ofmeasures of the form ν z − z i ∗ ( τ × δ z i ) , where z i is an atom of α and ν z − z i is a conditional measureof ν with respect to the projection P .Proof. If α = δ y for some y then the result is straightforward. For the general case, notice that if θ = τ × α and α = P p i δ z i then by the linearity of both convolution and of taking product measures ν ∗ θ = X ν ∗ ( τ × ( p i · δ z i )) = X p i · ν ∗ ( τ × δ z i ) . In general, if µ = µ · p + µ · p is a convex combination of probability measures and C is somesigma algebra, then the following holds almost surely for every f ∈ L : E µ ( f |C ) = p · E µ ( f |C ) · dµ dµ + p · E µ ( f |C ) · dµ dµ . We remark that in the above equation, the Radon-Nikodym derivatives dµ i dµ in fact stand for theRadon-Nikodym derivatives when the measures are restricted to the sigma-algebra C , i.e. dµ i | C dµ | C .However, we suppress this in our notation. So, for B the Borel sigma algebra on the y -axis, forevery f ∈ L and for almost every z Z f d ( ν ∗ θ ) z = E ν ∗ θ ( f | P − B )( z )= X i p i · E ν ∗ ( τ × δ zi ) ( f | P − B )( z ) · dν ∗ ( τ × δ z i ) dν ∗ θ ( z )= X i p i · Z f d ( ν ∗ ( τ × δ z i )) z · dν ∗ ( τ × δ z i ) dν ∗ θ ( z )= X i p i · Z f d ( ν z − z i ∗ ( τ × δ z i )) · dν ∗ ( τ × δ z i ) dν ∗ θ ( z )= Z f d X i ( ν z − z i ∗ ( τ × δ z i )) · p i · dν ∗ ( τ × δ z i ) dν ∗ θ ( z ) !

13t follows that almost surely,( ν ∗ θ ) z = X i ν z − y i ∗ ( τ × δ z i ) · p i · dν ∗ ( τ × δ z i ) dν ∗ θ ( z )The following Claim, which forms the main result of this section, is also the key for our argument. Claim 3.4.

Let ν ∈ P ([0 , ) be a T m × T n invariant measure such that:1. We have P ν = ρ , where ρ is a continuous ergodic measure, and P ν = λ .2. There exists some δ > such that:For every probability measure τ ∈ P ([0 , with dim τ ≥ − δ , for ρ almost every y , we have dim τ ∗ P ν y = 1 .Then ν = λ × ρ .Proof. Suppose towards a contradiction that ν = λ × ρ . Let us ﬁrst identify ν with the correspondingmeasure on T (i.e. we project ν to T but we keep the notation ν ), which cannot be λ × ρ either.Then, by Lemma 2.3, there exists ( i, j ) ∈ Z \ { (0 , } such thatˆ ν ( i, j ) = [ λ × ρ ( i, j ) . Now, as P ν = ρ and P ν = λ we must have i, j = 0, since if e.g. i = 0 then, using (6),ˆ ν (0 , j ) = d P ν ( j ) = ˆ ρ ( j ) = 1 · ˆ ρ ( j ) = ˆ λ (0)ˆ ρ ( j ) = [ λ × ρ (0 , j )a contradiction. Thus, we may assume both i, j = 0, and since ˆ λ ( i ) = 0 we have ˆ ν ( i, j ) = 0 by (6).Now, let k ∈ N be such that 2 | j | < n k + 1. We construct two measures τ, α ∈ P ( T ) such that:1. The measure α is a uniform measure on a ﬁnite (periodic) T n k orbit such that ˆ α ( j ) = 0.To ﬁnd such a measure, we take the T n k periodic orbit { x , x } where x = n k − and x = n k n k − . Deﬁne a measure α = δ x + δ x on this orbit. Thenˆ α ( x ) = 12 · e πi xn k − (1 + e πi x ( nk − n k − )Now, if ˆ α ( j ) = 0 then e πi j ( nk − n k − = −

1, which can only happen if 2 j ( n k − n k − ∈ Z . However,2 j ( n k − n k − = 2 jn k +1 and | jn k +1 | <

1. Thus, it is impossible that ˆ α ( j ) = 0.2. The measure τ is T m k invariant, dim τ ≥ − δ and ˆ τ ( i ) = 0.To ﬁnd such a measure, let β be the Cantor-Lebesgue measure with respect to base m andthe non-degenerate probability vector ( , , , ...,

0) (see the end of Section 2.1). Then β is aweakly mixing T m invariant measure (a Bernoulli measure). By (7),ˆ β ( x ) = ∞ Y j =1 ( 13 + 23 exp(2 πix m j ))= ∞ Y j =1 (cid:18) · (exp(2 πix m j ) − (cid:19)

14y looking at the corresponding power series expansion, we see that for every j | exp(2 πix m j ) − | = | ∞ X k =1 (2 πix m j ) k k ! | ≤ ∞ X k =1 (2 π | x | m j ) k k ! ≤ m j · exp(2 π | x | ) . By Proposition 3.1 in Chapter 5 of [29], we conclude that ˆ β ( i ) = 0 if and only if one of itsfactors has a zero at i . Since 0 = i ∈ Z ⊂ R , this clearly does not happen, and we concludethat ˆ β ( i ) = 0.Also, notice that dim β = H ( , )log m >

0, where H ( p , p ) is the Shannon entropy of the probabil-ity vector ( p , p ). Finally, by Theorem 2.6, there exists some q ∈ N such that dim β ∗ q > − δ ,where by β ∗ q we mean that we convolve β with itself q times. Recalling (5), we see that c β ∗ q ( i ) = (cid:16) ˆ β ( i ) (cid:17) q = 0. Thus, we may take τ = β ∗ q . Notice that τ is T m invariant, so it is also T m k invariant.Thus, by (5) and (6), the Fourier coeﬃcients of the measure ν ∗ ( τ × α ) ∈ P ( T ) satisfy \ ( ν ∗ ( τ × α ))( i, j ) = ˆ ν ( i, j ) · \ ( τ × α )( i, j ) = ˆ ν ( i, j ) · ˆ τ ( i ) · ˆ α ( j ) = 0 . Therefore, as i = 0, we have by Lemma 2.3 ν ∗ ( τ × α ) = λ × ( ρ ∗ α ) , (16)since \ ( λ × ( ρ ∗ α ))( i, j ) = ˆ λ ( i ) · [ ρ ∗ α ( j ) = 0 . On the other hand, let us now lift all our measures to corresponding measures on [0 ,

1] and[0 , . Since ν is already deﬁned on the unit square, we take this representative for our lift. Since τ cannot be atomic we can take our lift as the corresponding measure on [0 , α we can take essentially the same measure. By Claim 3.3, the conditional measures of ν ∗ ( τ × α )with respect to the projection P are almost surely ﬁnite convex combinations of measures of theform ν y − x i ∗ ( τ × δ x i ), where i = 0 , α , with weights p i ( y ) for i = 0 , P ( ν ∗ ( τ × α )) almost every y ,dim ( ν ∗ ( τ × α )) y = dim ( ν y − x ∗ ( τ × δ x ) · p ( y ) + ν y − x ∗ ( τ × δ x ) · p ( y ))= min { dim ν y − x ∗ ( τ × δ x ) · p ( y ) , dim ν y − x ∗ ( τ × δ x ) · p ( y ) } = min { dim ν y − x ∗ ( τ × δ x ) , dim ν y − x ∗ ( τ × δ x ) }≥ min { dim P ( ν y − x ∗ ( τ × δ x )) , dim P ( ν y − x ∗ ( τ × δ x )) }≥ min { dim ( P ν y − x ) ∗ τ, dim ( P ν y − x ) ∗ τ } = 1where we have used condition (2) in the statement of the Claim, the lower bound on dim τ andProposition 2.1. Since the opposite inequality is always true, we conclude thatdim ( ν ∗ ( τ × α )) y = 1 , for P ( ν ∗ ( τ × α )) almost every y. (17)Since P ( ν ∗ ( τ × α )) = ρ ∗ α , and by (17), we see via Lemma 2.2 part (1) thatdim ν ∗ ( τ × α ) ≥ ρ ∗ α.

15n the other hand, by part (2) of Lemma 2.2, and since P ν = λ ,dim ν ∗ ( τ × α ) ≤ dim p λ ∗ τ + dim ρ ∗ α = 1 + dim ρ ∗ α. We conclude that dim ν ∗ ( τ × α ) = 1 + dim ρ ∗ α .Finally, we project ν ∗ ( τ × α ) to T . Since this projection is a local diﬀeomorphism, it preservesdimension. Thus, the convolved measure ν ∗ ( τ × α ), with the ambient group being T , has dimension1 + dim ρ ∗ α . Moreover, by Theorem 2.4, since ρ is ergodic it is exact dimensional. Since α isa discrete measure (supported on two atoms), the convolution ρ ∗ α remains exact dimensional(Proposition 2.1).Therefore, we may apply Claim 3.2 for the measure ν ∗ ( τ × α ), since this is a T m k × T n k invariantmeasure (as the convolution of such measures), and the assumptions on the dimension of ν ∗ ( τ × α )and on P ( ν ∗ ( τ × α )) = ρ ∗ α are met by the previous paragraph. Thus, we may conclude that ν ∗ ( τ × α ) = λ × ( ρ ∗ α ). Via (16), this yields our desired contradiction. Let µ be as in Theorem 1.2, and let ν be some accumulation point of the sequence of measures asin (1) (where we pick a typical x according to µ ), along a subsequence N k . Our goal is to showthat ν = λ × ρ , and we shall do this by showing that ν meets the conditions of Claim 3.4.By our assumptions and Theorem 1.1 in [13] it follows that P ν = λ and P ν = ρ . Thus, ν satisﬁes condition (1) in Claim 3.4. Notice that this implies that ν gives zero mass to the pointsof discontinuouty of T m × T n . So, ν is T m × T n invariant. For the second condition of Claim 3.4,we require the following analogue of Theorem 5.1 in [13]. Recall that P is the EFD generated by µ (see Section 2.4). Claim 4.1. (Conditional integral representation) For P ν = ρ almost every y there is a probabilityspace (Ω , F , Q ( y )) and measurable functions c : Ω → (0 , ∞ ) , x : Ω → [ − , , η : Ω → P ([ − , such that:1. P ν y = Z c ( ω ) · ( δ x ( ω ) ∗ η ( ω )) | [0 , dQ ( y )( ω )

2. Let P y denote the distribution of random variable η as above. Then P = R P y dρ ( y ) .Proof. We dedicate the ﬁrst part of the proof to ﬁnding a disintegration of P according to themeasure ρ . To this end, consider the following sequence of distributions R N k ∈ P ( P ([0 , × [0 , R N k = 1 N k N k − X i =0 δ ( µ x,i log m ,T in x ) . Let R be some accumulation point of this sequence. Without the loss of generality, let us assumethe limit already exists along the sequence N k . Then we may assume that P R = P and P R = ρ ,since we are considering a µ typical point x , making use of the fact that µ is pointwise genericunder T n for ρ , and of the spectral condition on P via Proposition 2.10.16ext, we disintegrate the distribution R via the projection P : R = Z R y dρ ( y ) . Applying the map P to this disintegration, we see that P = P R = Z P R y dρ ( y ) . Thus, the family of measures { P R y } forms our desired disintegration.Let us study this family of distributions a little further: It is well known (see e.g. [9] or [28])that for ρ almost every y , we may write R y = lim p →∞ R ( · ∩ P − ( D p ( y )) ρ ( D p ( y )) . Therefore, P R y = lim p →∞ P R ( · ∩ P − ( D p ( y )) ρ ( D p ( y )) = lim p →∞ lim k →∞ ρ ( D p ( y )) · N k X { ≤ i ≤ N k − T in x ∈D p ( y ) } δ µ x,i log m . Finally, we note that for ρ almost every y , for every p , R ( · ∩ P − ( D p ( y )) ρ ( D p ( y )) ≪ R. Therefore, for ρ almost every y , for every p , P R ( · ∩ P − ( D p ( y )) ρ ( D p ( y )) ≪ P R = P. (18)We now turn our attention to the main assertions of the Claim. First, we embed µ (the measurefrom Theorem 1.2) on the diagonal of the unit square by pushing it forward via the map x ( x, x ).We call this new measure ˜ µ . For k ∈ N let A k denote the partition of [0 , given by D m k × D n k = k − _ i =0 ( T m × T n ) i ( D m × D n ) . Given a point z ∈ [0 , such that ˜ µ ( A k ( z )) > µ A k ( z ) := c · ( T m × T n ) k (˜ µ | A k ( z ) ) , where c is a normalizing constant. By applying Claim 2.8, we see that there is a set S ⊆ N ofdensity 1 (possibly depending on the x we chose according to µ ), such that for every k ∈ S themeasure ˜ µ | A k ( x,x ) is an aﬃne image of the measure µ | D mk ( x ) . Since we are only interested in thelimiting behaviour of these measures, we may assume S = N . Also, ν ( ∂A ) = 0 for all A ∈ A k , k ∈ N since P ν and P ν are both continuous measures. Thus, by Theorem 2.7 ν = lim k →∞ N k N k − X i =0 ˜ µ A i ( x,x ) (19)17ow, for P ν = ρ almost every y the conditional measure ν y can be obtained as the weak-*limit lim p →∞ ν P − D p ( y ) , where for every Borel set A ⊂ [0 , and p ∈ N , ν P − D p ( y ) ( A ) := ν ( A ∩ P − D p ( y )) ν ( P − D p ( y )) = ν ( A ∩ P − D p ( y )) ρ ( D p ( y )) . Fix p ∈ N . By (19) and since ˜ µ | A k ( x,x ) is an aﬃne image of the measure µ | D mk ( x ) for every k ,the projection of ν P − D p ( y ) to the x -axis (i.e. via P ) equals lim k →∞ ρ ( D p ( y )) · N k X { i :0 ≤ i ≤ N k − , and T in ( x ) ∈D p ( y ) } P ◦ (cid:18) L nimi , ( T im ( x ) ,T in ( x )) c k ( τ x k ∗ µ x,i log m ) | [0 , (cid:19) = lim k →∞ ρ ( D p ( y )) · N k X { i :0 ≤ i ≤ N k − , and T in ( x ) ∈D p ( y ) } c k ( τ x k ∗ µ x,i log m ) | [0 , (20)where L α,z is the unique aﬃne map taking the x -axis to the line with slope α through the point z .Notice that in the ﬁrst equation above we only take note of the indices such that T in ( x ) ∈ D p ( y ),and this is justiﬁed by Lemma 2.9.We thus see, as in Theorem 5.1 in [13] and its preceding discussion, that there is a distribution Q D p ( y ) ∈ P (cid:18) R × [ − , × P ( P ([0 , (cid:19) such that we have an integral representation (that depends on both p and y ) P ν P − D p ( y ) = Z g ( ω ) dQ D p ( y ) ( ω )where g : R × [ − , × P ( P ([0 , → P ([0 , g ( c, x, η ) = c · ( δ x ∗ η ) | [0 , . Moreover, thedistribution of Q D p ( y ) on the measure component P ( P ([0 , k →∞ ρ ( D p ( y )) · N k X { ≤ i ≤ N k − T in x ∈D p ( y ) } δ µ x,i log m , and by equation (18) and its preceding discussion, this distribution is absolutely continuous withrespect to P .Notice that for Q D p ( y ) almost every ( c, x, η ), c is the normalizing constant making ( x ∗ η ) | [0 , aprobabilty measure. Also, the map g is continuous almost surely. Moreover, by moving to a subse-quence, we may assume the weak -* limit lim p →∞ Q D p ( y ) exists, call it Q y . For these assertions, weargue, as in ([13], Theorem 5.1), that the distribution { P ( Q D p ( y ) ) } p of the normalizing constantsis tight. Indeed, for measures drawn according to P this follows from Proposition 2.13, and in ourcase the distribution of Q D p ( y ) on the measure component is absolutely continuous with respectto P . Finally, P ν y = P lim p →∞ ν P − D p ( y ) = lim p →∞ P ν P − D p ( y ) = lim p →∞ Z g ( ω ) dQ D p ( y ) ( ω ) = Z g ( ω ) dQ y ( ω ) . Recall that by equation (8) in [13] Section 5.2, µ D mi ( x ) := c · T km ( µ | D mk ( z ) ) = c k ( τ x k ∗ µ x,i log m ) | [0 , for corresponding parameters. ρ almost every y the distribution of Q y on the measure component P ( P ([0 , P R y ,as in the ﬁrst part of the proof, by the discussion preceding (18). Proof of Theorem 1.2

We are now in position to show that ν satisﬁes all conditions inClaim 3.4. We have already established that condition (1) holds. As for condition (2), we choose δ = dim P > P is non trivial, and Proposition 2.12). Let τ ∈ P ([0 , τ ≥ − δ . We now show that dim τ ∗ P ν y = 1 for P ν = ρ almostevery y First, by Lemma 2.14 and Claim 4.1 part (2)1 = Z dim( τ ∗ η ) dP ( η ) = Z dim( τ ∗ η ) dP y ( η ) dρ ( y ) . (21)Therefore, for ρ almost every y , for P y almost every η , dim τ ∗ η = 1 (since the integrand is always ≤ ρ almost every y ,dim τ ∗ P ν y = dim τ ∗ Z c ( ω ) · ( δ x ( ω ) ∗ η ( ω )) | [0 , dQ ( y )( ω )= dim Z c ( ω ) · τ ∗ ( δ x ( ω ) ∗ η ( ω )) | [0 , dQ ( y )( ω ) ≥ ess-inf ω ∼ Q ( y ) dim τ ∗ ( δ x ( ω ) ∗ η ( ω )) | [0 , ≥ ess-inf ω ∼ Q ( y ) dim τ ∗ δ x ( ω ) ∗ η ( ω )= ess-inf η ∼ P y dim τ ∗ η = 1Since dim τ ∗ P ν y ≤ τ ∗ P ν y = 1 for ρ almost every y .We conclude that ν satisﬁes the conditions of Claim 3.4. Therefore, ν = λ × ρ , as desired. Let µ be a T p -invariant and ergodic measure with positive dimension. Then µ generates an EFD P with dim P > m > n >

1. The pure point spectrum Σ(

P, S ) cancontain non zero integer multiplies of m only if either m ∼ p (in Theorem 1.1 we assume this isnot the case), or if log p log m ∈ Σ( T, µ ), see [12]. We shall prove Theorem 1.1 by using Theorem 1.2,and following the analysis of Hochman and Shmerkin from ([13], Section 8) in order to relax thespectral condition (i.e. deal with the latter case). We begin by treating the case n = p .Suppose ﬁrst that Σ( P, S ) does not contain a non-zero integer multiple of m . By the ergodicTheorem, µ is pointwise T n generic for µ . Also, since µ generates an EFD such that k log m / ∈ Σ( P, S )for every k ∈ Z \ { } , we may apply Theorem 1.2 and obtain1 N N − X i =0 δ ( T im ( x ) ,T in ( x )) → λ × µ, for µ almost every x. . Suppose now that there exists some k ∈ Z \ { } such that k log m ∈ Σ( P, S ), so P is not S log m ergodic by Proposition 4.1 in [13]. By the results discussed in ([13], Sections 8.2 and 8.3) there isa probability space (Ω , F , Q ) and a measurable family of measures { µ ω } ω ∈ Ω such that:19. The measures { µ ω } ω ∈ Ω form a disintegration of µ , that is, µ = R µ ω dQ ( ω ).2. For Q almost every ω , µ ω generates P .3. For Q almost every ω , µ ω log m -generates an S log m ergodic distribution P x at almost everypoint x (see Proposition 2.10 for the deﬁnition of log m -generation).Let δ = dim P > P . Then thefollowing holds : Lemma 5.1. ([13], Lemma 8.3) Let τ ∈ P ( R ) be such that dim τ ≥ − δ . Then dim τ ∗ η = 1 for Q almost every ω , µ ω almost every x , and P x almost every η . Now, we may ﬁnish the proof in a similar fashion to the proof of Theorem 1.2. Namely, For Q almost every ω and for µ ω almost every x , let ν be such that ( x, x ) equidistribute for it sub-sequentially under T m × T n . Then we may assume P ν = µ by the ergodic Theorem, and that µ ω log m -generates P x , where P x is typical with respect to Lemma 5.1. Then we have a conditionalintegral representation as in Claim 4.1, but now we can only disintegrate P x = R ( P x ) y dµ ( y ). Sincewe have Lemma 5.1 at our disposal (so that an analogue of (21) holds for P x instead of P ), we stillhave that for every τ ∈ P ( R ) with dim τ ≥ − δ , for µ almost every y , dim τ ∗ P ν y = 1 as thecalculation carried out during the last stage of the proof of Theorem 1.2 follows through in thiscase as well. It follows that ν = λ × µ . Finally, since this is true for Q almost every µ ω and for µ ω almost every x , this is also true for µ almost every x (recall that µ = R µ ω dQ ( ω )).The case when n p follows by a similar argument, only here for Q almost every µ ω , µ ω ispointwise n -normal, since this is true for µ by Theorem 1.10 in [13]. In this Section we prove the following generalization of Theorem 1.1:

Theorem 6.1.

Let µ be a T p invariant ergodic measure with dim µ > . Let m > n > be integerssuch that m p , and let f, g ∈ Aﬀ ( R ) be such that f ([0 , , g ([0 , ⊆ [0 , .1. If n = p then N N − X i =0 δ ( T im f ( x ) ,T in x ) → λ × µ, for µ almost every x,

2. If n p then N N − X i =0 δ ( T im f ( x ) ,T in g ( x )) → λ × λ, for µ almost every x, The proof is similar to the proof of Theorem 1.1. In particular, it relies on the followinggeneralization of Theorem 1.2:

Theorem 6.2.

Let µ ∈ P ([0 , be a probability measure, f, g ∈ Aﬀ ( R ) , and m > n > be integers,such that:1. The measure µ generates a non-trivial S -ergodic distribution P ∈ P ( P ([ − , . Here, we use the fact that the commutative phase measure from Theorem 8.2 in [13] has dimension 1, as provenin Section 8.3. . The pure point spectrum Σ( P, S ) does not contain a non-zero integer multiple of m .3. The measure gµ is pointwise generic under T n for an ergodic and continuous measure ρ , and f ([0 , , g ([0 , ⊆ [0 , .Then N N − X i =0 δ ( T im f ( x ) ,T in g ( x )) → λ × ρ, for µ almost every x. (22)For this to work, we need the following version of Claim 2.8. Let f, g ∈ Aﬀ( R ). For every k ∈ N ,deﬁne A k = { x ∈ R : f − D m k ( f ( x )) g − D n k ( g ( x )) } Claim 6.3.

Suppose that µ ∈ P ([0 , is a measure such that gµ is pointwise generic under T n for a continuous measure ρ . Then for µ almost every x , if x ∈ lim sup A k and { n k } represents thetimes when x ∈ A n k , then the density of { n k } is zero. The proof is analogues to that of Claim 2.8.

Proof of Theorem 6.2

The proof follows essentially the same steps as the proof of Theorem1.2. Let ν be some accumulation point of the orbit under T m × T n of δ ( f ( x ) ,g ( x )) , where x is drawnaccording to µ . • By ([13], Theorem 1.1) we have P ν = λ . By our assumption on gµ , P ν = ρ . • A complete analogue of Claim 4.1 holds in this case as well. First, we disintegrate P accordingto ρ , in a similar manner to the ﬁrst part of the proof of Claim 4.1. Here, we make use of thefact that f µ generates and log m generates P , which follows by ([13], Lemma 4.16).Secondly, we embed µ on a line in T by pushing it forward via the map x ( f ( x ) , g ( x ))(recall that we are assuming that both f and g map [0 ,

1] to [0 , µ ,and using the same notation as in Claim 4.1, we have ν = lim k →∞ N k N k − X i =0 ˜ µ A i ( f ( x ) ,g ( x )) by an application of Theorem 2.7. Also, by applying Claim 6.3, we see that there is a set S ⊆ N of density 1 (possibly depending on the x we chose according to µ ), such that for every i ∈ S the measure ˜ µ | A i ( f ( x ) ,g ( x )) is an aﬃne image of the measure µ | f − D mi ( f ( x )) . Thus, weobtain an analogue of (20). From here, we complete the proof as in the proof of Claim 4.1. • We ﬁnish the proof of the Theorem by showing that ν meets the conditions of Claim 3.4. Theproof is essentially the same as in the case of Theorem 1.2. Proof of Theorem 6.1

Since we have Theorem 6.2 at our disposal, the proof is now essentiallythe same as the proof of Theorem 1.1. We remark that an analogue of Lemma 5.1 remains true inthis case as well, which may be deduced from the results of ([13], Section 8.4).

References [1] Tim Bedford and Albert M. Fisher. Ratio geometry, rigidity and the scenery process forhyperbolic Cantor sets.

Ergodic Theory Dynam. Systems , 17(3):531–564, 1997.212] Thomas Bogensch¨utz and Hans Crauel. The Abramov-Rokhlin formula. In

Ergodic theoryand related topics, III (G¨ustrow, 1990) , volume 1514 of

Lecture Notes in Math. , pages 32–35.Springer, Berlin, 1992.[3] J. W. S. Cassels. On a problem of Steinhaus about normal numbers.

Colloq. Math. , 7:95–101,1959.[4] Kenneth Falconer.

Techniques in fractal geometry . John Wiley & Sons, Ltd., Chichester, 1997.[5] Kenneth J Falconer.

The geometry of fractal sets , volume 85. Cambridge university press,1986.[6] J. Feldman and M. Smorodinsky. Normal numbers from independent processes.

Ergodic TheoryDynam. Systems , 12(4):707–712, 1992.[7] Harry Furstenberg. Disjointness in ergodic theory, minimal sets, and a problem in diophantineapproximation.

Theory of Computing Systems , 1(1):1–49, 1967.[8] Harry Furstenberg. Intersections of Cantor sets and transversality of semigroups. In

Problemsin analysis (Sympos. Salomon Bochner, Princeton Univ., Princeton, N.J., 1969) , pages 41–59.Princeton Univ. Press, Princeton, N.J., 1970.[9] Hillel Furstenberg. Ergodic fractal measures and dimension conservation.

Ergodic Theory andDynamical Systems , 28(02):405–422, 2008.[10] Matan Gavish. Measures with uniform scaling scenery.

Ergodic Theory Dynam. Systems ,31(1):33–48, 2011.[11] Michael Hochman. Dynamics on fractals and fractal distributions. arXiv preprintarXiv:1008.3731 , 2010.[12] Michael Hochman. Geometric rigidity of × m invariant measures. J. Eur. Math. Soc. (JEMS) ,14(5):1539–1563, 2012.[13] Michael Hochman and Pablo Shmerkin. Equidistribution from fractal measures.

Inventionesmathematicae , 202(1):427–479, 2015.[14] Bernard Host. Nombres normaux, entropie, translations.

Israel J. Math. , 91(1-3):419–428,1995.[15] Bernard Host. Some results of uniform distribution in the multidimensional torus.

ErgodicTheory Dynam. Systems , 20(2):439–452, 2000.[16] Brian R. Hunt and Vadim Yu. Kaloshin. How projections aﬀect the dimension spectrum offractal measures.

Nonlinearity , 10(5):1031–1046, 1997.[17] Aimee S. A. Johnson. Measures on the circle invariant under multiplication by a nonlacunarysubsemigroup of the integers.

Israel J. Math. , 77(1-2):211–240, 1992.[18] R. Kenyon and Y. Peres. Measures of full dimension on aﬃne-invariant sets.

Ergodic TheoryDynam. Systems , 16(2):307–323, 1996.[19] Elon Lindenstrauss. p -adic foliation and equidistribution. Israel J. Math. , 122:29–42, 2001.2220] Elon Lindenstrauss, David Meiri, and Yuval Peres. Entropy of convolutions on the circle.

Ann.of Math. (2) , 149(3):871–904, 1999.[21] Russell Lyons. On measures simultaneously 2- and 3-invariant.

Israel J. Math. , 61(2):219–224,1988.[22] Pertti Mattila.

Fourier analysis and Hausdorﬀ dimension , volume 150 of

Cambridge Studiesin Advanced Mathematics . Cambridge University Press, Cambridge, 2015.[23] David Meiri. Entropy and uniform distribution of orbits in T d . Israel J. Math. , 105:155–183,1998.[24] David Meiri and Yuval Peres. Bi-invariant sets and measures have integer Hausdorﬀ dimension.

Ergodic Theory Dynam. Systems , 19(2):523–534, 1999.[25] Peter M¨orters and David Preiss. Tangent measure distributions of fractal measures.

Math.Ann. , 312(1):53–93, 1998.[26] Daniel J. Rudolph. × × Ergodic Theory Dynam.Systems , 10(2):395–406, 1990.[27] Wolfgang M. Schmidt. On normal numbers.

Paciﬁc J. Math. , 10:661–672, 1960.[28] David Simmons. Conditional measures and conditional expectation; Rohlin’s disintegrationtheorem.

Discrete Contin. Dyn. Syst. , 32(7):2565–2582, 2012.[29] Elias M. Stein and Rami Shakarchi.

Complex analysis , volume 2 of

Princeton Lectures inAnalysis . Princeton University Press, Princeton, NJ, 2003.[30] Peter Walters.

An introduction to ergodic theory , volume 79 of

Graduate Texts in Mathematics .Springer-Verlag, New York-Berlin, 1982.[31] U. Z¨ahle. Self-similar random measures. I. Notion, carrying Hausdorﬀ dimension, and hyper-bolic distribution.

Probab. Theory Related Fields , 80(1):79–100, 1988.

Einstein Institute of Mathematics, Edmond J. Safra Campus, The Hebrew University of Jerusalem,Givat Ram. Jerusalem, 9190401, Israel.

E-mail address [email protected]@mail.huji.ac.il