The distribution of rational numbers on Cantor's middle thirds set
TTHE DISTRIBUTION OF RATIONAL NUMBERS ON CANTOR’SMIDDLE THIRDS SET
ALEXANDER D. RAHM, NOAM SOLOMON, TARA TRAUTHWEIN, AND BARAK WEISS
Abstract.
We give a heuristic argument predicting that the number N ∗ ( T ) of ra-tionals p/q on Cantor’s middle thirds set C such that gcd( p, q ) = 1 and q ≤ T , hasasymptotic growth O ( T d + ε ), for d = dim C . We also describe extensive numericalcomputations supporting this heuristic. Our heuristic predicts a similar asymptoticif C is replaced with any similar fractal with a description in terms of missing digitsin a base expansion. Interest in the growth of N ∗ ( T ) is motivated by a problem ofMahler on intrinsic Diophantine approximation on C . Introduction
Let C denote Cantor’s middle thirds set, i.e. all numbers represented as x = (cid:80) ∞ a i − i with a i = a i ( x ) ∈ { , } for all i . Let N ∗ ( T ) denote the number of ra-tionals number of the form p/q , with p and q coprime, which belong to C and for which0 < q ≤ T . Motivated by questions in Diophantine approximation, our goal will be tounderstand the asymptotic growth rate of N ∗ ( T ).Everything we will say in the sequel will apply with minor modifications to a moregeneral situation in which C is the set of numbers defined by a restriction in a digitalexpansion, i.e. for some integer b ≥ F of { , . . . , b − } we willlet C denote the set of numbers x = (cid:80) ∞ a i b − i with all a i ∈ F . To simplify notationwe will stick throughout to the standard ternary set. When writing a rational as p/q we always assume that p and q are coprime.Fix c ∈ (0 , I T denote the interval [(1 − c ) T, T ] and let N ( T ) def = (cid:26) pq ∈ C : q ∈ I T (cid:27)(cid:101) N ( T ) def = (cid:26) pq ∈ C purely periodic : q ∈ I T (cid:27)(cid:101) N ∗ ( T ) def = (cid:26) pq ∈ C : 0 < q ≤ T, pq is purely periodic (cid:27) . Note that these quantities depend on c but this will be suppressed from the notation.The notations A ( T ) = O ( B ( T )) and A ( T ) (cid:28) B ( T ) mean that A ( T ) /B ( T ) is boundedabove by a positive constant, and A ( T ) (cid:16) B ( T ) means that the A ( T ) (cid:28) B ( T ) (cid:28) A ( T ). Conjecture 1.
Let d be the Hausdorff dimension of C , i.e. d = log 2 / log 3 , and in thegeneral case, d = log |F | / log b . For each ε > we have (cid:101) N ( T ) = O ( T d + ε ) . Date : September 4, 2019. a r X i v : . [ m a t h . N T ] S e p ALEXANDER D. RAHM, NOAM SOLOMON, TARA TRAUTHWEIN, AND BARAK WEISS
This conjecture was also made by Broderick, Fishman and Reich in [BFR]. An upperbound N ( T ) = O ( T d ) was obtained by Schleischitz in [Sch, Thm. 4.1]. Our heuristicactually predicts a more precise upper bound for (cid:101) N ( T ), see Remark 4.1. The exponent d is optimal in view of Proposition 3.3.Since numbers in C are explicitly given in terms of their base 3 expansion, it is possibleto count their number as a function of the complexity of their base 3 expansions. Butthis says nothing about the denominator q in reduced form; it may happen that arational with a complicated base 3 expansion corresponds to a reduced fraction p/q with q small. The basic heuristic principle behind Conjecture 1, is that the two eventsof having a small denominator relative to the complexity of the base 3 expansion, andof belonging to C , are probabilistically independent. We will make this heuristic moreprecise below.Some computational evidence for Conjecture 1 is given in [BFR]. Our goal in thispaper is to present more evidence supporting it. We will prove that the conjecturedasymptotics are lower bounds for N ∗ ( T ) and (cid:101) N ∗ ( T ); we will describe extensive com-putations consistent with this conjecture; and we will discuss the heuristic motivatingConjecture 1, exhibiting some numerical results which lend some support to this heuris-tic. Organization of the paper. In § § N ∗ ( T ) and (cid:101) N ∗ ( T ). We also explain that the main quantity of interest is (cid:101) N ( T ). In § (cid:101) N ( T ).Some oversimplifications in the probabilistic models lead to incorrect predictions, andwe modify the model slightly in § (cid:101) N ( T ).We discuss fluctuations and the relation of expectations to asymptotic behavior, in § Acknowledgements.
A. Rahm would like to thank Gabor Wiese and the Universityof Luxembourg for funding his research.
Noam’s funding: To be updated.
The researchof B. Weiss was supported by ISF grant 2095/15 and BSF grant 2016256.2.
Motivation and historical background
The classical problem in Diophantine approximation may be formulated as follows.Given a decreasing function ϕ : R + → R + and a real number x , are there infinitelymany rationals p/q such that | x − p/q | < ϕ ( q )? In case this holds one says that x is ϕ -approximable . For some choices of x and ϕ , determining whether x is ϕ -approximableis considered hopelessly difficult (e.g. ϕ ( q ) = 10 − /q , with x = 2 / or π ); a fruitfulline of research is to fix ϕ and ask about the measure of ϕ -approximable numbers, withrespect to some measure. Some classical results in diophantine approximation are:(Dirichlet) Every x is 1 /q -approximable.(Khinchin) With respect to Lebesgue measure, if (cid:80) qϕ ( q ) converges then almost no x is ϕ -approximable, and if (cid:80) qϕ ( q ) diverges then almost every x is ϕ -approximable. ATIONALS IN THE CANTOR SET 3 (Jarn´ık) The set BA def = (cid:8) x : ∃ c > . t . x is not c/q − approximable (cid:9) has Hausdorff dimension 1, but Lebesge measure zero.One measure to consider in place of Lebesgue measure in such statements, is the cointossing measure (assigning equal probability 1/2 to the digits 0,2 in base 3 expansion)on Cantor’s ternary set C . We give a brief list of activity concerning this type ofquestion.In 1984, Mahler [M] asked how well numbers in C can be approximated(i) by rationals in R .(ii) by rationals in C .Question (i) can be formalized in various ways, e.g. for which functions ϕ , does C contain ϕ -approximable numbers? For which ϕ is almost every number in C (withrespect to the natural coin-tossing measure) ϕ -approximable? for which ϕ is the setof numbers in C which are ϕ -approximable of the same Hausdorff dimension as that of C ? There has been a lot of recent activity concerning these and similar questions, see[W, F, LSV, Bu, SW] and the references therein.Question (ii), which is referred to as an intrinsic approximation problem, has notbeen nearly as well-studied. Broderick, Fishman and Reich [BFR] proved an analogue ofDirichlet’s theorem for Cantor sets and other missing digit sets. Fishman and Simmons[FS] extended the main result of [BFR] to a more general class of fractal subsets of R .A major difficulty in intrinsic approximation problems is that there is no reasonableunderstanding of the growth of the function N ( T ), (cid:101) N ( T ) as described above; boundson these functions will yield some progress on Mahler’s question (ii). In particular,Conjecture 1 implies (see [BFR] for the derivation): Conjecture 2.
For almost every x ∈ C , with respect to the coin-tossing measure, forany ε > , there are only finitely many rationals p/q ∈ C such that (cid:12)(cid:12)(cid:12)(cid:12) x − pq (cid:12)(cid:12)(cid:12)(cid:12) < q ε . (2.1)It was shown in [BFR] that for each x ∈ C , there are infinitely many p/q ∈ C forwhich | x − p/q | < q − (log q ) /d . Thus the exponent in (2.1) cannot be improved.3. Notation, basic observations, and a lower bound
The number x = (cid:80) ∞ a i ( x )3 − i is rational if and only if the sequence ( a i ( x )) i ≥ is eventually periodic, i.e. there are integers i = i ( x ) ≥ (cid:96) = (cid:96) ( x ) >
0, calledrespectively the length of initial block and period , such that a i ( x ) = a i + (cid:96) ( x ) , for all i > i , (3.1)and (3.1) does not hold for any smaller i or (cid:96) . We say that x is purely periodic if i = 0. It is elementary to verify the following (see also [BFR, Lemma 2.3]): ALEXANDER D. RAHM, NOAM SOLOMON, TARA TRAUTHWEIN, AND BARAK WEISS
Proposition 3.1.
Suppose x is a rational in C , with ( a i ) , i and (cid:96) as above. Then wemay write x = P/Q where P = i (cid:88) j =0 a j i + (cid:96) − j − i (cid:88) j =0 a j i − j + (cid:96) (cid:88) j =1 a i + j (cid:96) − j , and Q = 3 i (3 (cid:96) − (this fraction need not be reduced). In particular: • If x is a rational in C with period (cid:96) and initial block of length i , then there isan integer N such that i x − N is a purely periodic rational in C with period (cid:96) . • if x = p/q is purely periodic where gcd( p, q ) = 1 , then q is a divisor of (cid:96) − and (cid:96) is the order of in the multiplicative group ( Z /q Z ) × . As mentioned above, throughout this paper, the notation x = p/q will mean that x is a reduced rational in C , i.e. gcd( p, q ) = 1. The notation x = P/Q will mean that x is a rational in C , not necessarily reduced.The following proposition follows from standard calculations and is left to the reader. Proposition 3.2.
Fix c, c (cid:48) ∈ (0 , and define (cid:101) N ( T ) and (cid:101) N (cid:48) ( T ) using c and c (cid:48) respec-tively. Fix ε > . If (cid:101) N ( T ) (cid:28) T d + ε then the same holds for (cid:101) N (cid:48) ( T ) , N ( T ) , (cid:101) N ∗ ( T ) , and N ∗ ( T ) . Proposition 3.3.
There is c > such that for all T > we have (cid:101) N ∗ ( T ) ≥ T d / and N ∗ ( T ) ≥ c log( T ) T d .Proof. Let (cid:96) = (cid:98) log T (cid:99) ≥
1, i.e. T ∈ [3 (cid:96) , (cid:96) +1 ]. There are 2 (cid:96) purely periodic Cantorrationals of the form P/Q with Q = 3 (cid:96) −
1. Bringing them to reduced form, they areof the form p/q with q ≤ T . In particular (cid:101) N ∗ ( T ) ≥ (cid:96) = (cid:16) (cid:96) +1 (cid:17) d / ≥ T d / . Similarly any rational of the form
P/Q where Q = 3 i (3 (cid:96) − i −
1) will contribute to N ∗ ( T ). For each such Q , there are 2 (cid:96) − i possibilities for the digits in the periodic partof P/Q , and 2 i for the digits in the initial block. An exercise involving the inclu-sion/exclusion principle (which we omit), implies that the repetition in this counting isnegligible, i.e., up to a constant, the number of distinct rationals P/Q written in thisform is at least (cid:96) (cid:96) . This proves the claim. (cid:3) The heuristic
In this section we justify an upper bound of the form (cid:101) N ( T ) = O ( T d + ε ). Our approachis to assign to each reduced rational p/q a probability that it belongs to C , and boundthe expectation of the random variable (cid:101) N ( T ) with respect to this probability. Let Q = 3 (cid:96) − P/Q in the interval [0 , (cid:96) suchrationals, and of these, 2 (cid:96) belong to C . By Proposition 3.1, they are precisely the purelyperiodic Cantor rationals with period dividing (cid:96) . That is, fixing Q , the proportion ofrationals P/Q ∈ [0 ,
1] which belong to C is (cid:0) (cid:1) (cid:96) . ATIONALS IN THE CANTOR SET 5
Motivated by this we define our probabilistic model. By Proposition 3.1, p/q ∈ C ispurely periodic if and only if 3 does not divide q . For each rational p/q ∈ [0 , q not divisible by 3, our model stipulates:(*) The probability that p/q ∈ C is (cid:0) (cid:1) (cid:96) , where (cid:96) = (cid:96) ( q ) is the smallest number forwhich Q = 3 (cid:96) − is divisible by q ; the events p/q ∈ C are completely independent. Note that (cid:96) is the order of 3 in the multiplicative group C q def = ( Z /q Z ) × . Let φ ( q ) = C q be the Euler number of q . We may take representatives of elements of C q to bethe integers p between 0 and q − q , so we find that the expected numberof p/q in C with fixed denominator q is φ ( q ) (cid:0) (cid:1) (cid:96) ( q ) . Thus: E (cid:16) (cid:101) N ( T ) (cid:17) = (cid:88) q ∈ I T φ ( q ) (cid:18) (cid:19) (cid:96) ( q ) ≤ (cid:88) q ∈ I T T (cid:18) (cid:19) (cid:96) ( q ) = T (cid:88) (cid:96) ≥ log T + c (cid:48) L ( (cid:96), T ) (cid:18) (cid:19) (cid:96) , (4.1)where L ( (cid:96), T ) = { q ∈ I T : (cid:96) ( q ) = (cid:96) } and c (cid:48) = log (1 − c ) . We now need to bound the terms L ( (cid:96), T ). First we choose λ = − d − d . For (cid:96) ≥ λ log T we can use the trivial bound L ( (cid:96), T ) ≤ T , since T (cid:88) (cid:96) ≥ λ log T (cid:18) (cid:19) (cid:96) (cid:16) T − λ + λd = T d . So it only remains to show T λ log T (cid:88) (cid:96) =log T + c (cid:48) L ( (cid:96), T ) (cid:18) (cid:19) (cid:96) = O ( T d + ε ) . (4.2)For (cid:96) ∈ [log T + c (cid:48) , λ log T ], we use the obvious inequality L ( (cid:96), T ) ≤ τ (cid:0) (cid:96) − (cid:1) ,where τ ( n ) denotes the number of divisors of n . It is well-known that τ ( n ) ≤ (1+ o (1)) log n/ log log n . (4.3)In our situation we have 3 (cid:96) − ≤ T λ , so τ (cid:16) (cid:96) − (cid:17) ≤ λ log T/ log log T = T λ/ log log T , implying T λ log T (cid:88) (cid:96) =log T + c (cid:48) L ( (cid:96), T ) (cid:18) (cid:19) (cid:96) ≤ T λ log T T λ/ log log T (cid:18) (cid:19) log T + c (cid:48) (cid:28) log T T d +2 λ/ log log T . ALEXANDER D. RAHM, NOAM SOLOMON, TARA TRAUTHWEIN, AND BARAK WEISS from which (4.2) follows.
Remark 4.1.
1. In (4.1) we used the inequality φ ( q ) ≤ q ≤ T . But in fact it iswell-known that on average φ ( q ) (cid:16) q , so we actually expect (cid:101) N ( T ) (cid:16) T λ log T (cid:88) (cid:96) =log T + c (cid:48) L ( (cid:96), T ) (cid:18) (cid:19) (cid:96) . (4.4)
2. Our arguments show that the right hand side of (4.4) behaves like O (log T T d +2 λ/ log log T ) .In estimating the cardinality of L ( (cid:96), T ) we used the bound (4.3) which is optimal for ageneral n . However it may be that for numbers of the form n = 3 (cid:96) − a better boundexists, see [E] for related results. If so then our heuristic would predict a better boundfor (cid:101) N ( T ) . A revised model
The heuristic above relied on the basic statement (*). However this assumption leadsto some clearly incorrect predictions, namely:(i) (Primitive words)
In deriving (*) we calculated the frequency of purely periodicrationals with period dividing (cid:96) , belonging to C . It would have been more preciseto count the purely periodic rationals with period exactly (cid:96) , belonging to C . ByProposition 3.1, rationals with period exactly (cid:96) correspond to primitive words w in the alphabet { , , } of length (cid:96) , i.e. those w for which there is no properdivisor k of (cid:96) such that w is a concatenation of a identical words of length k . Astandard application of the inclusion/exclusion principle gives that the numberof primitive words of length (cid:96) from an alphabet of size a is m ( (cid:96), a ) def = (cid:88) d | (cid:96) µ (cid:18) (cid:96)d (cid:19) a d , (5.1)where µ is the M¨obius function.(ii) (Multiples of (cid:96) ) Fix q and let N q = { p : p/q ∈ C} , (5.2)and let (cid:96) = (cid:96) ( q ). Since C is invariant under multiplication by 3 mod 1, whenever p/q ∈ C we also have p (cid:48) /q ∈ C , where p (cid:48) = 3 p mod 1. This means that the set { p : p/q ∈ C} consists of orbits for the action of 3 on C q , and in particular, (cid:96) divides N q .(iii) (Divisibility by 2) Let Q = 3 (cid:96) −
1. Our model predicts that there are φ ( Q )(2 / (cid:96) rationals in C with denominator Q , coming from P ∈ { , . . . , Q − } such that P/Q belongs to C and gcd( P, Q ) = 1. However Q is even and if P/Q is in C then so is P , since it may be written in base 3 using the letters 0 and 2 only.That is, the actual number is zero. A similar observation holds for any q , whichdivides Q = 3 (cid:96) − Q/ q , let H be the group generatedby 3 in C q . By observation (ii), for each coset X ∈ C q /H , all number of the form p/q, p ∈ X simultaneously belong or do not belong to C ; if they all do, we will write X ∈ C . With this notation, our revised model stipulates that: ATIONALS IN THE CANTOR SET 7 (**)
Suppose q is not divisible by 3 and divides (3 (cid:96) − / , where (cid:96) = (cid:96) ( q ) . For each X ∈ C q /H , the probability that X ∈ C is m ( (cid:96), m ( (cid:96), , where m ( (cid:96), a ) is defined by (5.1) and ¯ m ( (cid:96), a ) is the set of primitive words of length (cid:96) in the symbols { , , } defining evennumbers. Note that our choice of probability takes into account (i) and (iii). It is not hard toshow that m ( (cid:96), / ¯ m ( (cid:96), / (cid:96) → (cid:96) →∞ , and using this, that the arguments given in § (cid:96) is prime, it is easy to check using (5.1) and the definition of ¯ m that the differencebetween 2 (cid:0) (cid:1) (cid:96) and m ( (cid:96), m ( (cid:96), is negligible. Nevertheless, when testing our heuristic, therewill be a difference between models (*) and (**). For sufficiently small values of q wehave computed the actual values of N q as defined in (5.2), and one may compare themto the number MLO( q ) def = round (cid:18) φ ( q ) · m ( (cid:96), m ( (cid:96), (cid:19) . (5.3)See Figures 1 and 2.The notation round( x ) stand for the closest integer to x , and the letters MLO standfor most likely outcome , since there is no other number more likely to occur as the valueof N q , under probabilistic model (**).Using inclusion/exclusion and M¨obius inversion, one can show (for more details see[T]) that the number of even (as numbers in base 3) primitive words of length (cid:96) withsymbols in the alphabet { , , ..., a − } is (cid:88) d | (cid:96), (cid:96)d even µ (cid:18) (cid:96)d (cid:19) a d + (cid:88) d | (cid:96), (cid:96)d odd µ (cid:18) (cid:96)d (cid:19) (cid:24) a d (cid:25) . As a consequence one obtains a simple formula for ¯ m ( (cid:96), q ) and hence to plot Figures 1 and 2. As can be seen in the Figures, withinthe range of our database of Cantor rationals, both models (*) and (**) give goodapproximations for the number of purely periodic Cantor rationals. The fit is notperfect though, and the plots reveal other interesting features. We try to explain someof these below.6.
Remarks on fluctuations, Bourgain’s theorem, and symmetries
Deviations from the mean.
An obvious objection to the line of reasoning pre-sented above, is that our prediction for (cid:101) N ( T ) is based on bounds on its expectation. That is, we have shown that our heuristic implies E ( (cid:101) N ( T )) = O ( T d + ε ), but in order tojustify (cid:101) N ( T ) = O ( T d + ε ) one needs additional arguments, which we now briefly indicate.If for some ε > T for which (cid:101) N ( T ) ≥ T d + ε ,then (possibly modifying the constants ε and c ) we can take this to be a subsequenceof the numbers in the form T k = (1 + c ) k . For each k we let X k denote the randomvariable, in model (*), counting the number of p/q ∈ C with q ∈ I T k . We will show ALEXANDER D. RAHM, NOAM SOLOMON, TARA TRAUTHWEIN, AND BARAK WEISS
57 114 228 456 912 1824 3648 7296 14592 29184 58368101102103 T ˜ N ( T ) F ( T ) M ( T ) Figure 1.
The summed number of purely periodic Cantor ratio-nals ˜ N ( T ), its approximation F ( T ) := (cid:80) q ∈ IT (cid:45) q round (cid:16)(cid:0) (cid:1) (cid:96) ( q ) · · φ ( q ) (cid:17) from model (*), and its approximation M ( T ) := (cid:88) q ∈ IT (cid:45) q | (cid:96) ( q ) − MLO( q ) frommodel (**), where I T := [(1 − c ) T, T ] for c = . More data points shownin Figure 3. q n (cid:96) ( q n ) / log q n q = 3 1 . q = 30 1 . q = 84 1 . q = 146 2 . q = 386 2 . Table 1.
Denominators q n such that for all q < q n +1 admitting Cantorrationals of denominator q , (cid:96) ( q ) / log q ≤ (cid:96) ( q n ) / log q n . For all q < admitting Cantor rationals of denominator q , we have (cid:96) ( q ) / log q ≤ (cid:96) ( q ) / log q .that the probability that X k exceeds T d + εk is O ( T − εk ), and hence is summable; fromthis it follows by Borel-Cantelli that the probability that for infinitely many k we have X k ≥ T d + εk is zero. ATIONALS IN THE CANTOR SET 9
57 114 228 456 912 1824 3648 7296 14592 29184 583680 . . . T M ( T ) / ˜ N ( T ) F ( T ) / ˜ N ( T ) Figure 2.
Ratios M ( T )˜ N ( T ) and F ( T )˜ N ( T ) for c = . Our heuristic predicts thatthis graph tends to 1 at infinity. r q = 3 r + 1 N q MLO( q ) N q MLO( q ) (cid:0) (cid:1) r N q MLO( q ) .
333 1 . . . . . .
667 1 .
049 19684 414 11 37 .
636 0 . .
571 1 . . .
29 0 . . Table 2.
The numbers q = 3 r + 1 , r = 4 , . . . ,
13 where our heuristicgives poor predictions. When revising the prediction by a factor of(3 / r , which is the factor taking into account a symmetry ω (cid:55)→ ω ¯ ω , weobtain a much better prediction.We continue to denote by c (cid:48) , λ the constants as in §
4, and write T = T k to simplifynotation. Let X (1) k (respectively, X (2) k ) be the number of p/q contributing to X k with
50 100 150 250 400 7501100 22003300 66009850 29550102103
T M ( T ) N ( T ) (a) c = 0 .
50 100 150 250 400 7501100 22003300 66009850 29550102103
T M ( T ) N ( T ) (b) c = 0 .
50 100 150 250 400 7501100 22003300 66009850 29550101102103
T M ( T ) N ( T ) (c) c = 0 .
50 100 150 250 400 7501100 22003300 66009850 29550101102103
T M ( T ) N ( T ) (d) c = 0 . (33 − −
12 (35 − − − − − − { T } M ( T ) N ( T ) (e) c = 0 Figure 3.
For different values of c (which determine the intervals I T := [(1 − c ) T, T ]), we plot the summed number of purely periodicCantor rationals ˜ N ( T ) and its approximation M ( T ) from model (**).As predicted in § c . q (cid:96) ( q ) N q MLO( q ) N q MLO( q ) . . . . . . . . Table 3.
All values of q with (cid:96) ( q ) = 24 for which our heuristic makesa prediction which is incorrect by a factor of 4 or more. Note that in allof these examples, N q > MLO( q ). At least three, and probably all, ofthe entries in the table are related to the symmetries discussed in § ATIONALS IN THE CANTOR SET 11 q (cid:96) ( q ) N q MLO( q ) N q MLO( q )
23 11 0 0 −
47 23 0 0 −
683 31 0 0 − . − . . · . . Table 4.
Some numbers q < (cid:96) ( q ) − for which (cid:96) ( q ) is a prime, includingall such q with 11 ≤ (cid:96) ( q ) ≤
23. In this case, symmetries are impossibleand our heuristic works well for each individual q . (cid:96) ( q ) > λ log T (respectively, log T + c (cid:48) ≤ (cid:96) ( q ) ≤ λ log q ). Let (cid:96) = λ log T , whichis a lower bound for (cid:96) ( q ) when p/q contributes to X (1) k . Since there are fewer than T rationals p/q with q ∈ I T , the probability that X (1) k ≥ T d + ε is smaller than theprobability that a binomial random variable with probability p = (cid:18) (cid:19) (cid:96) = T ( d − λ and T trials we will have T d + ε successes. By the Markov inequality, this probabilityis bounded above by T d − λ − d − ε = T − ε . The proof for X (2) k is similar, again usingthe Markov inequality and the bounds used in the proof of (4.2).6.2. Large (cid:96) and Bourgain’s theorem.
To highlight the sensitivity of (cid:101) N ( T ) tofluctuations, consider the expression (cid:98) (cid:96) ( q ) = (cid:26) (cid:96) ( q ) N q (cid:54) = 00 otherwise (with N q as in (5.2));that is, (cid:98) (cid:96) ( q ) is the order of 3 in ( Z /q Z ) × when there are rationals with denominator q in C , and zero otherwise. Clearly the nonzero values of (cid:98) (cid:96) ( q ) range between log q and q .If one could prove that (cid:98) (cid:96) ( q ) (cid:28) log q one would obtain a simple proof of Conjecture 1.Note that the heuristic behind Artin’s conjecture (see [Mo]) predicts that there areinfinitely many q for which (cid:96) ( q ) (cid:29) q , so that this may appear at first sight to be wildly optimistic. However our restriction N q (cid:54) = 0 is a stringent one. In fact, our computationsfound that for all 3 ≤ q < , (cid:98) (cid:96) ( q ) < q (see Table 1).On the other hand, by observation (ii), a large value of (cid:98) (cid:96) ( q ) would make a largecontribution to (cid:101) N ( T ) when q ∈ I T . For example if there were infinitely many q for which (cid:98) (cid:96) ( q ) > q d + ε , then their contribution alone would yield a contradiction to Conjecture1. However, a difficult result of Bourgain [B] implies that for any δ > (cid:98) (cid:96) ( q ) (cid:28) q δ .Bourgain’s theorem is much stronger inasmuch as it implies that the cosets of thesubgroup H equidistribute in the interval [0 ,
1] when (cid:96) ( q ) > q δ , while to obtain theupper bound above, one only needs to know that if (cid:96) ( q ) > q δ , then any coset for H contains at least one point in the interval (1 / , / (cid:98) (cid:96) ( q ) than those implied by Bourgain’s theorem.6.3. Additional sources of fluctuations.
It is easy to show that (4.4) predicts alower bound (cid:101) N ( T ) (cid:29) T d . However we do not expect a precise asymptotic in the form (cid:101) N ( T ) ∼ cT d , that is, we do not expect the limit of (cid:101) N ( T ) /T d to exist. There aretwo reasons for fluctuations in this expression. First consider the numbers of the form q = (3 (cid:96) − / , for which (cid:96) ( q ) = (cid:96) . If c < /
3, depending on the choice of T , the range I T may or may not contain one such number. In case it does, this contributes a termof order (2 / (cid:96) (cid:16) T d to the sum, which would contribute to the main term. Thus wehave fluctuations according as the window I T does or does not contain such q , or forgeneral c ∈ (0 , q in the interval I T . See Figure 3.Although these fluctuations would contradict a precise asymptotic (cid:101) N ( T ) ∼ cT d , theydo not preclude the weaker statement (cid:101) N ( T ) (cid:16) T d . A potentially more serious sourceof fluctuations in (4.4) is the number L ( (cid:96), T ), which could fluctuate considerablydue to fluctuations in the numbers τ (3 (cid:96) − Symmetries.
Heuristics (*) and (**) can also be used to make predictions forthe number N q of Cantor rationals with a fixed denominator q . However in this regime,our computations reveal many values of q for which the heuristic gives inaccuratepredictions. Some of these are shown in Tables 2 and 3. The numbers in Table 2 are allof the form 3 r + 1, and in Table 3 we show all numbers q for which (cid:96) ( q ) = 24 and theprediction is inaccurate by a factor of 4 or more. We will consider a possible explanationfor these inaccuracies by introducing a (non-rigorous) notion of ‘symmetries’ in base 3expansion.The identity r − = (3 r − r +1)2 easily implies the following (we leave details to thereader): suppose a purely periodic rational in base 3 expansion has repeating block ω ∈ { , } r , where r is the length of ω , and ¯ ω is the block obtained from ω by replacingoccurences of 0 with 2 and 2 with 0. Then the word ω ¯ ω of length 2 r obtained byconcatenating ω, ¯ ω defines (via an infinite base 3 expansion 0 .ω ¯ ωω ¯ ω · · · ) a number in C whose denominator divides 3 r + 1. This implies that any p r − ∈ C gives rise to some p (cid:48) r +1 ∈ C (and in fact, by observation (ii) in §
5, to the × r numbers). It can be deduced that heuristic (**) underestimatesnumbers p (cid:48) /q (cid:48) with q (cid:48) dividing 3 r +1 , arising in this way, by a factor of approximately ATIONALS IN THE CANTOR SET 13 r q r N q r X r Y r Z r Y r + MLO( q r )1 13 6 6 6 6 132 91 12 18 14 12 273 757 54 54 54 54 934 6643 120 156 122 120 2025 59293 450 420 388 390 6386 532171 1368 1062 978 1008 16417 4785157 4158 2562 2365 2436 41368 43053283 9744 5976 4663 4560 86549 387440173 38988 13608 13450 13500 2693110 3486843451 91440 30450 23224 23520 50961 Table 5.
The numbers q r = 3 r + 3 r + 1 with the contribution of thesymmetries of the form ω (cid:55)→ ω ¯ ω and ω (cid:55)→ ω ¯ ω . The number X r countsall strings of length 3 r of the specified form, Y r = (cid:106) X r · φ ( q r ) q r (cid:107) , and Z r is the actual number of Cantor rationals with denominator q r of thisspecial form.(3 / r . The revised heuristic is borne out by Table 2, where the last column correctsheuristic (**) by this factor, giving a good fit with the data.The mapping ω (cid:55)→ ω ¯ ω used above is for us an example of a symmetry in base 3.Here is another example. Suppose ω, ¯ ω ∈ { , } r are as in the previous paragraph, andsuppose and denote strings of length r consisting only of the digit 0 (respectively2). Then one may check, this time using the identity 3 r − r − r + 3 r + 1),that repeating blocks ω ¯ ω and ω ¯ ω give numbers in C whose denominator divides3 r + 3 r + 1. For example, taking r = 7, we have q = 3 + 3 + 1 = 4785157 , ourheuristic (**) gives MLO( q ) = 1771, and our computer program finds N q = 4158, whichis a poor fit. The number of strings of the form ω ¯ ω and ω ¯ ω , along with all theircyclic permutations (taking into account observation (ii) in §
5) is 2562. Some of thesegive a subset of the ones already considered in heuristic (**), so taking this symmetryinto account we should expect 2562 · φ ( q ) q = 2365 ≤ N q . This indeed gives a better(albeit still not very precise) prediction. We suspect that there are more symmetriescontributing to the numbers N q and hope to return to this issue in future work. InTable 5 we have tabulated the numbers q r for r = 2 , . . . ,
10, along with the numbers ofstrings of the above form multiplied by φ ( q ) /q , and compared this prediction with theactual number of strings of this form which are reduced rationals with denominator q r .When (cid:96) = kr for k, r ∈ N , k ≥
2, we can often make a similar construction ofa repeating block of length (cid:96) which is composed of k sub-blocks of size r (in thepreceding two paragraphs we gave examples with k = 2 , q = 3 ( k − r + 3 ( k − r + · · · + 3 r + 1 , N q will be significantly larger than predicted by our heuristic. The same will be truefor large divisors q (cid:48) of such q . Thus if (cid:96) has many divisors, there will be many valuesof q for which our predictions will be poor. In all of them we expect our heuristic togive a number which is smaller than the correct value, and we do not expect such verypoor predictions to occur when (cid:96) is prime. These two expectations are borne out inTables 3 and 4 below. We invite the reader to try to find explanations for the numbersappearing in Table 3; note that we have explained the appearance of 531442 using asymmetry ω (cid:55)→ ω ¯ ω , and that 589771 and 84253 are large divisors of 3 + 3 + 1 andcan thus be explained using the symmetries ω (cid:55)→ ω ¯ ω , ω (cid:55)→ ω ¯ ω . Appendix A. Computing the Cantor rationals of given denominator
In this appendix, we give an algorithm to compute the set of rational numbers in theCantor set of given denominator q , namely the Cantor rationals of reduced form pq . Itis stated in Algorithm 1 below, and has been implemented by the authors in Pari/GP.We denote by (cid:96) ( q ) the order of the element 3 in the group of multiplicative units in thering Z /q Z with q elements. Proposition A.1.
The set computed by algorithm contains all the Cantor rationalsof denominator q for its reduced form. This algorithm terminates within finite time.Proof. • The period length of pq in the ternary system is given by (cid:96) ( q (cid:48) ). Hence, thefinite sequence a of ternary digits is precisely the periodical sequence in pq .Furthermore, s (3 (cid:96) ( q (cid:48) ) −
1) + a (3 (cid:96) ( q (cid:48) ) − t = pq . So, the sequence s is precisely the sequence of ternary digits preceding theperiodical part in the ternary expansion of pq . By the elementary ternary digitsproperty of the Cantor set, algorithm decides if pq is a Cantor rational. Themask M allows it to check all suitable fractions pq . Here, and for establishingthe passlist , we make use of the well-known symmetry of the Cantor set: If x is an element of the Cantor set, then the same holds for (1 − x ), x , and —provided that it is in the unit interval — 3 x . • The loop in algorithm 1 consists of ( q −
1) repetitions, which contain a finitenumber of finite-time steps. (cid:3)
Remark A.2. • The mask M can be omitted and a coprimality check for ( p, q ) inserted, to obtain a simpler algorithm which is mathematically equivalent toalgorithm 1. The difference lies in the efficiency: In fact, the mask M is apowerful tool to reduce the time needed to carry out the algorithm, minimizingthe number of iterations of most expensive steps, which grows fast with q . • Even more important for the efficiency is the sub-algorithm testing the belongingof the ternary digits to the set { , } , because the numbers to be tested areincredibly great integers. ATIONALS IN THE CANTOR SET 15
Algorithm 1
Computation of the Cantor rationals of denominator q Input:
A natural number q . Output:
The set of Cantor rationals of reduced form pq .Carry out the prime decomposition of q .Create a mask M as the set of multiples of the primes in q satisfying that themultiples are strictly smaller than q .Denote by t the multiplicity of 3 in the prime decomposition of q .Let q (cid:48) := q t .Compute (cid:96) ( q (cid:48) ) := order of 3 in the multiplicative group of the ring Z /q (cid:48) Z .Initialize the passlist as an empty list. for p running from 1 through q − doif p is not an element of the mask M or the passlist , then Let T := pq (3 (cid:96) ( q (cid:48) ) − t .Let A := T mod (3 (cid:96) ( q (cid:48) ) − if A (cid:54) = 0 mod (3 (cid:96) ( q (cid:48) ) − then Let a be the lift of A to { , . . . , (cid:96) ( q (cid:48) ) − } ternary . if the digits of a are in { , } , then Let s := (cid:16) T − a (cid:96) ( q (cid:48) ) − (cid:17) ternary . if the digits of s are in { , } , then The fraction pq is a Cantor rational. Record it into the set of Cantor rationals of denominator q .Add 3-power multiples (if q (cid:54) = 0 mod 3) of p and their reflections to the passlist . else No 3-power multiples of pq are Cantor rationals. Add 3-power multiples of p and their reflections to the mask M . end ifend ifelseif the digits of (cid:16) T (cid:96) ( q (cid:48) ) − (cid:17) ternary or (cid:16) T (cid:96) ( q (cid:48) ) − − (cid:17) ternary are in { , } , then The fraction pq is a Cantor rational. Record it into the set of Cantor rationals of denominator q .Add 3-power multiples if 3 (cid:45) q of p and their reflections to the passlist . else No 3-power multiples of pq are Cantor rationals. Add 3-power multiples of p and their reflections to the mask M . end ifend ifend ifend for Output the rationals pq for p in the passlist . References [B] J. Bourgain,
Estimates on polynomial exponential sums , Isr. J. Math. (2010) 221–240.[BFR] R. Broderick, L. Fishman and A. Reich,
Intrinsic Approximation on Cantor-like Sets, a Prob-lem of Mahler (2011), Mosc. J. Comb. Number Th. (2011) 3–12.[Bu] Y. Bugeaud, Diophantine approximation and Cantor sets , Math. Ann. (2008), no. 3,677–684.[E] P. Erd˝os,
On the sum (cid:80) d | n − d − , Isr. J. Math. (1971) 43–48.[F] L. Fishman, Schmidt’s game on fractals,
Israel J. Math. (2009), 77–92.[FS] L. Fishman and D. Simmons,
Intrinsic approximation for fractals defined by rational iteratedfunction systems: Mahler’s research suggestion,
Proc. Lond. Math. Soc.(3) (2014), no. 1,189-212.[LSV] J. Levesley, C. Salp and S. Velani,
On a problem of K. Mahler: Diophantine approximationand Cantor sets,
Math. Ann. (2007), no. 1, 97118.[M] K. Mahler,
Some suggestions for further research.
Bull. Austral. Math. Soc. (1984), no. 1,101–108.[Mo] P. Moree, Artin’s primitive root conjecture — a survey , Integers Vol. 5 (2012) 1305–1416.[Sch] J. Schleischitz,
On intrinsic and extrinsic rational approximation to Cantor sets , preprint(2019) https://arxiv.org/pdf/1812.10689.pdf [SW] D. Simmons and B. Weiss,
Random walks on homogeneous spaces and Diophantine approxi-mation on fractals , Inv. Math. (2019) 337–394.[T] T. Trauthwein,
Approximation of Cantor Rational Cardinalities by Primitive Words , Master1 project report, Experimental Mathematics Lab, University of Luxembourg.[W] B. Weiss,
Almost no points on a Cantor set are very well approximable , Proc. R. Soc. Lond. (2001), 949–952.
University of Luxembourg
Massachusetts Institute of Technology [email protected]
University of Luxembourg [email protected]