DDigits of pi: limits to the seeming randomness
Karlis Podnieks
University of Latvia, Raina bulvaris 19, Riga, LV-1586, Latvia
Abstract.
The decimal digits of π are widely believed to behave like asstatistically independent random variables taking the values 0 , , , , , , , , / n . This conjecture seems to confirm well - it passes even thetests inspired by the Central Limit Theorem and the Law of the IteratedLogarithm.After this, a similar testing of the sequences of digits in the decimal rep-resentations of the numbers π , e and √ n , insteadof oscillations with amplitudes required by the Law of the Iterated Loga-rithm, convergence to zero is observed. If, for such ”analytically” definedirrational numbers, the observed behaviour remains intact ad infinitum ,then the seeming randomness of their digits is only a limited one. Keywords: digits of pi, random digits
The decimal digits of π are widely believed to behave like as statistically in-dependent random variables taking the values 0 , , , , , , , , , (for an overview, see [4]).In Section 2 (it reproduces in part Section 2.1 of [3]) another similar con-jecture is explored - the seemingly almost random behaviour of digits in thebase 3 representations of powers 2 n . This conjecture seems to confirm well - itpasses even the tests inspired by the Central Limit Theorem and the Law of theIterated Logarithm.Especially remarkable is Fig.2 below showing oscillations with amplitudesalmost as required by the Law of the Iterated Logarithm.In Section 3, similar pictures for the sequences of digits in the decimal rep-resentations of the numbers π , e and √ convergence to zero ! If, for such ”analytically” definedirrational numbers, the observed behaviour remains intact ad infinitum , thenthe seeming randomness of their digits is only a limited one. a r X i v : . [ m a t h . N T ] N ov Base 3 representations of powers of 2
Throughout this section, it is assumed that p, q are positive integers such that log p log q is irrational, i.e., p a (cid:54) = q b for any integers a, b > Definition 1.
Let us denote by D q ( n, i ) the i -th digit in the canonical base q representation of the number n , and by S q ( n ) - the sum of digits in thisrepresentation. Let us consider base q representations of powers p n . Imagine, for a moment,that, for fixed p, q, n , most of the digits D q ( p n , i ) behave like as statistically inde-pendent random variables taking the values 0 , , ..., q − q . Then, the (pseudo) mean value and (pseudo) variance of D q ( p n , i ) should be E = q −
12 ; V = q − (cid:88) i =0 q (cid:18) i − q − (cid:19) = q − . The total number of digits in the base q representation of p n is k n ≈ n log q p ,hence, the (pseudo) mean value of the sum of digits S q ( p n ) = k n (cid:80) i =1 D q ( p n , i ) shouldbe E n ≈ n q − log q p and, because of the assumed (pseudo) independence of dig-its, its (pseudo) variance should be V n ≈ n q − log q p . As the final consequence,the corresponding centered and normed variable S q ( p n ) − E n √ V n should tend to behave as a standard normally distributed random variable withprobability density √ π e − x .One can try to verify this conclusion experimentally. For example, let uscompute S (2 n ) for n up to 100000, and let us draw the histogram of the corre-sponding centered and normed variable s (2 n ) = S (2 n ) − n log (cid:113) n log S q ( p n ), as a function of n , behaves almost as n q − log q p , i.e.,almost linearly in n .An even more advanced idea for testing randomness of sequences of digitswas proposed in [2] - let us use the Law of the Iterated Logarithm. Namely,let us try to estimate the amplitude of the possible deviations of S q ( p n ) from n q − log q p by “applying” the Law of the Iterated Logarithm. ig. 1. Histogram of the centered and normed variable s (2 n ) Let us consider the following centered and normed (pseudo) random vari-ables: d q ( p n , i ) = D q ( p n , i ) − q − (cid:113) q − . By summing up these variables for i from 1 to k n , we obtain a sequence of(pseudo) random variables: κ q ( p, n ) = S q ( p n ) − q − k n (cid:113) q − , that “must obey” the Law of the Iterated Logarithm. Namely, if the sequence S q ( p n ) behaves, indeed, as a ”typical” sum of equally distributed random vari-ables, then lim n →∞ inf and lim n →∞ sup of the fraction κ q ( p, n ) √ k n log log k n , must be − δ q ( p, n ) = S q ( p n ) − ( q − log q p ) n (cid:113) ( q − log q p ) n log log n , then lim n →∞ sup δ q ( p, n ) = 1; lim n →∞ inf δ q ( p, n ) = − . In particular, this would mean that S q ( p n ) = ( q −
12 log q p ) n + O ( (cid:112) n log log n ) . nd, for p = 2; q = 3 this would mean (note that log ≈ . S (2 n ) = n · log O ( (cid:112) n log log n ); δ (2 , n ) = S (2 n ) − n log (cid:113) ( log n log log n ≈ S (2 n ) − . n √ . n log log n , lim n →∞ sup δ (2 , n ) = 1; lim n →∞ inf δ (2 , n ) = − . Fig. 2.
Oscillating behaviour of the expression δ (2 , n ) However, the real behaviour of the expression δ (2 , n ) until n = 10 does notshow convergence of oscillations to the segment [ − , +1] (see Fig. 2, obtainedby Juris ˇCer¸nenoks). Although δ (2 , n ) is oscillating almost as required by theLaw of the Iterated Logarithm, very many of its values lay outside the segment[ − , S q ( p n )? To my knowledge, thebest result on this problem is due to C. L. Stewart. It follows from his Theorem2 in [6] (put α = 0), that S q ( p n ) > log n log log n + C − , where the constant C > q, p . Since then, nobetter than log n log log n lower bounds of S q ( p n ) have been proved. π , e and √ In Section 2 above, the Central Limit Theorem (Fig. 1) and the Law of theIterated Logarithm (Fig. 2) were used to verify the conjecture that the sum ofdigits of the base 3 representation of 2 n behaves closely to the expected behaviourof the sum of the first n members of a sequence of independent random variablestaking the values 0 , , .et us try, as proposed in [2], to apply this method to the sequences of digitsin the decimal representations of the numbers π , e and √ , , , , , , , , , . Then,the (pseudo) mean value and (pseudo) variance of n -th digit would be (see theformulas above) − = 4 . − = 8 .
25 correspondingly. And, the (pseudo)mean value of the sum of the first n digits S ( n ) would be 4 . n , and, becauseof the assumed (pseudo) independence of digits, its (pseudo) variance would be8 . n . Let us try to estimate the amplitude of the possible deviations of S ( n )from the expected mean 4 . n by “applying” the Law of the Iterated Logarithm.Let us introduce the necessary centered and normed (pseudo) random variables: d ( i ) − . √ . d ( i ) denotes the i -th digit). By summing up these variables for i from 1 to n ,we obtain a sequence of (pseudo) random variables: S ( n ) − . n √ . , that “must obey” the Law of the Iterated Logarithm. Namely, if the sequence S ( n ) behaves, indeed, as a ”typical” sum of equally distributed random vari-ables, then lim n →∞ inf and lim n →∞ sup of the fraction δ ( n ) = S ( n ) − . n √ · . n log log n , must be − S ( n ) = 4 . n + O ( (cid:112) n log log n ) , and that the values of δ ( n ) must oscillate accross the entire segment [ − , +1],like as in Fig. 2.However, Fig. 3, Fig. 4, and Fig. 5 obtained by Juris ˇCer¸nenoks for thefirst 10 digits of the numbers π , e and √ δ ( n ) do not oscillate accrossthe entire segment [ − , +1], instead, they seem converging to 0. Thus, the pic-tures seem to support the following somewhat stronger conjecture for π , e and √ S ( n ) = 4 . n + o ( (cid:112) n log log n ) . An even more specific behaviour are showing (see Fig. 6) the famous MillionRandom Digits of the RAND Corporation published in 1955 [5]. ig. 3.
The number π : behaviour of the expression δ ( n ) Fig. 4.
The number e : behaviour of the expression δ ( n ) Fig. 5.
The number √
2: behaviour of the expression δ ( n ) he pictures obtained for π , e , √ α , by the authors of [1]. They conclude:“For α , , the corresponding computation of the first 10 values of m ( n ) − n/ √ n log log n leads to the plot in Figure 12(b) and leads us to conjecture that it is 2-stronglynormal.”However, when comparing these pictures with the above Fig. 2, the followingconjecture seems more plausible:The seeming randomness of the digits of π , e , √ Fig. 6.
Million random digits from RAND Corp.: behaviour of the expression δ ( n ) References
1. Aragon Artacho F. J., Bailey J. , Borwein J. M., Borwein P. B.: Walking on realnumbers. The Mathematical Intelligencer, 35(1), 42–60 (2013)2. Belshaw A., Borwein P.: Champernowne’s Number, Strong Normality, and the XChromosome. Computational and Analytical Mathematics. Springer Proceedingsin Mathematics and Statistics 50, 29-44 (2013)3. Cernenoks J., Iraids J., Opmanis M., Opmanis R., Podnieks K.: Integer complexity:experimental and analytical results II. arXiv:1409.0446 (September 2014) [Lastaccessed: 13 November 2014]4. Marsaglia, G.: On the randomness of pi and other decimal expansions. InterStat(October 2005) [Last accessed: 13 November 2014]5. Sloane, N.J.A.: The On-Line Encyclopedia of Integer Sequences. A002205, TheRAND Corporation list of a million random digits. [Last accessed: 13 November2014]6. Stewart, C.L.: On the representation of an integer in two different bases. Journalfur die reine und angewandte Mathematik 319, 63–72 (January 1980)7. Wolfram Mathematica.