On the Design of LIL Tests for (Pseudo) Random Generators and Some Experimental Results
OOn the Design of LIL Tests for (Pseudo) RandomGenerators and Some Experimental Results
Yongge Wang
Dept. SIS, UNC CharlotteCharlotte, NC 28223, USAEmail: [email protected]
Abstract —Random numbers have been one of the most usefulobjects in statistics, computer science, cryptography, modeling,simulation, and other applications though it is very di ffi cult toconstruct true randomness. Many solutions (e.g., cryptographicpseudorandom generators) have been proposed to harness orsimulate randomness and many statistical testing techniques havebeen proposed to determine whether a pseudorandom generatorproduces high quality randomness. NIST SP800-22 (2010) pro-poses the state of art testing suite for (pseudo) random generatorsto detect deviations of a binary sequence from randomness. Onthe one hand, as a counter example to NIST SP800-22 test suite,it is easy to construct functions that are considered as GOODpseudorandom generators by NIST SP800-22 test suite thoughthe output of these functions are easily distinguishable from theuniform distribution. Thus these functions are not pseudorandomgenerators by definition. On the other hand, NIST SP800-22does not cover some of the important laws for randomness. Twofundamental limit theorems about random binary strings arethe central limit theorem and the law of the iterated logarithm(LIL). Several frequency related tests in NIST SP800-22 coverthe central limit theorem while no NIST SP800-22 test coversLIL.This paper proposes techniques to address the above challengesthat NIST SP800-22 testing suite faces. Firstly, we proposestatistical distance based testing techniques for (pseudo) randomgenerators to reduce the above mentioned Type II errors in NISTSP800-22 test suite. Secondly, we propose LIL based statisticaltesting techniques, calculate the probabilities, and carry outexperimental tests on widely used pseudorandom generatorsby generating around 30TB of pseudorandom sequences. Theexperimental results show that for a sample size of 1000 sequences(2TB), the statistical distance between the generated sequencesand the uniform distribution is around 0.07 (with for sta-tistically indistinguishable and for completely distinguishable)and the root-mean-square deviation is around 0.005. Though thestatistical distance 0.07 and RMSD 0.005 are acceptable for someapplications, for a cryptographic “random oracle”, the preferredstatistical distance should be smaller than 0.03 and RMSD besmaller than 0.001 at the sample size 1000. These results justifythe importance of LIL testing techniques designed in this paper.The experimental results in this paper are reproducible and theraw experimental data are available at author’s website. I. I ntroduction
Secure cryptographic hash functions such as SHA1, SHA2,and SHA3 and symmetric key block ciphers (e.g., AES andTDES) have been commonly used to design pseudorandomgenerators with counter modes (e.g., in Java Crypto Libraryand in NIST SP800-90A standards). Though security of hashfunctions such as SHA1, SHA2, and SHA3 has been exten-sively studied from the one-wayness and collision resistant aspects, there has been limited research on the quality of longpseudorandom sequences generated by cryptographic hashfunctions. Even if a hash function (e.g., SHA1) performslike a random function based on existing statistical tests(e.g., NIST SP800-22 Revision 1A [17]), when it is calledmany times for a long sequence generation, the resulting longsequence may not satisfy the properties of pseudorandomnessand could be distinguished from a uniformly chosen sequence.For example, the recent reports from New York Times [16] andThe Guardian [1] show that NSA has included back doorsin NIST SP800-90A pseudorandom bit generators (on whichour experiments are based on) to get online cryptanalyticcapabilities.Statistical tests are commonly used as a first step in de-termining whether or not a generator produces high qualityrandom bits. For example, NIST SP800-22 Revision 1A [17]proposed the state of art statistical testing techniques fordetermining whether a random or pseudorandom generatoris suitable for a particular cryptographic application. NISTSP800-22 includes 15 tests: frequency (monobit), number of1-runs and 0-runs, longest-1-runs, binary matrix rank, discreteFourier transform, template matching, Maurer’s “universalstatistical” test, linear complexity, serial test, the approximateentropy, the cumulative sums (cusums), the random excur-sions, and the random excursions variants. In a statistical testof [17], a significance level α ∈ [0 . , .
01] is chosen foreach test. For each input sequence, a P -value is calculatedand the input string is accepted as pseudorandom if P -value ≥ α . A pseudorandom generator is considered good if, withprobability α , the sequences produced by the generator fail thetest. For an in-depth analysis, NIST SP800-22 recommendsadditional statistical procedures such as the examination ofP-value distributions (e.g., using χ -test).NIST SP800-22 test suite has inherent limitations withstraightforward Type II errors. For example, for a function F that mainly outputs “random strings” but, with probability α ,outputs biased strings (e.g., strings consisting mainly of 0’s), F will be considered as a “good” pseudorandom generatorby NIST SP800-22 test though the output of F could bedistinguished from the uniform distribution (thus, F is not apseudorandom generator by definition). In the following, weuse two examples to illustrate this kind of Type II errors. LetRAND c , n be the sets of Kolmogorov c -random binary stringsof length n , where c ≥
1. That is, for a universal Turing a r X i v : . [ c s . CR ] J a n machine M , letRAND c , n = { x ∈ { , } n : if M ( y ) = x then | y |≥ | x |− c } . (1)Let α be a given significance level of NIST SP800-22 testand R n = R ∪ R where R is a size 2 n (1 − α ) subset ofRAND , n and R is a size 2 n α subset of { n x : x ∈ { , } n } .Furthermore, let f n : { , } n → R n be an ensemble of randomfunctions (not necessarily computable) such that f ( x ) is chosenuniformly at random from R n . Then for each n -bit string x ,with probability 1 − α , f n ( x ) is Kolmogorov 2-random andwith probability α , f n ( x ) ∈ R . Since all Kolmogorov 2-random strings are guaranteed to pass NIST SP800-22 testat significance level α (otherwise, they are not Kolmogorov 2-random by definition) and all strings in R fail NIST SP800-22test at significance level α for large enough n , the functionensemble { f n } n ∈ N is considered as a “good” pseudorandomgenerator by NIST SP800-22 test suite. On the other hand,Theorem 3.2 in Wang [24] shows that RAND , n (and R )could be e ffi ciently distinguished from the uniform distributionwith a non-negligible probability. A similar argument could beused to show that R n could be e ffi ciently distinguished fromthe uniform distribution with a non-negligible probability.In other words, { f n } n ∈ N is not a cryptographically securepseudorandom generator.As another example, let { f (cid:48) n } n ∈ N be a pseudorandom gener-ator with f (cid:48) n : { , } n → { , } l ( n ) where l ( n ) > n . Assume that { f (cid:48) n } n ∈ N is a good pseudorandom generator by NIST SP800-22in-depth statistical analysis of the P-value distributions (e.g.,using χ -test). Define a new pseudorandom generators { f n } n ∈ N as follows: f n ( x ) = (cid:40) f (cid:48) n ( x ) if f (cid:48) n ( x ) contains more 0’s than 1’s f (cid:48) n ( x ) ⊕ l ( n ) otherwise (2)Then it is easy to show that { f n } n ∈ N is also a good pseudoran-dom generator by NIST SP800-22 in-depth statistical analysisof the P-value distributions (e.g., using χ -test). However, theoutput of { f n } n ∈ N is trivially distinguishable from the uniformdistribution.The above two examples show the limitation of testingapproaches specified in NIST SP800-22. The limitation ismainly due to the fact that NIST SP800-22 does not fullyrealize the di ff erences between the two common approachesto pseudorandomness definitions as observed and analyzed inWang [24]. In other words, the definition of pseudorandomgenerators is based on the indistinguishability concepts thoughtechniques in NIST SP800-22 mainly concentrate on theperformance of individual strings. In this paper, we proposetesting techniques that are based on statistical distances suchas root-mean-square deviation or Hellinger distance. The sta-tistical distance based approach is more accurate in deviationdetection and avoids above type II errors in NIST SP800-22.Our approach is illustrated using the LIL test design.Feller [6] mentioned that the two fundamental limit theo-rems of random binary strings are the central limit theoremand the law of the iterated logarithm. Feller [6] also called attention to the study of the behavior of the maximum ofthe absolute values of the partial sums ¯ S n = max ≤ k ≤ n | S ( ξ | ` k ) |− n √ n and Erdos and Kac [5] obtained the limit distribution of ¯ S n .NIST SP800-22 test suite includes several frequency relatedtests that cover the first central limit theorem and the cusumtest, “the cumulative sums (cusums) test”, that covers the limitdistribution of ¯ S n . However it does not include any test for theimportant law of the iterated logarithm. Thus it is importantto design LIL based statistical tests. The law of the iteratedlogarithm (LIL) says that, for a pseudorandom sequence ξ ,the value S lil ( ξ [0 .. n − − ,
1] and reach both ends infinitely oftenwhen n increases. It is known [21], [22], [23] that polynomialtime pseudorandom sequences follow LIL. It is also known[7] that LIL holds for uniform distributions. Thus LIL shouldhold for both Kolmogorov complexity based randomness andfor “behavioristic” approach based randomness.This paper designs LIL based weak, strong, and snapshotstatistical tests and obtains formulae for calculating the prob-abilities that a random sequence passes the LIL based tests.We have carried out some experiments to test outcomes ofseveral commonly used pseudorandom generators. In partic-ular, we generated 30TB of sequences using several NISTrecommended pseudorandom generators. Our results show thatat the sample size 1000 (or 2TB of data), sequences producedby several commonly used pseudorandom generators have aLIL based statistical distance 0.07 from true random sources.On the other hand, at the sample size 10000 (20TB of data),sequences produced by NIST-SHA256 based pseudorandomgenerators have a LIL based statistical distance 0.02 from truerandom sources. These distances are larger than expected forcryptographic applications.The paper is organized as follows. Section II introducesnotations. Section III discusses the law of iterated logarithms(LIL). Section IV reviews the normal approximation to bino-mial distributions. Sections V, VI, and VII propose weak andstrong LIL tests. Section VIII describes the steps to evaluatea pseudorandom generator. Section introduces Snapshot LILtests. Section X reports experimental results and we concludewith Section XI.II. N otations and pseudorandom generators In this paper, N and R + denotes the set of natural numbers(starting from 0) and the set of non-negative real numbers,respectively. Σ = { , } is the binary alphabet, Σ ∗ is the set of(finite) binary strings, Σ n is the set of binary strings of length n , and Σ ∞ is the set of infinite binary sequences. The length ofa string x is denoted by | x | . λ is the empty string. For strings x , y ∈ Σ ∗ , xy is the concatenation of x and y , x (cid:118) y denotesthat x is an initial segment of y . For a sequence x ∈ Σ ∗ ∪ Σ ∞ and a natural number n ≥ x | ` n = x [0 .. n −
1] denotes theinitial segment of length n of x ( x | ` n = x [0 .. n − = x if | x |≤ n ) while x [ n ] denotes the n th bit of x , i.e., x [0 .. n − = x [0] . . . x [ n − C of infinite sequences, Prob [ C ]denotes the probability that ξ ∈ C when ξ is chosen by a uniform random experiment. Martingales are used to describebetting strategies in probability theory. Definition 2.1: (Ville [19]) A martingale is a function F : Σ ∗ → R + such that, for all x ∈ Σ ∗ , F ( x ) = F ( x + F ( x . We say that a martingale
F succeeds on a sequence ξ ∈ Σ ∞ iflim sup n F ( ξ [0 .. n − = ∞ .The concept of “e ff ective similarity” by Goldwasser andMicali [10] and Yao [25] is defined as follows: Let X = { X n } n ∈ N and Y = { Y n } n ∈ N be two probability ensembles such that eachof X n and Y n is a distribution over Σ n . We say that X and Y arecomputationally (or statistically) indistinguishable if for everyfeasible algorithm A (or every algorithm A ), the total variationdi ff erence between X n and Y n is a negligible function in n . Definition 2.2:
Let { X n } n ∈ N and { Y n } n ∈ N be two probabilityensembles. { X n } n ∈ N and { Y n } n ∈ N are computationally (respec-tively, statistically) indistinguishable if for any polynomialtime computable set D ∈ Σ ∗ (respectively, any set D ∈ Σ ∗ )and any polynomial p , the inequality (3) holds for almost all n . | Prob [ A ( X n ) = − Prob [ A ( Y n ) = |≤ p ( n ) (3)Let l : N → N with l ( n ) ≥ n for all n ∈ N and G be apolynomial-time computable algorithm such that | G ( x ) | = l ( | x | )for all x ∈ Σ ∗ .Then the pseudorandom generator concept [3], [25] isdefined as follows. Definition 2.3:
Let l : N → N with l ( n ) > n for all n ∈ N , and { U n } n ∈ N be the uniform distribution. A pseudorandomgenerator is a polynomial-time algorithm G with the followingproperties:1) | G ( x ) | = l ( | x | ) for all x ∈ Σ ∗ .2) The ensembles { G ( U n ) } n ∈ N and { U n } n ∈ N are computation-ally indistinguishable.Let RAND c = ∪ n ∈N RAND c , n where RAND c , n is the set ofKolmogorov c -random sequences that is defined in equation(1). Then we have Theorem 2.4: ([24, Theorem 3.2]) The ensemble R c = { R c , n } n ∈N is not pseudorandom.Theorem 2.4 shows the importance for a good pseudoran-dom generator to fail each statistical test with certain givenprobability.III. S tochastic P roperties of L ong P seudorandom S equences Classical infinite random sequences were first introducedas a type of disordered sequences, called “Kollektivs", byvon Mises [20] as a foundation for probability theory. Thetwo features characterizing a Kollektiv are: the existenceof limiting relative frequencies within the sequence and theinvariance of these limits under the operation of an “admissibleplace selection". Here an admissible place selection is aprocedure for selecting a subsequence of a given sequence ξ in such a way that the decision to select a term ξ [ n ] doesnot depend on the value of ξ [ n ]. Ville [19] showed that von Mises’ approach is not satisfactory by proving that: for eachcountable set of “admissible place selection" rules, there existsa “Kollektiv" which does not satisfy the law of the iteratedlogarithm (LIL). Later, Martin-Löf [14] developed the notionof random sequences based on the notion of typicalness. Asequence is typical if it is not in any constructive null sets.Schnorr [18] introduced p -randomness concepts by definingthe constructive null sets as polynomial time computablemeasure 0 sets. The law of the iterated logarithm (LIL) playsa central role in the study of the Wiener process and Wang[23] showed that LIL holds for p -random sequences.Computational complexity based pseudorandom sequenceshave been studied extensively in the literature. For example, p -random sequences are defined by taking each polynomialtime computable martingale as a statistical test. Definition 3.1: (Schnorr [18]) An infinite sequence ξ ∈ Σ ∞ is p-random (polynomial time random) if for any polynomialtime computable martingale F , F does not succeed on ξ .A sequence ξ ∈ Σ ∞ is Turing machine computable if thereexists a Turing machine M to calculate the bits ξ [0], ξ [1], · · · .In the following, we prove a theorem which says that, for eachTuring machine computable non p -random sequence ξ , thereexists a martingale F such that the process of F succeedingon ξ can be e ffi ciently observed in time O ( n ). The theoremis useful in the characterizations of p -random sequences andin the characterization of LIL-test waiting period. Theorem 3.2: ([23]) For a sequence ξ ∈ Σ ∞ and a poly-nomial time computable martingale F , F succeeds on ξ ifand only if there exists a martingale F (cid:48) and a non-decreasing O ( n )-time computable (with respect to the unary represen-tation of numbers) function from h : N → N such that F (cid:48) ( ξ [0 .. n − ≥ h ( n ) for all n .It is shown in [23] that p -random sequences are stochasticin the sense of von Mises and satisfy common statistical lawssuch as the law of the iterated logarithm. It is not di ffi cult toshow that all p -random sequences pass the NIST SP800-22[17] tests for significance level 0 .
01 since each test in [17]could be converted to a polynomial time computable martin-gale which succeeds on all sequences that do not pass this test.However, none of the sequences generated by pseudorandomgenerators are p -random since from the generator algorithmitself, a martingale can be constructed to succeed on sequencesthat it generates.Since there is no e ffi cient mechanism to generate p -randomsequences, pseudorandom generators are commonly used toproduce long sequences for cryptographic applications. Whilethe required uniformity property (see NIST SP800-22 [17])for pseudorandom sequences is equivalent to the law of largenumbers, the scalability property (see [17]) is equivalent to theinvariance property under the operation of “admissible placeselection” rules. Since p -random sequences satisfy commonstatistical laws, it is reasonable to expect that pseudorandomsequences produced by pseudorandom generators satisfy theselaws also (see, e.g., [17]).The law of the iterated logarithm (LIL) describes thefluctuation scales of a random walk. For a nonempty string x ∈ Σ ∗ , let S ( x ) = | x |− (cid:88) i = x [ i ] and S ∗ ( x ) = · S ( x ) − | x |√| x | where S ( x ) denotes the number of 1s in x and S ∗ ( x ) denotesthe reduced number of 1s in x . S ∗ ( x ) amounts to measuringthe deviations of S ( x ) from | x | in units of (cid:112) | x | .The law of large numbers says that, for a pseudo randomsequence ξ , the limit of S ( ξ [0 .. n − n is , which correspondsto the frequency (Monobit) test in NIST SP800-22 [17]. Butit says nothing about the reduced deviation S ∗ ( ξ [0 .. n − ξ , S ∗ ( ξ [0 .. n − √ n for the fluctuations of S ∗ ( ξ [0 .. n − p -random sequences also. Theorem 3.3: (LIL for p -random sequences [23]) For asequence ξ ∈ Σ ∞ , let S lil ( ξ | ` n ) = (cid:80) n − i = ξ [ i ] − n √ n ln ln n (4)Then for each p -random sequence ξ ∈ Σ ∞ we have bothlim sup n →∞ S lil ( ξ | ` n ) = n →∞ S lil ( ξ | ` n ) = − . IV. N ormal A pproximations to S lil In this section, we provide several results on normalapproximations to the function S lil ( · ) that will be used innext sections. The DeMoivre-Laplace theorem is a normalapproximation to the binomial distribution, which says thatthe number of “successes” in n independent coin flips withhead probability 1 / n / √ n /
2. We first reviewa few classical results on the normal approximation to thebinomial distribution.
Definition 4.1:
The normal density function with mean µ and variance σ is defined as f ( x ) = σ √ π e − ( x − µ )22 σ ; (5)For µ = σ =
1, we have the standard normal densityfunction ϕ ( x ) = √ π e − x , (6)its integral Φ ( x ) = (cid:90) x −∞ ϕ ( y ) dy (7)is the standard normal distribution function. Lemma 4.2: ([7, Chapter VII.1, p175]) For every x >
0, wehave ( x − − x − ) ϕ ( x ) < − Φ ( x ) < x − ϕ ( x ) (8)The following DeMoivre-Laplace limit theorem is derivedfrom the approximation Theorem on page 181 of [7]. Theorem 4.3:
For fixed x , x , we havelim n →∞ Prob (cid:2) x ≤ S ∗ ( ξ | ` n ) ≤ x (cid:3) = Φ ( x ) − Φ ( x ) . (9)The growth speed for the above approximation is bounded bymax { k / n , k / n } where k = S ( ξ | ` n ) − n .The following lemma is useful for interpreting S ∗ basedapproximation results into S lil based approximation. It isobtained by noting the fact that √ n · S lil ( ξ | ` n ) = S ∗ ( ξ | ` n ). Lemma 4.4:
For any x , x , we have Prob (cid:2) x < S lil ( ξ | ` n ) < x (cid:3) = Prob (cid:104) x √ n < S ∗ ( ξ | ` n ) < x √ n (cid:105) In this paper, we only consider tests for n ≥ and x ≤ S ∗ ( ξ | ` n ) ≤ √ n . Thus k = S ( ξ | ` n ) − n (cid:39) √ n S ∗ ( ξ | ` n ) ≤ √ n ln ln n / . Hence, we havemax (cid:40) k n , k n (cid:41) = k n = (1 − α ) ln ln n n < − By Theorem 4.3, the approximation probability calculationerrors in this paper will be less than 0 . < whichis negligible. Unless stated otherwise, we will not mention theapproximation errors in the remainder of this paper.V. W eak -LIL test and design Theorem 3.3 shows that pseudorandom sequences shouldsatisfy the law of the iterated logarithm (LIL). Thus wepropose the following weak LIL test for random sequences.
Weak LIL Test : Let α ∈ (0 , .
25] and ℵ ⊂ N be a subsetof natural numbers, we say that a sequence ξ does not passthe weak ( α, ℵ )-LIL test if − + α < S lil ( ξ | ` n ) < − α forall n ∈ ℵ . Furthermore, P ( α, ℵ ) denotes the probability that arandom sequence passes the weak ( α, ℵ )-LIL test, and E ( α, ℵ ) is the set of sequences that pass the weak ( α, ℵ )-LIL test.By the definition, a sequence ξ passes the weak ( α, ℵ )-LILtest if S lil reaches either 1 − α or − + α at some points in ℵ . In practice, it is important to choose appropriate test pointset ℵ and calculate the probability for a random sequence ξ to pass the weak ( α, ℵ )-LIL test. In this section we calculatethe probability for a sequence to pass the weak ( α, ℵ )-LIL testwith the following choices of ℵ : ℵ = { n } , · · · , ℵ t = { t n } , and (cid:91) ℵ i for given n and t . Specifically, we will consider the cases for t = n = . Theorem 5.1:
Let x , · · · , x t ∈ { , } n . Then we have S lil ( x ) + · · · + S lil ( x t ) = S lil ( x · · · x t ) · (cid:114) t ln ln( tn )ln ln n (10) Proof.
By (4), we have S lil ( x ) + · · · + S lil ( x t ) = (cid:80) ti = S ( x i ) − tn √ n ln ln n = · S ( x · · · x t ) − tn √ n ln ln n = · S ( x · · · x t ) − tn √ · tn ln ln tn · (cid:114) t ln ln tn ln ln n = S lil ( x · · · x t ) · (cid:114) t ln ln( tn )ln ln n (11) (cid:3) Theorem 5.1 can be generalized as follows.
Theorem 5.2:
Let x ∈ { , } sn and x ∈ { , } tn . Then wehave (12) S lil ( x ) (cid:112) s ln ln( sn ) + S lil ( x ) (cid:112) t ln ln( tn ) = S lil ( x x ) (cid:112) ( s + t ) ln ln(( s + t ) n ) Proof.
We first note that S lil ( x ) (cid:112) s ln ln( sn ) = (2 · S ( x ) − sn ) / √ n (13) S lil ( x ) (cid:112) t ln ln( tn ) = (2 · S ( x ) − tn ) / √ n (14)By adding equations (13) and (14) together, we get (12). Thetheorem is proved. (cid:3) Corollary 5.3:
Let 0 < θ < ≤ s < t . For given ξ | ` sn with S lil ( ξ | ` sn ) = ε and randomly chosen ξ [ sn .. tn − Prob (cid:2) S lil ( ξ | ` tn ) ≥ θ (cid:3) = Prob S ∗ ( ξ [ sn .. tn − ≥ (cid:114) t − s (cid:16) θ √ t ln ln tn − ε √ s ln ln sn (cid:105) (15) Proof.
By Theorem 5.2, we have S lil ( ξ [0 .. tn − √ t ln ln tn = S lil ( ξ [ sn .. tn − (cid:112) ( t − s ) ln ln( t − s ) n + ε √ s ln ln sn . (16)Thus S lil ( ξ [0 .. tn − ≥ θ if, and only if, S lil ( ξ [ sn .. tn − ≥ θ √ t ln ln tn − ε √ s ln ln sn √ ( t − s ) ln ln( t − s ) n (17)By Lemma 4.4, (17) is equivalent to (18). S ∗ ( ξ [ sn .. tn − ≥ (cid:114) t − s (cid:16) θ √ t ln ln tn − ε √ s ln ln sn (cid:17) (18)In other words, (15) holds. (cid:3) After these preliminary results, we will begin to calculatethe probability for a random sequence to pass the weak ( α, ℵ )-LIL test. Example 5.4:
For α = . α = .
05, and ℵ i = { i + } with0 ≤ i ≤
8, the entry at ( ℵ i , ℵ i ) in Table I list the probability P ( α, ℵ ) that a random sequence passes the weak ( α, ℵ i )-LIL test. Proof.
Let θ = − α . By Theorem 4.3 and Lemma 4.4, (19) Prob (cid:2) | S lil ( ξ | ` n ) |≥ θ (cid:3) (cid:39) − Φ ( θ √ n )) . By substituting θ = .
95 (respectively 0 . n = , · · · , n = into (19), we obtain the value P (0 . , ℵ i ) (respec-tively P (0 . , ℵ i ) ) at the entry ( ℵ i , ℵ i ) in Table I. This completesthe proof of the Theorem. (cid:3) Now we consider the probability for a random sequence topass the weak ( α, ℵ )-LIL test with ℵ as the union of two ℵ i .First we present the following union theorem. Theorem 5.5:
For fixed 0 < α < t ≥
2, let θ = − α , ℵ = { n , tn } , ℵ a = { n } , ℵ b = { tn } . We have P ( α, ℵ ) (cid:39) P ( α, ℵ a ) + π (cid:90) θ √ n − θ √ n (cid:90) ∞ √ t − ( θ √ t ln ln tn − y ) e − x + y dxdy (20)Alternatively, we have P ( α, ℵ ) (cid:39) P ( α, ℵ a ) + P ( α, ℵ b ) − π (cid:90) ∞ θ √ n (cid:90) ∞ √ t − ( θ √ t ln ln tn − y ) e − x + y dxdy (21) Proof.
Since E ( α, ℵ ) = E ( α, ℵ a ) ∪ E ( α, ℵ b ) , we have P ( α, ℵ ) = (cid:0) P ( α, ℵ a ) + P ( α, ℵ b ) (cid:1) − P ( α, ℵ a ∩ℵ b ) where E ( α, ℵ a ∩ℵ b ) = (cid:110) ξ : | S lil ( ξ | ` n ) | > θ (cid:95) | S lil ( ξ | ` tn ) | > θ (cid:111) . By symmetry, it su ffi ces to show that Prob (cid:104) S lil ( ξ | ` tn ) ≥ θ | E ( α, ℵ a ) (cid:105) (cid:39) π (cid:82) θ √ n − θ √ n (cid:82) ∞ √ t − ( θ √ t ln ln tn − y ) e − x + y dxdy (22)Let ∆ = √ n · ∆ z . By Corollary 5.3, the probability that S lil ( ξ | ` n ) ∈ [ z , z + ∆ z ] and S lil ( ξ | ` tn ) > θ is approximately (cid:90) z √ n +∆ z √ n ϕ ( x ) dx (cid:90) ∞ √ t − ( θ √ t ln ln tn − z √ ln ln n ) ϕ ( x ) dx (cid:39) ∆ · ϕ ( z √ n ) · (cid:90) ∞ √ t − ( θ √ t ln ln tn − z √ ln ln n ) ϕ ( x ) dx (23)By substituting y = z √ n and integrating the equation(23) over the interval y ∈ [ − θ √ n , θ √ n ], we getthe equation (22).The equation (21) could be proved similarly by the follow-ing observation: a sequence passes the weak ( α, ℵ )-LIL test ifit passes the weak LIL test at point n or at point 2 n . Thus thetotal probability is the sum of these two probabilities minusthe probability that the sequence passes the weak LIL test atboth points at the same time. The theorem is then proved. (cid:3) Example 5.6:
For α = . α = .
05) and ℵ i = { i + } with 0 ≤ i < j ≤
8, the entry at ( ℵ i , ℵ j ) in TableI is the probability that a random sequence passes the weak(0 . , ℵ i ∪ ℵ j )-LIL test (respectively, (0 . , ℵ i ∪ ℵ i + )-LIL test). Proof.
The probability could be calculated using eitherequation (20) or (21) in Theorem 5.5 with θ = − α . Ouranalysis shows that results from (20) and (21) have a di ff erence TABLE IW eak (0 . , ℵ )-LIL and (0 . , ℵ )-LIL test probabilities α ℵ ℵ ℵ ℵ ℵ ℵ ℵ ℵ ℵ α = . ℵ ℵ ℵ ℵ ℵ ℵ ℵ ℵ ℵ α = . ℵ ℵ ℵ ℵ ℵ ℵ ℵ ℵ ℵ smaller than 0 . (cid:3) VI. W eak -LIL test design
IIIn this section, we consider the design of weak ( α, ℵ )-LIL test with ℵ consisting at least three points. To beconsistent with Section V, we use the following notations: ℵ = { n } , · · · , and ℵ t = { t n } for given n and t . Inparticular, we will consider the cases for n = . Theorem 6.1:
For fixed 0 < α < t , t ≥
2, let θ = − α , ℵ = { n , t n , t t n } , and ℵ a = { n , t n } . Then we have P ( α, ℵ ) (cid:39) P ( α, ℵ a ) + π √ π ( t − (cid:90) C (cid:90) C (cid:90) C e − x + y − ( z − y )22( t − dxdydz (24)where C = (cid:104) − θ (cid:112) t ln ln t n , θ (cid:112) t ln ln t n (cid:105) C = (cid:104) − θ √ n , θ √ n (cid:105) C = (cid:114) t − θ (cid:112) t ln ln t t n − z / √ t ) , ∞ . Proof.
By symmetry, it su ffi ces to show that Prob (cid:104) S lil ( ξ | ` t t n ) ≥ θ | E α, ℵ a (cid:105) (cid:39) π √ π ( t − (cid:90) C (cid:90) C (cid:90) C e − x + y − ( z − y )22( t − dxdydz (25)By Corollary 5.3, the probability that S lil ( ξ | ` t n ) ∈ [ z , z +∆ z ]and S lil ( ξ | ` t t n ) > θ is approximately P ( z , ∆ z , t n ) · (cid:90) ∞ (cid:113) t − ( θ √ t ln ln t t n − z √ ln ln t n ) ϕ ( x ) dx (26) where P ( z , ∆ z , t n ) is the probability that S lil ( ξ | ` t n ) ∈ [ z , z + ∆ z ].Let ∆ = (cid:112) t ln ln t n · ∆ z . By equation (22) in the proof ofTheorem 5.5, the probability P ( z , ∆ z , t n ) under the conditionalevent “ | S lil ( ξ | ` n ) | < θ ” is approximately P ( z , ∆ z , t n ) (cid:39) π (cid:90) C (cid:90) z √ t t n +∆ − y √ t − z √ t t n − y √ t − e − x + y dxdy (cid:39) (cid:90) C ϕ ( y ) ϕ (cid:32) z √ t ln ln t n − y √ t − (cid:33) ∆ √ t − dy (cid:39) ∆ √ t − (cid:90) C ϕ ( y ) · ϕ (cid:32) z √ t ln ln t n − y √ t − (cid:33) dy (27)By substituting (27) into (26), replacing z (cid:112) t ln ln t n with w , and integrating the obtained equation (27) over the interval w ∈ [ − θ (cid:112) t ln ln t n , θ (cid:112) t ln ln t n ], and finally replacing thevariable w back to z , equation (25) is obtained. The theoremis then proved. (cid:3) Example 6.2:
Let n = . By equation (24) in Theorem6.1, we can calculate the following probabilities:1) P (0 . , ℵ ∪ℵ ∪ℵ ) = . P (0 . , ℵ ∪ℵ ∪ℵ ) = . P (0 . , ℵ ∪ℵ ∪ℵ ) = . P (0 . , ℵ ∪ℵ ∪ℵ ) = . P (0 . , ℵ ∪ℵ ∪ℵ ) = . ff erent combinations, it can be shown that forany ℵ = ℵ i ∪ ℵ i ∪ ℵ i with di ff erent 0 ≤ i , i , i ≤
8, we have0 . ≤ P . , ℵ ≤ .
08 and 0 . ≤ P . , ℵ ≤ . P α, ℵ when ℵ contains three points. By recursivelyapplying Corollary 5.3 as in the proof of Theorem 6.1, we canobtain algorithms for calculating the probability P α, ℵ when ℵ contains more than three points. The process is straightfor- ward though tedious and the details are omitted here. In thefollowing, we give an alternative approach to approximate theprobability P ( α, ℵ ) with |ℵ| > α = . ℵ = ℵ ∪ ℵ ∪ ℵ ∪ ℵ . First we note that P ( α, ℵ ) = P ( α, ℵ ∪ℵ ∪ℵ ) + P ( α, ℵ ) − Prob (cid:2) E ( α, ℵ ) ∩ E ( α, ℵ ∪ℵ ∪ℵ ) (cid:3) (28)Since E ( α, ℵ ) ∩ E ( α, ℵ ∪ℵ ∪ℵ ) = ( E ( α, ℵ ) ∩ E ( α, ℵ ) ) ∪ ( E ( α, ℵ ) ∩ E ( α, ℵ ) ) ∪ ( E ( α, ℵ ) ∩ E ( α, ℵ ) )we have Prob (cid:2) E ( α, ℵ ) ∩ E ( α, ℵ ∪ℵ ∪ℵ ) (cid:3) = Prob (cid:2) E ( α, ℵ ) ∩ E ( α, ℵ ) (cid:3) + Prob (cid:2) E ( α, ℵ ) ∩ E ( α, ℵ ) (cid:3) + Prob (cid:2) E ( α, ℵ ) ∩ E ( α, ℵ ) (cid:3) − Prob (cid:2) E ( α, ℵ ) ∩ E ( α, ℵ ) ∩ E ( α, ℵ ) (cid:3) − Prob (cid:2) E ( α, ℵ ) ∩ E ( α, ℵ ) ∩ E ( α, ℵ ) (cid:3) − Prob (cid:2) E ( α, ℵ ) ∩ E ( α, ℵ ) ∩ E ( α, ℵ ) (cid:3) + · Prob (cid:2) E ( α, ℵ ) ∩ E ( α, ℵ ) ∩ E ( α, ℵ ) ∩ E ( α, ℵ ) (cid:3) (29)Let ε = Prob (cid:2) E ( α, ℵ ) ∩ E ( α, ℵ ) ∩ E ( α, ℵ ) ∩ E ( α, ℵ ) (cid:3) . By substitut-ing (29) into (28) and simplifying it, we get P ( α, ℵ ) = (cid:80) i ∈{ , , , } P ( α, ℵ i ) + (cid:80) i , i , i ∈{ , , , } P ( α, ℵ i ∪ℵ i ∪ℵ i ) − (cid:80) i , i ∈{ , , , } P ( α, ℵ i ∪ℵ i ) − ε (cid:39) . − ε (30)On the other hand, we have2 ε < · Prob (cid:2) E ( α, ℵ ) ∩ E ( α, ℵ ) ∩ E ( α, ℵ ) (cid:3) = P ( α, ℵ ∪ℵ ∪ℵ ) + (cid:80) i ∈{ , , } P ( α, ℵ i ) − (cid:80) i , i ∈{ , , } P ( α, ℵ i ∪ℵ i ) (cid:39) . . < P ( α, ℵ ) < . . In other words, arandom sequence passes the weak (0 . , ℵ ∪ ℵ ∪ ℵ ∪ ℵ )-LILtest with approximately 9 .
65% probability.VII. S trong
LIL test design
This section considers the following strong LIL tests.
Strong LIL Test : Let α ∈ (0 , .
25] and ℵ a , ℵ b , ℵ c ⊂ N besubsets of natural numbers. We say that a sequence ξ passesthe strong ( α ; ℵ a , ℵ b )-LIL test if there exist n ∈ ℵ a and n ∈ℵ b such that | S lil ( ξ | ` n i ) | > − α for i = , S lil ( ξ | ` n ) S lil ( ξ | ` n ) < . (31)Alternatively, we say that a sequence ξ passes the strong( α ; ℵ c )-LIL test if there exist n , n ∈ ℵ c such that (31) holds.Furthermore, SP ( α ; ℵ a , ℵ b ) and SP ( α ; ℵ c ) denote the probability that a random sequence passes the strong ( α ; ℵ a , ℵ b )-LIL and( α ; ℵ c )-LIL tests respectively. Theorem 7.1:
For fixed 0 < α < t ≥
2, let θ = − α , ℵ a = { n } , and ℵ b = { tn } . We have SP ( α, ℵ a , ℵ b ) (cid:39) π (cid:90) ∞ θ √ n (cid:90) − √ t − ( θ √ t ln ln tn + y ) −∞ e − x + y dxdy (32) Proof.
The theorem could be proved in a similar way as in theproof of Theorem 5.5. (cid:3)
Example 7.2:
Let α = . ℵ = { } , ℵ = { } , and ℵ = { } . Then we have SP ( α, ℵ , ℵ ) (cid:39) . SP ( α, ℵ , ℵ ) (cid:39) . f ( k ; r , ) denote the probability that the r th one appears at the position r + k . It is well known that forthis distribution, we have mean µ = r and variance σ = √ r .Thus the probability that the r ’s one appears before the n thposition is approximated by the following probability:12 √ r π (cid:90) n −∞ e − ( x − r )24 r dx (33)For n = and n = , assume that S lil ( ξ | ` n ) ≤ − y forgiven y ≥ θ . Then we have S ( ξ | ` n ) ≤ n − y √ n ln ln n S lil ( ξ | ` n ) ≥ θ , we need to have (35) r ( y ) = S ( ξ [ n .. n − ≥ n + θ √ n ln ln n − n + y √ n ln ln n α = − θ , ℵ a = { n } , and ℵ b = { n } . Using the sameargument as in the proof of Theorem 5.5 (in particular, thearguments for integrating equation (23)) and the negative bino-mial distribution equation (33), the probability that a sequencepasses the strong ( α ; ℵ a , ℵ b )-LIL test can be calculated withthe following equation.1 π (cid:90) − θ √ n −∞ (cid:90) n − n −∞ (cid:112) r ( y ) e − y − ( x − r ( y ))24 r ( y ) dxdy (36)By substituting the values of θ , n , and n , (36) evaluatesto 0.0002335. In other words, a random sequence passesthe strong (0 . ℵ , ℵ )-LIL test with probability 0 . α ; ℵ a , ℵ b )-LILtest with multiple points in ℵ b .VIII. E valuating P seudorandom G enerators In order to evaluate the quality of a pseudorandom generator G , we first choose a fixed n of sequence length, a value0 < α ≤ .
1, and mutually distinct subsets ℵ , · · · , ℵ t of { , · · · , n } . It is preferred that the S lil values on these subsetsare as independent as possible (though they are impossible tobe independent). For example, we may choose ℵ i as in SectionVI. Then we can carry out the following steps.1) Set P + ( α, ℵ ) = P − ( α, ℵ ) = P ( α, ℵ ) for all ℵ .2) Use G to construct a set of m ≥
100 binary sequencesof length n .3) For each ℵ , calculate probability P + ( α, ℵ ) that these se-quences pass the weak ( α, ℵ i )-LIL test via S lil ≥ − α (respectively, P − ( α, ℵ ) for S lil ≤ − + α ).4) Calculate the average absolute probability distance ∆ wlil = t + t (cid:88) i = P − α, ℵ i ) (cid:16)(cid:12)(cid:12)(cid:12) P + ( α, ℵ i ) − P + ( α, ℵ i ) (cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12) P − ( α, ℵ i ) − P − ( α, ℵ i ) (cid:12)(cid:12)(cid:12)(cid:17) and the root-mean-square deviationRMSD wlil = (cid:115) (cid:80) ≤ i ≤ j ≤ t (cid:16) p i , j , + p i , j , (cid:17) ( t + t + p + i , j , = P + ( α, ℵ i ∪ℵ j ) − P + ( α, ℵ i ∪ℵ j ) and p + i , j , = P − ( α, ℵ i ∪ℵ j ) − P − ( α, ℵ i ∪ℵ j )
5) Decision criteria: the smaller ∆ wlil and RMSD wlil , thebetter generator G .IX. S napshot LIL tests and random generator evaluation
We have considered statistical tests based on the limittheorem of the law of the iterated logarithm. These tests donot take full advantage of the distribution S lil , which definesa probability measure on the real line R . Let R ⊂ Σ n be a setof m sequences with a standard probability definition on it.That is, for each x ∈ R , let Prob [ x = x ] = m . Then each set R ⊂ Σ n induces a probability measure µ R n on R by letting µ R n ( I ) = Prob [ S lil ( x ) ∈ I , x ∈ R ]for each Lebesgue measurable set I on R . For U = Σ n ,we use µ Un to denote the corresponding probability measureinduced by the uniform distribution. By Definition 2.2, if R n is the collection of all length n sequences generated by apseudorandom generator, then the di ff erence between µ Un and µ R n n is negligible.By Theorem 4.3 and Lemma 4.4, for a uniformly chosen ξ ,the distribution of S ∗ ( ξ | ` n ) could be approximated by a normaldistribution of mean 0 and variance 1, with error bounded by n (see [7]). In other words, the measure µ Un can be calculatedas (37) µ Un (( −∞ , x ]) (cid:39) Φ ( x √ n ) = √ n (cid:90) x −∞ φ ( y √ n ) dy . Table V in the Appendix lists values µ Un ( I ) for 0 . I with n = , · · · , .In order to evaluate a pseudorandom generator G , firstchoose a sequence of testing points n , · · · , n t (e.g., n = + t ).Secondly use G to generate a set R ⊆ Σ n t of m sequences. Lastly compare the distances between the two probabilitymeasures µ R n and µ Un for n = n , · · · , n t .A generator G is considered “good”, if for su ffi cientlylarge m , the distances between µ R n and µ Un are negligible (orsmaller than a given threshold). There are various definitions ofstatistical distances for probability measures. In our analysis,we will consider the total variation distance [4] d ( µ R n , µ Un ) = sup A ⊆B (cid:12)(cid:12)(cid:12) µ R n ( A ) − µ Un ( A ) (cid:12)(cid:12)(cid:12) (38)Hellinger distance [11] H ( µ R n || µ Un ) = √ (cid:118)(cid:116)(cid:88) A ∈B (cid:32) (cid:113) µ R n ( A ) − (cid:113) µ Un ( A ) (cid:33) (39)and the root-mean-square deviation (40)RMSD( µ R n , µ Un ) = (cid:115) (cid:80) A ∈B (cid:16) µ R n ( A ) − µ Un ( A ) (cid:17) B is a partition of the real line R that is defined as { ( ∞ , , [1 , ∞ ) } ∪ { [0 . x − , . x − .
95) : 0 ≤ x ≤ } . In Section X, we will present some examples of using thesedistance to evaluate several pseudorandom generators.X. E xperimental results
As an example to illustrate the importance of LIL tests,we carry out weak LIL test experiments on pseudorandomgenerators SHA1PRNG (Java) and NIST DRBG [17] withparameters α = . .
05) and ℵ = { } , · · · , ℵ = { } (note that 2 bits = bits = (cid:39) int data typefor integer variables. The initial 145MB of each sequence thatwe have generated passes NIST tests with P-values larger than0 .
01 except for the “longest run of ones in a block” test whichfailed for several sequences.
A. Java SHA1PRNG API based sequences
The pseudorandom generator SHA1PRNG API in Javagenerates sequences SHA1 (cid:48) ( s , (cid:48) ( s , · · · , where s is anoptional seeding string of arbitrary length, the counter i is 64bits long, and SHA1 (cid:48) ( s , i ) is the first 64 bits of SHA1( s , i ). Inthe experiment, we generated one thousand of sequences withfour-byte seeds of integers 0 , , , · · · ,
999 respectively. Foreach sequence generation, the “random.nextBytes()” methodof SecureRandom Class is called 2 times and a 32-byteoutput is requested for each call. This produces sequences of2 bits long. The LIL test is then run on these sequences andthe first picture in Figure 1 shows the LIL-test result curvesfor the first 100 sequences. To reduce the size of the figure, weuse the scale 10000 n for the x -axis. In other words, Figure 1shows the values S lil ( ξ [0 .. n − ≤ n ≤ ℵ , · · · , ℵ are mapped to 82, 116, 164, Fig. 1. LIL test results for sequences generated by Java SHA1PRNG, NISTSP800 90A SHA1-DRBG, and NIST SP800 90A SHA2-DRBG B , the probability distributions µ JavaS HA n with n = , · · · , are presented in the AppendixTable VI. Figure 2 compares these distributions. Fig. 2. The distributions µ Un and µ JavaS HA n with n = , · · · , TABLE IIN umber of sequences that pass the
LIL values . and . ℵ ℵ ℵ ℵ ℵ ℵ ℵ ℵ ℵ ℵ n
82 116 164 232 328 463 655 927 1310Java SHA1PRNG0 . − . .
95 14 12 13 18 12 10 15 7 8 − .
95 13 13 14 9 10 7 9 8 6NIST SP800 90A SHA1-DRBG at sample size 10000 . − . .
95 10 9 12 10 5 5 11 6 6 − .
95 11 12 8 13 8 10 10 7 12NIST SP800 90A SHA256-DRBG at sample size 10000 . − . .
95 9 10 12 15 9 10 16 14 3 − .
95 13 9 8 4 8 6 9 12 9NIST SP800 90A SHA256-DRBG at sample size 100000 . − . .
95 120 107 127 110 89 93 93 84 70 − .
95 107 106 92 99 91 93 95 84 78
B. NIST SP800 90A DRBG pseudorandom generators
NIST SP800.90A [2] specifies three types of DRBG genera-tors: hash function based, block cipher based, and ECC based.For DRBG generators, the maximum number of calls betweenreseeding is 2 for hash function and AES based generators(the number is 2 for T-DES and ECC-DRBG generators). Inour experiment, we used hash function based DRBG. wherea hash function G is used to generate sequences G ( V ) G ( V + G ( V + · · · with V being seedlen-bit counter that is derivedfrom the secret seeds. The seedlen is 440 for SHA1 and SHA-256 and the value of V is revised after at most 2 bits areoutput. We generated 10000 sequences nistSHADRBG0, · · · ,nistSHADRBG9999. For each sequence nistSHADRBG i , theseed “ i th secret seed for NIST DRBG" is used to derive theinitial DRBG state V and C . Each sequence is of the format G ( V ) G ( V ) · · · G ( V + − G ( V ) G ( V + · · · G ( V + − V i + is derived from the value of V i and C i . In otherwords, each V is used 2 times before it is revised. The secondpicture (respectively, the third picture) in Figure 1 shows theLIL-test result curves for the first 100 sequences when SHA1(respectively, SHA256) is used as the hash function and TableII shows the number of sequences that reach the value 0.9, -0.9,0.95, and -0.95 at corresponding testing points respectively.The probability distributions µ nistDRBGsha , n , µ nistDRBGsha , n , and µ nistDRBGsha , n on partition B are presented in Appendix Tables VII, VIII, IX respectively.Figure 3 compares the distributions µ nistDRBGsha n at the samplesize 1000. The comparisons for µ nistDRBGsha n are presentedin Appendix Figure 5 and 6. C. Comparison and Discussion
Based on Table II, the average absolute probability distance ∆ wlil and the root-mean-square deviation RMSD wlil at thesample size 1000 (for DRBG-SHA256, we also include resultsfor sample size 10000) are calculated and shown in Table III. TABLE IIIT he probability distances ∆ wlil and RMSD wlil
Java SHA1PRNG DRBG-SHA1 DRBG-SHA256 (1000) DRBG-SHA256 (10000) ∆ wlil , . ∆ wlil , . wlil , . wlil , . Fig. 3. The distributions µ Un and µ nistDRBGsha n with n = , · · · , These values are quite large.Based on snapshot LIL tests at points 2 , · · · , , the corre-sponding total variation distance d ( µ R n , µ Un ), Hellinger distance H ( µ R n || µ Un ), and the root-mean-square deviation RMSD( µ R n , µ Un )at sample size 1000 (also DRBG-SHA256 at sample size10,000) are calculated and shown in Table IV, where subscripts1 , , , µ R n and µ Un is larger than 0.06 andthe root-mean-square deviation is around 0.005. TABLE IV total variation and H ellinger distances n d .074 .704 .064 .085 .067 .085 .074 .069 .071 H .062 .067 .063 .089 .066 .078 .077 .061 .068RMSD .005 .005 .004 .005 .004 .006 .005 .005 .005 d .066 .072 .079 .067 .084 .073 .065 .078 .083 H .060 .070 .073 .062 .077 .066 .067 .070 .087RMSD .004 .005 .005 .004 .005 .004 .004 .005 .005 d .076 .069 .072 .093 .071 .067 .078 .081 .066 H .082 .064 .068 .088 .079 .073 .076 .074 .080RMSD .005 .004 .004 .006 .004 .004 .005 .005 .005 d .021 .022 .026 .024 .022 .024 .026 .024 .021 H .019 .021 .024 .024 .022 .023 .025 .022 .021RMSD .001 .001 .002 .001 .001 .002 .002 .002 .001 Though the statistical distances in Tables III and IV maybe acceptable for various applications, for a cryptographic ran-dom source at the sample size “1000 of 2GB-long sequences”,it is expected to have a statistical distance smaller than 0.03and an RMSD smaller than 0.001 for the standard normaldistribution µ Un (see, e.g., [8]). At sample size 10000 of 2GBsequences, the statistical distance is reduced to 0.02 which isstill more than acceptable for cryptographic applications. One could also visually analyze the pictures in Figure1. For example, from three pictures in Figure 1, we mayget the following impression: sequences generated by JavaSHA1PRNG have a good performance to stay within theinterval [ − ,
1] though there is a big gap between 400 and900 in the bottom area that is close to the line y = −
1. Amongthe three pictures, sequences generated by SHA1-DRBG havea better performance that looks more close to a true randomsource. For sequences generated by SHA2-DRBG, too manysequences reach or go beyond y = y = − onclusion This paper proposed statistical distance based LIL testingtechniques and showed that, at sample size 1000, the col-lection of sequences generated by several commonly usedpseudorandom generators has a statistical distance 0.06 androot-mean-square deviation 0.005 from a true random source.These values are larger than expected for various cryptographicapplications. This paper also calculated the probability forweak LIL tests on sequences of less than 2GB. For longersequences, the corresponding probabilities decrease signifi-cantly. Thus large sample sizes of sequences are needed forbetter LIL testing. Alternatively, one may also split longersequences into independent sub-sequences of 2GB each andthen use the probabilities calculated in this paper to carryout LIL testing on them. For strong LIL tests, this paperobtained a preliminary result with a very small probabilityfor a random sequence to pass. It would be interesting tocalculate the exact probability SP ( α, ℵ ) for continuous interval ℵ (e.g., ℵ = [2 , ]). We believe that SP ( α, ℵ ) is large enoughfor a reasonable interval ℵ such as ℵ = [2 , ]. When theprobability SP ( α, ℵ ) becomes larger, the required sample size forthe strong LIL testing will be smaller. It would also be veryimportant to find new techniques that could be used to designpseudorandom generators with smaller statistical distance andsmaller root-mean-square deviation from a true random source.R eferences [1] J. Ball, J. Borger, and G. Greenwald. Revealed: how USand UK spy agencies defeat internet privacy and security.http: // / world / / sep / / nsa-gchq-encryption-codes-security, Sept. 13, 2013.[2] E. Barker and J. Kelsey. NIST SP 800-90A: Recommendation for Ran-dom Number Generation Using Deterministic Random Bit Generators .NIST, 2012.[3] M. Blum and S. Micali. How to generate cryptographically strongsequences of pseudorandom bits.
SIAM J. Comput. , 13:850–864, 1984.[4] J.A. Clarkson and C.R. Adams. On definitions of bounded variation forfunctions of two variables.
Tran. AMS , 35(4):824–854, 1933.[5] P. Erdös and M. Kac. On certain limit theorems of the theory ofprobability.
Bulletin of AMS , 52(4):292–302, 1946. [6] W. Feller. The fundamental limit theorems in probability. Bulletin ofAMS , 51(11):800–832, 1945.[7] W. Feller.
Introduction to probability theory and its applicatons ,volume I. John Wiley & Sons, Inc., New York, 1968.[8] D. Freedman, R. Pisani, and R. Purves.
Statistics . Norton & Company,2007.[9] O. Goldreich.
Foundations of cryptography: a primer . Now PublishersInc, 2005.[10] S. Goldwasser and S. Micali. Probabilistic encryption.
J. Comput. Sys.Sci. , 28(2):270–299, 1984.[11] E. Hellinger. Neue begründung der theorie quadratischer formenvon unendlichvielen veränderlichen.
J. für die reine und angewandteMathematik , 136:210–271, 1909.[12] A. Khintchine. Über einen satz der wahrscheinlichkeitsrechnung.
Fund.Math , 6:9–20, 1924.[13] A. N. Kolmogorov. Three approaches to the definition of the concept“quantity of information".
Problems Inform. Transmission , 1:3–7, 1965.[14] P. Martin-Löf. The definition of random sequences.
Inform. and Control ,9:602–619, 1966.[15] NIST. Test suite, http: // csrc.nist.gov / groups / ST / toolkit / rng / , 2010.[16] N. Perlroth, J. Larson, and S. Shane. NSA able to foil basic safeguards ofprivacy on web. http: // / / / / us / nsa-foils-much-internet-encryption.html, Sep. 5, 2013.[17] A. Rukhin, J. Soto, J. Nechvatal, M. Smid, E. Barker, S. Leigh,M. Levenson, M. Vangel, D. Banks, A. Heckert, J. Dray, and S. Vo. AStatistical Test Suite for Random and Pseudorandom Number Generatorsfor Cryptographic Applications . NIST SP 800-22, 2010.[18] C. P. Schnorr.
Zufälligkeit und Wahrscheinlichkeit . Lecture Notes inMath. 218. Springer Verlag, 1971.[19] J. Ville.
Étude Critique de la Notion de Collectif . Gauthiers-Villars,Paris, 1939.[20] R. von Mises. Grundlagen der wahrscheinlichkeitsrechung.
Math. Z. ,5:52–89, 1919.[21] Yongge Wang. The law of the iterated logarithm for p-random se-quences. In
IEEE Conf. Comput. Complexity , pages 180–189, 1996.[22] Yongge Wang. Randomness and complexity.
PhD Thesis, University ofHeidelberg , 1996.[23] Yongge Wang. Resource bounded randomness and computationalcomplexity.
Theoret. Comput. Sci. , 237:33–55, 2000.[24] Yongge Wang. A comparison of two approaches to pseudorandomness.
Theoretical computer science , 276(1):449–459, 2002.[25] A. C. Yao. Theory and applications of trapdoor functions. In
Proc. 23rdIEEE FOCS , pages 80–91, 1982.
XII. A ppendix
Figure 4 shows the distributions of µ Un for n = , · · · , and Table V lists values µ Un ( I ) on B with n = , · · · , . Fig. 4. Density functions for distributions µ Un with n = , · · · , Since µ Un ( I ) is symmetric, it is su ffi cient to list the distributionin the positive side of the real line.Table VI lists values µ JavaS HA n ( I ) on B with n = , · · · , .Table VII lists values µ nistDRBsha n ( I ) on B with n = , · · · , . Table VIII lists values µ nistDRBGsha , n ( I ) on B with n = , · · · , .Figure 5 compared the distributions µ nistDRBGsha , n . Fig. 5. The distributions µ Un and µ nistDRBGsha , n with n = , · · · , Table VIII lists values µ nistDRBGsha , n ( I ) on B with n = , · · · , .Figure 6 compared the distributions µ nistDRBGsha n . Fig. 6. The distributions µ Un and µ nistDRBGsha , n with n = , · · · , −2 −1 0 1 2 . . . . m nistSHA256w10K m nistSHA256w10K m nistSHA256w10K m nistSHA256w10K m nistSHA256w10K m nistSHA256w10K m nistSHA256w10K m nistSHA256w10K m nistSHA256w10K TABLE VT he distribution µ Un induced by S lil for n = , · · · , ( due to symmetry , only distribution on the positive part of real line R is given ) [0 . , .
05) .047854 .048164 .048460 .048745 .049018 .049281 .049534 .049778 .050013[0 . , .
10) .047168 .047464 .047748 .048020 .048281 .048532 .048773 .049006 .049230[0 . , .
15) .045825 .046096 .046354 0.04660 .046839 .047067 .047287 .047498 .047701[0 . , .
20) .043882 .044116 .044340 .044553 .044758 .044953 .045141 .045322 .045496[0 . , .
25) .041419 .041609 .041789 .041961 .042125 .042282 .042432 .042575 .042713[0 . , .
30) .038534 .038674 .038807 .038932 .039051 .039164 .039272 .039375 .039473[0 . , .
35) .035336 .035424 .035507 .035584 .035657 .035725 0.03579 .035850 .035907[0 . , .
40) .031939 .031976 .032010 .032041 .032068 .032093 .032115 .032135 .032153[0 . , .
45) .028454 .028445 .028434 .028421 .028407 .028392 .028375 .028358 .028340[0 . , .
50) .024986 .024936 .024886 .024835 .024785 .024735 .024686 .024637 .024588[0 . , .
55) .021627 .021542 .021460 .021379 .021300 .021222 .021146 .021072 .020999[0 . , .
60) .018450 .018340 .018234 .018130 .018029 .017931 .017836 .017743 .017653[0 . , .
65) .015515 .015388 .015265 .015146 .015032 .014921 .014813 .014709 .014608[0 . , .
70) .012859 .012723 .012591 .012465 .012344 .012227 .012114 .012004 .011899[0 . , .
75) .010506 .010367 .010234 .010106 .009984 .009867 .009754 .009645 .009541[0 . , .
80) .008460 .008324 .008195 .008072 .007954 .007841 .007733 .007629 .007530[0 . , .
85) .006714 .006587 .006466 .006351 .006241 .006137 .006037 .005941 .005850[0 . , .
90) .005253 .005137 .005027 .004923 .004824 .004730 .004640 .004555 .004474[0 . , .
95) .004050 .003948 .003851 .003759 .003672 .003590 .003512 .003438 .003368[0 . , .
00) .003079 .002990 .002906 .002828 .002754 .002684 .002617 .002555 .002495[1 . , ∞ ) .008090 .007750 .007437 .007147 .006877 .006627 .006393 .006175 .005970 TABLE VIT he distribution µ JavaS HA n induced by S lil for n = , · · · , ( −∞ , −
1) .011 .008 .012 .007 .006 .006 .008 .006 .004[ − . , − .
95) .002 .005 .002 .002 .004 .001 .001 .002 .002[ − . , − .
90) .005 .007 .004 .008 .004 .004 .003 .003 .003[ − . , − .
85) .008 .005 .006 .003 .008 .005 .001 .003 .007[ − . , − .
80) .007 .011 .006 .005 .007 .006 .003 .004 .006[ − . , − .
75) .010 .006 .010 .011 .010 .005 .003 .008 .006[ − . , − .
70) .015 .010 .013 .010 .002 .004 .013 .011 .012[ − . , − .
65) .013 .017 .010 .007 .010 .006 .011 .009 .009[ − . , − .
60) .019 .017 .013 .013 .011 .017 .011 .013 .007[ − . , − .
55) .014 .021 .015 .022 .019 .018 .017 .022 .017[ − . , − .
50) .020 .032 .024 .019 .022 .022 .021 .021 .020[ − . , − .
45) .030 .030 .027 .028 .024 .022 .027 .025 .022[ − . , − .
40) .034 .035 .037 .021 .025 .020 .031 .033 .037[ − . , − .
35) .036 .035 .037 .038 .033 .037 .032 .039 .032[ − . , − .
30) .042 .037 .044 .031 .034 .035 .035 .033 .042[ − . , − .
25) .043 .033 .042 .039 .032 .043 .046 .040 .041[ − . , − .
20) .042 .039 .040 .053 .048 .039 .047 .039 .048[ − . , − .
15) .053 .047 .042 .049 .052 .042 .039 .038 .029[ − . , − .
10) .055 .045 .049 .056 .053 .038 .048 .052 .043[ − . , − .
05) .047 .046 .051 .049 .046 .054 .041 .049 .053[ − . ,
0) .040 .037 .048 .047 .045 .055 .053 .059 .048[0 , .
05) .042 .046 .050 .053 .041 .041 .041 .045 .044[ . , .
10) .039 .053 .048 .048 .043 .050 .049 .038 .049[0 . , .
15) .040 .054 .039 .049 .058 .064 .039 .050 .054[0 . , .
20) .042 .047 .039 .047 .051 .058 .064 .041 .038[0 . , .
25) .034 .030 .029 .031 .040 .053 .050 .049 .040[0 . , .
30) .027 .036 .040 .032 .041 .033 .039 .040 .044[0 . , .
35) .034 .027 .034 .033 .043 .022 .033 .040 .040[0 . , .
40) .026 .033 .030 .043 .030 .030 .030 .022 .038[0 . , .
45) .030 .030 .016 .024 .030 .026 .034 .022 .031[0 . , .
50) .020 .021 .023 .028 .019 .033 .028 .022 .021[0 . , .
55) .020 .018 .018 .008 .025 .024 .013 .026 .018[0 . , .
60) .019 .012 .020 .020 .017 .020 .022 .015 .023[0 . , .
65) .015 .015 .014 .009 .015 .015 .015 .017 .019[0 . , .
70) .011 .013 .014 .008 .010 .008 .009 .015 .013[0 . , .
75) .009 .005 .011 .013 .008 .009 .009 .015 .012[0 . , .
80) .011 .009 .007 .004 .006 .009 .009 .006 .003[0 . , .
85) .007 .008 .009 .004 .008 .009 .002 .009 .007[0 . , .
90) .008 .004 .007 .008 .004 .003 .006 .008 .007[0 . , .
95) .006 .004 .007 .002 .004 .004 .002 .004 .003[0 . , .
00) .003 .004 .002 .010 .002 .004 .004 .002 .002[1 . , ∞ ) .011 .008 .011 .008 .010 .006 .011 .005 .0063 TABLE VIIT he distribution µ nistDRBGsha n induced by S lil for n = , · · · , ( −∞ , −
1) .009 .008 .007 .008 .006 .007 .007 .006 .007[ − . , − .
95) .002 .004 .001 .005 .002 .003 .003 .001 .005[ − . , − .
90) .004 .007 .004 .005 .002 .006 .004 .002 .000[ − . , − .
85) .009 .006 .011 .008 .005 .003 .006 .006 .009[ − . , − .
80) .005 .010 .004 .010 .008 .003 .004 .010 .003[ − . , − .
75) .007 .004 .010 .011 .006 .008 .011 .005 .002[ − . , − .
70) .009 .005 .014 .008 .011 .017 .007 .013 .011[ − . , − .
65) .019 .014 .014 .011 .026 .015 .012 .013 .009[ − . , − .
60) .013 .020 .010 .012 .018 .011 .014 .012 .011[ − . , − .
55) .016 .021 .019 .014 .019 .022 .021 .018 .017[ − . , − .
50) .022 .018 .022 .027 .028 .022 .023 .023 .023[ − . , − .
45) .027 .025 .020 .033 .021 .029 .025 .026 .034[ − . , − .
40) .028 .030 .024 .027 .025 .033 .034 .028 .035[ − . , − .
35) .030 .036 .031 .026 .027 .026 .037 .041 .036[ − . , − .
30) .041 .032 .037 .035 .032 .026 .040 .039 .038[ − . , − .
25) .034 .043 .052 .038 .039 .032 .034 .032 .048[ − . , − .
20) .045 .031 .048 .038 .038 .046 .036 .030 .044[ − . , − .
15) .055 .044 .048 .039 .039 .042 .046 .051 .050[ − . , − .
10) .056 .058 .046 .046 .041 .050 .046 .050 .042[ − . , − .
05) .046 .048 .048 .044 .044 .051 .046 .059 .039[ − . ,
0) .045 .050 .035 .051 .040 .053 .048 .059 .048[0 , .
05) .045 .040 .051 .052 .047 .041 .033 .044 .042[0 . , .
10) .058 .038 .060 .047 .056 .044 .044 .056 .051[0 . , .
15) .042 .044 .035 .041 .057 .047 .050 .040 .048[0 . , .
20) .037 .040 .040 .051 .039 .049 .045 .038 .033[0 . , .
25) .034 .050 .037 .056 .045 .039 .046 .039 .033[0 . , .
30) .042 .041 .034 .046 .042 .032 .037 .039 .035[0 . , .
35) .036 .036 .040 .035 .036 .031 .043 .037 .040[0 . , .
40) .022 .038 .028 .033 .045 .029 .043 .032 .038[0 . , .
45) .029 .020 .026 .023 .037 .036 .031 .018 .034[0 . , .
50) .025 .026 .028 .023 .019 .029 .020 .019 .026[0 . , .
55) .024 .025 .034 .019 .012 .031 .024 .023 .031[0 . , .
60) .020 .012 .016 .015 .023 .020 .019 .022 .014[0 . , .
65) .010 .016 .011 .014 .013 .019 .011 .011 .015[0 . , .
70) .012 .013 .011 .008 .015 .012 .010 .013 .013[0 . , .
75) .006 .012 .011 .008 .012 .011 .011 .014 .006[0 . , .
80) .010 .011 .005 .012 .009 .006 .009 .006 .011[0 . , .
85) .006 .005 .006 .005 .006 .005 .002 .008 .006[0 . , .
90) .005 .003 .006 .003 .002 .005 .001 .007 .005[0 . , .
95) .005 .007 .003 .002 .003 .004 .006 .004 .002[0 . , .
00) .002 .004 .003 .004 .001 .001 .003 .001 .001[1 . , ∞ ) .008 .005 .010 .007 .004 .004 .008 .005 .0054 TABLE VIIIT he distribution µ nistDRBGsha , n induced by S lil for n = , · · · , ( −∞ , −
1) .007 .005 .005 .002 .004 .003 .003 .009 .006[ − . , − .
95) .006 .004 .003 .002 .004 .003 .006 .003 .003[ − . , − .
90) .003 .004 .006 .001 .005 .003 .002 .001 .001[ − . , − .
85) .004 .006 .003 .005 .004 .005 .002 .005 .003[ − . , − .
80) .007 .006 .002 .013 .005 .007 .011 .005 .004[ − . , − .
75) .008 .010 .007 .006 .004 .008 .013 .007 .004[ − . , − .
70) .007 .010 .010 .013 .005 .004 .009 .010 .006[ − . , − .
65) .021 .013 .012 .015 .006 .018 .011 .010 .008[ − . , − .
60) .009 .008 .012 .015 .021 .009 .014 .019 .022[ − . , − .
55) .016 .019 .019 .018 .016 .008 .020 .012 .015[ − . , − .
50) .025 .013 .021 .016 .017 .023 .021 .013 .020[ − . , − .
45) .014 .033 .026 .023 .018 .015 .025 .034 .025[ − . , − .
40) .028 .024 .033 .023 .034 .034 .030 .026 .022[ − . , − .
35) .021 .025 .031 .034 .029 .036 .032 .033 .022[ − . , − .
30) .034 .031 .039 .043 .037 .040 .024 .031 .037[ − . , − .
25) .042 .041 .036 .027 .033 .031 .036 .041 .036[ − . , − .
20) .043 .046 .035 .030 .045 .039 .039 .037 .042[ − . , − .
15) .040 .042 .051 .047 .042 .044 .036 .042 .046[ − . , − .
10) .039 .042 .038 .050 .055 .044 .053 .043 .046[ − . , − .
05) .048 .046 .042 .055 .045 .050 .045 .042 .049[ − . ,
0) .049 .045 .044 .043 .045 .049 .040 .063 .055[0 , .
05) .055 .059 .050 .062 .049 .054 .056 .040 .043[0 . , .
10) .043 .041 .049 .044 .049 .045 .059 .060 .047[0 . , .
15) .046 .045 .036 .038 .045 .045 .042 .052 .052[0 . , .
20) .049 .046 .052 .040 .045 .049 .048 .047 .050[0 . , .
25) .054 .043 .033 .046 .046 .047 .033 .037 .043[0 . , .
30) .044 .050 .046 .041 .052 .039 .038 .040 .047[0 . , .
35) .037 .030 .032 .033 .035 .037 .034 .036 .054[0 . , .
40) .033 .028 .030 .040 .039 .033 .036 .049 .032[0 . , .
45) .025 .030 .036 .027 .024 .026 .029 .025 .033[0 . , .
50) .022 .031 .025 .043 .025 .032 .027 .028 .022[0 . , .
55) .023 .026 .021 .016 .027 .023 .018 .019 .020[0 . , .
60) .017 .017 .020 .012 .019 .017 .028 .020 .019[0 . , .
65) .024 .016 .018 .014 .025 .022 .018 .011 .015[0 . , .
70) .008 .016 .017 .009 .013 .017 .014 .007 .012[0 . , .
75) .013 .007 .016 .014 .006 .007 .014 .008 .016[0 . , .
80) .002 .009 .011 .010 .009 .011 .004 .008 .004[0 . , .
85) .011 .011 .012 .007 .001 .004 .005 .007 .007[0 . , .
90) .010 .006 .007 .003 .004 .004 .004 .004 .003[0 . , .
95) .004 .006 .002 .005 .004 .005 .005 .002 .006[0 . , .
00) .002 .003 .002 .007 .001 .002 .005 .003 .000[1 . , ∞ ) .007 .007 .010 .008 .008 .008 .011 .011 .0035 TABLE IXT he distribution µ nistDRBGsha , n induced by S lil for n = , · · · , ( −∞ , −
1) .0071 .0070 .0062 .0067 .0061 .0066 .0069 .0053 .0055[ − . , − .
95) .0036 .0036 .0030 .0032 .0030 .0027 .0026 .0031 .0023[ − . , − .
90) .0047 .0036 .0050 .0031 .0032 .0035 .0028 .0036 .0029[ − . , − .
85) .0044 .0057 .0060 .0035 .0039 .0047 .0038 .0043 .0035[ − . , − .
80) .0063 .0068 .0058 .0085 .0057 .0062 .0066 .0062 .0050[ − . , − .
75) .0089 .0078 .0090 .0082 .0071 .0057 .0083 .0071 .0070[ − . , − .
70) .0112 .0102 .0103 .0094 .0096 .0097 .0108 .0081 .0099[ − . , − .
65) .0126 .0128 .0118 .0118 .0118 .0113 .0104 .0123 .0120[ − . , − .
60) .0149 .0147 .0166 .0166 .0151 .0147 .0185 .0144 .0147[ − . , − .
55) .0180 .0217 .0179 .0181 .0191 .0180 .0165 .0169 .0199[ − . , − .
50) .0216 .0197 .0215 .0217 .0201 .0247 .0243 .0186 .0188[ − . , − .
45) .0228 .0275 .0245 .0228 .0226 .0220 .0250 .0246 .0255[ − . , − .
40) .0274 .0303 .0310 .0309 .0292 .0283 .0319 .0302 .0287[ − . , − .
35) .0302 .0298 .0322 .0331 .0315 .0326 .0323 .0354 .0336[ − . , − .
30) .0353 .0346 .0344 .0341 .0361 .0385 .0331 .0361 .0329[ − . , − .
25) .0394 .0385 .0365 .0379 .0391 .0408 .0381 .0375 .0387[ − . , − .
20) .0435 .0405 .0391 .0425 .0462 .0375 .0454 .0442 .0446[ − . , − .
15) .0419 .0436 .0430 .0430 .0450 .0488 .0431 .0429 .0453[ − . , − .
10) .0439 .0475 .0446 .0475 .0506 .0450 .0464 .0466 .0491[ − . , − .
05) .0474 .0426 .0516 .0484 .0480 .0499 .0474 .0511 .0501[ − . ,
0) .0488 .0489 .0473 .0447 .0474 .0471 .0465 .0501 .0481[0 , .
05) .0497 .0478 .0499 .0460 .0499 .0505 .0495 .0507 .0485[0 . , .
10) .0466 .0460 .0470 .0493 .0512 .0465 .0474 .0476 .0469[0 . , .
15) .0436 .0478 .0479 .0455 .0475 .0481 .0466 .0468 .0494[0 . , .
20) .0450 .0455 .0467 .0438 .0436 .0459 .0487 .0472 .0469[0 . , .
25) .0435 .0411 .0389 .0440 .0418 .0466 .0407 .0460 .0431[0 . , .
30) .0393 .0395 .0392 .0406 .0414 .0390 .0407 .0381 .0405[0 . , .
35) .0370 .0351 .0325 .0377 .0334 .0341 .0357 .0348 .0352[0 . , .
40) .0319 .0304 .0323 .0321 .0289 .0300 .0290 .0363 .0347[0 . , .
45) .0308 .0286 .0295 .0309 .0264 .0274 .0271 .0300 .0293[0 . , .
50) .0239 .0235 .0249 .0252 .0251 .0243 .0243 .0241 .0257[0 . , .
55) .0203 .0229 .0184 .0219 .0213 .0226 .0219 .0201 .0202[0 . , .
60) .0166 .0177 .0166 .0154 .0192 .0168 .0189 .0158 .0178[0 . , .
65) .0162 .0150 .0160 .0163 .0167 .0154 .0138 .0127 .0144[0 . , .
70) .0137 .0143 .0145 .0119 .0120 .0122 .0123 .0123 .0111[0 . , .
75) .0102 .0103 .0111 .0092 .0109 .0103 .0104 .0088 .0091[0 . , .
80) .0074 .0087 .0089 .0074 .0082 .0079 .0084 .0080 .0070[0 . , .
85) .0081 .0070 .0075 .0068 .0060 .0063 .0069 .0050 .0067[0 . , .
90) .0059 .0057 .0047 .0058 .0033 .0050 .0037 .0050 .0040[0 . , .
95) .0044 .0050 .0035 .0035 .0039 .0035 .0040 .0037 .0044[0 . , .
00) .0032 .0037 .0033 .0024 .0021 .0027 .0026 .0023 .0015[1 . , ∞∞