aa r X i v : . [ s t a t . O T ] N ov Benford’s law: a theoretical explanation for base 2
H. M. Bharath, IIT KanpurAugust 6, 2018
Abstract
In this paper, we present a possible theoretical explanation for benford’s law. We develop a recursive relationbetween the probabilities, using simple intuitive ideas. We first use numerical solutions of this recursion andverify that the solutions converge to the benford’s law. Finally we solve the recursion analytically to yeild thebenford’s law for base 2.
The leading significant digit of a random integer is one of , · · · . Intuitively, it is equally likely to be any of thesenine figures. However, empirical observations, and the benford’s law indicate the contrary. According to the law,the probability that a random integer, expressed in base 10, starts with the digit d is[1] P d = Log (1 + 1 d ) (1) d = 1 , · · · . This law was first proposed by newcomb in 1881[2]. It means, a random integer is most likelyto start with 1, with a probability of 0.301, and least likely to start with 9, with a probability of 0.046. Notethat the random integer is unscaled ; i.e., it can be arbitrarily large. This is the suspected reason behind thenonuniform probabilities. On the other hand, if the random number was scaled , i.e, chosen from a bounded set,the corresponding probabilities are obtained through a direct calculation. For instance, consider a scale of , ie,the number is chosen from the set [0 , ; the probabilities are indeed uniform. However, if the scale were ,they would be nonuniform, with d = 1 acquiring a very large probability( > ). In this paper, we use these scaledprobabilities to arrive at the benford’s values of unscaled probabilities. Before we proceed, we shall state the wellknown generalizations of benford’s law.The law is generalized to first two digits. The probability that a random integer starts with digits d d is givenby P d d = Log (1 + 1 d + 10 d ) = Log (1 + 1 d d ) (2)It is further generalised to arbitrary number of significant digits, and expressed in an arbitrary base b as P d ··· d k = Log b (1 + 1 d k + bd k − + · · · + b k − d ) = Log b (1 + 1 d · · · d k ) (3)where d · · · d k is the number expressed in base b [4]. We shall consider the simple case of base . In the next section,we present the basic idea behind the proof, supported with examples and numerical calculations. The analyticalproof is provided in section 3. We end with a brief discussion, in section 4. Expressed in base , every number starts with . Hence we consider the first two significant digits, which are either or . Let P and P be the corresponding probabilities. According to benford’s law, P = log (1 + ) =0 . , P = log (1 + ) = 0 . .These are the unscaled probabilities. Unlike them, the scaled probabilities are easily evaluated. For instance,consider a scale of ; i.e, the random integer is chosen from the set [0 , . Since, in this set, numbers startingfrom and are equally populated, the corresponding probabilities are each. This is true of any scale of theform · · · . Accordingly let us denote them by P = P = . The superscript indicates that the scale is of the1orm · · · . Now consider a scale of . It can be verified that the probabilities are now and . Also, this istrue of any scale of the form · · · . Let us denote them by P = and P = .Thus, the unscaled probability P is in between P and P , and P is inbetween P and P . P = P w + P (1 − w ) (4) P = P w + P (1 − w ) (5)where w is the weight assosiated with the scale being of the form · · · . To a first order, it can be approximatedto the probability that a randomly chosen scale starts with , which is P . Thus, P = P P + P P (6) P = P P + P P (7)this gives P = = 0 . , and P = = 0 . . These are the first order approximations. The approximationlies in the assumption w = P ; all integers starting from are not of the form · · · .To sharpen the approximation, consider the first three significant digits. Using a similar notation, we denotethe unscaled probabilities by P xy , where x, y = 0 , . And the scaled probabilities by P αβ xy , x, y, α, β = 0 , . P αβ xy is the probability that a random integer starts with xy when the scale is of the form αβ · · · . The equations, tothe second approximation are P xy = X αβ P αβ xy P αβ (8)This is a set of four equations in four variables. Once we solve for P αβ , we can evaluate P using P = P + P . To do this, we are to first evaluate P αβ xy , the population fraction of numbers starting from xy in the integerset S = [0 , αβ · · · . This set can be broken in to three chunks S = S ∪ S ∪ S where S , S and S are theinteger sets, S = [0 , · · · S = [1000 · · · , α · · · S = [1 α · · · , αβ · · · Note that they are disjoint. S is the largest; S is an enhancement over S and S is an enhancement over S . If p , p and p are the population fractions of numbers starting from xy within the sets S , S and S respectively,we may write P αβ xy = p | S | + p | S | + p | S || S | + | S | + | S | (9)where, | S j | is the number of elements in S j . Clearly, | S | = α | S | and | S | = β | S | . In S , the second and thethird digits are equally distributed, i.e., , , , appear with equal populations. Hence p = . In S , allnumbers have second digit and the third digit is equally distributed between and . So, p = δ x . In S , allnumbers have second digit α , and third digit . Therefore, p = δ xα δ y . Thus, P αβ xy = 1 + αδ x + βδ xα δ y α + β (10)The equation P xy = P α,β P αβ xy P αβ reads / / / / / / / / / / / / / / / / P P P P = P P P P (11)The solution, after normalizing the sum to 1 is P = 0 . P = 0 . P = 0 . P = 0 . P + P = P and P + P = P , we obtain P = 0 . , the second approximation. As expected,it is closer to the benford’s value, . , than the first approximation.Higher order approximations can be obtained by considering a larger number of digits. Considering k digitsafter the first digit, the equation to be solved is a k × k matrix equation P x ··· x k = X { α i } P α ··· α k x ··· x k P α ··· α k (12)where P x ··· x k is the probability that an unscaled integer starts with x · · · x k and the matrix element, P α ··· α k x ··· x k is the corresponding probability with a scale of α · · · α k · · · . This can be evaluated easily. For values of k upto 10, they were solved numerically using python. Table -1 summarizes the results. The values suggest a neatconvergence to the benford’s value. Interestingly, the relative error falls exponentially. In the next section, we shallprove it analytically. k P Rel err1 0.571428 0.0232 0.577861 0.0123 0.581339 0.00624 0.583135 0.00315 0.584045 0.001566 0.584503 0.000787 0.584732 0.000398 0.584847 0.000199 0.584905 0.00009710 0.584933 0.000049Table 1: Estimates of P up to k=10. Value according to benford’s law: P = 0 . In this section, we show that the benford’s law is an exact solution to equation[12]. We are to solve the equationfor P x ··· x k in the limit of k → ∞ . And the matrix elements in this equation are evaluated in appendix A. P α ··· α k x ··· x k = 1 + Q αx k (1 + α ) (13)We are to show that the solution is logarithmic, i.e., P x ··· x k = Log [1 + x ··· x k ] . Observe that this function has afirst approximation of x ··· x k , in the large k limit. Hence, we shall first show that this is a solution in the limit oflarge k. That is, we are to show, that x ) = lim k →∞ X k Q αx (1 + α ) (14) x and α are numbers between and with k places. In the limit of k → ∞ , x and α are any real numbers between and and the sum is replaced by an integral x ) = ˆ dα Q αx (1 + α ) (15)We are to show the above relation. Q αx is the sum of an infinite sereis. The integral is easily evaulated for each ofthese terms, and then summed up. The details of this proof has been completed in appendix B.For a finite value of k, to evaluate P β ··· β k , we write it as P β ··· β k = X { α i } P β ··· β k α ··· α l (16)We have shown that in the large l limit, lim l →∞ P β ··· β k α ··· α l = 11 β · · · β k α · · · α l (17)3hus, P β ··· β k = lim l →∞ P { α i } β ··· β k α ··· α l = lim l →∞ l X n =0 l (1 β · · · β k ) + n = Ln (cid:18) β · · · β k + 11 β · · · β k (cid:19) Normalizing, we obtain the benford’s law P β ··· β k = Log (cid:18) β · · · β k + 11 β · · · β k (cid:19) (18) So far, little light has been thrown in to the counterintuitive nature of benford’s law. We haven’t reconstructed ourintuition so as to understand the law. The origin of the anomalous behaviour is still unclear. A strong reason why itis counterintuitive is that, the cardinalities of numbers starting from any digit is the same, and therefore we expectthe probabilities to be the same as well. One step towards understanding it is to realise that, the probabilitiesmeasure the occurances and not the cardinalities .To understand it better, let { a i } be a sequence and { b i } be a sub sequence of { a i } . For instance, let a i = i and b i = 2 i . a i is the sequence of positive integers and b i is the subsequence of even numbers. The probability that arandomly chosen element in { a i } is also an element in { b i } is . Now, let { c i } be a subsequence of { b i } , c i = 4 i ,the sequence of multiples of four. The probablity that a randomly chosen element in { a i } is also an element in { c i } is . Even though { b i } and { c i } have the same cardinalities, and can be mapped to each other, the probabilitiesare not equal. In fact, the sequence { a i } can be rearranged such that every alternate term is an element of { c i } . { a ′ i } : 1 , , , , , , , , · · · This sequence { a ′ i } is a rearrangement of { a i } . The probability that a random element belongs to { c i } is now .Hence, this probability is unrelated to the cardinality; instead, it is a measure of f requency of occurance of theelements of { c i } in the parent sequence { a i } . Hence, it changes on rearranging the parent sequence.In the above examples, all the occurances were periodic. Thus, even though the sequences were infinite, due tothe periodicity, the calculation of the probability was as simple as it is in case of a finite set. However, in a benfordsequence, there is no such periodicity, and therefore, the calculation is nontrivial. In this paper, we have outlineda possible analytical explanation for benford’s law for base 2. It is very likely that, a similar strategy can yeild thelaw for any base. Therefore, further work in this direction is expected to be fruitful. In this appendix, we evaluate the coefficients P α ··· α k x ··· x k . It is the population fraction of numbers starting from x · · · x k in the set S = [0 , α · · · α k · · · . We shall use the same strategy again: break this set in to disjointchunks. S = [0 , · · · ∪ [100 · · · , α · · · ∪ · · · ∪ [1 α · · · α k − · · · , α · · · α k · · · defining the sets, S = [0 , · · ·