The Newcomb--Benford law. Scale invariance and a simple Markov process based on it
aa r X i v : . [ phy s i c s . pop - ph ] J a n The Newcomb–Benford law: Do physicists use more frequently the key than the key ? Andrea Burgos a) and Andr´es Santos b) Departamento de F´ısica, Universidad de Extremadura, 06006 Badajoz, Spain (Dated: 29 January 2021)
The Newcomb–Benford law, also known as the first-digit law, gives the probability distribution associatedwith the first digit of a dataset, so that the first significant digit has a probability of 30 . .
58 % of being 9. This law can be extended to the second and next significant digits. In this article, anintroduction to the discovery of the law, its derivation from the scale invariance property, as well as someapplications and examples, are presented. Additionally, a simple dynamic model simulating how an initialdataset changes if sequentially multiplied by a factor 2 is proposed. Within this model, it is proved that thefirst-digit distribution of the generated datasets irreversibly converges to the Newcomb–Benford law.
I. INTRODUCTION
Late 19th century. An astronomer and mathematicianvisits his institution’s library and consults a table of log-arithms to perform certain astronomical calculations. Ason previous occasions, he is struck by the fact that thefirst pages (those corresponding to numbers that start at1) are much more worn than the last ones (correspond-ing to numbers that start at 9). Intrigued, this time hedecides not to overlook the matter. He closes his eyes toget concentration, sketches a few calculations on a pieceof paper, and finally smiles. He has found the answerand it turns out to be enormously simple and elegant.A little over half a century later, a physicist and electri-cal engineer who ignores his predecessor’s discovery, ob-serves the same curious phenomenon on the pages of log-arithm tables, and arrives at the same conclusion. Bothhave understood that, in a long list of records { r n } ob-tained from nature, the fraction p d of records beginningwith the significant digit d = 1 , , . . . , p d = 1 / p d = log (cid:18) d (cid:19) , d = 1 , , . . . , . (1)The numeric values of p d are shown in the second columnof Table I. We see that the records that start with 1, 2,or 3 account for around 60 % of the total, while the other6 digits must settle for the remaining 40 %.Our 19th century character is Simon Newcomb (Fig.1) and he published his discovery in a modest two-pagenote. The second character is Frank Benford (Fig. 2)and he wrote a 22-page article in which, in addition tomathematically justifying Eq. (1), he showed its validityin the analysis of more than 20 ,
000 first digits taken fromsources as varied as river areas, populations of Americancities, physical constants, atomic and molecular weights, a) Electronic mail: [email protected] b) Electronic mail: [email protected]
TABLE I. Probabilities for the first, second, third, and fourthsignificant digits.Digit First Second Third Fourth d p d p (2) d p (3) d p (4) d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . specific heats, numbers taken from newspapers or theReader’s Digest, postal addresses, . . . , or the series n − , √ n , n , or n !, among others, with n = 1–100.With such an overwhelming evidence, it is not surpris-ing that Eq. (1) is usually known as Benford’s law (orfirst-digit law), even though it was found nearly sixtyyears earlier by Newcomb. This is but one more manifes-tation of the so-called Stigler’s law, according to whichno scientific discovery is named after who discovered itin the first place. In fact, as Stigler himself points out, the law that bears his name was actually spelled out ina similar way twenty-three years earlier by the Americansociologist Robert K. Merton. In order not to fall com-pletely into Stigler’s law, many authors refer to Eq. (1)as Newcomb–Benford’s law and that is the criterion (bymeans of the acronym NBL) that we will follow in thisarticle.
II. DERIVATION OF THE LAW
Often times, when one first speaks to a friend, familymember, or even a colleague about the NBL, their firstreaction is usually of skepticism. Why is the first digit
FIG. 1. Simon Newcomb (1835–1909).FIG. 2. Frank Benford (1883–1948). not evenly distributed among the 9 possible values? Asimple argument shows that, if a robust distribution lawexists, it cannot be the uniform distribution whatsoever.Imagine a long list of river lengths, mountain heights,and country surfaces, for example. It is possible thatthe lengths of the rivers are in km, the heights of themountains in m, and the surfaces of countries in km ,but they could also be expressed in miles, feet, or acres,respectively. Will the distribution p d depend on whetherwe use some units or others, or even if we mix them? Itseems logical not to, that is, that the p d distribution is(statistically) independent of the units chosen; in otherwords, it is expected that the p d distribution is invariantunder change of scale . The uniform distribution p d = obviously does not verify that property of invariance.Suppose we start from a dataset in which all the valuesof the first digit are equally represented. If we multiplyall the records in the dataset by 2, we can see that those records that started before with 1, then start with 2 or 3,while all those that started with 5, 6, 7, 8, or 9 start nowwith 1. The following diagram shows all the possibilities:1 → ( , → ( , → ( , → ( , (2a)56789 → . (2b)Therefore, if p d = initially, then p = and p + p = p + p = p + p = p + p = after multiplying by2 all the records, thus destroying the initial uniformity.We can continue multiplying by 2 and the distributionwill continue to evolve until a stationary and invariantdistribution is reached under that change of scale.Thus, the most identifying hallmark of the NBL is thatit must be applied to records that have units or, as New-comb himself writes, “As natural numbers occur in na-ture, they are to be considered as the ratios of quanti-ties.” Let us then make a sketch of the derivation of thelaw by imposing invariance under change of scale. Consider again a long list of records { r n } that, with-out loss of generality for the matter at hand, we willassume positive. Each record can be written in the form r n = x n × k n , where k n is an integer and x n ∈ [1 , significand . Obviously, it is the distribution of thesignificand that is relevant for the NBL. The significand x n is directly related to the mantissa µ n of the deci-mal logarithm of r n , that is, log r n = k n + µ n , where µ n = log x n ∈ [0 , P x ( x )d x be the probabilitythat the significand is between x and x + d x , so that thenormalization condition is R d x P x ( x ) = 1. The proba-bility that the first significant digit of the registry r is d is then given by the integral p d = Z d +1 d d x P x ( x ) . (3)If the distribution P x ( x ) is actually invariant under achange of scale, that means that P x ( λx ) = f ( λ ) P x ( x )with arbitrary λ . Taking into account the normalizationcondition in the form R λλ d( λx ) P x ( λx ) = 1, it followsthat necessarily f ( λ ) = λ − , that is, P x ( λx ) = λ − P x ( x ).Differentiating both sides of the equation with respect to λ and then taking λ = 1, we easily obtain xP ′ x ( x ) = − P x ( x ), which, according to Euler’s theorem, impliesthat P x ( x ) is a homogeneous function of degree −
1, thatis, P x ( x ) ∝ x − . The constant of proportionality is ob-tained from the normalization condition, finally yielding P x ( x ) = x − ln 10 , ≤ x < . (4)This is the unique distribution of significands that is in-variant under a change of scale. From Eq. (4), and ap-plying Eq. (3), it is straightforward to obtain the NBL(1).It is interesting to note that the inverse law for thesignificand, Eq. (4), implies a uniform law for the man-tissa (and vice versa). Let P µ ( µ )d µ be the probabilitythat the mantissa lies between µ and µ + d µ . Since P µ ( µ )d µ = P x ( x )d x and d µ = ( x − / ln 10)d x , Eq. (4)gives us P µ ( µ ) = 1. In the words of Newcomb, “Thelaw of probability of the occurrence of numbers is suchthat all mantissæ of their logarithms are equally proba-ble.” An immediate consequence is that if µ is a randomvariable uniformly distributed between 0 and 1, then therandom variable x = 10 µ fulfills the NBL. This is in factan easy way to generate a list of random records meetingthe NBL.There are deterministic series that also satisfy theNBL. Suppose the series { r n = a | α | n + b | β | n , n = 1 , , . . . } with a = 0, | α | > | β | , and log | α | = irrational. In thatcase, lim n →∞ log r n = n log | α | + log | a | has a uni-formly distributed mantissa, so { r n } satisfies the NBL.That includes, for example, the series { n } , { n } , and { F n } , where F n = √ (cid:2) ϕ n − ( − ϕ − ) n (cid:3) are the Fibonaccinumbers, ϕ ≡ (cid:0) √ (cid:1) being the golden ratio. Simi-larly, the series { n ! } also satisfies the law. Another important property is that if a list { r n } ful-fills the NBL, so does the list { r an } . Indeed, if log r n = k n + µ n , the mantissa µ n being uniformly distributed,then the mantissa of log r an = a ( k n + µ n ) is also evenlydistributed. This is directly related to the fact that theNBL is not only invariant under scale change, but alsounder base change , as would be expected, given the ar-tificial character of the decimal base. To see it, let usassume a base b and build the list { r an } , with a = log b ,from a list { r n } that fulfills the NBL. In that case, r an = y n × b k n , where y n = x an ∈ [1 , b ) is the significandof r an in the base b . The probability distribution P y ( y )is related to the distribution P x ( x ) through the equation P y ( y )d y = P x ( x )d x , so that Eq. (4) leads to P y ( y ) = y − ln b , ≤ y < b. (5)Therefore, the NBL (1) in an arbitrary base b takes theform p d = log b (cid:18) d (cid:19) , d = 1 , , . . . , b − . (6)Returning now to the decimal base, we can use thesignificand distribution (4) to generalize Eq. (1) beyondthe first digit. Consider an ordered string ( d , d , . . . , d m )made up of the first m digits, where d ∈ { , , . . . , } and d i ∈ { , , , . . . , } if i ≥
2. The records whose m first digits match the string ( d , d , . . . , d m ) will be thosewhose significand x is greater than or equal to d + d × − + · · · + d m × − ( m − and less than d + d × − + · · · +( d m +1) × − ( m − . Consequently, integrating P x ( x )between these two limits, one can obtain p d ,d ,...,d m = log m X i =1 d i × m − i ! − . (7)As a consistency test, it is easy to check that p d ,d ,...,d m − = X d m =0 p d ,d ,...,d m = log m − X i =1 d i × m − − i ! − . (8)For example, the probability that the first three digitsof a record form precisely the string (3 , ,
4) is p , , =log (1 + 1 / . p d ,d ,...,d m , we can calculate the prob-ability p ( m ) d that the m th digit is d , regardless of thevalues of the preceding m − m − p ( m ) d = X d =1 9 X d =0 · · · X d m − =0 p d ,d ,...,d m − ,d . (9)In Table I, the law for the first digit, p d , is accompa-nied by the laws for the second, third, and fourth digits,obtained from Eqs. (7) and (9). As the digit is moreinternal, the probability becomes less and less disparate.In Eq. (2a) we saw that, when multiplying by 2 adataset { r n } , part of the records that started with d =1 , , ,
4, specifically those that start with ( d, d, d, d, d, d , while the restwill start with 2 d +1. Let us call α d the fraction of recordsthat, initially starting with d = 1 , , ,
4, start with 2 d byduplicating all the records. Thus, α d = P d =0 p d,d p d , d = 1 , , , . (10)If the dataset fulfills the NBL, then one has α = log
10 32 log . , (11a) α = log
10 54 log
10 32 = 0 . , (11b) α = log
10 76 log
10 43 = 0 . , (11c) α = log
10 98 log
10 54 = 0 . . (11d) III. APPLICATIONS AND EXAMPLES
The applications and verifications of the NBL arenumerous and cover topics as varied and prosaic as
NBLBadajozCáceresExtremaduraSpain d p d FIG. 3. Comparison with the NBL of the distribution ofthe first digit in the populations of the municipalities of theprovinces of Badajoz and C´aceres, the community of Ex-tremadura, and Spain. the study of the genome, the half-life of unsta-ble nuclei, particle physics, astronomy, quantumcritical phenomena, toxic emissions, tax audits, electoral or scientific frauds, gross domesticproduct, stock market, inflation data, world wideweb, religious activities, dates of birth, river flowrates, or even COVID-19. Other examples can be seenin the link 27. In this section we will present some addi-tional examples.Let us start with one of the situations that Benfordhimself studied in his classic paper: that of city popula-tions. Using data from the Spanish National Institute ofStatistics, we have considered the population (in 2019)of the 165 municipalities in the province of Badajoz (plusthe total population of the province of Badajoz), of the223 municipalities of the province of C´aceres (plus the to-tal population of the province of C´aceres), and the totalof 388 municipalities of the community of Extremadura(plus the total populations of the provinces of Badajozand C´aceres). We have also considered the population(according to the 2016 census) of the 8 ,
110 Spanish mu-nicipalities. With all these lists of records, we have ana-lyzed the frequency of those starting with d = 1 , , . . . In thesecond case, the data considered correspond to the dailynumber of sunspots from 1818 to the present. As seenin Fig. 4, the distances between our planet and the starsare generally well followed by the NBL, despite the factthat the list is not excessively long (only 300 data) andthat there are “local” deviations (for example, p < p in the two choices of units). This was to be expected, NBL
Stellar distances ( light - year ) Stellar distances ( parsec ) Sunspots d p d FIG. 4. Comparison with the NBL of the distribution of thefirst digit in the distance to Earth (in light-years and in par-sec) of the brightest 300 stars and in the daily number ofsunspots. since the distribution of digits in distances (which areexpressed in units) is a clear example of invariance underscale change. However, in the case of the daily number ofsunspots, quantitative (although not qualitative) differ-ences are observed with the NBL, especially in the cases d = 1, 3, 4, and 5. It should be noted that, althoughthe series is rather long (more than 59 ,
000 records, af-ter excluding days without data or with 0 spots), eachrecord only has one, two, or three digits (the maximumnumber of sunspots was 528 and corresponded to August26, 1870).Lastly, we have analyzed the prices of 1 ,
016 itemsfrom a chain of fashion retailers and of 1 ,
373 prod-ucts from a chain of hypermarkets. The results areshown in Fig. 5. In this case, the discrepancies with theNBL are more pronounced. Although the highest fre-quencies occur for d = 1 and d = 2, the observed valuesof p d do not decrease monotonically with increasing d .In the case of the fashion retailers, we have p > p and p > p > p > p ; in the prices of the chain of hyper-markets, p > p > p > p . In principle, one mightthink that, since they can be expressed in euro, dollar,peso, yen, . . . , prices should satisfy the property of invari-ance under change of scale inherent to the NBL. However,commercial and artificial pricing strategies must be su-perimposed on this invariance, which generates relevantdeviations with respect to the NBL. IV. THE NEWCOMB–BENFORD DISTRIBUTION ASAN ATTRACTOR UNDER SCALE CHANGE. A SIMPLEDYNAMIC MODEL
As already said, the NBL (1) is invariant under changeof scale, that is, if we start from a dataset { r n } that ful-fills the NBL and multiply all the records by a constant λ ,the resulting dataset { λr n } still fulfills the NBL. The in-teresting thing is that, in addition, the NBL is an attrac-tor of this process: if we start from a dataset { r n } that NBL
F(cid:0)(cid:1)(cid:2)(cid:3)(cid:4)(cid:5) retailers
C(cid:6)(cid:7)(cid:8)(cid:9) o(cid:10) h(cid:11)(cid:12)(cid:13)(cid:14)(cid:15)(cid:16)(cid:17)(cid:18)(cid:19)(cid:20)(cid:21) d p d FIG. 5. Comparison with the NBL of the distribution of thefirst digit in the prices of articles of a chain of fashion retailersand a chain of hypermarkets. does not fulfill the NBL and generate new sets by mul-tiplying successively by λ (other than a fractional powerof 10), the generated sets converge towards the NBL. Inthis section we will analyze this property through a sim-ple model that allows it to be solved analytically.Suppose that at “time” t we have a list of records { r n ( t ) } and let us denote by p d ( t ) the fraction of thoserecords that have d ∈ { , , . . . , } as the first significantdigit. We will then generate a new list at time t + 1by multiplying by the factor 2, that is, { r n ( t + 1) } = { r n ( t ) } , p d ( t + 1) being the corresponding first-digit fre-quency distribution. According to Eqs. (2), p ( t + 1) = p ( t ) + p ( t ) + p ( t ) + p ( t ) + p ( t ) , (12a) p ( t + 1) = α p ( t ) , p ( t + 1) = (1 − α ) p ( t ) , (12b) p ( t + 1) = α p ( t ) , p ( t + 1) = (1 − α ) p ( t ) , (12c) p ( t + 1) = α p ( t ) , p ( t + 1) = (1 − α ) p ( t ) , (12d) p ( t + 1) = α p ( t ) , p ( t + 1) = (1 − α ) p ( t ) , (12e)where the fractions α d ( d = 1 , , ,
4) are defined by Eq.(10).Note that Eqs. (12) verify the normalization condi-tion P d =1 p d ( t + 1) = P d =1 p d ( t ) = 1. Therefore,only 8 of the probabilities { p d , d = 1 , , . . . , } are lin-early independent, so we can eliminate one of them.If we choose p = 1 − P d =1 p d , Eq. (12a) gives us p ( t + 1) = 1 − p ( t ) − p ( t ) − p ( t ) − p ( t ). Thus, Eqs.(12) can be written in matrix form as p I ( t + 1) = q + A · p I ( t ) , p II ( t + 1) = B · p I ( t ) , (13)where p I ( t ) = ( p ( t ) , p ( t ) , p ( t ) , p ( t )) † , p II ( t ) =( p ( t ) , p ( t ) , p ( t ) , p ( t )) † , and q = (1 , , , † are col-umn vectors. The square matrices A and B are given by A = − − − − α − α α , (14a) B = − α α
00 0 1 − α
00 0 0 α , (14b)Both matrices are singular, that is, not invertible.This implies the irreversible character of the transition { p d ( t ) } → { p d ( t + 1) } .In general, the parameters α d ( d = 1 , , ,
4) (and, con-sequently, the matrices A and B ) will depend on the dis-tributions of the first digit and of the first two digits of thedataset { r n ( t ) } , so they will be functions of time. How-ever, here we will consider a simplified model in whichthe four parameters α d are fixed constants. In that case,the solution to the initial-value problem associated withEq. (13) is p I ( t ) = t − X n =0 A n · q + A t · p I (0)= ( I − A t ) · p ∗ I + A t · p I (0) , (15a) p II ( t ) = B · ( I − A t − ) · p ∗ I + B · A t − · p I (0) , (15b)where I is the identity matrix and p ∗ I = ( I − A ) − · q = 13 + α α α − α α α , (16a) p ∗ II = B · p ∗ I = 13 + α α α (1 − α )(1 − α ) α (1 − α )(1 − α ) α α α (16b)is the stationary solution. Such a solution will be an attractor if lim t →∞ p I ( t ) = p ∗ I and lim t →∞ p II ( t ) = p ∗ II for any initial condition p I (0), that is, if lim t →∞ A t = 0.Note that the initial values p II (0) do not influence theevolution of p II ( t ).To check the above attractor condition, let us obtainthe eigenvalues { a i , i = 0 , , , } of A . It is easy to seethat the characteristic equation is a ( α α + a + a + a ) =0. Therefore, the eigenvalues are a = 0 and a = − (cid:18) β − β (cid:19) , (17a) a , = − − ± ı √ β + 1 ∓ ı √ β ! , (17b)where ı is the imaginary unit and β ≡ (cid:20) (cid:18) − α α + q − α α + 81 α α (cid:19)(cid:21) / . (18)Consequently, A t = U · D t · U − , t = 1 , , . . . , (19)where U = a α α a α α a α α a α a α a α − (1 − α ) a α α (1 − α ) a α α (1 − α ) a α α , (20) D t = a t a t
00 0 0 a t , t = 1 , , . . . . (21)From Eqs. (17) it can be verified that | a | < | a , | < < α α <
1, so that lim t →∞ D t = 0. Thisproves the attractor character of the stationary distri-bution { p ∗ I , p ∗ II } .If we choose α d = (that is, we assume that the seconddigits 0–4 are as equally likely as 5–9), then Eqs. (16)provide the stationary solution p ∗ = ≃ . p = p = ≃ . p = p = p = p = ≃ . p = p = ≃ . p d ( t )towards p ∗ d , we are going to consider two different initialconditions. First, we start from a uniform distribution,that is, p d (0) = . The result is shown in Fig. 6, where wesee that the evolution is oscillatory, as corresponds to thefact that both the real eigenvalue ( a ) and the real part ofthe complex eigenvalues ( a , ) are negative. As a secondexample, we take an inverted initial distribution, that is, p d (0) = p ∗ − d , so that the digit 9 is the most frequentand the digit 1 is the least frequent. In this latter case,as shown in Fig. 7, the initial oscillations are of greateramplitude but, as before, the stationary distribution ispractically reached after a few iterations.It seems convenient to characterize the evolution ofthe set of probabilities { p d ( t ) } towards the attractor { p ∗ d } by means of a single parameter that, in addition,evolves monotonically, thus representing the irreversibil-ity of evolution. It is expected that these properties are p d ( t ) d=1 d=2 d=3 d=4 d=5 d=6 d=7 d=8 d=9 d=1 d=2 d=3 d=4 d=5 d=6 d=7 d=8 d=9 p d ( t ) / p * d t FIG. 6. Evolution of p d ( t ) (top panel) and of the ratio p d ( t ) /p ∗ d (bottom panel), when starting from a uniform ini-tial distribution p d (0) = . verified by the Kullback–Leibler divergence, which inour case is defined as D KL ( t ) = X d =1 p d ( t ) ln p d ( t ) p ∗ d . (22)This quantity represents the relative entropy of p d ( t ) withrespect to p ∗ d . Figure 8 shows the evolution of D KL ( t ) forthe same initial conditions as in Figs. 6 and 7. In bothcases the monotonic evolution of D KL ( t ) is confirmed.Also, the asymptotic decay to 0 occurs essentially expo-nentially with a rate independent of the initial state. Tosee this decay in more detail, let us consider times longenough that the deviations δp d ( t ) ≡ p d ( t ) − p ∗ d can beconsidered small. In this regime, we can expand Eq. (22)in a power series and retain the dominant term. Theresult is D KL ( t ) ≈ X d =1 [ δp d ( t )] p ∗ d . (23)On the other hand, for times long enough, | a | t ≪ | a , | t (note that | a | = 0 . | a , | = 0 . p d ( t ) d=1 d=2 d=3 d=4 d=5 d=6 d=7 d=8 d=9 d=1 d=2 d=3 d=4 d=5 d=6 d=7 d=8 d=9 p d ( t ) / p * d t FIG. 7. Evolution of p d ( t ) (top panel) and of the ratio p d ( t ) /p ∗ d (bottom panel), when starting from an inverted ini-tial distribution p d (0) = p ∗ − d . -4 -3 -2 -1 p d ( ) = p *10 - d D K L ( t ) t p d ( ) = ~|a | FIG. 8. Evolution of the Kullback–Leibler divergence D KL ( t )(in logarithmic scale), starting from the uniform initial dis-tribution p d (0) = and from the inverted initial distribution p d (0) = p ∗ − d . The dashed line is proportional to | a , | t . according to Eqs. (15), δp d ( t ) ∼ | a , | t . Thus, D KL ( t ) ∼| a , | t = 10 t log | a , | . This asymptotic behavior is alsorepresented in Fig. 8.Let us prove that, indeed, D KL ( t + 1) ≤ D KL ( t ). Ac-cording to Eqs. (12) and (22), D KL ( t + 1) = " X d =5 p d ( t ) ln P d =5 p d ( t ) p ∗ + X d =1 (cid:20) α d p d ( t ) ln α d p d ( t ) p ∗ d +(1 − α d ) p d ( t ) ln (1 − α d ) p d ( t ) p ∗ d +1 (cid:21) . (24)Taking in to account that p ∗ = P d =5 p ∗ d and p ∗ d = α d p ∗ d , p ∗ d +1 = (1 − α d ) p ∗ d for d = 1 , , ,
4, Eq. (24) yields∆ D KL ( t ) ≡D KL ( t + 1) − D KL ( t )= " X d =5 p d ( t ) ln P d =5 p d ( t ) P d =5 p ∗ d − X d =5 p d ( t ) ln p d ( t ) p ∗ d . (25)The difference ∆ D KL ( t ) is a function of the 10 parameters { p d ( t ) , p ∗ d , d = 5 , . . . , } . To find the maximum value of∆ D KL ( t ), we take the derivatives ∂ ∆ D KL ( t ) ∂p d ( t ) = ln P d ′ =5 p d ′ ( t ) P d ′ =5 p ∗ d ′ − ln p d ( t ) p ∗ d , (26a) ∂ ∆ D KL ( t ) ∂p ∗ d = − P d ′ =5 p d ′ ( t ) P d ′ =5 p ∗ d ′ + p d ( t ) p ∗ d . (26b)The solution to the extremum conditions ∂ ∆ D KL ( t ) /∂p d ( t ) = ∂ ∆ D KL ( t ) /∂p ∗ d = 0 is p d ( t ) = γp ∗ d ( d = 5 , . . . , < γ < / P d =5 p ∗ d is arbitrary.In such a case, ∆ D KL ( t ) = 0. To see that this is actuallya maximum value, suppose, for instance, that p d ( t ) = 0except if d = d (with d = 5 , . . . , D KL ( t ) = p d ( t ) ln (cid:16) p ∗ d / P d =5 p ∗ d (cid:17) < D KL ( t + 1) ≤ D KL ( t ) , (27)the equality holding only if p d ( t ) = γp ∗ d ( d = 5 , . . . , S = −D KL + const, so that S increases irreversibly in the evolution towards equilib-rium. V. CONCLUDING REMARKS
We hope that this article may have helped to showthat, contrary to what might be initially thought, thefirst significant digit of a dataset extracted from natureor the real world is not evenly distributed among thenine possible values ( d = 1 , , . . . , d = 1 and decreases as d in-creases. The NBL (1) gives a mathematical expressionto this empirical fact, although it does not always needto be rigorously verified. It is to be expected that, ex-cept for unavoidable statistical fluctuations, the law isfulfilled in datasets accompanied by units (as generallyhappens in physics), so that the distribution of the firstdigit is independent of the units chosen (invariance un-der change of scale). More generally, the NBL is satisfiedwhen the mantissa of the logarithms (in any base) of theconsidered data is uniformly distributed. That makeslists so little related in principle to physical quantities,such as Fibonacci numbers or powers of 2, also satisfythe NBL. Moreover, if an initial list of data does notcomply with the law, iterating the data by an irrationalpower of 10 causes the distribution of the first digit inthe resulting lists to converge towards the NBL. We haveillustrated this property of the NBL as an attractor inSec. IV through a simple dynamic model that mimicsthe evolution of a dataset when multiplied by the factor2. Until the 70s of last century (which is when scientificpocket calculators began to be used), physicists used ta-bles of logarithms (or their application in slide rules) forsmall everyday scientific calculations, although if the cal-culations were more complicated, the big computers ofthe time could be used. Those calculations are nowdaysperformed on pocket calculators, cellular phones, or per-sonal computers with a wide variety of existing mathe-matical programs. Since the data that are manipulatedin physics are extracted from “real” situations, such asexperiments, models, physical constants, . . . , we can con-clude, as a tribute to Newcomb and Benford, that thekey 1 will be the one that presents the greatest wear andtear, while that of 9 will be the least used, thus answer-ing affirmatively to the question posed in the title of thisarticle. S. Newcomb, “Note on the frecuency of use of the different digitsin natural numbers,” Am. J. Math. , 39–40 (1881). F. Benford, “The law of the anomalous numbers,”Proc. Am. Philos. Soc. , 551–572 (1938). S. M. Stigler, “Stigler’s law of eponimy,”Trans. N. Y. Acad. Sci. , 147–158 (1989). E. W. Weisstein, “Benford’s law,” https://mathworld.wolfram.com/BenfordsLaw.html . R. S. Pinkham, “On the distribution of first significant digits,”Ann. Math. Statist. , 1223–1230 (1961). A. Berger and T. P. Hill, “A basic theory of Benford’s law,”Probab. Surv. , 1–126 (2011). D. C. Hoyle, M. Rattray, R. Jupp, and A. Brass,“Making sense of microarray data distributions,”Bioinformatics , 576–584 (2002). D. Ni and Z. Ren, “Benford’s law and half-lives of unstable nu-clei,” Eur. Phys. J. A , 251–255 (2008). A. Dantuluri and S. Desai, “Do τ lepton branching fractions obeyBenford’s law?” Physica A , 919–928 (2018). T. Alexopoulos and S. Leontsinis, “Benford’s law in astronomy,”J. Astrophys. Astron. , 639–648 (2014). A. Bera, U. Mishra, S. S. Roy, A. Biswas, A. Sen(De),and U. Sen, “Benford analysis of quantum critical phe-nomena: First digit provides high finite-size scaling expo-nent while first two and further are not much better,”Phys. Lett. A , 1639–1644 (2018). S. de Marchi and J. Hamilton, “Assesing the accuracy of self-reported data: An evaluation of the toxics release inventory,”J. Risk. Uncertain. , 57–76 (2006). M. J. Nigrini,
Benford’s Law: Applications for Forensic Account-ing, Auditing, and Fraud Detection (Wiley, Hoboken, NJ, 2012). W. K. T. Cho and B. J. Gaines, “Breaking the (Ben-ford) law: statistical fraud detection in campaign finance,”Am. Stat. , 218–223 (2007). D. Gamermann and F. L. Antunes, “Statistical analy-sis of Brazilian electoral campaigns via Benford’s law,”Physica A , 171–188 (2018). A. Diekmann, “Not the first digit! Using Benford’s law to detectfraudulent scientific data,” J. Appl. Stat. , 321–329 (2007). M. Ausloos, A. Eskandary, P. Kaur, and G. Dhesi, “Evidence forgross domestic product growth time delay dependence over for-eign direct investment. A time-lag dependent correlation study,”Physica A , 121181 (2019). T. P. Hill, “The first digit phenomenon: A century-old observa-tion about an unexpected pattern in many numerical tables ap-plies to the stock market, census statistics and accounting data,”Am. Sci. , 358–363 (1998). L. Pietronero, E. Tosatti, V. Tosatti, and A. Vespignani, “Ex-plaining the uneven distribution of numbers in nature: the lawsof Benford and Zipf,” Physica A , 297–304 (2001). M. Miranda-Zanetti, F. Delbianco, and F. Tohm´e, “Tamperingwith inflation data: A Benford law-based analysis of nationalstatistics in Argentina,” Physica A , 761–770 (2019). S. N. Dorogovtsev, J. F. F. Mendes, and J. G. Oliveira, “Fre-quency of occurrence of numbers in the World Wide Web,”Physica A , 548–556 (2006). T. A. Mir, “The Benford law behavior of the religious activitydata,” Physica A , 1–9 (2014). M. Ausloos, “Econophysics of a religious cult: The Antoinists inBelgium [1920-2000],” Physica A , 3190–3197 (2012). M. Ausloos, C. Herteliu, and B. Ileanu, “Breakdown of Benford’slaw for birth data,” Physica A , 736–745 (2015). M. Ausloos, R. Cerqueti, and C. Lupi, “Long-range propertiesand data validity for hydrogeological time series: The case of thePaglia river,” Physica A , 39–50 (2017). K.-B. Lee, S. Han, and Y. Jeong, “Covid-19, flattening the curve,and Benford’s law,” Physica A , 125090 (2020). “Testing Benford’s law,” https://testingbenfordslaw.com . Instituto Nacional de Estad´ıstica, . “The brightest stars,” . “Sunspot number,” http://sidc.be/silso/datafiles . https://cortefiel.com/es/es/mujer?srule=price-high-to-low& . . S. Kullback and R. A. Leibler, “On information and sufficiency,”Ann. Math. Statist.22