A note on the closed-form solution for the longest head run problem of Abraham de Moivre
aa r X i v : . [ m a t h . HO ] S e p A note on the closed-form solution for thelongest head run problem of Abraham deMoivre
Yaakov Malinovsky ∗ Department of Mathematics and StatisticsUniversity of Maryland, Baltimore County, Baltimore, MD 21250, USASeptember 17, 2020
Abstract
The problem of the longest head run was introduced and solved by Abraham deMoivre in the second edition of his book
Doctrine of Chances (de Moivre, 1738). Theclosed-form solution as a finite sum involving binomial coefficients was provided inUspensky (1937). Since then, the problem and its variations and extensions havefound broad interest and diverse applications. Surprisingly, a very simple closed formcan be obtained, which we present in this note.
Keywords: longest run problem; generating functions; history of probability
In a series of n independent trials, an event E has a probability p of occurrence for eachtrial. If, in these trials, event E occurs at least r times without interruption, then we havea run of size r . What is the probability y n of having a run of size r in n trials? Thisproblem was formulated and solved by Abraham de Moivre in the second edition of hisbook The Doctrine of Chances: or, A Method of Calculating the Probabilities of Eventsin Play (de Moivre (1738), Problem LXXXVIII, p. 243). Although more than 280 yearshave passed since then, de Moivre’s problem and its variations remain of great interest inprobability and statistics; see for example Novak (2017) and references therein. ∗ email: [email protected]
1e Moivre did not provide a proof, but demonstrated a method of finding y n . Reviewingthat method, one can see that he used the method of generating functions. He demonstratedthe method with an example of ten trials having p = 1 /
2, in which the probability of a runof size 3 equals 65 / y n can be obtainedas a corollary from the difference equation given by Uspensky. This closed-form solutionseems to have never been reported in the literature. In this note, we present it along withUspensky’s original derivations. We present Uspensky’s solution (Uspensky (1937), pages 77-79) while keeping his originalnotations. This solution demonstrates the power of the use of ordinary linear deferenceequations along with the generating functions. Recall that we denoted by y n the probabilityof a run of size r in n independent trials. Let’s consider n + 1 trials with the correspondingprobability y n +1 . A run of size r in n + 1 trials can happen in two mutually-exclusive ways:(W1) : the run is obtained in the first n trials or(W2) : the run is obtained as of trial n + 1.(W2) means that among the first n − r trials, there is no run of size r ; event E C occurredat trial n − r + 1; and event E occurred in the trials n − r + 2 , . . . , n + 1. Combining (W1)and (W2), we obtain a linear difference equation of order r + 1, y n +1 = y n + (1 − y n − r ) qp r , (1)with initial conditions y = y = · · · = y r − = 0 , y r = p r , where q = 1 − p . Substituting y n = 1 − z n , we then have z n +1 − z n + qp r z n − r = 0 , (2)2ith the corresponding initial conditions, z = z = · · · = z r − = 1; z r = 1 − p r . (3)The solution of (2) was obtained by the method of generating functions. The generatingfunction of the sequence z , z , z , . . . is the power function of t defined as ϕ ( t ) = z + z t + z t + · · · + z n t n + · · · . Using (2) and (3), one hopes to find a definite function ϕ ( t ); then the coefficient of t n willbe precisely z n . In our case, this outcome is possible and can be obtained by multiplying ϕ ( t ) by 1 − t + qp r t r +1 , applying (2) and substituting (3), ϕ ( t ) = 1 − p r t r − t + qp r t r +1 . (4)Then, the generating function ϕ ( t ) can be developed into a power series of t with acoefficient z n of t n as z n = β n,r − p r β n − r,r β n,r = nr +1 X l =0 ( − l (cid:18) n − lrl (cid:19) ( qp r ) l . (5)Going back to de Moivre’s original example, where n = 10 , r = 3 and p = 1 /
2, and using(5), we obtain z n = 63 /
128 and y n = 65 / Doctrine of Chances (de Moivre, 1738). r ≥ n/ Surprisingly, a simple closed-form solution follows from formula (1) of Uspensky (1937),arrived at by substituting n = r, r + 1 , r + 2 , . . . and considering an appropriate range of r.Substituting n = r , we obtain y r +1 = y r + (1 − y ) p r q = p r + p r q if r ≥ if r = 0 . (6)3ubstituting n = r + 1 and using (6), we obtain y r +2 = y r +1 + (1 − y ) p r q = p r + 2 p r q if r ≥ p + pq + (1 − p ) pq = 1 − q if r = 11 if r = 0 . (7)Substituting n = r + 1 and using (6) and (7), we obtain y r +3 = y r +2 + (1 − y ) p r q = p r + 3 p r q if r ≥ p + 2 p q + (1 − p ) p q if r = 21 − q + (1 − p − pq ) pq = 1 − q if r = 11 if r = 0 . (8)By continuing in similar manner, we obtain the following corollary: Corollary 1. If n/ ≤ r ≤ n , where r is an integer, then y n = p r + ( n − r ) p r q. (9) Remark 1.
Corollary 1 can be obtained directly from the difference equation y n = y n − +(1 − y n − − r ) qp r , which has initial conditions y = · · · = y r − = 0 , y r = p r . If n − − r ≤ r − (i.e. n/ ≤ r ) then y n − − r =0, and Corollary 1 follows. There are a number of interesting problems discussed in the Uspensky (1937) book, manyof which have roots in the classics of probability, their origins tracing back to foundersof modern-probability such as Pascal, Fermat, Huygens, Bernoulli, de Moivre, Laplace,Markoff, Bernstein, and others. For example, Uspensky (1937) considered another problemof de Moivre’s that was latter discussed and extended by Diaconis and Zabell (1991). Alarge collection of classic problems in probability with historical comments and originalcitations are nicely presented in the book by Gorroochurn (2012).There are many follow-ups on and extensions of de Moivre’s longest head run prob-lem. An interesting recursive solution of the problem in the case of the fair coin was given4y Sz´ekely and Tusn´ady (1979). That problem was then extended to the Markov chainsetting, where Uspensky’s generation function (4) was generalized to the case of depen-dent observations (Novak, 1989). The problem has also found applications in numerousfields. Among which are reliability (Derman et al. , 1982), computational biology (Schbath,2000), and finance where time dependent-sequences naturally occur (see Novak (2011) andreferences therein).
Acknowledgement
I thank Serguei Novak for comments on the early version of this article.
References de Moivre, A. (1738). The doctrine of chances. 2d ed. H Wood-fall, London.[Publicaly available on website of Linda Hall Library http://lhldigital.lindahall.org/cdm/ref/collection/math/id/16415 ].Derman, C., Liberman, G. J., and Ross, S. M.(1982). On the consecutive-k-out-of-n:Fsystems.
IEEE Trans. Reliab.
1, 57–63.Diaconis, P., Zabell, S. (1991). Closed form summation for classical distributions: Varia-tions on a theme of de Moivre. S tatist. Sci. , 284–302.Gorroochurn, P. (2012). Classic problems of probability. Wiley, Hoboken, NJ.Novak, S.Y. (1989). Assymptotic expansions in the problem of the lenght of the longesthead-run for Markov chainwith two states. T rudy Inst. Math. (Novosibirsk)
3, 136–147.In Russian.Novak, S.Y. (2011). Extreme value methods with application to Finance.Champman&Hall/CRC press, London.Novak, S.Y. (2017). On the lenght of the longest head run. S tat Probab Lett.
30, 111–114.5chbath S. (2000). An overview on the distribution of word counts in Markov chains. J .Comput. Biology , 193–201.Sz´ekely, G., Tusn´ady, G. (1979). Generalized Fibonacci numbers, and the number of ”pureheads”. M atematikai Lapok2