[PDF] Transition probabilities for general birth-death processes with applications in ecology, genetics, and evolution

Abstract

A birth-death process is a continuous-time Markov chain that counts the number of particles in a system over time. In the general process with n current particles, a new particle is born with instantaneous rate λ n and a particle dies with instantaneous rate μ n . Currently no robust and efficient method exists to evaluate the finite-time transition probabilities in a general birth-death process with arbitrary birth and death rates. In this paper, we first revisit the theory of continued fractions to obtain expressions for the Laplace transforms of these transition probabilities and make explicit an important derivation connecting transition probabilities and continued fractions. We then develop an efficient algorithm for computing these probabilities that analyzes the error associated with approximations in the method. We demonstrate that this error-controlled method agrees with known solutions and outperforms previous approaches to computing these probabilities. Finally, we apply our novel method to several important problems in ecology, evolution, and genetics.

Full PDF

NNoname manuscript No. (will be inserted by the editor)

Transition probabilities for general birth-death processeswith applications in ecology, genetics, and evolution

Forrest W. Crawford · Marc A. Suchard

Typeset on September 18, 2018

Abstract

A birth-death process is a continuous-time Markov chain that counts thenumber of particles in a system over time. In the general process with n current par-ticles, a new particle is born with instantaneous rate λ n and a particle dies withinstantaneous rate µ n . Currently no robust and eﬃcient method exists to evaluatethe ﬁnite-time transition probabilities in a general birth-death process with arbitrarybirth and death rates. In this paper, we ﬁrst revisit the theory of continued fractionsto obtain expressions for the Laplace transforms of these transition probabilities andmake explicit an important derivation connecting transition probabilities and contin-ued fractions. We then develop an eﬃcient algorithm for computing these probabilitiesthat analyzes the error associated with approximations in the method. We demonstratethat this error-controlled method agrees with known solutions and outperforms previ-ous approaches to computing these probabilities. Finally, we apply our novel methodto several important problems in ecology, evolution, and genetics. Keywords

General birth-death process · Continuous-time Markov chain · Transitionprobabilities · Population genetics · Ecology · Evolution

Forrest W. CrawfordDepartment of Biomathematics, University of California Los Angeles Los Angeles, CA 90095-1766 USA E-mail: [email protected] A. SuchardDepartments of Biomathematics, Biostatistics and Human Genetics, University of CaliforniaLos Angeles Los Angeles, CA 90095-1766 USA E-mail: [email protected] a r X i v : . [ q - b i o . P E ] N ov PACS

PACS code1 · PACS code2 · more Mathematics Subject Classiﬁcation (2000) · · · Birth-death processes (BDPs) have a rich history in probabilistic modeling, includ-ing applications in ecology, genetics, and evolution (Thorne et al 1991; Krone andNeuhauser 1997; Novozhilov et al 2006). Traditionally, BDPs have been used to modelthe number of organisms or particles in a system, each of which reproduce and die incontinuous time. A general BDP is a continuous-time Markov chain on the non-negativeintegers in which instantaneous transitions from state n ≥ n + 1 or n − n , jumpsto n + 1 occur with instantaneous rate λ n and jumps to n − µ n . The simplest BDP has linear rates λ n = nλ and µ n = nµ with no state-independent terms (Kendall 1948; Feller 1971). This model is the most widely-usedBDP since there exist closed-form expressions for its transition probabilities (Bailey1964; Novozhilov et al 2006). Many applications of BDPs require convenient methodsfor computing the probability P m,n ( t ) that the system moves from state m to state n in ﬁnite time t ≥

0. These probabilities exhibit their usefulness in many modeling ap-plications since the probabilities do not depend on the possibly unobserved path takenby the process from m to n and hence make possible analyses of discretely sampled orpartially observed processes. Despite the relative simplicity of specifying the rates of ageneral BDP, it can be remarkably diﬃcult to ﬁnd closed-form solutions for the tran-sition probabilities even for simple models (Renshaw 1993; Mederer 2003; Novozhilovet al 2006).In a pioneering series of papers, Karlin and McGregor develop a formal theoryof general BDPs that expresses their transition probabilities in terms of a sequence oforthogonal polynomials and a spectral measure (Karlin and McGregor 1957a,b, 1958b).While the work of Karlin and McGregor yields valuable theoretical insights regardingthe existence of unique solutions and properties of recurrence and transience for a given process, there remains no clear recipe for determining the orthogonal polynomials andmeasure corresponding to an arbitrary set of birth and death rates. Additionally, evenwhen the polynomials and measure are known, the transition probabilities may nothave an analytic representation or a convenient computational form.Possibly due to the diﬃculty of ﬁnding computationally useful formulas for transi-tion probabilities in general BDPs, many applied researchers resort to easier analysesusing moments, ﬁrst passage times, equilibrium probabilities, and other tractable quan-tities of interest. Referring to the system of Kolmogorov forward diﬀerential equationsfor transition probabilities that we give below, Novozhilov et al (2006, page 73) write,“The problem with exact solutions of system (1) is that, in many cases, theexpressions for the state probabilities, although explicit, are intractable foranalysis and include special polynomials. In such cases, it may be sensibleto solve more modest problems concerning the birth-and-death process underconsideration, without the knowledge of the time-dependent behavior of stateprobabilities p n ( t ).”Indeed, closed-form analytic expressions for transition probabilities of general BDPsare only known for a few types of processes. Some examples include constant birth anddeath rates (Bailey 1964), zero birth or death rates (pure-death and pure-birth) (Yule1925; Taylor and Karlin 1998), and certain linear rates (Karlin and McGregor 1958a).As a seemingly straightforward example, in the BDP with linear birth and death rates λ n = nλ + ν and µ n = nµ + γ including state-independent terms, Ismail et al (1988)oﬀer the orthogonal polynomials and associated measure, but still no closed form isavailable for the transition probabilities.Despite the diﬃculty in obtaining analytic expressions, several authors have madeprogress in approximate numerical methods for solution of transition probabilitiesin general BDPs. Murphy and O’Donohoe (1975) develop an appealing numericalmethod for the transition probabilities based on a continued fraction representationof Laplace-transformed transition probabilities. They invert these transformed prob-abilities by ﬁrst truncating the continued fraction. Several other authors give similar expressions derived from truncation of the state space (Grassmann 1977b,a; Rosenlund1978; Sharma and Dass 1988; Mohanty et al 1993). However, Klar et al (2010) ﬁndthat methods based on continued fraction truncation and then subsequent analyticaltransformation can suﬀer from instability. As an alternative, Parthasarathy and Sud-hesh (2006a) express the inﬁnite continued fraction representation given by Murphyand O’Donohoe as a power series. Unfortunately, the small radius of convergence ofthis series makes it less useful for numerical computation.We also note that for general BDPs that take values on a ﬁnite state space (usually n ∈ { , , . . . , N } ), it is possible to write a ﬁnite-dimensional stochastic transition ratematrix and solve for the matrix of transition probabilities. If the rate matrix is diagonal-izable, computation of transition probabilities in this manner can be computationallystraightforward. To illustrate, let Q be a ﬁnite-dimensional stochastic rate matrix with Q = UΛU − where U is an orthogonal matrix and Λ is diagonal. The matrix of tran-sition probabilities P satisﬁes the matrix diﬀerential equation P (cid:48) = P Q with initialcondition P (0) = I . The solution is P ( t ) = exp[ Qt ] = U diag( e z t , e z t , . . . , e z N t ) U − ,where z , . . . , z N are the eigenvalues of Q . However, it is possible to specify reason-able rate parameters in a general BDP that satisfy requirements for the existence ofa unique solution, but do not result in a diagonalizable rate matrix. Also, if the statespace over which the BDP takes values is large, numerical eigendecomposition of Q may be computationally expensive and could introduce serious roundoﬀ errors.To our knowledge, no robust computational method currently exists for ﬁnding theﬁnite-time transition probabilities of general BDPs with arbitrary rates. Such a tech-nique would allow rapid development of rich and sophisticated ecological, genetic, andevolutionary models. Additionally, in statistical applications, transition probabilitiescan serve as observed data likelihoods, and are thus often useful in estimating tran-sition rate parameters from partially observed BDPs. We believe more sophisticatedBDPs can be very useful for applied researchers. In spite of the numerical diﬃcultiespresented by approximant methods, we are surprised that continued fraction meth-ods like that of Murphy and O’Donohoe (1975) are not more widely explored. This may be due to omission of important details in their derivation of continued fractionexpressions for the Laplace transform of the transition probabilities.In this paper, we build on continued fraction expressions for the Laplace transformsof the transition probabilities of a general BDP using techniques similar to those in-troduced by Murphy and O’Donohoe, and we ﬁll in the missing details in the proof ofthis representation. We then apply the Laplace inversion formulae of Abate and Whitt(1992a,b) to obtain an eﬃcient and robust method for computation of transition prob-abilities in general BDPs. Our method relies on three observations: 1) it is possibleto ﬁnd exact expressions for Laplace transforms of the transition probabilities of ageneral BDP using continued fractions (Murphy and O’Donohoe 1975); 2) evaluationof continued fractions is typically very fast, requires far fewer evaluations than equiv-alent power series, and there exist robust algorithms for evaluating them eﬃciently(Bankier and Leighton 1942; Wall 1948; Blanch 1964; Lorentzen and Waadeland 1992;Craviotto et al 1993; Abate and Whitt 1999; Cuyt et al 2008); and 3) recovery ofprobability distributions by Laplace inversion using a Riemann sum approximation isoften more computationally stable than analytical methods of inversion (Abate andWhitt 1992a,b, 1995). Finally, we demonstrate the advantages of our error-controlledmethod through its application to several birth-death models in ecology, genetics, andevolution whose solution remains unavailable by other means. X = { X ( t ) , t ≥ } counting the number of arbitrarily deﬁned “particles” in existence at time t ≥

0, with X (0) = m ≥

0. To characterize the process, we deﬁne non-negative instantaneousbirth rates λ n and death rates µ n for n ≥

0, with µ = 0 and transition probabilities P m,n ( t ) = Pr( X ( t ) = n | X (0) = m ). While λ n and µ n are time-homogeneous con-stants, they may depend on n . We refer to the classical linear BDP in which λ n = nλ and µ n = nµ as the “simple birth-death process” (Kendall 1948; Feller 1971). Thegeneral BDP transition probabilities satisfy the inﬁnite system of ordinary diﬀerentialequationsd P m, ( t )d t = µ P m, ( t ) − λ P m, ( t ), andd P m,n ( t )d t = λ n − P m,n − ( t ) + µ n +1 P m,n +1 ( t ) − ( λ n + µ n ) P m,n ( t ) for n ≥

1, (1)with boundary conditions P m,m (0) = 1 and P m,n (0) = 0 for n (cid:54) = m (Feller 1971).Karlin and McGregor (1957b) show that for arbitrary starting state m , transitionprobabilities can be represented in the form P m,n ( t ) = π n (cid:90) ∞ e − xt Q m ( x ) Q n ( x ) ψ (d x ) , (2)where π = 1 and π n = ( λ · · · λ n − ) / ( µ · · · µ n ) for n ≥

1. Here, { Q n ( x ) } is a sequenceof polynomials satisfying the three-term recurrence relation λ Q ( x ) = λ + µ − x , and λ n Q n +1 ( x ) = ( λ n + µ n − x ) Q n ( x ) − µ n Q n − ( x ) , (3)and ψ is the spectral measure of X with respect to which the polynomials { Q n ( x ) } areorthogonal. The system (1) has a unique solution if and only if ∞ (cid:88) k =0 (cid:18) π k + 1 λ k π k (cid:19) = ∞ . (4)In what follows, we assume that the rate parameters { λ n } and { µ n } satisfy (4). Closed-form solutions to (1) are available for a surprisingly small number of choices of { λ n } and { µ n } . We therefore need another approach to ﬁnd useful formulae for computationof the transition probabilities. P m,n ( t ) for an arbitrary general BDP,a fruitful approach is often to Laplace transform each equation of the system (1) andform a recurrence relationship relating back to the Laplace transform of P m,n ( t ). Webase our presentation on that of Murphy and O’Donohoe (1975). Denote the Laplacetransform of P n,m ( t ) as f m,n ( s ) = L [ P m,n ( t )] ( s ) = (cid:90) ∞ e − st P m,n ( t ) d t. (5)Applying the Laplace transform to (1), with the starting state m = 0, we arrive at sf , ( s ) − P , (0) = µ f , ( s ) − λ f , ( s ), and sf ,n ( s ) − P ,n (0) = λ n − f ,n − ( s ) + µ n +1 f ,n +1 ( s ) − ( λ n + µ n ) f ,n ( s ) (6)for n ≥

1. Rearranging and recalling that P , (0) = 1 and P ,n (0) = 0 for n ≥

1, wesimplify (6) to f , ( s ) = 1 µ (cid:2) ( s + λ ) f , ( s ) − (cid:3) , and f ,n ( s ) = 1 µ n (cid:20) ( s + λ n − + µ n − ) f ,n − ( s ) − λ n − f ,n − ( s ) (cid:21) for n ≥ . (7)Some rearranging of (7) yields the forward system of recurrence relations f , ( s ) = 1 s + λ − µ (cid:16) f , ( s ) f , ( s ) (cid:17) , and f ,n ( s ) f ,n − ( s ) = λ n − s + µ n + λ n − µ n +1 (cid:16) f ,n +1 ( s ) f ,n ( s ) (cid:17) . (8)Then combining these expressions, we arrive at the generalized continued fraction f , ( s ) = 1 s + λ − λ µ s + λ + µ − λ µ s + λ + µ − · · · . (9) This is an exact expression for the Laplace transform of the transition probability P , ( t ). Let the partial numerators in (9) be a = 1 and a n = − λ n − µ n − , and thepartial denominators b = s + λ and b n = s + λ n − + µ n − for n ≥

2. Then (9)becomes f , ( s ) = a b + a b + a b + · · · . (10)To express (10) in more typographically economical notation, we write f , ( s ) = a b + a b + a b + · · · . (11)We denote the k th convergent (approximant) of f , ( s ) as f ( k )0 , ( s ) = a b + a b + · · · a k b k = A k ( s ) B k ( s ) . (12)There are deep connections between the orthogonal polynomial representation (3),Laplace transforms (7), and continued fractions of the form (9) that are beyond thescope of this paper (Karlin and McGregor 1957b; Bordes and Roehner 1983; Guilleminand Pinchon 1999). Interestingly, Flajolet and Guillemin (2000) demonstrate a closerelationship between the Laplace transforms of transition probabilities and state pathsof the underlying Markov chain.Before stating a theorem supporting this representation, we give two lemmas thatwill be useful in what follows. Lemma 1

Both the numerator A k and denominator B k of (12) satisfy the same re-currence, due to Wallis (1695): A k = b k A k − + a k A k − , and B k = b k B k − + a k B k − , (13) with A = 0 , A = a , B = 1 , and B = b . Lemma 2

By repeated application of Lemma 1, we arrive at the determinant formula A k B k − − A k − B k = ( b k A k − + a k A k − ) B k − − A k − ( b k B k − + a k B k − )= − a k ( A k − B k − − A k − B k − )= ( − k − k (cid:89) i =1 a i . (14)Now we state and prove a theorem giving expressions for the Laplace transform of P m,n ( t ). Although Murphy and O’Donohoe (1975) ﬁrst report this result, they do notprovide a detailed derivation in their paper. Theorem 1

The Laplace transform of the transition probability P m,n ( t ) is given by f m,n ( s ) =    m (cid:89) j = n +1 µ j  B n ( s ) B m +1 ( s )+ B m ( s ) a m +2 b m +2 + a m +3 b m +3 + · · · for n ≤ m,  n − (cid:89) j = m λ j  B m ( s ) B n +1 ( s )+ B n ( s ) a n +2 b n +2 + a n +3 b n +3 + · · · for m ≤ n , (15) where a n , b n , and B n are as deﬁned above.Proof To simplify notation, we sometimes omit the dependence of f k , A k , and B k onthe Laplace variable s . Suppose the process starts at X (0) = m . We can re-write theLaplace-transformed equations (6) with P m,m (0) = 1 and P m,n (0) = 0 for all n (cid:54) = m as sf m, ( s ) − δ m = µ f m, ( s ) − λ f m, ( s ) , (16a) sf m,n ( s ) − δ mn = λ n − f m,n − ( s ) + µ n +1 f m,n +1 ( s ) − ( λ n + µ n ) f m,n ( s ) , (16b)where δ mn = 1 if m = n and zero otherwise. We ﬁrst derive the expression for n ≤ m .If m = 0, f , ( s ) is given by (11), so we assume in what follows when n ≤ m , that m ≥

1. Rearranging (16a), we see that since B = 1 and s + λ = b = B , f m, = B B µ f m, . (17) Now, to show the general case by induction, assume that for n ≤ m , f m,n − = B n − B n µ n f m,n . (18)Substituting (18) into (16b) when n < m , we have b n +1 f m,n = λ n − B n − B n µ n f m,n + µ n +1 f m,n +1 (19) (cid:18) b n +1 + a n +1 B n − B n (cid:19) f m,n = µ n +1 f m,n +1 (20) f m,n = B n B n +1 µ n +1 f m,n +1 (21)and so (18) is true for any n < m . Letting n = m , we have by (18) and (16b), b m +1 f m,m = 1 + λ m − (cid:18) B m − B m µ m f m,m (cid:19) + µ m +1 f m,m +1 . (22)Recalling that s + λ m + µ m = b m +1 and using Lemma 1, µ m +1 f m,m +1 = 1 − B m +1 B m f m,m . (23)Rearranging the previous equation, we ﬁnd that f m,m = 1 B m +1 B m + µ m +1 f m,m +1 f m,m . (24)Likewise, we can write (16b) as a continued fraction recurrence: f m,n f m,n − = λ n − s + µ n + λ n + µ n +1 f m,n +1 f m,n . (25)Then plugging (25) into (24) and iterating, we obtain the continued fraction for f m,m : f m,m = 1 B m +1 B m + a m +2 b m +2 + a m +3 b m +3 + · · · = B m B m +1 + B m a m +2 b m +2 + a m +3 b m +3 + · · · . (26) This is an exact formula for the Laplace transform of P m,m ( t ), and proves the case m = n . For n ≤ m , we iterate (18) to get f m,n = B n B n +1 µ n +1 f m,n +1 = B n B n +1 B n +1 B n +2 µ n +1 µ n +2 f m,n +2 = B n B n +1 B n +1 B n +2 · · · B m − B m µ n +1 µ n +2 · · · µ m f m,m =  m (cid:89) j = n +1 µ j  B n B m f m,m . (27)Substituting (26) for f m,m completes the proof for n ≤ m .To ﬁnd the formula for f m,n when n > m , we adopt a similar approach. From (24)we arrive at B m +1 f m,m = B m − B m µ m +1 f m,m +1 . (28)We proceed inductively. Assume that for n > m , B n +1 f m,n =  n − (cid:89) j = m λ j  B m + µ n +1 B n f m,n +1 . (29)From (16b), we have b n +2 f m,n +1 = λ n f m,n + µ n +2 f m,n +2 . (30)Solving for f m,n in (29) and plugging this into the above equation, we have b n +2 f m,n +1 = λ n  n − (cid:89) j = m λ j  B m B n +1 + λ n µ n +1 B n B n +1 f m,n +1 + µ n +2 f m,n +2 . (31)Recalling that − λ n µ n +1 = a n +2 ,( b n +2 B n +1 + a n +2 B m ) f m,n +1 =  n (cid:89) j = m λ j  B n + µ n +2 B n +1 f m,n +2 , (32) and by Lemma 1, B n +2 f m,n +1 =  n (cid:89) j = m λ j  B m + µ m +2 B n +1 f m,m +2 . (33)This establishes the recurrence (29). Then for any n ≥ m , we can rearrange (29) toobtain f m,n =  n − (cid:89) j = m λ j  B m B n +1 − B n µ n +1 f m,n +1 f m,n . (34)This completes the proof. (cid:117)(cid:116) P m,n ( t ) as an unknown but computable function of the complex Laplace vari-able s . We base our presentation on that of Abate and Whitt (1992a). If (cid:15) is a positivereal number such that all singularities of f m,n ( s ) lie to the left of (cid:15) in the complexplane, the inverse Laplace transform of f m,n ( s ) is given by the Bromwich integral P m,n ( t ) = L − ( f m,n ( s )) = 12 πi (cid:90) (cid:15) + i ∞ (cid:15) − i ∞ e st f m,n ( s ) d s. (35) Letting s = (cid:15) + iu , P m,n ( t ) = 12 π (cid:90) ∞−∞ e ( (cid:15) + iu ) t f m,n ( (cid:15) + iu ) d u = e (cid:15)t π (cid:90) ∞−∞ (cid:2) cos( ut ) + i sin( ut ) (cid:3) f m,n ( (cid:15) + iu ) d u = e (cid:15)t π (cid:34) (cid:90) ∞−∞ (cid:104) Re (cid:0) f m,n ( (cid:15) + iu ) (cid:1) cos( ut ) − Im (cid:0) f m,n ( (cid:15) + iu ) (cid:1) sin( ut ) (cid:105) d u + i (cid:90) ∞−∞ (cid:104) Im (cid:0) f m,n ( (cid:15) + iu ) (cid:1) cos( ut ) + Re (cid:0) f m,n ( (cid:15) + iu ) (cid:1) sin( ut ) (cid:105) d u (cid:35) , (36)but P m,n ( t ) is real-valued, so the imaginary part of the last equality in (36) is zero.Then P m,n ( t ) = e (cid:15)t π (cid:90) ∞−∞ (cid:104) Re (cid:0) f m,n ( (cid:15) + iu ) (cid:1) cos( ut ) − Im (cid:0) f m,n ( (cid:15) + iu ) (cid:1) sin( ut ) (cid:105) d u. (37)But since P m,n ( t ) = 0 for t <

0, we also have that (cid:90) ∞−∞ (cid:104) Re (cid:0) f m,n ( (cid:15) + iu ) (cid:1) cos( ut ) + Im (cid:0) f m,n ( (cid:15) + iu ) (cid:1) sin( ut ) (cid:105) d u = 0 . (38)Then applying (38) to (37), we obtain P m,n ( t ) = e (cid:15)t π (cid:90) ∞−∞ Re (cid:0) f m,n ( (cid:15) + iu ) (cid:1) cos( ut ) d u. (39)Finally, we note that sinceRe (cid:0) f ( (cid:15) − iu ) (cid:1) = (cid:90) ∞ e − (cid:15)t cos( ut ) P m,n ( t ) d t = Re (cid:0) f ( (cid:15) + iu ) (cid:1) , (40)it must be the case that Re (cid:0) f m,n ( (cid:15) + iu ) (cid:1) is even in u for every (cid:15) . Therefore, P m,n ( t ) = 2 e (cid:15)t π (cid:90) ∞ Re (cid:0) f m,n ( (cid:15) + iu ) (cid:1) cos( ut ) d u. (41) Following Abate and Whitt (1992a), we approximate the integral above by a dis-crete Riemann sum via the trapezoidal rule with step size h : P m,n ( t ) ≈ he (cid:15)t π Re ( f m,n ( (cid:15) )) + 2 he (cid:15)t π ∞ (cid:88) k =1 Re ( f m,n ( (cid:15) + ikh )) cos( kht )= e A/ t Re (cid:18) f m,n (cid:18) A t (cid:19)(cid:19) + e A/ t ∞ (cid:88) k =1 ( − k Re (cid:18) f m,n (cid:18) A + 2 kπi t (cid:19)(cid:19) , (42)where the second line is obtained by setting h = π/ (2 t ) and (cid:15) = A/ (2 t ); this change ofvariables eliminates the cosine term.2.4 Numerical considerationsWhile (42) presents a method for numerical solution of the transition probabilities P m,n ( t ) for a BDP with arbitrary birth and death rates, it is not yet an algorithmfor reliable evaluation of these probabilities. In order to develop a reliable numericalmethod, we must: 1) characterize the error introduced by discretization of the integralin (41); 2) determine a suitable method to evaluate this nearly alternating sum whilecontrolling the error; and 3) accurately and rapidly evaluate the inﬁnite continuedfraction in (15).Abate and Whitt show that the discretization error that arises in (42) is e d = ∞ (cid:88) k =1 e − kA P m,n (cid:0) (2 k + 1) t (cid:1) , (43)and when P m,n ( t ) ≤ e d ≤ ∞ (cid:88) k =1 e − kA = e − A − e − A ≈ e − A , (44)when e − A is small. Then to obtain e d ≤ − γ , we set A = γ log(10). As Abate andWhitt point out, the terms of the series (42) alternate in sign whenRe (cid:18) f m,n (cid:18) A + 2 kπi t (cid:19)(cid:19) (45) has constant sign. This suggests that a series acceleration method may be helpful inkeeping the terms of the sum manageable and avoiding roundoﬀ error due to summandsof alternating sign. We opt to use the Levin transform for this purpose (Levin 1973;Press 2007; Numerical Recipes Software 2007).Evaluation of rational approximations to continued fractions by repeated applica-tion of Lemma 1 is appealing, but suﬀers from roundoﬀ error when denominators aresmall (Press 2007). To evaluate the inﬁnite continued fraction in the summand of (42),we use the modiﬁed Lentz method (Lentz 1976; Thompson and Barnett 1986; Press2007). To demonstrate, suppose we wish to approximate the value of f , ( s ), given by(9) by truncating at depth k . Then f ( k )0 , ( s ) = A k ( s ) B k ( s ) (46)is the k th rational approximant to the inﬁnite continued fraction f , ( s ). In the modi-ﬁed Lentz method, we stabilize the computation by ﬁnding the ratios C k = A k A k − and D k = B k − B k (47)so that f ( k )0 , can be found iteratively by f ( k )0 , = f ( k − , C k D k . (48)Using Lemma 1, we can iteratively compute C k and D k via the updates C k = b k + a k C k − and D k = 1 b k + a k D k − . (49)In practice, we must evaluate the continued fraction to only a ﬁnite depth, butwe must evaluate to a depth suﬃcient to control the error. Suppose we wish to eval-uate the inﬁnite continued fraction f , ( s ) given by (9) at some complex number s .Intuitively, we wish to terminate the Lentz algorithm when the diﬀerence between suc-cessive convergents is small. However, it is not immediately clear how the diﬀerence between convergents f ( k )0 , ( s ) − f ( k − , ( s ) is related to the absolute error f , ( s ) − f ( k )0 , .Craviotto et al (1993) make this relationship clear by furnishing an a posteriori trun-cation error bound for Jacobi fractions of the same form as (9) in this paper. Assumingthat f ( k )0 , ( s ) = A k ( s ) /B k ( s ) converges to f , ( s ) as k → ∞ , Craviotto et al (1993) givethe bound (cid:12)(cid:12)(cid:12) f , ( s ) − f ( k )0 , ( s ) (cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12) B k ( s ) B k − ( s ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Im (cid:16) B k ( s ) B k − ( s ) (cid:17)(cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12) f ( k )0 , ( s ) − f ( k − , ( s ) (cid:12)(cid:12)(cid:12) , (50)that is valid when Im( s ) is nonzero. Note that B k ( s ) /B k − ( s ) = 1 /D k ( s ), so (50)is easy to evaluate during iteration under the Lentz algorithm. Therefore, we stop atdepth k in the Lentz algorithm when | /D k ( s ) || Im (1 /D k ( s )) | (cid:12)(cid:12)(cid:12) f ( k )0 , ( s ) − f ( k − , ( s ) (cid:12)(cid:12)(cid:12) (51)is small.2.5 Numerical resultsAlthough our error-controlled method is designed to be used when an analytic solutioncannot be found, we seek to validate our numerical results by comparison to availableanalytic and numerical solutions. For the simple BDP with λ n = nλ and µ n = nµ , ournumerical results agree with the values from the well-known closed-form solution givenexplicitly in Bailey (1964) as P m,n ( t ) = min( m,n ) (cid:88) j =0 (cid:32) mj (cid:33)(cid:32) m + n − j − m − (cid:33) α m − j β n − j (1 − α − β ) j P m, ( t ) = α m (52)where α = µ (cid:16) e ( λ − µ ) t − (cid:17) λe ( λ − µ ) t − µ and β = λ (cid:16) e ( λ − µ ) t − (cid:17) λe ( λ − µ ) t − µ . (53) ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● . . . . . . . . n P r obab ili t y Fig. 1

Comparison of transition probabilities P ,n ( t = 1) computed by our error-controlledmethod and that of Murphy and O’Donohoe (1975) for the immigration-death model with λ n = 0 . µ n = 0 . n . The open circles are the values given by our method. The solid linecorresponds with the approximant method of Murphy and O’Donohoe with k = 2 (solid line), k = 3 (dashed line), and k = 4 (dotted line). In our experience, the approximant method failswhenever n + m + k is greater than approximately 20. It is interesting to note that increasingthe depth of truncation k in the approximant method actually worsens the approximation. Murphy and O’Donohoe (1975) give numerical probabilities for four general birth-death models: a) immigration-death with λ n = 0 . µ n = 0 . n ; b) immigration-emigration with λ n = 0 . µ = 0, and µ n = 0 .

1; c) queue with λ n = 0 . µ = 0, µ = µ = 0 . µ = µ = 0 .

4, and µ n = 0 . n ≥

5; and d) λ n = 0 . µ n = 0 . √ n .Our results agree with those computed by Murphy and O’Donohoe for each of the fourmodels given in Tables 2 through 7 in their paper (Murphy and O’Donohoe 1975). Wenote that Murphy and O’Donohoe did not report probabilities for m > n > n + m + k is greater than approximately 20.As a demonstration of the instability of the approximant method, we contrast thenumerical results given by our error-controlled method with those obtained using theapproximant method, that we implemented as described in Murphy and O’Donohoe(1975), except for some rescaling of intermediate quantities to avoid obvious sourcesof roundoﬀ error. Figure 1 shows this comparison, using model (a) above, for three values of the truncation index k . Note that increasing the truncation depth k in theapproximant method does not improve the error. Drawing on the robustness and generality of our error-controlled method, we concludewith four models in ecology, genetics, and evolution whose analytic solutions remainelusive and where past numerical approaches have fallen short. Using our approach,computation of transition probabilities is straightforward, and the techniques outlinedabove may be used without modiﬁcation. Some of the examples are well-known models,and others are novel. In some cases, the orthogonal polynomials satisfying (3) areknown, and hence a solution could be numerically computed using (2), provided thereare good ways of evaluating the polynomials. Often, a severe drawback of using knownorthogonal polynomials to compute a solution based on (2) is that the polynomialsare model-speciﬁc. This makes experimentation and model selection diﬃcult, sincecomputation of transition probabilities depends on a priori analytic information aboutthe polynomials and measure associated with the BDP. Our method does not rely ona priori information about the process, other than the birth and death rates for eachstate.3.1 Immigration and emigrationConsider a population model for the number of organisms in an area, and suppose newimmigrants arrive at rate ν , and emigrants leave at rate γ . Organisms living in the areareproduce with per-capita birth rate λ and die with rate µ . Deﬁne the linear rates λ n = nλ + ν and µ n = nµ + γ. (54)For the case γ = 0, an analytic expression for the orthogonal polynomials is known(Karlin and McGregor 1958a). For nonzero γ , orthogonal polynomials are availablefrom which a solution of the form (2) may be computed (Karlin and McGregor 1958a; . . . . . . . n P r obab ili t y . . . . . Time P r obab ili t y Fig. 2

Transition probabilities for the immigration/emigration model with λ = 0 . ν = 0 . µ = 0 .

3, and γ = 0 .

1. The top panel shows P ,n ( t ) with t = 1 (solid line), t = 2 (dashedline), t = 3 (dotted line), and t = 4 (dash-dotted line) for n = 0 , . . . ,

50. The bottom panelshows P ,n ( t ) with n = 15 (solid line), n = 20 (dashed line), n = 25 (dotted line), and n = 30(dash-dotted line) for t ∈ (0 , Ismail et al 1988). However, using our error-controlled method, we can easily ﬁnd thetransition probabilities without additional analytic information. Figure 2 shows anexample of the time-evolution of P ,n ( t ) for various times t and states n , with theparameters λ = 0 . ν = 0 . µ = 0 .

3, and γ = 0 .

1. The approximant method methodof Murphy and O’Donohoe fails to produce useful probabilities for n >

10 (not shown). n R a t e Allee Effect Logistic growth Logistic decay

Time S t a t e Fig. 3

Behavior of logistic/Allee model. The upper panel shows a plot of birth (solid line)and death (dashed line) rates for states n = 0 , . . . ,

60, and parameters λ = 1, µ = 0 . M = 20, α = 0 .

2, and β = 0 .

3. The diﬀerent phases of growth are labeled in the shaded regions. Thelower panel shows stochastic realizations of the logistic/Allee model for various starting values.The shaded regions correspond with the shaded phases of growth in the upper panel. . . . . . . Time P r obab ili t y Fig. 4

Logistic/Allee model probabilities of extinction P m, ( t ) for initial population sizes m = 1 (solid line), m = 5 (dashed line), m = 10 (dotted line), and m = 15 (dash-dotted line).The full model parametrization is found in the text. by ecologists. Another density-dependent constraint is known as the Allee eﬀect, inwhich per-capita birth rate increases superlinearly with n once a small population hasbeen established, due to favorable consequences of density, such as cooperation andmutual protection from predators (Allee et al 1949). As a realistic example of a gen-eral BDP that has no obvious solution by orthogonal polynomials, we seek a modelthat both transiently supports growth above the carrying capacity, and reﬂects thesetwo density-dependent constraints, similar in spirit to models described by Tan andPiantadosi (1991) and Dennis (2002).Qualitatively, if the per-capita birth rate with no density eﬀects is λ , then the totalbirth rate should rise faster than nλ when n is small, slower than nλ for intermediate n near the carrying capacity, and should decay toward zero for n greater than the carryingcapacity. Tan and Piantadosi introduce a logistic birth rate λ n = nλ (cid:0) − nN (cid:1) for a ﬁnitestate space model that takes values { , , . . . , N } . However, to allow for temporarygrowth beyond the carrying capacity, we choose λ n ∝ λn e − αn for intermediate andlarge n . To achieve attenuated growth for small n as well, we scale this rate by a logisticfunction, yielding λ n = λn e − αn e β ( n − M ) and µ n = nµ, (55) where M is the population size with highest birth rate, and the death rate is assumed tobe proportional only to the number of existing individuals. Figure 3 shows the resultingrates for various states n , with the diﬀerent phases of population change shaded. Toillustrate that the model produces the desired behavior, several realizations of theprocess are given in the lower panel for various starting values. The shaded regionscorrespond with the three phases of growth. Note that most paths in the lower panel ofFigure 3 center near n = 27, where the birth rate and death rate are equal. The lowerpanel corresponds with Figure 1 in Dennis (2002). Figure 4 demonstrates the successof the error-controlled method in computing time-dependent extinction probabilities P m, for various starting values with λ = 1, α = 0 . β = 0 . M = 20, and µ = 0 . P m, ( t → ∞ ), or the probability of ﬁxation of a novel mutationin a population of constant size N , P ,N ( t → ∞ ). While these asymptotic probabil-ities do reveal important properties of the underlying models, the information theyprovide about the distribution of time to ﬁxation/extinction is incomplete. In practice,researchers may observe that m organisms in a sample exhibit a certain trait at acertain time. Then P m, ( t ), the probability of extinction of that trait at ﬁnite times t in the future should presumably be of great interest, since researchers cannot reliablyobserve the process for inﬁnitely long times. Additionally, the ﬁnite-time probabilityof ﬁxation/extinction may exhibit threshold eﬀects or unexpected dynamics that arenot revealed by the asymptotic probability of such an event.Moran (1958) introduces a model for the time-evolution of a biallelic locus when thepopulation size is constant through time. A biallelic locus is a location in an organism’sgenome in which two diﬀerent genetic variants or alleles exist in a population. We areinterested in how the number of individuals carrying each allele changes from generationto generation. Krone and Neuhauser (1997) exploit the Moran model to derive a BDP counting the number of individuals with a certain allele in the context of ancestralgenealogy reconstruction in which one allele oﬀers a selective advantage to individualsthat carry it. Selection greatly complicates the problem and remains an active area ofresearch. In a limiting case, this process corresponds to Kingman’s coalescent processwhen there is no mutation or selection (Kingman 1982a,b).To construct the Moran process with mutation and selection, suppose a ﬁnite pop-ulation of N haploid organisms has 2 alleles at a certain locus: A and A . Individualsthat carry A reproduce at rate α and A individuals reproduce at rate β . Supposefurther that individuals carrying the A allele have a selective advantage over individ-uals carrying A , so α > β . When an individual dies, it is replaced by the oﬀspring of arandom parent chosen from all N individuals, including the one that dies. This parentcontributes a gamete carrying its allele that is also subject to mutation. Mutation from A to A happens with probability u and in reverse with rate v . The new oﬀspringreceives the possibly mutated haplotype and the process continues.Let X ( t ) be a BDP counting the number of A individuals on the state space n ∈ { , . . . , N } . To construct the transition rates of the process, suppose there arecurrently n individuals of type A . We ﬁrst consider the addition of a new individualof type A , so that n → n + 1. For this to happen, the individual that dies must beof type A . If the parent of the replacement is one of the n of type A , the parentcontributes its allele without mutation, and this happens with probability 1 − u . If theparent of the replacement is one of the N − n of type A , the parent contributes itsallele, which then mutates with probability v . Therefore, the total rate of addition is λ n = N − nN (cid:20) α nN (1 − u ) + β N − nN v (cid:21) , (56)for n = 0 , . . . , N with λ n = 0 when n > N . Likewise, the removal of an individual oftype A can happen when one of the n individuals of type A is chosen for replacement.If the parent of the replacement is one of the N − n of type A , the parent contributes A without mutation, with probability 1 − v . If the parent is one of the n of type A , the allele must mutate to A with probability u . The total rate of removals becomes µ n = nN (cid:20) β N − nN (1 − v ) + α nN u (cid:21) , (57)for n = 1 , . . . , N with µ = λ N = 0 and µ n = 0 when n > N . Note that if v >

0, then λ > A allele cannot go extinct. Also, if u >

0, then µ N >

0, so the A allelecannot be ﬁxed in the population.Karlin and McGregor (1962) derive the relevant polynomials and measure for theMoran process described above, but without selection, so that α = β . Donnelly (1984)gives expressions for the transition probabilities in the case where α = β = 1, notingthat when selection is introduced (via diﬀering α and β ), his approach is no longerfruitful. Using our technique, computation of the transition probabilities under selectionis straightforward. The upper panel of Figure 5 shows the probability of ﬁxation bytime t . The lower panel shows the ﬁnite-time ﬁxation probability of A , P m, ( t ), with u = 0 so the state n = 100 is absorbing.Since the state space in the Moran model is ﬁnite, it is natural to consider thematrix exponentiation method discussed in the Introduction. We write the stochastictransition matrix as Q =  − λ λ µ − ( λ + µ ) λ µ − ( λ + µ ) λ . . . . . . . . . µ N − ( λ N + µ N ) λ N  (58)where λ n and µ n are deﬁned by (56) and (57), respectively. In our experience, thematrix exponentiation method often works well, and its computational cost is similarto that of our error-controlled method. However, it is highly sensitive to rate matrixconditioning. For example, Figure 6 shows a comparison of transition probabilitiesfrom the error-controlled method and the matrix exponentiation method for the Moranmodel with N = 100, α = 210, β = 20, u = 0 . v = 0. In evolutionary terms, . . . . . n P r obab ili t y . . . . . . Time P r obab ili t y Fig. 5

Transition probabilities for the Moran model with selection. The upper panel showsthe probability of n individuals having allele A at time t , P ,n ( t ) for the Moran model with N = 100, starting from m = 50 with u = 0 . v = 0 . α = 60, and β = 10. We show theprobabilities for t = 1 (solid line), t = 3 (dashed line), t = 5 (dotted line), t = 8 (dash-dottedline). Note that although the states 0 and 100 are not absorbing, the mutation rates u and v are small enough that probability accumulates signiﬁcantly in these end states. Note alsothe asymmetry in the distribution at longer times. The lower panel reports the probabilityof ﬁxation by time t , P m, ( t ), for the same model, but with u = 0 so the state n = 100 isabsorbing. The probabilities shown are for m = 70 (solid line), m = 50 (dashed line), m = 20(dotted line), and m = 1 (dash-dotted line). Note the starkly diﬀerent time-dynamics fordiﬀerent starting values. this means that mutation from A to A is impossible, and the A haplotype suﬀersfrom low ﬁtness. Computationally, this has the eﬀect of making µ n small for most n ,and hence the rate matrix grows ill-conditioned. ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● . . . . . . . n P r obab ili t y Fig. 6

Comparison of Moran model transition probabilities P ,n ( t = 0 .

2) computed bytwo methods with N = 100, α = 210, β = 20, u = 0 . v = 0. The open circlescorrespond with our error-controlled method, and the solid line corresponds with the matrixexponentiation method. This choice of parameters causes wild ﬂuctuations in probabilitiesreported by the matrix exponentiation method since the stochastic rate matrix becomes nearlysingular. Although the rate matrix in this example is nearly defective, this choice of parame-ter values is not unreasonably extreme. For example, researchers in population geneticsoften wish to test the hypothesis that selection occurs in a dataset. They ﬁt parame-ters for models with selection (full model) and without selection (restricted model) andperform a likelihood ratio test of this hypothesis. If the estimates of β and u in the fullmodel are small, they may be unable to reliably compute the probability (likelihood)of the data, given the estimated parameter values under the full model.3.4 A frameshift-aware indel modelThorne et al (1991) introduce a BDP modeling insertion and deletion of nucleotidesin DNA for applications in molecular evolution. The authors model the process ofsequence length evolution by assuming that a new nucleotide can be inserted adjacentto every existing nucleotide, and every existing nucleotide is subject to deletion, at aconstant per-nucleotide rate. This corresponds to the simple BDP with λ n = nλ and µ n = nµ . If a sequence has m nucleotides at time 0 and there are n nucleotides at time t later, the probability of this event is P m,n ( t ).However, an important aspect of biological sequence evolution is conservation ofthe structure and biophysical properties of proteins that result from transcription andtranslation of DNA sequences. After coding DNA is transcribed into RNA, ribosomestranslate 3-nucleotide chunks (codons) of the RNA into a single amino acid residue,that is then joined to the end of a growing protein polymer. Insertions or deletions(indels) in a DNA sequence that result in a shift in this triplet code are called “frame-shift” mutations. It is likely that a frame-shift indel occurring in a protein-coding DNAsequence results in a protein that is prematurely terminated or possesses structuraland chemical characteristics unlike the ancestral protein. Insertions or deletions whoselength is a multiple of three should be more common. We seek to model this behaviorin a novel way: suppose the indel process is a BDP similar in spirit to the one presentedby Thorne et al (1991), and the rate of insertion and deletion of nucleotides dependson the number of nucleotides already inserted, modulo (mod) 3: λ n =  nβ if n − nβ if n − nβ if n − µ n =  nγ if n − nγ if n − nγ if n − . (59)Here we assume that β > β , β , and γ > γ , γ so that transitions to state n such that n − n . However, using our error-controlled method,numerical results are readily available. Figure 7 shows P ,n ( t ) for n = 0 , . . . ,

50 atvarious times t . Note that the distribution of the number of inserted bases has peaksat the integers mod three. Finally, it is worth noting that the dearth of tractable BDPsfor indel events has been a major deterrent in statistical sequence alignment and weare actively exploring solutions to this problem using our error-controlled method. . . . . . . . n P r obab ili t y Fig. 7

Frameshift-aware indel model probability of observing n inserted DNA bases, givenstarting at m = 1. The transition probability P ,n ( t ) is shown for t = 5 (solid line), t = 7(dashed line), t = 9 (dotted line), and t = 11 (dash-dotted line), with parameters β = 0 . β = 1, β = 4, γ = 2, γ = 0 .

2, and γ = 0 . Traditionally the simple BDP with linear rates has dominated modeling applications,since its transition probabilities and other quantities of interest ﬁnd analytic expres-sions. However, increasingly sophisticated models in ecology, genetics, and evolution,among other ﬁelds, may necessitate more advanced computational methods to handleprocesses whose birth and death rates do not easily yield analytic solutions. We havedemonstrated a ﬂexible method for ﬁnding transition probabilities of general BDPsthat works for arbitrary sets of birth and death rates { λ n } and { µ n } , and does notrequire additional analytic information. This should prove useful for rapid develop-ment and testing of new models in applications. For simple models whose solution isavailable, we ﬁnd that our method agrees with known solutions and remains robustfor large starting and ending states and long times t . It is our hope that the methodpresented here will assist researchers in understanding the properties of increasinglyrich and realistic models. Acknowledgements

We are grateful to Ken Lange for helpful comments. This work wassupported by National Institutes of Health grants GM086887 and T32GM008185, and National9Science Foundation grant DMS0856099. A software implementation of all methods in this paperis available from FWC. k , we have f ( k ) m,n ( s ) =  m (cid:89) j = n +1 µ j  B n B m +1 + B m a m +2 b m +2 + a m +3 b m +3 + · · · a m + k b m + k for n ≤ m , and f ( k ) m,n ( s ) =  n − (cid:89) j = m λ j  B m B n +1 + B n a n +2 b n +2 + a n +3 b n +3 + · · · a n + k b n + k for n ≥ m . (60)For concreteness, suppose in what follows that n ≥ m . Note that the denominatorof the second equation is simply B n + k . Let A ( n ) k be the numerator of the continuedfraction in the second equation in (60), so f ( k ) m,n =  n − (cid:89) j = m λ j  A ( n ) k B n + k , (61)where A ( n ) k satisﬁes A (0) k = A k , A ( n )1 = (cid:81) n +1 j =1 a j , and A ( n ) k = a n + k A ( n ) k − + b n + k A ( n ) k − . (62) Note also that the diﬀerence between truncated estimates in the Laplace domain ( s ) is A n + k B n + k − A n B n = A n + k B n − A n B n + k B n + k B n = ( − n A ( n ) k B n + k B n . (63)This yields the generalized determinant formula A n + k B n − A n B n + k = ( − n A ( n ) k , (64)and at a root s i of B n + k ( s ), we have A ( n ) k ( s i ) = ( − n A n + k ( s i ) B n ( s i ) . (65)Now if s , s , . . . , s n are the roots of B n ( s ), we have, using the previous line and apartial fractions decomposition of (60), the formula for the Laplace transform of thetransition probability P m,n ( t ), truncated at k , f ( k ) m,n ( s ) =  n − (cid:89) j = m λ j  B m ( s ) A ( n ) k ( s ) B n + k ( s )=  n − (cid:89) j = m λ j  B m ( s ) A ( n ) k ( s ) (cid:81) n + ki =1 ( s − s i )=  n − (cid:89) j = m λ j  n + k (cid:88) i =1 B m ( s ) B n ( s i ) A n + k ( s i ) (cid:81) j (cid:54) = i ( s j − s i ) (cid:18) s − s i (cid:19) , (66)since we only require the values of A n + k ( s ) and B n ( s ) at the zeros of B n + k ( s ). Theninverse transforming, an approximate formula for the transition probability P m,n ( t ) is P ( k ) m,n ( t ) ≈  n − (cid:89) j = m λ j  n + k (cid:88) i =1 B m ( s i ) B n ( s i ) A n + k ( s i ) (cid:81) j (cid:54) = i ( s j − s i ) e − s i t . (67) The roots of B n ( s ), used in (66) and (67), are often found numerically as follows.Consider the characteristic polynomial det (cid:0) ˜ B n + sI (cid:1) of the matrix˜ B n =  λ λ µ λ + µ λ µ λ + µ

1. . . . . . . . . λ n − µ n − λ n − + µ n − λ n − µ n − λ n − + µ n −  . (68)It is clear that the n th partial denominator B n ( s ) = det( ˜ B n + sI ), and this quantityis zero when − s is an eigenvalue of the matrix ˜ B n . Therefore, the negatives of theeigenvalues of ˜ B n are the roots of B n ( s ). Furthermore, ˜ B n can be transformed intoa real symmetric matrix via a similarity transform and hence B n ( s ) has precisely n roots, all of which are simple, real, and negative. One usually ﬁnds these eigenvalues viathe QR algorithm or similar numerical techniques (Press 2007). However, the iterativeeigendecomposition of (68) generates small errors in the eigenvalues for large n + k .These errors are ampliﬁed in the product in the denominator of each summand in (67),resulting in a sum with both positive and negative terms that may be very large. Klaret al (2010) encounter similar instability in this algorithm. Their solution is to ﬁndthe roots of the terms in the numerator and compute each summand as a productof individual numerators and denominators in an attempt to keep roundoﬀ error inthe product from accumulating. So, if z , . . . , z n + k are the roots of A n + k then (67)becomes P ( k ) m,n ( t ) ≈  n − (cid:89) j = m λ j  n + k (cid:88) i =1 B m ( s i ) B n ( s i )( z i − s i ) (cid:89) j (cid:54) = i (cid:18) z j − s i s j − s i (cid:19) e − s i t . (69)This procedure does improve the numerical stability of the computation, but requirestwo eigendecompositions of possibly large matrices for every evaluation of P m,n ( t ),increasing the computational cost and, for large m and n , the roundoﬀ error. In our opinion, it is more advantageous to avoid truncation of the continued fraction (9) ata pre-speciﬁed index, and instead evaluate the continued fraction until convergenceduring numerical inversion. Figure 1 shows how approximant methods fail for large n .5.2 A power series methodParthasarathy and Sudhesh (2006a) present exact solutions by transforming continuedfractions such as (9) into an equivalent power series. Wall (1948) shows that Jacobifractions of this type can always be represented by an equivalent power series. However,the small radius of convergence of power series expressions for transition probabilitiescan limit their usefulness for long times or large birth or death rates. Parthasarathyand Sudhesh show that P ,n ( t ) has a power series representation given by P m,n ( t ) = (cid:32) n − (cid:89) k =0 a k (cid:33) ∞ (cid:88) m =0 ( − m A ( m, n ) t m + n ( m + n )! , (70)where A ( m, n ) = n (cid:88) i =0 a i i +1 (cid:88) i =0 a i i +1 (cid:88) i =0 a i · · · i m − +1 (cid:88) i m =0 a i m , (71)with A (0 , n ) = 1 (Parthasarathy and Sudhesh 2006a,b). Here, a n = λ n and a n +1 = µ n in the notation used in their papers. This approach is unique because it yields anexact analytic expression for the transition probabilities of a general BDP. However, theradius of convergence of the power series depends on the speciﬁed rates, and this radiusmay be quite small. To illustrate the pitfalls of this approach, consider a n = ( n + 1) λ ,corresponding to the BDP with λ n = (2 n + 1) λ and µ n = 2 nλ (Parthasarathy andSudhesh 2006a, Example 4.6). The power series for the transition probability in thisprocess becomes P ,n ( t ) = ∞ (cid:88) m =0 ( − m (2 n + 2 m )! m ! n ! ( λt/ n + m ( n + m )! . (72) Then the radius of convergence R of the power series is given by1 /R = lim m →∞ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (2 n + 2 m + 2)! (cid:0) λ (cid:1) n + m +1 ( m + 1)! n !( n + m + 1)! × m ! n !( n + m )!(2 n + 2 m )! (cid:0) λ (cid:1) n + m (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = lim m →∞ (2 m + 2 n + 1)(2 n + 2 m + 2)( m + 1)( n + m + 1) (cid:18) λ (cid:19) = lim m →∞ m + 2 n + 1 m + 1 λ = 2 λ. (73)And so the series diverges when 2 λt >

1. To illustrate the limitations of the powerseries approach, note that in this process, the transition intensity from 0 to 1 is λ , sothe expected ﬁrst-passage time from 0 to 1 is E ( T , ) = 1 /λ . Therefore, we cannotevaluate (72) when t is greater than E ( T , ) /

2. If n is much greater than 1, we may beunable to reliably evaluate P ,n ( t ) for times near E ( T ,n ). References

Abate J, Whitt W (1992a) The Fourier-series method for inverting transforms of probabilitydistributions. Queueing Syst 10:5–87Abate J, Whitt W (1992b) Numerical inversion of probability generating functions. Oper ResLett 12:245–251Abate J, Whitt W (1995) Numerical inversion of Laplace transforms of probability distribu-tions. ORS J Comput 7(1):36–43Abate J, Whitt W (1999) Computing Laplace transforms for numerical inversion via continuedfractions. INFORMS J Comput 11(4):394–405Allee WC, Emerson AE, Park O (1949) Principles of Animal Ecology. Saunders PhiladelphiaBailey NTJ (1964) The Elements of Stochastic Processes with Applications to the NaturalSciences. Wiley New YorkBankier JD, Leighton W (1942) Numerical continued fractions. Am J Math 64(1):653–668Blanch G (1964) Numerical evaluation of continued fractions. SIAM Rev 6(4):383–421Bordes G, Roehner B (1983) Application of stieltjes theory for s-fractions to birth and deathprocesses. Adv Appl Probab 15(3):507–530Craviotto C, Jones WB, Thron WJ (1993) A survey of truncation error analysis for Pad´e andcontinued fraction approximants. Acta Appl Math 33:211–2724Cuyt A, Petersen V, Verdonk B, Waadeland H, Jones W (2008) Handbook of ContinuedFractions for Special Functions. Springer Berlin / HeidelbergDennis B (2002) Allee eﬀects in stochastic populations. Oikos 96(3):389–401Donnelly P (1984) The transient behaviour of the Moran model in population genetics. MathProc Cambridge 95(02):349–358Feller W (1971) An Introduction to Probability Theory and its Applications. Wiley series inprobability and mathematical statistics, Wiley New YorkFlajolet P, Guillemin F (2000) The formal theory of birth-and-death processes, lattice pathcombinatorics and continued fractions. Adv Appl Probab 32(3):750–778Grassmann W (1977a) Transient solutions in Markovian queues : An algorithm for ﬁndingthem and determining their waiting-time distributions. Eur J Oper Res 1(6):396–402Grassmann WK (1977b) Transient solutions in Markovian queueing systems. Comput OperRes 4(1):47–53Guillemin F, Pinchon D (1999) Excursions of birth and death processes, orthogonal polyno-mials, and continued fractions. J Appl Probab 36(3):752–770Ismail MEH, Letessier J, Valent G (1988) Linear birth and death models and associated La-guerre and Meixner polynomials. J Approx Theory 55(3):337–348Karlin S, McGregor J (1957a) The classiﬁcation of birth and death processes. Trans Am MathSoc 86(2):366–400Karlin S, McGregor J (1957b) The diﬀerential equations of birth-and-death processes, and theStieltjes moment problem. Trans Am Math Soc 85(2):589–646Karlin S, McGregor J (1958a) Linear growth, birth and death processes. J Math Mech 7(4):643–662Karlin S, McGregor J (1958b) Many server queueing processes with Poisson input and expo-nential service times. Paciﬁc J Math 8(1):87–118Karlin S, McGregor J (1962) On a genetics model of Moran. Math Proc Cambridge 58(02):299–311Kendall DG (1948) On the generalized “birth-and-death” process. Ann Math Stat 19(1):1–15Kingman JFC (1982a) The coalescent. Stat Proc Appl 13(3):235–248Kingman JFC (1982b) On the genealogy of large populations. J Appl Probab 19:27–43Klar B, Parthasarathy PR, Henze N (2010) Zipf and Lerch limit of birth and death processes.Probab Eng Inform Sc 24(01):129–144Krone SM, Neuhauser C (1997) Ancestral processes with selection. Theor Popul Biol 51:210–237Lentz WJ (1976) Generating Bessel functions in Mie scattering calculations using continuedfractions. Appl Opt 15(3):668–6715Levin D (1973) Development of non-linear transformations for improving convergence of se-quences. Int J Comput Math 3(B):371–388Lorentzen L, Waadeland H (1992) Continued Fractions with Applications. North-Holland,AmsterdamMederer M (2003) Transient solutions of Markov processes and generalized continued fractions.IMA J Appl Math 68(1):99–118Mohanty S, Montazer-Haghighi A, Trueblood R (1993) On the transient behavior of a ﬁnitebirth-death process with an application. Comput Oper Res 20(3):239–248Moran PAP (1958) Random processes in genetics. Math Proc Cambridge 54(01):60–71Murphy JA, O’Donohoe MR (1975) Some properties of continued fractions with applicationsin Markov processes. IMA J Appl Math 16(1):57–71Novozhilov AS, Karev GP, Koonin EV (2006) Biological applications of the theory of birth-and-death processes. Brief Bioinform 7(1):70–85Numerical Recipes Software (2007) Derivation of the Levin transformation. Numerical recipeswebnote No 6 URL