Renewal theory in analysis of tries and strings
aa r X i v : . [ c s . D S ] D ec RENEWAL THEORY IN ANALYSIS OF TRIES ANDSTRINGS
SVANTE JANSON
To my colleague and friend Allan Gut on the occasion of his retirement
Abstract.
We give a survey of a number of simple applications ofrenewal theory to problems on random strings and tries: insertion depth,size, insertion mode and imbalance of tries; variations for b -tries andPatricia tries; Khodak and Tunstall codes. Introduction
Although it long has been realized that renewal theory is a useful toolin the study of random strings and related structures, it has not alwaysbeen used to its full potential. The purpose of the present paper is to givea survey presenting in a unified way some simple applications of renewaltheory to a number of problems involving random strings, in particularseveral problems on tries, which are tree structures constructed from strings.(Other applications of renewal theory to problems on random trees are givenin, e.g., [6] and [18].)Since our purpose is to illustrate a method rather than to prove newresults, we present a number of problems in a simple form without tryingto be as general as possible. In particular, for simplicity we exclusivelyconsider random strings in the alphabet { , } , and assume that the “letters”(bits) ξ i in the strings are i.i.d. Note, however, that the methods beloware much more widely applicable and extend in a straightforward way tolarger alphabets. The methods also, at least in principle, extend to, forexample, Markov sources where ξ i is a Markov chain. (See e.g. Szpankowski[32, Section 2.1] and Cl´ement, Flajolet and Vall´ee [5] for various interestingprobability models of random strings. Renewal theory for Markov chains istreated for example by Kesten [21] and Athreya, McDonald and Ney [2].)Indeed, one of the purposes of this paper is to make propaganda for the useof renewal theory to study e.g. Markov models, even if we do not do this inthe present paper. (Some such results may appear elsewhere.)The results below are (mostly) not new; they have earlier been proved byother methods, in particular Mellin transforms. (We try to give proper ref-erences for the theorems, but we do not attempt to cover the large literature Date : December 11, 2009.2000
Mathematics Subject Classification. on random tries and strings in any completeness.) Indeed, such methods of-ten provide sharper results, with better error bounds or higher order terms,and these methods too certainly are important. Nevertheless, we believethat renewal theory often is a valuable method that yields the leading termsin a simple and intuitive way, and that it ought to be more widely usedfor this type of problems. Moreover, as said above, this method may beeasier to extend to other situations. (Further, it gives one explanation forthe oscillatory terms that often appear, as an instance of the arithmetic casein renewal theory. Note that oscillatory terms become much less commonfor larger alphabets, except when all letters are equiprobable, because it ismore difficult to be arithmetic, see Appendix A.)We treat a number of problems on random tries in Sections 3–5 and 8(insertion depth, imbalance, size, insertion mode). We consider b -tries inSection 6 and Patricia tries in Section 7. Tunstall and Khodak codes arestudied in Section 9. A random walk in a region bounded by two crossinglines is studied in Section 10. The standard results from renewal theory thatwe use are for convenience collected in Appendix A. Notation.
We use p −→ and d −→ for convergence in probability and in dis-tribution, respectively.If Z n is a sequence of random variables and µ n and σ n are sequences ofreal numbers with σ n > n , at least), then Z n ∼ AsN( µ n , σ n )means that ( Z n − µ n ) /σ n d −→ N (0 , x by { x } := x − ⌊ x ⌋ . Acknowledgement.
I thank Allan Gut and Wojciech Szpankowski for in-spiration and helpful discussions.2.
Preliminaries
Suppose that Ξ (1) , Ξ (2) , . . . is an i.i.d. sequence of random infinite stringsΞ ( n ) = ξ ( n )1 ξ ( n )2 · · · , with letters ξ ( n ) i in an alphabet A . (When the superscript n does not matter we drop it; we thus write Ξ = ξ ξ · · · for a generic stringin the sequence.) For simplicity, we consider only the case A = { , } , andfurther assume that the individual letters ξ i are i.i.d. with ξ i ∼ Be( p ) forsome fixed p ∈ (0 , P ( ξ i = 1) = p and P ( ξ i = 0) = q := 1 − p .Given a finite string α · · · α n ∈ A n , let P ( α · · · α n ) be the probabilitythat the random string Ξ begins with α · · · α n . In particular, for a singleletter, P (0) = q and P (1) = p , and in general P ( α · · · α n ) = n Y i =1 P ( α i ) = n Y i =1 p α i q − α i . (2.1)Given a random string ξ ξ · · · , we define X i := − ln P ( ξ i ) = − ln (cid:0) p ξ i q − ξ i (cid:1) = ( − ln q, ξ i = 0 , − ln p, ξ i = 1 . (2.2) ENEWAL THEORY IN ANALYSIS OF TRIES AND STRINGS 3
Note that X , X , . . . is an i.i.d. sequence of positive random variables with E X i = H := − p ln p − q ln q, (2.3)the usual entropy of each letter ξ i , and E X i = H := p ln p + q ln q, (2.4)Var X i = H − H = pq (ln p − ln q ) = pq ln ( p/q ) . (2.5)Note that the case p = q = 1 / X i = ln 2 is determin-istic and Var X i = 0; for all other p ∈ (0 , < Var X i < ∞ .By (2.2), X i is supported on { ln(1 /p ) , ln(1 /q ) } . It is well-known, bothin renewal theory and in the analysis of tries, that one frequently has todistinguish between two cases: the arithmetic (or lattice ) case when thesupport is a subset of d Z for some d >
0, and the non-arithmetic (or non-lattice ) case when it is not, see further Appendix A. For X i given by (2.2),this yields the following cases: arithmetic: The ratio ln p/ ln q is rational. More precisely, X i then is d -arithmetic, where d equals gcd(ln p, ln q ), the largest positive realnumber such that ln p and ln q both are integer multiples of d . Ifln p/ ln q = a/b , where a and b are relatively prime positive integers,then d = gcd(ln p, ln q ) = | ln p | a = | ln q | b . (2.6) non-arithmetic: The ratio ln p/ ln q is irrational.We let S n denote the partial sums of X i : S n := P ni =1 X i . Thus P ( ξ · · · ξ n ) = n Y i =1 P ( ξ i ) = n Y i =1 e − X i = e − S n . (2.7)(This is a random variable, since it depends on the random string ξ · · · ξ n ; itcan be interpreted as the probability that another random string Ξ ( j ) beginswith the same n letters as observed.)We introduce the standard renewal theory notations (see e.g. Gut [13,Chapter 2]), for t ≥ n ≥ ν ( t ) := min { n : S n > t } , (2.8) F n ( t ) := P ( S n ≤ t ) = P ( ν ( t ) > n ) , (2.9) U ( t ) := E ν ( t ) = ∞ X n =0 F n ( t ) . (2.10)Note that (2.10) means that, for any function g ≥ Z ∞ g ( t ) d U ( t ) = ∞ X n =0 Z ∞ g ( t ) d F n ( t ) = ∞ X n =0 E g ( S n ) . (2.11) SVANTE JANSON
We also allow the summation to start with an initial random variable X ,which is independent of X , X , . . . , but may have an arbitrary real-valueddistribution. We then define b S n := ∞ X n =0 X i = X + ∞ X n =1 X i , (2.12) b ν ( t ) := min { n : b S n > t } . (2.13)3. Insertion depth in a trie A trie is a binary tree structure designed to store a set of strings. It isconstructed from the strings by the following recursive procedure, see furthere.g. Knuth [22, Section 6.3], Mahmoud [25, Chapter 5] or Szpankowski [32,Section 1.1]: If the set of strings is empty, then the trie is empty; if there isonly one string, then the trie consists of a single node (the root), and thestring is stored there; if there is more than one string, then the trie beginswith a root, without any string stored, all strings that begin with 0 arepassed to the left subtree of the root, and all strings that begin with 1 arepassed to the right subtree. In the latter case, the subtrees are constructedrecursively by the same procedure, with the only difference that at the k thlevel, the strings are partitioned according to the k th letter. We assume thatthe strings are distinct (in our random model, this holds with probability1), and then the procedure terminates. Note that one string is stored ineach leaf of the trie, and that no strings are stored in the remaining nodes.The leaves are also called external nodes and the remaining nodes are called internal nodes ; note that every internal node has one or two children.The trie is a finite subtree of the complete infinite binary tree T ∞ , wherethe nodes can be labelled by finite strings α = α · · · α k ∈ A ∗ := S ∞ k =0 A k (the root is the empty string). It is easily seen that a node α · · · α k in T ∞ is an internal node of the trie if and only if there are at least 2 strings (inthe given set) that start with α · · · α k , and (for k ≥
1) that α · · · α k is anexternal node if and only if there is exactly one such string, and there is atleast one other string beginning with α · · · α k − .Let D n be the depth (= path length) of the node containing a givenstring, for example the first, in the trie constructed from n random stringsΞ (1) , . . . , Ξ ( n ) . (By symmetry, any of n strings will have a depth with thesame distribution.) Denoting the chosen string by Ξ = ξ ξ · · · , the depth D n is thus at most k if and only if no other of the strings begins with ξ · · · ξ k .Conditioning on the string Ξ, each of the other strings has this beginningwith probability P ( ξ · · · ξ k ), and thus by independence, recalling (2.7), P ( D n ≤ k | Ξ) = (cid:0) − P ( ξ · · · ξ k ) (cid:1) n − = (cid:0) − e − S k (cid:1) n − . (3.1)Let X = X ( n )0 be a random variable with the distribution P ( X ( n )0 > x ) = (cid:0) − e x /n (cid:1) n − = (cid:0) − e x − ln n (cid:1) n − , x ∈ ( −∞ , ∞ ) . (3.2) ENEWAL THEORY IN ANALYSIS OF TRIES AND STRINGS 5 As n → ∞ , this converges to exp( − e x ), and thus X ( n )0 → X ∗ , where − X ∗ has the Gumbel distribution with P ( − X ∗ ≤ x ) = exp( − exp( − x )). Remark 3.1.
It is easily seen that X ( n )0 d = ln n − max { Z , . . . , Z n − } , where Z , Z , . . . are i.i.d. Exp(1) random variables. Cf. Leadbetter, Lindgren and Rootz´en[23, Example 1.7.2].Using (3.2), we can rewrite (3.1) as P ( D n ≤ k | Ξ) = P (cid:0) X ( n )0 > ln n − S k | Ξ (cid:1) (3.3)and thus, recalling (2.12) and (2.13), P ( D n ≤ k ) = P (cid:0) X > ln n − S k (cid:1) = P (cid:0) b S k > ln n (cid:1) = P (cid:0)b ν (ln n ) ≤ k (cid:1) . (3.4)Since k ≥ D n d = b ν (ln n ) . (3.5)In the case p = 1 / S k = k ln 2 is non-random, and the only randomnessin b ν (ln n ) comes from X ; in fact, it is easy to see that P ( D n ≤ k ) → P ( − X ∗ ≤ t ) if k → ∞ and n → ∞ along sequences such that k ln 2 − ln n → t ∈ ( −∞ , ∞ ), see [14], [28], [25, Theorem 5.7], [24]. This result can also beexpressed as d TV ( D n , ⌈ (ln n − X ∗ ) / ln 2 ⌉ ) → n → ∞ , where d TV denotesthe total variation distance of the distributions, see [19, Example 4.5].However, if p = 1 /
2, then each X k is truly random, which leads to largerdispersion of D n . We can apply standard renewal theory theorems, see The-orems A.1–A.3 and Remark A.4 in the appendix, and immediately obtainthe following. For other, earlier proofs see Knuth [22, Sections 6.3 and 5.2],Pittel [27, 28] and Mahmoud [25, Section 5.5]. The Markov case is treatedby Jacquet and Szpankowski [17], ergodic strings by Pittel [27], and a classof general dynamical sources by Cl´ement, Flajolet and Vall´ee [5]. Theorem 3.2.
For every p ∈ (0 , , D n ln n p −→ H , (3.6) with H the entropy given by (2.3) . Moreover, the convergence holds in every L r , r < ∞ , too. Hence, all moments converge in (3.6) and E D rn ∼ H − r (ln n ) r , < r < ∞ . (3.7) Theorem 3.3.
More precisely: (i) If ln p/ ln q is irrational, then, as n → ∞ , E D n = ln nH + H H + γH + o (1) . (3.8)(ii) If ln p/ ln q is rational, then, as n → ∞ , E D n = ln nH + H H + γH + ψ (ln n ) + o (1) , (3.9) SVANTE JANSON where ψ ( t ) is a small continuous function, with period d = gcd(ln p, ln q ) in t , given by ψ ( t ) := − H X k =0 Γ( − π i k/d ) e π i kt/d . (3.10) Proof.
The non-arithmetic case (3.8) follows directly from (3.5) and (A.4);we can replace X ( n )0 by the limit X ∗ , and since the Gumbel variable − X ∗ has characteristic function E e − i tX ∗ = Γ(1 − i t ), we have E X ∗ = Γ ′ (1) = − γ .In the arithmetic case, we use (A.6), together with Lemma A.5 whichyields E n td − X ∗ d o = 12 − X k =0 Γ(1 − π i k/d )2 πk i e π i kt/d = 12 + 1 d X k =0 Γ( − π i k/d ) e π i kt/d . (cid:3) Theorem 3.4.
Suppose that p ∈ (0 , . Then, as n → ∞ , D n − H − ln n √ ln n d −→ N (cid:16) , σ H (cid:17) , with σ = H − H = pq (ln p − ln q ) . If p = 1 / , then σ > and this canbe written as D n ∼ AsN (cid:0) H − ln n, H − σ ln n (cid:1) . Moreover,
Var D n = σ H ln n + o (ln n ) . In the argument above, X depends on n . This is a nuisance, althoughno real problem (see Remark A.4). An alternative that avoids this problemis to Poissonize by considering a random number of strings. In this caseit is simplest to consider 1 + Po( λ ) strings, so that a selected string Ξ iscompared to a Poisson number Po( λ ) other strings, for a parameter λ → ∞ .Conditioned on Ξ, the number of other strings beginning with ξ · · · ξ k thenhas the Poisson distribution Po( λP ( ξ · · · ξ k )). Thus we obtain instead of(3.1), now denoting the depth by D λ , P ( D λ ≤ k | Ξ) = e − λP ( ξ ··· ξ k ) = e − λe − Sk = e − e − ( Sk − ln λ ) = P ( − X ∗ < S k − ln λ ) = P ( S k + X ∗ > ln λ ) = P (cid:0)b ν (ln λ ) ≤ k (cid:1) , where X := X ∗ now is independent of n , and consequently D λ d = b ν (ln λ ).We obtain the same asymptotics as for D n above, directly from TheoremsA.1–A.3. It is in this case easy to depoissonize, by noting that D n is stochas-tically monotone in n , and derive the results for D n from the results for D λ by choosing λ = n ± n / ; we omit the details. ENEWAL THEORY IN ANALYSIS OF TRIES AND STRINGS 7 Imbalance in tries
Mahmoud [26] studied the imbalance factor of a string in a trie, definedas the number of steps to the right minus the number of steps to the left inthe path from the root to the leaf where the string is stored. We define Y i := 2 ξ i − ( − , ξ = 0 , +1 , ξ = 1 , and denote the corresponding partial sums by V k := P ki =1 Y i . Thus theimbalance factor ∆ n of the string Ξ in a random trie with n strings is V D n ,with D n as in Section 3 the depth of the string.It follows immediately from (3.3) that (3.4) holds also conditioned on thesequence ( Y , Y , . . . ). As a consequence, for any k and v , P ( D n = k | V k = v ) = P (cid:0)b ν (ln n ) = k | V k = v (cid:1) , which shows that( D n , ∆ n ) = ( D n , V D n ) d = (cid:0)b ν (ln n ) , V b ν (ln n ) (cid:1) . In particular, ∆ n d = V b ν (ln n ) . We may apply Theorem A.8 (and Remark A.9). A simple calculation yieldsVar( µ X Y − µ Y X ) = pq (ln p + ln q ) = pq ln ( pq ), and we obtain the centrallimit theorem by Mahmoud [25]: Theorem 4.1. As n → ∞ , ∆ n ∼ AsN (cid:18) p − qH ln n, pq ln ( pq ) H ln n (cid:19) . The expected size of a trie
A trie built of n strings as in Section 3 has n external nodes, since eachexternal node contains exactly one string. However, the number of internalnodes, W n , say, is random. We will study its expectation. For simplicity wePoissonize directly and consider a trie constructed from Po( λ ) strings; welet f W λ be the number of internal nodes. The results below have previouslybeen found by other methods, in particular, more precise asymptotics havebeen found using Mellin transforms; see Knuth [22], Mahmoud [25], Fayolle,Flajolet, Hofri and Jacquet [10], and, in particular, Jacquet and R´egnier[15, 16]. The Markov case is studied by R´egnier [30] and dynamical sourcesby Cl´ement, Flajolet and Vall´ee [5].If α = α · · · α k is a finite string, let I ( α ) be the indicator of the eventthat α is an internal node in the trie. We found above that this event occursif and only if there are at least two strings beginning with α . In our Poisson SVANTE JANSON model, the number of strings beginning with α has a Poisson distributionPo( λP ( α )), and thus E f W λ = X α ∈A ∗ E I ( α ) = X α ∈A ∗ P (cid:0) Po( λP ( α )) ≥ (cid:1) = X α ∈A ∗ f ( λP ( α )) , (5.1)where f ( x ) := P (cid:0) Po( x ) ≥ (cid:1) = 1 − (1 + x ) e − x . (5.2)Sums of the type in (5.1) are often studied using Mellin transform inver-sion and residue calculus. Renewal theory presents an alternative. As saidin the introduction, this opens the way to straightforward generalizations,e.g. to Markov sources. Theorem 5.1.
Suppose that f is a non-negative function on (0 , ∞ ) , andthat F ( λ ) = P α ∈A ∗ f ( λP ( α )) , with P ( α ) given by (2.1) . Assume furtherthat f is a.e. continuous and satisfies the estimates f ( x ) = O ( x ) , < x < , and f ( x ) = O (1) , < x < ∞ . (5.3) Let g ( t ) := e t f ( e − t ) . (i) If ln p/ ln q is irrational, then, as λ → ∞ , F ( λ ) λ → H Z ∞−∞ g ( t ) d t = 1 H Z ∞ f ( x ) x − d x. (5.4)(ii) If ln p/ ln q is rational, then, as λ → ∞ , F ( λ ) λ = 1 H ψ (ln λ ) + o (1) , (5.5) where, with d := gcd(ln p, ln q ) given by (2.6) , ψ is a bounded d -periodicfunction having the Fourier series ψ ( t ) ∼ ∞ X m = −∞ b ψ ( m ) e π i mt/d (5.6) with b ψ ( m ) = b g ( − πm/d ) = Z ∞−∞ e π i mt/d g ( t ) d t = Z ∞ f ( x ) x − − π i m/d d x. (5.7) Furthermore, ψ ( t ) = d ∞ X k = −∞ g ( kd − t ) . (5.8) If f is continuous, then ψ is too. ENEWAL THEORY IN ANALYSIS OF TRIES AND STRINGS 9
Proof. If f ( α ) is any non-negative function on A ∗ , then, using (2.7), foreach k ≥ X α ,...,α k f ( α · · · α k ) = X α ,...,α k f ( α · · · α k ) P ( α · · · α k ) P ( α · · · α k )= E f ( ξ · · · ξ k ) P ( ξ · · · ξ k ) = E (cid:0) e S k f ( ξ · · · ξ k ) (cid:1) , and thus, X α ∈A ∗ f ( α ) = ∞ X k =0 E (cid:0) e S k f ( ξ · · · ξ k ) (cid:1) . (5.9)With f ( α ) = f ( λP ( α )), we have f ( ξ · · · ξ k ) = f ( λe − S k ) and thus (5.9)yields, recalling (2.10), F ( λ ) = X α ∈A ∗ f ( λP ( α )) = ∞ X k =0 E (cid:0) e S k f ( λe − S k ) (cid:1) = Z ∞ f ( λe − x ) e x d U ( x ) . Define further f ( x ) := f ( x ) /x ; thus g ( t ) = f ( e − t ). Then, F ( λ ) = Z ∞ λf ( λe − x ) d U ( x ) = λ Z ∞ g ( x − ln λ ) d U ( x ) . (5.10)We can now apply the key renewal theorem, Theorem A.7. The function g is a.e. continuous and it follows from (5.3) that g ( t ) ≤ Ce −| t | for some C ; hence g is directly Riemann integrable on ( −∞ , ∞ ) by Lemma A.6. Inthe non-arithmetic case (i) we obtain (5.4) from (5.10) and (A.10), since µ = E X i = H by (2.3) and, with x = e − t , Z ∞−∞ g ( t ) d t = Z ∞−∞ e t f ( e − t ) d t = Z ∞ f ( x ) x − d x. (5.11)Similarly, the aritmetic case (ii) follows from (A.12) and (A.14)–(A.16)together with the calculation, generalizing (5.11), b g ( s ) = Z ∞−∞ e − i st g ( t ) d t = Z ∞−∞ e (1 − i s ) t f ( e − t ) d t = Z ∞ f ( x ) x − s d x. (This equals the Mellin transform e f ( − s ).) (cid:3) Remark 5.2.
The assumptions on f may be weakened (with the sameproof); it suffices that f ( x ) = O ( x − δ ) and f ( x ) = O ( x δ ) for x ∈ (0 , ∞ )and some δ >
0. If f is continuous, it is obviously sufficient that theseestimates hold for small and large x , respectively.Returning to f W λ , we obtain the following for the expected number ofinternal nodes in the Poisson trie. Theorem 5.3. (i) If ln p/ ln q is irrational, then, as λ → ∞ , E f W λ λ → H . (5.12) (ii) If ln p/ ln q is rational, then, as λ → ∞ , E f W λ λ = 1 H + 1 H ψ (ln λ ) + o (1) , (5.13) where, with d = gcd(ln p, ln q ) , ψ is a continuous d -periodic function withaverage and Fourier expansion ψ ( t ) = X k =0 Γ(1 − π i k/d )1 + 2 π i k/d e π i kt/d = X k =0 π i kd Γ (cid:16) − − π i kd (cid:17) e π i kt/d . Proof.
We apply Theorem 5.1 to (5.1). It follows from (5.2) that f ′ ( x ) = xe − x . Thus, by an integration by parts, since f ( x ) /x → x → x → ∞ , Z ∞ f ( x ) x − d x = Z ∞ f ′ ( x ) x − d x = Z ∞ e − x d x = 1 . (5.14)Consequently, (5.12) follows from (5.4).Similarly, (5.13) follows from (5.5), and the calculation, generalizing (5.14), b g ( s ) = Z ∞ f ( x ) x − s d x = (1 − i s ) − Z ∞ f ′ ( x ) x − s d x = Γ(1 + i s )1 − i s = − i s Γ( − s ) . (cid:3) The case of a fixed number n of strings is easily handled by comparison,and (5.12) and (5.13) imply the corresponding results for W n : Theorem 5.4. (i) If ln p/ ln q is irrational, then, as n → ∞ , E W n n → H . (ii) If ln p/ ln q is rational, then, as n → ∞ , with ψ as in Theorem 5.3, E W n n = 1 H + 1 H ψ (ln n ) + o (1) . Proof. E W n is increasing in n . Thus, first, because P (Po(2 n ) ≥ n ) ≥ / E f W n ≥ E W n , and thus E W n ≤ E f W n = O ( n ). Secondly, using thisestimate, the standard Chernoff concentration bounds for the Poisson dis-tribution easily implies, with λ ± = n ± n / , say, E f W λ − + o ( n ) ≤ E W n ≤ E f W λ + + o ( n ). The results then follow from Theorem 5.3. (cid:3) Remark 5.5.
It is well-known that the periodic function ψ above, as inmany similar results, fluctuates very little from its mean. In fact, the largest d is obtained for p = q = 1 /
2, when d = ln 2. Since Γ(1+i s ) decreases rapidlyas s → ±∞ , the Fourier coefficients of ψ ( t ) are very small; the largest (inabsolute value) are | c ψ ( ± | = | Γ(1 + 2 π i / ln 2) | / | − π i / ln 2 | ≈ . · − , so | ψ (ln n ) | is at most about 10 − , and the oscillations ψ (ln n ) /H of E W n /n are bounded by 1 . · − . (See for example [25, pp. 23–28].) Otherchoices of p yield even smaller oscillations. ENEWAL THEORY IN ANALYSIS OF TRIES AND STRINGS 11 b -tries As a variation, consider a b -trie, where each node can store b strings, forsome fixed integer b ≥
1; as before, the internal nodes do not contain anystring. A finite string α now is an internal node if and only if at least b + 1of the strings start with α . In the argument above we only have to replace(5.2) by f ( x ) := P (cid:0) Po( x ) ≥ b + 1 (cid:1) ; (6.1)thus f ′ ( x ) = P (cid:0) Po( x ) = b (cid:1) = x b e − x /b ! and (5.11) yields, with an integra-tion by parts as in (5.14), R ∞−∞ g ( t ) d t = 1 /b . Hence, in the non-arithmeticcase when ln p/ ln q is irrational, the expected number of internal nodes is E f W ( b ) λ ∼ λ/ ( Hb ), as found by Jacquet and R´egnier [15, 16]. In the arith-metic case, we obtain a periodic function ψ , now with Fourier coefficients(1 + 2 π i k/d ) − Γ( b − π i k/d ) /b !.We can also analyze the external nodes. Let Z j be the number of nodeswhere exactly j strings are stored, j = 1 , . . . , b . A finite string α is oneof these nodes if exactly j of the stored strings begin with α , and at least b − j + 1 other strings begin with α ′ , the sibling of α obtained by flippingthe last letter. (We assume that there are at least b strings, so we can ignorethe root.)Consider again the Poisson model. In the case when α ends with 1, i.e., α = β β , the probability of this event is, with x = λP ( β ), byindependence in the Poisson model, P (cid:0) Po( px ) = j (cid:1) P (cid:0) Po( qx ) > b − j (cid:1) . If α = β
0, we similarly have the probability P (cid:0) Po( qx ) = j (cid:1) P (cid:0) Po( px ) > b − j (cid:1) .Summing over β ∈ A ∗ , we thus obtain a sum of the type in Theorem 5.1with f replaced by f j ( x ) = P (cid:0) Po( px ) = j (cid:1) P (cid:0) Po( qx ) > b − j (cid:1) + P (cid:0) Po( qx ) = j (cid:1) P (cid:0) Po( px ) > b − j (cid:1) = p j x j j ! e − px − b − j X k =0 q k x k k ! e − qx ! + q j x j j ! e − qx − b − j X k =0 p k x k k ! e − px ! = p j x j j ! e − px + q j x j j ! e − qx − b − j X k =0 ( p j q k + q j p k ) x j + k j ! k ! e − x . We argue as above, with g j ( t ) := e t f j ( e − t ). We have, similarly to (5.11),omitting some details, c j := Z ∞−∞ g j ( t ) d t = Z ∞ f j ( x ) x − d x = ( p ln(1 /p ) + q ln(1 /q ) − P b − k =1 1 k ( pq k + qp k ) , j = 1 , j ( j − − P b − jk =0 ( j + k − j ! k ! ( p j q k + q j p k ) , ≤ j ≤ b. (6.2) Alternatively, using f j ( x ) = p j x j j ! e − px ∞ X k = b − j +1 q k x k k ! e − qx + q j x j j ! e − qx ∞ X k = b − j +1 p k x k k ! e − px = ∞ X k = b − j +1 ( p j q k + q j p k ) x j + k j ! k ! e − x , we find c j = ∞ X k = b − j +1 ( j + k − j ! k ! ( p j q k + q j p k ) , ≤ j ≤ b. (6.3)More generally (except when ( j, s ) = (1 , b g j ( s ) = Z ∞ f j ( x ) x − s d x = Γ( j − s ) j ! ( p − i s + q − i s ) − b − j X k =0 Γ( j + k − s ) j ! k ! ( p j q k + q j p k ) . (6.4)If we use the notation Z jn for the trie with a fixed number n of stringsand e Z jλ for the Poisson model with Po( λ ) strings, we obtain as above thefollowing result for the number of external nodes that store j strings. Theorem 6.1. (i) If ln p/ ln q is irrational, then, as n → ∞ , for j =1 , . . . , b , E Z jn n → π j := c j H , with c j given by (6.2) – (6.3) . (ii) If ln p/ ln q is rational, then, as n → ∞ , for j = 1 , . . . , b , E Z jn n = ψ bj (ln n ) + o (1) , where ψ bj is a continuous d -periodic function, with d as in Theorem 5.3; ψ bj has average π j and Fourier expansion ψ bj ( t ) = H − ∞ X k = −∞ b g j ( − π i k/d ) e π i kt/d = π j + H − X k =0 b g j ( − π i k/d ) e π i kt/d , with b g j given by (6.4) . The same results (with n replaced by λ ) hold for e Z jλ in the Poisson model.Proof. As just said, the Poisson case follows from (5.1), and it remains onlyto depoissonize. To do this, choose λ = n , and let N ∼ Po( n ) be the numberof strings in the Poisson model. We couple the trie with n strings and thePoisson trie with N strings by starting with min( n, N ) common strings. Ifwe add a new string to the trie, it is either stored in an existing leaf or itconverts a leaf to an internal node and adds two new leafs (and possibly a ENEWAL THEORY IN ANALYSIS OF TRIES AND STRINGS 13 chain of further internal nodes). Thus at most 3 leaves are affected, andeach Z j changes by at most 3. Since we add max( n, N ) − min( n, N ) = | N − n | new strings, we have | e Z jλ − Z jn | ≤ | N − n | for each j , and thus | E e Z jλ − E Z jn | ≤ E | N − n | = O ( √ n ). (cid:3) For example, for b = 2 , , b π π π π − H pq H pq − H pq H pq H pq − H pq + H ( pq ) H pq − H ( pq ) H pq + H ( pq ) H pq − H ( pq ) Note that P b jπ j = 1, or equivalently P b jc j = H , since the total numberof strings in the leaves is n ; this can also be verified from (6.2).7. Patricia tries
Another version of the trie is the Patricia trie, where the trie is compressedby eliminating all internal nodes with only one child. (We use the notationsabove with a superscript P for the Patricia case.) Since each internal nodein the Patricia trie thus has exactly 2 children, the number of internal nodesis one less than the number of external nodes, i.e. W Pn = n − n strings.As another illustration of Theorem 5.1, we note that this trivial result, tothe first order at least, also can be derived as above. The condition for a finitestring α to be an internal node of the Patricia trie is that there is at least onestring beginning with α α
1. In thePoisson model, the number of strings with these beginnings are independentPoisson random variables with means λP ( α
0) = λqP ( α ) and λP ( α
1) = λpP ( α ), and we can argue as above with f ( x ) = (1 − e − px )(1 − e − qx ). Inthis case, R ∞−∞ g ( t ) d t = R ∞ f ( x ) x − = − p ln p − q ln q = H , which implies E f W Pλ ∼ λ and E W Pn ∼ n in the non-arithmetic case. Moreover, we knowthat this holds in the arithmetic case too, without oscillations, which meansthat b ψ ( m ) = 0 for m = 0 in (5.6)–(5.7). Indeed, for example by integrationby parts, b g ( s ) = Z ∞ f ( x ) x − s d x = Z ∞ x − s (1 − e − px − e − qx + e − x ) d x = (cid:0) − p − i s − q − i s (cid:1) Γ( − s ) , and thus b ψ ( m ) = b g ( − πm/d ) = 0 for m = 0.We can also consider a Patricia b -trie, and obtain the asymptotics of theexpected number of internal nodes in a similar way, but it is simpler to usethe result in Theorem 6.1 and the fact that the number of internal nodes is P bj =1 Z Pjn − P bj =1 Z jn −
1; in the non-arithmetic case this yields theasymptotics (cid:0)P bj =1 π j (cid:1) n .The number of internal nodes in the Patricia trie is reduced to n − n/H in the trie (see Theorem 5.4, and ignore the small oscillationsin the arithmetic case); this is a reduction by a factor H which is at mostln 2 ≈ . λ ) strings, with one selectedstring Ξ, then a string α is an internal node on the path in the trie fromthe root to Ξ such that α does not appear in the Patricia trie if and only ifΞ begins with α , and further, either Ξ begins with α
0, there is at least oneother such string, and there is no string beginning with α
1, or, conversely,Ξ and at least one other string begins with α α
0. The probability of this is λ − f ( x ) with x = λP ( α ) and f ( x ) := xq (1 − e − qx ) e − px + xp (1 − e − px ) e − qx . Hence, if ∆ D λ := D λ − D Pλ is difference between the path lengths to Ξin the trie and in the Patricia trie, then E ∆ D λ = λ − P α f ( λP ( α )) andTheorem 5.1 yields E ∆ D λ → H Z ∞−∞ f ( x ) x − d x = qH Z ∞−∞ e − px − e − x x d x + pH Z ∞−∞ e − qx − e − x x d x = − q ln p − p ln qH . This holds also in the arithmetic case, since a simple calculation showsthat Fourier coefficients b ψ ( m ) in (5.7) vanish for all m = 0. (This is aninteresting example of cancellation in an arithmetic case where we wouldexpect oscillations.) Hence the expected saving is 1 for p = 1 /
2, and O (1)for any fixed p . (This is o ( E D λ ) and thus asymptotically negligible.)Again, we can depoissonize by considering λ = n ± n / , and we obtain thesame result for a fixed number n of strings. Together with Theorem 3.3, weobtain the following, earlier found by Szpankowski [31], see also Knuth [22,Section 6.3] ( p = 1 /
2) and Rais, Jacquet and Szpankowski [29]. (Dynamicalsources are considered by Bourdon [3].)
Theorem 7.1.
For the expected depth E D Pn in a Patricia trie: (i) If ln p/ ln q is irrational, then, as n → ∞ , E D Pn = ln nH + H H + γ + q ln p + p ln qH + o (1) . (ii) If ln p/ ln q is rational, then, as n → ∞ , E D Pn = ln nH + H H + γ + q ln p + p ln qH + ψ (ln n ) + o (1) , ENEWAL THEORY IN ANALYSIS OF TRIES AND STRINGS 15 where ψ ( t ) is a small continuous function, with period d in t , given by (3.10) . Insertion in a trie
When a new string is inserted in a trie, it becomes a new external node;it may also create one or several new internal nodes. Let N ≥ Theorem 8.1. As n → ∞ , P ( N = 0) = 1 − pqH − ψ (ln n ) + o (1) , P ( N = j ) = (cid:16) pqH + ψ (ln n ) (cid:17) pq (1 − pq ) j − + o (1) , j ≥ , where ψ = 0 in the non-arithmetic case, while in the d -arithmetic case ψ ( t ) = 2 pqH X k =0 Γ (cid:0) − π i kd (cid:1) e π i kt/d . Further, E N = 1 H + 12 pq ψ (ln n ) + o (1) . (8.1) The same results hold in the Poisson case (with n replaced by λ ).Proof. Consider first the Poisson case, with insertion of Ξ in a trie withPo( λ ) other strings.Let K be the length of the longest prefix of Ξ that is shared with atleast two strings already existing in the trie; this is the depth of the lastinternal node (in the existing trie) that the new string encounters whilebeing inserted.There is either no existing string with the same K + 1 first letters as Ξ,or exactly one such string. In the first case, Ξ is inserted at depth K + 1without creating any new internal nodes, so N = 0.In the second case, we have reached an external node, which is convertedinto an internal node, and the string that was stored there is displaced andinstead stored, together with the new string, at the end of a sequence of N ≥ N is the number of common letters, afterthe K first, in these two strings.Thus, conditioned on N ≥ N has a geometric distribution: P ( N = j ) = P ( N ≥ p + q ) j − · pq, j ≥ . (8.2)Since further P ( N = 0) = 1 − P ( N ≥ P ( N ≥ k , the event N ≥ K = k and, say, ξ K +1 = 1, happensif and only if ξ k +1 = 1 and there is exactly one existing string beginningwith ξ · · · ξ k ξ · · · ξ k
0. The conditionalprobability of this given α := ξ · · · ξ k is P ( ξ k +1 = 1) P (cid:0) Po( λP ( α ) q ) ≥ (cid:1) P (cid:0) Po( λP ( α ) p ) = 1 (cid:1) = f (cid:0) λP ( α ) (cid:1) , with f ( x ) = p (1 − e qx )( pxe − px ) = p xe − px − p xe − x . Thus, P ( N ≥ , K = k and ξ K +1 = 1) = E f (cid:0) λ P ( ξ · · · ξ k ) (cid:1) = E f (cid:0) λe − S k (cid:1) = E f (cid:0) e − ( S k − ln λ ) (cid:1) and, summing over k and using (2.11), P ( N ≥ ξ K +1 = 1) = ∞ X k =0 E f (cid:0) e − ( S k − ln λ ) (cid:1) = Z ∞ f (cid:0) e − ( x − ln λ ) (cid:1) d U ( x ) . The function g ( x ) := f ( e − x ) is directly Riemann integrable on ( −∞ , ∞ )by Lemma A.6 (because f ( x ) = O ( x ∧ x − )), and thus the key renewaltheorem Theorem A.7 yields P ( N ≥ ξ K +1 = 1) = 1 H Z ∞−∞ g ( x ) d x + ψ (ln λ ) + o (1) . (8.3)where ψ ( t ) = 0 in the non-arithmetic case and ψ ( t ) = 1 H X m =0 b g ( − πm/d ) e π i mt/d (8.4)in the arithmetic case.Routine integrations yield Z ∞−∞ g ( x ) d x = Z ∞ f ( y ) d yy = Z ∞ ( p e − py − p e − y ) d y = p − p = pq (8.5)and, more generally, b g ( s ) = Z ∞−∞ e − i sx g ( x ) d x = Z ∞ f ( y ) y i s − d y = ( p − i s − p )Γ(1 + i s );thus in the arithmetic case, since p π i m/d = 1 for integers m , b g ( − πm/d ) = pq Γ(1 − πm i /d ) . (8.6)By symmetry, (8.3) implies, for similarly defined g and ψ , P ( N ≥ ξ K +1 = 0) = 1 H Z ∞−∞ g ( x ) d x + ψ (ln λ ) + o (1) , (8.7)where, noting that (8.5) and (8.6) are symmetric in p and q , R ∞−∞ g ( x ) d x = pq and ψ = ψ .Consequently, summing (8.3) and (8.7), with ψ := ψ + ψ = 2 ψ , P ( N ≥
1) = 2 pqH + ψ (ln λ ) + o (1) . (8.8) ENEWAL THEORY IN ANALYSIS OF TRIES AND STRINGS 17
The result in the Poisson case now follows from (8.2), (8.4), (8.6) and (8.8).For the mean we have by (8.2) and (8.8), E N = ∞ X j =0 j P ( N = j ) = 12 pq P ( N ≥
1) = 1 H + 12 pq ψ (ln λ ) + o (1) . To depoissonize, consider first adding Ξ to a trie with Po( n − n / ) strings,and then increase the family by adding Po( n / ) further strings; it is easilyseen that with probability 1 − O ( λ − / ) = 1 − o (1), this does not changethe place where Ξ is inserted, and thus not N . The same holds for allintermediate tries, in particular for the one with exactly n strings if there isone, which there is w.h.p. because P (cid:0) Po( n − n / ) ≤ n (cid:1) → P (cid:0) Po( n + n / ) ≥ n (cid:1) →
1. Hence the variable N is w.h.p. the same for n strings andfor Po( n ) strings. (cid:3) It is easily verified that, at least if we ignore the error terms, the expectednumber of new internal nodes added for each new string given by (8.1)coincides with the derivative of E W λ = λH + λH ψ (ln λ ) + o ( λ ) given by(5.13), as it should. Remark 8.2.
Christophi and Mahmoud [4] studied random climbing inrandom tries, taking (in one version) steps left or right with probabilities p and q ; this is like inserting a new node but without moving any old one.The length of the climb is thus D n when N = 0 or 1 but D n − ( N −
1) when N ≥ Tunstall and Khodak codes
Tunstall and Khodak codes are variable-to-fixed length codes that areused in data compression. We give a brief description here. See [7], [8] andthe survey [33] for more details and references, as well as for an analysisusing Mellin transforms.We recall first the general situation. The idea is that an infinite stringcan be parsed as a unique sequence of nonoverlapping phrases belonging toa certain (finite) dictionary D . Each phrase in the dictionary then can berepresented by a binary number of fixed length ℓ ; if there are M phrases inthe dictionary we take ℓ := ⌈ lg M ⌉ .Note first that a set of phrases is a dictionary allowing a unique parsingin the way just described if and only if every infinite string has exactly oneprefix in the dictionary. Equivalently, the phrases in the dictionary have tobe the external nodes of a trie where every internal node has two children(so the Patricia trie is the same); this trie is the parsing tree.By a random phrase we mean a phrase distributed as the unique initialphrase in a random infinite string Ξ. Thus a phrase α in the dictionary D is chosen with probability P ( α ). We let the random variable L be the lengthof a random phrase.If we parse an infinite i.i.d. string Ξ, the successive phrases will be inde-pendent with this distributions. Hence, if K N is the (random) number ofphrases required to code the N first letters ξ · · · ξ N , then, see Appendix Aand (2.8), K N = ν ( N −
1) for a renewal process where the increments X i are independent copies of L . Consequently, as N → ∞ , by Theorem A.1, K N N a . s . −→ E L and E K N N → E L . (9.1)We obtain also convergence of higher moments and, by Theorem A.3, acentral limit theorem for K N . The expected number of bits required to codea string of length N is thus ℓ E K N ∼ ℓN E L = ⌈ lg M ⌉ E L N.
For simplicity, we consider the ratio κ := lg M / E L , and call it the compres-sion rate . (One objective of the code is to make this ratio small.)In Khodak’s construction of such a dictionary, we fix a threshold r ∈ (0 , α = α · · · α k with P ( α ) ≥ r ;the external nodes are thus the strings α such that P ( α ) < r but the parent, α ′ say, has P ( α ′ ) ≥ r . The phrases in the Khodak code are the externalnodes in this tree. For convenience, we let R = 1 /r >
1. Let M = M ( R ) bethe number of phrases in the Khodak code.In Tunstall’s construction, we are instead given a number M . We startwith the empty phrase and then iteratively M − α having maximal P ( α ) by its two children α α r > M = M ( R ). Conversely, a Tunstall code isalmost a Khodak code, with r chosen as the smallest P ( α ) for a proper prefix α of a phrase; the difference is that Tunstall’s construction handles ties moreflexibly; there may be some phrases too with P ( α ) = r . Thus, Tunstall’sconstruction may give any desired number M of phrases, while Khodak’sdoes not. We will see that in the non-arithmetic case, this difference isasymptotically negligible, while it is important in the arithmetic case. (Thisis very obvious if p = q = 1 /
2, when Khodak’s code always gives a dictionarysize M that is a power of 2.)Let us first consider the number of phrases, M = M ( R ), in Khodak’s con-struction with a threshold r = 1 /R . This is a purely deterministic problem,but we may nevertheless apply our probabilistic renewal theory arguments.In fact, M , the number of leafs in the parsing tree, equals 1 + the numberof internal nodes. Thus, M = 1 + P α f ( RP ( α )) with f ( x ) := [ x ≥ ENEWAL THEORY IN ANALYSIS OF TRIES AND STRINGS 19
Theorem 9.1.
Consider the Khodak code with threshold r = 1 /R . (i) If ln p/ ln q is irrational, then, as R → ∞ , M ( R ) R → H . (ii) If ln p/ ln q is rational, then, as R → ∞ , M ( R ) R = 1 H · d − e − d e − d { (ln R ) /d } + o (1) . Proof.
The non-arithmetic case follows directly from Theorem 5.1(i), since R ∞ f ( x ) x − d x = R ∞ x − d x = 1.In the arithmetic case, we use (5.8). Since g ( t ) = e t [ t ≤ ψ ( t ) = d X kd ≤ t e kd − t = d − e − d e d ⌊ t/d ⌋− t = d − e − d e − d { t/d } . (cid:3) Remark 9.2.
In the arithmetic case (ii), ln P ( α ) is a multiple of d for anystring α . Hence M ( R ) jumps only when R ∈ { e kd : k ≥ } , and it sufficesto consider such R . For these R , the result can be written M ( R ) ∼ H d − e − d R, ln R ∈ d Z . (9.2)Next, consider the length L of a random phrase. We will use the notation L T M for a Tunstall code with M phrases and L K R for a Khodak code withthreshold r = 1 /R .Consider first the Khodak code. By construction, given a random stringΞ = ξ ξ · · · , the first phrase in it is ξ · · · ξ n where n is the smallest integersuch that P ( ξ · · · ξ n ) = e − S n < r = e − ln R . Hence, by (2.8), L K R = ν (ln R ) . (9.3)Hence, Theorems A.1–A.3 immediately yield the following (as well as con-vergence of higher moments). Theorem 9.3.
For the Khodak code, the following holds as R → ∞ , with σ = H − H = pq ln ( p/q ) : L K R ln R a . s . −→ H , (9.4) L K R ∼ AsN (cid:16) ln RH , σ H ln R (cid:17) , (9.5)Var L K R ∼ σ H ln R. (9.6) If ln p/ ln q is irrational, then E L K R = ln RH + H H + o (1) . (9.7) If ln p/ ln q is rational, then, with d := gcd(ln p, ln q ) given by (2.6) , E L K R = ln RH + H H + dH (cid:16) − n ln Rd o(cid:17) + o (1) . (9.8)In the arithmetic case, as said in Remark 9.2, it suffices to consider thresh-olds such that − ln r = ln R is a multiple of d ; in this case (9.8) becomes E L K R = ln RH + H H + d H + o (1) . (9.9)We analyze the Tunstall code by comparing it to the Khodak code. Thus,suppose that M is given, and increase R (decrease r ) until we find a Kho-dak code with M ( R ) ≥ M phrases. (By our definitions, M ( R ) is right-continuous, so a smallest such R exists.) Let M + := M ( R ) ≥ M and M − := M ( R − ) < M . Thus, there are M + − α with P ( α ) ≥ r = R − , and M − − P ( α ) > r ; consequently there are M + − M − stringswith P ( α ) = r . The strings with P ( α ) = r are not parsing phrases in theKhodak code (while all their children are), but we use some of them in theTunstall code to achieve exactly M parsing phrases. Since each of thesestrings replaces two parsing phrases in the Khodak code, the total numberof parsing phrases decreases by 1 for each used string with P ( α ) = r , andthus the Tunstall code uses M ( R ) − M = M + − M parsing phrases with P ( α ) = r . The length L T M of a random phrase, realized as the first phrasein Ξ, equals L K R unless Ξ begins with one of the phrases α in the Tunstallcode with P ( α ) = r , in which case L T M = L K R −
1. The probability of thelatter event is evidently P ( α ) = r for each such α , and is thus ( M ( R ) − M ) r .Consequently, with R as above, L T M = L K R − ∆ M , (9.10)where ∆ M ∈ { , } and P (∆ M = 1) = ( M ( R ) − M ) /R . We can now findthe results for L T M : Theorem 9.4.
For the Tunstall code, the following holds as M → ∞ , with σ = H − H = pq ln ( p/q ) : L T M ln M a . s . −→ H , (9.11) L T M ∼ AsN (cid:16) ln MH , σ H ln M (cid:17) , (9.12)Var L T M ∼ σ H ln M. (9.13) If ln p/ ln q is irrational, then E L T M = ln MH + ln HH + H H + o (1) . (9.14) If ln p/ ln q is rational, then, with d := gcd(ln p, ln q ) given by (2.6) , E L T M = ln MH + ln HH + H H + 1 H ln sinh( d/ d/ ENEWAL THEORY IN ANALYSIS OF TRIES AND STRINGS 21 + dH ψ (cid:16)n ln M + ln( H (1 − e − d ) /d ) d o(cid:17) + o (1) , (9.15) where ψ ( x ) := e dx − e d − − x. (9.16)Note that ψ is continuous, with ψ (0) = ψ (1) = 0. ψ is convex andthus ψ ≤ p = q = 1 / d = H = ln 2 and ψ ( x ) = 2 x − − x , with a minimum − . . . . . Proof.
Let as above R be the smallest number with M ( R ) ≥ M ; thus M ( R ) ≥ M > M ( R − ). By Theorem 9.1, ln R = ln M + O (1), so (9.11)–(9.13) follow from (9.4)–(9.6) and the fact that | L T M − L K R | ≤
1, see (9.10).If ln p/ ln q irrational, Theorem 9.1 yields M ( R ) /R → /H , and thus also M ( R − ) /R → /H . Since M ( R ) ≥ M > M ( R − ), also MR → H , (9.17)and further M ( R ) /M →
1. Consequently, E ∆ M = M ( R ) − MR = (cid:16) M ( R ) M − (cid:17) MR → , and thus, by (9.10), E L T M = E L K R − E ∆ M = E L K R + o (1). Since also, by(9.17) again, ln R = ln M + ln H + o (1), (9.14) follows from (9.7).In the case when ln p/ ln q is rational, we argue similarly, but we haveto be more careful. First, necessarily R = e Nd for some integer N , seeRemark 9.2. Further, (9.2) applies. Let, for convenience, β := H − e − d d = H sinh( d/ d/ e − d/ ; (9.18)thus (9.2) can be written M ( R ) ∼ β − R as R → ∞ . Let x := 1 d ln( βM ) − N + 1 = 1 d ln βMR + 1 . (9.19)Then, by these definitions and (9.2), M = β − e d ( N − x ) , (9.20) M ( R ) = β − R (1 + o (1)) = β − e dN + o (1) , (9.21) M ( R − ) = M ( Re − d ) = β − ( Re − d )(1 + o (1)) = β − e d ( N − o (1) . (9.22)Since M ( R − ) < M ≤ M ( R ), we see that o (1) ≤ x ≤ o (1). We definealso, using (9.20), x := n ln( βM ) d o = n ln e d ( N − x ) d o = { x } . (9.23)Typically, 0 ≤ x <
1, and then x = x , but it may happen that x is slightlybelow 0 and x = x + 1, or that x is slightly above 1 and then x = x − By (9.19), ln R = ln( βM ) + d (1 − x ), and thus (9.9) yields, using (9.18), E L K R = ln( βM ) H + H H + d H + dH (1 − x ) + o (1)= ln MH + ln HH + 1 H ln sinh( d/ d/ H H + dH (1 − x ) + o (1) . Furthermore, by R = e dN , (9.20), (9.21) and (9.18), E ∆ M = M ( R ) − MR = β − (1 − e d ( x − ) + o (1)= dH − e xd − d − e − d + o (1) = dH (cid:16) − e xd − e d − (cid:17) + o (1) . Combining these, we find by (9.10) and (9.16), E L T M = E L K R − E ∆ M = ln MH + ln HH + 1 H ln sinh( d/ d/ H H + dH ψ ( x ) + o (1) . This is almost (9.15), except that there ψ ( x ) is replaced by ψ ( x ) = ψ ( { ln( βM ) /d } ), see (9.23). However, as noted above, x = x can happenonly when one of x and x is o (1) and the other is 1+ o (1). Since the function ψ is continuous and ψ (0) = ψ (1), we see that in this case ψ ( x ) − ψ ( x ) = ± ( ψ (1) − ψ (0)) + o (1) = o (1). Hence, ψ ( x ) = ψ ( x ) + o (1) in all cases,and (9.15) follows. (cid:3) Remark 9.5.
We have chosen to derive Theorem 9.4 from the correspondingresult Theorem 9.3 for the Khodak code. An alternative is to note that inthe Tunstall code, we obtain the random phrase length L T M by stoppingΞ at M + − M of the M + − M − strings α with P ( α ) = r , and all stringswith smaller P ( α ). By symmetry, we obtain the same distribution of thelength if we stop randomly with probability ( M + − M ) / ( M + − M − ) whenever P ( α ) = e − S n = r ; equivalently, we stop when e − S n − X < r , where X is arandom variable, independent of Ξ, with values 0 and ε , for some very smallpositive ε = ε ( M ), and P ( X = ε ) = ( M + − M ) / ( M + − M − ). Consequently,we have L T M d = b ν (ln R ), with R and X as above, and we can apply TheoremsA.1–A.3 (and Remark A.4) directly. Corollary 9.6.
The compression rate for the Tunstall code is κ := lg M E L T M = H ln 2 (cid:18) − ln H + H / H + δ ln M + o (cid:0) (ln M ) − (cid:1)(cid:19) where δ = 0 when ln p/ ln q is irrational while when ln p/ ln q is rational, δ := ln sinh( d/ d/ dψ (cid:16)n ln M + ln( H (1 − e − d ) /d ) d o(cid:17) , with d given by (2.6) and ψ by (9.16) . ENEWAL THEORY IN ANALYSIS OF TRIES AND STRINGS 23
For the Khodak code, the compression rate lg( M ( R )) / E L K R is asymptoti-cally given by the same formula, with ln M replaced by ln R , except that the ψ term does not appear in δ . The reason that the ψ term does not appear for the Khodak code is that L K R = L T M ( R ) , and in the arithmetic case, we may assume that R = e Nd ,and then for L T M ( R ) , the argument x of ψ is { ln( βM ( R )) /d } = { ln( R ) /d + o (1) } = { N + o (1) } and thus close to 0 or 1, where ψ vanishes.10. A stopped random walk
Drmota and Szpankowski [9] consider (motivated by the study of Tunstalland Khodak codes) walks in a region in the first quadrant bounded by twocrossing lines. Their first result, on the number of possible paths, seems torequire a longer comment, and will not be considered here. Their secondresult is about a random walk in the plane taking only unit steps north oreast, which is stopped when it exits the region; the probability of an eaststep is p each time. Coding steps east by 1 and north by 0, this is the sameas taking our random string Ξ. Drmota and Szpankowski [9] study, in ournotation, the exit time D K,V := min { n : n > k or S n > V ln 2 } for given numbers K and V , with K integer. We thus have D K,V = ( K + 1) ∧ ν ( V ln 2) . (10.1)We have here kept the notations K and V from [9], but for conveniencewe in the sequel write V := V ln 2. We assume p = q , since otherwise D K,V = ( K ∧ ⌊ V ⌋ ) + 1 is deterministic.We need a little more notation. Let as usual φ ( x ) := (2 π ) − / e − x / and Φ( x ) := R x −∞ φ ( y ) d y be the density and distribution functions of thestandard normal distribution. Further, letΨ( x ) := Z x −∞ Φ( y ) d y = x Φ( x ) + φ ( x ) . (10.2)This definition is motivated by the following lemma. Lemma 10.1. If Z ∼ N (0 , , then for every real t , E ( Z ∨ t ) = Ψ( t ) and E ( Z ∧ t ) = − Ψ( − t ) . Further, Ψ( t ) − Ψ( − t ) = t .Proof. Since E Z = 0, E ( Z ∨ t ) = E ( Z ∨ t − Z ) = Z ∞ P ( Z ∨ t − Z > x ) d x = Z ∞ Φ( t − x ) d x = Ψ( t ) . Further, since − Z d = Z , − E ( Z ∧ t ) = E (cid:0) ( − Z ) ∨ ( − t ) (cid:1) = E (cid:0) Z ∨ ( − t ) (cid:1) = Ψ( − t ) . Finally, Ψ( t ) − Ψ( − t ) = E (cid:0) ( Z ∨ t ) + ( Z ∧ t ) (cid:1) = E ( Z + t ) = t . (This alsofollows from (10.2) and Φ( t ) + Φ( − t ) = 1, φ ( − t ) = φ ( t ).) (cid:3) We can now state our version of the result by Drmota and Szpankowski[9]. We do not obtain as sharp error estimates as they do (although ourbounds easily can be improved when | K − V /H | is large enough). On theother hand, our result is more general and includes the transition regionwhen V /H ≈ K and both stopping conditions are important. Theorem 10.2.
Suppose that p = q and that V, K → ∞ . Let V := V ln 2 and e σ := ( H − H ) /H > . (i) If ( K − V /H ) / √ V → + ∞ , then D K,V is asymptotically normal: D K,V ∼ AsN (cid:16) V H , e σ V (cid:17) . (10.3) Further,
Var( D K,V ) ∼ e σ V . (ii) If ( K − V /H ) / √ V → −∞ , then D K,V is asymptotically degenerate: P ( D K,V = K + 1) → . (10.4) Further,
Var D = o ( V ) . (iii) If ( K − V /H ) / √ V → a ∈ ( −∞ , + ∞ ), then D K,V is asymptoticallytruncated normal: V − / ( D K,V − V /H ) d −→ ( e σZ ) ∧ a = e σ (cid:0) Z ∧ ( a/ e σ ) (cid:1) . (10.5) with Z ∼ N (0 , . Further, Var( D K,V ) ∼ V Var( e σZ ∧ a ) = V e σ Var( Z ∧ ( a/ e σ )) . (iv) In every case, E D K,V = V H − e σ p V Ψ (cid:16) V /H − K e σ √ V (cid:17) + o ( p V ) (10.6)= K − e σ p V Ψ (cid:16) K − V /H e σ √ V (cid:17) + o ( p V ) . (10.7)(v) If ( K − V /H ) / √ V ≥ ln V , then E D K,V = V H + H H + ψ ( V ) + o (1) , (10.8) where ψ = 0 in the non-arithmetic case and ψ ( t ) = dH (cid:0) / − { t/d } (cid:1) in the d -arithmetic case. (vi) If ( K − V /H ) / √ V ≤ − ln V , then E D K,V = K + 1 + o (1) . (10.9) Proof.
Let e D := D K,V − V /H √ V , e ν := ν ( V ) − V /H √ V , e K := K − V /H √ V , e K := K + 1 − V /H √ V = e K + o (1) . ENEWAL THEORY IN ANALYSIS OF TRIES AND STRINGS 25
Thus, by (10.1), e D = e ν ∧ e K . By Theorem A.3, e ν = ν ( V ) − V /H √ V −→ N (0 , e σ ) . (10.10)The results on convergence in distribution in (i)–(iii) follow immediately,noting that in (i), w.h.p. ν ( V ) < K + 1 and thus D K,V = ν ( V ); in (iii) weuse e.g. the continuous mapping theorem on ∧ : R → R .For (iv), note first that the two expressions in (10.6) and (10.7) are thesame by Lemma 10.1. We may by considering subsequences assume thatone of the cases (i)–(iii) occurs.Next, (A.9) can be written E ( e ν ) → e σ , which together with (10.10) im-plies that e ν is uniformly integrable. (See e.g. [12, Theorem 5.5.9].) Incase (iii), when e K converges, this implies that e D = ( e ν ∧ e K ) also is uni-formly integrable, and thus the convergence in distribution already provedfor (iii) implies E e D → E ( e σ ( Z ∧ ( a/ e σ ))) = − e σ Ψ( − a/ e σ ), which yields (10.6)when K → a ∈ R ; further, the uniform square integrability of e D impliesVar e D → Var( e σZ ∧ a ) as asserted in (iii).If instead e K → + ∞ , case (i), we may assume e K >
0; then e D = ( e ν ∧ e K ) ≤ e ν and thus e D is uniformly integrable in this case too. Hence (10.3)implies both Var( e D ) ∼ e σ , or equivalently Var D K,V ∼ e σ V as asserted in(i), and E e D →
0, which yields (10.6) in this case because Ψ( − e K ) → e K → −∞ , case (ii), we may assume that e K <
0; then e K − e D = ( e K − e ν ) + ≤ | e ν | is uniformly square integrable, and e K − e D p −→ e K − E e D = E ( e K − e D ) →
0, and thus (10.7) holds, sinceΨ( e K ) → o ( √ V ). Further, Var e D = Var( e K − e D ) →
0, whichyields Var D = o ( V ).This completes the proof of (iv).For (v), we have D K,V ≤ ν ( V ) and thus, by the Cauchy–Schwarz inequal-ity and Theorem A.1, E | D K,V − ν ( V ) | ≤ E (cid:0) ν ( V ) [ D K,V = ν ( V )] (cid:1) ≤ (cid:0) E ν ( V ) (cid:1) / P (cid:0) D K,V = ν ( V ) (cid:1) / = O ( V ) P (cid:0) D K,V = ν ( V ) (cid:1) / . (10.11)For e K ≥ ln V , Chernoff’s bound [20, Theorem 2.1] implies, because S K +1 is a linear transformation of a binomial Bi( K + 1 , p ) random variable, P ( D K,V = ν ( V )) = P ( ν ( V ) > K + 1) = P ( S K +1 ≤ V )= P ( S K +1 − E S K +1 ≤ − H e K p V ) ≤ exp (cid:16) − c e K V K + 1 + e K √ V (cid:17) ≤ exp( − c ln ( V )) . for some c , c > p ); the last inequality is perhaps mosteasily seen by considering the case K + 1 ≤ V /H (when K + 1 ≍ V ) and K + 1 > V /H (when e K ≍ K/ √ V ) separately. Hence, the right-handside of (10.11) tends to 0, and thus E D K,V = E ν ( V ) + o (1). Consequently,(v) follows from the formulas (A.3) and (A.5) for E ν ( V ) provided by The-orem A.2.The argument for (vi) is very similar. The Chernoff bound for S K implies P ( D K,V = K + 1) = P ( ν ( V ) < K + 1) = P ( S K > V ) ≤ exp( − c ln ( V )) , and the Cauchy–Schwarz inequality then implies E | K + 1 − D K,V | = o (1),proving (vi). (cid:3) Appendix A. Some renewal theory
For the readers’ (and our own) convenience, we collect here a few standardresults from renewal theory, sometimes in less standard versions. See e.g.Asmussen [1], Feller [11] or Gut [13] for further details.We suppose that X , X , . . . is an i.i.d. sequence of non-negative ran-dom variables with finite mean µ := E X >
0, and that S n := P ni =1 X i .Moreover, we suppose that X is independent of X , X , . . . (but X mayhave a different distribution, and is not necessarily positive) and define b S n := P ∞ n =0 X i = S n + X . We further define the first passage times ν ( t )and b ν ( t ) by (2.8) and (2.13) and the renewal function U by (2.10). (Recallthat ν is a special case of b ν with X = 0. Hence the results stated below for e ν hold for ν too.)For some theorems, we have to distinguish between the arithmetic (lattice)and non-arithmetic (non-lattice) cases, in general defined as follows: arithmetic (lattice): There is a positive real number d such that X /d always is an integer. We let d be the largest such number andsay that X is d -arithmetic . (This maximal d is called the span ofthe distribution.) non-arithmetic (non-lattice): No such d exists. (Then X is notsupported on any proper closed subgroup of R .) Theorem A.1. As t → ∞ , b ν ( t ) t a . s . −→ µ . (A.1) If further < r < ∞ and E | X | r < ∞ , then b ν ( t ) /t → µ − in L r , i.e., E | b ν ( t ) /t − µ − | r → , and thus E (cid:18) b ν ( t ) t (cid:19) r → µ r . (A.2) Proof.
See e.g. Gut [13, Theorem 2.5.1] for the case X = 0; the general casefollows by essentially the same proof. (cid:3) Theorem A.2.
Suppose that E X < ∞ and E | X | < ∞ . ENEWAL THEORY IN ANALYSIS OF TRIES AND STRINGS 27 (i)
If the distribution of X is non-arithmetic, then, as t → ∞ , E ν ( t ) = tµ + E X µ + o (1) (A.3) and, more generally, E b ν ( t ) = tµ + E X µ − E X µ + o (1) . (A.4)(ii) If the distribution of X is d -arithmetic, then, as t → ∞ , E ν ( t ) = tµ + E X µ + dµ (cid:16) − n td o(cid:17) + o (1) . (A.5) and, more generally, E b ν ( t ) = tµ + E X µ + dµ (cid:16) − E n t − X d o(cid:17) − E X µ + o (1) . (A.6) Proof.
See e.g. Gut [13, Theorem 2.5.2] for the case X = 0; the generalcase follows easily by conditioning on X . In the arithmetic case, notethat b ν ( t ) = ν ( t − X ) = ν (cid:0) ⌊ ( t − X ) /d ⌋ d (cid:1) and use E ( ⌊ ( t − X ) /d ⌋ d ) = t − E X − d E { ( t − X ) /d } . (cid:3) Theorem A.3.
Assume that σ := Var X < ∞ . Then, as t → ∞ , b ν ( t ) − t/µ √ t d −→ N (cid:16) , σ µ (cid:17) . (A.7) If further σ > , this can be written b ν ∼ AsN( µ − t, σ µ − t ) .Moreover, if also E X < ∞ , then Var (cid:0)b ν ( t ) (cid:1) = σ µ t + o ( t ); (A.8) and E (cid:0)b ν ( t ) − t/µ (cid:1) = σ µ t + o ( t ) . (A.9) Proof.
See e.g. Gut [13, Theorem 2.5.2] for the case X = 0, noting that(A.8) and (A.9) are equivalent because E b ν ( t ) − t/µ = O (1) by Theorem A.2;again, the case with a general X is similar, or follows by conditioning on X . The case σ = 0 is trivial. (cid:3) Remark A.4.
We can allow X = X ( n )0 to depend on n in TheoremsA.1–A.3 provided a . s . −→ is weakened to p −→ in (A.1) and we add the fol-lowing uniformity assumptions: X ( n )0 is tight; for L r convergence and (A.2)we further assume that sup n E | X ( n )0 | r < ∞ ; for Theorem A.2 we assumethat X ( n )0 are uniformly integrable; for (A.8) and (A.9) we assume thatsup n E | X ( n )0 | < ∞ . For the evaluation of (A.6) when X is non-trivial, we note the followingformula. Lemma A.5.
Suppose that X has a continuous distribution with finitemean, and a characteristic function ϕ ( t ) := Ee i tX that satisfies ϕ ( t ) = O ( | t | − δ ) for some δ > . Then, for any real u , E { X + u } = 12 − X n =0 ϕ (2 πn )2 πn i e π i nu . Proof.
Let X u := ⌊ X + u ⌋ − u + 1. Then { X + u } = X − X u + 1, and theresult follows from the formula for E X u in [19, Theorem 2.3]. (cid:3) For the next theorem (known as the key renewal theorem ), we say that afunction f ≥ −∞ , ∞ ) is directly Riemann integrable if the upper andlower Riemann sums P ∞ k = −∞ h sup [( k − h,kh ) f and P ∞ k = −∞ h inf [( k − h,kh ) f are finite and converge to the same limit as h →
0. (See further Feller[11, Section XI.1]; Feller considers functions on [0 , ∞ ), but this makes nodifference.) For most purposes, the following sufficient condition suffices.(Usually, one can take F = f .) Lemma A.6.
Suppose that f is a non-negative function on ( −∞ , ∞ ) . If f isbounded and a.e. continuous, and there exists an integrable function F with ≤ f ≤ F such that F is non-decreasing on ( −∞ , − A ) and non-increasingon ( A, ∞ ) for some A , then f is directly Riemann integrable.Sketch of proof. It is well-known that the boundedness and a.e. continu-ity implies Riemann integrability on any finite interval [ − B, B ]. Using thedominating function F , one sees that the tails of the Riemann sums comingfrom intervals [( k − h, kh ) outside [ − B, B ] can be made arbitrarily small,uniformly in h ∈ (0 , B large. (cid:3) Theorem A.7.
Let f be any non-negative directly Riemann integrable func-tion on ( −∞ , ∞ ) . (i) If the distribution of X is non-arithmetic, then, as t → ∞ , Z ∞ f ( s − t ) d U ( s ) → µ Z ∞−∞ f ( s ) d s, (A.10) Z ∞ f ( t − s ) d U ( s ) → µ Z ∞−∞ f ( s ) d s. (A.11)(ii) If the distribution of X is d -arithmetic, then, as t → ∞ , Z ∞ f ( s − t ) d U ( s ) = 1 µ ψ ( t ) + o (1) , (A.12) Z ∞ f ( t − s ) d U ( s ) = 1 µ ψ ( − t ) + o (1) , (A.13) ENEWAL THEORY IN ANALYSIS OF TRIES AND STRINGS 29 where ψ ( t ) is the bounded d -periodic function ψ ( t ) := d ∞ X k = −∞ f ( kd − t ); (A.14) ψ has the Fourier series ψ ( t ) ∼ ∞ X m = −∞ b ψ ( m ) e π i mt/d (A.15) with b ψ ( m ) = b f ( − πm/d ) = Z ∞−∞ e π i mt/d f ( t ) d t. (A.16)In particular, the average of ψ is b ψ (0) = R ∞−∞ f . The series (A.14) con-verges uniformly on [0 , d ]; thus ψ is continuous if f is. Further, if f issufficiently smooth (an integrable second derivative is enough), then theFourier series (A.15) converges uniformly. Proof.
The two formulas (A.10) and (A.11) are equivalent by the substitu-tion f ( x ) → f ( − x ). The theorem is usually stated in the form (A.11) forfunctions f supported on [0 , ∞ ); then the integral is R t f ( t − s ) d U ( s ). How-ever, the proof in Feller [11, Section XI.1] applies to the more general formabove as well. (The proof is based on approximations with step functionsand the special case when f ( x ) is an indicator fuction of an interval; thelatter case is known as Blackwell’s renewal theorem .) In fact, a substan-tially more general version of (A.11), where also the increments X k maytake negative values, is given in [2, Theorem 4.2].Part (ii) follows similarly (and more easily) from the fact that the mea-sure d U is concentrated on { kd : k ≥ } , and thus R ∞ f ( s − t ) d U ( s ) − µ ψ ( t ) = P ∞ k = −∞ f ( kd − t )(d U { kd } − d/µ ) together with the renewal the-orem d U { kd } − d/µ → k → ∞ . The Fourier coefficient calculation in(A.16) is straightforward and standard. (cid:3) Finally, we consider a situation where we are given also another sequence Y , Y , . . . of random variables such that the pairs ( X i , Y i ), i ≥
1, are i.i.d.,while Y i and X i may be (and typically are) dependent on each other. ( Y i need not be positive.) We denote the means by µ X := E X and µ Y := E Y ; thus µ X = µ in the earlier notation, and we assume as above that0 < µ X < ∞ . We also suppose that X is independent of all ( X i , Y i ), i ≥ V n := P ni =1 Y i . Theorem A.8.
Suppose that σ X := Var X < ∞ and σ Y := Var Y < ∞ ,and let b σ := Var( µ X Y − µ Y X ) . Then V b ν ( t ) − µ Y µ X t √ t d −→ N (cid:16) , b σ µ X (cid:17) . If b σ > , this can also be written as V b ν ( t ) ∼ AsN (cid:18) µ Y µ X t, b σ µ X t (cid:19) . Note that the special case Y i = 1 yields (A.7). Proof.
For X = 0, and thus b ν ( t ) = ν ( t ), this is Gut [13, Theorem 4.2.3].The general case follows by the same proof, or by conditioning on X . (cid:3) Remark A.9.
Again, we can allow X = X ( n )0 to depend on n , as long asthe X ( n )0 is tight. References [1] S. Asmussen,
Applied Probability and Queues . Wiley, Chichester, 1987.[2] K. B. Athreya, D. McDonald & P. Ney, Limit theorems for semi-Markovprocesses and renewal theory for Markov chains.
Ann. Probab. (1978),no. 5, 788–797.[3] J. Bourdon, Size and path length of Patricia tries: dynamical sourcescontext. Random Structures Algorithms (2001), no. 3–4, 289–315.[4] C. Christophi & H. Mahmoud, On climbing tries. Probab. Engrg. In-form. Sci. (2008), no. 1, 133–149.[5] J. Cl´ement, P. Flajolet & B. Vall´ee, Dynamical sources in informationtheory: a general analysis of trie structures. Algorithmica (2001),no. 1–2, 307–369.[6] F. Dennert & R. Gr¨ubel, Renewals for exponentially increasing life-times, with an application to digital search trees. Ann. Appl. Probab. (2007), no. 2, 676–687.[7] M. Drmota, Y. Reznik, S. Savari & W. Szpankowski, Precise asymptoticanalysis of the Tunstall code. Proc. 2006 International Symposium onInformation Theory (Seattle, 2006) , 2334–2337.[8] M. Drmota, Y. A. Reznik & W. Szpankowski, Tunstallcode, Khodak variations, and random walks. Preprint, 2009. [9] M. Drmota & W. Szpankowski, On the exit time of a random walk withpositive drift.
Proceedings, 2007 Conference on Analysis of Algorithms,AofA 07 (Juan-les-Pins, 2007) , Discrete Math. Theor. Comput. Sci.Proc. AH (2007), 291–302.[10] G. Fayolle, P. Flajolet, M. Hofri & P. Jacquet, Analysis of a stack algo-rithm for random multiple-access communication. IEEE Trans. Inform.Theory (1985), no. 2, 244–254.[11] W. Feller, An Introduction to Probability Theory and its Applications,Volume II . 2nd ed., Wiley, New York, 1971.[12] A. Gut,
Probability: A Graduate Course . Springer, New York, 2005.[13] A. Gut,
Stopped Random Walks . 2nd ed., Springer, New York, 2009.
ENEWAL THEORY IN ANALYSIS OF TRIES AND STRINGS 31 [14] P. Jacquet & M. R´egnier, Trie partitioning process: limiting distribu-tions.
CAAP ’86 (Nice, 1986) , 196–210, Lecture Notes in Comput. Sci.,214, Springer, Berlin, 1986.[15] P. Jacquet & M. R´egnier, Normal limiting distribution of the size oftries.
Performance ’87 (Brussels, 1987) , 209–223, North-Holland, Am-sterdam, 1988.[16] P. Jacquet & M. R´egnier, New results on the size of tries.
IEEE Trans.Inform. Theory (1989), no. 1, 203–205.[17] P. Jacquet & W. Szpankowski, Analysis of digital tries with Markoviandependency. IEEE Trans. Inform. Theory , (1991), no. 5, 1470–1475.[18] S. Janson, One-sided interval trees. J. Iranian Statistical Society (2004), no. 2, 149–164.[19] S. Janson, Rounding of continuous random variables and oscillatoryasymptotics. Ann. Probab. (2006), no. 5, 1807–1826.[20] S. Janson, T. Luczak & A. Ruci´nski, Random Graphs . Wiley, New York,2000.[21] H. Kesten, Renewal theory for functionals of a Markov chain with gen-eral state space.
Ann. Probab. (1974), 355–386.[22] D.E. Knuth, The Art of Computer Programming. Vol. 3: Sorting andSearching . 2nd ed., Addison-Wesley, Reading, Mass., 1998.[23] M. R. Leadbetter, G. Lindgren & H. Rootz´en,
Extremes and RelatedProperties of Random Sequences and Processes . Springer-Verlag, NewYork, 1983.[24] G. Louchard & H. Prodinger, Asymptotics of the moments of extreme-value related distribution functions.
Algorithmica (2006), no. 3-4,431–467.[25] H. Mahmoud, Evolution of Random Search Trees , Wiley, New York,1992.[26] H. Mahmoud, Imbalance in random digital trees.
Methodol. Comput.Appl. Probab. (2009), no. 2, 231–247.[27] B. Pittel, Asymptotical growth of a class of random trees. Ann. Probab. (1985), No. 2, 414–427.[28] B. Pittel, Paths in a random digital tree: limiting distributions. Adv.in Appl. Probab. (1986), no. 1, 139–155.[29] B. Rais, P. Jacquet & W. Szpankowski, Limiting distribution for thedepth in PATRICIA tries. SIAM J. Discrete Math. (1993), no. 2,197–213.[30] M. R´egnier, Trie hashing analysis, Proc. Fourth Int. Conf. Data Engi-neering (Los Angeles, 1988) , IEEE, 1988, pp. 377–387.[31] W. Szpankowski, Patricia tries again revisited.
J. Assoc. Comput.Mach. (1990), no. 4, 691–711.[32] W. Szpankowski, Average Case Analysis of Algorithms on Sequences .Wiley, New York, 2001. [33] W. Szpankowski, Average redundancy for known sources: ubiquitoustrees in source coding.
Proceedings, Fifth Colloquium on Mathemat-ics and Computer Science (Blaubeuren, 2008) , Discrete Math. Theor.Comput. Sci. Proc. AI (2008), 19–58. Department of Mathematics, Uppsala University, PO Box 480, SE-751 06Uppsala, Sweden
E-mail address : [email protected] URL : ∼∼