Two-colour balanced affine urn models with multiple drawings I: central limit theorems
TTWO-COLOUR BALANCED AFFINE URN MODELS WITH MULTIPLEDRAWINGS I: CENTRAL LIMIT THEOREMS
MARKUS KUBA AND HOSAM M. MAHMOUDA
BSTRACT . This is a research endeavor in two parts. We study a class of balanced urn schemeson balls of two colours (say white and black). At each drawing, a sample of size m ≥ is drawnfrom the urn, and ball addition rules are applied. We consider these multiple drawings undersampling with or without replacement. We further classify ball addition matrices according to thestructure of the expected value into affine and nonaffine classes. We give a necessary and sufficientcondition for a scheme to be in the affine subclass. For the affine subclass, we get explicit resultsfor the expected value and second moment of the number of white balls after n steps and anasymptotic expansion of the variance. Moreover, we uncover a martingale structure, amenable toa central limit theorem formulation. This unifies several earlier works focused on special cases ofurn models with multiple drawings [5, 6, 17, 20, 21, 24]. The class is parametrized by Λ , specifiedby the ratio of the two eigenvalues of a “reduced” ball replacement matrix and the sample size.We categorize the class into small-index urns ( Λ < ), critical-index urns ( Λ = ), and large-index urns ( Λ > ), and triangular urns. In the present paper (Part I), we obtain central limittheorems for small- and critical-index urns and prove almost-sure convergence for triangular andlarge-index urns. In a companion paper (Part II), we discuss the moment structure of large-indexurns and triangular urns.
1. I
NTRODUCTION
Urn schemes are simple, useful and versatile mathematical tools for modeling many evolu-tionary processes in diverse applications such as algorithmics, genetics, epidimiology, physics,engineering, economics, networks (social and other types), and many more. Modeling via urnsis centuries old, but perhaps the earliest contributions in the flavor commonly called P´olya urns(the subject of the present paper) are [7, 8]. In the first of these two classics, urns were intendedto model the diffusion of gases. In the second, urns were meant to model contagion. Many P ´olyaurn models useful for numerous applications were added later on. In fact, they are too many (lit-erally hundreds) to be listed individually. The sources [13, 16] are classic surveys listing many ofthese applications; see also [19], where two chapters are devoted to applications in algorithmicsand biosciences.While the term “P ´olya urn” refers to a vast variety of schemes, there is a common threadamong most of them. Urns of the classic flavor on two colours (say white and black) evolvein the following way. At the beginning, time zero, the urn contains a certain number of whiteand black balls. Thereafter, evolution of the urn occurs in discrete time steps. At every step,
Date : April 1, 2015.2000
Mathematics Subject Classification.
Key words and phrases.
Urn model, random structure, martingale, central limit theorem. a r X i v : . [ m a t h . P R ] M a r M. KUBA AND H. M. MAHMOUD a ball is chosen at random from the urn. The colour of the ball is inspected, then the ball isreinserted in the urn. According to the colour of the sampled ball, other balls are added/removedfollowing certain rules—if we have chosen a white ball, we put in the urn a white balls and b black balls, but if we have chosen a black ball, we put in the urn c white balls and d black balls.The values a, b, c, d ∈ Z are fixed. The urn model is specified by the × ball replacementmatrix M = (cid:18) a bc d (cid:19) . One is usually interested in the number of white balls W n after n draws,and the number of black balls B n after n draws.1.1. P´olya urn models with multiple drawings.
In the classic version of P´olya urns, one ballis sampled at each unit of (discrete) time. The present work is devoted to the study of a gen-eralization of the P´olya urn model, where multiple balls are drawn at each discrete time step,their colours are inspected, then the sample is reinserted in the urn. Additions and deletions takeplace according to the drawn sample (multiset). Such urn models recently received attentionin the literature, see for example [5, 6, 14, 17, 20, 21, 23, 24]. The addition/removal of ballsdepends on the combinations of colours of the drawn balls. We use the notation { W k B m − k } to refer to a sample of size m containing k white balls and m − k black balls. Specifically, wedraw m ≥ balls and add/remove white and black balls according to the multiset { W k B m − k } of observed colours: If we draw k white and m − k black balls, we add a m − k white and b m − k black balls, ≤ k ≤ m . The ball replacement matrix of this urn model with multiple drawingsis a rectangular ( m + 1) × matrix: M = a b a b ... ... a m − b m − a m b m . (1)We assume throughout that the urn model is balanced , such that the overall number of added/removed balls is a constant σ , independent of the composition of the sample: a k + b k = σ ≥ , ≤ k ≤ m . Moreover, we are only interested in so-called tenable urn models, where the processof drawing and replacing balls can be continued ad infinitum. Several of the afore-mentionedworks on urn models with multiple drawings were only concerned with a specific urn model.This includes an urn model related to logic circuits [21, 24], the generalized P ´olya-Eggenbergerurn [5, 6], and the generalized Friedman urn [17]. In this work, we unify and generalize theseearlier works. We do so by discussing a more general model encompassing all the previouslymentioned specific urns.1.2. Plan of the paper and notation.
The main ingredient for our analysis is to specify all ( m + 1) × ball replacement matrices for which the conditional expectation of the number ofwhite balls W n after n draws has an affine structure of the form E (cid:2) W n | F n − (cid:3) = α n W n − + β n , n ≥ , for certain deterministic sequences α n , β n , where F n denotes the σ -algebra generated by thefirst n draws from the urn. So, we are considering a class of two-colour balanced tenable affine FFINE URN MODELS 3 urns, grown under sampling multisets. Beside such characterization, we also present a centrallimit theorem for W n for urns in this class with small and critical index, a parameter that will bedefined in the sequel. We shall return soon in a companion paper [18] to deriving more familiesof limit laws concerning urns in the class completing the analysis of limit laws. In particular,we discuss urn models with a large index and triangular urns, using the so-called method ofmoments.We denote by x k the k th falling factorial, x ( x − . . . ( x − k + 1) , k ≥ , with x = 1 . Weshall also use ∇ , the backward difference operator, defined by ∇ h n = h n − h n − , when actingon a function h n . 2. P RELIMINARIES
Sampling schemes.
Assume that an urn contains w white and b black balls. We considertwo different sampling schemes for drawing the m balls at each step: model M and model R .In model M we draw the m balls without replacement. The m balls are drawn at once and theircolours are examined. After the sample is collected, we put the entire sample back in the urnand execute the replacement rules according to the counts of colours observed. The tenabilityassumption implies that for model M the coefficients a k of the ball replacement matrix (1) satisfythe condition a k ≥ − ( m − k ) , for ≤ k ≤ m .The probability P ( W k B m − k ) of drawing k white and m − k black balls is given by P ( W k B m − k ) = 1( b + w ) m (cid:18) mk (cid:19) w k b m − k = (cid:0) wk (cid:1)(cid:0) bm − k (cid:1)(cid:0) b + wm (cid:1) , ≤ k ≤ m. Thus X , the number of white balls in the sample, follows a hypergeometric distribution, withparameters w + b, w , and m , that is, one that counts the number of white balls in a sample ofsize m balls taken out of an urn containing w white and b black balls (a total of τ = w + b balls).The expected value and second moment are given by E [ X ] = m wτ , E [ X ] = w ( w − m ( m − τ ( τ −
1) + wmτ .
In model R , we draw the m balls with replacement. The m balls are drawn one at a time. Aftera ball is drawn, its colour is observed, and is reinserted in the urn, and thus it might reappear inthe sampling of one multiset. After m balls are collected in this way (and they are all back inthe urn), we execute the replacement rules according to the counts of colours observed. By thetenability assumption a k ≥ − , for ≤ k ≤ m − and a m ≥ , for model R .The probability P ( W k B m − k ) of drawing k white and m − k black balls is given by P ( W k B m − k ) = 1( b + w ) m (cid:18) mk (cid:19) w k b m − k , ≤ k ≤ m. These assumptions can be relaxed a little bit, if the initial values W and B are adapted to the entries in the ballreplacement matrix. E.g., for m = 1 , the urn model with ball replacement matrix (cid:0) − − (cid:1) is still tenable, if W isa multiple of and B a multiple of . M. KUBA AND H. M. MAHMOUD
In other words, under model R , the number of white balls in the multiset of size m follows abinomial distribution with parameters m , and w/τ , one that counts the number of successes in m independent identically distributed experiments, with w/τ probability of success per experiment.Let Y denote such a binomially distributed random variable. The expected value and secondmoment are given by E [ Y ] = m wτ , E [ Y ] = m wτ (cid:16) − wτ (cid:17) + m w τ . Stochastic recurrence.
We start with W white and B black balls, W , B ∈ N assumingthat W + B ≥ m , to enable at least the first draw. Thereafter, tenability guarantees the perpet-uation of drawing. We are interested in the distribution of the numbers W n and B n of white andblack balls after n draws, respectively. We denote by T n = W n + B n , n ≥ , the total number of balls contained in the urn after n draws. As we are considering a class ofbalanced urns, the total number of balls T n after n draws is a deterministically linear: T n = σn + T , n ≥ . We restrict ourselves to the case where the total number of balls increases after each draw, inother words we consider σ ≥ .In what follows, we use the notation I n ( W k B m − k ) to stand for the indicator of the event thatthe multiset { W k B m − k } is drawn in the n th sampling. Conditioning on the composition of theurn after n − draws, we obtain a stochastic recurrence for W n . The number of white ballsafter n draws is the number of white balls after n − draws, plus the contribution of white ballsafter the n th sample is obtained: W n = W n − + m (cid:88) k =0 a m − k I n ( W k B m − k ) , n ≥ . (2)Let F n − denote the σ -field generated by the first n − draws. For ≤ k ≤ m the indicatorvariables I n ( W k B m − k ) satisfy P (cid:0) I n ( W k B m − k ) = 1 | F n − (cid:1) = (cid:0) W n − k (cid:1)(cid:0) B n − m − k (cid:1)(cid:0) T n − m (cid:1) = (cid:0) W n − k (cid:1)(cid:0) T n − − W n − m − k (cid:1)(cid:0) T n − m (cid:1) (3)for model M , and P (cid:0) I n ( W k B m − k ) = 1 | F n − (cid:1) = (cid:18) mk (cid:19) W kn − B m − kn − T mn − = (cid:18) mk (cid:19) W kn − ( T n − − W n − ) m − k T mn − (4)for model R . We obtain for W sn , s ≥ , a stochastic recurrence by taking the s th power of (2),and using the fact that the indicator variables are mutually exclusive: W sn = s (cid:88) (cid:96) =0 (cid:18) s(cid:96) (cid:19) W s − (cid:96)n − m (cid:88) k =0 a (cid:96)m − k I n ( W k B m − k ) , n ≥ . (5) FFINE URN MODELS 5
3. A
FFINE EXPECTATION
We classify ball replacement matrices according to the structure of the conditional expectedvalue. Our motivation is that all previously treated specific urn models with multiple draw-ings [5, 6, 17, 20, 21, 24] had one feature in common, namely a simple recurrence relation forthe conditional expectation of an affine form E (cid:2) W n | F n − (cid:3) = α n W n − + β n , n ≥ , where α n and β n are certain deterministic sequences. It is desired to unify all the earlier specialcases into a single simple model, and find a more general theory to work as an umbrella for thesespecial cases and other special cases that may be equally important in application. In [20], a char-acterization of all ball replacement matrices giving rise to an affine linear conditional expectedvalue was given for the case of drawing m = 2 balls, under sampling without replacement. Weextend this analysis in the next subsection to arbitrary m ≥ , for both sampling models and char-acterize all ball replacement matrices leading to such a simple relation. (Note that our results stayvalid for m = 1 ; here our model reduces to ordinary balanced urn models.) Subsequently, thisallows us to obtain closed formulæ for the expected value and second moment, and to uncoveran associated martingale structure. Later on, this is exploited to obtain limit theorems.3.1. A necessary and sufficient condition for average affinity.
We obtain, for ≤ k ≤ m , anecessary and sufficient condition on the numbers a k , b k for the conditional expectation to take anaffine form, reducing the number of significant parameters to three: a m − , a m and the balance σ . Proposition 1.
Suppose we are given the numbers a m − and a m , and the balance factor σ = a k + b k ≥ . For both sampling schemes, the random variable W n satisfies a linear affinerelation of the form E (cid:2) W n | F n − (cid:3) = α n W n − + β n , n ≥ , if and only if, for ≤ k ≤ m , the numbers a k satisfy the condition a k = ( m − k ) a m − − ( m − k − a m . Equivalently, the coefficients a k themselves satisfy an affinity condition: a k = a + hk, with h (and h = a m − a m ) an integer guaranteeing tenability. The sequences α n and β n are givenin terms of a m − , a m and T n by α n = T n − + m ( a m − − a m ) T n − , and β n = a m , n ≥ . For technical reasons we assume from this point on that for balanced affine urn models thefactors α n , as stated in Proposition 1, satisfy α n > for n ≥ . Equivalently, we make theassumption T + m ( a m − − a m ) > . In view of tenability and the steady increase of balls( σ ≥ ) this is a natural assumption and not really a restriction. If for a certain model T + m ( a m − − a m ) ≤ , after only a few draws (say j ≥ ), we will have T j + m ( a m − − a m ) > .We then restart the urn and take j as the new beginning of time. M. KUBA AND H. M. MAHMOUD
An immediate consequence of the affinity condition is the appearance of a martingale, andsimple closed formulæ for the expected value and the variance. Moreover, by appropriate choicesof the parameters a m − , a m and the balance factor σ , the affinity condition covers many of thepreviously treated specific urn models with multiple drawings. Example 1.
Let a m = a m − = c . We obtain a k = c for ≤ k ≤ m , such that the randomvariable W n degenerates to a deterministic value: W n = W + nc . Example 2.
For m = 2 , we obtain the condition a − a + a = 0 ; this affinity condition isdiscussed in [20], which only considers model M . Example 3.
For a m = mc , a m − = ( m − c and σ = mc , we obtain the generalized Friedmanurn model with a k = kc , as discussed in [17] under both sampling schemes. Example 4.
For a m = 0 , a m − = c and σ = mc , we obtain the generalized P ´olya urn modelwith a k = ( m − k ) c , as discussed in [5, 6]. Example 5.
For a m = 1 , a m − = 0 and σ = 1 , we obtain a k = − ( m − k ) + 1 , an urn model forlogic circuits treated in [21, 24].In order to prove Proposition 1, we first determine the general structure of the conditionalexpectation. Lemma 1.
For both sampling schemes, the conditional expected value of the random vari-able W n is a polynomial of degree m (the sample size) in W n − : E (cid:2) W n | F n − (cid:3) = m (cid:88) i =0 f n,i W in − , n ≥ . The values f n,i are model dependent. For model R , we get f n,i = δ i, + ( − i T in − i (cid:88) k =0 a m − k (cid:18) mk (cid:19)(cid:18) m − km − i (cid:19) ( − k . For model M , we get f n,i = δ i, + 1 T mn − m (cid:88) j =0 T jn − [ x i ] p m,j ( x ) , where the polynomials p m,j ( x ) are, for ≤ j ≤ m , given by p m,j ( x ) = m − j (cid:88) k =0 a m − k (cid:18) mk (cid:19) x k (cid:18) m − kj (cid:19) ( − x ) m − k − j . Proof.
Our starting point is the relation E (cid:2) W n | F n − (cid:3) = W n − + m (cid:88) k =0 a m − k E (cid:2) I n ( W k B m − k ) | F n − (cid:3) . FFINE URN MODELS 7
We discuss first the proof for model R , which is simpler. According to (4) we get E (cid:2) W n | F n − (cid:3) = W n − + m (cid:88) k =0 a m − k (cid:18) mk (cid:19) W kn − ( T n − − W n − ) m − k T mn − . Expanding ( T n − − W n − ) m − k by the binomial theorem, and changing the order of summationyields E (cid:2) W n | F n − (cid:3) = W n − + 1 T mn − m (cid:88) i =0 T m − in − W in − ( − i i (cid:88) k =0 a m − k (cid:18) mk (cid:19)(cid:18) m − km − i (cid:19) ( − k . Consequently, the conditional expectation satisfies the equation E (cid:2) W n | F n − (cid:3) = W n − + m (cid:88) i =0 ( − i T in − W in − i (cid:88) k =0 a m − k (cid:18) mk (cid:19)(cid:18) m − km − i (cid:19) ( − k , which gives the claimed formula for f n,i .For model M , from (3) we have E (cid:2) W n | F n − (cid:3) = W n − + 1 T mn − m (cid:88) k =0 a m − k (cid:18) mk (cid:19) W kn − ( T n − − W n − ) m − k . Next, we use the binomial theorem for the falling factorials to obtain E (cid:2) W n | F n − (cid:3) = W n − + 1 T mn − m (cid:88) k =0 a m − k (cid:18) mk (cid:19) W kn − m − k (cid:88) j =0 (cid:18) m − kj (cid:19) T jn − ( − W n − ) m − k − j . Changing the order of summation gives E (cid:2) W n | F n − (cid:3) = W n − + 1 T mn − m (cid:88) j =0 T jn − m − j (cid:88) k =0 a m − k (cid:18) mk (cid:19) W kn − (cid:18) m − kj (cid:19) ( − W n − ) m − k − j . The inner sum on the right-hand side is exactly the polynomial p m,j ( W n − ) . The polynomialscan be expanded into powers of W n − , leading to the stated result. (cid:3) Proof of Proposition 1.
Given the numbers a m − and a m , we need to ensure that the conditionalexpected value of W n only involves W n − and constants, but no higher powers of W n − . ByLemma 1, this is equivalent to the condition f n,i = 0 , ≤ i ≤ m . It remains to show thatthis condition is fulfilled, if and only if the coefficients of a ball replacement matrix satisfy thestated condition a k = ( m − k ) a m − − ( m − k − a m . Note that by collecting the coefficient k and expressing a m − in terms of a = m ( a m − − a m ) + a m , we have the equivalent condition a k = hk + a , with arbitrary a and h satisfying tenability.We start with model R . By Lemma 1 the condition f n,i = 0 , ≤ i ≤ m , implies the followinglinear equations for the numbers a k , ≤ k ≤ m − , independent of T n − and thus independent M. KUBA AND H. M. MAHMOUD of n , too: i (cid:88) k =2 a m − k (cid:18) mk (cid:19)(cid:18) m − km − i (cid:19) ( − k = m (cid:18) m − m − i (cid:19) a m − − (cid:18) mm − i (cid:19) a m , ≤ i ≤ m. This system of linear equations is upper triangular and has a unique solution. The solution can beobtained by Cramer’s rule. However, in order to avoid more involved calculations, we can checkthat the stated solution a k = ( m − k ) a m − − ( m − k − a m satisfies the equations by simplealgebraic manipulations, which are omitted here. For model M , by contrast to the previous case,the m − equations f n,i = 0 , ≤ i ≤ m , are not independent of n , since they involve T n − : f n,i = 1 T mn − m (cid:88) j =0 T jn − [ x i ] p m,j ( x ) , ≤ i ≤ m . In order to ensure that f n,i = 0 for all n , with ≤ i ≤ m , the coefficient [ x i ] p m,j ( x ) of the falling factorials T jn − have to vanish for all n . Assume conversely that there exists alargest j = j , ≤ j ≤ m , such that [ x i ] p m,j ( x ) (cid:54) = 0 . Then, for large n , we have f n,i = 1 T mn − j (cid:88) j =0 T jn − [ x i ] p m,j ( x ) = 1 T mn − ( T j n − [ x i ] p m,j ( x ) + O ( T j − n − ) , such that f n,i ∼ T j n − T mn − [ x i ] p m,j ( x ) (cid:54) = 0 . Thus, we obtain the system of equations [ x i ] p m,j ( x ) = [ x i ] m − j (cid:88) k =0 a m − k (cid:18) mk (cid:19) x k (cid:18) m − kj (cid:19) ( − x ) m − k − j = 0 , for ≤ i ≤ m and ≤ j ≤ m . This leads to an overdetermined system of linear equationsfor the coefficients a k . Instead of writing the whole system, it is sufficient to derive an exactlysolvable subsystem of equations involving all the coefficients a k , ≤ k ≤ m . In order to do so,we concentrate on the equations arising from the coefficient of x m − j . This is the highest powerof x in the polynomials p m,j ( x ) . We get [ x m − j ] p m,j ( x ) = [ x m − j ] m − j (cid:88) k =0 a m − k (cid:18) mk (cid:19) x k (cid:18) m − kj (cid:19) ( − x ) m − k − j = m − j (cid:88) k =0 a m − k (cid:18) mk (cid:19)(cid:18) m − kj (cid:19) ( − m − k − j , ≤ j ≤ m . We allow a m − , and a m to be freely chosen. Setting j = m − i leads to a the systemof equations for the numbers a k with ≤ k ≤ m − : i (cid:88) k =0 a m − k (cid:18) mk (cid:19)(cid:18) m − km − i (cid:19) ( − m − k = 0 , ≤ i ≤ m. FFINE URN MODELS 9
This system coincides with the system of equations previously derived for sampling with re-placement. It has the stated unique solution. Hence, the overdetermined system of equations [ x i ] p m,j ( x ) = 0 , ≤ i ≤ m, ≤ j ≤ m, has either exactly one solution or no solution at all. It remains to show that coefficients satisfyingthe affinity condition a k = ( m − k ) a m − − ( m − k − a m lead to a solution. Starting from (3)we get E (cid:2) W n | F n − (cid:3) = W n − + m (cid:88) k =0 (cid:0) k ( a m − − a m ) + a m (cid:1) (cid:0) W n − k (cid:1)(cid:0) T n − − W n − m − k (cid:1)(cid:0) T n − m (cid:1) . Next, we use Vandermonde’s convolution formula m (cid:88) k =0 (cid:18) rk (cid:19)(cid:18) sm − k (cid:19) = (cid:18) r + sm (cid:19) , and obtain E (cid:2) W n | F n − (cid:3) = W n − + W n − ( a m − − a m ) (cid:0) T n − − m − (cid:1)(cid:0) T n − m (cid:1) + a m . (cid:3) Expected value and second moment.
Next, we generalize the result of Bagchi and Pal [1]for the expected value and the second moment, when drawing a single ball (the case m = 1 ) tobalanced affine urn models with multiple drawings. In order to state our result we introduce thequantity g n given by g n = n − (cid:89) j =0 T j T j + m ( a m − − a m ) = (cid:0) n − T σ n (cid:1)(cid:0) n − T m ( am − − am ) σ n (cid:1) = Γ( n + T σ ) Γ( T + m ( a m − − a m ) σ )Γ( T σ ) Γ( n + T + m ( a m − − a m ) σ ) . (6) Proposition 2.
The expected value of the random variable W n , counting the number of whiteballs in a two-color balanced affine urn model with multiple drawings, is for both samplingmodels M and R given by E [ W n ] = a m g n (cid:80) nj =1 g j + W g n , with g n as given stated above in (6) .For m ( a m − − a m ) σ < , we have the closed form expression E [ W n ] = a m ( n + T σ )1 − m ( a m − − a m ) σ + (cid:16) W − a m T σ − m ( a m − − a m ) σ (cid:17) (cid:0) n − T m ( am − − am ) σ n (cid:1)(cid:0) n − T σ n (cid:1) , as well as the asymptotic expansion E [ W n ] = a m σσ − m ( a m − − a m ) n + (cid:16) W − a m T σ − m ( a m − − a m ) σ (cid:17) Γ( T σ )Γ( T + m ( a m − − a m ) σ ) n m ( am − − am ) σ + O (1) , Moreover, for m ( a m − − a m ) σ = 1 we obtain E [ W n ] = W nσ + T T . Proof of Proposition 2.
From Proposition 1 we get E [ W n ] = (cid:18) T n − + m ( a m − − a m ) T n − (cid:19) E [ W n − ] + a m , n ≥ . (7)Multiplication with g n as defined in (6) gives the recurrence relation g n E [ W n ] = g n − E [ W n − ] + g n a m , such that E [ W n ] = a m g n n (cid:88) j =1 g j + W g g n = a m g n n (cid:88) j =1 g j + W g n . (8)Applying the summation formula s (cid:88) k =1 (cid:0) k + xk (cid:1)(cid:0) k + yk (cid:1) = ( s + 1 + y ) (cid:0) s +1+ xs +1 (cid:1) ( x + 1 − y ) (cid:0) s +1+ ys (cid:1) + 1 − x + 1 x + 1 − y , to the sum involving g n , which has the form (cid:0) n + xn (cid:1) / (cid:0) n + yn (cid:1) , with x = T σ − and y = T σ − m ( a m − − a m ) σ , we obtain the result, valid for m ( a m − − a m ) σ < . For m ( a m − − a m ) σ = 1 , we observethat by the tenability assumption on the urn, we obtain for both sampling models the conditions a m ≥ , and also b ≥ . Thus, we get from Proposition 1 σ = a + b = m ( a m − − a m ) + a m + b ≥ m ( a m − − a m ) , such that a m = b = 0 , and the result follows directly from (8).In order to obtain asymptotic expansions, we only need Stirling’s formula for the Gammafunction: Γ( z ) = (cid:16) ze (cid:17) z √ π √ z (cid:16) z + 1288 z + O (cid:16) z (cid:17)(cid:17) , | z | → ∞ . Hence, we obtain g n = Γ( T σ ) Γ( n + T + m ( a m − − a m ) σ )Γ( n + T σ ) Γ( T + m ( a m − − a m ) σ ) = Γ( T σ )Γ( T + m ( a m − − a m ) σ ) n m ( am − − am ) σ (cid:16) O (cid:16) n (cid:17)(cid:17) , yielding the stated result. (cid:3) Proposition 3.
For balanced affine urn schemes, the second moment of W n is E [ W n ] = (cid:0) n − λ n (cid:1)(cid:0) n − λ n (cid:1)(cid:0) n − T σ n (cid:1)(cid:0) n − T − σ n (cid:1) (cid:18) W + n (cid:88) j =1 ( β j E [ W j − ] + a m ) (cid:0) j − T σ j (cid:1)(cid:0) j − T − σ j (cid:1)(cid:0) j − λ j (cid:1)(cid:0) j − λ j (cid:1) (cid:19) for model M , with λ , = m ( a m − − a m )+ T − ± √ m ( a m − − a m )( a m − − a m +1) σ , and β n = ( a m − − a m ) (cid:16) mT n − − m T n − (cid:17) + 2 ma m ( a m − − a m ) T n − + 2 a m . (9) FFINE URN MODELS 11
For model R , the second moment of W n is E [ W n ] = (cid:0) n − µ n − (cid:1)(cid:0) n − µ n (cid:1)(cid:0) n − T σ n (cid:1) (cid:18) W + n (cid:88) j =1 ( β j E [ W j − ) + a m ] (cid:0) j − T σ j (cid:1) (cid:0) j − µ j (cid:1)(cid:0) j − µ j (cid:1) (cid:19)(cid:19) , with µ , = m ( a m − − a m )+ T ± ( a m − − a m ) √ mσ , and β n = ( a m − − a m ) mT n − + 2 ma m ( a m − − a m ) T n − + 2 a m . (10) Proof.
We use the stochastic recurrence (5), with s = 2 , and obtain for the conditional expecta-tion the equation E (cid:2) W n | F n − (cid:3) = W n − + m (cid:88) k =0 (2 W n − a m − k + a m − k ) E (cid:2) I n ( W k B m − k ) | F n − (cid:3) . By Proposition 1 and the affinity condition, we further get E (cid:2) W n | F n − (cid:3) = W n − + a m (2 W n − + a m )+ m (cid:88) k =0 (cid:0) k ( a m − − a m ) + 2 k ( a m − − a m )( a m + W n − ) (cid:1) × E (cid:2) I n ( W k B m − k ) | F n − (cid:3) . (11)The sums depend on the particular sampling model. According to (3), for model M , the numberof drawn white balls in the sample of size m is given by a hypergeometric distribution withparameters T n − , W n − and m . Alternatively, for model R , the number of drawn white balls inthe sample of size m is given by a binomial distribution with parameters m and W n − /T n − .We take expectations and use the results of Section 2 to simplify the sums. Consequently, weobtain for both models a linear recurrence relation of the form E [ W n ] = α n E [ W n − ] + β n E [ W n − ] + γ n , n ≥ , with E [ W n ] as given in Proposition 2. For model M , the sequences α n and γ n are given by α n = 1 + ( a m − − a m ) m T n − + 2( a m − − a m ) mT n − , γ n = a m , and β n as stated in (9). For model R , we have α n = 1 + ( a m − − a m ) m T n − + 2( a m − − a m ) mT n − , γ n = a m , and β n as stated in (10). The recurrence relation for E [ W n ] is readily solved in a manner similarto that we used to solve (7), and we obtain E [ W n ] = (cid:18) n (cid:89) (cid:96) =1 α (cid:96) (cid:19)(cid:18) W + n (cid:88) j =1 β j E [ W j − ] + γ j (cid:81) j(cid:96) =1 α (cid:96) (cid:19) , with E [ W n ] given by Proposition 2. Finally, we simplify the products (cid:81) n(cid:96) =1 α (cid:96) by viewing α n asa rational function in the variable n , and factorizing it into linear terms of the forms α n = ( n − λ )( n − λ )( n − T σ )( n − T − σ ) , and α n = ( n − µ )( n − µ )( n − T σ ) , for models M and R , respectively. (cid:3) Martingale structure.
Next, we deduce from the linear affine structure of the conditionalexpected value of W n and the previous result for the expected value the following result. Proposition 4.
For balanced affine urn schemes with a m (cid:54) = 0 , the centered random variable W n = g n ( W n − E [ W n ]) = g n W n − a m n (cid:88) j =1 g j − W , with g n as defined in (6) , is a martingale with respect to the natural filtration: E [ W n | F n − ] = W n − , n ≥ , and W = 0 .For balanced affine urn schemes with a m = 0 , the random variable W n = g n W n is a non-negative martingale and converges almost surely to a limit W ∞ .Proof of Proposition 4. By Proposition 1 the conditional expectation is given by E (cid:2) W n | F n − (cid:3) = W n − (cid:18) T n − + m ( a m − − a m ) T n − (cid:19) + a m , (12)for n ≥ . As in the proof of Proposition 2, we obtain E (cid:2) g n W n | F n − (cid:3) = g n − W n − + g n a m , n ≥ , By definition W n = g n W n − a m n (cid:88) j =1 g j − W , and we get the representation g n E (cid:2) W n | F n − (cid:3) − a m n (cid:88) j =1 g j − W = g n − W n − − a m n − (cid:88) j =1 g j − W , such that E (cid:2) W n | F n − (cid:3) = W n − . We also note that W = g ( W − E [ W ]) = 0 , and so E [ W n ] = 0 . Moreover, E (cid:2) |W n | (cid:3) = g n E (cid:2) | W n − E [ W n ] | (cid:3) . For a m = 0 , we note that W n ≥ and also W n = g n W n ≥ . By martingale theory, W n converges almost surely to a limit: W n a . s −→ W ∞ . (cid:3) FFINE URN MODELS 13
4. L
IMIT THEOREMS
In this section, we discuss limit theorems for the number of white balls. Our limit theoremsare valid for arbitrary m ≥ , unifying the earlier observed phenomena for the case m = 1 , andcovering new such cases, as well as extending the result to larger sample size. For balanced urnmodels and a single ball in the sample, one considers the ball replacement matrix M = (cid:18) a bc d (cid:19) ,with balance factor σ , with eigenvalues Λ = σ and Λ = a − c . For this classic case, there is aknown trichotomy [2, 3, 11, 12, 15]: (1) triangular urn models with a nongaussian limit for c = 0 (or b = 0 ), (2) the so-called small urns with a Gaussian limit for c > and Λ / Λ ≤ , and (3)the so-called large urns with a nongaussian limit for c > and Λ / Λ > . Note that owing tothe balance, the urn actually has only three parameters a, c and σ . The terms “small urns” and“large urn” were used by other researchers [2]. We prefer to think of the ratio of eigenvalues asan index and refer to urns with small versus large index. It is the index that can be large or small,not the physical container (urn, box, etc.).For urn models with multiple drawings and affine expectation, we obtain a similar charac-terization. By Proposition 1, our class of urns is determined by a m − , a m and the balance fac-tor σ , satisfying the affinity condition a k = ( m − k )( a m − − a m ) + a m , ≤ k ≤ m . We call A = (cid:18) a m − b m − a m b m (cid:19) the reduced ball replacement matrix. For the affine subclass of balancedurn models, the eigenvalues of A are Λ = σ and Λ = a m − − a m . It turns out that the behaviourof the urn critically depends on the urn index Λ given by the ratio Λ / Λ of the two eigenvaluesof A times the sample size m : Λ = Λ( m, σ ) := mσ ( a m − − a m ) . This parameter governs the growth of the second largest term in the asymptotic expansion of theexpected value. For instance, in terms of this index, the expectation in Proposition 2 is E [ W n ] = a m − Λ n + O ( n Λ ) + O (1) . In the following we obtain a central limit theorem for urn models with “small index” Λ < and“critical index” Λ = . Note that the case Λ = 0 is excluded from our considerations becauseit leads to a m = a m − and by the affinity condition to a k = a m , ≤ k ≤ m ; thus we havedeterministic development: W n = W + a m n . We also obtain almost sure and L -convergencefor “large index” urns Λ > . We call an urn model triangular if a m = 0 or b = 0 (or both). Wealready obtained for a m = 0 almost sure convergence in Proposition 4. Since B n = T n − W n ,we can reduce the case b = 0 and a m ≥ by reversing the colors to a m = 0 and b ≥ . Adetailed study of the moment structure of large index urns and the triangular urns with a m = 0 (and b ≥ ) will appear in a companion work.4.1. Asymptotic expansion of the variance.
An asymptotic expansion of the variance of W n can be obtained from the explicit expressions for the expected value and the second moment. Itis required to prove later on a central limit theorem for Λ ≤ and almost sure convergence oflarge-index urns. Theorem 1.
For balanced affine urn schemes, the variance satisfies the following expansions:Small-index urns, the case Λ < : V [ W n ] = a m b Λ m (1 − − Λ) n + o ( n ) . Critical-index urns, the case
Λ = : V [ W n ] = a m b m n log n + O ( n ) , Large-index urns, the case Λ > : V [ W n ] = Cn + O ( n ) , with the constant C being model-dependent given by an infinite sum: C = W ψ + ∞ (cid:88) j =1 β j E [ W j − ] + a m − ψ j a m j − Λ (cid:16) a m − Λ j − Λ + ( W − amT σ − Λ ) Γ( T σ )Γ( T σ +Λ) (cid:17) ψ j + 2 a m − Λ ζ (2Λ −
1) + 2 a m (cid:16) W − a m T σ − Λ (cid:17) Γ( T σ )Γ( T σ + Λ) ζ (Λ) − (cid:16) W − a m T σ − Λ (cid:17) Γ ( T σ )Γ ( T σ + Λ) , with ζ ( z ) denoting the Riemann zeta function and β j , ψ j , E [ W j − ] as given in (9) , (10) , (14) , andProposition 4.Proof. Our starting point is the expression for E [ W n ] in Proposition 3. In order to perform aunified analysis for the two models, we use the representation E [ W n ] = ψ n ψ W + ψ n n (cid:88) j =1 β j E [ W j − ] + a m ψ j , (13)with ψ n = Γ( n + λ )Γ( n + λ )Γ( n + T σ ) Γ( n + T − σ ) , model M ;Γ( n + µ )Γ( n + µ )Γ( n + T σ ) , model R . (14)We refine our previous result and observe that the expected value E [ W n ] satisfies the asymptoticexpansion E [ W n ] = a m − Λ n + (cid:16) W − a m T σ − Λ (cid:17) Γ( T σ )Γ( T σ + Λ) n Λ + T a m σ (1 − Λ) + O ( n − ) . (15) FFINE URN MODELS 15
Moreover, g n satisfies the asymptotic expansion g n = Γ( T σ + Λ)Γ( T σ ) n − Λ (cid:16) n Λ (cid:16) − T σ − Λ (cid:17) + O (cid:16) n (cid:17)(cid:17) . (16)Furthermore, β n satisfies for both urn models the asymptotic expansion β n = 2 a m + Λ( a m − + a m ) n + O (cid:16) n (cid:17) . (17)We need the expansion ψ n = n (cid:16) Mn + O (cid:16) n (cid:17)(cid:17) , with the constant M given by M = (cid:40) (cid:0) λ − λ + λ − λ − T σ + T σ − ( T − σ + T − σ (cid:1) , model M ; (cid:0) µ − µ + µ − µ − T σ + 2 T σ (cid:1) , model R , with λ i , µ i as given in Proposition 3. After simplifications it turns out that for both models theconstant M is given by M = Λ − Λ + Λ m + 2Λ T σ . In order to keep track of the different expansions in a readable transparent way, we introduce ashorthand notation: E [ W n ] = E n + E n Λ + E + O ( n − ) ,β n = B + B n − + O ( n − ) , (18)with constants E i , B i as given in (15) and (17). We note that (cid:0) E [ W n ] (cid:1) = E n + 2 E E n + 2 E E n + E n + o ( n ) . We shall prove that E [ W n ] = E n + 2 E E n + ϕ n , with ϕ n = (cid:16) a m ( σ (1 − Λ) − a m )Λ m (1 − − Λ) + 2 E E (cid:17) n + o ( n ) , for Λ < ; a m ( σ − a m ) m n log n + O ( n ) , for Λ = ;( C + E ) n + O ( n ) , for Λ > . We start with the small-index urns satisfying Λ < . Assume first that < Λ < . We postponethe remaining case Λ < to the end (note that Λ = 0 leads to a degenerate urn model). Theexpansion of E [ W n ] is obtained as follows. First, let j = j ( n ) , with j → ∞ , and write β j E [ W j − ] + a m ψ j = B E j − + B E j − Λ + ( a m + B E + E B − B E − B E M ) j − + O ( j − − Λ ) . (19) Replacing the summands by their asymptotic expansion leads to an error of magnitude O (1) .This is fully sufficient for our purpose. Consequently, we use the following identity, which canbe obtained using the Euler-MacLaurin summation formula (see [9]; Pages 595–596): n (cid:88) j =1 j α = n α +1 α + 1 + n α ζ ( − α ) + O ( n α − ) , (20)for α (cid:54) = − , where ζ ( z ) denotes the Riemann zeta function. We obtain the expansion n (cid:88) j =1 β j E [ W j − ] + a m ψ j = B E (cid:16) n − −
2Λ + n − (cid:17) + B E n − Λ − Λ+ (cid:0) a m + B E + E B − B E − B E M (cid:1) n − −
2Λ + O (1) . (21)Since B E − Λ) = E and B E − Λ = 2 E E , we obtain—taking into account the expansion of ψ n —thefollowing: ψ n n (cid:88) j =1 β j E [ W j − ] + a m ψ j = E n + 2 E E n + (cid:16) B E M E + a m + B E + E B − B E − B E M − (cid:17) n + o ( n ) . (22)Consequently, the first two terms in E [ W n ] − ( E [ W n ]) cancel out. Only a leading linear termremains in the variance, and its coefficient is (cid:16) B E M E + a m + B E + E B − B E − B E M − (cid:17) − E E . The stated result follows after simplification aided by a computer algebra system and using thefact that b = σ (1 − Λ) − a m .For Λ < we can proceed in a similar way. The expansion (19) is still valid. The onlydifference is that the magnitude of the error is larger and of order O ( n − Λ ) in(21). Nevertheless,the resulting expansion (22) is still valid due to the multiplication with ψ n ∼ n .For Λ = , we proceed in a similar way. We use the identity n (cid:88) j =1 j − = ln n + γ + O (cid:16) n (cid:17) , (23)where γ denotes the Euler-Mascheroni constant. We have B E = 2 a m a m − Λ = a m (1 − Λ) = E , andalso B E = 2 E E , such that ψ n n (cid:88) j =1 β j E [ W j − ] + a m ψ j = E n + 2 E E n FFINE URN MODELS 17 + (cid:0) a m + B E + E B − B E − B E M (cid:1) n ln n + O ( n ) . Consequently, the first two terms in E [ W n ] − E [ W n ] cancel out again, and the important constantis given by a m + B E + E B − B E − B E M. The stated result is obtained after simplification.For large-index urns Λ > we cannot neglect errors of magnitude O (1) as in the case < Λ < . In order to deal with the cancellations, we adapt (13) and use a different exact representation E [ W n ] = ψ n ψ W + ψ n n (cid:88) j =1 β j E [ W j − ] + a m − ψ j B j − Λ ( E j − Λ + E ) ψ j + ψ n B n (cid:88) j =1 ( E j − + E j − Λ ) . Owing to (19) we know that the first sum is convergent by the comparison test. Applicationof (20) to the second sum gives ψ n B n (cid:88) j =1 ( E j − + E j − Λ ) = E n + 2 E E n + B (cid:0) E ζ (2Λ −
1) + E ζ (Λ) (cid:1) n + o ( n ) . The first two terms in E [ W n ] − ( E [ W n ]) cancel out, and the constant C is given by W ψ + ∞ (cid:88) j =1 β j E [ W j − ] + a m − ψ j B j − Λ ( E j − Λ + E ) ψ j + B (cid:0) E ζ (2Λ −
1) + E ζ (Λ) (cid:1) − E , which proves the stated result. (cid:3) Almost-sure convergence of nontriangular urns.
For triangular urns with a m = 0 (or b = 0 or both) we have already obtained a limit theorem for W n via the martingale in Proposi-tion 4. A first byproduct of the previous result concerning the first and second moment is a limittheorem for W n /T n for a m (cid:54) = 0 (and b (cid:54) = 0 ). Proposition 5.
Let W n be the number of white balls in the urn after n draws. For nontriangularbalanced affine urn models with Λ < the ratio of white balls W n over the total number T n = T + nσ after n draws converges almost surely: W n T n ( a.s. ) −−−→ a m σ (1 − Λ) = a m a m + b . Proof.
Following [5] we use supermartingale theory to obtain the stated result. We only presentthe computation for model R , the proof for model M is very similar. The following computationsare somewhat lengthy, and preferably carried out with the help of a computer algebra system. Let Z n = W n T n − a m σ (1 − Λ) . Using Proposition 1 we obtain E (cid:2) Z n | F n − (cid:3) = (cid:16) σ Λ T n − (cid:17)(cid:16) − σT n (cid:17) Z n − = T n − + σ Λ T n Z n − . Furthermore, in a manner similar to the proof of Proposition 3, we get E (cid:2) Z n | F n − (cid:3) = T n − T n (cid:16) σ Λ T n − + Λ ( m − σ mT n − (cid:17) Z n − + Λ σ ( σ (1 − Λ) − a m )(1 − Λ) mT n Z n − + a m Λ ( σ (1 − Λ) − a m )(1 − Λ) mT n . Hence, E (cid:2) Z n | F n − (cid:3) ≤ ( T n − + σ Λ) T n Z n − + Λ σ ( σ (1 − Λ) − a m )(1 − Λ) mT n Z n − + a m Λ ( σ (1 − Λ) − a m )(1 − Λ) mT n . Now we use the fact that σ (1 − Λ) − a m = b ≥ and also a m ≥ . Moreover, we know that ≤ W n T n ≤ , and consequently − a m σ (1 − Λ) ≤ Z n ≤ − a m σ (1 − Λ) . Thus, there exists a constant κ = κ ( m, a m − , a m , σ ) —independent of n —such that E (cid:2) Z n | F n − (cid:3) ≤ ( T n − + σ Λ) T n Z n − + Λ σ | ( b − a m ) Z n − | + a m b Λ (1 − Λ) mT n ≤ ( T n − + σ Λ) T n Z n − + κ T n . There exists a constant κ > such that κ T n + κ T n ≤ κ T n − . Let c n = ( T n − + σ Λ) T n with < c n < and d n = κ T n > . We have E (cid:2) Z n + κ T n | F n − (cid:3) ≤ c n Z n − + d n + κ T n ≤ Z n − + κ T n + κ T n ≤ Z n − + κ T n − . Hence, Z n + κ T n is a positive supermartingale, which converges almost surely. Thus, Z n convergesalmost surely. Let lim n →∞ Z n = Z almost surely. Following [5], we next prove that E [ Z n ] → . By dominated convergence this is sufficient to obtain the stated result since it implies that E [ Z ] = 0 and so Z = 0 almost surely, such that Z n converges to almost surely. We have E [ Z n ] ≤ c n E [ Z n − ] + d n . By the comparison theorem ∞ (cid:88) n =1 d n = ∞ (cid:88) n =1 κ ( nσ + T ) < ∞ . Moreover, n (cid:89) k =1 ( T k − + σ Λ) T k = n (cid:89) k =1 ( k + Λ − T σ ) ( k + T σ ) = Γ( n + Λ + T σ ) Γ(1 + T σ ) Γ(Λ + T σ ) Γ( n + 1 + T σ ) . Thus, we can use the following lemma —also used in [5]— to show that E [ Z n ] → and to finishour proof. Lemma 2 ([4]) . Suppose { x n } n ≥ , { c n } n ≥ and { d n } n ≥ are nonnegative real sequences satis-fying x n +1 ≤ c n x n + d n , where < c n < for n ≥ . If (cid:81) ni =1 c i → and (cid:80) ∞ n =1 d n < ∞ , then x n → . FFINE URN MODELS 19
By Stirling’s formula the product (cid:81) nk =1 ( T k − + σ Λ) T k satisfies the asymptotic expansions n (cid:89) k =1 ( T k − + σ Λ) T k ∼ κ n − , for some constant κ . Since Λ < the product tends to zero, and so does E [ Z n ] . (cid:3) Almost-sure convergence for urns with large index.Theorem 2.
For nontriangular balanced affine urn models with a large index < Λ < therandom variable W n = g n ( W n − E [ W n ]) converges almost surely and in L to a limit W ∞ .Proof. By Proposition 4, W n is a martingale. Hence, by martingale theory it suffices to provethat ∞ (cid:88) n =1 E (cid:2) ( ∇W n ) (cid:3) < ∞ in order to prove almost-sure and L convergence (see Chapter 10 of [25]). We use a standardargument: E (cid:2) ( ∇W n ) | F n − (cid:3) = E (cid:2) ( W n − W n − ) | F n − (cid:3) = E (cid:2) W n − W n W n − + W n − | F n − (cid:3) . By the martingale property we get further E (cid:2) ( ∇W n ) | F n − (cid:3) = E (cid:2) W n | F n − (cid:3) − W n − E (cid:2) W n | F n − (cid:3) + W n − = E (cid:2) W n | F n − (cid:3) −W n − . This implies that E (cid:2) ( ∇W n ) (cid:3) = E (cid:2) W n (cid:3) − E (cid:2) W n − (cid:3) , n ≥ . Using the fact that W = 0 we obtain N (cid:88) n =1 E (cid:2) ( ∇W n ) (cid:3) = E (cid:2) W N (cid:3) = g N V [ W N ] . By the asymptotic expansion (16) and of V [ W n ] we observe that lim N →∞ N (cid:88) n =1 E (cid:2) ( ∇W n ) (cid:3) = C Γ ( T σ + Λ)Γ ( T σ ) < ∞ , with C as given in Theorem 1. (cid:3) Some corollaries of the relatively small variance for Λ ≤ are helpful in deriving furtherdistributional results. Corollary 1.
Let W n be the number of white balls in the urn after n draws, Then, for Λ ≤ , wehave W n = a m − Λ n + O L (cid:0)(cid:112) V [ W n ] (cid:1) = a m − Λ n + (cid:40) O L ( √ n ) , Λ < ; O L ( √ n ln n ) , Λ = , By saying a sequence of random variables Y n is O L ( g ( n )) , we mean there exist a positive constant C and apositive integer n , such that E [ | Y n | ] ≤ C | g ( n ) | , for all n ≥ n . and W n = (cid:16) a m − Λ (cid:17) n + O L (cid:0) n V [ W n ] (cid:1) = (cid:16) a m − Λ (cid:17) n + (cid:40) O L ( n ) , Λ < ; O L ( n ln n ) , Λ = . . Proof.
From the asymptotics of the mean and variance, as given in Proposition (2), (15) andTheorem 1, for large n we have E (cid:104)(cid:16) W n − a m − Λ n (cid:17) (cid:105) = E (cid:104)(cid:16)(cid:0) W n − E [ W n ] (cid:1) + (cid:16) E [ W n ] − a m − Λ n (cid:17)(cid:17) (cid:105) = V [ W n ] + (cid:16) E [ W n ] − a m − Λ n (cid:17) = O ( V [ W n ]) . (24)So, by Jensen’s inequality E (cid:104)(cid:12)(cid:12)(cid:12) W n − a m − Λ n (cid:12)(cid:12)(cid:12)(cid:105) ≤ (cid:114) E (cid:104)(cid:16) W n − a m − Λ n (cid:17) (cid:105) = O ( V [ W n ]) , and this implies W n = a m − Λ n + O L ( V [ W n ]) . The second part follows by squaring. (cid:3)
Corollary 2.
Let W n be the number of white balls in the urn after n draws. Then, we have W n n L −→ a m − Λ , and W n n −→ (cid:16) a m − Λ (cid:17) . So, both convergences occur in probability, too.
Martingale central limit theorem.
We follow the approach used in [17, 20] used for spe-cial urns. We would obtain a Gaussian law for W j , if a set of conditions for the martingalecentral limit theorems are satisfied. There is more than one such set (see [10]). One set ofsuch conditions convenient in our work is the combined conditional Lindeberg’s condition andthe conditional variance condition. The conditional Lindeberg condition requires that, for somepositive increasing sequence ξ n , and for all ε > , U n := n (cid:88) j =1 E (cid:104)(cid:16) ∇W j ξ n (cid:17) I (cid:16)(cid:12)(cid:12)(cid:12) ∇W j ξ n (cid:12)(cid:12)(cid:12) > ε (cid:17) (cid:12)(cid:12)(cid:12) F j − (cid:105) P −→ , and the conditional variance condition requires that, for some square integrable random variable Y (cid:54) = 0 , we have V n := n (cid:88) j =1 E (cid:104)(cid:16) ∇W j ξ n (cid:17) (cid:12)(cid:12)(cid:12) F j − (cid:105) P −→ Y. FFINE URN MODELS 21
When these conditions are satisfied, we get W n − a m − Λ nξ n D −→ N (0 , Y ) , where the right-hand side is a mixture of normals, with Y being the mixer. It will turn out thatthe correct scale factors are ξ n = (cid:40) n − Λ , for Λ < , ln n, for Λ = . (25) Lemma 3.
The terms |∇W j | satisfy |∇W j | ≤ Kj − Λ for some positive constant K and ≤ j ≤ n .Proof. Suppose ω j = W j − W j − is the random number of white balls added right after the j thdrawing. Starting from the definition of W j , we write the absolute difference as |∇W j | = |W j − W j − | = (cid:12)(cid:12)(cid:12)(cid:16) g j W j − a m j (cid:88) k =1 g k − W (cid:17) − (cid:16) g j − W j − − a m j − (cid:88) k =1 g k − W (cid:17)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12) g j ( W j − + ω j ) − a m g j − g j − W j − (cid:12)(cid:12) = (cid:12)(cid:12) W j − ∇ g j + g j ω j − a m g j (cid:12)(cid:12) ≤ T j − | g j − g j − | + qg j + qg j − , with q = max ≤ k ≤ m | a k | . By definition of g n and the asymptotic expansion (16) g j = O ( j − Λ ) and further | g j − g j − | = O ( j − − Λ ) . Consequently, there exists a constant K > such that |∇W j | ≤ Kj − Λ (cid:3) Lemma 4. U n = n (cid:88) j =1 E (cid:104)(cid:16) ∇W j ξ n (cid:17) I (cid:16)(cid:12)(cid:12)(cid:12) ∇W j ξ n (cid:12)(cid:12)(cid:12) > ε (cid:17) (cid:12)(cid:12)(cid:12) F j − (cid:105) P −→ . Proof.
Choose any ε > . Concerning Λ < we distinguish between Λ < and < Λ < .Lemma 3 asserts that for arbitrary j with ≤ j ≤ n ∇W j ξ n ≤ Kn , for Λ < Kj Λ n − Λ ≤ Kn − Λ , for < Λ < ; Kj ln n ≤ K ln n , for Λ = . Hence, the sets {|∇W j | > ε ξ n } are all empty, regardless of Λ < , < Λ < or Λ = , forall n greater than some positive integer n ( ε ) . For large n (namely, n > n ( ε ) ), we can stop the sum at n ( ε ) . By in Lemma 3, we get U n = n ( ε ) (cid:88) j =1 E (cid:104)(cid:16) ∇W j ξ n (cid:17) I (cid:16)(cid:12)(cid:12)(cid:12) ∇W j ξ n (cid:12)(cid:12)(cid:12) > ε (cid:17) (cid:12)(cid:12)(cid:12) F j − (cid:105) ≤ ξ n n ( ε ) (cid:88) j =1 E (cid:104) K (cid:12)(cid:12)(cid:12) F j − (cid:105) ≤ K n ( ε ) ξ n → , for n → ∞ . (cid:3) Lemma 5. V n = n (cid:88) j =1 E (cid:104)(cid:16) ∇W j ξ n (cid:17) (cid:12)(cid:12)(cid:12) F j − (cid:105) P −→ a m Q ( σ (1 − Λ) − a m )Λ m (1 − Λ) (1 − . Proof.
Let Q := Γ( T σ +Λ)Γ( T σ ) . By (16) we have g n = Qn − Λ + O ( n − Λ − ) . From this asymptotic relation, we also have ∇ g n = − Q Λ n − Λ − + O ( n − Λ − ) . As in the proof of Lemma 4, we write ∇W j = ( ∇ g j ) W j − + g j ω j − a m g j − . And so, we can write (cid:0) ∇W j (cid:1) = Q j (cid:16) Λ W j − j − ω j W j − j + 2Λ a m W ( j − j + ω j − a m ω j + a m (cid:17) . Using the L approximation in Corollary 2, we write the conditional expectation E (cid:2)(cid:0) ∇W j (cid:1) | F j − (cid:3) = Q j (cid:16) Λ (cid:16) a m − Λ (cid:17) − (cid:16) Λ (cid:16) a m − Λ (cid:17) + a m (cid:17) E (cid:2) ω j | F j − (cid:3) + 2Λ a m (cid:16) a m − Λ (cid:17) + E (cid:2) ω j | F j − (cid:3) + a m (cid:17) + O L (cid:16) j (cid:17) . We already know exactly the conditional expectations of ω j and ω j from (12) and (11), respec-tively: E (cid:2) ω j | F j − (cid:3) = σ Λ T j − W j − + a m and E (cid:2) ω j | F j − (cid:3) = E [ W j | F j − ] − W j − E [ W j | F j − ] + W j − = (cid:0) α j − σ Λ T j − ) + 1 (cid:1) W j − + ( β j − a m ) W j − + γ j , with model-dependent sequendes α j , γ j as given in (11) and β j in Proposition 3. Consequently,using asymptotic expansions of α j , β j , γ j , and Corollary 2 we obtain the following model-independent expansions E (cid:2) ω j | F j − (cid:3) = a m − Λ + O L (cid:16) √ j (cid:17) , FFINE URN MODELS 23 and E (cid:2) ω j | F j − (cid:3) = a m (Λ ( σ − Λ σ − a m ) + ma m ) m (1 − Λ) + O L (cid:16) √ j (cid:17) . Putting all the elements together and simplifying, we obtain E (cid:2)(cid:0) ∇W j (cid:1) | F j − (cid:3) = a m Q ( σ (1 − Λ) − a m )Λ mj (1 − Λ) + O L (cid:16) j (cid:17) . Now we can sum using (20) and (23). We get for Λ < V n = a m Q ( σ (1 − Λ) − a m )Λ m (1 − Λ) (1 − O L (cid:16) n (cid:17) L −→ a m Q ( σ (1 − Λ) − a m )Λ m (1 − Λ) (1 − , and, for Λ = we get, V n = a m Q ( σ − a m ) m + O L (cid:16) n (cid:17) L −→ a m Q ( σ − a m ) m . This implies the required convergence in probability. (cid:3)
Having checked the two martingale conditions, a Gaussian law follows for the nondegeneratecases: W n − Qa m − Λ nn − Λ D −→ N (cid:16) , a m ( σ (1 − Λ) − a m ) Q Λ m (1 − Λ) (1 − (cid:17) for Λ < and W n − Qa m n √ log n D −→ N (cid:16) , a m ( σ − a m ) Q m (cid:17) . for Λ = . Translating this into a statement on the number of white balls and using b = σ (1 − Λ) − a m we get a main result of this investigation. Theorem 3.
Suppose we have a two-color tenable affine balanced urn that grows by samplingsets of size m with or without replacement, and with a small index Λ ≤ , and does not fallin any of the afore-mentioned degenerate cases. Let W n be the number of white balls after n draws. For Λ ≤ we have Gaussian laws: W n − a m − Λ n √ n D −→ N (cid:16) , a m b Λ m (1 − Λ) (1 − (cid:17) , for Λ < , and W n − a m n √ n log n D −→ N (cid:16) , a m b m (cid:17) , for Λ = 12 . Recall that several degenerate cases are excluded from this study, namely, the triangular case where a = 0 or b = 0 , the zero-balance cases, and the case T + m ( a m − − a m ) ≤ .
5. C
ONCLUSION AND O UTLOOK
Summary.
We studied for a two-color affine linear urn models with multiple drawings—sample size m ≥ —under two sampling models the distribution of the number of white balls W n after n draws. In the following we summarize our findings according to the index Λ and statethe order of growth of the expectation and variance. Λ < Λ = Λ > E [ W n ] n n n V [ W n ] n n log n n Limit law W n − E [ W n ] √ V [ W n ] → N (0 , W n − E [ W n ] √ V [ W n ] → N (0 , W n ( a.s. ) −−−→ W ∞ T ABLE
1. Overview of the result for nontriangular urns a m b (cid:54) = 0 .Here W n = g n (cid:0) W n − E [ W n ] (cid:1) with g n = (cid:81) n − j =0 T j T j + m ( a m − − a m ) ∼ Cn − Λ . Note that for non-normal limit law for large-index urns the distribution will depend on the sampling model; thiswill be discussed in a companion work, as well as the moment structure.5.2. Quadratic expected value and beyond.
Using Lemma 1 it is possible to extend Propo-sition 1 to characterize all ball replacement matrices leading to a conditional expected value ofquadratic type, cubic type, etc., and in general to a polynomial of degree k , with ≤ k ≤ m .Beginning with the extension to quadratic types, E (cid:2) W n | F n − (cid:3) = α n, W n − + α n, W n − + α n, , n ≥ , we obtain the same condition for the a k ’s for both models but different resulting coefficients α n, , α n, and α n, . It is possible to obtain a explicit expression for the expected value of W n , butthe arising formula is very complicated and does not easily seem to lead to precise asymptoticexpansions.5.3. Urns with a large index, triangular urns, and more colors.
In the companion work [18],we complete the study of balanced urn models with multiple drawings and affine conditionalexpectation. In particular, we provide a detailed analysis of the moments of W n for triangularurn models and also for urns with a large index Λ > / . The analysis is based on the so-calledmethod of moments applied to the centered moments of W n and the martingales W n , W n .In order to generalize the affinity condition of Proposition 1 to more than two colors it isbeneficial to rewrite the a k ’s as an affine combination of a and a m : a k = m − km a + km a m , ≤ k ≤ m . This idea can be readily generalized to r ≥ colors. One obtains a martingale ofthe form E [ X n | F n − ] = ( I + 1 T n − ˜ A T ) X n − , FFINE URN MODELS 25 with X n = ( X (1) n , . . . , X ( r ) n ) T , where X ( (cid:96) ) n denotes the random number of balls colored (cid:96) and I the identity matrix. The matrix ˜ A T is a certain r × r -matrix (somewhat similar to the “reduced”ball replacement matrix A introduced before), being composed of r vectors appearing in thegeneral affinity condition. Compared to the two color case, simple expressions for the (mixed)moments of X n do not seem to exist, but it may be possible to study the limitings distributionof X n using different methods. R EFERENCES [1] A. B
AGCHI AND
A. K. P AL (1985). Asymptotic normality in the generalized P´olya-Eggenberger urn model,with an application to computer data structures. SIAM J. Algebraic Discrete Math. , 394–405.[2] B. C HAUVIN , N. P
OUYANNE AND
R. S
AHNOUN (2011). Limit distributions for large P´olya urns.
The Annalsof Applied Probability , , 1–32.[3] B. C HAUVIN , N. P
OUYANNE , AND
C. M
AILLER (2014). Smoothing equations for large P´olya urns.
Journalof Theoretical Probability (to appear).[4] M.-R. C
HEN , S.-R. H
SIAU AND
T.-H. Y
AUN (2014+). A New Two-Urn Model.
Journal of Applied Proba-bility (to appear).[5] M.-R. C
HEN AND
M. K
UBA (2013). On generalized Polya urn models.
Journal of Applied Probability
Vol-ume 50, Number 4, 1169–1186.[6] M.-R. C
HEN AND
C.-Z. W EI (2005). A New Urn Model, Journal of Applied Probability , 964–976, 2005.[7] F. E GGENBERGER AND
G. P ´
OLYA (1923). ¨Uber die Statistik verketteter Vorg¨ange.
Z. Angewandte Math.Mech. , 279–289.[8] P. E HRENFEST AND
T. E
HRENFEST (1907). ¨Uber zwei bekannte Einw¨ande gegen das Boltzmannsche H-theorem.
Physikalische Zeitschrift , , 311–314.[9] R. L. G RAHAM , D. E. K
NUTH , AND
O. P
ATASHNIK (1994).
Concrete Mathematics , Addison-Wesley.[10] P. H
ALL AND
C. H
EYDE (1980).
Martingale Limit Theory and Its Applications . Academic Press, New York.[11] S. J
ANSON (2004). Functional limit theorems for multitype branching processes and generalized P´olya urns.
Stochastic processes and applications , , 177–245.[12] S. J ANSON (2006). Limit theorems for triangular urn schemes,
Probability Theory and Related Fields ,417–452.[13] N. L. J
OHNSON AND
S. K
OTZ (1977).
Urn Models and Their Application . John Wiley, New York.[14] N. L. J
OHNSON , S. K
OTZ , AND
H. M
AHMOUD (2004). P´olya-type urn models with multiple drawings.
Journal of the Iranian Statistical Society , , 165–173.[15] M. K NAPE AND
R. N
EININGER (2014+). P´olya urns via the contraction method.
Combinatorics, Probabilityand Computing (Special issue dedicated to the memory of Philippe Flajolet) (to appear).[16] S. K
OTZ AND
N. B
ALAKRISHNAN (1997). Advances in urn models during the past two decades.
Advances inCombinatorial Methods and Applications to Probability and Statistics , Birkh¨auser, Boston, MA, pp. 203–257.[17] M. K
UBA , H. M
AHMOUD AND
A. P
ANHOLZER (2013). Analysis of a generalized Friedman’s urn with mul-tiple drawings.
Discrete Applied Mathematics , Volume 161, Issue 18, 2968–2984.[18] M. K
UBA AND
H. M
AHMOUD (2014+). On urn models with multiple drawings II: triangular urns and urnswith a large index.
Preprint [19] H. M
AHMOUD (2008).
P´olya Urn Models , Chapman-Hall, Orlando..[20] H. M
AHMOUD (2013). Drawing multisets of balls from tenable balanced linear urns.
Probability in the Engi-neering and Informational Sciences , , 147–162.[21] J. M OLER , F. P
LO AND
H. U
RMENETA (2013). A generalized P´olya urn and limit laws for the number ofoutputs in a family of random circuits.
TEST , , 46–61.[22] N. P OUYANNE (2008). An algebraic approach to P´olya processes.
Annales de l’Institut Henri Poincar´e , Vol.44, No. 2, 293–323. [23] H. R
ENLUND (2010). Generalized P´olya urns via stochastic approximation, available on the arXiv,http://arxiv.org/abs/1002.3716.[24] T. T
SUKIJI AND
H. M
AHMOUD (2001). A limit law for outputs in random circuits,
Algorithmica , 403–412.[25] D. W
ILLIAMS (1991).
Probability with Martingales . Cambridge University Press, Cambridge, UK.M
ARKUS K UBA , I
NSTITUTE OF A PPLIED M ATHEMATICS AND N ATURAL S CIENCES , U
NIVERSITY OF A P - PLIED S CIENCES - T
ECHNIKUM W IEN , H ¨
OCHST ¨ ADTPLATZ
5, 1200 W
IEN
E-mail address : [email protected] H OSAM
M. M
AHMOUD , D
EPARTMENT OF S TATISTICS , T HE G EORGE W ASHINGTON U NIVERSITY , W
ASH - INGTON , D.C. 20052, U.S.A.
E-mail address ::