A Ray-Knight representation of up-down Chinese restaurants
AA RAY–KNIGHT REPRESENTATIONOF UP-DOWN CHINESE RESTAURANTS
Dane Rogers ∗ and Matthias Winkel ∗ June 12, 2020
Abstract
We study composition-valued continuous-time Markov chains that appear nat-urally in the framework of Chinese Restaurant Processes (CRPs). As time evolves,new customers arrive (up-step) and existing customers leave (down-step) at suit-able rates derived from the ordered CRP of Pitman and Winkel (2009). We relatesuch up-down CRPs to the splitting trees of Lambert (2010) inducing spectrallypositive L´evy processes. Conversely, we develop theorems of Ray–Knight type torecover more general up-down CRPs from the heights of L´evy processes with jumpsmarked by integer-valued paths. We further establish limit theorems for the L´evyprocess and the integer-valued paths to connect to work by Forman et al. (2018+)on interval partition diffusions and hence to some long-standing conjectures.
Keywords:
Chinese Restaurant Process; composition; Ray–Knight theorem;scaling limit; squared Bessel process; stable process.
The purpose of this paper is to study a class of continuous-time Markov chains in thestate space C := (cid:8) ( n , . . . , n k ) : k ≥ , n , . . . , n k ≥ (cid:9) of integer compositions, which includes, for k = 0, the empty vector that we also denoteby ∅ . Such Markov chains arise naturally in the framework of the Dubins–Pitman two-parameter Chinese Restaurant Process (CRP) [29], when considered with the additionalorder structure of Pitman and Winkel [30]. Specifically, we interpret n , . . . , n k as thenumbers of customers at an ordered list of k tables in a restaurant, short table sizes .In the most basic model, we allow only the following transitions from ( n , . . . , n k ) ∈ C . • At rate n i − α , a new customer joins the i -th table, leading to a transition intostate ( n , . . . , n i − , n i + 1 , n i +1 , . . . , n k ), 1 ≤ i ≤ k . • At rate α a new customer opens a new table inserted directly to the right of the i -th table, leading to state ( n , . . . , n i , , n i +1 , . . . , n k ), 1 ≤ i ≤ k . • At rate θ a new customer opens a new table inserted in the left-most position,leading to state (1 , n , . . . , n k ). • At rate 1 each existing customer at table i , 1 ≤ i ≤ k , leaves, leading to either( n , . . . , n i − , n i − , n i +1 , . . . , n k ) if n i ≥ n , . . . , n i − , n i +1 , . . . , n k ) if n i = 1.These transition rates give rise to a C -valued continuous-time Markov chain if 0 ≤ α ≤ θ ≥
0, which we call a continuous-time up-down ordered Chinese Restaurant Processwith parameters ( α, θ ), or, for the purposes of this paper, just up-down oCRP ( α, θ ). ∗ University of Oxford, Department of Statistics, 24–29 St Giles’, Oxford OX1 3LB, UK a r X i v : . [ m a t h . P R ] J un he first three bullet points increase the number of customers and we call any oneof them an up-step , while the last bullet point decreases their number, a down-step .Conditionally given an up-step, the (induced discrete-time) transition probabilities arethose of an ordered CRP [30]. Without the order of tables, the up-step rates have beenrelated to the usual Dubins–Pitman CRP [29, Section 3.4], where the middle two bulletpoints combine to a rate kα + θ for a new table; see also [23]. See [22, 16, 31] for otherdiscussions of order structures related to CRPs. It was shown in [30, Proposition 6]that starting from (1) ∈ C , the distribution p n +1 after n consecutive up-steps is thesame as in (the left-right reversal) of a Gnedin–Pitman [16] regenerative compositionstructure that is known to be weakly sampling consistent, i.e. the push-forward of p n +1 under a down-step is also p n . Specifically, writing N i = n i + · · · + n k , p n ( n k , . . . , n ) = k (cid:89) i =1 r ( N i , n i ) , where r ( n, m ) = (cid:18) nm (cid:19) ( n − m ) α + mθn (1 − α ) ( m − ↑ ( n − m + θ ) m ↑ , for 1 ≤ m ≤ n , with x y ↑ := x ( x + 1) · · · ( x + y − discrete-time up-down oCRP ( α, θ ) on C n := (cid:8) ( n , . . . , n k ) ∈ C : n + · · · + n k = n (cid:9) , n ≥
1, in which each transition consists of an up-step followed by a down-step. Then p n is stationary. Similar up-down chains on related state spaces were studied in [6, 14],and for the corresponding discrete-time up-down CRP without the order of tables,Petrov [28] noted the stationary distribution, which in our setting is obtained as thepush-forward of p n under the map that ranks a vector into decreasing order. Petrov’smain result was the existence of a diffusive scaling limit of his up-down chain, whenrepresented in the infinite-dimensional simplex of decreasing sequences with sum 1. Conjecture 1.1.
Let ( C nj ) j ≥ be a C n -valued discrete-time up-down oCRP ( α, θ ) , foreach n ≥ . If n − C n converges, as n → ∞ , then ( n − C n [ n y ] ) y ≥ has a diffusive scalinglimit, when suitably represented on a space of interval partitions. For < θ = α < and θ < α < , these are the ( α, α ) - and ( α, -interval-partition evolutions of [13]. While Petrov [28] used analytic methods studying how generators act on a certaincore of symmetric functions, we develop here probabilistic methods in order to studyscaling limits associated with the (continuous-time) up-down oCRP( α, θ ). As demon-strated e.g. by Pal [27] in a finite-dimensional setting, asymptotic results for continuous-time Markov chains associated with discrete-time Markov chains via a method that canbe referred to as “Poissonization” may sometimes be used to deduce asymptotic resultsfor the discrete-time Markov chains via “de-Poissonization”. See also Shiga [35]. Sup-porting the conjecture, it is confirmed in [34, Theorem 3.1.2] that the mixing timesof ( C nj ) j ≥ , in the sense of controlling the maximal separation distance of Aldous andDiaconis [3], are of order n , extending results of Fulman [14] in the unordered case.In the present paper, we do not explore further the passage between discrete andcontinuous time. We relate the (continuous-time) up-down oCRP( α, θ ) to genealogicaltrees [15] and their jumping chronological contour processes (JCCPs) [25] that lead torepresentations as spectrally positive L´evy processes, whose jumps we further mark byinteger-valued paths. We show that the up-down oCRP( α, θ ), and natural generalisa-tions, can be recovered from the heights of such a marked L´evy process. This result isreminiscent of the recovery of a geometric Galton–Watson process from the occupationmeasure (upcrossing counts) of a suitably stopped simple symmetric random walk [24]2 t t T Figure 1: The relationship between a genealogical tree T and its JCCP X .or indeed, in the scaling limit, the recovery of a Feller diffusion (squared Bessel processof dimension 0) as local time process of a stopped Brownian motion.In our framework, we establish scaling limits of the L´evy process and of the integer-valued paths marking its jumps hence connecting to the framework of Forman et al.[11, 12, 13] and thereby to two long-standing conjectures. Specifically, Aldous [2] conjec-tured the existence of a scaling limit for a simple up-down Markov chain with uniformstationary distribution on certain discrete binary trees as a diffusive evolution of aBrownian Continuum Random Tree [1]. Feng and Sun [8] conjectured the existenceof certain measure-valued diffusions whose stationary distributions are measures withPoisson–Dirichlet distributed atom sizes, also known as Pitman–Yor processes [36] orthree-parameter Dirichlet processes [7] in the Bayesian non-parametrics literature. Let us first explain how an up-down oCRP( α, S y ) y ≥ ,induces a genealogy. The rates stated at the beginning of the introduction are suchthat every table evolves in size as an integer-valued Markov chain with up-rate m − α and down-rate m when in state m ≥
1, and is removed when hitting state 0. Hence,tables have “death” times. Furthermore, while a table is “alive”, new tables are inserted(“children born”) directly to its right at rate α . Note the recursive nature of the modeland the independence of table size evolutions. This is the genealogy [15, 25] of a binaryhomogeneous Crump–Mode–Jagers (CMJ) branching process. Figure 1 captures thisas a tree T with a vertical line for each table and horizontal arrows linking each parenttable to its children at heights/levels corresponding to birth times.Travelling around this genealogical tree T recording heights as in Figure 1, yieldsLambert’s [25] jumping chronological contour process (JCCP) X : each vertical lineyields a jump from a birth level to a death level, and exploration is by sliding down atunit speed and recursively jumping up at the birth levels of children. As jump heightsare IID table lifetimes and jumps occur at the rate α of table insertions, this processis a L´evy process starting from an initial table lifetime and stopped when reaching 0.We further mark each jump U i of the JCCP of height ζ i = X U i − X U i − by the tablesize evolution (cid:0) Z i ( s ) (cid:1) ≤ s ≤ ζ i , as in Figure 2. In a general framework of CMJ processes,this is what Jagers [20, 21] studied: each individual has “characteristics” that varyduring its lifetime. Key quantities of interest in a CMJ process are the characteristics Z i ( y − X U i − ) at each time y , or summary statistics such as sums of these characteristics.Conversely, in a marked JCCP setting we can at each level y extract a composition S y by listing from left to right for each jump crossing level y the size given by the markfor that level. In this construction, we refer to ( S y ) y ≥ as the skewer process as weimagine piercing the marked JCCP at level y and pushing together all sizes that we3 T y N y = 1 N y = 3 N y = 1 N y = 2 N y = 1 tX t n = 1 n = 4 S y = (1 , , , , X X S ,y = (1 , , S ,y = (2 , Z Z Z Figure 2: The skewer at level y of the marked JCCP when θ = 0.find at this level to form a sequence without gaps. See Figure 2. This is the discreteanalogue of the skewer process of [12]. We provide a rigorous set-up in Section 2 andprove carefully the following result, formulated here in the setting of size evolutionson N := { , , , . . . } , which we call Q α -Markov chains, with Q-matrix Q α whose onlynon-zero off-diagonal entries are q α ( m, m +1) = m − α and q α ( m, m −
1) = m for m ≥ Theorem 1.2.
Let ≤ α ≤ and n ≥ . Let Z i be independent Q α -Markov chainswith Z (0) = n and Z i (0) = 1 for i ≥ . Let ζ i = inf { s ≥ Z i ( s ) = 0 } be theabsorption times. Let ( J t ) t ≥ be an independent Poisson process of rate α and X t = − t + J t (cid:88) i =0 ζ i , t ≥ , and T = inf { t ≥ X t = 0 } . (1.1) Consider ( X, Z ) := (cid:0) ( X t ) ≤ t ≤ T , ( Z i ) ≤ i ≤ J T (cid:1) , with Z i as the mark of the i -th jump of X .Then the skewer process of ( X, Z ) is an up-down oCRP ( α, starting from ( n ) ∈ C . This extraction of a Markov process from the level sets of another process is inthe spirit of the Ray–Knight theorems that identify the local time process of suitablystopped Brownian motion as squared Bessel processes, see e.g. [33].We also generalise in three directions: first, to start from any ( n , . . . , n k ) ∈ C , weconcatenate k independent copies of ( X, Z ) replacing n by n , . . . , n k , respectively.Second, to obtain an up-down oCRP( α, θ ) for θ >
0, we add a similar construction (cid:0) ( X t ) t< , ( Z − i ) i ≥ (cid:1) to provide left-most tables, as well as their “children” and further“descendants”. See Figure 3. Third, if we replace Q α by other Q-matrices on N subject to conditions that ensure appropriate absorption in 0, and if we appropriatelyrelax the restriction that new tables always start from a single customer, we still obtaina Markovian skewer process, which we refer to as a generalised up-down oCRP ( α, θ ) . See Section 2.4 for details. We remark that laws of absorption times of continuous-time Markov chains are called “phase-type distributions.” The associated JCCPs withphase-type jumps have been studied in other contexts [4]. A feature of phase-typedistributions is that they are (weakly) dense in the space of all probability measureson [0 , ∞ ).Returning to the setup of Theorem 1.2, we establish distributional scaling limits forthe table size evolution and the total number of customers in an up-down oCRP( α, θ ),4s the initial number of customers tends to infinity, and for the L´evy process ( X t , t ≥ a ∈ [0 , ∞ ) with dimension parameter δ ∈ R . We follow [17, 33] and consider the uniquestrong solution of the stochastic differential equation (SDE) dY ( s ) = δ ds + 2 (cid:112) | Y ( s ) | dW ( s ) , Y (0) = a, (1.2)driven by a standard Brownian motion (cid:0) W ( s ) (cid:1) s ≥ . For δ ≥ s ≥
0, never reaching 0 when δ ≥
2, reflecting from 0when δ ∈ (0 ,
2) and absorbed in 0 when δ = 0. We denote this distribution by BESQ a ( δ ).For δ <
0, let ζ = inf { s ≥ Y ( s ) = 0 } and denote by BESQ a ( δ ) the distribution of theprocess (cid:0) Y ( s ∧ ζ ) (cid:1) s ≥ that is absorbed at its first hitting time of 0. Theorem 1.3.
For all ≤ α ≤ , the table size evolution Z n , i.e. a Q α -Markov chain,starting from Z n (0) = (cid:98) nz (cid:99) , satisfies (cid:18) Z n (2 ns ) n (cid:19) s ≥ D −→ BESQ z ( − α ) under the Skorokhod topology.Furthermore, the convergence holds jointly with the convergence of hitting times of 0. We remark that the convergence of hitting times is not automatic since hitting timesare not continuous functionals on Skorokhod space [19].
Theorem 1.4.
For all ≤ α ≤ and θ ≥ , the evolution M n of the total numberof customers in an up-down oCRP ( α, θ ) , i.e. a continuous-time Markov chain whosenon-zero off-diagonal entries are q m,m +1 = m + θ and q m,m − = m , m ≥ , startingfrom M n (0) = (cid:98) na (cid:99) , satisfies (cid:18) M n (2 ns ) n (cid:19) s ≥ D −→ BESQ a (2 θ ) under the Skorokhod topology. This appearance of squared Bessel processes further strengthens the connection ofTheorem 1.2 to the Ray–Knight theorems.
Theorem 1.5.
For all < α < , the L´evy process ( X t ) t ≥ of (1.1) satisfies (cid:18) X n α t n (cid:19) t ≥ D −→ Stable (1 + α ) under the Skorokhod topology,where Stable (1 + α ) is the distribution of a spectrally positive stable L´evy process withLaplace exponent ψ ( λ ) = λ α / α Γ(1 + α ) . This connects to work by Forman et al. [11, 12, 13], where the starting point is a σ -finite BESQ ( − α ) excursion measure Θ due to Pitman and Yor [32]. Specifically, a Stable (1+ α ) L´evy process ( X t ) t ≥ is constructed from Z ∼ BESQ z ( − α ) and a PoissonRandom Measure N := (cid:80) t> Z t (cid:54)≡ δ t,Z t with intensity measure Leb × Θ by using theexcursion lengths ζ t := inf { s ≥ Z ts = 0 } as jump heights in a compensating limit X t = lim ε ↓ (cid:88) ≤ r ≤ t : ζ r >ε ζ r − t α α Γ(1 − α )Γ(1 + α ) ε − α . Furthermore, an interval partition diffusion is extracted from (
X, N ) using a continuousanalogue of the skewer process described above. This interval partition diffusion iscalled a type-1 evolution. Type-0 evolutions are obtained by specifying a semi-groupthat has further intervals added on the left-hand side in a way that achieves some left-right symmetry. Further generalisations will be explored in [10]. Further to the scaling5 X − X − X − ξ − ξ − ξ − S y − = (2 , , , , , , X t tL − R − L − R − L − R − A − A − A − Figure 3: The skewer at level y of the additional marked JCCP when θ > Conjecture 1.6.
Let ( S yn ) y ≥ be a C -valued continuous-time up-down oCRP ( α, θ ) , foreach n ≥ . If n − S n converges, as n → ∞ , then ( n − S [ ny ] n ) y ≥ has a diffusive scalinglimit, when suitably represented on a space of interval partitions. For < θ = α < and θ < α < , these are the type-0 and type-1 evolutions of [13]. The structure of this paper is as follows. In Section 2 we make rigorous the definitionof the skewer process and prove Theorem 1.2 and its generalisations. In Section 3 weestablish the scaling limit results Theorems 1.3, 1.4 and 1.5.
In Section 2.1 we establish the distribution of the hitting time of 0 of Q α -Markovchains, before making rigorous the skewer construction of the introduction in Section2.2. In Section 2.3 we prove Theorem 1.2, and Section 2.4 discusses generalisations ofTheorem 1.2 with general initial conditions in Corollary 2.3, to skewer constructions ofup-down oCRP( α, θ ) with θ > Q α -Markov chain As a first step towards the proof of Theorem 1.2, let us show that the process ( X t ) t ≥ of (1.1) is well-defined with T < ∞ almost surely. Since ( X t ) t ≥ has independent andidentically distributed jumps at rate α and unit downward drift, this will follow if weshow that the mean of the jump height distribution is 1 /α . Jump heights are hittingtimes of 0 by Q α -Markov chains starting from 1. Denote by P r the distribution (onthe Skorokhod space D ([0 , ∞ ) , R ) of c`adl`ag functions g : [0 , ∞ ) → R ) of a Q α -Markovchain starting from r ∈ N . To study scaling limits in Section 3, we actually need toidentify the distribution of the hitting time ζ of 0 under P .We use a classical technique relating hitting times of birth-and-death processes andcontinued fractions. See e.g. Flajolet and Guillemin [9] for related developments.6 roposition 2.1. Under P , the hitting time ζ of has Laplace transform given by E ( e − λζ ) = 1 − λα + λ α αe λ Γ(1 + α, λ ) if < α ≤ , where the incomplete Gamma function Γ( a, z ) is defined as Γ( a, z ) := (cid:82) ∞ z e − t t a − dt ,for z > , a ∈ R . In particular, ζ is exponential with rate if α = 1 . Moreover, E ( ζ ) = α − if < α ≤ .Proof. Define σ r := inf { t ≥ Z t = r } with the convention that inf ∅ = ∞ . Supposethat f r is the density of σ r under P r +1 and g r ( t ) := P r (cid:0) Z t = r, Z s ≥ r for all s ≤ t (cid:1) .By an application of the strong Markov property at the first jump, g r ( t ) = e − (2 r − α ) t + (cid:90) t ( r − α ) e − (2 r − α ) u (cid:90) t − u f r ( s ) g r ( t − s − u ) dsdu. (2.1)For λ >
0, we define Laplace transforms F r ( λ ) := (cid:90) ∞ f r ( t ) e − λt dt, G r ( λ ) := (cid:90) ∞ g r ( t ) e − λt dt, and H r ( λ ) := (cid:90) ∞ e − (2 r − α ) t e − λt dt = (2 r − α + λ ) − . (2.2)Now, by (2.1),( r − α ) F r ( λ ) G r ( λ ) H r ( λ ) = ( r − α ) (cid:90) ∞ (cid:20) (cid:90) t (cid:90) t − s e − (2 r − α ) t f r ( s ) g r ( t − s − u ) (cid:21) dudse − λt dt = (cid:90) ∞ (cid:0) g r ( t ) − e − (2 r − α ) t (cid:1) e − λt dt = G r ( λ ) − H r ( λ ) . Rearranging this and evaluating H r ( λ ) as in (2.2), it follows that G r ( λ ) = 12 r − α + λ − ( r − α ) F r ( λ ) . Now for all ε > (cid:90) t + εt f r ( s ) ds = P r +1 ( t < σ r ≤ t + ε )= ∞ (cid:88) k =1 P r +1 (cid:0) Z t = r + k, Z s ≥ r + 1 for all 0 ≤ s ≤ t (cid:1) P r + k ( σ r ≤ ε )= g r +1 ( t ) P r +1 ( σ r ≤ ε ) + o ( ε ) as ε ↓ . Upon dividing by ε and letting ε ↓
0, we see that f r ( t ) = ( r + 1) g r +1 ( t ). Hence for all r ≥ F r ( λ ) = ( r + 1) G r +1 ( λ ) = r + 12 r + 2 − α + λ − ( r + 1 − α ) F r +1 ( λ ) . (2.3)Now let a > z >
0, and define K − ( λ ) := e z z − a Γ( a, z ) and K − ( λ ) implicitly by K − ( λ ) = 11 + z − a − (1 − a ) K − ( λ ) . K − ( λ ) = (1 + z − a ) e z z − a Γ( a, z ) − − a ) e z z − a Γ( a, z ) = 1 + z − a − a − − a ) e z z − a Γ( a, z ) . However, it is known by [26, p.278, 3.3.3] that for any a > z > K − ( λ ) = e z z − a Γ( a, z ) = 11 + z − a − − a )3 + z − a − − a )5 + z − a − − a )7 + z − a − · · · = 11 + z − a − (1 − a ) K − ( λ ) . Upon using the recursion given for F r ( λ ) in (2.3), we also see that F ( λ ) = 12 + λ − α − − α )4 + λ − α − − α )6 + λ − α − − α )8 + λ − α − · · · . Substituting the continued fraction expression for K − ( λ ) into the expression for K − ( λ ),it follows that K − ( λ ) = 13 + z − a − − a )5 + z − a − − a )7 + z − a − − a )9 + z − a − · · · . As α (cid:54) = 0, this can be identified with F ( λ ) by setting a = 1 + α, z = λ , from which itfollows that F ( λ ) = K − ( λ ) = λ − α − α − − αe λ λ − (1+ α ) Γ(1 + α, λ ) = 1 − λα + λ α αe λ Γ(1 + α, λ ) . Since F ( λ ) is the Laplace transform of ζ under P , by the definition of f ( λ ), theproposition is proved upon observing that E ( ζ ) = − F (cid:48) (0), and that for α = 1 theexpression simplifies to the Laplace transform of the claimed exponential distribution.The fact that we obtain the exponential distribution when α = 1 is actually anelementary consequence of the observation that in this case q , = 1 and q , = 0.8 .2 Construction of skewer processes in the setting of Theorem 1.2 Consider the setting of Theorem 1.2, i.e. for some fixed 0 ≤ α ≤
1, let ( J t ) t ≥ be aPoisson process of rate α , and let ( Z i , i ≥
0) be an independent family of independent Q α -Markov chains starting from Z (0) = n ≥ i = 0 and from Z i (0) = 1 for i ≥ ζ i = inf { s ≥ Z i ( s ) = 0 } , i ≥
0. By construction, the process X t := − t + (cid:80) ≤ i ≤ J t ζ i , t ≥
0, is a compound Poisson process with negative unit drift,starting from X = ζ >
0. We use the convention X − := 0. While we will eventuallystop ( X t ) t ≥ at T := inf { t ≥ X t = 0 } , it will sometimes be useful to have accessto the process with infinite time horizon. By Proposition 2.1, ( X t ) t ≥ is recurrent. As( X t ) t ≥ has no negative jumps, T < ∞ almost surely, and so(*) ( X, Z ) := (cid:0) ( X t ) ≤ t ≤ T , ( Z i ) ≤ i ≤ J T (cid:1) is as follows. X is c`adl`ag with finitely manyjumps at times U i , 0 ≤ i ≤ J T . For each jump, we have ζ i := X U i − X U i − > Z i = (cid:0) Z i ( s ) (cid:1) ≤ s<ζ i is a positive integer-valued c`adl`ag path, 0 ≤ i ≤ J T .We will define the skewer process of ( X, Z ) as a function of (
X, Z ) for any pair(
X, Z ) that satisfies (*). See Figure 2 for an illustration. In this setting, first introducenotation for • the pre- and post-jump heights B i := X U i − and D i := X U i of the i -th jump,which we will interpret as birth and death levels, 0 ≤ i ≤ J T ; • the number K y := { ≤ i ≤ J T : B i ≤ y < D i } of jumps that cross level y , y ≥ • for each level y with K y ≥
1, the indices I y := inf { i ≥ B i ≤ y < D i } and I yl := inf { i ≥ I yl − +1 : B i ≤ y < D i } , 2 ≤ l ≤ K y , of jumps that cross level y ; • and the values N yl := Z I yl ( y − B I yl ) of the integer-valued paths when crossing level y , i.e. evaluated y − B I yl beyond the birth level B I yl , for each 1 ≤ l ≤ K y , y ≥ Definition 2.2 (Skewer process) . Using the notation above, define the skewer process ( S y ) y ≥ of ( X, Z ) by S y := ( N y , N y , . . . , N yK y ), y ≥ X may have a dense set of jumptimes, each marked by a non-negative real-valued path. Whether composition-valued orinterval-partition-valued, the skewer process collects, in left to right order, the values ofthe paths at level y . This terminology stems from our visualisation in Figure 2, wherewe imagine a skewer piercing the graph of the marked X -process at level y . The valuesencountered at level y are pushed together without leaving gaps, as if on a skewer. Proof of Theorem 1.2.
Let ( X t ) t ≥ be the (unstopped) L´evy process constructed from( Z i ) i ≥ and ( J t ) t ≥ as before via X t := − t + (cid:80) J t i =0 ζ i , where ζ i = inf { s ≥ Z i ( s ) = 0 } ,noting that Z ∼ P n and Z i ∼ P for i ≥
1, independently, and that ( J t ) t ≥ is anindependent Poisson process of rate α , a PP( α ), for short. Denote the distribution of themarked process ( X, Z ) := (cid:0) ( X t ) ≤ t ≤ T , ( Z i ) ≤ i ≤ J T (cid:1) , stopped at T = inf { t ≥ X t = 0 } ,by P n . 9 X t ˜ L = L ˜ R = R ˜ L = L ˜ R = R A A T Figure 4: An excursion of the JCCP with notation used in the proof of Theorem 1.2.It may be useful to view ζ ∅ := X as the “lifetime” of a table of initial size n .In a genealogical interpretation, where the individuals in the genealogy are tables andthe children of each table are those inserted directly to the right at rate α , this initialtable would be the ancestor, hence notation ∅ . We will not require notation for afull genealogical representation of the marked L´evy process. Instead, we proceed todecompose ( X, Z ) under P n into the excursions above the minimum that can be viewedas recursively capturing the genealogies of the children of the ancestor.The skewer process makes the size of this initial table, varying according to Z ,stay the left-most entry at all levels 0 ≤ y < X . Let us show that there is a PP( α )corresponding to the insertion of tables directly to the right of the initial table. By thedefinition of the skewer process, the i -th jump of X corresponds to the second entry inthe evolving composition (a table adjacent to the initial table) if its pre-jump level isa running minimum X U i − = inf ≤ t ≤ U i X t ≥
0. These pre-jump levels are representedin Figure 4 as a collection of points on the vertical axis. Let ˜ R := 0 and define, alsobeyond T , the left and right endpoints of the excursions of ( X t ) t ≥ above the minimum˜ L j := inf { t ≥ ˜ R j − : X t (cid:54) = X t − } , ˜ R j := inf { t ≥ ˜ L j : X t = X ˜ L j − } , for j ≥ . Write K ∅ := sup { j ≥ R j < T } for the number of such excursions before T and reverseorder so that, for 1 ≤ j ≤ K ∅ , the interval ( L j , R j ) := ( ˜ L K ∅ − j +1 , ˜ R K ∅ − j +1 ) captures the j -th excursion in the order encountered by the skewer process as a process indexed bylevel y ≥
0. Write ˜ A j := X − X ˜ R j , for j ≥ A j := X R j for any 1 ≤ j ≤ K ∅ , where˜ A := 0, for the starting levels of these excursions, respectively below X and above 0.We claim that ( ˜ A j ) j ≥ are the points of an unstopped PP( α ) independent of the IIDsequence of marked excursions (cid:0) ( X t +˜ L j − X ˜ R j ) ≤ t ≤ ˜ R j − ˜ L j , ( Z (cid:101) I j + i ) ≤ i ≤ (cid:101) I j +1 − (cid:101) I j − (cid:1) , where (cid:101) I j := J (cid:101) L j identifies the first jump in the j -th excursion, j ≥
1. To show this, observethat ˜ A = X − X ˜ R = X − X ˜ L − = ˜ L = U which is exponentially distributed withrate α . For j ≥
1, apply the strong Markov property of the marked L´evy processat times ˜ L j and ˜ R j . Then the inter-arrival times ( ˜ A j − ˜ A j − ) j ≥ are all IID Exp( α )random variables independent of the IID sequence of marked excursions, as claimed.In particular, the ( ˜ A j ) j ≥ are the points of a PP( α ) as claimed.Note that since ( X , Z ) and (cid:0) ( X t − X ) t ≥ , ( Z i ) i ≥ (cid:1) are independent, it followsthat ( X , Z ) is independent of the ( ˜ A j ) j ≥ and of the marked excursions as these can10e defined solely in terms of (cid:0) ( X t − X ) t ≥ , ( Z i ) i ≥ (cid:1) . By time-reversal of the PP( α ) atthe independent time ζ ∅ = X , we obtain the following.(S) Under P n , we have Z ∼ P n with absorption time ζ ∅ . Given Z , the ( A j ) ≤ j ≤ K ∅ are the jump times of a PP( α ) restricted to [0 , ζ ∅ ], and given also ( A j ) ≤ j ≤ K ∅ ,the K ∅ marked excursions are conditionally IID, each with distribution P .We will now verify the jump-chain/holding-time structure of the skewer process byan induction on the steps of the jump chain. Note that the total number of steps M of the jump chain is positive and finite almost surely. We denote by Y m , 1 ≤ m ≤ M ,the levels of those M steps of the skewer process. We also set Y m = ∞ for m ≥ M + 1.Consider the inductive hypothesis that the first m ∧ M steps satisfy the jump-chain/holding-time structure of an up-down oCRP( α, S Y m = ( n , . . . , n k ),the marked excursions of ( X, Z ) above Y m are independent with distributions P n i ,1 ≤ i ≤ k .Assuming the inductive hypothesis for some m ≥ n , . . . , n k ) ∈ C , we notethat if Y m = ∞ or if Y m < ∞ with ( n , . . . , n k ) = ∅ the induction proceeds trivially,while if Y m < ∞ with ( n , . . . , n k ) (cid:54) = ∅ , we may assume that there are independentexponential clocks(i) of rate n i − α , from the Q α -Markov chain Z under P n i as in (S), triggering atransition into state ( n , . . . , n i − , n i + 1 , n i +1 , . . . , n k ), for each 1 ≤ i ≤ k ,(ii) of rate n i , from the Q α -Markov chain Z under P n i as in (S), triggering a transi-tion into state ( n , . . . , n i − , n i − , n i +1 , . . . , n k ), or to ( n , . . . , n i − , n i +1 , . . . , n k )if n i = 1, for each 1 ≤ i ≤ k ,(iii) of rate α , from the PP( α ) under P n i as in (S), triggering a transition into state( n , . . . , n i , , n i +1 , . . . , n k ), for each 1 ≤ i ≤ k .Denoting by E m +1 the minimum of these 3 k exponential clocks, the lack of memoryproperty of the other exponential clocks makes the system start afresh, with either S Y m +1 = ∅ or with independent marked excursions above Y m +1 = Y m + E m +1 . Specif-ically, in (iii), we note that the post- E m +1 PP( α ) is still PP( α ), and is independent ofthe first, P -distributed excursion, as required for the induction to proceed. Ultimately, we will describe how to extend Theorem 1.2 to the ( α, θ )-case with moregeneral integer-valued Markov chains and started from a general state ( n , . . . , n k ) ∈ C .We start by describing how to start from the general state ( n , . . . , n k ) ∈ C in thebasic ( α, uv ∈ C of u, v ∈ C with u =( u , . . . , u n ) and v = ( v , . . . , v m ) by uv := ( u , . . . , u n , v , . . . , v m ). Corollary 2.3.
For ( n , . . . , n k ) ∈ C , consider skewer processes ( S j,y ) y ≥ arising fromindependent copies of the ( X, Z ) of Theorem 1.2, with n replaced by n j , so that S j, = ( n j ) ∈ C , ≤ j ≤ k . Then the process ( S y ) y ≥ obtained by concatenation S y = S ,y S ,y · · · S k,y evolves as an up-down oCRP ( α, starting from ( n , . . . , n k ) ∈ C . emark . It is also possible to derive the up-down oCRP( α,
0) from a single path ofthe L´evy process ( X t ) t ≥ . Specifically, consider the L´evy process up to time T k , definedto be the time of the k -th down-crossing over 0. Using the same notation as previously,but with T replaced by T k in the definition of K y , then conditional on the event that N = n , N = n , . . . , N k = n k , the skewer S y of (cid:0) ( X t ) ≤ t ≤ T k , ( Z i ) ≤ i ≤ J Tk (cid:1) evolvesas an up-down oCRP( α,
0) starting from state ( n , n , . . . , n k ) ∈ C . A proof involvesnoting that the excursions above level 0 are independent and it is then a consequenceof Theorem 1.2. We leave the details of this proof to the reader.As was mentioned somewhat briefly in the introduction, it is possible to derive theup-down oCRP( α, θ ) with θ > θ by adding in excursions of (cid:0) ( X t ) t ≥ , ( Z i ) i ≥ (cid:1) corresponding to the arrivals at rate θ of the new customers openinga new left-most table; this is visualised in Figure 3.More formally, suppose that we have the following objects, independent of eachother and of the objects in the previous construction: • the times ( A − j ) j ≥ of a Poisson process with intensity θ > • an IID sequence of P -distributed marked excursions.Denoting the excursions by ( X − jr ) ≤ r ≤ ξ − j , j ≥
1, we now define a process ( X t ) t< byinserting into a process of unit negative drift the j -th excursion at level A − j , j ≥
1. Tothis end, we let L := 0 and denote by L − j := − A − j − (cid:80) ≤ i ≤ j ξ − i and R − j := L − j + ξ − j the left and right endpoints of the j -th of these excursions, j ≥
1, and then set for t < X t := (cid:40) A − j + X − jt − L − j if t ∈ [ L − j , R − j ] for some j ≥ , | t | − (cid:80) ≤ i ≤ j − ξ − i if t ∈ ( R − j , L − j +1 ) for some j ≥ . Then the marks of the marked excursions naturally give rise to an infinite sequence( Z − i ) i ≥ marking the jumps of ( X t ) t< . Note that we can recover the decompositioninto marked excursions from (cid:0) ( X t ) t< , ( Z − i ) i ≥ (cid:1) by decomposing along the runningminimum process (inf r ≤ t X r ) t< . Indeed, observe that this running minimum processhas intervals [ L − j , R − j ] of lengths ξ − j along which it is constant at heights A − j , with L − j < R − j < L − j +1 , j ≥
1. We denote the distribution of (cid:0) ( X t ) t< , ( Z − i ) i ≥ (cid:1) by P − and note that, by construction, we have a decomposition similar to (S):(S − ) Under P − , the ( A − j ) j ≥ are the jump times of a PP( θ ), and the marked excursionsare independent of ( A − j ) j ≥ and IID, each with distribution P .Extending the notation of the bullet points in Section 2.2, we denote by ( U − i ) i ≥ thesequence of jump times of ( X t ) t< , by B − i := X U − i − and D − i := X U − i their pre- andpost-jump heights, by K y − := { i ≥ B − i ≤ y < D − i } the number of jumps acrosslevel y and for each level with K y − ≥
1, we let I y − := inf { i ≥ B − i ≤ y < D − i } and I y − l := inf { i ≥ I y − ( l − + 1 : B − i ≤ y < D − i } , 2 ≤ l ≤ K y − . Then we define the skewerprocess of (cid:0) ( X t ) t< , ( Z − i ) i ≥ (cid:1) at level y to be S y − := ( N y − K y − , . . . , N y − ) , where N y − l := Z − I y − l ( y − B − I y − l ) , ≤ l ≤ K y − . (2.4)This construction was illustrated in Figure 3.12 heorem 2.5. For any ≤ α ≤ , θ > and ( n , . . . , n k ) ∈ C , let (cid:0) ( X t ) t ≥ , ( Z i ) i ≥ (cid:1) be as in Theorem 1.2 and, independently, let (cid:0) ( X t ) t< , ( Z − i ) i ≥ (cid:1) be as above. Then,with ( S y ) y ≥ defined as in Corollary 2.3 and ( S y − ) y ≥ defined as in (2.4) , the process ( S y − S y ) y ≥ evolves as an up-down oCRP ( α, θ ) starting from ( n , . . . , n k ) ∈ C .Proof. We extend the induction on the number of steps in the jump-chain/holding-timedescription of the skewer process in the proof of Theorem 1.2. Specifically, the inductionhypothesis provides the P − -distributed marked process on the negative t -axis, and weencounter one more independent exponential clock in the induction step,(iv) of rate θ , from the PP( θ ) under P − as in (S − ), triggering a transition into state(1 , n , . . . , n k ).As in case (iii), we note if case (iv) triggers the transition, the post- E m +1 PP( θ ) is stillPP( θ ), and independent of the first, P -distributed marked excursion as required.Finally, we generalise the Q α -Markov chains and their initial distributions. This isbased on the observation that much of the proof of Theorems 1.2 and 2.5 above doesnot rely on the specific transition rates of Q α -Markov chains nor on letting excursionsstart from 1. This establishes the following result. Theorem 2.6.
Let ≤ α ≤ and θ ≥ . Suppose that Q is a Q-matrix on N and p − and p are distributions on N , with the following properties: • Q is such that q , = 0 , i.e. is an absorbing state. • Q is such that P n ( ζ < ∞ ) = 1 for all n ≥ ; i.e. absorption (at the hitting time ζ of 0) happens with probability 1 under any P n (the law of a Q -Markov chainstarting from n ≥ ). In particular, ( p − , Q ) is such that (cid:80) n ≥ p − n P n ( ζ < ∞ ) = 1 . • ( p, Q ) is such that (cid:80) n ≥ p n E n ( ζ ) ≤ /α .Then replacing Q α by Q and taking Z i (0) ∼ p , i ≥ , in the setting of Theorem 1.2and in the definition of P n , n ≥ , and replacing P by P p − := (cid:80) n ≥ p − n P n in (S − ) , theconcatenated skewer process ( S y − S y ) y ≥ , constructed as in Theorem 2.5, is Markovian. Note that for µ := (cid:80) ∞ n =1 p n E n ( ζ ), the assumption that µ ≤ /α is a conditionof (sub)-criticality and corresponds to the condition required for the excursions of theL´evy process to be almost surely finite, which in particular is necessary to avoid ( X t ) t ≥ drifting to ∞ ; we previously had µ = 1 /α by Proposition 2.1. Specifically, we note thatnew tables inserted on the right arrive at rate α and have an average lifespan of µ independently of arrival times; we need µα to be less than or equal to the rate 1 ofdownward drift. If this does not hold, then it is not necessarily possible to apply theStrong Markov property at time ˜ R and the proof of Theorem 2.5 breaks down. Indeed,informally, when ˜ R = ∞ , the rate- α insertion of a table to the right is absent in theskewer process, but whether ˜ R = ∞ or not depends on the future of the skewer process.Note that, on the other hand, no such restriction is needed for p − since p − onlyaffects the first jump in an excursion of the L´evy process, while the remainder of theexcursion has jumps according to the absorption time under (cid:80) n ≥ p n P n .We can describe precisely the class of skewer processes in such constructions, ofwhich the up-down oCRP( α, θ ) is a special case. We continue to interpret a composition( n , . . . , n k ) ∈ C as k tables in a row, with n , . . . , n k customers, respectively.13 efinition 2.7 (Generalised up-down oCRP( α, θ )) . Let 0 ≤ α ≤ θ ≥
0, andsuppose that Q is a Q-matrix on N with 0 as an absorbing state, and that p − and p aredistributions on N . Then the generalised up-down oCRP ( α, θ ) with Q-matrix Q , tabledistribution p and left-most table distribution p − is a continuous-time Markov processwhere from state ( n , . . . , n k ) ∈ C there are independent exponential clocks • of rate q n i ,l , for the arrival at table i of l − n i new customers if l > n i or thedeparture from table i of n i − l existing customers if l < n i , and leading to atransition into state ( n , . . . , n i − , l, n i +1 , . . . , n k ) if l ≥ l (cid:54) = n i , or into state( n , . . . , n i − , n i +1 , . . . , n k ) if l = 0, for each l (cid:54) = n i , 1 ≤ i ≤ k , • of rate αp l for the arrival of l customers at a new table to the right of table i ,and leading to state ( n , . . . , n i , l, n i +1 , . . . , n k ), for each l ≥
1, 1 ≤ i ≤ k , • of rate θp − l for the arrival of l customers at a new table to the left of table 1, andleading to state ( l, n , . . . , n k ), for each l ≥ α ,respectively θ , a group of random p -distributed size, respectively p − -distributed size,arrives to open a new table to the right of each table i , 1 ≤ i ≤ k , respectively, to theleft of table 1. The first bullet point allows to model the party size at each table by apretty general N -valued continuous-time Markov chain, with people of various groupsizes coming and going until the party ends with the last people leaving the table.As the reader may now expect, this process arises as the distribution of a skewerprocess in Theorem 2.6: Theorem 2.8.
The skewer process in Theorem 2.6 is a generalised up-down oCRP ( α, θ ) with Q-matrix Q , table distribution p and left-most table distribution p − . The proof follows by using essentially the same argument as for Theorems 1.2 and2.5, and is therefore omitted.
In this section, we develop the scaling limit results stated as Theorems 1.3, 1.4 and 1.5in the introduction. The first two results relate the chains governing table sizes andtotal numbers of customers in the up-down oCRP( α, θ ) to squared Bessel processes.The third result relates the L´evy process controlling the JCCP to a (1 + α )-stable L´evyprocess. Before proving these results in Section 3.2, we cover preliminary material inSection 3.1. We discuss some further consequences in Section 3.3. To discuss convergence of stochastic processes, we work with the Skorokhod topology onthe space of c`adl`ag paths. Specifically, let (
E, ρ ) be a metric space and let D ([0 , ∞ ) , E )denote the space of c`adl`ag functions from [0 , ∞ ) to E . Also denote by D ([0 , t ] , E ) thespace of c`adl`ag functions from [0 , t ] to E . 14 efinition 3.1 (Skorokhod topology) . The
Skorokhod topology on D ([0 , t ] , E ) is thetopology induced by the metric d t ( f, g ) := inf λ ∈ L t (cid:16) (cid:107) λ (cid:107) ◦ t ∨ (cid:107) f − g ◦ λ (cid:107) t (cid:17) , where L t is the set of all strictly increasing bijections of [0 , t ] into itself, (cid:107) · (cid:107) ◦ t is thefunction on L t with (cid:107) λ (cid:107) ◦ t := sup r,s : 0 ≤ r
Lemma 3.2.
The following results hold.(a) [5, Theorem 16.2] d ( f n , f ) → in D ([0 , ∞ ) , E ) if and only if d t ( f n , f ) → forevery continuity point t of f .(b) [5, Lemma 16.1, Theorem 16.2] If f n → f in D ([0 , ∞ ) , E ) and f is continuous,then f n unif −→ f on [0 , t ] for any t > . To show that convergence in distribution of hitting times holds, we establish thefollowing lemma.
Lemma 3.3.
Consider the pair of functions τ, τ < : D ([0 , ∞ ) , R ) → [0 , ∞ ] given by τ ( f ) := inf { t ≥ f ( t ) ≤ } and τ < ( f ) := inf { t ≥ f ( t ) < } , and the function spaces C ([0 , ∞ ) , R ) := { f ∈ D ([0 , ∞ ) , R ) : f is continuous on [0 , ∞ ) } and D ∗ ([0 , ∞ ) , R ) := (cid:8) f ∈ D ([0 , ∞ ) , R ) : ∀ t ≥ τ ( f ) , f ( t ) ≤ , τ < ( f ) = τ ( f ) < ∞ (cid:9) . If f n → f in D ([0 , ∞ ) , R ) and f ∈ C ([0 , ∞ ) , R ) ∩ D ∗ ([0 , ∞ ) , R ) , then τ ( f n ) → τ ( f ) .Proof. Suppose that f n ∈ D ([0 , ∞ ) , R ) converges under the Skorokhod topology to somefunction f ∈ D ∗ ([0 , ∞ ) , R ) ∩ C ([0 , ∞ ) , R ). For any t ≥ d t ( f n , f ) → f ∈ C ([0 , ∞ ) , R ), f n unif −→ f on [0 , t ] by Lemma 3.2 (b).Let ε >
0. Then it follows from uniform convergence on [0 , τ ( f ) − ε ], the definitionof τ and the fact that inf ≤ t ≤ τ ( f ) − ε f ( t ) > N ∈ N such that for any n > N , f n > , τ ( f ) − ε ]. This yields τ ( f n ) > τ ( f ) − ε for every n > N .To show that τ ( f n ) < τ ( f )+ ε for sufficiently large n , recall that as f ∈ D ∗ ([0 , ∞ ) , R ), τ ( f ) = τ < ( f ). By assumption, there is some δ ∈ (0 , ε ) with f ( τ < ( f ) + δ ) <
0. Now f n unif −→ f on [0 , τ < ( f ) + δ ], and so there exists N ∈ N such that for any n > N , f n ( τ ( f )+ δ ) <
0. It follows that τ ( f n ) ≤ τ ( f )+ δ < τ ( f )+ ε . Setting N := max( N , N ),we have shown that for all n > N , | τ ( f ) − τ ( f n ) | < ε as required.We will also use the following well-known convergence criteria for L´evy processes. Lemma 3.4. e.g. [19, Corollary VII.3.6] Suppose that X n and X are c`adl`ag processeswith stationary independent increments. Then X n D → X on D ([0 , ∞ ) , R ) if and only if X n D → X . Moreover, whenever this holds and F X n and F X are the L´evy measures of X n and X respectively, then for every g ∈ C ([0 , ∞ ) , R ) , F X n ( g ) → F X ( g ) , where C ([0 , ∞ ) , R ) := (cid:8) g ∈ C ([0 , ∞ ) , R ): g is bounded and there is ε > with g = 0 on [0 , ε ) (cid:9) . .1.2 Squared Bessel processes The scaling limit results of Theorems 1.3 and 1.4 involve [0 , ∞ )-valued squared Besselprocesses of dimension δ ∈ R , which are absorbed at 0 if δ ≤
0. As in the introduction,we denote these by
BESQ a ( δ ). We will establish the convergence of absorption times byconsidering the natural extension of squared Bessel processes with negative dimensionto the negative half-line. Definition 3.5 (Extended squared Bessel process) . [17, 33] Let δ ∈ R , a ∈ R , andsuppose that ( W s ) s ≥ is a standard Brownian motion in R . The unique strong solutionof the SDE (1.2) is called an extended squared Bessel process with dimension δ . Itsdistribution on C ([0 , ∞ ) , R ) when starting from a is denoted by BESQ ∗ a ( δ ).The drift parameter δ in (1.2) is referred to as the dimension since it can be shownfor any integer n ∈ N , that if ( B t ) t ≥ is an n -dimensional Brownian motion, the evolu-tion of the squared norm (cid:107) B t (cid:107) , t ≥
0, is a
BESQ ( n )-process.We state some properties of squared Bessel processes needed later. Proposition 3.6 (Pathwise properties of squared Bessel processes) . Let δ ∈ R , a ≥ ,and suppose that Y ∗ is a BESQ ∗ a ( δ ) process.(a) [17, p.330] The process − Y ∗ is a BESQ ∗− a ( − δ ) process.(b) [17, Corollary 1] If δ ≥ and a > , then Y ∗ does not hit almost surely, and if δ < then Y ∗ hits almost surely.(c) [33, Proposition XI.1.5] If δ ≥ then Y ∗ t ≥ for all t ≥ , with an absorbingstate if and only if δ = 0 . If δ ∈ (0 , , then is instantaneously reflecting. In particular, we have
BESQ a ( δ ) = BESQ ∗ a ( δ ) for δ ≥
0. For negative dimension − δ , δ >
0, a
BESQ a ( − δ ) process is by definition obtained as Y = ( Y ∗ s ∧ ζ ) s ≥ from a BESQ ∗ a ( − δ )process Y ∗ = ( Y ∗ s ) s ≥ , by stopping Y ∗ at the first hitting time ζ := inf { s ≥ Y ∗ s = 0 } of 0; by the Markov property of BESQ ∗ a ( − δ ) and by Proposition 3.6 (a), the process Y := ( − Y ∗ ζ + s ) s ≥ is then a BESQ ( δ ) process independent of Y . Vice versa, we canreplace the absorbed part of a BESQ a ( − δ ) process Y by the negative of an independent BESQ ( δ ) process Y as Y ∗ s = Y s , s ≤ ζ , and Y ∗ ζ + s = − Y s , s ≥
0, to construct a
BESQ ∗ a ( − δ )process Y ∗ . A slight refinement of such arguments yields the following corollary, whichuses notation from Lemma 3.3. Corollary 3.7.
When δ > and a ≥ , almost all the paths of a BESQ ∗ a ( − δ ) processlie in C ([0 , ∞ ) , R ) ∩ D ∗ ([0 , ∞ ) , R ) .Proof. Let f be a path of BESQ ∗ a ( − δ ). By Proposition 3.6 (b), f hits 0 almost surely.By Proposition 3.6 (a), − f is a BESQ − a ( δ ) path. By Proposition 3.6 (c), applied to − f , • f ≤ τ ( f ) , ∞ ) a.s. if − δ ∈ ( − , • f < τ ( f ) , ∞ ) a.s. if − δ ≤ − f that f (cid:0) τ ( f ) (cid:1) = 0, since f > , τ ( f )). To showthat τ < ( f ) = τ ( f ), suppose for a contradiction that this was not the case. Since f ≤ τ ( f ) , ∞ ), τ < ( f ) > τ ( f ) would imply that f = 0 on [ τ ( f ) , τ < ( f )). This contradictspart (c) of Proposition 3.6, and so τ < ( f ) = τ ( f ). Hence f ∈ D ∗ ([0 , ∞ ) , R ) a.s.By Proposition 3.6 (c), we have τ < ( f ) = ∞ for paths f of a BESQ ∗ a ( δ ), δ ≥
0. Inparticular the conclusion of Corollary 3.7 fails for such parameters.16 .1.3 Martingale problems for diffusions
To discuss some of the general theory needed for a proof later, we follow [19]. In whatfollows, (cid:0) Ω , ( F t ) t ≥ , F , P (cid:1) is a filtered probability space and F = ( F t ) t ≥ . Definition 3.8 (Continuous semi-martingale) . A continuous semi-martingale X =( X t ) t ≥ is a real-valued stochastic process on (Ω , F , P ) of the form X t = X + M t + B t ,where X is F -measurable, M a continuous local F -martingale and B a continuous F -adapted process of finite variation, with M = B = 0. The quadratic variation C = (cid:104) M (cid:105) of X is the unique continuous increasing F -adapted process such that M t −(cid:104) M (cid:105) t , t ≥ B, C ) = ( B t , C t ) t ≥ as the characteristics of X .If η is any distribution on R , we say the martingale problem s (cid:0) σ ( X ) , X | η ; B, C (cid:1) has a solution P if P is a probability measure on (Ω , F ) with P ( X ∈ · ) = η , and X isa continuous semi-martingale on (Ω , F , F , P ) with characteristics ( B, C ) = ( B t , C t ) t ≥ . Definition 3.9 (Homogeneous diffusion) . A homogeneous diffusion is a continuoussemi-martingale X on (Ω , F , F , P ) for which there exist Borel functions b : R → R and c : R → [0 , ∞ ) called the drift coefficient and diffusion coefficient such that B t = (cid:82) t b ( X s ) ds and C t = (cid:82) t c ( X s ) ds , t ≥ BESQ ∗ a ( δ ) process is a homogeneous diffusion with driftcoefficient b ( x ) = δ and diffusion coefficient c ( x ) = 4 | x | , x ∈ R . This can be seen fromthe SDE characterisation in Definition 3.5 and (1.2).We will need two results from [19]. Lemma 3.10. [19, Theorem III.2.26] Consider a homogeneous diffusion X and let ournotation be as above. Then the martingale problem s (cid:0) σ ( X ) , X | η ; B, C (cid:1) has a uniquesolution if and only if the SDE dY t = b ( Y t ) dt + (cid:112) c ( Y t ) dW t , Y = ξ , has a unique weaksolution, for ( W t ) t ≥ an F -Brownian motion and ξ an F -measurable random variablewith distribution η . Lemma 3.11. [19, Theorem IX.4.21] Suppose that for any a ∈ R , the martingaleproblem s (cid:0) σ ( X ) , X | δ a ; B, C (cid:1) has a unique solution P a , and that a (cid:55)→ P a ( A ) is Borelmeasurable for all A ∈ F . For each n ≥ , let X n be a Markov process started from aninitial distribution η n , where X n has generator L n of the form L n f ( x ) = (cid:90) y ∈ R (cid:0) f ( x + y ) − f ( x ) (cid:1) K n ( x, dy ) , for a finite kernel K n on R . Suppose further that the functions b n and c n given by b n ( x ) := (cid:90) y ∈ R yK n ( x, dy ) and c n ( x ) := (cid:90) y ∈ R y K n ( x, dy ) , x ∈ R , are well-defined and finite, and that the drift and diffusion coefficients b and c and theinitial distribution η of X are such that, as n → ∞ ,(a) b n → b and c n → c locally uniformly,(b) sup { x : | x |≤ r } (cid:82) y ∈ R | y | {| y | >ε } K n ( x, dy ) → for any ε > and any r > ,(c) η n → η weakly on R .Then X n converges in distribution to X as n → ∞ under the Skorokhod topology on D ([0 , ∞ ) , R ) . .2 Proofs of Theorems 1.3, 1.4 and 1.5 Recall that Theorem 1.3 claims that the distributional scaling limit of the table sizeevolution of an up-down oCRP( α, θ ) is
BESQ ( − α ) and that this convergence holdsjointly with the convergence of absorption times in 0. Proof of Theorem 1.3.
Let 0 ≤ α ≤ a ∈ R . Observe by Lemma 3.10 that themartingale problem associated to the SDE characterising the BESQ a ( δ ) diffusion has aunique solution P a . Showing that a (cid:55)→ P a ( A ) is Borel measurable is fairly elementary,and so we leave this to the reader. Consider an extension Z ∗ n of the table size Markovchain Z n to Z that further transitions from 0 into − α , and from state − m , m ≥
1, to − m − m and into − m + 1 at rate m − α . This chain is an instanceof a Markov chain on R with generator L given by L f ( x ) = (cid:0) ( | x | ∨ α ) − α (cid:1)(cid:0) f ( x + 1) − f ( x ) (cid:1) + (cid:0) | x | ∨ α (cid:1)(cid:0) f ( x − − f ( x ) (cid:1) , x ∈ R . Let X n := (cid:0) n − Z ∗ n (2 ns ) , s ≥ (cid:1) . Then X n has generator L n f ( x ) = (cid:90) y ∈ R (cid:0) f ( x + y ) − f ( x ) (cid:1) K n ( x, dy ) , where the increment kernel K n ( x, dy ) is given by K n ( x, dy ) = n (cid:16)(cid:0) (2 n | x | ∨ α ) − α (cid:1) δ /n ( dy ) + (2 n | x | ∨ α ) δ − /n ( dy ) (cid:17) , x ∈ R . Now (cid:90) y ∈ R yK n ( x, dy ) = (2 n | x | ∨ α ) − α − (2 n | x | ∨ α ) = − α, and (cid:90) y ∈ R y K n ( x, dy ) = 2 n | x | ∨ α − αn + 2 n | x | ∨ αn = 4 (cid:16) | x | ∨ αn (cid:17) − αn → | x | . This convergence is uniform in x ∈ R under the supremum norm (cid:107) · (cid:107) since (cid:13)(cid:13)(cid:13)(cid:13) (cid:16) | x | ∨ αn (cid:17) − αn − | x | (cid:13)(cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13)(cid:13) | x | + 2 αn − (cid:16) | x | ∨ αn (cid:17) (cid:13)(cid:13)(cid:13)(cid:13) = 2 αn → . Moreover, for any r ≥ ε > n sufficiently large,sup { x : | x |≤ r } (cid:90) y ∈ R | y | {| y | >ε } K n ( x, dy ) = 0 . It is clear that as n → ∞ , n − (cid:98) nz (cid:99) → z . This shows that all the assumptions of Lemma3.11 are satisfied. It follows that the law of X n converges weakly in the Skorokhodtopology on D ([0 , ∞ ) , R ) to that of the diffusion process with drift coefficient − α anddiffusion coefficient 4 | x | , namely a BESQ ∗ z ( − α ) process.Let 0 < α ≤
1. Then the paths of a
BESQ ∗ z ( − α ) process lie in D ∗ ([0 , ∞ ) , R ) ∩ C ([0 , ∞ ) , R ) by Corollary 3.7 since − α <
0. By applying Skorokhod’s representationtheorem on the separable space D ([0 , ∞ ) , R ), we may assume that the convergence X n → X that we have just established holds almost surely. By Lemma 3.3, this entails18he convergence τ ( X n ) a.s −→ τ ( X ). Therefore (cid:0) X n , τ ( X n ) (cid:1) a.s −→ (cid:0) X, τ ( X ) (cid:1) , and thisentails the claimed joint distributional convergence since τ ( X n ) and τ ( X ) are the firsthitting times of 0 for the up-down chains X n and for the continuous X , none of whichcan reach the negative half-line without first visiting 0.If instead α = 0, then we cannot apply Corollary 3.7. Observe that the Markovchain Z ∗ n is a simple birth-death process. By a standard reference such as [18, Corollary6.11.12], the hitting time τ ( Z ∗ n ) of 0 by a simple birth-death process Z ∗ n started fromheight m ∈ N satisfies P (cid:0) τ ( Z ∗ n ) ≤ t (cid:1) = (cid:18) tt + 1 (cid:19) m ⇒ P (cid:0) τ ( X n ) ≤ t (cid:1) = (cid:18) nt nt + 1 (cid:19) (cid:98) nz (cid:99) . We conclude that P (cid:0) τ ( X n ) ≤ t (cid:1) → e − z/ t , as n → ∞ .By [17, p.319], it is known that ζ D = z/ G for the absorption time ζ of a BESQ z ( δ )and G ∼ Gamma (cid:0) (2 − δ ) / , (cid:1) when δ <
2. Hence P ( ζ ≤ t ) = P (cid:0) Exp (1) ≥ z/ t (cid:1) = e − z/ t in the case δ = 0. Thus we still have τ ( X n ) D → τ ( X ), in the case when α = 0.We need to strengthen this to joint convergence in distribution: since the distribu-tions of pairs (cid:0) X n , τ ( X n ) (cid:1) , n ≥
1, are tight, it suffices to show that any subsequentialdistributional limit for the pairs is such that the limiting time is the extinction time ofthe limiting process. Specifically, if (cid:0) X n k , τ ( X n k ) (cid:1) converges, we may assume conver-gence holds almost surely, by Skorokhod’s representation theorem, and the limit ( X, τ )is such that τ D = τ ( X ), as the marginal distributions converge. Also, we can use deter-ministic arguments as in the proof of Lemma 3.3 to show that τ ≥ τ ( X ) almost surely,and with τ D = τ ( X ), this yields τ = τ ( X ) almost surely, as required. Specifically, let ε >
0. Since X n k → X and τ ( X ) < ∞ almost surely, we have inf ≤ t ≤ τ ( X ) − ε X n k ( t ) > k , i.e. τ ( X n k ) ≥ τ ( X ) − ε . Since τ ( X n k ) → τ almost surely, thisyields τ ≥ τ ( X ) − ε , and so τ ≥ τ ( X ) almost surely, as ε > Z n ( s ) = Z ∗ n (cid:0) s ∧ τ ( Z ∗ n ) (cid:1) , s ≥ α, θ ) is BESQ (2 θ ). Proof of Theorem 1.4.
The convergence of the scaled evolution of the total number ofcustomers to the
BESQ (2 θ ) diffusion is proved in the same way as Theorem 1.3, wherethe extension of the integer-valued Markov chain M n to state space R can be chosento have generator L f ( x ) := (cid:0) | x | + θ (cid:1)(cid:0) f ( x + 1) − f ( x ) (cid:1) + | x | (cid:0) f ( x − − f ( x ) (cid:1) instead. Note that this Markov chain shares the relevant parts of the boundary be-haviour of BESQ (2 θ ) at 0 in that 0 is absorbing for θ = 0, while upward transitions from0 are possible when θ >
0. The proof proceeds as above.Let us turn to the proof of Theorem 1.5. Specifically, we show that the L´evy process( X t ) t ≥ underlying the JCCP of the genealogy of the up-down oCRP( α,
0) as in (1.1)has a (1 + α )-stable L´evy process as its scaling limit, if 0 < α < roof of Theorem 1.5. Recall (1.1). Since X / n = ζ / n → ζ = 0 so that ( X t ) t ≥ is a L´evy process starting from X = 0. By conditioning on the number J t of jumps in the time interval (0 , t ], applyingthe independence of the jump heights ζ i and using Proposition 2.1, E (cid:16) e − λX t (cid:17) = e λt E (cid:32) exp (cid:32) − λ J t (cid:88) i =1 ζ i (cid:33)(cid:33) = e λt ∞ (cid:88) k =0 (cid:16) E (cid:16) e − λζ (cid:17)(cid:17) k P ( J t = k )= e λt ∞ (cid:88) k =0 (cid:18) − λα + λ α αe λ Γ(1 + α, λ ) (cid:19) k e − αt ( αt ) k k != e ( λ − α ) t ∞ (cid:88) k =0 (cid:18) αt − λt + λ α te λ Γ(1 + α, λ ) (cid:19) k k ! = exp (cid:18) λ α te λ Γ(1 + α, λ ) (cid:19) . Hence ( X t ) t ≥ has Laplace exponent given by ϕ ( λ ) = ln E ( e − λX t ) t = λ α e − λ Γ(1 + α, λ ) . Now note that the Laplace exponent of (2 n ) − X n α t satisfiesln E (cid:0) exp (cid:0) − λ (2 n ) − X n α t (cid:1)(cid:1) t = ( λ/ n ) α e − λ/ n Γ(1 + α, λ/ n ) 2 n α = λ α e − λ/ n α Γ(1 + α, λ/ n ) → λ α α Γ(1 + α ) , as n → ∞ . By the convergence theorem for Laplace transforms, we have (2 n ) − X n α D → X as n → ∞ . By independence and stationarity of increments, we conclude by Lemma 3.4: (cid:18) X n α t n (cid:19) t ≥ D −→ (cid:0) X t (cid:1) t ≥ under the Skorokhod topologyon D ([0 , ∞ ) , R ), as n → ∞ . In this section we explore further the connections between
BESQ ( − α ) and Stable (1+ α )processes. Proposition 3.12.
Let α ∈ (0 , , and denote by Q − αy the law of a BESQ y ( − α ) processstarted from y > . If ζ denotes the absorption time of this process under Q − αy , thenas y ↓ , we have that y − (1+ α ) Q − αy ( ζ > s ) converges to (cid:101) Π( s, ∞ ) , where (cid:101) Π( ds ) = 12 α Γ(1 + α ) s − (2+ α ) ds is the L´evy measure of a spectrally positive Stable (1 + α ) L´evy process. roof. Recall from [17, p.319] that under Q − αy , ζ D = y/ G , where G ∼ Gamma (1 + α, α and rate parameter 1. Observe that Q − αy ( ζ > s ) = P (cid:16) G < y s (cid:17) = (cid:90) y/ s α ) e − u u α du. Substituting x = 2 su/y , we see that y − (1+ α ) Q − αy ( ζ > s ) = (cid:90) α )2 α e − yx/ s x α s − (1+ α ) dx. By letting y ↓ (cid:90) α )2 α x α s − (1+ α ) dx = 12 α Γ(2 + α ) s − (1+ α ) . Writing (cid:101) Π( s, ∞ ) = (cid:82) ∞ s (cid:101) Π( du ), we observe that indeed y − (1+ α ) Q − αy ( ζ > s ) → (cid:101) Π( s, ∞ )as y ↓ (cid:101) Π is the L´evy measure of a stable process with Laplaceexponent Γ(1 − α ) α α Γ(2 + α ) λ α . In particular, the L´evy measure of the
Stable (1 + α ) process in Theorem 1.5 is c (cid:101) Πwhere c = 2 α (1 + α ) / Γ(1 − α ).We can also approximate the L´evy measure directly from the convergence in The-orem 1.5, where pre-limiting jump sizes are governed by the integer-valued table sizeprocess, while limiting jump sizes are governed by the L´evy measure. As in Proposition2.1, we will denote by P the distribution of the table size process starting from 1, andby ζ the hitting time of 0 under P . Corollary 3.13.
For every ε > , it is the case that as n → ∞ , αn α P (cid:18) ζ n > ε (cid:19) −→ Π( ε, ∞ ) and P (cid:18) ζ n ∈ · (cid:12)(cid:12)(cid:12)(cid:12) ζ n > ε (cid:19) −→ Π (cid:0) · ∩ ( ε, ∞ ) (cid:1) Π( ε, ∞ ) , where Π = c (cid:101) Π is as above and the second convergence is in the sense of weak conver-gence.Proof. In the setting of the proof of Theorem 1.5, with X = 0, set X nt := (2 n ) − X n α t ,and write F X n and F X for the L´evy measures of X n and X respectively. Noting that X n = 0, a combination of Lemma 3.4 and Theorem 1.5 shows that F X n ( g ) → F X ( g )for every g ∈ C ([0 , ∞ ) , R ). Since X is a compound Poisson process with drift, observethat it has L´evy measure α P ( ζ ∈ dx ), so that F X n ( dx ) = 2 αn α P (cid:0) (2 n ) − ζ ∈ dx (cid:1) .On the other hand, F X is the L´evy measure of the limiting Stable (1 + α ) L´evy processwith Laplace exponent ψ . This L´evy measure is F X = Π, as identified just above thestatement of this corollary. 21ote that Lemma 3.4 yields vague convergence of the σ -finite measures F X n to F X on (0 , ∞ ). Now, F X is a measure with no atoms, and it is easy to show that this isequivalent to the claim that F X n ( ε, ∞ ) −→ F X ( ε, ∞ ) , for all ε > . Now, for any x > ε , observe that as n → ∞ , P (cid:18) ζ n > x (cid:12)(cid:12)(cid:12)(cid:12) ζ n > ε (cid:19) = 2 αn α P (cid:0) (2 n ) − ζ > x (cid:1) αn α P (cid:0) (2 n ) − ζ > ε (cid:1) −→ Π ( x, ∞ )Π ( ε, ∞ ) , which entails the claimed weak convergence of probability measures on ( ε, ∞ ). Remark . It can be shown that 2 αn α P (cid:0) ( n − Z nt ) t ≥ ∈ · (cid:1) → Θ vaguely on D ([0 , ∞ ) , R ) \ { } , where Θ is the BESQ ( − α ) excursion measure mentioned above Con-jecture 1.6, with Θ( ζ ∈ · ) = Π. See [11, 12, 13] for the direct study of limitingstructures consisting of Stable (1 + α ) processes with BESQ ( − α ) excursions in theirjumps. Acknowledgements
Dane Rogers was supported by EPSRC DPhil studentship award 1512540 and by aMerton doctoral completion bursary. We would like to thank Jim Pitman for pointingout some relevant references and Noah Forman, Soumik Pal and Douglas Rizzolo forallowing us to build on unpublished drafts from which several ideas here arose and thatexplored the special case α = 1 / References [1]
Aldous, D. (1991). The continuum random tree. I.
Ann. Probab. , , 1–28.[2] Aldous, D. (1999). Wright–Fisher diffusions with negative mutation rate! .[3]
Aldous, D. and Diaconis, P. (1987). Strong uniform times and finite random walks.
Adv. Appl.Math. , 69–97.[4] Asmussen, S., Avram, F. and Pistorius, M. (2004). Russian and American put options underexponential phase-type L´evy models.
Stoch. Proc. Appl. , 79–111.[5]
Billingsley, P. (1999).
Convergence of Probability Measures , 2nd edn. John Wiley and Sons.[6]
Borodin, A. and Olshanski, G. (2009). Infinite-dimensional diffusions as limits of randomwalks on partitions.
Probab. Theory Rel. Fields , 281–318.[7]
Carlton, M. A. (2002) A family of densities derived from the three-parameter Dirichlet process.
J. Appl. Probab. , 764–774.[8] Feng, S. and Sun, W. (2010). Some diffusion processes associated with two parameter Poisson–Dirichlet distribution and Dirichlet process. Probab. Theory Related Fields , 501–525.[9]
Flajolet, P. and Guillemin, F. (2000). The formal theory of birth-and-death processes, latticepath combinatorics and continued fractions.
Adv. Appl. Probab. , 750–778.[10] Forman, N., Pal, S., Rizzolo, D., Shi, Q. and Winkel, M. (2020). A two-parameter familyof interval-partition-valued diffusions with Poisson–Dirichlet stationary distributions.
Work inprogress. Forman, N., Pal, S., Rizzolo, D. and Winkel, M. (2018). Uniform control of local times ofspectrally positive stable processes. Ann. Appl. Probab., Vol 28, No 4, 2592–2634.[12]
Forman, N., Pal, S., Rizzolo, D. and Winkel, M. (2019) Diffusions on a space of intervalpartitions: construction from marked L´evy processes. arXiv:1909.02584 [math.PR] .[13]
Forman, N., Pal, S., Rizzolo, D. and Winkel, M. (2019) Diffusions on a space of intervalpartitions: Poisson–Dirichlet stationary distributions arXiv:1910.07626 [math.PR] .[14]
Fulman, J. (2009). Commutation relations and Markov chains.
Probab. Theory Rel. Fields ,99–136.[15]
Geiger, J. and Kersting, G. (1997). Depth-first search of random trees and Poisson pointprocesses.
IMA Vol. Math. Appl. , 111–126.[16] Gnedin, A. and Pitman, J. (2005). Regenerative composition structures,
Ann. Probab. ,445–479.[17] G¨oing-Jaeschke, A. and Yor, M. (2003). A survey and some generalisations of Bessel pro-cesses,
Bernoulli , 313–349.[18] Grimmett, G. and Stirzaker, D. (2001).
Probability and Random Processes , 3rd edn. OxfordUniversity Press.[19]
Jacod, J. and Shiryaev, A. (2002).
Limit theorems for stochastic processes , 2nd edn. Springer.[20]
Jagers, P. (1969) A general stochastic model for population development.
Skand. Aktuarietidskr ,84–103.[21]
Jagers, P. (1975)
Branching processes with biological applications.
Wiley and Sons.[22]
James, L. (2006) Poisson Calculus for spatial neutral to the right processes.
Ann. Stat. ,416–440.[23] Joyce, P. and Tavar´e, S. (1987). Cycles, permutations and the structure of the Yule processwith immigration.
Stoch. Proc. Appl. , 309–314.[24] Knight, F. B. (1963). Random walks and a sojourn density process of Brownian motion.
Trans.Amer. Math. Soc. , 56–86.[25]
Lambert, A. (2010). The contour of splitting trees is a L´evy process.
Ann. Probab. , 348–395.[26] Lorentzen, L. and Waadeland, H. (2008).
Continued fractions , 2nd edn. Atlantis Press.[27]
Pal, S. (2013). Wright–Fisher diffusion with negative mutation rates.
Ann. Probab. , 503–526.[28] Petrov, L. (2009). Two-parameter family of diffusion processes in the Kingman simplex.
Funct.Anal. Appl. , 279–296.[29] Pitman, J. (2006).
Combinatorial Stochastic Processes . Springer.[30]
Pitman, J. and Winkel, M. (2009). Regenerative tree growth: binary self-similar continuumrandom trees and Poisson-Dirichlet compositions.
Ann. Probab. , 1999–2041.[31] Pitman, J. and Yakubovich, Y. (2018). Ordered and size-biased frequencies in GEM and Gibbs’models for species sampling.
Ann. Appl. Probab. , 1793–1820.[32] Pitman, J. and Yor, M. (1982). A decomposition of Bessel bridges.
Z. Wahrscheinlichkeitsthe-orie verw. Gebiete , 425–457.[33] Revuz, D. and Yor, M. (1999)
Continuous Martingales and Brownian Motion . Springer.[34]
Rogers, D. (2020)
Up-down ordered Chinese Restaurant Processes: representation and asymp-totics . DPhil thesis. University of Oxford.[35]
Shiga, T. (1990) A stochastic equation based on a Poisson system for a class of measure-valueddiffusion processes.
Journal of Mathematics of Kyoto University , , 245–279. 1990.[36] Teh, Y. W. (2006) A hierarchical Bayesian language model based on Pitman-Yor processes.In
Proceedings of the 21st International Conference on Computational Linguistics and the 44thannual meeting of the Association for Computational Linguistics , pp. 985–992., pp. 985–992.