Quasi-Limiting Behavior of Drifted Brownian Motion
QQuasi-Limiting Behavior of Drifted Brownian Motion
SangJoon Lee ∗ Iddo Ben-Ari † November 20, 2020
Abstract
A Quasi-Stationary Distribution (QSD) for a Markov process with an almost surelyhit absorbing state is a time-invariant initial distribution for the process conditionedon not being absorbed by any given time. An initial distribution for the process isin the domain of attraction of some QSD ν if the distribution of the process a time t , conditioned not to be absorbed by time t converges to ν as t tends to infinity.We study Brownian motion with constant drift on the half line [0 , ∞ ) absorbed at 0.Previous work by Martinez et al. [13] [12] identifies all QSDs and provides a nearlycomplete characterization for their domains of attraction. Specifically, it was shownthat if the distribution a well-defined exponential tail (including the case of lighterthan any exponential tail), then it is in the domain of attraction of a QSD determinedby the exponent. In this work we expand the discussion regarding the dependence onthe initial distribution through1. Obtaining a new approach to existing results, explaining the direct relation be-tween a QSD and an initial distribution in its domain of attraction; and2. Considering a wide class of heavy-tailed initial distributions, where non-triviallimits are obtained under appropriate scaling. Here we review the origin and some well-known results of the study of QSDs. In section 1.1,we will present the general definition of QSD and related theorems. In section 1.2, we willintroduce the specific model we work in this paper, and present some previous results on themodel.
Consider X = ( X t : t ≥ R + = [0 , ∞ ) with 0 as a unique absorbingstate. Let τ = inf { t ≥ X t = 0 } . ∗ Department of Mathematics, University of Connecticut, Storrs, CT 06269-1009, USA;e-mail: [email protected] † Department of Mathematics, University of Connecticut, Storrs, CT 06269-1009, USA;e-mail: [email protected] a r X i v : . [ m a t h . P R ] N ov e will work under the assumption P x ( τ < ∞ ) = 1 , for all x ∈ R + . (1)The notation P x is a shorthand for the distribution of X with initial distribution, thedistribution of X , equal to the Dirac-delta measure at x .If π is a stationary distribution for X , then (1) guarantees that π = δ , [7, Section 2.2].While this result is not very interesting, the distribution of the process and particularly of X t conditioned on { τ > t } , is in general far from trivial. This naturally leads to the following“conditional” analog for a stationary distribution: Definition 1.1.
The probability distribution π is a Quasi-Stationary Distribution (QSD) for X if P π ( X t ∈ · | τ > t ) = π for all t > . A seemingly more relaxed definition, in the spirit of ergodic theorems for Markov Chains,is the following:
Definition 1.2.
A probability distribution π is a Quasi-Limiting Distribution (QLD) for X if for some µ , lim t →∞ P µ ( X t ∈ · | τ > t ) = π, in distribution . (2)where, as usual, P π and P µ are shorthand for the distribution of X with initial distri-bution equal to π or µ , respectively. QLDs corresponding to an initial distributions whichare Dirac-delta (or, more generally, compactly supported initial distributions) are known asYaglom limits.Of course, a QSD is a QLD. A partial converse holds under a standard regularity as-sumption (the Feller property): Proposition 1.3.
Suppose that for every t > and continuous and bounded function f on (0 , ∞ ) , the function x → E x [ f ( X t ) , τ > t ] is continuous. Then every QLD for X is a QSDfor X . For the sake of completeness, we provide a proof in Appendix A.1.We comment that QLDs are very often defined by requiring a pointwise limit rather thanlimit in distribution. That is (2) in Definition 1.2 is replaced bylim t →∞ P µ ( X t ∈ A | τ > t ) = π ( A ) for all measurable A ⊆ (0 , ∞ ) . (3)With this definition the conclusion of Proposition 1.3 holds without the additional regularitycondition we imposed. See [15, Definition 1 and Proposition 1].As in the sequel we will only work with processes satisfying the condition in the propo-sition, we always consider QLDs as QSDs. In light of the above, when µ and π are as inDefinition 1.2, we say that µ is in the domain of attraction of the QSD π . Of course, the2 a) Sample paths of 1-dimensionalBrownian Motion with constant nega-tive drift − .
02, with fixed initial state X = 2000 (b) PDF plot of 1-dimensional Brown-ian Motion with same drift and initialstate, at t = 100000. Sample size is10000. As expected, X follows aGaussian distribution.(c) Sample paths of same processes,conditioned not to be absorbed by t = 100000 (d) PDF plot of the same sample pro-cesses and same condition. Unlikeabove, this distribution has exponentialtail. Also, the density near 0 drops sig-nificantly in this setting. Figure 1: Illustration between unconditioned stochastic process and process conditioned notto be absorbed by a given timedomain of attraction of any QSD contains itself.Figure 1 illustrates the difference between the unconditioned process, the process that isrequired to be positive only at the given time, and the process that is required to never hit0 up to the given time.Unlike uniqueness of stationary distribution under irreducibility assumptions, QSDs arein general not unique, and typically a continuum of QSDs exists. Notable exceptions of thisare Markov chains on finite state spaces (with a unique absorbing state) and certain diffusionprocesses on bounded domains absorbed at the boundary. One strategy of finding QSDs isto study the quasi-limiting behavior under different initial distributions. When the class ofQSDs is known it is natural to ask what is the domain of attraction of each.3he concept of QSD is fairly intuitive and straightforward, as the idea was first intro-duced as early as 1931 by Wright [24], and the terms related to QSD have been crystallizedin 1950s by Bartlett [1] [2]. Mathematically, Yaglom [25] first showed an explicit solutionto a limiting conditional distribution for the for the subcritical Bienaym´e-Galton-Watsonbranching process. In the discrete setting there are detailed results for some specific models;for example, explicit description of QSDs are known for certain birth-and-death processes [7,Theorem 5.4]. As for uniqueness, a necessary and sufficient conditions for birth-and-deathprocesses were obtained by van Doorn [21] Mart´ınez, San Mart´ın and Villemonais later gen-eralized the result to countable state processes [14]. For other discrete state space models,Buiculescu studied QSDs for multi-type Galton-Watson processes [4], and Ferrari and Mari´cdiscussed QSDs approximated by Fleming-Viot processes [8]. A survey of results providedby van Doorn and Pollett [22] gives a comprehensive view of the progress on the discretespace models.Our work is on Brownian Motion with constant drift, which is one among a few modelswhere a lot is known explicitly, in part because it is a Gaussian process. Our main objectof interest is the dependence on the initial distribution, and is in continuation to the worksof Mart´ınez and San Mart´ın who identified all QSDs for the model [13], and later identifiedthe domain of attraction for each QSD [12]. Rates of convergence to Yaglom limits arealso studied by Polak and Rolski [18], and O¸cafrain [16]. Another diffusion - notably alsoa Gaussian processes - where explicit results are known is the Ornstein-Uhlenbeck process:Lladser and San Mart´ın [11] classified QSDs through their domains of attraction. Ye [26]identified the Yaglom limit for fractional-dimensional radial Ornstein-Uhlenbeck processes.As for general theory for diffusion processes, there has been much work and progress onconditions for existence and uniqueness of QSDs and on convergence to the Yaglom limit.Here is a partial list of references: Pinsky [17] (smooth bounded domains with absorption onthe boundary), Cattiaux, Collet, Lambert, Amaury, Mart´ınez, M´el´eard and San Mart´ın [5],Steinsaltz and Evans [20], Kolb and Steinsaltz [10], and Hening and Kolb [9] (uniqueness andconvergence for one dimensional diffusions) and Champangnat and Villemonais [6] (rates ofconvergence for one-dimensional diffusions).We close this section with the well-known properties related to QSDs and QLDs.
Theorem 1.4. [7, Theorem 2.2] Suppose that π is a QSD. Then under P π , τ is exponentiallydistributed with parameter λ π > . Proposition 1.5.
Let the assumption of Proposition 1.3 hold. Let µ is in the domain ofattraction of the QSD π . Then for every (cid:15) > , P µ ( τ > t ) = o ( e − ( λ π − (cid:15) ) t ) , where λ π is as in Theorem 1.4. We give an elementary proof in Appendix A.2. See also [15, Proposition 5] for a sharperresult under slightly stronger assumptions. 4 .2 Quasi Stationarity for Drifted BM
In this section and the sequel we will work under the following:
Assumption 1.6. X is Brownian Motion (BM) with constant negative drift − α , α > , on R + absorbed at . Analytically, BM with constant drift − α on R + absorbed at 0 is the sub-Markovianprocess generated by L α , which for each u satisfying u ∈ C ( R + ) and u (0) = 0, L α u = 12 u (cid:48)(cid:48) − αu (cid:48) . The works by Mart´ınez, Picco and San Mart´ın [13][12] studied QSDs for this class ofmodels. The formal derivation for densities of the QSDs, as presented in their main results,will be given in Appendix A.3.
Theorem 1.7. [13, Proposition 1] Every QSD for X is of the form π γ for some γ ∈ [0 , α ) . Theorem 1.8. [12, Theorem 1.3] The probability measure µ is in the domain of attractionof π if lim inf x →∞ ln µ ([ x, ∞ )) x ≤ − α. Theorem 1.9. [12, Theorem 1.1] Let ρ ∈ (0 , α ) . The probability measure µ is in the domainof attraction of π α − ρ if lim x →∞ ln µ ([ x, ∞ )) x = − ρ. We note the following:1. Theorem 1.9 was proved under the assumption that µ has a smooth density.2. The limit condition in Theorem 1.9 is not merely technical. The authors constructedan example [12, Theorem 1.4] with initial distribution with tail which alternates be-tween two exponential decay rates and which is not in the domain of attraction of anyQSDs. We comment that the method we develop in this paper can provide a simplerconstruction of such initial distribution. We present our main results in Section 2, split according to the tail of the initial distribution,considering initial distributions in the domain of attraction on QSDs in Section 2.1 and heavytails in Section 2.2. Our proofs are given split across three sections: In Section 3, we presentsome general tools we will use. In Section 4 we prove the results from Section 2.1. In Section5 we prove the results from Section 2.2, along with some concrete examples in Section 5.4.5
Main Results
In this section we will state our main results, by first proposing a principle which we thinkcan help the readers to envision the general classification of quasi-limiting behavior, andthen provide the theorems based on the principle. We recall that we are working underAssumption 1.6.Our goals are twofold:1. Develop a method that would yield alternate proof to Theorems 1.8 and 1.9, whichcan be generalized to other models, as well as leading to complete characterization ofthe domain of attraction of every QSD. Our results are presented in Section 2.1.2. Characterize the asymptotic behavior when the initial distribution has tails which areheavier than exponential. It is not hard to show, see Lemma 2.5, that this class ofinitial distributions is not in the domain of attraction of any QSD. Our results arepresented in Section 2.2.
As at its core, the concept of quasi-stationarity concerns conditional probabilities underevents with diminishing probabilities, namely the events { τ > t } . It is therefore natural tostudy the rate at their probabilities, P µ ( τ > t ), tend to zero. One of the nice propertiesof our model is that through Girsanov theorem and the reflection principle (or formulas forBrownian bridges) a closed form formula for these probabilities is readily available. We have: Proposition 2.1. P µ ( X t ∈ dy, τ > t ) = 1 √ πt (cid:90) exp (cid:18) αx − α t − αy (cid:19) (cid:18) e − ( x − y )22 t − e − ( x + y )22 t (cid:19) dµ ( x ) . (4)Our approach to the problem is to obtain for each initial distribution µ a family ofprobability measures ( ν t : t ≥ Principle 2.2. lim t →∞ ν t = δ γ = ⇒ lim t →∞ P µ ( X t ∈ · | τ > t ) = π γ (5)The measure ν t is defined through its cumulative distribution function F ν t : F ν t ( z ) = C t (cid:90) [0 ,zt ] e − x / (2 t ) e αx dµ ( x ) (6)where C t is the normalization constant. Table 1 is the summary of our result; it showsthe relation between µ , ν t and the QLD of µ .The key idea in the method is to “decouple” the initial distribution from the asymp-totic distribution, then identifying the relevant QSD as a member of a one-parameter fam-ily selected according to the value of γ . Indeed, in our model, observe that the mapping γ → π γ , γ ∈ [0 , α ) as given in (99) is an explicit function, with the case γ = 0 is merely a6 lim ν t QLD (= QSD) Example distributions ρ ≥ α δ π (Theorem 2.3) Half-normal distributionDelta distribution α > ρ > δ α − ρ π α − ρ (Theorem 2.4) Exponential distributionwith rate λ < αρ = 0 δ α QLD does not exist:scaling is necessary.See Section 2.2and Table 2 Pareto distributionHalf-Cauchy distributionTable 1: Domain of attraction classified by parameter ρ = lim x →∞ − ln µ ([ x, ∞ )) x removable singularity and is defined as lim γ → π γ .We believe that this method has a number of advantages:1. It is more intuitive, simpler and elementary than the previous approach. It lets usunderstand how the initial distribution actually evolves over time, and at a specifictime, which part of the initial distribution have evolved to consist the absolute majorityof the process not absorbed.2. The method allows for expanded characterization of the domain of attraction of QSDs.3. Our approach simplifies the analysis for the case of a distribution with alternatingexponential tails, given in [12], and opens the possibility of studying general compound-tail distributions.We expect this method to be applicable to other models and we hope it can be adopted asa general framework for classifying domain of attraction of QSDs.Our Principle 2.2 will be employed in two ways. We first observe thatlim t →∞ ν t = (cid:40) δ ⇐⇒ lim sup x →∞ − ln µ ([ x, ∞ )) x ≥ αδ α − ρ ⇐⇒ lim x →∞ − ln µ ([ x, ∞ )) x = ρ < α (7)Through application of the approach outlined above we obtain the following results:7 heorem 2.3. Suppose µ satisfies the following assumption. ρ := lim inf x →∞ − ln µ ([ x, ∞ )) x ≥ α. (8) Then P µ ( X t ∈ ·| τ > t ) → π . Theorem 2.4.
Suppose µ satisfies the following assumption, ρ := lim x →∞ − ln µ ([ x, ∞ )) x ∈ (0 , α ) (9) and let the sequence of measure ( ν t : t ≥ defined as (6) . Then lim t →∞ ν t = δ α − ρ (10) and moreover, lim t →∞ P µ ( X t ∈ · | τ > t ) = π α − ρ (11)We will refer to µ satisfying (8) as possessing “Critical and Super-critical” tails (withcritical being an equality), and will prove Theorem 2.3 in Section 4.1. We will refer to µ that satisfies (9) as possessing “Sub-critical Exponential” tails and will prove Theorem 2.4in Section 4.2. A natural question to ask from [12] would be the following: what happens if the initialdistribution is too heavy to be in the domain of attraction of any QSDs? A first step inthis direction is to look for such initial distributions. In light of Theorems 1.8 and 1.9, thefollowing is not surprising:
Lemma 2.5.
Suppose lim x →∞ ln µ ([ x, ∞ )) x = 0 . Then P µ ( τ > t ) does not decay exponentially. As a consequence ( P µ ( X t ∈ · | τ > t ) : t ≥ is not tight. Thus, in order to obtain a non-trivial limit, one has to scale X t as t → ∞ . As we willsee, the scaling itself depends on µ . We comment that all of the cases covered in this sectioncorrespond to ν t → δ α in (5).The next step is to study long-time behavior under such heavier-tailed distributions, andthis is the main topic of this part of the project. In order to do so, we mainly rely on thetheory of regularly varying functions [3]. Assumption 2.6.
Suppose µ is a probability measure satisfying the following: . µ ([ x, ∞ )) = e − F ( x ) , with F smoothly varying [3, Section 1.8] with index parameter β < / .2. There exists a positive function R ( x, c ) on R + × R + increasing in c , such that for all c > x →∞ F ( x + R ( x, c )) − F ( x ) = c. (12)Some comments are in order:1. Probability measures with regularly varying tails falls into the category β = 0. Somedistinguished cases are the Weibull distribution with 0 < k <
1, which has a uniformdecay rate with β = k , and the Pareto and Cauchy distributions, both having uniformdecay rate with β = 0.2. If F is smooth enough, then R ( x, c ) = cF (cid:48) ( x ) (13)So when β (cid:54) = 0, R ( x, c ) is a regular varying function with index ϕ = 1 − β .3. When β = 0 it is more natural to replace the identity function on the right-hand sideof (12) with a strictly increasing continuous and nonnegative function H satisfying H (0) = 0.The main principle we developed to obtain results under Assumption 2.6 is the following. Principle 2.7.
Assumption 2.6 ⇒ lim t →∞ P µ ( X t > R ( t, c ) | τ > t ) = e − c (14)We note that the assumption β < β = 1 is the criticalborder where the relation between the survival rate P µ ( τ > t ) and the initial distribution µ changes. Also, although Lemma 2.5 applies whenever 0 ≤ β <
1, Principle 2.7 only applies to0 < β < /
2. The remaining half 1 / ≤ β < Theorem 2.8.
Suppose µ ([ x, ∞ )) = exp( − F ( x )) where F ( x ) is strictly increasing smoothlyvarying function with index β < . . Then lim t →∞ P µ (cid:18) X t > cF (cid:48) ( αt ) (cid:12)(cid:12)(cid:12)(cid:12) τ > t (cid:19) = e − c (15)Table 2 summarizes our results by showing how β relates to some of the well-knowndistributions, and how they lead to quasi-limiting behavior of such initial distribution. Thetable also lists a number of concrete cases, all presented in Section 5.4.Throughout the rest of the paper, we will be using some asymptotic notations; f ( t ) ∼ g ( t )if lim t →∞ f ( t ) g ( t ) ∈ (0 , ∞ ), and f ( t ) (cid:28) g ( t ) if lim t →∞ f ( t ) g ( t ) = 0.9 Related result Example distributions β > k > β = 1 ρ = lim x →∞ − ln µ ([ x, ∞ )) x Theorem 2.3 if ρ ≥ α Theorem 2.4 if ρ < α
Exponential distributionErlang distribution ≤ β < ≤ k < < β < Theorem 2.8 Weibull distribution withshape parameter k < , Example 5.5 β = 0 µ ([ x, ∞ )) is RV with index − κ :Corollary 5.6 if κ (cid:54) = 0Corollary 5.8 if κ = 0 Pareto distributionHalf-Cauchy distribution, Example 5.7Log-Cauchy distributionTable 2: Distributions classified by index parameter β of F ( x ) = − ln µ ([ x, ∞ )) In this section, we prove Proposition 2.1, which is the master formula we use throughoutthis paper. We will also further explain the intuition behind the sequence of new measure ν t . Finally, we will introduce the variations of Scheffe’s lemma [19], which is one of the toolfor Chapter 4. When X t is a drifted Brownian Motion with negative drift α , (such that X t + αt is a standardBM B t ) P x ( X t ∈ dy ) = exp (cid:18) αx − α t αy (cid:19) P x ( B t ∈ dy ) (16)We also want to enforce the condition τ > t , where τ is the hitting time at 0. We can10pply the reflection principle to compute P x ( X t ∈ dv, τ > t ). P x ( X t ∈ dy, τ > t ) = exp (cid:18) αx − α t αy (cid:19) P x ( B t ∈ dy, τ > t )= exp (cid:18) αx − α t αy (cid:19) √ πt (cid:18) e − ( x − y )22 t − e − ( x + y )22 t (cid:19)(cid:124) (cid:123)(cid:122) (cid:125) = f ( t,x,y ) (17)Integrating f ( t, x, y ) with respect to µ gives (4). Furthermore, we can get the survivalrate from the above formula as well. P µ ( τ > t ) = (cid:90) ∞ (cid:90) ∞ µ ( x ) f ( t, x, y ) dydx (18)We wrap this section with the principle behind finding the family of probability measures( ν t : t ≥
0) in (2.2). From (18), P µ ( τ > t ) = (cid:90) ∞ (cid:90) ∞ µ ( x ) exp (cid:18) αx − α t αy (cid:19) √ πt (cid:18) e − ( x − y )22 t − e − ( x + y )22 t (cid:19) dydx = e − α t √ πt (cid:18)(cid:90) ∞ µ ( x ) e − x t e αx (cid:90) ∞ e − y t e − αy (cid:16) e xyt − e − xyt (cid:17) dydx (cid:19) (19)We substitute z = tx .(19) = e − α t t √ πt (cid:18)(cid:90) ∞ µ ( tz ) e − tz e αtz (cid:90) ∞ e − y t e − αy (cid:0) e tz − e − tz (cid:1) dydz (cid:19) (20)For convenience, we will use x instead of z for (20) in later parts.From the above equations, the natural construction of ν t would come from the termsthat consist the outer integral. Indeed, we will use ν t ( x ) = µ ( tx ) e − tx e αtx in Section 4.2. InSection 4.1, (19) will be used with some modification. From (4) and (18), we can consider the conditional density P µ ( X t ∈ dy | τ > t ) = P µ ( X t ∈ dy, τ > t ) P µ ( τ > t )= (cid:82) ∞ µ ( x ) f ( t, x, y ) dx (cid:82) ∞ (cid:82) ∞ µ ( x ) f ( t, x, y ) dydx (21)When t is fixed, this is clearly a probability density which we will call µ t ( y ). In orderto prove convergence of the probability distributions, we will use the following version ofScheffe’s Lemma [23, p. 55]: 11 emma 3.1. Suppose that f n , f are probability densities on R + satisfying lim inf f n ≥ f ,a.e. Then (cid:82) A f n ( x ) dx → (cid:82) A f ( x ) dx for any A .Proof. Let dm n = f n dx , and dm ∞ = f dx . By Fatou’s lemma, for every A ,lim inf m n ( A ) ≥ m ∞ ( A ) (22)Now 1 − lim sup m n ( A ) = lim inf(1 − m n ( A )) = lim inf m n ( A c ) , Thus, by (22) applied to A c ,1 − lim sup m n ( A ) = lim inf m n ( A c ) ≥ m ∞ ( A c ) = 1 − m ∞ ( A ) . In other words lim sup m n ( A ) ≤ m ∞ ( A ) and the first statement follows. Throughout this section we will assume that (8) holds.Define f ( t, x, y ) = ye − αy e − y t sinh( xy ) xy and h ( t, x ) = (cid:90) ∞ f ( t, x, y ) dy and let h ( x ) = lim t →∞ h ( t, x ) = (cid:90) ∞ e − αy sinh( xy ) x dy Note that h ( x ) is increasing, h (0) := lim x (cid:38) h ( x ) = (cid:90) ∞ ye − αy dy = 1 α and h ( x ) = ∞ if and only if x ≥ α .For every t , we define two measures on [0 , ∞ ): dγ ( x ) = xe αx dµ ( x ) dν t ( x ) = e − x t dγ ( x ) (23)By assumption, there exists a function δ ( x ) → γ ([0 , x ]) ≤ e δ ( x ) x without loss of generality, we may also assume δ is decreasing.12bserve that P ( X t ∈ dy | τ > t ) = (cid:82) f ( t, x/t, y ) dν t ( x ) (cid:82) h ( t, x/t ) dν t ( x ) . (24)We will now prove the theorem through the application of Lemma 3.1, where f t ( v ) = (cid:82) f ( t, x/t, y ) dν t ( x ) (cid:82) h ( t, x/t ) dν t ( x )and f ( v ) = α ye − αy Proof of Theorem 2.3.
Let (cid:15) ∈ (0 ,
1) and let η t = (cid:15)αt . We begin by analyzing the behaviorof the denominator in the right-hand side of (24).Observe that h ( t, y ) is bounded on [0 , M ] × R + and increases as t → ∞ to h ( x ) = (cid:90) ∞ ye − αy sinh( xy ) xy dy As a result, the convergence is uniform. From this it follows thatlim sup t →∞ (cid:82) [0 ,η t ] h ( t, x/t ) dν t ( x ) ν t ([0 , η t ]) ≤ h ( (cid:15)α ) . (25)We turn to evaluation of the interval on [ η t , . αt ]. Since here xt ≤ . α < α , h (cid:0) t, xt (cid:1) isuniformly bounded by a constant depending only on α . Below C denotes a positive constantdepending only on α, (cid:15) , and whose value may change from line to line.Integrating by parts, (cid:90) [ η t , . αt ] h (cid:16) t, xt (cid:17) dν t ( x ) ≤ C t (cid:90) [ η t , . αt ] xe − x t γ ([ η t , x ]) dx. Changing variables to z = x √ t , the last expression becomes (cid:90) √ tα [ (cid:15), . ze − z γ ([ η t , √ tz ]) dz Now γ ([ η t , √ tz ]) ≤ γ ([0 , √ tz ]) ≤ γ ([0 , η t ]) e δ ( η t )( √ tz −√ t(cid:15) ) ≤ γ ([0 , η t ]) e δ ( η t ) √ tz Putting this back in the integral gives an upper bound of the form γ ([0 , η t ]) (cid:90) √ tα [ (cid:15), . ze − z e δ ( η t ) √ tz dz δ ( η t ) → t → ∞ , for all t large enough, we have δ ( η t ) ≤ min (cid:18) α (cid:15) , α(cid:15) (cid:19) (26)To obtain an upper bound on the integral, observe that as a function of z , − z δ ( η t ) √ tz = − z z − δ ( √ t ))is decreasing on [ δ ( η t ) √ t, ∞ ), and by (26), if z > η t √ t = (cid:15)α √ t , then z > δ ( η t ) √ t .Therefore we have − z δ ( η t ) √ tz ≤ − ( η t / √ t ) δ ( η t ) √ t (cid:18) η t √ t (cid:19) ≤ − α (cid:15) t α (cid:15) t − ( α(cid:15) ) t (cid:90) [ η t , . αt ] h (cid:16) t, xt (cid:17) dν t ( x ) ≤ Ce − ( α(cid:15) )2 t t γ ([0 , η t ]) ≤ Ce (cid:18) − ( α(cid:15) )24 + δ ( η t ) (cid:15)α (cid:19) t t → . αt, ∞ ). Observe that h ( t, x ) ≤ √ πtx E (cid:104) e ( x − α ) √ tZ (cid:105) where Z is standard Gaussian, and therefore h (cid:16) t, xt (cid:17) ≤ √ πtx/t e x t e α t e − αx Hence (cid:90) [0 . αt, ∞ ) h (cid:16) t, xt (cid:17) dν t ( x ) ≤ √ πt e α t (cid:90) [0 . αt, ∞ ) dµ ( x )But µ ([0 . αt, ∞ )) = e − . α t (1+ o (1)) , and as a result (cid:90) [0 . αt, ∞ ) h (cid:16) t, xt (cid:17) dν t ( x ) → . (29)Since lim inf t →∞ ν t ([0 , η t ]) >
0, it follows from (25), (28) and (29), thatlim sup t →∞ (cid:82) h ( t, x/t ) dν t ( x ) ν t ([0 , η t ]) ≤ h ( (cid:15)α ) . (30)14epeating the argument leading to that gave (25) mutatis mutandis, we obtainlim inf t →∞ (cid:82) [0 ,η t ] f ( t, x/t, y ) dν t ( x ) ν t ([0 , η t ]) ≥ ye − αy inf x ≤ (cid:15)α sinh( xy ) xy = ye − αy (31)It therefore follows from (30) and (31), thatlim inf t →∞ (cid:82) f ( t, x/t, y ) dν t ( x ) (cid:82) h ( t, x/t ) dν t ( x ) ≥ ye − αy h ( (cid:15)α )and this holds for every (cid:15) ∈ (0 , . (cid:15) → h ( (cid:15)α ) = (cid:82) ∞ ye − αy dy , we obtainlim inf t →∞ (cid:82) f ( t, x/t, y ) dν t ( x ) (cid:82) h ( t, x/t ) dν t ( x ) ≥ ye − αy (cid:82) ∞ ye − αy dy and the result follows from Lemma 3.1. Throughout this section, we assume that (9) holds.We first split (19) into three parts. P µ ( τ > t ) = e − α t √ πt (cid:90) M e − x t e αx h (cid:16) t, xt (cid:17) dµ ( x ) (cid:124) (cid:123)(cid:122) (cid:125) = J ( t ) + (cid:90) stM e − x t e αx h (cid:16) t, xt (cid:17) dµ ( x ) (cid:124) (cid:123)(cid:122) (cid:125) = J ( t ) + (cid:90) ∞ st e − x t e αx h (cid:16) t, xt (cid:17) dµ ( x ) (cid:124) (cid:123)(cid:122) (cid:125) = J ( t ) (32)Where h ( t, x ) = (cid:90) ∞ e − y t e − αy sinh( xy ) dy .Here, M is chosen such that we have the following inequality e − ( ρ + (cid:15) ) x ρ + (cid:15) ≤ µ ([ x, ∞ )) c ≤ e − ( ρ − (cid:15) ) x ρ − (cid:15) (33)For each x > M and some arbitrary (cid:15) >
0. ( c is the normalizing constant of µ ) Also, wechoose s such that s = α − η for some α > η > µ . Finally, since we areonly interested in the limiting behavior with respect to t , we write M < st which is alwaystrue for large enough t . 15 roposition 4.1. Under assumption (9) P µ ( τ > t ) ∼ e − α t √ πt J ( t ) ∼ ce − (2 αρ − ρ t (cid:18) ρ − α − ρ (cid:19) (34) where c is the constant in (33) which only depend on µ .Proof. We first look at the region for J ( t ). In this interval we have the following. J ( t ) = (cid:90) stM e − x / (2 t ) e αx h (cid:16) t, xt (cid:17) dµ ( x )= t (cid:90) sM/t e − tx / e αtx h ( t, x ) dµ ( tx ) (35)Some observations on h ( t, x ) :1. h ( t, x ) is bounded in R + × [0 , s ] since s < α .2. h ( x ) = lim t →∞ h ( t, x ) = 1 α − x − α + x by dominated convergence theorem. Moreover, h ( x ) is also bounded in [0 , s ].We introduce a new sequence of measures ( ν + t , ν − t , t ≥
0) defined as dν + t ( x ) = e − tx e αtx e − ( ρ − (cid:15) ) tx = (cid:114) πt e ( α − ρ + (cid:15) )2 t (cid:114) t π e − t ( x − ( α − ρ + (cid:15) ))22 dν − t ( x ) = e − tx e αtx e − ( ρ + (cid:15) ) tx = (cid:114) πt e ( α − ρ − (cid:15) )2 t (cid:114) t π e − t ( x − ( α − ρ − (cid:15) ))22 (36)For both case notice that the latter part is a Gaussian density with mean α − ρ ± (cid:15) andvariance 1 /t , therefore we have the following convergence of measure: ν + t (cid:42) (cid:114) πt e ( α − ρ + (cid:15) )2 t δ α − ρ + (cid:15) ν − t (cid:42) (cid:114) πt e ( α − ρ − (cid:15) )2 t δ α − ρ − (cid:15) (37)Therefore, lim sup t →∞ J ( t ) = lim sup t →∞ c √ πt (cid:90) sM/t h ( t, x ) dν + t ( x )= c √ πte ( α − ρ + (cid:15) )2 t (cid:18) ρ − (cid:15) − α − ρ + (cid:15) (cid:19) (38)16im inf t →∞ J ( t ) = lim inf t →∞ c √ πt (cid:90) sM/t h ( t, x ) dν − t ( x )= c √ πte ( α − ρ − (cid:15) )2 t (cid:18) ρ + (cid:15) − α − ρ − (cid:15) (cid:19) (39)and since (cid:15) is arbitrary, we conclude that J ( t ) ∼ c √ πte ( α − ρ )2 t (cid:18) ρ − α − ρ (cid:19) (40)For the second interval x ∈ ( st, ∞ ) we first study some bound for h ( t, x/t ). we start fromthe obvious. h (cid:16) t, xt (cid:17) ≤ (cid:90) ∞ exp (cid:18) − y t + αy + xyt (cid:19) (41)We can rewrite the exponent as − y √ t (cid:18) y √ t + 2 α √ t − x √ t (cid:19) = − y √ t (cid:18) y √ t + 2 ϕ (cid:19) = −
12 ( w − ϕ )( w + ϕ ) (42)where ϕ = (cid:18) √ tα − x √ t (cid:19) , and w = y √ t + ϕ . Therefore, after changing variables y → w ,we obtain h ( t, x ) ≤ √ te ϕ (cid:90) ∞ ϕ e − w dw = √ te α t e x t e − αx L (cid:18) √ tα − x √ t (cid:19) , (43)where L ( z ) = (cid:90) ∞ z e − w dw . L has some nice properties:1. L ( z ) is strictly decreasing and bounded above by √ π .2. When z is negative, L ( z ) < √ π .3. When z is positive, L ( z ) ≤ min (cid:32) e − z z , (cid:114) π (cid:33) (44)4. More specifically, if z ≥ L ( z ) ≤ e − z (45)17sing the bound above we get the following. J ( t ) ≤ √ t (cid:90) ∞ st e α t L (cid:18) √ tα − x √ t (cid:19) dµ ( x ) ≤ c √ πte α t e − ρst = c √ πte t (cid:16) α − ρ ( α − η ) (cid:17) (46)We want J ( t ) = o ( J ( t )) = o (cid:18) √ te ( α − ρ )2 t (cid:19) . Indeed, if we pick η = ρ/ α − ρ ) − (cid:18) α − γ ( α − η ) (cid:19) = ρ − ρη = ρ > x ∈ [0 , M ], we use the fact that for any (cid:15) >
0, we can fix t suchthat for each t > t , M/ √ t < (cid:15) . And for such t , we have J ( t ) = (cid:90) M e − x t e αx (cid:90) ∞ e − y t e − αy sinh (cid:16) xyt (cid:17) dydµ ( x ) ≤ √ te α t (cid:90) M µ ( x ) L (cid:18) √ tα − x √ t (cid:19) dµ ( x ) (48)And since L is decreasing, (48) ≤ √ te α t (cid:90) M L (cid:16) √ tα − (cid:15) (cid:17) dµ ( x ) (49)Finally using (45) and that µ is a probability measure,(49) ≤ √ t (cid:90) M e α(cid:15) √ t e − (cid:15) dµ ( x ) ≤ √ te α(cid:15) √ t − (cid:15) = o (cid:18) √ te ( α − ρ )2 t (cid:19) = o ( J ( t )) (50)18e now turn to computing the limiting density. P µ ( X t ∈ dy, τ > t ) = e − α t √ πt (cid:90) ∞ e − x t e αx e − y t e − αy sinh (cid:16) xyt (cid:17)(cid:124) (cid:123)(cid:122) (cid:125) = g ( x,y,t ) dµ ( x ) = e − α t √ πt (cid:90) M g ( x, y, t ) dµ ( x ) (cid:124) (cid:123)(cid:122) (cid:125) = K ( t,y ) + (cid:90) stM g ( x, y, t ) dµ ( x ) (cid:124) (cid:123)(cid:122) (cid:125) = K ( t,y ) + (cid:90) ∞ st g ( x, y, t ) dµ ( x ) (cid:124) (cid:123)(cid:122) (cid:125) = K ( t,y ) (51)Where M, s are the same as (32).
Proposition 4.2.
Under assumption (9) , P µ ( X t ∈ dy, τ > t ) ∼ e − α t √ πt K ( t, y ) ∼ ce − (2 αρ − ρ t e − αy sinh(( α − ρ ) y ) (52) where c is the constant in (33) which only depends on µ .Proof. Using similar estimation method and sequence of measures ( ν + t , ν − t , t ≥
0) as before,we can see that for each y ∈ R + lim sup t →∞ K ( t, y ) = lim sup t →∞ c √ πt (cid:90) sM/t e − y t e − αy sinh( xy ) dν + t ( x )= c √ πte ( α − ρ + (cid:15) )2 t e − αy sinh(( α − ρ + (cid:15) ) y ) (53)lim inf t →∞ K ( t, y ) = lim inf t →∞ c √ πt (cid:90) sM/t e − y t e − αy sinh( xy ) dν − t ( x )= c √ πte ( α − ρ − (cid:15) )2 t e − αy sinh(( α − ρ − (cid:15) ) y ) (54)and therefore K ( t, y ) ∼ c √ πte ( α − ρ )2 t e − αy sinh(( α − ρ ) y ) (55)For K ( t, y ) we use the upper bound in (33) to get the following estimate. K ( t, y ) ≤ e − y t e − αy (cid:90) ∞ st e − x t e ( α − ρ + (cid:15) ) x e xyt dx = √ te ( α − ρ + (cid:15) )2 t e ( − ρ + (cid:15) ) y L (cid:18) √ ts − √ t ( α − ρ + (cid:15) ) − y √ t (cid:19) (56)19ince s − ( α − ρ + (cid:15) ) > (cid:15) , the argument for L above is strictly positiveand increasing. Therefore by (45),(56) ≤ √ te ( α − ρ )2 t exp (cid:18) − ( s − ( α − ρ )) t α − ρ ) − s ) (cid:15)t (cid:19) e ( s − α +2 (cid:15) ) y (57)Again, s − ( α − ρ ) > (cid:15) is arbitrarily small so the middle term above is exponentiallydecaying. We conclude that(57) = o (cid:18) √ te ( α − ρ )2 t (cid:19) = o ( K ( t, y )) (58)Finally for K ( t, y ) we can directly apply the dominated convergence theorem. K ( t, y ) ∼ (cid:90) M e αx e − αy sinh(0) dµ ( x )= o (1) = o ( K ( t, y )) (59)We can now prove Theorem 2.4. Proof of Theorem 2.4.
The fact that (cid:15) is arbitrarily small in (37) proves the first part ofthe theorem. The second part follows from Proposition 4.1 and Proposition 4.2 and Lemma3.1.
In Section Section 5.1 we will prove Lemma 2.5 to see that adequate scaling is necessaryto obtain a non-trivial quasi-limiting behavior. In Section 5.2 we obtain the scaling byestimating the tail distribution of the surviving process, Proposition 5.2. In Section 5.3 wewill use this to prove Theorem 2.8. Finally, in Section 5.4 we present a number of concreteapplications to Theorem 2.8. 20 .1 On tail of the initial distribution and tail of survival time
Proof of Lemma 2.5.
Pick b > αb ) > e αb then by Proposition 2.1 we have: P µ ( τ > t ) ≥ P µ ( X > αt, X t > b, τ > t )= e − α t t √ πt (cid:90) ∞ α µ ( tx ) e − tx e αtx (cid:90) ∞ b e − y t e − αy ( e xy − e − xy ) dydx ≥ e − α t t √ πt (cid:90) ∞ α µ ( tx ) e − tx e αtx (cid:90) ∞ b e − y t e − αy e xy dydx = t √ π (cid:90) ∞ α µ ( tx ) L (cid:18) b √ t + √ t ( α − x ) (cid:19) dx ≥ µ ([ tα, ∞ ))This implies that P µ ( τ > t ) is at least as heavy as the tail distribution of µ . By Proposition1.5, any initial distribution µ that has heavier-than-exponential tail distribution cannotconverge to a QSD. The method we develop here works for a large class of distributions µ , yet both scaling andlimit distributions may depend on the choice of µ .Recall that we work under the Assumption 2.6. We can write the density of µ as follows.If β > µ ( x ) = F (cid:48) ( x ) exp( − F ( x )) = F (cid:48) ( x ) µ ([ x, ∞ )) (60)Note that by [3, Proposition 1.8.1] , F (cid:48) ( x ) is smooth varying with index β − µ has a continuous density,which we also denote by µ . P µ ( X t > a t , τ > t ) = e − α t √ πt (cid:18)(cid:90) ∞ e − x t e αx (cid:90) ∞ a t e − y t e − αy (cid:16) e xyt − e − xyt (cid:17) dydµ ( x ) (cid:19) = e − α t t √ πt (cid:90) ∞ µ ( tx ) e − tx e αtx (cid:90) ∞ a t e − y t e − αy ( e xy − e − xy ) dydx = t √ π (cid:90) ∞ µ ( tx ) L (cid:18) a t √ t + √ tα − √ tx (cid:19) dx (cid:124) (cid:123)(cid:122) (cid:125) = J ( t ) − (cid:90) ∞ µ ( tx ) e αtx L (cid:18) a t √ t + √ tα + √ tx (cid:19) dx (cid:124) (cid:123)(cid:122) (cid:125) = J ( t ) (61)21e first notice that from the second term J , e αtx L (cid:18) a t √ t + √ tα + √ tx (cid:19) dx ≤ e αtx e − a t /t + tα tx atα +2 atx +2 αtx a t / √ t + √ tα + √ tx = e − t ( α − x )22 e − a t / (2 t ) e − a t ( α + x ) a t / √ t + √ tα + √ tx (62)If a t (cid:29) (cid:15) √ t then the term e − a t / (2 t ) will let J decay faster (in exponential sense) than µ ( tx ). In fact, unless a t = o ( √ t ) and x ∈ ( α − t − / (cid:15) , α + t − / (cid:15) ), J decays exponentiallyfaster than µ ( tx ).Furthermore, when we define J ,A ( t ) , J ,A ( t ) to be integrated over some sub-interval A of R + instead of the entire R + as follows: J ,A ( t ) = (cid:90) A µ ( tx ) L (cid:18) a t √ t + √ tα − √ tx (cid:19)(cid:124) (cid:123)(cid:122) (cid:125) = f ( t,x ) dxJ ,A ( t ) = (cid:90) A µ ( tx ) e αtx L (cid:18) a t √ t + √ tα + √ tx (cid:19) dx (63)since P µ ( X ∈ · , X t ∈ · , τ > t ) ≥ J ,A = O ( J ,A ) on the samesub-interval A ∈ R + .For the first term J , we split the integration. J ( t ) = (cid:90) α + a t /t − η t f ( t, x ) dx (cid:124) (cid:123)(cid:122) (cid:125) = J , ( t ) + (cid:90) α + a t /t + (cid:15) t α + a t /t − η t f ( t, x ) dx (cid:124) (cid:123)(cid:122) (cid:125) = J , ( t ) + (cid:90) ∞ α + a t /t + (cid:15) t f ( t, x ) dx (cid:124) (cid:123)(cid:122) (cid:125) = J , ( t ) (64)where η t , (cid:15) t is to be picked depending on µ .The goal now is to get an accurate asymptotic on the survival rate. Proposition 5.1.
Suppose µ satisfies Assumption 2.6. Then for any η t (cid:29) t β − , log J , ( t ) (cid:28) log µ ([ tα + a t , ∞ )) (65) Proof.
Suppose η t (cid:29) t β − . Then we have the following estimate. J , ( t ) ≤ L ( √ tη t ) (cid:90) α + a t /t − η t µ ( tx ) dx ≤ L ( √ tη t ) ≤ exp (cid:18) − t ( η t ) (cid:19) (cid:28) exp (cid:18) t − β (cid:19) ∼ µ ([ tα + a t , ∞ )) (66)22ow suppose t β − (cid:29) η t (cid:29) t β − . Pick t ( β − / (cid:28) η t = t r such that by (66), (cid:90) α + a t /t − η t µ ( tx ) L (cid:18) √ tα + a t √ t − √ tx (cid:19) dx (cid:124) (cid:123)(cid:122) (cid:125) = J , , ( t ) (cid:28) µ ([ tα + a t , ∞ ))Now we want to pick t r = η t (cid:28) η t such that (cid:90) α + a t /t − η t α + a t /t − η t µ ( tx ) L (cid:18) √ tα + a t √ t − √ tx (cid:19) dx (cid:124) (cid:123)(cid:122) (cid:125) = J , , ( t ) (cid:28) µ ([ tα + a t , ∞ ))Using integration by parts, J , , ( t ) = (cid:90) α + a t /t − η t α + a t /t − η t µ ( tx ) L (cid:18) √ tα + a t √ t − √ tx (cid:19) dx = − t µ ([ tx, ∞ )) L (cid:18) √ tα + a t √ t − √ tx (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) α + a t /t − η t α + a t /t − η t + 1 √ t (cid:90) α + a t /t − η t α + a t /t − η t exp (cid:18) − t ( x − ( α + a t /t )) (cid:19) µ ([ tx, ∞ )) dx ≤ t (cid:16) − µ ( tα + a t − tη t , ∞ ) L ( √ tη t ) + µ ( tα + a t − tη t , ∞ ) L ( √ tη t ) (cid:17) + 1 t µ ( tα + a t − tη t , ∞ ) L ( √ tη t ) (67)Since both µ ([ x, ∞ )) and L ( x ) are decreasing function, the driving term of (67) is the lastone. And since µ ( x, ∞ ) = exp( − F ( x )) where F is an increasing regularly varying functionwith index β ,1 t µ ( tα + a t − tη t , ∞ ) L ( √ tη t ) ∼ t µ ( tα + a t − tη t , ∞ ) L ( √ tη t ) ≤ t exp (cid:0) − F (( tα + a t ) − t r ) (cid:1) exp (cid:18) − t r (cid:19) ∼ t µ ( tα + a t , ∞ ) exp (cid:18) t β + r − t r (cid:19) (68)If β + r < r we get the desired asymptotic. That is, we need r > ( β −
1) + r t ( β − / (cid:28) η t we can pick η t (cid:29) t ( β − r ∼ t β − to get (cid:90) α + a t /t − η t µ ( tx ) L (cid:18) √ tα + a t √ t − √ tx (cid:19) dx = J , , ( t ) + J , , ( t ) (cid:28) µ ([ tα + a t , ∞ )) (69)23ecursively, we can pick η nt (cid:29) t ( β − − (1 / n ) such that J , ,n ( t ) = (cid:90) α + a t /t − η nt α + a t /t − η n − t µ ( tx ) L (cid:18) √ tα + a t √ t − √ tx (cid:19) dx (cid:28) µ ([ tα + a t , ∞ ))So for sufficiently large n we have η t = η nt (cid:29) t ( β − − (1 / n ) (cid:29) t β − J , ( t ) = n (cid:88) i =1 J , ,i ( t ) (cid:28) µ ([ tα + a t , ∞ ))which completes the proof. Proposition 5.2.
Suppose µ satisfies 2.6 and β > . If a t (cid:29) √ t , P µ ( X t > a t , τ > t ) ∼ t √ π J , ( t ) ∼ µ ([ tα + a t , ∞ )) (70) Proof.
Pick η t and (cid:15) t as follows. t β − (cid:28) η t (cid:28) , (cid:15) t = t − b , β < b < . η t → , η t (cid:28) a t /t, (cid:15) t (cid:28) a t /t, √ t(cid:15) t → ∞ , F (cid:48) ( tα + a t ) (cid:15) t (cid:28) /t (72)For J , ( t ), we first observe that the interval ( α + a t /t − η t , α + a t /t + (cid:15) t ) close in to α + a t /t . Moreover, while L does vary between 0 and (cid:112) π/ µ does notvary much from µ ( t ( α + a t /t )) inside the interval, and therefore we can use the intermediatevalue theorem. Also, we split the integration to get the following bound for J , ( t ). J , ( t ) ∼ µ ( t ( α + a t /t )) × (cid:32)(cid:90) α + a t /tα + a t /t − η t L (cid:18) a t √ t + √ tα − √ tx (cid:19) dx + (cid:90) α + a t /t + (cid:15) t α + a t /t L (cid:18) a t √ t + √ tα − √ tx (cid:19) dx (cid:33) ≤ µ ( tα + a t ) (cid:32) √ t (cid:90) √ tη t L ( y ) dy + (cid:90) (cid:15) t √ πdx (cid:33) ≤ µ ( tα + a t ) (cid:18) √ t (cid:90) ∞ L ( y ) dy + √ π(cid:15) t (cid:19) ∼ √ πµ ( tα + a t ) (cid:15) t Note that the first integration is essentially the expected value of a half-normal distribu-tion, and second integration is estimated using the fact that L is bounded above.24o estimate J , ( t ), since √ t(cid:15) t → ∞ , it follows that L ( √ t(cid:15) t ) → √ π and we can use IVTto get the sharp estimate. J , ( t ) ∼ √ π (cid:90) ∞ α + a t /t + (cid:15) t µ ( tx ) dx ∼ √ π t µ ([ tα + a t , ∞ )) (73)Proposition 5.1 shows that J , ( t ) = o ( J , ( t )).For J , ( t ), we combine (60) and (72) to get the following asymptotic comparison. J , ( t ) ≤ √ πµ ( tα + a t ) (cid:15) t (cid:28) √ πt µ ([ tα + a t , ∞ )) ∼ J , ( t ) (74)Finally from the choice of (cid:15) t we have b < .
5, and therefore J , ( t ) ≤ (cid:90) ∞ α + a t /t + (cid:15) t µ ( tx ) e − t ( α − x )22 e − a t / (2 t ) e − a t ( α + x ) dx ≤ e − t(cid:15) t µ ([ tα + a t , ∞ )) = o ( J , ( t )) (75)We conclude that P µ ( X t > a t , τ > t ) ∼ (1 + o (1)) t √ π J , ( t ) ∼ µ ([ tα + a t , ∞ )) (76)We can extend this proposition to the cases where F is slowly varying. In such cases, weexpect the tail distribution µ ( x, ∞ ) itself to be smoothly varying. Corollary 5.3.
Suppose µ ([ x, ∞ )) = G ( x ) , where G is smoothly varying function with index − κ < . Then P µ ( X t > a t , τ > t ) ∼ t √ π J , ( t ) ∼ µ ([ tα + a t , ∞ )) .Proof. It suffices to show that J , ( t ) = o ( J , ( t )). The smooth varying condition yields thefollowing relation [3, 1.8.1’] tµ ( tα + a t ) ∼ µ ([ tα + a t , ∞ )) (77)Since we have (cid:15) t (cid:28) J , ( t ) ≤ µ ( tα + a t ) (cid:15) t = o (cid:18) t µ ([ tα + a t , ∞ )) (cid:19) = o ( J , ( t )) (78)so we have the desired asymptotic. 25 .3 Proof of Theorem 2.8 (Heavy-tailed initial distributions) Proposition 5.2 and Corollary 5.3 show why the second part of Assumption 2.6 is necessary.We need the right a t that will yield nontrivial result on the limitlim t →∞ P µ ( X T > a t | τ > t ) = lim t →∞ P µ ( X t > a t , τ > t ) P µ ( τ > t ) (79)Due to Proposition 5.2 this boils down to comparing µ ( tα, ∞ ) and µ ( tα + a t , ∞ ). Proof of Theorem 2.8. If µ satisfies Assumption 2.6, setting a t = R ( t, c ) gives the following. µ ([ tα + a t , ∞ )) = exp( − F ( tα + R ( t, c )) ∼ exp( − ( F ( tα ) + c ))= e − c µ ( tα, ∞ ) (80)We make few comments on the observation (13). If smooth enough, F has the Taylorexpansion F ( tα + R ( t, c )) = F ( tα ) + F (cid:48) ( tα ) R ( t, c ) + o ( F (cid:48) ( t ))therefore by choosing R ( t, c ) = cF (cid:48) ( tα ) , we get F ( tα + R ( t, c )) − F ( tα ) = c + o ( F (cid:48) ( t )).Since F has index β < F (cid:48) ( t ) = o (1) so condition (12) is satisfied.We further observe that with the choice R ( t, c ) = cF (cid:48) ( tα ) , F (cid:48) ( tα + R ( t, c )) = F (cid:48) ( tα ) + F (cid:48)(cid:48) ( tα ) R ( t, c ) + o ( F (cid:48)(cid:48) ( t ))= F (cid:48) ( tα ) + cF (cid:48)(cid:48) ( tα ) F (cid:48) ( tα ) + o ( F (cid:48)(cid:48) ( t ))= F (cid:48) ( tα ) + o (1) (81)Therefore we get F (cid:48) ( tα + R ( t, c )) ∼ F (cid:48) ( tα ), and consequently, µ ( tα + R ( t, c )) = F (cid:48) ( tα + R ( t, c )) exp( − F ( tα + R ( t, c )) ∼ F (cid:48) ( tα ) exp( − ( F ( tα ) + c ))= e − c µ ( tα ) (82)Putting together Proposition 5.2, Corollary 5.3, (80), and (82) completes the proof. We present some concrete results here.
Corollary 5.4.
Suppose µ ([ x, ∞ )) = e − x β with β ∈ (0 , . . Then lim t →∞ P µ (cid:18) X t t − β > c (cid:12)(cid:12)(cid:12)(cid:12) τ > t (cid:19) = exp( − βα β − c ) (83) that is, the limiting distribution is exponential with parameter βα β − . roof. From proposition 5.2 we get P µ ( X t > a t , τ > t ) ∼ µ ([ tα + a t , ∞ )) (84)Pick a t = c · t − β . Then by the generalized binomial theorem,( tα + a t ) β = ( tα ) β + cβα β − + o (1)Note that F (cid:48) ( tα ) = β ( tα ) β − . By substituting c = ct − β (( αt ) β ) (cid:48) = cβα β − , Theorem 2.8gives us the desired result. Example 5.5. If µ is a Weibull distribution with scale parameter λ > and shape parameter < β < . , the limiting distribution of P µ (cid:18) X t t − β > c (cid:12)(cid:12)(cid:12)(cid:12) τ > t (cid:19) is exponential distributionwith rate β (cid:16) αλ (cid:17) β − . Corollary 5.6.
Suppose µ ([ x, ∞ )) = G ( x ) , where G is smoothly varying function with index − κ < . Then lim t →∞ P µ (cid:18) X t t > c (cid:12)(cid:12)(cid:12)(cid:12) τ > t (cid:19) = (cid:18) α + cα (cid:19) − κ (85) that is, the limiting distribution is Lomax (shifted Pareto) distribution with shape parameter κ and scale parameter α .Proof. Since G ( x ) = exp(log( G ( x ))) and log( G ( x )) is a slowly varying function ( β = 0), thenatural choice for R ( t, c ) would be a t = R ( t, c ) = tc . Indeed, by the uniform convergencetheorem of regular varying function, [3, Theorem 1.5.2]lim t →∞ G ( tα + tc ) G ( t ) = ( α + c ) − κ (86)Therefore we have P µ ( X t > tc, τ > t ) P µ ( τ > t ) ∼ ( α + c ) − κ G ( t ) α − κ G ( t ) (87)which gives us the desired result. Example 5.7. If µ is a Half-Cauchy distribution (Cauchy distribution supported on R + ),the limiting distribution of P µ (cid:18) X t t > c (cid:12)(cid:12)(cid:12)(cid:12) τ > t (cid:19) is Lomax distribution with shape parameter1 and scale parameter α . Note that when β = 0, µ is a distribution with regular or slowly varying tail. In such casesit is often more convenient to work with the asymptotic result P µ ( X t > R ( t, c ) , τ > t ) ∼ µ ([ tα + R ( t, c ) , ∞ )) directly to find the right scaling factor R . We conclude this section with showingthe quasi-limiting behavior of µ which itself has slowly varying tail.27 orollary 5.8. Suppose µ ([ x, ∞ )) ∼ x as x → ∞ . Then lim t →∞ P µ (cid:18) ln X t ln t > c (cid:12)(cid:12)(cid:12)(cid:12) τ > t (cid:19) = (cid:40) c ≤ c c > . that is, the limiting distribution is Pareto distribution with shape parameter and scaleparameter .Proof. µ ([ x, ∞ )) ∼ exp( − ln ln x ) so we can apply Corollary 5.3. Since we have R ( t, c ) = t c , P µ ( X t > t c , τ > t ) P µ ( τ > t ) ∼ ln( tα )ln( tα + t c ) ∼ ln t +ln α ln t +ln α → c < ln t +ln α ln t +ln( α +1) → c = 1 ln t +ln αc ln t → c c > α of the BM. A Appendix
A.1 Proof of Proposition 1.3
Proof.
Suppose π is a QLD for µ . Then for every and continuous function f we have E µ [ f ( X t ) | τ > t ] → (cid:82) f dπ . That is, E µ [ f ( X t ) , τ t ] ∼ P µ ( τ > t ) (cid:82) f dπ , provided (cid:82) f dπ (cid:54) = 0.Fix such f and let t , t >
0. Then by the Markov property, E µ [ f ( X t + t ) , τ > t + t ] = E µ [ h t ( X t ) , τ > t ] . (89)where h t ( x ) = E x [ f ( X t ) , τ > t ]. By our assumption h t ( · ) is continuous and bounded, andtherefore, E µ [ h t ( X t ) | τ > t ] → (cid:82) h t dπ . On rewriting (89) we have E µ [ f ( X t + t ) | τ > t + t ] = E µ [ h t ( X t ) | τ > t ] × P µ ( τ > t ) P µ ( τ > t + t ) . By our assumption, as t → ∞ the lefthand side converges to the positive limit (cid:82) f dπ andthe first expression on the righthand side converges to (cid:82) h t dπ . Therefore the ratio on therighthand side converges to a nonzero limit we denote by R ( t ). This limit is independentof the choice of f . Therefore (cid:90) f π = E π [ f ( X t ) , τ > t ] R ( t ) = E π [ f ( X t ) | τ > t ] R ( t ) P π ( τ > t ) . (90)Taking f ≡
1, we obtain R ( t ) = P π ( τ>t ) , and plugging this in back into (90) gives provesthe claim. 28 .2 Proof of Proposition 1.5 Proof.
By the Markov property, P µ ( τ > s + t ) = P µ ( τ > t, P X t ( τ > s ))= P µ ( P X t ( τ > s ) | τ > t ) P µ ( τ > t ) (91)Write f ( x ) = P x ( τ > s ). π is a QSD and therefore the distribution of τ under P π isexponential with a parameter λ π >
0. Since π is the QLD of µ , for arbitrary (cid:15) > t = ( t , µ, s ) such that for each t > t , (cid:12)(cid:12)(cid:12) P µ ( P X t ( τ > s ) | τ > t ) − E π ( f ) (cid:12)(cid:12)(cid:12) < (cid:15)e − λ π s (92)Since E π ( f ) = P π ( τ > s ) = e − λ π s , we have that P µ ( P X t ( τ > s ) | τ > t ) ≤ (1 + (cid:15) ) e − λ π s , t > t . (93)Choose s = 1, and apply (93) repeatedly to obtain P µ ( τ > t + 1) ≤ (1 + (cid:15) ) e − λ π P µ ( τ > t ) P µ ( τ > t + 2) ≤ (1 + (cid:15) ) e − λ π P µ ( τ > t + 1) ≤ (1 + (cid:15) ) e − λ π P µ ( τ > t )... P µ ( τ > t + n ) ≤ (1 + (cid:15) ) n e − nλ π P µ ( τ > t ) (94)Since the choice of (cid:15) is arbitrary the result follows. A.3 QSDs for BM with constant drift
Here we provide a formal derivation for densities of the QSDs under assymption 1.6. Re-call that a BM with constant drift − α on R + absorbed at 0 is the sub-Markovian processgenerated by L α , which for each u satisfying u ∈ C ( R + ) and u (0) = 0, L α u = 12 u (cid:48)(cid:48) − αu (cid:48) . The formal adjoint L ∗ α of L α , with respect to integration by parts, is given by L ∗ v = 12 v (cid:48)(cid:48) + αv (cid:48) , v ∈ C ( R + ) , v (0) = 0 . Observe that for any f in the domain of L α , ddt P x ( f ( X t ) , τ > t ) = L α P x ( f ( X t ) , τ > t ) ⇒ P x ( f ( X t ) , τ > t ) = f ( x ) + (cid:90) t L α P x ( f ( X s ) , τ > s ) ds (95)29uppose a probability density function π satisfies L ∗ α π = − λπ for some λ >
0. Noticethat every QSD must be smooth, since if π is a QSD then by definition we have the followingdensity. π ( y ) = P π ( X s = y | τ > s )= P π ( X s = y, τ > s ) P π ( τ > s ) (96)Then with integration by parts we have the following. E π ( f ( X t ) , τ > t ) = (cid:90) f ( x ) π ( x ) dx + (cid:90) (cid:90) t L α ( E x ( f ( X s ) , τ s )) dsπ ( x ) dx = (cid:90) f ( x ) π ( x ) dx + (cid:90) t (cid:90) E x ( f ( X s ) , τ > s ) L ∗ α π ( x ) dxds = (cid:90) f ( x ) π ( x ) dx − λ (cid:90) t E π ( f ( X s ) , τ > s ) ds (97)Setting h ( t ) = E π ( f ( X t ) , τ > t ), (97) gives h (cid:48) ( t ) = − λh ( t ) ⇒ E π ( f ( X t ) , τ > t ) = e − λt (cid:90) f ( x ) π ( x ) dx (98)Therefore by monotone convergence, P π ( τ > t ) = e − λt E π ( f ( X t ) | τ > t ) = (cid:90) f ( x ) π ( x ) dx That is, π is a density of a QSD if and only if L ∗ α π = − λπ . We can see that a densityfor a QSD π is a solution to standard ODE and depends on the parameter λ . The range of λ for which such a density exists is λ ∈ (0 , α /
2] and for each such λ corresponds a uniquedensity. Fix such λ , set γ = √ α − λ , and let π γ be the corresponding density. Then π γ ( x ) = (cid:40) α − γ γ e − αx sinh( γx ) γ > α xe − αx γ = 0 . (99) Acknowledgement
The authors would like to express their deep gratitude to an anonymous referee whoseinput was invaluable to the presentation of our results and helped correcting errors andomissions.
References [1] M. S. Bartlett,
On theoretical models for competitive and predatory biological systems ,Biometrika (1957), 27–42. MR 86727302] , Stochastic population models in ecology and epidemiology , Methuen’s Mono-graphs on Applied Probability and Statistics, Methuen& Co., Ltd., London; John Wi-ley& Sons, Inc., New York, 1960. MR 0118550[3] N. H. Bingham, C. M. Goldie, and J. L. Teugels,
Regular variation , Cambridge Univ.Press, 2001.[4] Mioara Buiculescu,
On quasi-stationary distributions for multi-type Galton-Watson pro-cesses , J. Appl. Probability (1975), 60–68. MR 365734[5] Patrick Cattiaux, Pierre Collet, Amaury Lambert, Servet Mart´ınez, Sylvie M´el´eard, andJaime San Mart´ın, Quasi-stationary distributions and diffusion models in populationdynamics , Ann. Probab. (2009), no. 5, 1926–1969. MR 2561437[6] Nicolas Champagnat and Denis Villemonais, Exponential convergence to quasi-stationary distribution for absorbed one-dimensional diffusions with killing , ALEA Lat.Am. J. Probab. Math. Stat. (2017), no. 1, 177–199. MR 3622466[7] Pierre Collet, Servet Mart´ınez, and Jaime San Mart´ın, Quasi-stationary distributions:Markov chains, diffusions and dynamical systems , Springer Science & Business Media,2012.[8] Pablo A. Ferrari and Nevena Mari´c,
Quasi stationary distributions and Fleming-Viotprocesses in countable spaces , Electron. J. Probab. (2007), no. 24, 684–702. MR2318407[9] Alexandru Hening and Martin Kolb, Quasistationary distributions for one-dimensionaldiffusions with singular boundary points , Stochastic Process. Appl. (2019), no. 5,1659–1696. MR 3944780[10] Martin Kolb and David Steinsaltz,
Quasilimiting behavior for one-dimensional diffu-sions with killing , Ann. Probab. (2012), no. 1, 162–212. MR 2917771[11] Manuel Lladser and Jaime San Mart´ın, Domain of attraction of the quasi-stationarydistributions for the Ornstein-Uhlenbeck process , J. Appl. Probab. (2000), no. 2,511–520. MR 1781008[12] Servet Martinez, Pierre Picco, and Jaime San Martin, Domain of attraction of quasi-stationary distributions for the Brownian motion with drift , Adv. in Appl. Probab. (1998), no. 2, 385–408. MR 1642845[13] Servet Mart´ınez and Jaime San Mart´ın, Quasi-stationary distributions for a Brownianmotion with drift and associated limit laws , J. Appl. Probab. (1994), no. 4, 911–920.MR 1303922[14] Servet Mart´ınez, Jaime San Mart´ın, and Denis Villemonais, Existence and uniquenessof a quasistationary distribution for Markov processes with fast return from infinity , J.Appl. Probab. (2014), no. 3, 756–768. MR 32562253115] Sylvie M´el´eard and Denis Villemonais, Quasi-stationary distributions and populationprocesses , Probab. Surv. (2012), 340–410. MR 2994898[16] William O¸cafrain, Polynomial rate of convergence to the Yaglom limit for Brownianmotion with drift , Electron. Commun. Probab. (2020), Paper No. 35, 12. MR 4095047[17] Ross G. Pinsky, The lifetimes of conditioned diffusion processes , Ann. Inst. H. Poincar´eProbab. Statist. (1990), no. 1, 87–99. MR 1075440[18] M. Polak and T. Rolski, A note on speed of convergence to the quasi-stationary distri-bution , Demonstratio Math. (2012), no. 2, 385–397. MR 2963076[19] Henry Scheff´e, A useful convergence theorem for probability distributions , Ann. Math.Statistics (1947), 434–438. MR 21585[20] David Steinsaltz and Steven N. Evans, Quasistationary distributions for one-dimensional diffusions with killing , Trans. Amer. Math. Soc. (2007), no. 3, 1285–1324. MR 2262851[21] Erik A. van Doorn,
Quasi-stationary distributions and convergence to quasi-stationarityof birth-death processes , Adv. in Appl. Probab. (1991), no. 4, 683–700. MR 1133722[22] Erik A. van Doorn and Philip K. Pollett, Quasi-stationary distributions for discrete-statemodels , European J. Oper. Res. (2013), no. 1, 1–14. MR 3063313[23] David Williams,
Probability with martingales , Cambridge Mathematical Textbooks,Cambridge University Press, Cambridge, 1991. MR 1155402[24] Sewall Wright,
Evolution in Mendelian populations , Genetics (1931), no. 2, 97.[25] A. M. Yaglom, Certain limit theorems of the theory of branching random processes ,Doklady Akad. Nauk SSSR (N.S.) (1947), 795–798. MR 0022045[26] Jun Ye, Quasi-stationary distributions for the radial Ornstein-Uhlenbeck processes , ActaMath. Sci. Ser. B (Engl. Ed.)28