Affine Jump-Diffusions: Stochastic Stability and Limit Theorems
aa r X i v : . [ q -f i n . M F ] O c t Affine Jump-Diffusions: Stochastic Stability and LimitTheorems
Xiaowei Zhang † Peter W. Glynn ∗ Abstract.
Affine jump-diffusions constitute a large class of continuous-time stochasticmodels that are particularly popular in finance and economics due to their analyticaltractability. Methods for parameter estimation for such processes require ergodicity inorder establish consistency and asymptotic normality of the associated estimators. Inthis paper, we develop stochastic stability conditions for affine jump-diffusions, therebyproviding the needed large-sample theoretical support for estimating such processes. Weestablish ergodicity for such models by imposing a “strong mean reversion” conditionand a mild condition on the distribution of the jumps, i.e. the finiteness of a logarithmicmoment. Exponential ergodicity holds if the jumps have a finite moment of a positiveorder. In addition, we prove strong laws of large numbers and functional central limittheorems for additive functionals for this class of models.
Key words. affine jump-diffusion; ergodicity; Lyapunov inequality; strong law of largenumbers; functional central limit theorem
Affine jump-diffusion (AJD) processes constitute an important class of continuous time stochasticmodels that are widely used in finance and econometrics. This class of models is flexible enoughto capture various empirical attributes such as stochastic volatility and leverage effects; see, e.g.,Barndorff-Nielsen and Shephard (2001). Furthermore, the affine structure permits efficient compu-tation, as a consequence of the fact that the characteristic function of its transient distributionis of an exponential affine form. The transform can then be computed by solving a system ofordinary differential equations (ODEs) of generalized Riccati type; see Duffie et al. (2000). Theability to efficiently compute such characteristic functions then leads to significant tractability bothfor computing various expectations and probabilities, and for the use of “method of moments” forcalibrating such models; see Singleton (2001), Bates (2006), and Filipovi´c et al. (2013). † Corresponding author. Department of Management Sciences, College of Business, City University of Hong Kong,Hong Kong. Email: [email protected] ∗ Department of Management Science and Engineering, Stanford University, CA 94305, U.S.
We will adopt the following notation throughout the paper. • We write R d + := { v ∈ R d : v i ≥ , i = 1 , . . . , d } and R d − := { v ∈ R d : v i ≤ , i = 1 , . . . , d } . • A vector v ∈ R d is treated as a column vector, v ⊺ denotes its transpose, || v || denotes itsEuclidean norm. • For a matrix A , A (cid:23) A is symmetric positive semidefinite and A ≻ A is symmetric positive definite. • We write v I = ( v i : i ∈ I ) and A IJ = ( A ij : i ∈ I, j ∈ J ), where v ∈ R d is a vector, A ∈ R d × d is a matrix, and I , J ⊆ { , . . . , d } are two index sets. • We use to denote a zero vector or a zero matrix, and Id( i ) to denote a matrix with all zeroentries except the i -th diagonal entry is 1, regardless of dimension. • For a set K ⊆ R d , I K ( x ) denotes the indicator function associated with K , i.e. I K ( x ) = 1 if x ∈ K and 0 otherwise.Fix a complete probability space (Ω , F , P ) equipped with a filtration { F t : t ≥ } that satisfiesthe usual hypotheses (Protter 2003, p.3). Suppose that a stochastic process X = ( X ( t ) : t ≥ X ⊆ R d satisfies the following stochastic differential equation (SDE)d X ( t ) = µ ( X ( t )) d t + σ ( X ( t )) d W ( t ) + Z R d zN (d t, d z ) ,X (0) = x ∈ X , (1)where W = ( W ( t ) : t ≥
0) is a d -dimensional Wiener process and N (d t, d z ) is a random countingmeasure on [0 , ∞ ) × R d with compensator measure Λ( X ( t -))d tν (d z ); moreover, µ : R d R d , σ : R d R d × d , and Λ : R d R are measurable functions, and ν is a Borel measure on R d . Inthe sequel, we will write P x ( · ) = P ( ·| X (0) = x ) and P η ( · ) = R X P ( ·| X (0) = x ) η (d x ) for an initialdistribution η ; E x and E η denote the corresponding expectation operators.We call X an AJD if the drift µ ( x ), diffusion matrix σ ( x ) σ ( x ) ⊺ , and jump intensity Λ( x ) are all3ffine in x , namely, µ ( x ) = b + βx, b ∈ R d , β ∈ R d × d σ ( x ) σ ( x ) ⊺ = a + d X i =1 x i α i , a ∈ R d × d , α i ∈ R d × d , i = 1 , . . . , d Λ( x ) = λ + κ ⊺ x, λ ∈ R , κ ∈ R d . (2)This paper is largely motivated by statistical calibration of AJDs. Most calibration proceduresthat have been applied to AJDs are based on some estimating equation as follows. Let Ξ denotethe collection of unknown parameters. For simplicity, we assume that the process X is discretelysampled at time epochs { k ∆ : k = 0 , , . . . , n } for some ∆ >
0. To estimate Ξ, one judiciously selectsa tractable function h ( x, y ; Ξ) for which E [ h ( X (0) , X (∆); Ξ)] = 0, and then solves the equation1 n n X k =1 h ( X (( k − , X ( k ∆); ˆΞ n ) = 0 , to compute the estimate ˆΞ n . In a situation where the dimension of h is greater than the dimensionof Ξ, one can use the generalized method of moments (Hansen 1982). Typical choices of h includethe marginal characteristic function of the conditional distribution of X ( k ∆) given X (( k − A g ( x ) for some tractable function g with enough smoothness, where A is the operator defined in (10) as in Hansen and Scheinkman (1995). See also Duffie and Glynn(2004) for a choice of h that also utilizes the operator A but in a context where X is sampled atrandom times rather than deterministic times.In order to establish consistency and asymptotic normality of ˆΞ n , it is standard to assume positiveHarris recurrence as well as certain moment conditions on the function h ; see, e.g., Hansen (1982).The SLLNs and FCLTs that we present as part of Theorem 1 and Theorem 2 provide large-sampletheoretical support for establishing these asymptotic properties of the estimator. We refer interestedreaders to A¨ıt-Sahalia (2007) for an extensive survey on various statistical calibration methods forgeneral jump-diffusions and related assumptions for statistical validity. The following three assumptions are universal throughout the paper.
Assumption 1.
Let X = R m + × R d − m . For each x ∈ X , there exists a unique X -valued strongsolution to the SDE (1) with coefficients (2) . Assumption 2.
Let I = { , . . . , m } and J = { m + 1 , . . . , d } for some ≤ m ≤ d .(i) a (cid:23) with a II = (ii) α i (cid:23) and α i, II = α i,ii · Id( i ) for i ∈ I ; α i = for i ∈ J ; iii) b ∈ R m + × R d − m ;(iv) β IJ = and β II has non-negative off-diagonal elements;(v) λ ∈ R + , κ I ∈ R m + and κ J = ;(vi) ν is a probability distribution on X . Assumption 3. a J J ≻ and b i > α i,ii > for i = 1 , . . . , m . In this paper, we focus on AJDs with canonical state space (Assumption 1) and admissible param-eters (Assumption 2). In the absence of jumps (i.e., λ = 0 and κ = ), the existence and uniquenessof a strong solution to the SDE (1) with coefficients (2) is established in Filipovi´c and Mayerhofer(2009). They first prove the existence of a weak solution, then prove pathwise uniqueness of thesolution, and finally apply the Yamada–Watanabe theorem (Karatzas and Shreve 1991, Corollary5.3.23). The same approach is followed in Dawson and Li (2006) to prove the case of AJDs in oneor two dimensions.Clearly, under Assumption 2 both the diffusion matrix and the jump intensity are independentof x J , i.e. σ ( x ) σ ( x ) ⊺ = a + P mi =1 x i α i and the jump intensity Λ( x ) = λ + P mi =1 x i κ i . In financialapplications, the first m components ( X , . . . , X m ) are often used to model volatility processes andthus are referred to as volatility factors , whereas the other ( d − m ) components are referred to as dependent factors .The jumps of the AJDs we study here have finite activity, a consequence of the fact that ν is assumed to be a probability distribution rather than a σ -finite measure. Nevertheless, thisrestriction is imposed merely for mathematical simplicity; the main results could also be provedfor the case of infinite activity at the cost of a more involved analysis. One may recognize that theSDE (1) with finite activity jumps is precisely the model proposed in Duffie et al. (2000), whichalready covers a substantial number of financial and economic applications.For a one-dimensional AJD such as the CIR model, the condition 2 b i > α i,ii > X admits a positive transition density. Note that the existence of a transition density for AJDs isestablished in Filipovi´c et al. (2013) but their proof requires b i > α i,ii > i = 1 , . . . , m , whichis stronger than our Assumption 3. Prior to presenting the main results of the paper, let us review several concepts regarding stochasticstability of a Markov process. 5 efinition . A Markov process X with state space X is called Harris recurrent if there exists anon-trivial σ -finite measure ϕ on X such that R ∞ I K ( X ( t )) d t = ∞ , P x -a.s., for all x ∈ X and anymeasurable set K with ϕ ( K ) > Definition . A Harris recurrent Markov process X is called positive Harris recurrent if it admitsa finite invariant measure π , which can be normalized to a probability measure that is called the stationary distribution of X , the measure π must necessarily be unique. Definition . For a Markov process X with state space X , a set K ⊆ X is called uniformly transient if there exists M < ∞ such that E x R ∞ I K ( X ( t )) d t ≤ M for all x ∈ X . Furthermore, X is called transient if there is a countable cover of X with uniformly transient sets. Definition . For any measurable function f : X 7→ [1 , ∞ ) and any signed-measure ϕ on X , definethe f -norm of ϕ by || ϕ || f := sup | h |≤ f | ϕ ( h ) | , where ϕ ( h ) := R X h ( x ) ϕ (d x ). When f ≡ ||·|| f iscalled the total variation norm and is denoted by ||·|| .The following concept is also needed to state our condition for stochastic stability for AJDs. Definition . A square matrix is called stable if all its eigenvalues have negative real parts.The following notation will facilitate the presentation in the sequel. Let Z denote an R d -valuedrandom variable with distribution ν . For q >
0, set f q ( x ) := 1 + || x || q . For any measurable functions f : X 7→ [1 , ∞ ) and h : X 7→ R , set || h || f := sup x ∈X {| h ( x ) | /f ( x ) } . Let D [0 ,
1] denote the space ofright continuous functions x : [0 , R with left limits, endowed with the Skorokhod topology.A distinctive feature of AJDs, besides the affine structure, relative to other jump-diffusion modelsis that its jump intensity is state-dependent. This property endows AJDs with greater flexibilityin financial modeling but creates technical difficulties for analyzing the dynamics of the process.Indeed, differing theoretical treatments are needed, depending on whether the jump intensity isstate-dependent, when we establish Lyapunov inequalities in Section 3. We therefore present ourmain results in two separate theorems. Theorem 1 covers only AJDs with state-independent jumpintensities ( κ = ), whereas Theorem 2 allows state-dependent jump intensities. Theorem 1.
If Assumptions 1–3 hold, κ = , β is a stable matrix, and E log(1 + || Z || ) < ∞ , then:(i) X is positive Harris recurrent and lim t →∞ || P x ( X ( t ) ∈ · ) − π ( · ) || = 0 , x ∈ X , (3) where π is the stationary distribution of X .If, in addition, E || Z || p < ∞ for some p > , then:(ii) For each q ∈ (0 , p ] , there exist positive finite constants c q and ρ q such that || P x ( X ( t ) ∈ · ) − π ( · ) || f q ≤ c q f q ( x ) e − ρ q t , t ≥ , x ∈ X . (4)6 iii) For any measurable function h : X 7→ R with || h || f p < ∞ , P x (cid:18) lim t →∞ t Z t h ( X ( s )) d s = π ( h ) (cid:19) = 1 , x ∈ X , (5) and P x lim n →∞ n n X i =1 h ( X ( i ∆)) = π ( h ) ! = 1 , x ∈ X . (6) (iv) For any measurable function h : X 7→ R with || h q || f p < ∞ for some q > , there existnon-negative finite constants σ h and γ h such that n / (cid:18) n Z n · h ( X ( s )) d s − π ( h ) (cid:19) ⇒ σ h W ( · ) , (7) and n / n ⌊ n ·⌋ X i =1 h ( X ( i ∆)) − π ( h ) ⇒ γ h W ( · ) , (8) as n → ∞ P x -weakly in D [0 , for all x ∈ X , where W is a one-dimensional Wiener process. Theorem 2.
If Assumptions 1–3 hold, β + E ( Z ) κ ⊺ is a stable matrix, and E || Z || < ∞ , then:(i) X is positive Harris recurrent and (3) holds.If, in addition, E || Z || p < ∞ for some p ≥ . Then:(ii) For each q ∈ [1 , p ] , there exist positive finite constants c q and ρ q such that (4) holds.(iii) For any measurable function h : X 7→ R with || h || f p < ∞ , (5) and (6) hold.(iv) For any measurable function h : X 7→ R with || h q || f p < ∞ for some q > , there exist non-negative finite constants σ h and γ h such that (7) and (8) hold as n → ∞ P x -weakly in D [0 , for all x ∈ X . We note that X is called ergodic if it has a stationary distribution π and the convergence (3)holds, whereas called f - exponentially ergodic if || P x ( X ( t ) ∈ · ) − π ( · ) || f ≤ c ( x ) e − ρt , t ≥ , x ∈ X , for some functions f : X 7→ [1 , ∞ ), c : X 7→ R + and some positive finite constant ρ . Clearly, X is f p -exponentially ergodic under the assumptions of Theorem 1(ii) or Theorem 2(ii).The key condition imposed here to establish positive Harris recurrence of AJDs is that β + E ( Z ) κ ⊺ is a stable matrix. If we adopt the convention that 0 ×∞ = 0, when κ = this condition is reduced tothat β is a stable matrix regardless of the finiteness of E || Z || . The condition that β is a stable matrixis typically assumed in the literature, including Sato and Yamazato (1984), Glasserman and Kim72010), and Jena et al. (2012), in order that the process be mean reverting and have a stationarydistribution. However, the first of the three articles works on a special L´evy-driven SDE, whereasthe other two study ADs, so none of them allows state-dependent jump intensities as AJDs do. Itcan be shown that the stability of β + E ( Z ) κ ⊺ implies that of β ; see Lemma 3 of Zhang et al. (2015).Thus, our condition is stronger and we call it the strong mean reversion condition .Note that E ( Z ) is the mean jump size and κ largely determines the magnitude of the jumpintensity when the AJD takes on big values. To some extent, E ( Z ) κ ⊺ captures the impact of thejumps. Thus, by imposing the stability of β + E ( Z ) κ ⊺ , we essentially assume that mean reversionis a dominating factor, more significant than the jumps, in the dynamics of the process. On theother hand, this condition is technically mild. Indeed, we show in Section 3.3 that it cannot berelaxed in general if positive Harris recurrence of an AJD is desired. In this section, we apply Lyapunov ideas to address the stochastic stability of X . A key step in thisapproach is to judiciously construct suitable Lyapunov functions; see Meyn and Tweedie (1993c)for an extensive treatment of this approach. Nevertheless, we do not directly use the results therebecause their theory uses a definition of domain that insists on functions inducing martingales,whereas we work with local martingales.Consider a twice-differentiable function g : X 7→ R . By virtue of Itˆo’s formula, g ( X ( t )) = g ( X (0)) + Z t (cid:20) ∇ g ( X ( s -)) ⊺ µ ( X ( s -)) + 12 d X i,j =1 ∂ g ( X ( s -)) ∂x i ∂x j ( σ ( X ( s -)) σ ( X ( s -)) ⊺ ) ij (cid:21) d s + Z t ∇ g ( X ( s -)) ⊺ σ ( X ( s -)) d W ( s ) + Z t Z X ( g ( X ( s -) + z ) − g ( X ( s -))) N (d s, d z ) . (9)By defining operators G , L , and A on twice-differentiable appropriately integrable functions g via G g ( x ) := ∇ g ( x ) · ( b + βx ) + 12 d X i,j =1 ∂ g ( x ) ∂x i ∂x j a i,j + d X k =1 α k,ij x k ! , L g ( x ) :=( λ + κ ⊺ x ) Z X ( g ( x + z ) − g ( x )) ν (d z ) , A g ( x ) := G g ( x ) + L g ( x ) , (10)8e may rewrite (9) as g ( X ( t )) = g ( X (0)) + Z t A g ( X ( s -)) d s + S ( t ) + S ( t ) ,S ( t ) := Z t ∇ g ( X ( s -)) ⊺ σ ( X ( s -)) d W ( s ) ,S ( t ) := Z t Z X ( g ( X ( s -) + z ) − g ( X ( s -))) ˜ N (d s, d z ) , (11)where ˜ N (d s, d z ) = N (d s, d z ) − Λ( X ( s -))d sν (d z ) is the compensated random measure of N (d s, d z ).We introduce some notation to facilitate the construction of the needed Lyapunov inequalities.First, for a d × d matrix H ≻
0, define || v || H := √ v ⊺ Hv . Then, ||·|| H is a vector norm on R d and itis easy to show that ¯ δ || v || ≤ || v || H ≤ ¯ δ || v || , v ∈ R d , (12)where ( δ i : i = 1 , . . . , d ) are the eigenvalues of H , ¯ δ = min { δ i : i = 1 , . . . , d } and ¯ δ = max { δ i : i =1 , . . . , d } . We can then define the following induced matrix norms (see Horn and Johnson (2012,p.340)). For a matrix A ∈ R d × d , define ||| A ||| := sup (cid:26) || Av |||| v || : = v ∈ R d (cid:27) and ||| A ||| H := sup (cid:26) || Av || H || v || H : = v ∈ R d (cid:27) . For each ∆ >
0, let X ∆ := ( X ( n ∆) : n = 0 , , . . . ) denote the ∆-skeleton of X . Proposition 1.
Under Assumptions 1–3, X ∆ is ϕ -irreducible for any ∆ > , where ϕ is theLebesgue measure on X . The proof of Proposition 1 relies on the following result, which is of interest in its own right. Itreduces irreducibility of a jump-diffusion process to that of the associated diffusion process.
Lemma 1.
Suppose that X satisfies the SDE (1) . Let ˜ X = ( ˜ X ( t ) : t ≥ satisfy d ˜ X ( t ) = µ ( ˜ X ( t )) d t + σ ( ˜ X ( t )) d W ( t ) , ˜ X (0) = x ∈ X , (13) where W is the d -dimensional Wiener process in (1) . If ˜ X ∆ (resp., ˜ X ) is ϕ -irreducible, then X ∆ (resp., X ) is ϕ -irreducible.Proof. Consider a measurable K ⊆ X and let τ denote the first jump time of X . Then P x ( X ( t ) =˜ X ( t )) = 1 for t < τ ∗ . It follows that for any t > P x ( X ( t ) ∈ K, τ ∗ > t ) = E x h E (cid:16) I ( ˜ X ( t ) ∈ K, τ ∗ > t ) | X ( s ) , ≤ s ≤ t (cid:17)i Here, we do not restrict its coefficients µ , σ , Λ to follow the affine form (2). E x h I ( ˜ X ( t ) ∈ K ) P (cid:16) τ ∗ > t | ˜ X ( s ) , ≤ s ≤ t (cid:17)i = E x h I ( ˜ X ( t ) ∈ K ) e − R t Λ( ˜ X ( s )) d s i . Hence, P x ( X ( t ) ∈ K, τ ∗ > t ) = 0 if and only if P x ( ˜ X ( t ) ∈ K ) = 0 for any t >
0. It is then clearthat the ϕ -irreducibility of ˜ X ∆ (resp., ˜ X ) implies that of X ∆ (resp., X ). Proof of Proposition 1.
The key in the proof is to convert the AJD by a linear transformation usedin Filipovi´c and Mayerhofer (2009) into a canonical representation in which the matrices involvedare of special form. Specifically, note that if X satisfies the SDE (1) with coefficients (2), then forany nonsingular matrix A ∈ R d × d , the linear transformation Y = AX satisfiesd Y ( t ) = ( Ab + AβA − Y ( t )) d t + Aσ (cid:0) A − Y ( t ) (cid:1) d W ( t ) + Z R d AzN (d t, d z ) ,Y (0) = Ax, (14)where N (d t, d z ) has the compensator measure Λ( A − Y ( t -))d tν (d z ). So the drift, diffusion matrix,and intensity of SDE (14) are Ab + AβA − y , Aσ (cid:0) A − y (cid:1) σ (cid:0) A − y (cid:1) ⊺ A ⊺ , and λ + κ ⊺ A − y , respectively,which are all affine in y . Consequently, the existence and uniqueness of a strong solution to (1) isinvariant with respect to nonsingular linear transformations.Since α i,ii > i = 1 , . . . , m , it follows from Lemma 7.1 of Filipovi´c and Mayerhofer (2009)that there exists a nonsingular matrix A ∈ R d × d that maps R m + × R d − m to itself and renders thetransformed diffusion matrix in the following block-diagonal form Aσ (cid:0) A − y (cid:1) σ (cid:0) A − y (cid:1) ⊺ A ⊺ = diag( α , y , . . . , α m,mm y m ) h + P mi =1 y i η i ! for some ( d − m ) × ( d − m ) matrices h (cid:23) η i (cid:23) i = 1 , . . . , m . In particular, A is of the form A = I m D I d − m ! , for some ( d − m ) × m matrix D , where I m and I d − m are identity matrices. Moreover, it is straight-forward to verify that Ab , AβA − , and κ ⊺ A − satisfy both Assumption 2 and Assumption 3 in lieuof b , β , and κ . Hence, we can assume without loss of generality that the diffusion matrix of (1) hasthe form σ ( x ) σ ( x ) ⊺ = diag( α , x , . . . , α m,mm x m ) a J J + P mi =1 x i α i, J J ! . (15)Hence, ˜ X I ( t ) satisfiesd ˜ X I ( t ) = ( b I + β II ˜ X I ( t )) d t + diag( √ α , x , . . . , √ α m,mm x m ) d W I ( t ) , ˜ X I (0) = x I ∈ R m + . b i > α i,ii , i = 1 , . . . , m , we can directly verify the conditions of thetheorem on p.388 of Duffie and Kan (1996) to conclude that ∈ R m + is not attainable in finite time,i.e. ˜ X i ( t ) > t > i = 1 , . . . , m , if ˜ X i (0) > i = 1 , . . . , m .We now consider a bijective transformation ˜ Y := f ( ˜ X ), where f : X 7→ X is defined as follows: f i ( x ) = 2 √ x i for i = 1 , . . . , m and f i ( x ) = x i for x = m + 1 , . . . , d . Then, ∂f i ( x ) ∂x j = x − / i , if i = j, i = 1 , . . . , m, , if i = j, i = m + 1 , . . . , d, , otherwise,and ∂ f i ( x ) ∂x k ∂x l = ( − x − / i , if i = k = l, i = 1 , . . . , m, , otherwise.It follows that, by Itˆo’s formula,d f i ( ˜ X ( t )) = ζ i ( ˜ X ( t )) d t + ∇ f i ( ˜ X ( t )) ⊺ σ ( ˜ X ( t )) d W ( t ) , for i = 1 , . . . , d , where ζ i ( x ) = ∂f i ( x ) ∂x i µ i ( x ) + 12 ∂ f i ( x ) ∂x i ( σ ( x ) σ ( x ) ⊺ ) ii . Note that we have shown that x i > i = 1 , . . . , m for x ∈ X , so the function ζ ( x ) is well-definedfor all x ∈ X . Let f − denote the inverse mapping of f , i.e. f − i ( y ) = y i for i = 1 , . . . , m and f − i ( y ) = y i , for i = m + 1 , . . . , d . Then,d ˜ Y ( t ) = ζ ( f − ( ˜ Y ( t ))) d t + ∇ f ( f − ( ˜ Y ( t ))) σ ( f − ( ˜ Y ( t ))) d W ( t ) , (16)where ∇ f := ( ∂f i ∂x j ) ≤ i,j ≤ d is the Jacobian matrix of f . A straightforward calculation reveals thatthe diffusion matrix of (16) is ∇ f ( f − ( y )) σ ( f − ( y )) σ ( f − ( y )) ⊺ ∇ f ( f − ( y )) ⊺ = diag( α , . . . , α m,mm ) a J J + P mi =1 y i α i, J J ! . Hence, in light of the assumption that α i,ii > i = 1 , . . . , m and a J J ≻
0, the diffusion matrix of(16) is uniformly elliptic . It is well known that such diffusion processes admit a positive probabilitydensity; see, e.g., Theorem 3.3.4 of Davies (1989). Since the mapping f is bijective, we concludethat ˜ X also admits a positive transition density, so ˜ X ∆ is ϕ -irreducible. This completes the proofin light of Lemma 1. Proposition 2.
Under Assumptions 1 and 2, X is a stochastically continuous affine process.Proof. For
T > u ∈ i R d , define M ( t ) := e φ ( T − t,u )+ ψ ( T − t,u ) ⊺ X ( t ) ,11here φ : R + × i R d C and ψ ( t, u ) : R + × i R d C d are functions that are differentiable withrespect to t . Applying Itˆo’s formula, M ( t )= M (0) + Z t M ( s -) ψ ( T − s, u ) ⊺ σ ( X ( s )) d W ( s ) + Z t M ( s -) Z X (cid:16) e ψ ( T − s,u ) ⊺ z − (cid:17) N (d s, d z )+ Z t M ( s -) [ − ∂ t φ ( T − s, u ) − ∂ t ψ ( T − s, u ) ⊺ X ( s ) + ψ ( T − s, u ) ⊺ µ ( X ( s -))] d s + 12 Z t M ( s -) ψ ( T − s, u ) ⊺ σ ( X ( s -)) σ ( X ( s -)) ⊺ ψ ( T − s, u ) d s = M (0) + Z t M ( s -) ψ ( T − s, u ) ⊺ σ ( X ( s )) d W ( s ) + Z t M ( s -) Z X (cid:16) e ψ ( T − s,u ) ⊺ z − (cid:17) ˜ N (d s, d z )+ Z t M ( s -) (cid:20) − ∂ t φ ( T − s, u ) + ψ ( T − s, u ) ⊺ b + 12 ψ ( T − s, u ) ⊺ aψ ( T − s, u ) (cid:21) d s + Z t M ( s -) (cid:20) − ∂ t ψ ( T − s, u ) ⊺ X ( s -) + ψ ( T − s, u ) ⊺ βX ( s -) + 12 d X i =1 ψ ( s, u ) ⊺ α i ψ ( s, u ) X i ( s -) (cid:21) d s + Z t M ( s -)( λ + κ ⊺ X ( s -)) Z X (cid:16) e ψ ( T − s,u ) ⊺ z − (cid:17) ν (d z )d s. Hence, if φ and ψ satisfy the following generalized Riccati equations ∂ t φ ( t, u ) = ψ ( t, u ) ⊺ b + 12 ψ ( t, u ) ⊺ aψ ( t, u ) + λ Z X (cid:16) e ψ ( t,u ) ⊺ z − (cid:17) ν (d z ) ,∂ t ψ i ( t, u ) = ψ ( t, u ) ⊺ β i + 12 d X i =1 ψ ( t, u ) ⊺ α i ψ ( t, u ) + κ i Z X (cid:16) e ψ ( t,u ) ⊺ z − (cid:17) ν (d z ) , i = 1 , . . . , d, with φ (0 , u ) = 0 and ψ (0 , u ) = u , where β i is the i -th column of β , then M ( t ) = M (0) + Z t M ( s -) ψ ( T − s, u ) ⊺ σ ( X ( s )) d W ( s ) + Z t M ( s -) Z X (cid:16) e ψ ( T − s,u ) ⊺ z − (cid:17) ˜ N (d s, d z ) . (17)It follows from Proposition 6.1 and Proposition 6.4 of Duffie et al. (2003) that under Assumption2, the preceding generalized Riccati equations have a unique solution ( φ ( · , u ) , ψ ( · , u )) : R + C − × C m − × i R d − m for all u ∈ C m − × i R d − m , where C m − = { z ∈ C m | Re( z ) ∈ R m − } . Hence, φ ( t, u ) + ψ ( t, u ) ⊺ x ∈ C − , x ∈ X , (18)under Assumption 1. Further, Proposition 7.4 of Duffie et al. (2003) asserts that φ ( t + s, u ) = φ ( t, u ) + φ ( s, ψ ( t, u )) ψ ( t + s, u ) = ψ ( t, ψ ( s, u )) (19)for all t, s ∈ R + and u ∈ C m − × i R d − m . 12n light of (17) and (18), ( M ( t ) : 0 ≤ t ≤ T ) is a local martingale with | M ( t ) | ≤ t ,thereby a martingale. So E x [ e u ⊺ X ( T ) ] = E x [ M ( T )] = E x [ M (0)] = e φ ( T,u )+ ψ ( T,u ) ⊺ x , (20)namely the characteristic function E x [ e u ⊺ X ( t ) ] is exponential-affine in x . In addition, it is easy toverify via (19) and (20) the ChapmanKolmogorov equation P x ( X ( t + s ) ∈ · ) = Z X P x ( X ( t ) ∈ d y ) P y ( X ( s ) ∈ · ) , implying that X is a time-homogeneous Markov process, thereby an affine process by (20).At last, E x [ e u ⊺ X ( t ) ] is clearly continuous in t by (20), indicating that X is stochastically continuous. Proof of Theorem 1(i).
We first show that X is Harris recurrent. Theorem 1.1 of Meyn and Tweedie(1993a) asserts that X is Harris recurrent if (i) X is a Borel right process (Getoor 1975, p.55),and (ii) there exists a petite set K for X , such that P x ( τ K < ∞ ) = 1 for all x ∈ X , where τ K = inf { t ≥ X ( t ) ∈ K } .For condition (i), we note that X is a Feller process by Theorem 5.1 of Keller-Ressel et al. (2011),Proposition 8.2 of Duffie et al. (2003), and Proposition 2. The Feller property of X trivially impliesthat X is a Borel right process.For condition (ii), fix an arbitrary ∆ > X ∆ is a Feller chain since X is a Fellerprocess. By Theorem 3.4 of Meyn and Tweedie (1992), the Feller property of X ∆ and Proposition1 immediately imply that all compact sets are petite for X ∆ , thereby petite for X . In the sequel,we will show that there exists a compact set K such that P x ( τ K < ∞ ) = 1 for all x ∈ X . To thatsend, we first establish the following Lyapunov inequality A g ( x ) ≤ − c + c I K ( x ) , x ∈ X , (21)for some compact set K and some positive finite constants c and c , where g ( x ) = log(1 + || x || H )for some d × d matrix H ≻ β is a stable matrix, there exists a d × d matrix H ≻ − ( Hβ + β ⊺ H ) ≻ g ( x ) as follows ∇ g ( x ) = 2 Hx || x || H and ∇ g ( x ) = 2(1 + || x || H ) H − Hxx ⊺ H (1 + || x || H ) . G g ( x ) = 21 + || x || H x ⊺ H ( b + βx ) + 12 d X i,j =1 a ij + d X k =1 α k,ij x k ! H − Hxx ⊺ H || x || H ! ij . (22)We note that for any i, j = 1 , . . . , d , | ( Hxx ⊺ H ) ij | = | ( Hx ) i ( Hx ) j | ≤ || Hx || ≤ ||| H ||| || x || ≤ ¯ δ − ||| H ||| || x || H , where the last inequality follows from (12). Hence, | ( Hxx ⊺ H ) ij | || x || H = O (1) , (23)as || x || H → ∞ . Therefore, we can rewrite (22) as G g ( x ) = 2 x ⊺ Hβx + O ( || x || H )1 + || x || H = 2 x ⊺ Hβx || x || H (1 + || x || H ) + o (1) , as || x || H → ∞ . Moreover, by virtue of (12) and the fact that − ( Hβ + β ⊺ H ) ≻ − x ⊺ Hβx = − x ⊺ ( Hβ + β ⊺ H ) x ≥ ¯ γ || x || ≥ ¯ γ ¯ δ − || x || H , where ¯ γ > − ( Hβ + β ⊺ H ). Therefore,lim sup || x || H →∞ G g ( x ) = lim sup || x || H →∞ x ⊺ Hβx || x || H (1 + || x || H ) ≤ − ¯ γ ¯ δ − . (24)On the other hand, it is easy to see that 1 + ( || x || H + || z || H ) ≤ || x || H )(1 + || z || H ) for all x, z ∈ R d . Thus,log || x + z || H || x || H ! ≤ log || x || H + || z || H ) || x || H ! ≤ log(2(1 + || z || H )) . (25)It is easy to see that log(2(1 + || z || H )) is integrable on X , since E log(1 + || Z || H ) < ∞ if and only if E log(1 + || Z || ) < ∞ in light of (12). Then, we move the left-hand-side of (25) to the right-hand-sideand apply Fatou’s lemma to obtainlim sup || x || H →∞ Z X log || x + z || H || x || H ! ν (d z ) ≤ Z X lim sup || x || H →∞ log || x + z || H || x || H ! ν (d z ) = 0 . κ = , lim sup || x || H →∞ L g ( x ) = lim sup || x || H →∞ λ Z X log || x || H + || z || H ) || x || H ! ν (d z ) ≤ . (26)We then conclude from (24) and (26) that there exists k > A g ( x ) = G g ( x ) + L g ( x ) ≤ − γ ¯ δ − , for all x ∈ X with || x || H > k . Then, it is easy to check that the inequality (21) holds by setting K = { x ∈ X : || x || H ≤ k } , c = ¯ γ ¯ δ − /
2, and c = max { , sup x ∈ K ( A g ( x ) + c ) } .We are now ready to show P x ( τ K < ∞ ) = 1 for all x ∈ X . Define T n = inf { t ≥ | X ( t ) | > n } .It follows from (11) and (21) that g ( X ( t ∧ T n )) ≤ g ( X (0)) + Z t ∧ T n [ − c + c I K ( X ( s -))] d s + S ( t ∧ T n ) + S ( t ∧ T n ) , n ≥ . (27)Noting that | X ( t -) | ≤ n is bounded for t ∈ [0 , T n ), ( S i ( t ∧ T n ) : t ≥
0) is a martingale, i = 1 , E x [ g ( X ( t ∧ τ K ∧ T n ))] ≤ g ( x ) − c E x ( t ∧ τ K ∧ T n ) , x ∈ X \ K, n ≥ . Therefore, c E x ( t ∧ τ K ∧ T n ) ≤ g ( x ) , x ∈ X \ K, n ≥ , since g ( x ) ≥ x ∈ X . Note that X is non-explosive, so T n → ∞ as n → ∞ P x -a.s. for all x ∈ X . Therefore, by sending n → ∞ and then sending t → ∞ , we conclude from the monotoneconvergence theorem that c E x ( τ K ) ≤ g ( x ) for x ∈ X \ K . Hence, P x ( τ K < ∞ ) = 1 for all x ∈ X .Consequently, X is Harris recurrent by Theorem 1.1 of Meyn and Tweedie (1993a).Theorem 1.2 of Meyn and Tweedie (1993a) states that given the Harris recurrence, X is positiveHarris recurrent if sup x ∈ K E x ( τ K (∆)) < ∞ . We now show this is indeed the case. For any ∆ > τ K (∆) := ∆ + Θ ∆ ◦ τ K be the first hitting time on K after ∆, where Θ ∆ is the shift operator ;see Sharpe (1988, p.8). Then, E x ( τ K (∆) − ∆) = Z X P x ( X (∆) ∈ d y ) E y ( τ K ) ≤ Z X c − g ( y ) P x ( X (∆) ∈ d y ) = c − E x g ( X (∆)) , (28)for all x ∈ X . In addition, it follows from (27) that E x g ( X (∆ ∧ T n )) ≤ g ( x ) + ( c − c ) E x (∆ ∧ T n ) , x ∈ X , n ≥ . E x g ( X (∆)) ≤ lim inf n →∞ E x g ( X (∆ ∧ T n )) ≤ g ( x ) + ( c − c )∆ , x ∈ X . (29)Combining (28) and (29) yields that E x ( τ K (∆)) ≤ c − ( g ( x ) + d ) , x ∈ X . Hence, sup x ∈ K E x ( τ K (∆)) < ∞ , which implies that X is positive Harris recurrent by Theorem 1.2of Meyn and Tweedie (1993a).Finally, Theorem 6.1 of Meyn and Tweedie (1993b) asserts that if X ∆ is ϕ -irreducible, which istrue by Proposition 1, then a positive Harris recurrent process is ergodic, i.e. (3) holds. Proof of Theorem 2(i).
Following the proof Theorem 1(i), it suffices to show the Lyapunov inequal-ity (21) holds under the assumptions of Theorem 2(i). In fact, we prove the following strongerresult A g ( x ) ≤ − c g ( x ) + c I K ( x ) , x ∈ X , (30)for some compact set K and some positive finite constants c and c , where g ( x ) = (1 + || x || H ) p/ for some d × d matrix H ≻ p ≥ E || Z || < ∞ , there exists p ≥ E || Z || p < ∞ . Since β + E ( Z ) κ ⊺ is stable, thereexists a matrix H ≻ − [ H ( β + E ( Z ) κ ⊺ ) + ( β + E ( Z ) κ ⊺ ) ⊺ H ] ≻ . (31)It is straightforward to calculate the gradient and Hessian of g ( · ) as follows ∇ g ( x ) = pg ( x )1 + || x || H Hx and ∇ g ( x ) = pg ( x )1 + || x || H " H + ( p − Hxx ⊺ H || x || H . It then follows from (23) that as || x || H → ∞ , G g ( x ) = pg ( x )1 + || x || H x ⊺ H ( b + βx ) + 12 d X i,j =1 ( a i,j + d X k =1 α k,ij x k ) H + ( p − Hxx ⊺ H || x || H ! i,j = pg ( x ) x ⊺ Hβx || x || H + o (1) ! . (32)To analyze the asymptotic behavior of L g ( x ), we apply the mean value theorem, namely g ( x + z ) − g ( x ) = ∇ g ( ξ ) ⊺ z = p (1 + || ξ || H ) p/ − ξ ⊺ Hz, where ξ = x + uz for some u ∈ (0 , || ξ || H lies between || x || H and || x + z || H and ξ ⊺ Hzκ ⊺ x x ⊺ Hzκ ⊺ x and ( x + z ) ⊺ Hzκ ⊺ x . It then follows that κ ⊺ x ( g ( x + z ) − g ( x )) g ( x ) = p · (1 + || ξ || H ) p/ − (1 + || x || H ) p/ · ξ ⊺ Hzκ ⊺ x ∼ p · x ⊺ Hzκ ⊺ x || x || H (33)as || x || H → ∞ for all z ∈ R d . Moreover, | g ( x + z ) − g ( x ) | = p (1 + || ξ || H ) p/ − | z ⊺ Hξ |≤ p (1 + || ξ || H ) p/ − || z |||| Hξ ||≤ p ¯ δ − (1 + || ξ || H ) p/ − || z || H ||| H ||| H || ξ || H ≤ p ¯ δ − (1 + || ξ || H ) p/ − / || z || H ||| H ||| H , (34)where the second inequality follows from (12). So (cid:12)(cid:12)(cid:12)(cid:12) κ ⊺ x ( g ( x + z ) − g ( x )) g ( x ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ | κ || x | p ¯ δ − (1 + || ξ || H ) p/ − / || z || H ||| H ||| H (1 + || x || H ) p/ ≤ p ¯ δ − ||| H ||| H || κ || H || z || H (1 + || ξ || H ) p/ − / (1 + || x || H ) p/ − / , where the second inequality follows from (12). Note that1 + || ξ || H = 1 + || x + uz || H ≤ || x || H )(1 + || uz || H ) ≤ || x || H )(1 + || z || H ) , so Z X (cid:12)(cid:12)(cid:12)(cid:12) κ ⊺ x ( g ( x + z ) − g ( x )) g ( x ) (cid:12)(cid:12)(cid:12)(cid:12) ν (d z ) ≤ p/ − / p ¯ δ − ||| H ||| H || κ || H Z X (1 + || z || H ) p/ ν (d z ) < ∞ . (35)By (33) and (35), the dominated convergence theorem dictates that κ ⊺ x Z X ( g ( x + z ) − g ( x )) ν (d z ) ∼ pg ( x ) · Z x ⊺ Hzκ ⊺ x || x || H ν (d z ) = pg ( x ) · x ⊺ H E ( Z ) κ ⊺ x || x || H , and thus L g ( x ) = ( λ + κ ⊺ x ) Z X ( g ( x + z ) − g ( x )) ν (d z ) ∼ pg ( x ) x ⊺ H E ( Z ) κ ⊺ x || x || H , (36)as || x || H → ∞ . Combining (32) and (36), A g ( x ) = G g ( x ) + L g ( x ) = pg ( x ) x ⊺ H ( β + E ( Z ) κ ⊺ ) x || x || H + o (1) ! Here, we use the notation that f ( x ) ∼ g ( x ) if lim || x || H →∞ f ( x ) g ( x ) = 1 . || x || H → ∞ . By (31), the definition of the matrix H , − x ⊺ H ( β + E ( Z ) κ ⊺ ) x = − x ⊺ [ H ( β + E ( Z ) κ ⊺ ) + ( β + E ( Z ) κ ⊺ ) ⊺ H ] x ≥ γ || x || ≥ γ ¯ δ − || x || H , where ¯ γ > − [ H ( β + E ( Z ) κ ⊺ ) + ( β + E ( Z ) κ ⊺ ) ⊺ H ]. Hence, thereexists k > A g ( x ) ≤ − p ¯ γ ¯ δ − g ( x ) for all x ∈ X with || x || H > k . Therefore, (30) holdsby setting K = { x ∈ X : || x || H ≤ k } , c = p ¯ γ ¯ δ − /
4, and c = max { , sup x ∈ K ( A g ( x ) + c g ( x )) } . Proof of Theorem 1(ii).
Note that if E || Z || p < ∞ for some p >
0, then E || Z || q < ∞ for all q ∈ (0 , p ].We assume that p ∈ (0 , p ≥ β is stable, there exists a matrix H ≻ − ( Hβ + β ⊺ H ) ≻
0. We show that g q ( x ) = (1 + || x || H ) q/ satisfies the inequality (30) for some compact set K and some positive finiteconstants c , c . Note that g q ( x + z ) − g q ( x ) ≤ (1 + || x || H + || z || H ) q/ − (1 + || x || H ) q/ = q ξ q/ − || z || qH , where the equality follows from the mean value theorem and ξ ∈ (1 + || x || H , || x || H + || z || H ).Since ξ > p ∈ (0 , g q ( x + z ) − g q ( x ) ≤ q || z || qH . Likewise, it can be shown that g q ( x ) − g q ( x + z ) ≤ q || z || qH . Hence, | g q ( x + z ) − g q ( x ) | ≤ q || z || qH and (cid:12)(cid:12)(cid:12)(cid:12)Z X g q ( x + z ) − g q ( x ) ν (d z ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ Z X | g q ( x + z ) − g q ( x ) | ν (d z ) ≤ q E || Z || qH < ∞ , It follows that with κ = , L g q ( x ) = λ Z X ( g q ( x + z ) − g q ( x )) ν (d z ) = O (1) , as || x || H → ∞ . Moreover, applying (32) to g q ( x ), A g q ( x ) = G g q ( x ) + L g q ( x ) = qg q ( x ) x ⊺ Hβx || x || H + o (1) ! , as || x || H → ∞ . By the definition of the matrix H , − x ⊺ Hβx = − x ⊺ ( Hβ + β ⊺ H ) x ≥ γ || x || ≥ γ ¯ δ − || x || H , where ¯ γ > − ( Hβ + β ⊺ H ). Hence, there exists k > A g ( x ) ≤ − p ¯ γ ¯ δ − g ( x ) for all x ∈ X with || x || H > k . Therefore, A g q ( x ) ≤ − c g q ( x ) + c I K ( x ) , x ∈ X , (37)18here K = { x ∈ X : || x || H ≤ k } , c = p ¯ γ ¯ δ − /
4, and c = max { , sup x ∈ K ( A g ( x ) + c g ( x )) } .We apply Itˆo’s formula to e c t g q ( X ( t )). In particular, by (11), e c t g q ( X ( t )) = g q ( X (0)) + Z t e c s [ c g q ( X ( s -)) + A g q ( X ( s -))]d s + Z t e c s ∇ g q ( X ( s -)) ⊺ σ ( X ( s )) d W ( s )+ Z t e c s Z X ( g q ( X ( s -) + z ) − g q ( X ( s -))) ˜ N (d s, d z ) . Clearly, the two stochastic integrals above are both martingales up to time T n , where T n = { t ≥ | X ( t ) | > n } . It follows from (37) and the optional sampling theorem that e c t E x g q ( X ( t ∧ T n )) ≤ g q ( x ) + E x Z t ∧ T n e c s · c I K ( X ( s )) d s ≤ g q ( x ) + c c − E x e t ∧ T n . We now apply Fatou’s lemma and the monotone convergence theorem to conclude that e c t E x g q ( X ( t )) ≤ g q ( x ) + c c − · lim inf n →∞ E x e t ∧ T n = g q ( x ) + c c − e c t . (38)Then we can adopt the argument used in the proof of Theorem 6.1 of Meyn and Tweedie (1993c)to conclude that because of (38), there exist positive finite constants d q and ρ q such that || P x ( X ( t ) ∈ · ) − π ( · ) || g q +1 ≤ d q ( g q ( x ) + 1) e − ρ q t , t ≥ , x ∈ X . By (12), there exist positive constants d and d such that d ≤ (cid:12)(cid:12)(cid:12) f q ( x ) g q ( x )+1 (cid:12)(cid:12)(cid:12) ≤ d for all x ∈ X . Hence, || P x ( X ( t ) ∈ · ) − π ( · ) || f q ≤ c q f q ( x ) e − ρ q t , t ≥ , x ∈ X , where c q = d q d /d . Proof of Theorem 2(ii).
Following the proof of Theorem 1(ii), it suffices to show that (37) holdsunder the present assumptions. Note that E || Z || q < ∞ for all q ∈ [1 , p ] since E || Z || p < ∞ . Hence,we can apply the Lyapunov inequality (30) to g q ( x ), which results in (37). The key condition that we impose to establish positive Harris recurrence of X is the strong mean-reversion condition, i.e, β + E ( Z ) κ ⊺ is a stable matrix. Indeed, this condition cannot be relaxed ingeneral as illustrated by the following example. Proposition 3.
Suppose that d = 1 , m = 1 , and Assumptions 1–3 hold. If E | Z | < ∞ and β + E ( Z ) κ > , then X is transient . roof. The proof also relies on Lyapunov inequalities; see Theorem 3.3 of Stramer and Tweedie(1994). Specifically, transience follows if there exists a bounded function g and a closed set K suchthat A g ( x ) ≥ , x ∈ X \ K, (39)and sup x ∈ K g ( x ) < g ( x ) , x ∈ X \ K. (40)Let g ( x ) = 1 − e − ǫx for some ǫ >
0. Obviously, g is bounded for x ∈ X = R + . Then, A g ( x ) = ( b + βx ) g ′ ( x ) + 12 ( a + αx ) g ′′ ( x ) + ( λ + κx ) Z R + ( g ( x + z ) − g ( x )) ν (d z )= e − ǫx (cid:20) ǫ ( b + βx ) − ǫ a + αx ) + ( λ + κx ) Z R + (1 − e − ǫz ) ν (d z ) (cid:21) = e − ǫx (cid:20)(cid:18) ǫβ − ǫ α + κ (1 − E e − ǫZ ) (cid:19) x + ǫb − ǫ a + λ (1 − E e − ǫZ ) (cid:21) . Let h ( ǫ ) be the coefficient of x in the brackets above, i.e., h ( ǫ ) := ǫβ − ǫ α + κ (1 − E e − ǫZ ). Clearly, h (0) = 0 and h ′ (0) = β + κ E ( Z ) >
0, yielding that h ( ǫ ) > ǫ >
0. Fixing this ǫ , we seethat A g ( x ) ∼ e − ǫx h ( ǫ ) x as x → ∞ . Hence, there exists k > A g ( x ) > x ∈ X \ K ,where K := [0 , k ], proving (39). Moreover, (40) is true since g ( x ) is increasing in x .The “boundary” case, i.e. β + E ( Z ) κ ⊺ = 0, is more complicated as the behavior of the processmay depend on other parameters. We leave its analysis for future research. In this section, we prove SLLNs and FCLTs for additive functionals of X of the form R t h ( X ( s )) d s or P ni =1 h ( X ( i ∆)) for some function h . Limit theorems for both discrete-time and continuous-timeMarkov processes have been extensively studied in the past; see, e.g., Glynn and Meyn (1996),Kontoyiannis and Meyn (2003), Meyn and Tweedie (2009, chap.17), and references therein. Inparticular, positive Harris recurrence is “almost” sufficient for a LLN to hold. Conditions forFCLTs, on the other hand, often include exponential ergodicity, or Lyapunov inequalities of theform similar to (30).Nevertheless, existing FCLTs for discrete-time Markov processes are not applicable to the skeletonchain X ∆ because they typically require one to establish a “discrete-time” version of the Lyapunovinequality of the form E x [ g ( X (∆))] ≤ cg ( x ) for some constant c <
1, some function g ≥
1, and all x off a compact set. This is awkward mathematically given the fact that the transition measure P x ( X (∆) ∈ · ) is not known explicitly. Our approach to establish (8) is to first consider thescenario in which X (0) follows the stationary distribution. We then apply an FCLT for stationarysequences, i.e., Theorem 3.1 of Ethier and Kurtz (1986, p.351), whose conditions can be verified as20 consequence of exponential ergodicity (4). To generalize the FCLT to an arbitrary initial statewe follow an argument similar to one used in Glynn and Meyn (1996).The asymptotic variances, σ h in (7) and γ h in (8), can be expressed in terms of the solutionto a Poisson equation ; see, e.g., Glynn and Meyn (1996). But it typically has no closed form interms of the parameters ( a, α . . . , α d , b, β, λ, κ, ν ) of the SDE (1). However, when h is the (vector-valued) identity function, we are indeed able to analytically derive both the asymptotic mean andasymptotic covariance matrix that appear in the corresponding FCLT (see Corollary 1), thanks tothe tractable affine structure. Proof of Theorem 1(iii) and Theorem 2(iii).
We have established positive Harris recurrence andergodicity of X in Section 3.1 under the assumptions of Theorem 1(iii) or Theorem 2(iii). So π ( | h | ) < ∞ for any measurable function h : X 7→ R with || h || f p < ∞ . The SLLN (5) then followsfrom Theorem 2 of Sigman (1990).For the skeleton chain X ∆ , note that the stationary distribution π of X is necessarily invariantfor X ∆ . In addition, X ∆ is ϕ -irreducible by Proposition 1, so X ∆ is positive Harris recurrent.Hence, the SLLN (6) follows from Theorem 17.1.7 of Meyn and Tweedie (2009, p.427). Proof of Theorem 1(iv) and Theorem 2(iv).
Fix q > h : X 7→ R with || h q || f p < ∞ .We have shown in Section 3.2 that there exists a matrix H ≻
0, a compact set K and positive finiteconstants c , c such that A g ( x ) ≤ − c g ( x ) + c I K ( x ) for all x ∈ X , where g ( x ) = (1 + || x || H ) p/ .Thanks to (12), || h || f p < ∞ if and only if || h || g < ∞ . Moreover, we have shown in Section 3.1 that K is a petite set for X . It then follows immediately from Theorem 4.4 of Glynn and Meyn (1996)that (7) holds as n → ∞ P x -weakly in D [0 ,
1] for all x ∈ X .We now show that (8) holds P π -weakly in D [0 , π is the stationary distribution of X .This can be done by applying an FCLT for stationary sequences to { ¯ h ( X ( n ∆) : n = 0 , , . . . } , whichis a mean-zero stationary sequence if X (0) ∼ π , where ¯ h ( x ) := h ( x ) − π ( h ).Specifically, let F k and F k denote the σ -algebras generated by ( X ( n ∆) : n ≤ k ) and ( X ( n ∆) : n ≥ k ), respectively. Let ϕ ( l ) := sup Γ ∈ F k + l E π | P (Γ | F k ) − P (Γ) | denote the measure of mixing (Ethier and Kurtz 1986, p.346) of F k and F k + l associated with the L -norm. Then, by Theorem3.1 and Remark 3.2(b) of Ethier and Kurtz (1986, p.351), it suffices to verify that for some ǫ > E π h(cid:12)(cid:12) ¯ h ( X ( n ∆)) (cid:12)(cid:12) ǫ i < ∞ and ∞ X l =0 [ ϕ ( l )] ǫ/ (2+ ǫ ) < ∞ . (41)21et ǫ = q − >
0. Then, E π h(cid:12)(cid:12) ¯ h ( X ( n ∆)) (cid:12)(cid:12) ǫ i = π (¯ h q ) ≤ (cid:12)(cid:12)(cid:12)(cid:12) ¯ h q (cid:12)(cid:12)(cid:12)(cid:12) f p π ( f p ) < ∞ , verifying the first condition in (41). To verify the second, note that by the Markov property, forany Γ ∈ F k + l there exists a function w Γ with | w Γ ( · ) | ≤ P [Γ | F k + l ] = w Γ ( X (( k + l )∆)).If X (0) ∼ π , then for any Γ ∈ F k + l , | P (Γ | F k ) − P (Γ) | = | E [ w Γ ( X (( k + l )∆)) | F k ] − E [ w Γ ( X (( k + l )∆))] | = (cid:12)(cid:12)(cid:12)(cid:12)Z X w Γ ( y ) P X ( k ∆) ( X ( l ∆) ∈ d y ) − Z X Z X w Γ ( y ) P x ( X (( k + l )∆) ∈ d y ) π (d x ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12) P X ( k ∆) ( X ( l ∆) ∈ · ) − π ( · ) (cid:12)(cid:12)(cid:12)(cid:12) , where ||·|| is the total variation norm, where the inequality follows from Definition 4 and the factthat | w Γ ( · ) | ≤
1. It follows that ϕ ( l ) ≤ E π (cid:12)(cid:12)(cid:12)(cid:12) P X ( k ∆) ( X ( l ∆) ∈ · ) − π ( · ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ c p e − ρ p l ∆ E π [ f ( X ( k ∆))] = c p π ( f p ) e − ρ p l ∆ , where the second inequality holds because of Theorem 1(ii) and Theorem 2(ii). This immediatelyimplies P ∞ l =0 [ ϕ ( l )] ǫ/ (2+ ǫ ) < ∞ . Therefore, we conclude that (8) holds P π -weakly in D [0 , n → ∞ P x -weakly in D [0 ,
1] for all x ∈ X . To that end,we first show that P x (cid:18) lim n →∞ sup ≤ t ≤ | Y n ( t ) − Y n,l ( t ) | = 0 (cid:19) = 1 , x ∈ X , (42)for any positive integer l , where Y n ( t ) := n − / P ⌊ nt ⌋ i =1 ¯ h ( X ( i ∆)) and Y n,l ( t ) := n − / P ⌊ nt ⌋ + li = l +1 ¯ h ( X ( i ∆)).Note that for all sufficiently large n ,sup ≤ t ≤ | Y n ( t ) − Y n,l ( t ) | = 1 n (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n X i =1 ¯ h ( X ( i ∆)) − n + l X i = l +1 ¯ h ( X ( i ∆)) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = 1 n (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) l X i =1 ¯ h ( X ( i ∆)) − n + l X i = n +1 ¯ h ( X ( i ∆)) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ n l X i =1 ¯ h ( X ( i ∆)) + 1 n n + l X i = n +1 ¯ h ( X ( i ∆)) → , P x − a . s ., as n → ∞ for all x ∈ X , because n P ni =1 ¯ h ( X ( i ∆)) → π (¯ h ) < ∞ , P x − a . s . , as n → ∞ for all x ∈ X , thanks to Theorem 1(iii) and Theorem 2(iii). This completes the proof of (42).Let φ be a bounded continuous functional φ on D [0 , l , | E x [ φ ( Y n )] − E x [ φ ( Y n,l )] | → n → ∞ for all x ∈ X . This limit can be rewritten aslim n →∞ (cid:12)(cid:12)(cid:12)(cid:12) E x [ φ ( Y n )] − Z X P x ( X ( l ∆) ∈ d y ) E y [ φ ( Y n,l )] (cid:12)(cid:12)(cid:12)(cid:12) = 0 , x ∈ X . (43)On the other hand, note that (cid:12)(cid:12)(cid:12)(cid:12)Z X P x ( X ( l ∆) ∈ d y ) E y [ φ ( Y n,l )] − E π [ φ ( Y n )] (cid:12)(cid:12)(cid:12)(cid:12) ≤ || P x ( X ( l ∆) ∈ · ) − π ( · ) || · sup g ∈D [0 , | φ ( g ) | . Since || P x ( X ( l ∆) ∈ · ) − π ( · ) || → l → ∞ by Theorem 1(i) and Theorem 2(i), for any δ > l so large that (cid:12)(cid:12)(cid:12)(cid:12)Z X P x ( X ( l ∆) ∈ d y ) E y [ φ ( Y n,l )] − E π [ φ ( Y n )] (cid:12)(cid:12)(cid:12)(cid:12) ≤ δ. (44)It then follows from (43) and (44) that lim sup n →∞ | E x [ φ ( Y n )] − E π [ φ ( Y n )] | ≤ δ . Since (8) holds P π -weakly in D [0 , n →∞ | E π [ φ ( Y n )] − E π [ φ ( W )] | = 0, and thuslim sup n →∞ | E x [ φ ( Y n )] − E π [ φ ( W )] | ≤ δ. Sending δ → P x -weakly in D [0 ,
1] for all x ∈ X . Thanks to the affine structure, the asymptotic mean and the asymptotic variance can be derivedanalytically when h is the identity function, i.e. h ( x ) = x . Note that with h being R d -valued, thecorresponding SLLN and FCLT are multivariate. The calculation follows closely the approach usedin Zhang et al. (2015) so we omit the details. Corollary 1.
If Assumptions 1–3 hold and E || Z || < ∞ , then P x (cid:18) lim t →∞ t Z t h ( X ( s )) d s = v (cid:19) = 1 , x ∈ X , where v = − ( β + E ( Z ) κ ⊺ ) − ( b + λ E ( Z )) . Furthermore, if E || Z || ǫ < ∞ for some ǫ > , then n / (cid:18) n Z n · X ( s ) d s − v (cid:19) ⇒ Σ / W ( · ) , as n → ∞ P x -weakly in D R d [0 , for all x ∈ X , where Σ = A ( a + λ E ( ZZ ⊺ )) A ⊺ + m X i =1 v i A ( α i + κ i E ( ZZ ⊺ )) A ⊺ . cknowledgments The first author was partially supported by the Hong Kong Research Grants Council under GeneralResearch Fund (ECS 624112). The second author gratefully acknowledges the support and theintellectual environment of the Institute for Advanced Study at the City University of Hong Kong,where this work was completed.
References
A¨ıt-Sahalia, Y. (2007). Estimating continuous-time models using discretely sampled data. In R. Blundell,P. Torsten, and W. K. Newey (Eds.),
Advances in Economics and Econometrics, Theory and Applica-tions, Ninth World Congress , Chapter 9. Cambridge University Press.A¨ıt-Sahalia, Y., J. Cacho-Diaz, and R. J. A. Laeven (2015). Modeling financial contagion using mutuallyexciting jump processes.
J. Financ. Econ. 117 (3), 585–606.Andersen, L. B. G. and V. V. Piterbarg (2007). Moment explosions in stochastic volatility models.
Financ.Stoch. 11 , 29–50.Barczy, M., L. D¨oring, Z. Li, and G. Pap (2014). Stationarity and ergodicity for an affine two-factor models.
Adv. Appl. Probab. 46 (3), 878–898.Barndorff-Nielsen, O. E. and N. Shephard (2001). Non-Gaussian Ornstein-Uhlenbeck-based models and someof their uses in financial economics.
J. R. Statist. Soc. B 63 (2), 167–241.Bates, D. S. (2006). Maximum likelihood estimation of latent affine processes.
Rev. Financ. Stud. 19 (3),909–965.Berman, A. and R. J. Plemmons (1994).
Nonnegative Matrices in the Mathematical Sciences . SIAM, Philadel-phia.Cheridito, P., D. Filipovi´c, and R. L. Kimmel (2007). Market price of risk specification for affine models:Theory and evidence.
J. Financ. Econ. 83 , 123–170.Collin-Dufresne, P., R. S. Goldstein, and C. S. Jones (2008). Identification of maximal affine term structuremodels.
J. Finance 63 (2), 743–759.Cox, J. C., J. E. Ingersoll, and S. A. Ross (1985). A theory of the term structure of interest rates.
Econo-metrica 53 (2), 385–407.Dai, Q. and K. J. Singleton (2000). Specification analysis of affine term structure models.
J. Finance 55 ,1943–1978.Davies, E. B. (1989).
Heat Kernels and Spectral Theory , Volume 92 of
Cambridge Tracts in Mathematics .Cambridge University Press.Dawson, D. A. and Z. Li (2006). Skew convolution semigroups and affine Markov processes.
Ann.Probab. 34 (3), 1103–1142.Duffie, D., D. Filipovi´c, and W. Schachermayer (2003). Affine processes and applications in finance.
Ann.Appl. Probab. 13 (3), 984–1053.Duffie, D. and P. W. Glynn (2004). Estimation of continuous-time Markov processes sampled at randomtime intervals.
Econometrica 72 (6), 1773–1808.Duffie, D. and R. Kan (1996). A yield-factor model of interest rates.
Math. Finance 6 , 379–406. uffie, D., J. Pan, and K. J. Singleton (2000). Transform analysis and asset pricing for affine jump-diffusions. Econometrica 68 (6), 1343–1376.Errais, E., K. Giesecke, and L. R. Goldberg (2010). Affine point processes and portfolio credit risk.
SIAMJ. Finan. Math. 1 , 642–665.Ethier, S. N. and T. G. Kurtz (1986).
Markov Processes: Characterization and Convergence . John Wiley &Sons, Inc.Filipovi´c, D. and E. Mayerhofer (2009). Affine diffusion processes: Theory and applications. In H. Albrecher,W. Runggaldier, and W. Schachermayer (Eds.),
Radon Ser. Comput. Appl. Math. , Volume 8, pp. 1–40.Filipovi´c, D., E. Mayerhofer, and P. Schneider (2013). Density approximations for multivariate affine jump-diffusion processes.
J. Econometrics 176 , 93–111.Gao, X., X. Zhou, and L. Zhu (2018). Transform analysis for Hawkes processes with applications in darkpool trading.
Quant. Finance 18 (2), 265–282.Getoor, R. K. (1975).
Markov Processes: Ray Processes and Right Processes . Lecture Notes in Mathematics.Springer-Verlag Berlin Heidelberg.Glasserman, P. and K.-K. Kim (2010). Moment explosions and stationary distributions in affine diffusionmodels.
Math. Finance 20 (1), 1–33.Glynn, P. W. and S. P. Meyn (1996). A Liapounov bound for solutions of the Poisson equation.
Ann.Probab. 24 (2), 916–931.Hansen, L. P. (1982). Large sample properties of generalized method of moments estimators.
Economet-rica 50 , 1029–1054.Hansen, L. P. and J. A. Scheinkman (1995). Back to the future: generating moment implications forcontinuous-time Markov processes.
Econometrica 63 (4), 767–804.Hawkes, A. G. (1971). Spectra of some self-exciting and mutually exciting point processes.
Biometrika 58 ,83–90.Heston, S. L. (1993). A closed-form solution for options with stochastic volatility with applications to bondand currency options.
Rev. Financ. Stud. 6 , 327–343.Horn, R. A. and C. R. Johnson (2012).
Matrix Analysis (2nd ed.). Cambridge University Press.Jena, R. P., K.-K. Kim, and H. Xing (2012). Long-term and blow-up behaviors of exponential moments inmulti-dimensional affine diffusions.
Stoch. Proc. Appl. 122 , 2961–2993.Jin, P., J. Kremer, and B. R¨udiger (2017). Exponential ergodicity of an affine two-factor model based anthe α -root process. Adv. Appl. Probab. 49 , 1144–1169.Jin, P., B. R¨udiger, and C. Trabelsi (2016). Positive Harris recurrence and exponential ergodicity of thebasic affine jump-diffusion.
Stoch. Anal. Appl. 34 (1), 75–95.Karatzas, I. and S. E. Shreve (1991).
Brownian Motion and Stochastic Calculus (2nd ed.). Springer.Keller-Ressel, M. (2011). Moment explosions and long-term behavior of affine stochastic volatility model.
Math. Finance 21 (1), 73–98.Keller-Ressel, M., W. Schachermayer, and J. Teichmann (2011). Affine processes are regular.
Probab. Theor.Relat. Field. 151 , 591–611.Kontoyiannis, I. and S. P. Meyn (2003). Spectral theory and limit theorems for geometrically ergodic Markovprocesses.
Ann. Appl. Probab. 13 (1), 304–362. ee, R. W. (2004). The moment formula for implied volatlity at extreme strikes. Math. Finance 14 (3),469–480.Masuda, H. (2004). On multidimensional Ornstein-Uhlenbeck processes driven by a general L´evy process.
Bernoulli 10 , 97–120.Meyn, S. P. and R. L. Tweedie (1992). Stability of Markovian processes I: criteria for discrete-time chains.
Adv. Appl. Probab. 24 , 542–574.Meyn, S. P. and R. L. Tweedie (1993a). Generalized resolvents and Harris recurrence of Markov processes.
Contemporary Mathematics 149 , 227–250.Meyn, S. P. and R. L. Tweedie (1993b). Stability of Markovian processes II: continuous-time processes andsampled chains.
Adv. Appl. Probab. 25 , 487–517.Meyn, S. P. and R. L. Tweedie (1993c). Stability of Markovian processes III: Foster-Lyapunov criteria forcontinuous-time processes.
Adv. Appl. Probab. 25 , 518–548.Meyn, S. P. and R. L. Tweedie (2009).
Markov Chains and Stochastic Stability (2nd ed.). CambridgeUniversity Press.Protter, P. E. (2003).
Stochastic Integration and Differential Equations (2nd ed.). Springer.Sato, K.-i. and M. Yamazato (1984). Operator-self-decomposable distributions as limit distributions ofprocesses of Ornstein–Uhlenbeck type.
Stoch. Proc. Appl. 17 , 73–100.Sharpe, M. (1988).
General Theory of Markov Processes . Academic Press.Sigman, K. (1990). One-dependent regenerative processes and queues in continuous time.
Math. Oper.Res. 15 (1), 175–189.Singleton, K. J. (2001). Estimation of affine asset pricing models using the empirical characteristic function.
J. Econometrics 102 (1), 111–141.Stramer, O. and R. L. Tweedie (1994). Stability and instability of continuous time Markov processes. InF. P. Kelly (Ed.),
Probability, Statistics and Optimization: A Tribute to Peter Whittle , pp. 173–184.John Wiley & Sons, Inc.Vasicek, O. (1977). An equilibrium characterization of the term structure.
J. Financ. Econ. 5 , 177–188.Zhang, X., J. Blanchet, K. Giesecke, and P. W. Glynn (2015). Affine point processes: Approximation andefficient simulation.
Math. Oper. Res. 40 (4), 797–819.(4), 797–819.