A Bond Option Pricing Formula in the Extended CIR Model, with an Application to Stochastic Volatility
aa r X i v : . [ m a t h . P R ] O c t A Bond Option Pricing Formula in the Extended CIR Model, with anApplication to Stochastic Volatility
Zheng Liu ∗ , Qidi Peng † and Henry Schellhorn ‡ October 22, 2014
Abstract
We provide a complete representation of the interest rate in the extended CIR model. Since it was proved inMaghsoodi (1996) that the representation of the CIR process as a sum of squares of independent Ornstein-Uhlenbeckprocesses is possible only when the dimension of the interest rate process is integer. We use a slightly differentrepresentation, valid when the dimension is not integer. Our representation uses an infinite sum of squares of basicprocesses. Each basic process can be described as an Ornstein-Uhlenbeck process with jumps at fixed times. Inthis case, the price of a bond option resembles the Black-Scholes formula, where the normal distribution is replacedby a new distribution, which generalizes the non-central chi-square distribution. For practical purposes, the bondprice is however better calculated by inverting a Laplace transform. We generalize the result to a model wherevolatility is stochastic.
The Cox-Ingersoll-Ross term structure model, in short the CIR model, was introduced in Cox et al. (1985a), (1985b).In this model, the spot interest rate r ( t ) is assumed to follow a squared Bessel process:d r ( t ) = ( − b ( t ) r ( t ) + θ ( t )) d t + σ ( t ) p r ( t ) d B ( t ) (1.1) r (0) = r > , where B is a standard Brownian motion. In the original specification of Cox et al. (1985a), (1985b), the speed ofmean reversion b ≥
0, the volatility σ > θ > extendedCIR model .Several features of the CIR model are particularly attractive. Firstly, it can be justified by general equilibriumconsiderations, see Cox et al. (1985a). Secondly, the interest rate is always positive and stationary. Cox et al. foundthat its distribution follows a non-central chi-square distribution. Finally, there is a closed form formula for the bondprice. For practitioners however, the main shortcoming of the constant parameters version of the model is that itcannot reproduce the original term structure of interest rates. This fact was highlighted by several authors (Hull(1990), Keller-Ressel and Steiner (2008), Yang (2006) and all the references therein): yield curves can be only normal,inverse, or humped. The extended CIR model, however, has enough parameters to be fitted to the original yield curve.Maghsoodi (1996), Jamshidian (1995), and Rogers (1995) propose a representation of the extended CIR model asa sum of squares of Ornstein-Uhlenbeck processes when the dimension d ≡ θ ( t ) /σ ( t ) is constant and integer. As aconsequence, the interest rate follows the generalized chi-square distribution. Maghsoodi (1996), and Shirakawa (2002)also propose a representation of the interest rate as a time-changed lognormal process. However, as the latter authorstates, ”it is difficult to derive the probability distribution of the squared Bessel processes with time-varying dimensions ∗ Morgan Stanley, New York city, E-mail: [email protected] † Institute of Mathematical Sciences, Claremont Graduate University, E-mail: [email protected] ‡ Corresponding author, Institute of Mathematical Sciences, Claremont Graduate university, E-mail: [email protected] d ( t ) is any positive continuous function.There are alternate approaches to extend the CIR model. Brigo and Mercurio (2001) find that a deterministicshift of the CIR model is analytically tractable. The obvious drawback of making the parameters of the CIR modelfunctions of time is the problem of overparameterization: the parameters will not be robust in a change of regime.Several authors consider instead a CIR model of interest rates with constant parameters and stochastic volatility. Forinstance, Longstaff and Schwartz (1992), and Duffie and Kan (1996), consider a generalized two-factor CIR model,where one factor is the interest rate, and the other one is its volatility. The volatility in that model is a variationof the volatility in the popular Heston (1993) model. Cotton, Fouque, Papanicolaou and Sircar (2004) calculate anasymptotic expansion of the bond price in such a model (with constant parameters) when the speed of mean reversionis fast. Fouque and Lorig (2011) generalize this model to a model with a volatility of volatility.Finally, we note that several authors have generalized the CIR model in a different way using multiple factors. Werefer the reader to the references contained in Chen, Filipovic, and Poor (2004) and Gourieroux and Monfort (2011).We then incorporate stochastic volatility v ( t ) in our extended CIR model. If b and θ are deterministic, and v ( t ) isindependent of r ( t ) the distribution of the interest rate, conditional on volatility, is the same as in the deterministicvolatility case. As an illustration of our approach, we give a semi-closed formula of the price of a bond option whenthe price of the underlying bond at time t can be expressed a function as a function of r ( t ) , v ( t ) and time t . We alsoconsider a model where volatility is correlated with the interest rate.In the same way that it is not too difficult to generalize the one-factor CIR model to multiple factors (see Duffieand Kan (1996)), we believe it is not difficult to generalize our results on the extended CIR model to multiple factors.However, we leave this for future research.The structure of this paper is as follows. In Section 2, we consider the case where volatility is deterministic, namelythe extended CIR model per se . In Section 3, we provide an analytic formula (up to solution of ODEs) for a bondoption with deterministic volatility. In Section 4, we extend our results to stochastic volatility. Definition 2.1
We let
T > be the maturity of the bond underlying the bond option. Definition 2.2
We define the number of degrees of freedom (also called dimension in Maghsoodi) d ( t ) = 4 θ ( t ) σ ( t ) . Since θ is assumed to be strictly positive, and σ non-zero, d ( t ) is also strictly positive. Our goal is to generalize the representation of the rate r ( t ) as a sum with a constant number of terms of squares ofOrnstein-Uhlenbeck processes into a sum of time-varying numbers of such terms. Heuristically, it is easier to representthis sum as an integral. Before introducing the rigorous definitions, we explain informally the method. We introduce2 collection { W ( ., u ) } of independent Brownian motions . We write the Gaussian processes as: ∂x ( t, u ) ∂t = 12 ( − b ( t ) + ”plug”) x ( t, u ) d t + σ ( t )2 d W ( t, u )for some unknown functional ”plug”. In particular, if we take the representation: r ( t ) = Z d ( t )0 x ( t, u ) d u. Then informally, taking derivatives results in:d r ( t ) = Z d ( t )0 ∂ ( x ( t, u ) ) ∂t d u + x ( t, d ( t )) d ′ ( t ) . We see that: d r ( t ) = ( − b ( t ) r ( t ) + σ ( t ) d ( t )4 ) d t + σ ( t ) p r ( t ) d B ( t ) + Z d ( t ) u =0 x ( t, u ) ”plug” d u + x ( t, d ( t )) d ′ ( t ) . If the last two terms cancel out, by Definition 2.1, we obtain the right equation for r . Intuitively, the last twoterms will cancel out if the ”plug” term includes a Dirac delta function:”plug” = − δ ( u − d − ( t )) d ′ ( t ) , provided d is invertible. We want to represent R d ( t )0 x ( t, u ) d u in (2.1.1) as a sum. To this end we discretize the x ( t, u ) process in the u direction and replace R d ( t )0 x ( t, u ) d u by a sum of squares of basic processes { ˜ x i ( t ) } i . Since the ”number” of terms d ( t ) is real-valued, an idea is to approximate d ( t ) by the ratio [ nd ( t )] n . We then construct a sum of [ nd ( t )] terms, whichwe call Z ( n ) low ( t ): Z ( n ) low ( t ) = [ nd ( t )] X i =1 ˜ x i ( t ) . We should then ”divide” Z ( n ) low ( t ) by n , and then make n go to infinity. The reason why we put quotes around ”divide”is that the variance of Z ( n ) low ( t ) /n tends to zero, and thus Z ( n ) low ( t ) /n is not the right representation for the rate. Rather,if Z ( n ) low ( t ) had a non-central chi-squared distribution, we could represent Z ( n ) low as a sum of i.i.d. processes { r ( n ) j,low ( t ) } j Z ( n ) low ( t ) = n X j =1 r ( n ) j,low ( t ) . It turns out that Z ( n ) low ( t ) does not have the scaled non-central chi-squared distribution. However, it is possible todefine a process r ( n,M ) j,mid ( t ) so that, conditionally on the information available at time ( m − M T , the random variable r ( n,M ) j,mid ( mM T ) has approximately the scaled non-central chi-squared distribution. Since the latter is computable, we cancompute the characteristic function of r ( n,M ) j,mid ( mM T ) conditionally on r ( n,M ) j,mid ( ( m − M T ). We then iterate this backwardinduction by calculating this characteristic function conditionally on r ( n,M ) j,mid ( ( m − M T ), and so far until time zero. Wewill then take a double limit of this characteristic function, by making both n and M go to infinity. Rather than starting with a countably infinite collection of processes x ( ., u ) we will start with a countable finite collection of processesin the formal proof { ˜ x i ( . ) } . See below. The subscript low indicates that we truncate the integral into a lower value. .2 Definitions Let (Ω , F t , P ) be a probability space where the filtration F t is generated by a countably infinite collection { W i } ofindependent Brownian motions. The probability measure P is the risk-neutral measure. Observation
Most subsequent definitions will not be needed to understand part (i) of Theorem 2.4, and the reader may skipthis tedious part, which will be necessary only in the stochastic volatility section.We start by defining d ( n ) ( t ) = nd ( t ); M ( n ) d = max t ∈ [0 ,T ] d ( n ) ( t ); t Mm = mM T ≤ m ≤ M. (2.1)Definition (2.1) will be a standing definition, in the sense that, for each m, M , (2.1) will hold throughout the paper. Assumption 1
We suppose that, over the interval [0 , T ], d ( t ) satisfies Dirichlet’s condition, i.e., it is differentiable, the number Q of minima and maxima of d ( t ) is finite, and that the set of critical points of d (i.e. the points where d ′ ( t ) = 0)has measure zero. As mentioned above, d ( t ) is strictly positive. Also, we assume that θ ( t ), b ( t ) and σ ( t ) are positivereal-valued continuous functions on [0 , T ]. This ensures that (1.1) has a pathwise unique strong solution (see Magshoofi(1996)). ˜ x ( n ) i ( t )We have to consider separately each decreasing and increasing branch of d ( n ) ( t ). We now have 3 subcases. Accordingly,the processes ˜ x ( n ) i ( t ) can at any time t either: • not jump; • jump up by a half; • jump down by a half.Let J ( n ) down ( i ) for i ≥ d ( n ) ( t ) = i such that d d ( n ) ( t )d t > , t ∈ [0 , T ] . Let J ( n ) up ( i ) for i ≥ d ( n ) ( t ) = i such that d d ( n ) ( t )d t < , t ∈ [0 , T ] . We do not consider the case d d ( n ) ( t )d t (cid:12)(cid:12) t = 0 and d ( n ) ( t ) = i . Denote by N ( n ) up ( i ) ( N ( n ) down ( i )) the cardinality of theset J ( n ) up ( i ) ( J ( n ) down ( i )), and designate the elements of these sets followingly: J ( n ) up ( i ) = { J ( n ) k,up ( i ) | ≤ k ≤ N ( n ) up ( i ) } , J ( n ) down ( i ) = { J ( n ) p,up ( i ) | ≤ p ≤ N ( n ) down ( i ) } , It is fairly easy to note that the cases where t is such that d d ( n ) ( t )d t (cid:12)(cid:12) t = 0 and d ( n ) ( t ) = i play no role in the proof (provided that theset of points where d is flat has measure zero). From now on, we continue the development without taking care of that case, in order tosave space. J ( n )0 ,up ( i ) = J ( n )0 ,down ( i ) = −∞ . Finally, we call the sets of all jumps together with theterminal time: J ( n ) = N ( n ) up ( i ) [ i =1 J ( n ) up ( i ) N ( n ) down ( i ) [ i =1 J ( n ) down ( i ) [ { T } We call Z ( n ) k,up ( i ) ( Z ( n ) p,down ( i )) the first minimizer (maximizer) of d ( n ) after J ( n ) k,up ( i ) ( J ( n ) p,down ( i )). We have then(again, barring the case d d ( n ) ( t )d t = 0 and d ( n ) ( t ) = i ): J ( n ) k,up ( i ) < Z ( n ) k,up ( i ) < J ( n ) p ( k ) ,down ( i ) < Z ( n ) p ( k ) ,down ( i ) < J ( n ) k +1 ,up ( i ) , where: k ( p ) = (cid:26) pp + 1 if d ( n ) is decreasing at t = 0 d ( n ) is increasing at t = 0 ,p ( k ) = (cid:26) pk − d ( n ) is decreasing at t = 0 d ( n ) is increasing at t = 0 . For definiteness, we set Z ( n )0 ,up ( i ) = Z ( n )0 ,down ( i ) = 0.We now define recursively ˜ x ( n ) i ( t ) for all times t . In the course of the definition, we also define what we meanby ”times before the first jump”, ”times between up and down jump” and ”times between down and up jump” for aparticular process ˜ x ( n ) i ( t ). We start with ˜ x ( n ) i (0) = s r (0) d ( n ) (0) . In the subcase when t < min { J ( n )1 ,down ( i ) , J ( n )1 ,up ( i ) } ( times before the first jump ), we have:˜ x ( n ) i ( t ) = ˜ x ( n ) i (0) exp( − Z t b ( u )2 d u ) + Z t exp( − Z tu b ( s )2 d s ) σ ( u )2 d W i ( u ) . In the case where J ( n ) p − ,down ( i ) ≤ J ( n ) k,up ( i ) ≤ t < J ( n ) p,down ( i ) ≤ J ( n ) k +1 ,up ( i ) ( times between up and down jump ), wedefine:˜ x ( n ) i ( t ) = ˜ x ( n ) i ( J ( n ) k,up ( i ))(1 + 12 d d ( n ) d t | t = J ( n ) k,up ( i ) ) exp( − t Z J ( n ) k,up ( i ) b ( u )2 d u ) + t Z J ( n ) k,up ( i ) exp( − t Z u b ( s )2 d s ) σ ( u )2 d W i ( u ) . In the case where J ( n ) k − ,up ( i ) ≤ J ( n ) p,down ( i ) ≤ t < J ( n ) k,up ( i ) ≤ J ( n ) p +1 ,down ( i ) ( times between down and up jump ), wedefine:˜ x ( n ) i ( t ) = ˜ x ( n ) i ( J ( n ) p,down ( i ))(1 + 12 d d ( n ) d t | t = J ( n ) p,down ( i ) ) exp( − t Z J ( n ) p,down ( i ) b ( u )2 d u ) + t Z J ( n ) p,down ( i ) exp( − t Z u b ( s )2 d s ) σ ( u )2 d W i ( u ) . We summarize this into the following definitions.d˜ x ( n ) i ( t ) = − ( b ( t )2 + g ( n ) ( t, i )2 )˜ x ( n ) i ( t ) d t + σ ( t )2 d W i ( t ) , with g ( n ) ( t, i ) = N ( n ) up ( i,t ) X k =1 δ ( t − J ( n ) k,up ( i )) + N ( n ) down ( i,t ) X p =1 δ ( t − J ( n ) p,down ( i )) d d ( n ) d t | t . act 1 There is no jump of ˜ x ( n ) i ( t ) in the interval [ J ( n ) p,down ( i ) ≤ t < J ( n ) k ( p ) ,up ( i )). Likewise, there is no jump of ˜ x ( n ) i ( t ) inthe interval [ J ( n ) k,up ( i ) ≤ t < J ( n ) p ( k ) ,down ( i )).We further define: W ( n ) ( t, a, ω ) = ∞ X i =1 { i − < d ( n ) ( a ) ≤ i } W i ( t, ω ) ,x ( n ) ( t, u ) = ˜ x ( n ) i ( t )1 { i − < u ≤ i } , where 1 { i − < d ( n ) ( a ) ≤ i } denotes an indicator function. We also introduce the following notations: Z ( n, ,M ) mid (0) = d ( n ) ( t ) Z u =0 x ( n ) (0 , u ) d u, d˜ x ( n, ,M ) i ( t ) = 12 ( − b ( t ) + g ( n ) ( t, i ))˜ x ( n, ,M ) i ( t ) d t + σ ( t )2 d W i ( t ) for 0 ≤ t ≤ t M , and for each 1 ≤ m < M :˜ x ( n,m,M ) i ( t Mm ) = s Z ( n,m,M ) mid ( t Mm ) d ( n ) ( t Mm ) , d˜ x ( n,m,M ) i ( t ) = 12 ( − b ( t ) + g ( n ) ( t, i ))˜ x ( n,m,M ) i ( t ) d t + σ ( t )2 d W i ( t ) for t Mm ≤ t ≤ t Mm +1 ,x ( n,m,M ) mid ( t, u ) = ˜ x ( n,m,M ) i ( t )1 { i − < u ≤ i } , in which we define Z ( n,m,M ) mid ( t ) for all 1 ≤ m < M as: Z ( n,m,M ) mid ( t ) = d ( n ) ( t ) Z u =0 ( x ( n,m,M ) mid ( t, u )) d u for t Mm ≤ t ≤ t Mm +1 . We denote by (for the k branches) lower integration bounds: L up ( m, M, k, i, n ) = min( t Mm , J ( n ) k,up ( i ))and upper integration bounds: U up ( m, k, i, n ) = min( J ( n ) next ( J ( n ) k,up ( i )) , t Mm +1 ) , where J ( n ) next is defined as in (5.3).Similarly we can define L down ( m, M, p, i, n ) and L down ( m, M, p, i, n ) for the p branches. We define a remainder by R ( n,m,M ) ( t ) = M ( n ) d X i =1 N up ( i ) X k =1 U up ( m,M,k,i,n ) Z L up ( m,M,k,i,n ) (cid:18) (˜ x ( n,m,M ) i ( s )) d d ( n ) d s − (˜ x ( n,m,M ) i ( J ( n ) k,up ( i ))) d d ( n ) d s | s = J ( n ) k,up ( i ) (cid:19) d s + N down ( i ) X p =1 U down ( m,M,p,i,n ) Z L down ( m,M,p,i,n ) (cid:18) (˜ x ( n,m,M ) i ( s )) d d ( n ) d s − (˜ x ( n,m,M ) i ( J ( n ) p,down ( i ))) d d ( n ) d s | s = J ( n ) p,down ( i ) (cid:19) d s.
6e now define r ( n,m,M ) j,mid ( t ) by, for t Mm ≤ t < t Mm +1 : r ( n,m,M ) j,mid ( t Mm ) = Z ( n,m,M ) mid ( t Mm ) n , and as a consequence, r ( n,m,M ) j,mid ( t ) − r ( n,m,M ) j,mid ( t Mm )= − Z ts = t Mm ( b ( s ) r ( n,m,M ) j,mid ( s ) + σ ( s ) d ( n ) ( s )4 ) d s + Z ts = t Mm σ ( s ) q r ( n,m,M ) j,mid ( s ) d B ( n,m,M ) j,mid ( s )+ R ( n,m,M ) ( t ) n . As usual { B ( n,m,M ) j,mid ( s ) } j are independent Brownian motions. We can thus define the process: r ( n,M ) j,mid ( t ) = M X m =0 r ( n,m,M ) j,mid ( t )1 { t Mm ≤ t < t Mm +1 } . (2.2) For any λ > , λ ≥ c >
0, and for any x ∈ R , we define: g λ ,λ ,c ( x ) = 1 c ∞ X i =0 e − λ / ( λ / i i ! f λ +2 i ( xc ) , where f λ ( x ) = ( x λ/ − e − x/ λ/ Γ( λ/ x ≥ g λ ,λ ,c is a probability density. We say that a random variable X ∼ χ ( λ , λ , c )if the density of X is g λ ,λ ,c . In words, X is a scaled non-central chi-square with real-valued degrees of freedom λ .When c = 1 and λ is integer, we obtain the standard non-central chi-square distribution. When λ is integer therandom variable X is the sum of squares of independent normal random variables X i with mean µ i and variance c : X = λ X i =1 X i where the parameter λ = λ X i =1 µ i c As the usual chi-square distribution, its generalization to real-valued degrees of freedom is infinitely divisible.
Theorem 2.3
For any X ∼ χ ( λ , λ , c ) there exists n independent and identically distributed random variables X , ..., X n such that: X d = n X k =1 X k . Moreover: X ∼ χ ( λ n , λ n , c ) . .3 Main Result Theorem 2.4
Suppose Assumption 1 holds. Then (i)
Assume that d ( . ) is differentiable over [0 , T ] , bounded away from zero, and has a finite number of optima. Foreach t ∈ [0 , T ] the characteristic function of r ( t ) is: for ω ∈ R , E [exp( iωr ( t ))] = exp iω r (0) e − R t b ( u ) d u − iω Σ(0 , t ) + Z t exp( − R ts b ( u ) d u ) θ ( s )1 − iω Σ( s, t ) d s !! , (2.3) where Σ( s, t ) := 14 Z ts exp( − Z tv b ( u ) d u ) σ ( v ) d v. (ii) For any t ∈ [0 , T ] , the following convergence holds in distribution: lim n →∞ r ( n,n Q )1 ,mid ( t ) = r ( t ) , where the definition of r ( n,n Q )1 ,mid ( t ) is given in (2.2). Remarks:
1. The representation (5.5), which together with (5.6) and (5.7), states that: E [exp( iωr ( t )]= exp (cid:18) iωr (0) e − R t b ( u ) d u − iω R t e − R tv b ( u ) d u σ ( v ) d v/ − R t d ′ ( s ) log (cid:16) − iω R ts e − R tv b ( u ) d u σ ( v ) / v (cid:17) d s (cid:19) (1 − iω R t e − R tv b ( u ) d u σ ( v ) / v ) d (0) / . Therefore Equation (2.3) results from the following integration by parts12 Z t d ′ ( s ) log (cid:0) − iω Z ts e − R tv b ( u ) d u σ ( v ) / t (cid:1) d s = − d (0)2 log (cid:0) − iω Z ts e − R tv b ( u ) d u σ ( v ) / t (cid:1) − (cid:16) Z t d ( s ) e − R ts b ( u ) d u σ ( s ) / − iω R ts e − R tv b ( u ) d u σ ( v ) / v ) d s (cid:17) ,
2. When b, σ , and d are constants over [0 , T ], i.e., d ′ ( s ) = 0, we see from (2.4) that r ( t ) has the SNC chi-squareddistribution with d (0) degrees of freedom. Our characteristic function thus properly generalizes a well-knownresult.3. Casual observation of Lemma 5.12 seems to highlight a much simpler proof: namely, approximate the time-dependent parameters by step functions, i.e., construct an Euler approximation of (1.1), calculate its charac-teristic function like in Lemma 5.12, and make the time-step go to zero. Unfortunately, we could not find anyproof of strong convergence of the Euler approximation for the CIR process: in our proof we had to resort tothe extra power coming from our representation (2.4), and to weak convergence. Even if there was a classicalproof of strong convergence of the Euler approximation with deterministic volatility, the extension to stochasticvolatility would be difficult, unlike what we obtain in the next section.4. We have not found in the literature any distribution with characteristic function equal to (2.3). This seems tobe a new, or at least independently rediscovered distribution. We discuss in the stochastic volatility section amethod to approximate this distribution by its moments.8. It can be verified that (2.3) solves the equivalent of the Fokker-Planck equation for characteristic functions. Weemphasize again that the main strength of our method of proof, compared to solving the Fokker-Planck equationdirectly, is that it can be extended to stochastic volatility.The Fokker-Planck equation for the density f ( r, t ) of r ( t ) given in (1.1) is: ∂f ( r, t ) ∂t = − ∂∂r [( − b ( t ) r + θ ( t )) f ( r, t )] + 12 ∂ ∂r [ σ ( s ) rf ( r, t )]= b ( t ) f ( r, t ) + ( σ ( t ) + b ( t ) r − θ ( t )) ∂f ( r, t ) ∂r + σ ( t )2 r ∂ f ( r, t ) ∂r . Let the Fourier transform with respect to x be defined as ˆ f ( x, t ) = √ π R e − ixr f ( r, t ) d r , then the Fourier transformwith respect to x of the Fokker-Planck equation is given as ∂ ˆ f ( x, t ) ∂t = − b ( t ) x ∂ ˆ f ( x, t ) ∂x − θ ( t ) ix ˆ f ( x, t ) − iσ ( t )2 x ∂ ˆ f ( x, t ) ∂x = − θ ( t ) ix ˆ f ( x, t ) − ( b ( t ) x + iσ ( t )2 x ) ∂ ˆ f ( x, t ) ∂x . (2.4)Writing Φ( x, t ) = E [ e ixr ( t ) ], then:Φ( x, t ) = exp (cid:16) ix (cid:0) r (0) e − R t b ( u ) d u − ix R t e − R tv b ( u ) d u σ ( v ) / v + Z t θ ( s ) e − R ts b ( u ) d u (1 − ix R ts e − R tv b ( u ) d u σ ( v ) / v ) d s (cid:1)(cid:17) . Hence the following relation holds: Φ( − x, t ) = √ π ˆ f ( x, t ) . We first compute the left hand side (LHS) of (2.4). Observe that √ π ∂ ˆ f ( x, t ) ∂t = Φ( − x, t ) − ix ∂∂t r (0) e − R t b ( u ) d u ix R t e − R tv b ( u ) d u σ ( v ) / v + Z t θ ( s ) e − R ts b ( u ) d u (1 + 2 ix R ts e − R tv b ( u ) d u σ ( v ) / v ) d s !! . We calculate ∂∂t r (0) e − R t b ( u ) d u ix R t e − R tv b ( u ) d u σ ( v ) / v + Z t θ ( s ) e − R ts b ( u ) d u (1 + 2 ix R ts e − R tv b ( u ) d u σ ( v ) / v ) d s ! = r (0) e − R t b ( u ) d u ( − b ( t ) − ixσ ( t ) / ix R t e − R tv b ( u ) d u σ ( v ) / v ) + θ ( t ) + Z t θ ( s ) e − R ts b ( u ) d u ( − b ( t ) − ixσ ( t ) / ix R ts e − R tv b ( u ) d u σ ( v ) / v ) d s = θ ( t ) − (cid:18) b ( t ) + ix σ ( t )2 (cid:19) r (0) e − R t b ( u ) d u (1 + 2 ix R t e − R tv b ( u ) d u σ ( v ) / v ) + Z t θ ( s ) e − R ts b ( u ) d u (1 + 2 ix R ts e − R tv b ( u ) d u σ ( v ) / v ) d s ! . Secondly we compute the following item for the right hand side (RHS) of (2.4): √ π ∂ ˆ f ( x, t ) ∂x = Φ( − x, t ) ∂∂x − ixr (0) e − R t b ( u ) d u ix R t e − R tv b ( u ) d u σ ( v ) / v + Z t − ixθ ( s ) e − R ts b ( u ) d u (1 + 2 ix R ts e − R tv b ( u ) d u σ ( v ) / v ) d s ! . On one hand, 9 ∂x − ixr (0) e − R t b ( u ) d u ix R t e − R tv b ( u ) d u σ ( v ) / v ! = − ir (0) e − R t b ( u ) d u (1 + 2 ix R t e − R tv b ( u ) d u σ ( v ) / v ) . On the other hand, ∂∂x Z t − ixθ ( s ) e − R ts b ( u ) d u (1 + 2 ix R ts e − R tv b ( u ) d u σ ( v ) / v ) d s ! = − Z t iθ ( s ) e − R ts b ( u ) d u (1 + 2 ix R ts e − R tv b ( u ) d u σ ( v ) / v ) d s. Multiplying both sides of Equation (2.4) by √ π/ Φ( − x, t ) we get: LHS = − ixθ ( t )+ ix ( b ( t ) + ix σ ( t )2 ) r (0) e − R t b ( u ) d u (1 + 2 ix R t e − R tv b ( u ) d u σ ( v ) / v ) + Z t θ ( s ) e − R ts b ( u ) d u (1 + 2 ix R ts e − R tv b ( u ) d u σ ( v ) / v ) d s !! . and RHS = − ixθ ( t ) − ( iσ ( t )2 x + b ( t ) x ) − ir (0) e − R t b ( u ) d u (1 + 2 ix R t e − R tv b ( u ) d u σ ( v ) / v ) − Z t iθ ( s ) e − R ts b ( u ) d u (1 + 2 ix R ts e − R tv b ( u ) d u σ ( v ) / v ) d s ! . It is easy to see LHS and RHS are algebraically equal. Thus the Fokker-Planck equation (2.4) has been verified.
We call P ( t, T ) the price at time t of a zero-coupon bond with maturity T , i.e.: P ( t, T ) = E (cid:2) exp( − Z Tt r ( s ) d s ) |F t (cid:3) . A standard result (see, e.g. Shreve (2004)) is that: P ( t, T ) = exp (cid:0) − r ( t ) C ( t, T ) + A ( t, T ) (cid:1) , where C ( t, T ) satisfies the Riccati equation : C t ( t, T ) − b ( t ) C ( t, T ) − σ ( t ) C ( t, T ) + 1 = 0 (3.1) C ( T, T ) = 0and: A ( t, T ) = − Z Tt θ ( u ) C ( u, T ) d u. (3.2)There is a well-known analytical solution in case the parameters b, σ and θ are constant.10 .1 Call Option Price By moving to the forward measure, we can separate discounting and evaluation of the expected value of the payoff.We can thus apply Theorem 2.4.Let now 0 ≤ t ≤ T . The price C (0) at initial time 0 of a European option (expiring at time t and with exerciseprice K ) on the T maturity bond is given by: C (0) = E h exp( − t Z r ( u ) d u ) max( P ( t, T ) − K, i . The t -forward probability measure is the probability measure P t under which P ( ., t ) is a martingale. In this measurethe process { B t ( s ) } s is Brownian motion where:d B t ( s ) = d B ( t ) − σ P ( s, t ) d t, where the bond volatility σ P ( s, t ) is given by: σ P ( s, t ) = (cid:26) − σ ( s ) p r ( s ) C ( s, t )0 if 0 ≤ s ≤ t else.Thus: d r ( s ) = (( − b ( s ) + σ ( s ) C ( s, t )) r ( s ) + θ ( s )) d s + σ ( s ) p r ( s ) d B t ( s ) . We define: b t ( s ) = − b ( s ) + σ ( s ) C ( s, t )Σ t ( s, t ) = 14 Z t exp( − Z ts b t ( u ) d u ) σ ( s ) d s as well as the Laplace transform of the density of r ( t ) in the t -forward measure:ˆ F r ( t ) ( p ) = exp − p r (0) e − R t b t ( u ) d u p Σ t (0 , t ) + Z t exp( − R ts b t ( u ) d u ) θ ( s )1 + 2 p Σ t ( s, t ) d s !! . Theorem 3.1
Let C ( s, T ) and A ( s, T ) solve (3.1) and (3.2). The price of a call option is given by Laplace inversion: C (0) = P (0 , t )2 πi lim b →∞ a + ib Z p = a − ib e pr ˆ C t ( p ; t ) ˆ F r ( t ) ( p ) d p, (3.3) where, ˆ C t ( p ) = pK p − C ( t,T ) e A ( t,T ) − ( p + C ( t, T )) K p +1 C ( t, T ) p − p . (3.4) Proof
Using the change of measure developed by Jamshidian (1989) and Geman (1989): C (0) = P (0 , t ) E P t h max(exp( − r ( t ) C ( t, T ) + A ( t, T )) − K, i . P ( t, T ) − K,
0) is (see, e.g. Lewis (2000)): ∞ Z e − pt max (cid:16) e − C ( t,T ) r + A ( t,T ) − K, (cid:17) d r = ˆ C t ( p ) , which is equal to the right hand side of Equation (3.4). It is well-known (see, e.g., Lewis (2000) again) that the priceof an option is given by the discounted inverse Laplace transform of the product of the transform of the payoff andthe transform of the distribution. Therefore we obtain (3.3). (cid:3) We now assume that both σ and θ are continuous functions of infinite variation, but that the dimension d ( t ) satisfiesassumption 1. This assumption is not unrealistic, in the sense that only the moments of θ intervene in the calculationsof the moments of r ( t ). Given a time-series of r , it would actually be difficult to invalidate the hypothesis that θ isof infinite variation. Thus, given the limited practical use of generalizing our results to the case where d can be ofinfinite variation, we leave this task as an open conjecture for future research. There is an interesting parallel betweenthe deterministic and the stochastic volatility case. In the deterministic case it was deemed natural since Magshoodi(1996) to take the simplifying assumption that d be integer-valued. In the stochastic case, we deem natural to considerthe case where d if of finite variation.We now consider two cases. The easiest case to consider is when the volatility is independent on the interest rate.We then build a tractable model where volatility is correlated with the interest rate. We consider the model d r ( t ) = ( − b ( t ) r ( t ) + θ ( t )) d t + p v ( t ) r ( t ) d B ( t ) ,r (0) = r >
0; (4.1)d v ( t ) = µ ( v ( t ) , t ) d t + ξ ( v ( t ) , t ) d B v ( t ) ,v (0) = v > , (4.2)where the coefficients µ and ξ are such that 0 < v ( t ) < K almost surely. We assumed B ( t ) d B v ( t ) = 0 . Under Assumption 1, all the arguments in the proof of Theorem 2.4 hold, and, by conditioning on the path ofvolatility, we obtain the characteristic function of the rate: E [exp( iωr ( t ))] = E " exp iω r (0) e − R t b ( u ) d u − iω Σ(0 , t ) + Z t exp( − R ts b ( u ) d u ) θ ( s )1 − iω Σ( s, t ) d s !! (4.3)When volatility is stochastic, it is often possible to write the price of a bond at time t as a function f of both therate and volatility: P ( t, T, ω ) = f ( r ( t, ω ) , v ( t, ω ) , t ; T )In this case, to calculate the price C (0) of a bond option, one is more interested in the conditional Laplace transformfunction of the density of the rate, given volatility v ( t ) at time t , which we call ˆ F r ( t ) | v ( t ) . One can then develop thelatter in a Taylor series:ˆ F r ( t ) | v ( t ) ( p ) = E " exp − p r (0) e − R t b t ( u ) d u p Σ t (0 , t ) + Z t exp( − R ts b t ( u ) d u ) θ ( s )1 + 2 p Σ t ( s, t ) d s !! (cid:12)(cid:12)(cid:12) v ( t ) = ∞ X k =0 ( − pk ! ) k d k ˆ F r ( t ) | v ( t ) d p k (cid:12)(cid:12)(cid:12) p =0 . F r ( t ) | v ( t ) ( p )d p (cid:12)(cid:12)(cid:12) p =0 = − E (cid:20) ( r e − R t b t ( u ) d u + Z t exp( − Z ts b t ( u ) d u ) θ ( s )) (cid:12)(cid:12)(cid:12) v ( t ) (cid:21) . (4.4)It is well-known that the derivative of the Laplace transform of the density is equal to the negative of the firstmoment. By Itˆo’s lemma, we can verify that the (conditional) first moment of the rate is equal to the negative of theright hand-side of (4.4), which provides a pleasant confirmation of our result. Since θ ( s ) = d ( s ) σ ( s ) /
4, it is enoughto compute of E [ σ ( s ) | v ( t )] in order to compute explicitly (4.4). We can then easily reconstruct the conditionaldistribution of r ( t ) given v ( t ) by inverse Laplace transformation, provided convergence conditions are met. We consider the following model of volatility, which is a special case of (4.2).d w ( t ) = ( − b ( t ) w ( t ) + θ w ( t )) d t + p v ( t ) w ( t ) d B ( t ) ,r (0) = r > v ( t ) = ( − b ( t ) v ( t ) + θ v ( t )) d t + ξv ( t ) d B v ( t ) ,v (0) = v > r ( t ) = w ( t ) + v ( t ) . (4.5)We let d w ( t ) = 4 θ w ( t ) v ( t ) ,d v ( t ) = θ w ( t )4 v ( t )and suppose that both of them satisfy assumption (1). Even if B and B v are uncorrelated, r ( t ) is correlated with v ( t ). It is then clear that E [exp( iωr ( t )) | v ( t )] = E " exp iω w (0) e − R t b ( u ) d u − iω Σ(0 , t ) + Z t exp( − R ts b ( u ) d u ) θ ( s )1 − iω Σ( s, t ) d s !! (cid:12)(cid:12)(cid:12) v ( t ) × exp( iωv ( t )) . The advantage of this model is that the conditional characteristic function of the rate is explicit. Is this model astochastic CIR model? We would like to leave this exciting (and, as far as we know, unsolved) question as an openproblem to the community.
Open problem:
Is there a Brownian motion B r ( t ) such that the distribution of r ( t ) defined in (4.5) agrees with the distribution attime t of the solution of the following SDE:d x ( t ) = − b ( t ) x ( t ) d t + ( θ w ( t ) + θ v ( t )) d t + p v ( t ) x ( t ) d B r ( t ) ? (4.6) Observation
One avenue of proof is to reuse our framework. Indeed, let:d˜ x ( n ) i,w ( t ) = − ( b ( t )2 + g ( n ) w ( t, i )2 )˜ x ( n ) i ( t ) d t + p v ( t )2 d W i,w ( t ) , d˜ x ( n ) i,v ( t ) = − ( b ( t )2 + g ( n ) v ( t, i )2 )˜ x ( n ) i ( t ) d t + p v ( t )2 d W i,v ( t ) , W i,w and W i,v are independent Brownian motions, and g ( n ) w ( t, i ) and g ( n ) v ( t, i ) are defined in a manner similarwith (2.2). Then, (conditionally on v ), independent copies w ( n,M ) j,low and v ( n,M ) j,low exist that satisfy the representation: n X j =1 w ( n,M ) j,low ( t Mm ) = [ nd w ( t )] X i =1 x i,w ( t Mm ) n X j =1 v ( n,M ) j,low ( t Mm ) = [ nd v ( t )] X i =1 x i,v ( t Mm )It then remains to prove that, for some appropriately defined processes w ( n,M )1 ,mid and v ( n,M )1 ,mid : r ( n,M ) ≡ w ( n,M )1 ,mid + v ( n,M )1 ,mid converges somehow to a process r ( ∞ , ∞ ) that satisfies (4.6), and for which r ( ∞ , ∞ ) ( t ) agrees with r ( t ) in distribution. Acknowledgements
We would like to thank the participants at the 2014 Claremont Symposium on Interest rates, as well as the partic-ipants of the 2014 Bachelier conference. We would also thank Professor John Angus for the valuable communicationwith him. We received very helpful advice from two anonymous referees. All errors are ours.
The characteristic function of Y ∼ χ ( λ ) when λ > Y ( ω ) = (1 − iω ) − λ / . (5.1)Let X ∼ χ ( λ , λ , X ( ω ) = Z R e iωx g λ ,λ ( x ) d x = + ∞ X k =0 e − λ / ( λ / k k ! Z + ∞ e iωx f λ +2 k ( x ) d x = + ∞ X k =0 e − λ / ( λ / k k ! (1 − iω ) − λ / − k = (1 − iω ) − λ / ∞ X k =0 e − λ / (( λ / − iω ) − ) k k != exp( iλ ω − iω )(1 − iω ) λ / . (5.2)On the other hand, since X , . . . , X n are i.i.d,Φ P nk =1 X k ( t ) = (Φ X ( t )) n . By the facts that X = P nk =1 X k , we obtainΦ X ( ω ) = (Φ X ( ω )) /n = exp( i ( λ /n ) ω − iω ) exp( kiπn )(1 − it ) ( λ /n ) / , for k = 0 , . . . , n − . The fact that Φ X (0) = 1 guarantees Φ X ( ω ) = exp( i ( λ /n ) ω − iω )(1 − iω ) ( λ /n ) / . X ∼ χ (cid:0) λ n , λ n , (cid:1) . The case where c = 1 follows trivially. (cid:3) The proofs of the lemmas necessary to this proof are in a second section of Appendix.For convenience, we define a function J ( n ) next ( t ) which associates to a point in time t the next jump after t : J ( n ) next ( t ) = min { J ∈ J ( n ) | J > t ) . (5.3)The stochastic Fubini theorem (Ikeda and Watanabe 1981, Williams 1979 p. 44, Heath Jarrow Morton 1992,Lemma 0.1) states the following. Lemma 5.1 [Heath-Jarrow-Morton (1992)]Let { Φ( t, a, ω ) : ( t, a ) ∈ [0 , τ ] × [0 , τ ]) be a family of real random variables such that:(i) (( t, ω ) , a ) ∈ { ([0 , τ ] × Ω) × [0 , τ ] } → Φ( t, a, ω ) is L × B [0 , τ ] measurable where L is the predictable σ -field;(ii) t R Φ ( s, a, ω ) d s < ∞ a.e. for all t ∈ [0 , τ ] and ω ∈ Ω ;(iii) t R (cid:0)R τ Φ( s, a, ω ) d a (cid:1) d s < ∞ a.e. for all t ∈ [0 , τ ] and ω ∈ Ω .If t → R τ R t Φ( s, a, ω ) d W s ( ω ) d a is continuous a.e. for all t ∈ [0 , τ ] and ω ∈ Ω , then: Z t Z τ Φ( s, a, ω ) d a d W s ( ω ) = Z τ Z t Φ( s, a, ω ) d W s ( ω ) d a, a.e. for all t ∈ [0 , τ ] and ω ∈ Ω . Definition 5.2
We define: µ ( n ) ( t, u ) = ∞ X i =1 ( − ( b ( s ) + g ( t, i ))˜ x ( n ) i ( t ) − σ ( t )4 )1 { i − < u ≤ i } ;Φ ( n ) ( t, u ) = ∞ X i =1 σ ( t ) x ( n ) ( t, u ); y ( n ) ( t, u ) = x ( n ) ( t, u ) ; Z ( n ) mid ( t ) = d ( n ) ( t ) Z u =0 y ( n ) ( t, u ) d u. The SDE for y ( n ) ( t, u ) spells then:d y ( n ) ( t, u ) = µ ( n ) ( t, u ) d t + Φ ( n ) ( t, u ) ∂ t W ( n ) ( t, u ) . Using the convention that the differential operator d s applies to the first parameter of W ( n ) ( s, a ), this definitionresults in: Z τ Z t Φ ( n ) ( s, a ) ∂ s W ( n ) ( s, a ) d a = ∞ X i =1 Z τ { i − < d ( n ) ( a ) ≤ i } Z t σ ( s ) x ( n ) i ( s ) d W i ( s ) d a. We have then the obvious lemma. 15 emma 5.3
For a.e. t ∈ [0 , τ ] , the following equation holds almost surely: Z ts =0 Z τa =0 Φ ( n ) ( s, a ) d a∂ s W ( n ) ( s, a ) = Z τa =0 Z ts =0 Φ ( n ) ( s, a ) ∂ s W ( s, a ) d a. Definition 5.4
We define the remainder: R ( n ) = M ( n ) d X i =1 N up ( i ) X k =1 J ( n ) next ( J ( n ) k,up ( i )) Z s = J ( n ) k,up ( i ) (cid:18) (˜ x ( n ) i ( s )) d d ( n ) d s | s − (˜ x ( n ) i ( J ( n ) k,up ( i ))) d d ( n ) d s | J ( n ) k,up ( i ) (cid:19) d s + N down ( i ) X p =1 J ( n ) next ( J ( n ) k,up ( i )) Z s = J ( n ) p,down ( i ) (cid:18) (˜ x ( n ) i ( s )) d d ( n ) d s | s − (˜ x ( n ) i ( J ( n ) p,down ( i ))) d d ( n ) d s | J ( n ) p,down ( i ) (cid:19) d d ( n ) d s | s d s. Lemma 5.5
There exists a collection of Brownian motions B ( n ) Z,mid such that: Z ( n ) mid ( t ) − Z ( n ) mid (0) = Z ts =0 − b ( s ) Z ( n ) mid ( s ) − σ ( s ) d ( n ) ( s )4 d s + Z ts =0 σ ( s ) q Z ( n ) mid ( s ) dB ( n ) Z,mid ( s ) + R ( n ) ( t ) . Fix a time t Mm . We cannot estimate the remainder R ( n ) ( t ) properly since each term ˜ x ( n ) i may have jumped over [0 , t Mm ],which makes them of different orders of magnitude. To this effect, we refine our representation by resetting it at eachtime t Mm . This allows us to better control the remainder. Lemma 5.6
There exists a collection of Brownian motions B ( n,m,M ) Z,mid such that, for t Mm ≤ t ≤ t Mm +1 : Z ( n,m,M ) mid ( t ) − Z ( n,m,M ) mid ( t Mm ) = Z ts = t Mm − b ( s ) Z ( n,m,M ) mid ( s ) − σ ( s ) d ( n ) ( s )4 d s + Z ts = t Mm σ ( s ) q Z ( n,m,M ) mid ( s ) d B ( n,m,M ) Z,mid ( s ) + R ( n,m,M ) ( t ) . We notice that the proof of Lemma 5.5 can be carried over exactly to Lemma 5.6, so we omit it.
Lemma 5.7
Let M ( n ) = n Q .The remainder R ( n,m,M ( n )) ( t ) /n converges to zero in probability when n → ∞ . We define ˜ Z ( n,M ) mid ( t ) = M X m =0 Z ( n,m,M ) mid ( t )1 { t Mm ≤ t < t Mm +1 } By the fact that Z ( n,m − ,M ) mid ( t Mm ) = Z ( n,m,M ) mid ( t Mm ), the trajectory ˜ Z ( n,M ) mid ( ., ω ) is continuous. Lemma 5.8
The trajectory r ( n,m,M ) j,mid ( ., ω ) is continuous almost surely. Also, the following relations hold: r ( n,m − ,M ) j,mid ( t Mm ) = r ( n,m,M ) j,mid ( t Mm )˜ Z ( n,M ) mid ( t Mm +1 ) = n X j =1 r ( n,m,M ) j,mid ( t Mm +1 ) . emma 5.9 Take M ( n ) = n Q . The process r ( n,M ( n )) j,mid converges to r in distribution when n goes to ∞ . Definition 5.10
We define: d low ( s, u ) = [ min s ≤ t
There exists continuous functions ˜ b and ˜ σ such that: E [exp( iω ˜ Z ( n,M ) mid ( t ) /n ]= exp( iω ˜ Z ( n ) mid (0) e − R t b ( u ) d u − iω R t e − R tv ˜ b ( u ) d u ˜ σ ( v ) / v − R t d ′ ( s ) log (cid:16) − iω R ts e − R tv ˜ b ( u ) d u ˜ σ ( v ) / v (cid:17) d s )(1 − iω R t e − R tv ˜ b ( u ) d u ˜ σ ( v ) / v ) d (0) / + o ( n Q M ) . Lemma 5.12
Take M ( n ) = n Q . There exists continuous functions ˜ b and ˜ σ such that lim n →∞ E [exp( iωr ( n,M ( n )) mid ( t )]= exp (cid:18) iωr (0) e − R t b ( u ) d u − iω R t e − R tv ˜ b ( u ) d u ˜ σ ( v ) / v − R t d ′ ( s ) log (cid:16) − iω R ts e − R tv ˜ b ( u ) d u ˜ σ ( v ) / v (cid:17) d s (cid:19) (1 − iω R t e − R tv ˜ b ( u ) d u ˜ σ ( v ) / v ) d (0) / . (5.5)Taking the sequence M ( n ) = n Q , Lemma 5.9 and Lemma 5.12 show that r ( t ) has characteristic function given by(5.5). Formula (2.3) obtains by integration by parts. Since both forms of the characteristic functions (before and afterintegration by parts) are interesting, we show this calculation in the main text. We calculate the first two moments of r ( t ) by ways of the characteristic function (5.5) and match them with the analytical results obtained from applyingItˆo’s lemma to (1.1). Using (2.3), it is easy to see that the only valid choice (for all t ) of a function ˜ b and ˜ σ is:˜ b = b (5.6)˜ σ = σ (5.7)Note that it is also possible to see directly from our definitions that (5.6), (5.7) hold, but it is more cumbersome.17 .3 Proofs of All Lemmas Proof of Lemma 5.5
The function t → R τ R t Φ ( n ) ( s, a ) d W s ( a ) d a is continuous a.e. and a.s. even at a discontinuity point . We notehowever that the function: t → d ( n ) ( t ) Z u =0 t Z s =0 µ ( n ) ( s, a, ω ) d s d a is discontinuous at t . The latter case is however covered by the regular Fubini theorem (see e.g. Hunter andNachtergaele 2001, p.350). Applying both Fubini theorems we have: Z ( n ) mid ( t ) − Z ( n ) mid (0) = ( A ) + ( B ) + ( C ) , (5.8)where:( A ) = t Z s =0 d ( n ) ( s ) Z u =0 µ ( n ) ( s, u ) d u d s ;( B ) = t Z s =0 d ( n ) ( s ) Z u =0 Φ ( n ) ( s, u ) d u∂ s W ( s, u );( C ) = t Z s =0 d ( n ) ( t ) Z u = d ( n ) ( s ) + µ ( n ) ( s, u ) d u d s + t Z s =0 d ( n ) ( t ) Z u = d ( n ) ( s ) + Φ ( n ) ( s, u ) d u∂ s W ( s, u )= t Z s =0 t Z a = s + µ ( n ) ( s, d ( n ) ( a )) d d ( n ) d a | a d a d s + t Z s =0 t Z a = s + Φ ( n ) ( s, d ( n ) ( a )) d d ( n ) d a | a d a∂ s W ( s, d ( a ))= t Z a =0 a Z s =0 µ ( n ) ( s, d ( n ) ( a )) d d ( n ) d a | a d a d s + t Z a =0 a Z s =0 Φ ( n ) ( s, d ( n ) ( a )) ∂ s W ( s, d ( a )) d d ( n ) d a | a d a = t Z a =0 ( x ( n ) ( a, d ( n ) ( a ))) d d ( n ) d a | a d a (5.9)= M ( n ) d X i =1 N up ( i ) X k =1 J next ( J ( n ) k,up ( i )) Z s = J ( n ) k,up ( i ) (˜ x ( n ) i ( s )) d d ( n ) d s | s d s + N down ( i ) X p =1 J next ( J ( n ) k,down ( i )) Z s = J ( n ) p,down ( i ) (˜ x ( n ) i ( s )) d d ( n ) d s | s d s. (5.10) Calculation of (A)
We split ( A ) in two parts: ( A ) = ( A
1) + ( A The function t → R t Φ ( n ) ( s, a ) d W s is not continuous at a discontinuity t = d − ( i/n ), but the discontinuity is ”integrated” when thisfunction is integrated w.r.t. a A
1) = t Z s =0 d ( n ) ( s ) Z u =0 ( − b ( s )(˜ x ( n ) i ( s )) − σ ( s )4 ) d u d s = t Z s =0 (cid:18) − b ( s ) Z ( n,M ) mid ( s ) − σ ( s ) d ( n ) ( s )4 (cid:19) d s. The calculation of (A2) is more complicated. It is easier to isolate the up branches and the down branches of d ( n ) :( A
2) = M ( n ) d X i =1 N ( n ) up ( i ) X k =1 ( A k,up,i,t ) + N ( n ) down ( i ) X p =1 ( A p,down,i,t )where we define:( A k,up,i,t ) = − J ( n ) next ( J ( n ) k,up ( i )) Z s = J ( n ) k,up ( i ) d ( n ) ( s ) Z u =0 δ ( s − J ( n ) k,up ( i ))(˜ x ( n ) i ( s )) { i − < u ≤ i } d u d s = − J ( n ) next ( J ( n ) k,up ( i )) Z s = J ( n ) k,up ( i ) (˜ x ( n ) i ( J ( n ) k,up ( i ))) d d ( n ) d s | s d s ;( A p,down,i,t ) = − J ( n ) next ( J ( n ) p,down ( i )) Z s = J ( n ) p,down ( i ) d ( n ) ( s ) Z u =0 δ ( t − J ( n ) p,down ( i ))(˜ x ( n, ,M ) i ( s )) d d ( n ) d s | s { i − < u ≤ i } d u d s = − J ( n ) next ( J ( n ) p,down ( i )) Z s = J ( n ) k,up ( i ) (˜ x ( n ) i ( J ( n ) p,down ( i ))) d d ( n ) d s | s d s. Calculation of (B)
By Levy’s theorem there exists a collection of Brownian motions B ( n ) such that:( B ) = t Z s =0 d ( n ) ( s ) Z u =0 σ ( s )˜ x ( n ) i ( s, ω ) ∞ X i =1 { i − < u ≤ i } d u d W i ( s )= t Z s =0 [ d ( n ) ( s )] X i =1 σ ( s )˜ x ( n ) i ( s ) d W i ( s ) + ( d ( n ) ( s ) − [ d ( n ) ( s )]) σ ( s )˜ x ( n )[ d ( n ) ( s )]+1 ( s ) d W [ d ( n ) ( s )]+1 ( s )= t Z s =0 σ ( s ) q Z ( n ) mid ( s ) [ d ( n ) ( s )] P i =1 ˜ x ( n ) i ( s ) d W i ( s ) + ( d ( n ) ( s ) − [ d ( n ) ( s )])˜ x ( n )[ d ( n ) ( s )]+1 ( s ) d W [ d ( n ) ( s )]+1 ( s ) s [ d ( n ) ( s )] P i =1 ˜ x ( n ) i ( s ) + (cid:16) ( d ( n ) ( s ) − [ d ( n ) ( s )])˜ x ( n )[ d ( n ) ( s )]+1 ( s ) (cid:17) = Z ts =0 σ ( s ) q Z ( n ) mid ( s ) d B ( n ) ( s ) . A ) = ( A
1) + ( A B ) and ( C ). Proof of Lemma 5.7
We observe that: J ( n ) next ( J ( n ) k,up ( i )) − J ( n ) k,up ( i ) = o ( 1 n ) J ( n ) next ( J ( n ) p,down ( i )) − J ( n ) p,down ( i ) = o ( 1 n ) . Clearly d d ( n ) d s | s = o ( n ). There is no jump of ˜ x ( n,m,M ) i,mid in the interval [ J ( n ) k,up ( i ) ≤ s < J ( n ) p ( k ) ,down ( i )) if s ∈ [ t Mm , t Mm +1 ].A fortiori there is no jump of ˜ x ( n,m,M ) i,mid in the interval [ J ( n ) k,up ( i ) ≤ s < J ( n ) next ( J ( n ) k,up ( i )] if s ∈ [ t Mm , t Mm +1 ]. Thus E [ | (˜ x ( n,m,M ) i,mid ( s )) − (˜ x ( n ) i ( J ( n ) k,up ( i ))) | ] is of order 1 /M on s ∈ [ L ( m,M,k,i,n ) up , U ( m,M,k,i,n ) up ], and: E [ U ( m,M,k,i,n ) up Z L ( m,M,k,i,n ) up (cid:12)(cid:12)(cid:12) (˜ x ( n,m,M ) i,mid ( s )) − (˜ x ( n ) i,mid ( J ( n,m,M ) k,up ( i ))) (cid:12)(cid:12)(cid:12) d s ] = o ( 1 M ) . Thus: E [ U ( m,M,k,i,n ) up Z L ( m,M,k,i,n ) up (cid:18) (˜ x ( n,m,M ) i,mid ( s )) d d ( n ) d s | s − (˜ x ( n,m,M ) i,mid ( J ( n ) k,up ( i ))) d d ( n ) d s | J ( n ) k,up ( i ) (cid:19) d s ] ≤ E [ U ( m,M,k,i,n ) up Z L ( m,M,k,i,n ) up (cid:12)(cid:12)(cid:12) (˜ x ( n,m,M ) i,mid ( s )) − (˜ x ( n,m,M ) i,mid ( J ( n ) k,up ( i ))) (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) max s ∈ [ J ( n ) k,up ( i ) ,J next ( J ( n ) k,up ( i ))] d d ( n ) d s | s (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) d s ] = o ( nM )Since V ar [ | (˜ x ( n,m,M ) i ( s )) − (˜ x ( n,m,M ) i ( J ( n ) k,up ( i ))) | ] is also of order 1 /M on s ∈ [ L ( m,M,k,i,n ) up , U ( m,M,k,i,n ) up ], V ar [ U ( m,M,k,i,n ) up Z L ( m,M,k,i,n ) up (cid:18) (˜ x ( n,m,M ) i,mid ( s )) d d ( n ) d s − (˜ x ( n,m,M ) i,mid ( J ( n ) k,up ( i ))) d d ( n ) d s | J ( n ) k,up ( i ) (cid:19) d s ] = o ( n M )The double sum P M ( n ) d i =1 (cid:16)P N up ( i ) k =1 + P N down ( i ) k =1 (cid:17) contributes to a number of jumps equal to card { J ( n ) } −
1, whichis of order n . Thus: E [ R ( n,m,M ) ( t ) n ] = o ( n M ) (5.11)The variance of R ( n ) ( t ) n is a bit more difficult to calculate, since (we focus only on the up branches, but the We focus only on the up branches, but the computation for the down branches is similar.
V ar [ N ( n ) up ( i ) X k =1 U ( m,M,k,i,n ) up Z L ( m,M,k,i,n ) up (cid:18) (˜ x ( n,m,M ) i,mid ( s )) d d ( n ) d s | s − (˜ x ( n ) i,mid ( J ( n,m,M ) k,up ( i ))) d d ( n ) d s | J next ( J ( n ) k,up ( i )) (cid:19) d s ] = N ( n ) up ( i ) X k =1 V ar [ N ( n ) up ( i ) X k =1 U ( m,M,k,i,n ) up Z L ( m,M,k,i,n ) up (cid:18) (˜ x ( n,m,M ) i,mid ( s )) d d ( n ) d s | s − (˜ x ( n,m,M ) i,mid ( J ( n ) k,up ( i ))) d d ( n ) d s | J next ( J ( n ) k,up ( i )) (cid:19) d s ]+ N ( n ) up ( i ) X k,k ′ =1 k = k ′ N ( n ) up ( i ) X k =1 Cov U ( m,M,k,i,n ) up Z L ( m,M,k,i,n ) up (cid:18) (˜ x ( n,m,M ) i,mid ( s )) d d ( n ) d s | s − (˜ x ( n,m,M ) i,mid ( J ( n ) k,up ( i ))) dd ( n ) d s | J next ( J ( n ) k,up ( i )) (cid:19) d s, (5.12) U ( m,M,k ′ ,i,n ) up Z L ( m,M,k ′ ,i,n ) up (cid:18) (˜ x ( n,m,M ) i,mid ( s )) d d ( n ) d s | s − (˜ x ( n ) i,mid ( J ( n,m,M ) k ′ ,up ( i ))) d d ( n ) d s | J next ( J ( n ) k ′ ,up ( i )) (cid:19) d s However the number N ( n ) up ( i ) of terms of i is of order 1, thus the number of terms in (5.12) is still of order 1, and: V ar [ R ( n,m,M ) ( t ) n ] = o ( n M ) (5.13)Then by Markov’s inequality: for all ǫ > P (cid:0) | R ( n,m,M ) ( t ) M | > ǫ (cid:1) ≤ | R ( n,m,M ) ( t ) M | ǫ = (cid:0) R ( n,m,M ) ( t ) M (cid:1) ǫ −−−−−→ n → + ∞ . As a result, R ( n,m,M ) ( t ) n tends to 0 in probability when M → ∞ . Proof of Lemma 5.8
Fix any ε, δ > n . The case n = 1 is trivial. Suppose that it istrue for n − A ε,δ be the event such that {| r ( n,M ) n,mid ( t Mm +1 ) − r ( n,M ) n,mid ( t Mm +1 − ε ) | > δ } . Set B ( n ) j,mid = B ( n − j,mid for 1 ≤ j ≤ n − P ( | n − X j =1 r ( n,M ) j,mid ( t Mm +1 ) − r ( n,M ) j,mid ( t Mm +1 − ε ) | > δ ) = 0 . Suppose P ( A ε,δ ) >
0. We have then P ( | n X j =1 r ( n,M ) j,mid ( t Mm +1 ) − r ( n,M ) j,mid ( t Mm +1 − ε ) | > δ ) > . By Levy’s theorem, we arrive at the contradiction: P ( | n X j =1 Z ( n ) mid ( t Mm +1 ) − Z ( n ) mid ( t Mm +1 − ε ) | > δ ) > . Proof of Lemma 5.9
Define ˜ R ( n,M ) ( t ) = M X m =1 R ( n,m,M ) ( t )1 { t Mm ≤ t < t Mm +1 } tep (i) : tightness of r ( n,M )1 ,mid Let h ( n,M )1 ( t ) = Z ts =0 − b ( s ) r ( n,M )1 ,mid ( s ) − σ ( s ) d ( s )4 d s + Z ts =0 σ ( s ) q r ( n,M )1 ,mid ( s ) d B ( n,M )1 ,mid ( s ) . (5.14)Thus: r ( n,M )1 ,mid ( t ) = h ( n,M )1 ( t ) + ˜ R ( n,M ) ( t ) n . . By Lemma 5.8, the process r ( n,M )1 ,mid is continuous. Let the modulus of continuity be: w ( r ( n,M )1 ,mid , δ ) = sup | s − t | <δ ≤ s,t ≤ T | r ( n,M )1 ,mid ( t ) − r ( n,M )1 ,mid ( s ) | . We bound its expected value: E [ w ( r ( n,M )1 ,mid , δ )] ≤ ( A ) + ( B ) + ( C ) , where: ( A ) = E [ sup | s − t | <δ Z tu = s − b ( u ) r ( n,M )1 ,mid ( u ) − σ ( u ) d ( n ) ( u )4 d u ] < K δ for some constant K >
0. With the assumption that σ is bounded:( B ) = E [ sup | s − t | <δ Z tu = s σ ( u ) q r ( n,M )1 ,mid ( u ) d B ( n,M )1 ,mid ( u )] ≤ E [ sup s ∈ [0 ,T ] | σ ( s ) q r ( n,M )1 ,mid ( s ) | sup | s − t | <δ | B ( n,M )1 ,mid ( t ) − B ( n,M )1 ,mid ( s ) | ] ≤ (cid:0) E [ sup s ∈ [0 ,T ] σ ( s ) q r ( n,M )1 ,mid ( s )] (cid:1) / (cid:0) E [ sup | s − t | <δ | B ( n,M )1 ,mid ( t ) − B ( n,M )1 ,mid ( s ) | ] (cid:1) / (H¨older inequality) ≤ K δ / , where K > B is equal to : E [sup s
0. Then for any ε >
0, usingMarkov’s inequality: P ( w ( r ( n,M ( n ))1 ,mid , δ ) ≥ ε ) ≤ E [ w ( r ( n,M ( n ))1 ,mid , δ )] ε . Clearly, for some constant K > E [ w ( r ( n,M ( n ))1 ,mid , δ )] ≤ sup | s − t | <δ ( E [( r ( n,M ( n ))1 ,mid ( t ) − r ( n,M ( n )) j,mid ( s )) ]) / ≤ K δ / . Thus: P ( w ( r ( n,M ( n ))1 ,mid , δ ) ≥ ε ) ≤ K δ / ε . We now invoke Theorem 8.2 in Billingsley (1968). Tightness occurs if, for each positive ε and η there must existof δ such that, for all n ≥ n P ( w ( r ( n,M ( n ))1 ,mid , δ ) ≥ ε ) ≤ η. (5.16)As a result we can select δ such that: K δ / ε ≤ η. We thus select δ = min( ( ηε ) K , . and, for all n ≥ n = Tδ , (5.16) occurs. Step (ii) : convergence of r ( n,M ( n ))1 ,mid to a solution of (1.1).By Lemma 5.7, for any ε > t :lim n →∞ P ( | r ( n,M ( n ))1 ,mid ( t ) − h ( n )1 ( t ) | > ε ) = 0Thus, r ( n,M ( n ))1 ,mid ( t ) − h ( n )1 ( t ) d → t < t :( r ( n,M ( n ))1 ,mid ( t ) − h ( n )1 ( t ) , r ( n,M ( n ))1 ,mid ( t ) − r ( n,M ( n ))1 ,mid ( t ) − h ( n )1 ( t ) − h ( n )1 ( t )) d → (0 , . By Corollary 1 to Theorem 5.1 in Billingsley (1968):( r ( n,M ( n ))1 ,mid ( t ) − h ( n )1 ( t ) , r ( n,M ( n ))1 ,mid ( t ) − h ( n )1 ( t )) d → (0 , . r ( n,M ( n ))1 ,mid − h ( n )1 to zero.In other terms: lim n →∞ P ( r ( n,M ( n ))1 ,mid = h ( n )1 ) = 1We can extract a subsequence { n k ( i ) } so that k ≥ k ( i ) implies that P ( | r ( n k ,M ( n k ))1 ,mid − h ( n k )1 | ≥ i − ) = 2 − i . By thefirst Borel-Cantelli lemma there is a probability 1 that | r ( n k )1 ,mid − h ( n k )1 | ≤ i − for all but finitely many i . Therefore:lim k → + ∞ r ( n k ,M ( n k ))1 ,mid = lim k → + ∞ h ( n k )1 a.s. P In other terms: P ( r ( ∞ , ∞ )1 ,mid = r ( ∞ , ∞ )1 ,mid (0) − Z .s =0 b ( s ) r ( ∞ , ∞ )1 ,mid ( s ) − σ d ( n ) ( s )4 n d s + Z .s =0 σ ( s ) q r ( ∞ , ∞ )1 ,mid ( s ) d B ( ∞ )1 ,mid ( s )) = 1 . Thus ( r ( ∞ , ∞ )1 ,mid , B ( ∞ )1 ,mid ) is a weak solution to the stochastic differential Equation (1.1). Maghsoodi (1996) provedthat under our conditions to (1.1) there is pathwise uniqueness. Proof of Lemma 5.11
By differentiability of d ( n ) , for any t Mm ≤ t < t Mm +1 : d ( n ) ( t ) − d low ( t Mm , t Mm +1 ) = o ( nM ) . By definition of Z ( n,m,M ) χ and ˜ Z ( n,M ) mid E [ Z ( n,m,M ) χ ( t Mm +1 ) − Z ( n,M ) mid ( t Mm +1 ) |F t Mm ] = o ( n Q M ; t Mm ) (5.17) V ar [ Z ( n,m,M ) χ ( t Mm +1 ) − Z ( n,M ) mid ( t Mm +1 ) |F t Mm ] = o ( n Q M ; t Mm ) . (5.18)By differentiability of the characteristic function, the following Taylor series holds: E [exp( iωZ ( n,M ) mid ( t )) |F t Mm ] = E [exp( iω ( Z ( n,m,M ) χ ( t )) |F t Mm ] + iωE [ Z ( n,M ) mid ( t ) − Z ( n,m,M ) χ |F t Mm ] − ω E [( Z ( n,M ) mid ( t ) − Z ( n,m,M ) χ ) |F t Mm ] + ... Let X be a F t Mm -measurable random variable with bounded mean. Suppose that X ( K M + i K M ) ≥ E [exp( iωZ ( n,M ) mid ( t )) |F t Mm ] − E [exp( iω ( Z ( n,m,M ) χ ( t )) |F t Mm ] ≥ X ( − K M − i K M ) , then: | ωE [ Z ( n,M ) mid ( t ) − Z ( n,m,M ) χ |F t Mm ] − ω E [( Z ( n,M ) mid ( t ) − Z ( n,m,M ) χ ) |F t Mm ] + .. | ≥ X K M (5.19) | − ω E [( Z ( n,M ) mid ( t ) − Z ( n ) χ ) |F t Mm ] + ω E [( Z ( n,M ) mid ( t ) − Z ( n,m,M ) χ ) |F t Mm ] + .. | ≥ X K M (5.20)The left handside of (5.19) and (5.20) can be majored by the terms below, and we obtain X K M ≤ exp( ωE [ Z ( n,M ) mid ( t ) − Z ( n,m,M ) χ ||F t Mm ]) − ≤ X K M X K M ≤ exp( ωE [( Z ( n,M ) mid ( t ) − Z ( n,m,M ) χ ) |F t Mm ])) − ≤ X K M E [exp( iωZ ( n,M ) mid ( t )) |F t Mm ] − E [exp( iω ( Z ( n,m,M ) χ ( t )) |F t Mm ] = o ( n Q M ; t Mm ) . (5.21)From now on within this proof, we suppress the subscript n from our variables. For the moment we fix M , andtake it out of the subscripts of our variables. We let ∆ = 1 /M , and suppose that K ∆ = t . We find it more convenientto shorten further the notation by: Z ( m,M ) mid ( t Mm +1 ) → Z mid (( m + 1)∆) d low ( m ∆ , ( m + 1)∆) → ˜ d (( m + 1)∆) E [ X | Z ( n ) mid ( t Mm )] → E m ∆ [ X ] c ( n,m,M ) → c ( m ∆) . We denote the conditional non-centrality parameter and scale factor of Z χ (( m + 1)∆) given Z ( n,M ) mid ( t Mm ) by: λ ( m ∆) = c ( m ∆) Z ( n,M ) mid ( m ∆)) e − ˜ b ( m ∆)∆ (5.22) c ( m ∆) = (1 − e − ˜ b ( m ∆)∆ )˜ σ ( m ∆)4˜ b ( m ∆) . (5.23)where, so far ˜ b and ˜ σ are unknown functions. By the characteristic function of non-central chi-square distribution,the assumption that M ( n ) = n Q and (5.21), for k = 0 , . . . , K − E k ∆ [exp( iωZ mid (( k + 1)∆))] = exp( i ωZ mid ( k ∆) e − ˜ b ( k ∆)∆ − iωc ( k ∆) )(1 − iωc ( k ∆)) ˜ d (( k +1)∆) / + o (∆; k ∆) (5.24)As a way to find the general iteration formula, we study the case where k = 2. Defining: ω = ωe − ˜ b (2∆)∆ − iωc (2∆) , we calculate E ∆ [exp( iω Z mid (2∆))] = exp( i ω Z mid (∆) e − ˜ b (∆)∆ − iω c (∆) )(1 − iω c (∆)) ˜ d (2∆) / + o (∆; ∆) . Hence, E ∆ [exp( iωZ mid (3∆))] = exp( i ω Z mid (∆) e − ˜ b (∆)∆ − iω c (∆) )(1 − iω c (∆)) d low (2∆) / (1 − iωc (2∆)) ˜ d (3∆) / + o (∆ ; ∆)= exp( i ωZ mid (∆) e ( − ˜ b (∆) − ˜ b (2∆))∆ − iω ( c (2∆)+ e − b (2∆)∆ c (∆)) )(1 − iω ( c (2∆) + e − ˜ b (2∆)∆ c (∆))) ˜ d (2∆) / (1 − iωc (2∆)) ( ˜ d (3∆) − ˜ d (2∆)) / + o (∆; ∆) . We thank an anonymous referee for this judicious construction. E [exp( iωZ mid ( K ∆))] = exp (cid:18) i ωZ mid (0) e − P K − k =0 ˜ b ( k ∆)∆ − iω ( P K − k =0 e − ∆ P K − l = k +1 ˜ b ( l ∆) c ( k ∆) (cid:19) (1 − iω P K − k =0 e − ∆ P K − l = k +1 ˜ b ( l ∆) c ( k ∆)) ˜ d (∆) / A K + o (∆) , where the denominator A K is given by: A K = K − Y j =0 − iω K X k = j e − ∆ P K − l = k +1 ˜ b ( l ∆) c ( k ∆) ( ˜ d (( j +1)∆) − ˜ d ( j ∆)) / . In order to show the limit of the above sequence, we take the logarithm and get:log A K = 12 K − X j =0 ( ˜ d (( j + 1)∆) − ˜ d ( j ∆)) log − iω K − X k = j e − ∆ P K − l = k +1 ˜ b ( l ∆) c ( k ∆) . We now return to placing subscripts ( M ) to our expressions. Letlog A ( M ) K/M = 12 K − X j =0 h ( M ) ( jM ) g ( M ) ( jM )where: h ( M ) ( jM ) = ˜ d ( j + 1 M ) − ˜ d ( jM )= d low ( jM , j + 1 M ) − d low ( j − M , jM ) g ( M ) ( jM ) = log − iω K − X k = j e − M P K − l = k +1 ˜ b ( lM ) c ( M ) ( kM ) Let R ( M ) = { j | h ( jM ) = d ′ ( jM ) + o ( M }} . Since there is a finite number of minima of d ( n ) , there is a sequence M K (take for instance a dyadic sequence) such that for all K > K the set R ( M K ) is finite and constant. We can thussplit the calculations into log A ( M K ) K = 12 K − X j =0 h ( M K ) ( jM K ) g ( M K ) ( jM K )= X j ∈{ ,..,K }∩R ( M K ) h ( M K ) ( jM K ) g ( M K ) ( kM K ) (5.25)+ X j ∈R ( M K ) h ( M K ) ( jM K ) g ( M K ) ( jM K ) . (5.26)The sum in (5.26) has a finite number of terms, while the sum in (5.25) has a number of terms that tends to infinitywhen K → ∞ . 26lso, observe that, by using the mean value theorem, c ( k ∆) = (1 − e − ˜ b ( k ∆)∆ )˜ σ ( k ∆)4˜ b ( k ∆) = ˜ σ ( k ∆)∆4 + o (∆) . Thus we can write: lim K →∞ log A ( M K ) K = 12 Z T d ′ ( s ) g ( s ) d s, with g ( s ) = log(1 − iω Z ts e − R tv ˜ b ( u ) d u ˜ σ ( v ) / v ) . Equivalently: lim K →∞ KM K = t A ( M K ) K = exp( 12 Z t d ′ ( s ) log(1 − iω Z ts e − R tv ˜ b ( u ) d u ˜ σ ( v ) / v )d s ) . We also observe that, by Riemann sum,exp (cid:18) i ωZ mid (0) e − P K − k =0 ˜ b ( k ∆)∆ − iω ( P K − k =0 e − ∆ P K − l = k +1 ˜ b ( l ∆) c ( k ∆) (cid:19) (1 − iω P K − k =0 e − ∆ P K − l = k +1 ˜ b ( l ∆) c ( k ∆)) ˜ d (∆) / −−−−−→ K → + ∞ exp( iωZ mid (0) e − R t b ( u ) d u − iω R t e − R tu ˜ b ( u ) d u ˜ σ ( u ) d u/ )(1 − iω R t e − R tv ˜ b ( u ) d u ˜ σ ( v ) / v ) d (0) / . Then Lemma 5.11 follows.
Proof of Lemma 5.12
The proof is analogous to the proof of Lemma 5.11, and we show only the first inductive step. By (5.4), E [exp( iω ˜ Z ( n,M ) mid ( t Mm +1 ) |F t Mm ]= E [exp( iω n X j =1 r ( n,M ) j,mid ( t Mm +1 ) |F t Mm ]= exp( iω n X j =1 r ( n,M ) j,mid ( t Mm )) E [exp( iω n X j =1 h ( n,M ) j,mid ( t Mm +1 ) + R ( n,m,M ) ) |F t Mm ]= exp( iω ˜ Z ( n,M ) mid ( t Mm )) E [exp( iω n X j =1 h ( n,M ) j,mid ( t Mm +1 ) + R ( n,m,M ) ) |F t Mm ] . We use (5.11), (5.13) as well as the arguments presented at the beginning of Lemma 5.11 to obtain E [exp( iω ˜ Z ( n,M ) mid ( t Mm +1 ) |F t Mm ]= exp( iω ˜ Z ( n,M ) mid ( t Mm )) E [exp( iω n X j =1 h ( n,M ) j,mid ( t Mm +1 ) |F t Mm ] + o ( n Q M ; t Mm )= E exp iω n X j =1 (cid:16) r ( n,M ) j,mid ( t Mm ) + h ( n,M ) j,mid ( t Mm +1 ) (cid:17) |F t Mm + o ( n Q M ; t Mm ) . However each copy of r ( n,M ) j,mid ( t Mm ) + h ( n,M ) j,mid ( t Mm +1 ) is conditionally independent. Thus E [exp( iω ˜ Z ( n,M ) mid ( t Mm +1 ) |F t Mm ] = (cid:16) E [exp (cid:16) iω (cid:16) r ( n,M )1 ,mid ( t Mm ) + h ( n,M )1 ,mid ( t Mm +1 ) (cid:17)(cid:17) |F t Mm ] (cid:17) n + o ( n M ; t Mm ) . E [exp (cid:16) iω (cid:16) r ( n,M )1 ,mid ( t Mm ) + h ( n,M )1 ,mid ( t Mm +1 ) (cid:17)(cid:17) |F t Mm ] − E [exp( iωr ( n,M )1 ,mid ( t Mm )) |F t Mm ] = o ( n Q M ; t Mm ) . Thus: E [exp( iω ˜ Z ( n,M ) mid ( t Mm +1 ) |F t Mm ] = (cid:16) E [exp( iωr ( n,M )1 ,mid ( t Mm +1 )) |F t Mm ] (cid:17) n + o ( n Q M ; t Mm ) . Then Lemma 5.12 holds by using the infinite divisibility of SNC chi-squared distribution and Lemma 5.11. (cid:3)
References [1] Billingsley, P. (1968).
Convergence of Probability Measures . Wiley, New York.[2] Brigo, D., and F. Mercurio (2001). A Deterministic-shift Extension of Analytically-tractable and Time-homogeneous Short-rate Models.
Finance and Stochastics ,Vol 5. 369-387.[3] Brigo, D., and F. Mercurio (2006).
Interest Rate Models: Theory and Practice - with Smile, Inflation, and Credit .Springer.[4] Chen, L., D. Filipovic, and H. Vincent Poor (2004). Quadratic Term Structure Models for Risk-free and DefaultableRates.
Mathematical Finance , Vol. 14, No 4, 515-536.[5] Cotton, P., J.P. Fouque, G. Papanicolaou, and R. Sircar (2004). Stochastic Volatility Corrections for Interest RateDerivatives.
Mathematical Finance , vol. 14, No 2, 173-200.[6] Cox, J.C., J.E. Ingersoll, and S.A. Ross (1985a). An Intertemporal General Equilibrium Model of Asset prices.
Econometrica , vol. 53,no 2, 363-384.[7] Cox, J.C., J.E. Ingersoll, and S.A. Ross (1985b). A Theory of the Term Structure of Interest Rates.
Econometrica ,53, 385-487.[8] Duffie, D., and R. Kan (1996). A Yield Factor Model of Interest Rates.
Mathematical Finance , 6, 379-406.[9] Fouque, J.-P., and M. Lorig (2011). A Fast Mean-Reverting Correction to Heston’s Stochastic Volatility Model.
SIAM Journal of Financial Mathematics , Vol. 2, p. 221-254..[10] Geman, H. (1989). L’Importance de la Probabilit´e Forward Neutre dans une Approche Stochastique des Tauxd’Int´erˆet, ESSEC Working Paper (Univ. Paris Panth´eon Sorbonne PhD Dissert)[11] Gourieroux, C., and A. Monfort (2011). Bilinear Term Structure Models.
Mathematical Finance , vol. 21, no. 1,1-19.[12] Heston, S. (1993). A Closed-Form Solution for Options with Stochastic Volatility with Applications to Bond andCurrency Options.
The Review of Financial Studies vol. 6, no 2, 327-343.[13] Hull, J.C., and A. White (1990). Pricing Interest Rate derivative Securities.
The Review of Financial Studies , vol.3.[14] Hunter, J. and B. Nachtergaele (2001).
Applied Analysis . World Scientific.[15] Jamshidian, F. (1989). An Exact Bond Option Pricing Formula.
The Journal of Finance
44: 205-209.[16] Jamshidian, F., (1993b). A Simple Class of Square-Root Models. Fuji International Finance Working Paper.2817] Karatzas, I., and Shreve, S.E. (1991).
Brownian Motion and Stochastic Calculus . Springer-Verlag.[18] Keller-Ressel, M., and T. Steiner (2008). Yield Curve Shapes and the Asymptotic Short Rate Distribution inAffine One-factor Models.
Finance and Stochastics , vol. 2, Issue 2,149-172.[19] Lewis, A. (2000). Option Valuation under Stochastic Volatility. Finance Press.[20] Longstaff, F., and E. Schwartz (1992). Interest Rate Volatility and the Term Structure: a Two-Factor GeneralEquilibrium Model.
Journal of Finance , vol. 47, no 4, 1259-1282.[21] Maghsoodi, Y. (1996). Solutions of the Extended CIR Term Structure and Bond Option Valuation.
MathematicalFinance , vol. 6, no 1, 89-109.[22] Mannolini, A., C. Mari, and R. Reno (2008). Pricing Caps and Floors with the Extended CIR Model.
InternationalJournal of Finance and Economics , vol. 13, 386-400.[23] Revuz, D., and M. Yor (1991). Continuous Martingales and Brownian Motion. Springer.[24] Rogers, L.C.G. (1995). Which Model for the Term Structure of Interest Rates Should One Use? In
MathematicalFinance , ed. M. Davis, D. Suffie, W. Fleming, and S. Shreve. New York: Springer-Verlag, 93-116.[25] Shirakawa, H. (2002) Squared Bessel Processes and their Applications to the Square Root Interest Rate Model.
Asia-Pacific Financial Markets
9, 169-190.[26] Shreve, S. (2004).
Stochastic Calculus for Finance , vol. 2. Springer.[27] Yang, H. (2006). Calibration of the Extended CIR Model.