Volatility derivatives in market models with jumps
aa r X i v : . [ q -f i n . P R ] M a y VOLATILITY DERIVATIVES IN MARKET MODELS WITH JUMPS
HARRY LO AND ALEKSANDAR MIJATOVI ´CA
BSTRACT . It is well documented that a model for the underlying asset price process that seeks to capture thebehaviour of the market prices of vanilla options needs to exhibit both diffusion and jump features. In this paperwe assume that the asset price process S is Markov with c`adl`ag paths and propose a scheme for computing thelaw of the realized variance of the log returns accrued while the asset was trading in a prespecified corridor. Wethus obtain an algorithm for pricing and hedging volatility derivatives and derivatives on the corridor-realizedvariance in such a market. The class of models under consideration is large, as it encompasses jump-diffusionand L´evy processes. We prove the weak convergence of the scheme and describe in detail the implementation ofthe algorithm in the characteristic cases where S is a CEV process (continuous trajectories), a variance gammaprocess (jumps with independent increments) or an infinite activity jump-diffusion (discontinuous trajectorieswith dependent increments).
1. I
NTRODUCTION
Derivative securities on the realized variance of the log returns of an underlying asset price process tradeactively in the financial markets. Such derivatives play an important role in risk management and are alsoused for expressing a view on the future behaviour of volatility in the underlying market. Since the liquidcontracts have both linear (variance swaps) and non-linear (square-root = volatility swaps, hockey stick =variance options) payoffs, it is very important to have a robust algorithm for computing the entire law ofthe realized variance. Often such contingent claims have an additional feature, which makes them cheaperand hence more attractive to the investor, that stipulates that variance of log returns accrues only when thespot price is trading in a contract-defined corridor (see Subsection 2.1 for the precise definitions of suchderivatives).It is clear from these definitions that, in order to manage the risks that arise in the context of volatilityderivatives, one needs to apply the same modelling framework that is being used for pricing and hedgingvanilla options on the underlying asset. It has therefore been argued that the pricing and hedging of volatil-ity derivatives should be done using models with jumps and stochastic volatility (see for example [10],Chapter 11). In this paper we propose a scheme for computing the distribution of the realized variance andthe corridor-realized variance when the underlying process S = ( S ) t ≥ is a Markov process with possiblydiscontinuous trajectories, thus obtaining an algorithm for pricing and hedging all the payoffs mentionedabove. Our main assumption is that the Markov dimension of S is equal to one (i.e. we assume that thefuture and past of the process S are independent given its present value). We do not make any additionalassumptions on the structure of the increments or the distributional properties of the process S . This class ofprocesses is large as it encompasses one dimensional jump-diffusions and L´evy processes. We would like to thank Martijn Pistorius for many useful discussions.
The algorithm consists of two steps. In the first step the original Markov process S under a risk-neutralmeasure is approximated by a continuous-time finite state Markov chain X = ( X t ) t ≥ . This is achievedby approximating the generator of S by a generator matrix for X . The second step consists of pricing thecorresponding volatility derivative in the approximate model X . It should be stressed that the two steps areindependent of each other but both clearly contribute to the accuracy of the scheme. In other words thesecond step can be carried out for any approximate generator matrix of the chain X . In specific examplesin this paper we describe a natural way of defining the approximate generator matrix (see Section 4 fordiffusions and Section 5 for processes with jumps) which is by no means optimal (see monograph [9]for weak convergence of such approximations and [16] for possible improvements) but already makes theproposed scheme accurate enough (see the numerical results in Section 6).In the second step of the algorithm we approximate the dynamics of the corridor-realized variance of thelogarithm of the chain X (i.e. the variance that accrued while X was in the prespecified corridor) by a Poissonprocess with an intensity that is a function of the current state of the chain X . This approximation is obtainedby matching k ∈ N instantaneous conditional moments of the corridor-realized variance of the chain X . Thisis a generalisation of the method proposed in [1], which in the framework of this paper corresponds to k = k strictly larger than one improves considerably the quality of theapproximation to the distribution of the corridor-realized variance for S . In fact if S is a diffusion process,then our algorithm with k = S are discontinuous, then the scheme with k = k =
3. Furthermore in Section 3 we prove the weak convergence of our scheme as k tends toinfinity (see Theorem 3.1).The general approach of this paper is to view continuous-time Markov chains as a numerical tool that isbased on probabilistic principles and can therefore be applied in a very natural way to problems in pricingtheory. It is worth noting that there is no theoretical obstruction for extending our scheme to the case when S is just one component of a two dimensional Markov process (e.g. S is the asset price in a stochastic volatilitymodel) by using a Markov chain to approximate this two dimensional process. The reason why throughoutthis paper we assume that S itself is Markov lies in the feasibility of the associated numerical scheme. If S is Markov the dimension of the generator of X can be as small as 70, while in the case of the stochasticvolatility process we would need to find the spectra of matrices of dimension larger than 2000. This is byno means impossible but is not the focus of the present paper.The literature on the pricing and hedging of derivatives on the realized variance is vast. It is generallyagreed that either the assumption on the independence of increments or the continuity of trajectories of theunderlying process needs to be relaxed in order to obtain a realistic model for the realized variance. Inthe recent paper [3] model independent bounds for options on variance are obtained in a general continu-ous semimartingale market. The continuity assumption is relaxed in [7], where a class of one dimensionalMarkov processes with independent increments is considered and the law of the realized variance is ob-tained. A perfect replication for a corridor variance swap (i.e. the mean of the corridor-realized variance) in OLATILITY DERIVATIVES IN MARKET MODELS WITH JUMPS 3 the case of a continuous asset price process is given in [6]. For other contributions to the theory of volatilityderivatives see [1] and the references therein. The main aim of this paper is to provide a stochastic approxi-mation scheme for the pricing and hedging of derivatives on the realized (and corridor-realized) variance inmodels that violate both the above assumptions, thus making it virtually impossible to find the laws of therelevant random variables in semi-analytic form.The paper is organised as follows. Section 2 defines the approximating Markov chains and gives a generaldescription of the pricing algorithm. In Section 3 we state and prove the weak convergence of the proposedscheme. Section 5 (resp. 4) describes the implementation of the algorithm in the case where the process S isan infinite activity jump-diffusion (resp. has continuous trajectories). Section 6 contains numerical resultsand Section 7 concludes the paper.2. T HE k CONDITIONAL MOMENTS OF THE REALIZED VARIANCE
Let S = ( S t ) t ≥ be a strictly positive Markov process with c`adl`ag paths (i.e. each path is right-continuousas a function of time and has a left limit at every time t ) which serves as a model for the evolution of the riskysecurity under a risk-neutral measure. Note that we are also implicitly assuming that S is a semimartingale.2.1. The contracts. A volatility derivative in this market is any security that pays f ([ log ( S )] T ) at maturity T , where f : R + → R is a measurable payoff function and [ log ( S )] T is the is the quadratic variation of theprocess log ( S ) = ( log ( S t )) t ≥ at maturity T defined by [ log ( S )] T : = lim n → ¥ (cid:229) t ni ∈ P n , i ≥ log S t ni S t ni − ! , (1)where P n = { t n , t n , . . . , t nn } , n ∈ N , is a refining sequence of partitions of the interval [ , T ] . In other words t n = t nn = T , P n ⊂ P n + for all n ∈ N and lim n → ¥ max {| t ni − t ni − | : i = , . . . , n } =
0. It is well-knownthat this sequence converges in probability (see [13], Theorem 4.47). Many such derivative products tradeactively in financial markets across asset-classes (see [1] and the references therein).A corridor variance swap is a derivative security with a linear payoff function that depends on the accruedvariance of the asset price S while it is trading in an interval [ L , U ] that is specified in the contract, where0 ≤ L < U ≤ ¥ . More specifically if we define a process S t : = max { L , min { S t , U }} , ∀ t ∈ [ , ¥ ) , (2)then for a given partition P n = { t n , t n , . . . , t nn } of the time interval [ , T ] the corridor-realized variance isgiven by (cid:229) t ni ∈ P n , i ≥ h [ L , U ] ( S t ni − ) + [ L , U ] ( S t ni ) − [ L , U ] ( S t ni − ) [ L , U ] ( S t ni ) i log S t ni S t ni − ! , (3)where [ L , U ] denotes the indicator function of the interval [ L , U ] . In practice the increments t ni − t ni − ususallyequal one day. The square bracket in the sum in (3) ensures that the accrued variance is not increased whenthe asset price S jumps over the interval [ L , U ] .The one sided corridor-realized variance was defined in [4]. Definition (1.1) in [4] (resp. (1.2) in [4])corresponds to expression (3) above if we choose U = ¥ (resp. L = S into the corridor HARRY LO AND ALEKSANDAR MIJATOVI ´C differently from its exit from the corridor. This asymmetry is then exploited to obtain an approximatehedging strategy for linear payoffs on the corridor-realized variance. In this paper we opt for a symmetrictreatment of the entrance and exit of S into and from the corridor [ L , U ] , because this is in some sense morenatural. It is however important to note that all the theorems and the algorithm proposed here do NOTdepend in any significant way on this choice of definition. In other words for any reasonable modification ofthe definition in (3) (e.g. the one in [4]) the algorithm described in this section would still work. Note alsothat our algorithm will yield an approximate distribution of random variable (3) in the model S and thereforeallows us to price any non-linear payoff that depends on the corridor-realized variance.In the case the corridor-realized variance is monitored continuously (see [6]), we can express it usingquadratic variation as follows. Note first that since the map s max { L , min { s , U }} can be expressed asa difference of two convex functions, Theorem 66 in [18] implies that the process S = ( S t ) t ≥ is again asemimartingale and therefore the corridor-realized variance Q L , UT ( S ) , defined as the limit of the expressionin (3) as n tends to infinity, exists and equals Q L , UT ( S ) = [ log ( S )] T − (cid:18) log UL (cid:19) (cid:229) ≤ t ≤ T (cid:2) ( , L ) ( S t − ) ( U , ¥ ) ( S t ) + ( , L ) ( S t ) ( U , ¥ ) ( S t − ) (cid:3) (4)by Theorem 4.47a in [13]. Since we are assuming that the process S is c`adl`ag the limit S t − : = lim s ր t S s existsalmost surely for all t >
0. The sum in (4), which corresponds to jumps of the asset price S over the corridor [ L , U ] , is almost surely finite by Theorem 4.47c in [13]. Note also that if L = U = ¥ ) we find that Q , UT ( S ) (resp. Q L , ¥ T ( S ) ) equals the quadratic variation of the semimartingale log ( S ) = ( log ( S t )) t ≥ becausethe process S cannot in these cases jump over the corridor. Our main task it to find an approximate law ofthe random variable Q L , UT ( S ) which will allow us to price any derivative on the corridor-realized variancewith terminal value f ( Q L , UT ( S )) , where f is a possibly non-linear function.2.2. Markov chain X and its corridor-realized variance. Let us start by assuming that we are given agenerator matrix L of a continuous-time Markov chain X = ( X t ) t ≥ which approximates the generator ofthe Markov process S . The state-space of the Markov chain X is the set E : = { x , . . . , x N − } ⊂ R + with N ∈ N elements, such that x i < x j for any integers 0 ≤ i < j ≤ N − . In Sections 4 and 5 we discuss brieflyhow to construct such approximate generators for Markov processes that are widely used in finance (i.e.diffusion processes with jumps.) Throughout the paper we will use the notation L ( x , y ) = e ′ x L e y for theelemetns of the matrix L , where x , y ∈ E , vectors e x , e y denote the corresponding standard basis vectors of R N and ′ is transposition.The quantities of interest are the quadratic variation [ log ( X )] = ([ log ( X )] t ) t ≥ and the corridor-realizedvaiance Q L , U ( X ) = ( Q L , Ut ( X )) t ≥ processes which are for any maturity T defined by [ log ( X )] T : = lim n → ¥ (cid:229) t ni ∈ P n , i ≥ log X t ni X t ni − ! , (5) Q L , UT ( X ) : = [ log ( X )] T − (cid:18) log UL (cid:19) (cid:229) ≤ t ≤ T (cid:2) ( , L ) ( X t − ) ( U , ¥ ) ( X t ) + ( , L ) ( X t ) ( U , ¥ ) ( X t − ) (cid:3) , (6)where partitions P n , n ∈ N , of [ , T ] are as in (1), the process X = ( X t ) t ≥ is defined analogously with (2) by X t : = max { L , min { X t , U }} and X t − : = lim s ր t X s for any t >
0. Note that if we choose L < min { x : x ∈ E } OLATILITY DERIVATIVES IN MARKET MODELS WITH JUMPS 5 and U > max { x : x ∈ E } , then the random variables in (5) and (6) coincide. We can therefore without lossof generality only consider the corridor-realized variance Q L , UT ( X ) .Since the process X is a finite-state Markov chain, the jumps of X arrive with bounded intensity and it istherefore clear that the following must hold P " Q L , Ut + D t ( X ) − Q L , Ut ( X ) = (cid:18) log X t + D t X t (cid:19) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X t = x = o ( D t ) for all x ∈ [ L , U ] ∩ E . (7)An analogous equality holds if X t is ountside of the corridor [ L , U ] . Recall also that by definition a function f ( D t ) is of the order o ( D t ) (usually denoted by f ( D t ) = o ( D t ) ) if and only if lim D t ց f ( D t ) / D t =
0. Equal-ity (7) implies that the j -th instantaneous conditional moment of the corridor-realized variance Q L , U ( X ) isgiven by M j ( x ) : = lim D t → D t E (cid:20)(cid:16) Q L , Ut + D t ( X ) − Q L , Ut ( X ) (cid:17) j (cid:12)(cid:12)(cid:12) X t = x (cid:21) = (cid:229) y ∈ E L ( x , y ) "(cid:18) log yx (cid:19) j − (cid:18) log UL (cid:19) j A U , L ( x , y ) (8)where the set A U , L ⊂ R is defined as A U , L : = ([ , L ) × ( U , ¥ )) ∪ (( U , ¥ ) × [ , L )) and for any x ∈ E we have x : = max { L , min { x , U }} .2.3. The extension ( X , I ) . The basic idea of this paper is to extend the markov chain X to a continuous-time Markov chain ( X , I ) = ( X t , I t ) t ≥ where the dynamics of the process I approximates well the dynamicsof the corridor-realized variance Q L , U ( X ) . Conditional on the path of the chain X , the process I will be acompound Poisson process with jump-intensity that is a function of the current state of X . The generatorof ( X , I ) will be chosen in such a way that the first k ∈ N infinitesimal moments of I and Q L , U ( X ) coincide.The approximating chain I will start at 0 (as does the process Q L , U ( X ) ) and gradually jump up its uniformstate-space { , a , .., a C } , where a is a small positive constant and C is some fixed integer.The main computational tool in this paper is the well-known spectral decomposition for partial-circulantmatrices (see Appendices A.2-A.4 in [1] for the definition and the properties of the spectrum), which willbe applied to the generator of the Markov chain ( X , I ) . The geometry of the state-space { , a , .., a C } istherefore of fundamental importance because it allows the generator of ( X , I ) to be expressed as a partial-circulant matrix. As mentioned in the introduction, the main difference between the approach in the presentpaper and the algorithm in [1] is that here we take advantage of the full strength of the partial-circulantform of the generator of ( X , I ) . This allows us to define the process I as a compound Poisson processwith state-dependent intensity rather than just a Poisson process (which was the case in [1]), without addingcomputational complexity. As we shall see in Section 6 this enables us to approximate the entire distributionof the corridor-realized variance and hence obtian much more accurate numerical results.Assuming that the process I can jump at most n ∈ N states up from its current position in an infinitesimalamount of time, the dynamics of I are uniquely determined by the state-dependent intensities l i : E → R + , where i ∈ { , . . . , n } (9) HARRY LO AND ALEKSANDAR MIJATOVI ´C and E is the state-space of the chain X . The generator of I , conditional on the event X t = x , can therefore forany c , d ∈ { , , .., C } be expressed as L I ( x : c , d ) : = l j ( x ) if d = c + j mod ( C + ) for some j ∈ { , ..., n } ; − (cid:229) ni = l i ( x ) if d = c ;0 otherwise.(10)The dimension of the matrix L I ( x : · , · ) is 2 C + x ∈ E and the identity d = c + j mod ( C + ) meansthat the numbers d and c + j represent the same element in the additive group Z C + . A key observation hereis that the entries L I ( x : c , d ) in the conditional generator depend on c and d solely through the difference d − c and hence the afore mentioned group structure makes the conditional generator into a circulant matrix(see Appendix A for the definition of circulant matrices).This algebraic structure of the conditional generator L I ( x : · , · ) translates into a periodic boundary con-dition for the process I . This is very undesirable because the process Q L , U ( X ) we are trying to approximateclearly does not exhibit such features. We must therefore choose C large enough so that even if the chain I is allowed to jump n steps up at any time, the probability that it oversteps the boundary is negligible (i.e.below machine precision). We will see in Section 6 that in practice C ≈
100 and n ≈
30 is sufficient to avoidthe boundary. Since our aim is to match the first k instantaneous moments, it is necessary to take n largeror equal to k . In applications this does not pose additional restrictions because, as we shall see in Section 6, k = k = ( X , I ) on the state-space E × { , a , . . . , a C } as follows(11) G ( x , c ; y , d ) : = L ( x , y ) d c , d + L I ( x : c , d ) d x , y , where x , y ∈ E , c , d ∈ { , , . . . , C } and d · , · denotes the Kronecker delta function. The matrix G is of thesize N ( C + ) and has partial-circulant form. In other words we can express G in terms of N blocks whereeach block is a square matrix of the size 2 C + G are equal toa sum of a circulant matrix and a scalar multiple of the identity matrix. All other blocks are scalar matrices.For the precise definition of partial-circulant matrices see Appendix A.We can now compute, using (10) and (11), the j -th instantaneous conditional moment of the process I asfollows lim D t → D t E h ( I t + D t − I t ) j (cid:12)(cid:12)(cid:12) X t = x , I t = a c i = C (cid:229) d = ( a d − a c ) j L I ( x : c , d )= a j n (cid:229) d = d j l d ( x ) (12)for any x ∈ E and all integers c ∈ { , , . . . , C } that satisfy the inequality c < C − n , where n was introducedin (9). This inequality implies that the process I cannot jump to or above a C (i.e. it cannot complete afull circle) in a very short time interval D t . Note also that it is through this inequality only that identity (12)depends on the current level a c of the process I .Our main goal is to approximate the process ( X , Q L , U ( X )) , where corridor-realized variance Q L , U ( X ) isdefined in (6), by the continuous-time Markov chain ( X , I ) with generator given by (11). We now match the OLATILITY DERIVATIVES IN MARKET MODELS WITH JUMPS 7 first k instantaneous conditional moments of processes Q L , U ( X ) and I using identities (8) and (12):(13) a j n (cid:229) d = d j l d ( x ) = M j ( x ) for any x ∈ E and j = , . . . , k . In other words we must choose the intensity functions l i (see (9)) and the parameter a so that the system (13)is satisfied. The necessary requirement for the solution is that l i ( x ) ≥ x ∈ E and all i = , . . . , n .These inequalities can place non-trivial restrictions on the solution space and will be analysed in more detailin Sections 4 and 5.Another simple yet important observation that follows from (13) is that, in order to match the first k instantaneous conditional moments of the corridor-realized variance Q L , U ( X ) , the size of the support of thejump distribution of the of Poisson processes with state-dependant intensity (i.e. n ) must be at least k . Fromnow on we assume that n ≥ k .The pricing of volatility derivatives is done using the following theorem which yields a closed-formformula for the semingroup of the Markov chain ( X , I ) . Theorem 2.1.
Let G be the generator matrix of the Markov process ( X , I ) given by (11) . Then for any t ≥ ,x , y ∈ E and d ∈ { , . . . , C } the equality holds P (cid:0) X t = y , I t = a d (cid:12)(cid:12) X = x (cid:1) = exp ( t G )( x , y , d )= C + C (cid:229) j = e i p j d exp ( t L j )( x , y ) , (14) where i = √− , the scalars p j and the complex matrices L j , for j = , . . . , C, are given by L j ( x , y ) : = L ( x , y ) + d x , y n (cid:229) i = (cid:0) e − i p j i − (cid:1) l i ( x ) , (15) p j : = p C + j . Theorem 2.1 is the main computational tool used in this paper which allows us to find in a semi-analyticform the semigroup of the chain ( X , I ) (if C ≈
100 and N =
70, the matrix G contains more than 10 elements). For a straightforward implementation of the algorithm in Matlab see [14]. It is clear that The-orem 2.1 generalizes equation (6) in [1] and that this generalization involves exactly the same number ofmatrix operations as the algorithm in [1]. The only additional computations are the sums in (15).The proof of Theorem 2.1 relies on the partial-circulant structure of the matrix G given in (11). Theargument follows precisely the same lines as the one that proved Theorem 3.1 in [1] and will therefore notbe given here (see Appendix A.5 in [1] for more details).Since the dynamics of the process ( X , I ) are assumed to be under a risk-neutral measure, the current valueof any payoff that depends on the corridor-realized variance at fixed maturity can easily be obtained from theformulae in Theorem 2.1. Furthermore the same algorithm yields the risk sensitivities Delta and Gamma ofany derivative on the corridor-realized variance, without adding computational complexity. This is becausethe output of our scheme is a vector of values of the derivative in question conditional on the process X starting at each of the elements in its state-space. We should also note that forward-starting derivatives onthe corridor-realized variance can be dealt with using the same algorithm because conditioning on the state HARRY LO AND ALEKSANDAR MIJATOVI ´C of a Markov chain at a future time requires only a single additional matrix-vector multiplication. Explicitcalculations are obvious and are omitted (see [1] for more details).3. C
ONVERGENCE
In Section 2 we defined the Markov chain ( X , I k ) via its generator (11) that in some sense approximatesthe process ( X , Q L , U ( X )) , where Q L , U ( X ) is the corridor-realized variance of X defined in (6). Here I k denotes the process I from Section 2 which satisfies the instantaneous conditional moment restrictions,given by (13), up to order k .Notice that it follows directly from definition (6) that the process ( X , Q L , U ( X )) is adapted to the naturalfiltration generated by the chain X and that its components X and Q L , U ( X ) can only jump simultaneously.On the other hand note that the form of the generator of the chain ( X , I k ) , given by (11), implies that thecomponents X and I k cannot both jump at the same time. It is also clear that the process I k is not adapted tothe natural filtration of X . In this section our goal is to prove that, in spite of these differences, for any fixedtime T the sequence of random variables ( I kT ) k ∈ N converges in distribution to the random variable Q L , UT ( X ) .In fact we have the following theorem which states that, for any bounded European payoff, the price of thecorresponding derivative on the corridor-realized variance in the approximate model ( X , I k ) converges to theprice of the same derivative in ( X , Q L , U ( X )) as the number k of matched instantaneous conditional momentstends to infinity. Theorem 3.1.
Let X be a continuous-time Markov chain with generator L as given in Section 2. For eachk ∈ N define a real number a k : = k max (cid:26)(cid:16) log yx (cid:17) : x , y ∈ E \{ } (cid:27) , (16) assume that n in (9) equals k and that there exist functions l ki : E → R + , i ∈ { , . . . , k } , that solve thesystem of equations (13) . Let the continuous-time Markov chain ( X , I k ) be given by generator (11) wherethe integers C k in (10) , which determine the size of the state-space of the process I k , are chosen in such a waythat lim k → ¥ a k C k = ¥ . Then for any fixed time T > the sequence of random variables ( I kT ) k ∈ N convergesweakly to Q L , UT ( X ) . In other words for any bounded continuous function f : R → R we have lim k → ¥ E [ f ( I kT ) | X ] = E [ f ( Q L , UT ( X )) | X ] . Before proving Theorem 3.1 we note that the assumption on the existence of non-negative solutions ofthe system in (13) is not stringent and can be satisfies for any chain X by allowing n in (9) to take valueslarger than k . The restriction n = k in Theorem 3.1 is used because it simplifies the notation. Proof.
Throughout this proof we will use the notation S t : = Q L , Ut ( X ) for any t ∈ R + . By the L´evy continuitytheorem it is enough to prove that the equality holdslim k → ¥ E [ exp ( i uI kT )] = E [ exp ( i u S T )] for each u ∈ R . OLATILITY DERIVATIVES IN MARKET MODELS WITH JUMPS 9
Let D t > s -algebra generated by theprocess X up to and including time T − D t and using the Markov property, we obtain the following repre-sentation E [ exp ( i u S T )] = E (cid:2) exp ( i u S T − D t ) E (cid:2) exp ( i u ( S T − S T − D t )) (cid:12)(cid:12) X T − D t (cid:3)(cid:3) (17) = E " e i u S T − D t k (cid:229) j = ( i u ) j j ! E (cid:2) ( S T − S T − D t ) j (cid:12)(cid:12) X T − D t (cid:3) + ¥ (cid:229) j = k + ( i u ) j j ! E (cid:2) ( S T − S T − D t ) j (cid:12)(cid:12) X T − D t (cid:3)! = E " e i u S T − D t + D t k (cid:229) j = ( i u ) j j ! M j ( X T − D t )+ ¥ (cid:229) j = k + ( i u ) j j ! E (cid:2) ( S T − S T − D t ) j (cid:12)(cid:12) X T − D t (cid:3)! + o ( D t ) , where M j is defined in (8). By applying Markov property of ( X , I k ) , identity (12) and condition (13), whichholds by assumption for all j ∈ { , . . . , k } , we obtain E [ exp ( i uI kT )] = E " e i uI kT − D t + D t k (cid:229) j = ( i u ) j j ! M j ( X T − D t ) (18) + ¥ (cid:229) j = k + ( i u ) j j ! E (cid:20) ( I kT − I kT − D t ) j (cid:12)(cid:12)(cid:12)(cid:12) X T − D t , I kT − D t (cid:21)! + o ( D t ) . It follows from (8) that there exists a positive constant G such that max { M j ( x ) : x ∈ E } ≤ G j for all j ∈ N .Therefore we find that for a constant D : = exp ( uG ) the following inequality holds on the entire probabilityspace (cid:12)(cid:12)(cid:12)(cid:12) k (cid:229) j = ( i u ) j j ! M j ( X T − D t ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ D . (19)Note also that D is independent of k and D t .Definition (16) implies that k a k is a positive constant, say A , for each k ∈ N . If we introduce a positiveconstant L : = max {− L ( x , x ) : x ∈ E } , we obtain the following bound E (cid:2) ( S T − S T − D t ) j (cid:12)(cid:12) X T − D t (cid:3) ≤ A j L D t + o ( D t ) for each j ∈ N (20)on the entire probability space. In order to find a similar bound for the process I k we first note that if followsfrom the linear equation (13) (for j =
1) and definition (16) that the inequalities k (cid:229) d = d l kd ( x ) ≤ kL for all k ∈ N , x ∈ E must hold. Therefore (12) implies E (cid:20) ( I kT − I kT − D t ) j (cid:12)(cid:12)(cid:12)(cid:12) X T − D t , I kT − D t (cid:21) ≤ A j kL D t + o ( D t ) for any j ∈ N (21) and any small time-step D t . We can now combine the estimates in (17), (18), (19), (20) and (21) to obtainthe key bound (cid:12)(cid:12) E [ exp ( i u S T )] − E [ exp ( i uI kT )] (cid:12)(cid:12) ≤ (cid:12)(cid:12) E [ exp ( i u S T − D t )] − E [ exp ( i uI kT − D t )] (cid:12)(cid:12) ( + D tD ) (22) + L ( k + ) D t ¥ (cid:229) j = k + ( Au ) j j ! + o ( D t ) . The main idea of the proof of Theorem 3.1 is to iterate the bound in (22) T D t times. This procedure yieldsthe following estimates (cid:12)(cid:12) E [ exp ( i u S T )] − E [ exp ( i uI kT )] (cid:12)(cid:12) ≤ D D t ( + D tD ) ( T / D t ) − + L ( k + ) T ¥ (cid:229) j = k + ( Au ) j j ! + T o ( D t ) D t . Since the left-hand side of this inequality is independent of D t , the inequality must hold in the limit as D t ց
0. We therefore find (cid:12)(cid:12) E [ exp ( i u S T )] − E [ exp ( i uI kT )] (cid:12)(cid:12) ≤ L ( k + ) T ¥ (cid:229) j = k + ( Au ) j j ! . (23)The right-hand side of inequality (23) clearly converges to zero as k tends to infinity. This concludes theproof of the theorem. ✷ Theorem 3.1 implies that the prices of the volatility derivatives in the Markov chain model X can beapproximated arbitrarily well using the method defined in Section 2. Our initial problem of approximatingprices in the model based on a continuous-time Markov process S is by Theorem 3.1 reduced to the questionof the approximation of the law of S by the law of X . This can be achieved by a judicious choice for thegenerator matrix of the chain X . Since this is not the central topic of this paper we will not investigate thequestion further in this generality (see [9] for numerous results on weak convergence of Markov processes).However in Sections 4 and 5 we are going to propose specific Markov chain approximations for diffusionand jump-diffusion processes respectively and study numerically the behaviour of the approximations forvolatility derivatives in Section 6.4. T HE REALIZED VARIANCE OF A DIFFUSION PROCESS
Our task now is to apply the method described in Section 2 to approximate the dynamics of the corridor-realized variance of a diffusion processes. The first step is to approximate the diffusion process S whichsolves the stochastic differential equation (SDE) dS t S t = g dt + s (cid:18) S t S (cid:19) dW t , (24)with measurable volatility function s : R + → R + , using a continuous-time Markov chain X . A possible wayof achieving this is to use a generator for the chain X given by the following system of linear equations (cid:229) y ∈ E L ( x , y ) = , (cid:229) y ∈ E L ( x , y )( y − x ) = g x , (25) (cid:229) y ∈ E L ( x , y )( y − x ) = s (cid:18) xX (cid:19) x for each x ∈ E . In Appendix B we give an algorithm to define the state-space E of the chain X . In Section 6we provide a numerical comparison for vanilla option prices in the CEV model, i.e. in the case s ( s ) : = s s b − , and in the corresponding Markov chain model given by the approximation above. Note that aMarkov chain approximation X of the diffusion S is in the spirit of [2] and is by no means the only viablealternative. One could produce more accurate results by matching higher instantaneous moments of the twoprocesses (see [16] for rates of convergence in some special cases).If the solution of SDE (24) is used as a model for the risky security under a risk-neutral measure we haveto stipulate that g = r , where r is the prevailing risk-free rate in the economy. Therefore by the first twoequations in system (25) the vector in R N with cooridnates equal to the elements in the set E represents aneigenvector of the matrix L for the eigenvalue g . Hence we find E [ X t | X = x ] = e ′ x exp ( t L ) (cid:229) y ∈ E ye y = e t g x , ∀ x ∈ E , (26)where e x denotes the standard basis vector in R N that corresponds to the element x ∈ E in the natural orderingand the operation ′ denotes transposition. Therefore, under the condition g = r , the market driven by thechain X will also have a correct risk-neutral drift.Once we define the chain X , the next task is to specify the process I that approximates well the corridor-realized variance Q L , U ( X ) defined in (6). As we shall see in Section 6, matching the first two moments(i.e. the case k = n ≥
2, where n is the number of states the approximatevariance process I can jump up by at any given time (see (9)). To have flexibility we use n much largerthan 2, usually around 30. However in order to maintain the tractability of the solution of system (13) wemake an additional assumption that the intensities l i in (9), for i = , . . . , n , are all equal to a single intensityfunction l n : E → R + . To simplify the notation we introduce the symbol b n , mj : = m (cid:229) l = n + l j , where j , n , m ∈ N and m > n . (27)System (13) can in this case be solved explicitly as follows l ( x ) = a M ( x ) b , n − M ( x ) b , n a ( b , n − b , n ) , for any x ∈ E , (28) l n ( x ) = M ( x ) − a M ( x ) a ( b , n − b , n ) , for any x ∈ E , (29)where M j ( x ) is given in (8). Since the functions l , l n are intensities, all the values they take must benon-negative. The formulae above imply that this is satisfied if and only if the following inequalities hold a b , n b , n ≥ M ( x ) M ( x ) ≥ a for all x ∈ E . (30)It is clear that the function x M ( x ) / M ( x ) , x ∈ E , depends on the definition of the chain X both throughthe choice of the state-space E and the choice of the generator L . Figure 1b contains the plot of this functionin the special case of the CEV model. Inequalities (30) are used to help us choose a feasible value for theparameter a which determines the geometry of the state-space of the process I . Note also that (30) implies that the larger the value of n is, the less restricted we are in choosing a . In Section 6 we will make thesechoices explicit for the CEV model.The generator of the approximate corridor-realized variance I , conditional on the chain X being at thelevel x , is in general given by Formula (10). In this particular case the non-zero matrix elements L I ( x : c , d ) , c , d ∈ { , , . . . , C } , are given by L I ( x : c , d ) : = l ( x ) if d = ( c + ) mod ( C + ) ; l n ( x ) if d = ( c + i ) mod ( C + ) , i ∈ { , ..., n } ; − l ( x ) − ( n − ) l n ( x ) if d = c . This defines explicitly (via equations (28) and (29)) the dynamics of the chain ( X , I ) if the original assetprice process S is a diffusion. In Section 6 we will describe an implementation of this method when S follows a CEV process and study the behaviour of certain volatility derivatives in this model.5. T HE REALIZED VARIANCE OF A JUMP - DIFFUSION
In this section the task is to describe the algorithm for the pricing of volatility derivatives in jump-diffusionmodels. This will be achieved by an application of the algorithm from Section 2 with k =
3. In Section 6we will investigate numerically the quality of this approximation. We start by describing a construction ofthe Markov chain which is used to approximate a jump-diffusion.5.1.
Markov chain approximations for jump-diffusions.
We will consider a class of processes withjumps that is obtained by subordination of diffusions. The prototype for such processes is the well-knownvariance gamma model defined in [15], which can be expressed as a time-changed Brownian motion withdrift.A general way of building (possibly infinite-activity) jump-diffusion processes is by subordinating diffu-sions using a class of independent stochastic time changes. Such a time change is given by a non-decreasingstationary process ( T t ) t ≥ with independent increments, which starts at zero, and is known as a subordinator .The law of ( T t ) t ≥ is characterized by the Bernstein function f ( l ) , defined by the following identity E [ exp ( − l T t )] = exp ( − f ( l ) t ) for any t ≥ l ∈ D , (31)where D is an interval in R that contains the half-axis [ , ¥ ) . For example in the case of the variance gammaprocess, the Bernstein function is of the form f ( l ) = m n log (cid:18) + l nm (cid:19) . (32)In this case ( T t ) t ≥ is a gamma process with characteristic function equal to E [ exp ( i uT t )] = exp ( − f ( − i u ) t ) .Note that the set D in (31) is in this case equal to ( − m / n , ¥ ) (see [15], equation (2)). This subordinator isused to construct the jump-diffusions in Section 6.Let S be a diffusion defined by the SDE in (24). If we evaluate the process S at an independent subordi-nator ( T t ) t ≥ , we obtain a Markov process with jumps ( S T t ) t ≥ . It was shown in [17] that the semigroup of ( S T t ) t ≥ is generated by the unbounded differential operator G ′ : = − f ( − G ) , where G denotes the generator The parameter m is the mean rate, usually taken to be equal to one in order to ensure that E [ T t ] = t for all t ≥
0, and n is thevariance rate of ( T t ) t ≥ . OLATILITY DERIVATIVES IN MARKET MODELS WITH JUMPS 13 of the diffusion S . Similarly, if X is a continuous-time Markov chain with generator L defined in the firstparagraph of Section 4, the subordinated process ( X T t ) t ≥ is again a continuous-time Markov chain with thegenerator matrix L ′ : = − f ( − L ) . We should stress here that it is possible to define rigorously the operator G ′ using the spectral decomposition of G and the theorey of functional calculus (see [8], Chapter XIII, Sec-tion 5, Theorem 1). The matrix L ′ can be defined and calculated easily using the Jordan decomposition ofthe generator L . If the matrix L can be expressed in the diagonal form L = U L U − , which is the case inany practical application (the set of matrices that cannot be diagonalised is of codimention one in the spaceof all matrices and therefore has Lebesgue measure zero), we can compute L ′ using the following formula L ′ = − U f ( − L ) U − . (33)Here f ( − L ) denotes a diagonal matrix with diagonal elements of the form f ( − l ) , where l runs over thespectrum of the generator L .Before using the described procedure to define the jump-diffusion process, we have to make sure that ithas the correct drift under a risk-neutral measure. Recall that if the process S solves the SDE in (24), thenthe identity E [ S t | S ] = S exp ( t g ) holds, where g is the drift parameter in (24). Since the subordinator ( T t ) t ≥ is independent of S , by conditioning on the random variable T t , we find that under the pricing measure thefollowing identity must hold S exp ( rt ) = E [ S T t | S ] = S E [ exp ( g T t )] = S exp ( − f ( − g ) t ) , where f is the Bernstein function of the subordinator ( T t ) t ≥ and r is the prevailing risk-free rate which isassumed to be constant. This will be satisfied if and only if r = − f ( − g ) , which in case of the gamma sub-ordinator (i.e. when the function f is given by (32)) yields an explicit formula for the drift in equation (24) g = mn (cid:18) − exp (cid:18) − r nm (cid:19)(cid:19) . (34)Since formula (26) holds for the chain X , tower property and the identity r = − f ( − g ) imply E [ X T t | X ] = X E [ exp ( g T t )] = X exp ( rt ) . Therefore the subordinated Markov chain ( X T t ) t ≥ can also be used as a model for a risky asset under thepricing measure.The construction of jump-diffusions described above is convenient because we can use the generator L ,that was defined in Section 4, and apply the Bernstein function f from (32) to obtain the generator of theMarkov chain that approximates the process ( S T t ) t ≥ . This accomplishes the first step in the approximationscheme outlined in the introduction. In Subsection 5.2 we develop an algorithm for computing the law ofthe relized variance of the approximating chain generated by L ′ . It should be stressed that the algorithmin the next subsection does not depend on the procedure used to obtain the generator of the approximatingchain.5.2. The algorithm.
To simplify the notation let us assume that S is a jump-diffusion and that X is aconinuous-time Markov chain with generator L that is used to approximate the dynamics of the Markovprocess S . Since S has jumps it is no longer enough to use the algorithm from Section 2 with k = will become clear from the numerical results in Section 6). In this subsection we give an account of how toapply our algorithm in the case k = a and the constant C that uniquely determine the geometry of thestate-space of the process I (see the paragraph preceding equation (9) for the definition of the state-space).Set the maximum jump size of I to be m a for some m ∈ N . We now pick an integer n , such that 1 < n < m ,and set the intensities that correspond to the jumps of the process I of sizes between 2 a and n a to equal l n . Similarley we set the intensities for the jumps of sizes between ( n + ) a and m a to be equal to l m .This simplifying assumption makes it possible to describe the dynamics of I using only three functions l , l n , l m : E → R + that give state-dependent intensities for jumping up by i a where i = i ∈ { , . . . , n } , i ∈ { n + , . . . , m } respectively. In order to match k = Q L , U ( X ) , these functions must by (13) satisfy the following system of equations b , n b n , m b , n b n , m b , n b n , m l ( x ) l n ( x ) l m ( x ) = M ( x ) M ( x ) M ( x ) ∀ x ∈ E , where M j ( x ) : = M j ( x ) a j , the symbol b n , mj is defined in (27) and functions M j , j = , ,
3, are given in (8). Gaussian elimination yieldsthe explicit solution of the system l = ( M b n , m − M b n , m )( b , n b n , m − b , n b n , m ) − ( M b n , m − M b n , m )( b , n b n , m − b , n b n , m )( b n , m − b n , m )( b , n b n , m − b , n b n , m ) − ( b n , m − b n , m )( b , n b n , m − b , n b n , m ) , l n = ( M − M )( b n , m − b n , m ) − ( M − M )( b n , m − b n , m )( b n , m − b n , m )( b , n − b , n ) − ( b n , m − b n , m )( b , n − b , n ) , l m = ( M − M )( b , n − b , n ) − ( M − M )( b , n − b , n )( b n , m − b n , m )( b , n − b , n ) − ( b n , m − b n , m )( b , n − b , n ) , where all the identities are interpreted as functional equalites on the set E . It is clear from (27) that thedenominators in the above expressions satisfy the inequalities ( b n , m − b n , m )( b , n b n , m − b , n b n , m ) − ( b n , m − b n , m )( b , n b n , m − b , n b n , m ) < , ( b n , m − b n , m )( b , n − b , n ) − ( b n , m − b n , m )( b , n − b , n ) < , for suffciently large m (e.g. m ≥ b n , m dominates both expressions and has anegative coefficient in front of it. We therefore find that, if the functions l , l n , l m are to be positive, thefollowing inequalities must be satisfied0 < a M ( x ) + a M ( x ) b , n b n , m − b , n b n , m b n , m b , n − b , n b n , m − M ( x ) b , n b n , m − b , n b n , m b n , m b , n − b , n b n , m , (35) 0 > a M ( x ) − a M ( x ) b n , m − b n , m b n , m − b n , m + M ( x ) b n , m − b n , m b n , m − b n , m , (36) 0 < a M ( x ) − a M ( x ) b , n − b , n b , n − b , n + M ( x ) b , n − b , n b , n − b , n , (37)for every x ∈ E . These inequalities specify quadratic conditions on the spacing a (of the state-space of theprocess I ) which have to be satisfied on the entire set E . OLATILITY DERIVATIVES IN MARKET MODELS WITH JUMPS 15
Note that inequality (35) is always satisfied if the corresponding discriminant is negative. Alternativelyif the discriminant is non-negative, then the real zeros of the corresponding parabola, denoted by a ( x ) , a ( x ) and without loss of generality assumed to satisfy a ( x ) ≤ a ( x ) , exist and the conditions a < a ( x ) or a > a ( x ) ∀ x ∈ E must hold. Similar analysis can be applied to inequality (37). Inequality (36) will always be violated if thediscriminant is negative. This implies the following condition(38) ( b n , m − b n , m ) ( b n , m − b n , m )( b n , m − b n , m ) ≥ M ( x ) M ( x ) M ( x ) ∀ x ∈ E , which has to hold regardless of the choice of the spacing a . Even if condition (38) is satisfied we need toenforce the inequalities a ( x ) < a < a ( x ) ∀ x ∈ E . In Table 1 we summarise the conditions that need to hold for l ( x ) , l n ( x ) , l m ( x ) to be positive for any fixedelement x ∈ E . Discriminant ≥ al ( x ) > a < a ( x ) or a > a ( x ) false none l n ( x ) > a ( x ) < a < a ( x ) false l n ( x ) cannot be positive l m ( x ) > a < a ( x ) or a > a ( x ) false none Table 1: Conditions on the discriminant and the real roots a ( x ) , a ( x ) , assumed to satisfy the relation a ( x ) ≤ a ( x ) , in this tablerefer to the parabolas that arise in inequalities (35), (36) and (37). These inequalities are equivalent to the conditions l ( x ) > l n ( x ) > l m ( x ) > Having chosen the spacing a according to the conditions in Table 1, we can use the formulae above tocompute functions l , l n and l m . The conditional generator of the process I , defined in (10), now takes theform L I ( x : c , d ) : = l ( x ) if d = ( c + ) mod ( C + ) ; l n ( x ) if d = ( c + s ) mod ( C + ) , s ∈ { , ..., n } ; l m ( x ) if d = ( c + s ) mod ( C + ) , s ∈ { n + , ..., m } ;with diagonal elements given by − l ( x ) − ( n − ) l n ( x ) − ( m − n ) l m ( x ) and all other entries equal to zero.In Section 6 we are going to implement the algorithm described here for the variance gamma model and thesubordinated CEV process.
6. N
UMERICAL RESULTS
In this section we will perform a numerical study of the approximations given in the Sections 4 and 5.Subsection 6.1 gives an explicit construction of the approximating Markov chain X and compares the vanillaoption prices with the ones in the original model S . Subsections 6.2 and 6.3 compare the algorithm forvolatility derivatives described in this paper with a Monte Carlo simulation.6.1. Markov chain approximation.
Let S be a Markov process that satisfies SDE (24) with the volatilityfunction s : R + → R + given by s ( s ) : = s s b − and the drift g equal to the risk-free rate r (i.e. S is a CEVprocess). We generate the state-space E the algorithm in Appendix B and define find the generator matrix L by solving the linear system in (25).If the process S is a jump-diffusion of the form described in Subsection 5.1 (e.g. a variance gammamodel or a CEV model subordinated by a gamma process), we obtain the generator for the chain X byapplying formula (33) to the generator defined in the previous paragraph, where the function f (in (33))is given by (32). More preciselly if S is the subordinated CEV process, the drift g in (24) is given by theformula in (34). If S is a variance gamma model we subordinate the geometric Brownian motion whichsolves SDE (24) with the constant volatility function s ( s ) = s and the drift g = q + s / q is theparameter in the vairance gamma model (see [15], equation (1)). The implementation in Matlab of thisconstruction can be found in [14]. Note that the state-space of the Markov chain X in the diffusion and thejump-diffusion cases is of the same form (i.e. given by the algorithm in Appendix B).The numerical accuracy of these approximations is illustrated in Tables 2, 3 and 4 where the vanillaoption prices in the Markov chain model X are compared with the prices in the original model S for theCEV process, the variance gamma model and the subordinated CEV model respectively. Markov chain X CEV: closed-form K \ T . . Table 2: Implied volatility in the CEV model. The maturity T varies from half a year to two years and the corresponding strikesare of the form Ke rT , where K takes values between 80 and 120 and the risk-free rate equals r = S , with thecurrent spot value S = s equal to s ( s ) : = s s b − and the drift g = r ,where the volatility parameters are s = . b = .
3. The parameters for the non-uniform state-space of the chain X are N = l = , s = , u = , g l = , g u =
50 (see Appendix B for the definition of these parameters) and the generator of X isspecified by system (25). The pricing in the Markov chain model is done using (39) and in the CEV model using a closed-formformula in [12], pages 562-563. The total computation time for all the option price in the table under the Markov chain model X isless than one tenth of a second on a standard PC with 1.6GHz Pentium-M processor and 1GB RAM. It is clear from Tables 2, 3 and 4 that the continuous-time Markov chain X approximates reasonably wellthe Markov process S on the level of European option prices. The pricing in the Markov chain model is OLATILITY DERIVATIVES IN MARKET MODELS WITH JUMPS 17
Markov chain X VG: FFT K \ T . . Table 3: Implied volatility in the variance gamma model. The strikes and maturities are as in Table 2. The process S , with thecurrent spot value S = s ( s ) = s and thedrift equal to g = q + s /
2, where q is given in [15], equation (1). The Bernstein function of the gamma subordinator is givenin (32). The risk-free rate is assumed to be r = s = . , q = − . , and the jumpparameters in (32) equal m = , n = .
05. The parameters for the state-space of the chain X are N =
70 and l = , s = , u = , g l = , g u =
30 (see Appendix B for the definition of these parameters). The total computation time for all the option price in thetable under the Markov chain model is less than one tenth of a second. The Fourier inversion is performed using the algorithmin [5] and takes approximately the same amount of time. All computations are performed on the same hardware as in Table 2.
Markov chain X CEV with jumps: MC K \ T . . Table 4: Implied volatility in the CEV model subordinated by a gamma process. The strikes and maturities are as in Table 2. Theprocess S , with the current spot value S = s ( s ) = s s b − (where s = . , b = .
7) and the drift given by (34) (where the risk-free rate is r =
2% and the jump-parametersin (32) equal m = , n = . X are as in Table 3. The total computation time forall the option price in the table under the Markov chain model is less than one tenth of a second. The prices in the model S werecomputed using a Monte Carlo algorithm that first generates the paths of the gamma process ( T t ) t ≥ (using the algorithm in [11],page 144) and then, via an Euler scheme, generates paths of the process S . For the T = paths were generatedin 200 seconds. All computations are performed on the same hardware as in Table 2. done by matrix exponentiation. The transition semigroup of the chain X is of the form P ( X t = y | X = x ) = e ′ x exp ( t L ) e y , where x , y ∈ E , (39) e x , e y are the corresponding vectors of the standard basis of R N and ′ denotes transposition. For moredetails on this pricing algorithm see [2]. The implied volatilities in the CEV, the variance gamma andthe subordinated CEV model were obtained by a closed-form formula, a fast Fourier transform inversionalgorithm and a Monte Carlo algorithm respectively. As mentioned earlier the quality of this approximationcan be improved considerably, without increasing the size of the set E , by matching more than the first twoinstantaneous moments of the process S . Volatility derivatives – the continuous case.
The next task is to construct the process I defined inSection 2, obtain its law at a maturity T using Theorem 2.1 and compare it to the law of the random variable [ log ( S )] T defined in (1) by pricing non-linear contracts.Let S be the CEV processs with the parameter values as in the caption of Table 2 and let X be the corre-sponding Markov chain, which is also described uniquely in the same caption. As described in Section 4 inthis case we use k = I matches the first and the second instantaneous conditional momentsof the process [ log ( X )] T defined in (6)) and hence define the state-dependent intensities in the conditionalgenerator of L I by (28) and (29). We still need to determine the values of the spacing a , the size ( C + ) of the state-space of I and the largest possible jump-size a n of the process I at any given time.The necessary and sufficien condition on parameters a and n is given by (30). Figure 1b contains thegraph of the ratio in question x M ( x ) / M ( x ) , x ∈ E , for the CEV model. The minimum of the ratio is0 . a . The largest value of the ratio is approximately 0 . n =
50 satisfies the first inequality in (30).An important observation here is that Figure 1b only displays the values of the ratio M ( x ) / M ( x ) for x in E ∩ [ , ] . The choices of a and n made above therefore satisfy condition (30) only in this range(recall that in this case we have x = x = X to get below 20 or above 250 in2 years time is less than 10 − (see Figure 1a). This intuitive statement is supproted by the quality of theapproximation of the empirical distribution of [ log ( S )] T by the distribution of I T (see Figure 1c and Table 5).We now need to choose the size ( C + ) of the state-space for the process I . The integer C is determinedby the longest maturity that we are interested in, which in our case is 2 years. This is because we are usingTheorem 2.1 to find the joint law of the random variable ( X T , I T ) and must make sure that the process I does not complete the full circle during the time interval of length T (recall that the pricing algorithm basedon Theorem 2.1 makes the assumption that the process I is on a circle). In other words we have to choose C so that the chain X accumulates much less than 2 C a of realized variance. In the example consideredhere it is sufficient to take C = { , a , . . . , a C } , defined in the paragraphfollowing (8), a uniform lattice in the interval between 0 and 440 · . = . a does not change with maturity, all that is needed to obtain the joint probability distribution of ( X T , I T ) for allmaturities T ∈ { . , , } is to diagonalize numerically the complex matrices L j , j = , ..., C , in (14) onlyonce. The distribution of I T , obtained as a marginal of the random vector ( X T , I T ) , is plotted in Figure 1c.Note that the computational time required to obtain the law of I T is therefore independent of maturity T .We now perform a numerical comparisons between our method for pricing volatility derivatives and apricing algorithm based on a Monte Carlo simulation of the CEV model S . We generate 10 paths of theprocess S using an Euler scheme and compute the empirical probability distribution of the realized variance [ log ( S )] T based on that sample (see Figure 1c). We also compute the variance swap, the volatility swapand the call option prices E (cid:2) ( S T / T − ( q K ) ) + (cid:3) , for q ∈ { , , } , where K : = p E [ S T / T ] and S T denotes either I T or [ log ( S )] T . The prices and the computation times are documented in Table 5. Acursory inspection of the prices of non-linear payoff functions reveals that the method for k = k =
1, without adding computational complexity sinceboth algorithms require finding the spectrum of ( C + ) complex matrices in (15). We will soon see that OLATILITY DERIVATIVES IN MARKET MODELS WITH JUMPS 19
CEV k moments Spectral: I T MC: [ log ( S )] T derivative \ T . . p E [ S T / T ] E hp S T / T i q =
80% 2 1.46% 1.47% 1.52% (0.003%) (0.003%) (0.005%)call option 1 0.33% 0.33% 0.43% 0.39% 0.38% 0.45% q = q = Table 5: The prices of volatility derivatives in the CEV model S . The parameter values for the process S and the chain X are givenin the caption of Table 2. The parameters for the process I are a = . C =
220 for k ∈ { , } and n =
50 when k = n controls the jumps of I strictly larger than a , which are note present if k = S T denotes either I T or [ log ( S )] T and the call option price is E (cid:2) ( S T / T − ( q K ) ) + (cid:3) , for q ∈ { , , } , K : = p E [ S T / T ] .An Euler scheme with a time-increment of one day is used to generate 10 paths of the CEV process S and the sum in (1) is usedto obtain the empirical distribution of [ log ( S )] T (see Figure 1c) and to evaluate the contingent claims in this table. The numbers inbrackets are the standard errors in the Monte Carlo simulation. The computational time for the pricing of volatility derivativesusing our algorithm is independent of the maturity T . All computations are performed on the same hardware as in Table 2. the discrepancy between the algorithm in [1] and the one proposed in the current paper is amplified in thepresence of jumps. Note also that all three methods ( k = , Volatility derivatives – the discontinuous case.
In this subsection we will study numerically thebehaviour of the law of random variables [ log ( S )] T and Q L , UT ( S ) , defined in (1) and (4) respectively, where S is a Markov process with jumps. Let S be a variance gamma or a subordinated CEV process with parametervalues given in the captions of Tables 3 and 4 respectively.Since S has discontinuous trajectories we will have to match k = I in order to avoid large pricing errors for non-linear payoffs (see Tables 6 and 7for the size of the errors when k = X as described in Subsection 5.1 usingthe parameter values in the captions of Tables 3 and 4. All computations in this subsection are performedusing the implementation in [14] of our algorithm.Recall from Subsection 5.2 that in order to define the process I we need to set values for the integers1 < n < m and the spacing a so that the intensity functions l , l n , l m are positive (Table 1 states explicitnecessary and sufficient conditions for this to hold). Note that the inequality in (38) is necessary if l n isto be positive. Figure 2c contians the graph of the function x M ( x ) M ( x ) / M ( x ) for x ∈ E such that20 ≤ x ≤
250 in the case of the variance gamma model. If we choose n : = m : = , then the left-hand side of the inequality in (38) equals 13 .
48, which is an upper bound for the ratio inFigure 2c. If S equals the subordinated CEV process, the graph of the function x M ( x ) M ( x ) / M ( x ) takes a similar form and the same choice of n , m as above satisfies the inequality in (38).The distance a between the consecutive points in the state-space of the process I has to be chosen sothat the inequality a ( x ) < a < a ( x ) is satisfied for all x ∈ E (see Table 1). Figure 2d contains the graphsof the functions a , a over the state-space of X in the range 20 ≤ x ≤
250 for the variance gamma model.The corresponding graphs in case of the subordianted CEV process are very similar and are not reported. Itfollows that by choosing a : = . x ∈ E such that 20 ≤ x ≤ a that lies between the zeros a ( x ) and a ( x ) for all x ∈ E for our specific choice ofthe chain X and its state-space. However not matching the instantaneous conditional moments of I T and Q L , UT ( X ) outside of the interval [ , ] is in practice of little consequence because the probability that thechain X gets into this region (recall that the current spot level is assumed to be 100) before the maturity T = X in the case of variance gamma model).Once the parameters n , m and a have been determined, we use the explicit expressions for l , l n , l m onpage 14 to define the state dependent intensities of the process I for the states x ∈ E that satisfy 20 ≤ x ≤ l , l n , l m : E → R + to be constant. The choice of parameter C =
65 is, like in the previous subsection, determined by the longest maturity we are interested in (in ourcase this is T = [ log ( S )] T , for T ∈ { . , , } , in the variance gammaand the subordinated CEV model based on the approximation I T are given in Figures 2b and 3a respectively.The prices of various payoffs on the realized variance in these two models are given in Tables 6 and 7.Observe that the time required to compute the distribution of I T in the case of the continuous process S (see Table 5) is larger than the time required to perform the equivalent task for the process with jumps (seeTables 6 and 7). From the point of view of the algorithm this difference arises because in the continuouscase we have to use more points in the state-space of the process I since condition (30) forces the choiceof the smaller spacing a . In other words the quotient x M ( x ) / M ( x ) takes much smaller values if thereare no jumps in the model S than if there are. It is intuitively clear from definition (8) that this ratio for thevariance gamma (or the subordinated CEV) has a larger lower bound than the function in Figure 1b, becausein the the diffusion case the generator matrix is tridiagonal.Finally we apply our algorithm to computing the law of the corridor-realized variance Q L , UT ( S ) , where S is the subordinated CEV process and the corridor is given by L =
70 and U = I defined by matching k = Q L , UT ( S ) approximates best the entire distribution of the corridor-realized variance. However, ifone is interested only in the value of the corridor variance swap (i.e. a derivative with a payoff that is linearin Q L , UT ( S ) ), Table 8 shows that it suffices to take k = OLATILITY DERIVATIVES IN MARKET MODELS WITH JUMPS 21 VG k moments Spectral: I T MC: [ log ( S )] T derivative \ T . . p E [ S T / T ] E hp S T / T i q =
80% 2 1.56% 1.48% 1.45% (0.007%) (0.005%) (0.004%)3 1.66% 1.53% 1.47%call option 1 0.50% 0.36% 0.25% 0.85% 0.63% 0.45% q = q = Table 6: The prices of volatility derivatives in the variance gamma model S . The parameter values for the process S and the chain X are given in the caption of Table 3. The parameters for the process I are a = . C =
65 for k = , ,
3. We choose n = k = n = , m =
30 when k =
3. The variable S T and the payoffs are as in Table 5. The algorithm in [11], page 144, isused to generate 10 paths of the VG process S and the sum in (1) is used to obtain the empirical distribution of [ log ( S )] T (seeFigure 2b) and to evaluate the contingent claims in this table. The numbers in brackets are the standard errors in the Monte Carlosimulation. Note that the computational time for the pricing of volatility derivatives using the process I is independent of thematurity T . All computations are performed on the same hardware as in Table 2 (see [14] for the source code in Matlab).
7. C
ONCLUSION
We proposed an algorithm for pricing and hedging volatility derivatives and derivatives on the corridor-realized variance in markets driven by Markov processes of dimension one. The scheme is based on an order k approximation of the corridor-realized variance process by a continuous-time Markov chain. We provedthe weak convergence of our scheme as k tends to infinity and demonstrated with numerical examples thatin practice it is sufficient to use k = k = S is a component of a two dimensional Markovprocess. The implementation of the algorithm in this case is hampered by the dimension of the generator ofthe approximating Markov chain, which would in this case be approximately 2000 (as opposed to 70, as inthe examples of Section 6). It would be interesting to understand the precise structure of this large generatormatrix and perhaps exploit it to obtain an efficient algorithm for pricing volatility derivatives in the presenceof stochastic volatility. CEV + jumps k moments Spectral: I T MC: [ log ( S )] T derivative \ T . . p E [ S T / T ] E hp S T / T i q =
80% 2 1.56% 1.49% 1.46% (0.007%) (0.005%) (0.004%)3 1.66% 1.54% 1.49%call option 1 0.51% 0.37% 0.30% 0.86% 0.64% 0.49% q = q = Table 7: The prices of volatility derivatives in the subordinated CEV model S . The parameter values for the process S and thechain X are given in the caption of Table 4. The parameters for the process I , the random variable S T and the payoffs of thevolatility derivatives are as in Table 6. The algorithm described in the caption of Table 4 is used to generate 10 paths of theprocess S and the sum in (1) is used to obtain the empirical distribution of [ log ( S )] T (see Figure 3a) and to evaluate the contingentclaims in this table. The numbers in brackets are the standard errors in the Monte Carlo simulation. Note that the computationaltime for the pricing of volatility derivatives using the process I is independent of the maturity T . All computations are performedon the same hardware as in Table 2 (the code in [14] can easily be adapted to this model). CEV + jumps k moments Spectral: I T MC: Q L , UT ( S ) derivative \ T . . p E [ S T / T ] E hp S T / T i Table 8: Contingent claims on corridor-realized variance in the subordinated CEV model S . The corridor is defined by L =
70 and U = Q L , UT ( S ) and the law of I T for T ∈ { . , , } aregiven in Figure 3b. The Monte Carlo algorithm is as described in Table 4 and the numbers in brackets are the standard errors in thesimulation. OLATILITY DERIVATIVES IN MARKET MODELS WITH JUMPS 23 A PPENDIX
A. P
ARTIAL - CIRCULANT MATRICES
A matrix C ∈ R n × n is circulant if there exists a vector c ∈ R n such that C i j = c ( i − j ) mod n for all i , j ∈{ , . . . , n } . The matrix C can always be diagonalised analytically, when viewed as a linear operator on thecomplex vector space C n , as follows. For any r ∈ { , . . . , n − } we have an eigenvalue l r and a corre-sponding eigenvector y ( r ) (i.e. the equation Cy ( r ) = l r y ( r ) holds for all r and the family of vectors y ( r ) , r ∈ { , . . . , n − } , spans the whole of C n ) of the form l r = n − (cid:229) k = c k e − i p n rk and y ( r ) j = √ n e − i p n r j for j ∈ { , . . . , n − } . It is interesting to note that the eigenvectors y ( r ) , r ∈ { , . . . , n − } , are independent of the circulant matrix C . For the proof of these statements see Appendix A in [1].Let A be a linear operator represented by a matrix in R m × m and let B ( k ) , for k = , . . . , m −
1, be a familyof n -dimensional matrices with the following property: there exists an invertible matrix U ∈ C n × n such that U − B ( k ) U = L ( k ) , for all k ∈ { , . . . , m − } , where L ( k ) is a diagonal matrix in C n × n . In other words this condition stipulates that the family of matrices B ( k ) can be simultaneously diagonalized by the transformation U . Therefore the columns of matrix U areeigenvectors of B ( k ) for all k between 0 and m − e A , acting on a vector space of dimension mn , in the followingway. Clearly the matrix e A can be decomposed naturally into m blocks of size n × n . Let e A i , j denote an n × n matrix which represents the block in the i -th row and j -th column of this decomposition. We now define theoperator e A as e A ii : = B ( i ) + A ii I R n and(40) e A i j : = A i j I R n , for all i , j ∈ { , . . . , m } such that i = j . (41)The real numbers A i j are the entries of matrix A and I R n is the identity operator on R n . We may now stateour main definition. Definition.
A matrix is termed partial-circulant if it admits a structural decomposition as in (40) and (41)for any matrix A ∈ R m × m and a family of n -dimensional circulant matrices B ( k ) , for k = , . . . , m − PPENDIX
B. N ON - UNIFORM STATE - SPACE OF THE M ARKOV CHAIN X The task here is to construct a non-uniform state-space for the Markov chain X , which was used inSection 6 to approximate the Markov process S . Recall that the state-space is a set of non-negative realnumbers E = { x , x , . . . , x N − } for some even integer N ∈ N . Recall that the elements of the set E , whenviewed as a finite sequence, are strictly increasing. We first fix three real numbers l , s , u ∈ R , such that l < s < u , that specify the boundaries of the lattice x = l , x N − = u and the starting point of the chain x ⌈ N / ⌉ = s = S which coincides with the initial spot value in the model S . The function ⌈·⌉ : R → Z returnsthe smallest integer which is larger or equal than the argument. We next choose strictly positive parametervalues g l , g u which control the granularity of the spacings between l and s and between s and u respectively. In other words the larger g l (resp. g u ) is, the more uniformly spaced the lattice is in the interval [ l , s ] (resp. [ s , u ] ). The algorithm that constructs the lattice points is a slight modification of the algorithm in [19], page167, and can be described as follows.(1) Compute c = arcsinh (cid:16) l − sg l (cid:17) , c = arcsinh (cid:16) u − sg u (cid:17) , N l = ⌈ N / ⌉ and N u = N − ( N l + ) .(2) Define the lower part of the grid by the formula x k : = s + g l sinh ( c ( − k / N l )) for k ∈ { , . . . , N l } .Note that x = l , x N l = s .(3) Define the upper part of the grid using the formula x N l + k : = s + g u sinh ( c k / N u ) for k ∈ { , . . . , N u } .Note that x N − = u . R EFERENCES [1] C. Albanese, H. Lo, and A. Mijatovi´c. Spectral methods for volatility derivative. to apper in
Quantitative Finance .[2] C.Albanese and A. Mijatovi´c. A stochastic volatility model for risk-reversals in foreign exchange. to appear in
InternationalJournal of Theoretical and Applied Finance .[3] P. Carr and R. Lee. Hedging variance options on continuous semimartingales. to appear in
Finance and Stochastics .[4] P. Carr and K. Lewis. Corridor variance swaps.
Risk , 17(2), February 2004.[5] P. Carr and D. Madan. Option valuation using the fast fourier transform.
Journal of Computational Finance , pages 61–73,1998.[6] P. Carr and D. Madan. Towards a theory of volatility trading. In R. Jarrow, editor,
Volatility: New Estimation Techniques forPricing Derivatives , Risk publication, pages 417–427. Risk, 1998.[7] P. Carr, D. Madan, H. Geman, and M. Yor. Pricing options on realized variance.
Finance and Stochastics , 9(4):453–475, 2005.[8] N. Dunford and J.T. Schwartz.
Linear operators Part II: Spectral theory . John Wiley & Sons, 1963.[9] S.N. Ethier and T.G. Kurtz.
Markov processes: Characterization and convergence . John Wiley & Sons, 1986.[10] J. Gatheral.
The volatility surface: a practitoner’s guide . John Wiley & Sons, Inc., 2006.[11] P. Glasserman.
Monte Carlo Methods in Financial Engineering . Springer, 2004.[12] J. Hull.
Options, futures and other derivatives . Pearson Education, 6th edition, 2006.[13] J. Jacod and A.N. Shiryaev.
Limit theorems for stochastic processes , volume 288 of
A Series of Comprehensive Studies inMathematics . Springer-Verlag, 2nd edition, 2003.[14] H. Lo and A. Mijatovi´c. An implementation in Matlab of the algorithm given in the paper “Volatility derivatives in marketmodels with jumps”, 2009. see URL .[15] D. Madan, P. Carr, and E.C. Chang. The variance gamma process and option pricing.
European Finance Review , 2(1):79–105,1998.[16] A. Mijatovi´c. Spectral properties of trinomial trees.
Proc. R. Soc. A , 463:1681–1696, 2007.[17] R. S. Phillips. On the generation of semigroups of linear operators.
Pacific Journal of Mathematics , 2(3):343–369, 1952.[18] P. Protter.
Stochastic integration and differential equations . Springer, 2nd edition, 2005.[19] D. Tavella and C. Randall.
Pricing Financial Instruments: the finite difference method.
Wiley, 2000.I
MPERIAL C OLLEGE L ONDON AND S WISS R E E-mail address : [email protected] D EPARTMENT OF M ATHEMATICS , I
MPERIAL C OLLEGE L ONDON
E-mail address : [email protected] OLATILITY DERIVATIVES IN MARKET MODELS WITH JUMPS 25 P r obab ili t y den s i t y f un c t i on PDF of the spot price in the CEV model T=1/2T=1T=2 (a) The probability distribution function for the spot price X T ,with the maturity T equal to 0.5, 1 and 2 years, where X is theMarkov chain used to approximate the CEV process S . For aprecise description of the process X see Subsection 6.1. Allrelevant parameter values are given in the caption of Table 2.
40 60 80 100 120 140 160 180 200 220 24000.0020.0040.0060.0080.010.0120.0140.0160.0180.02 x M ( x ) / M ( x ) CEV model − the ratio M (x)/M (x) (b) The function x M ( x ) / M ( x ) , where x ∈ E , in the CEVmodel. The minimum of this function, which equals 0 . a by the second inequalityin (30). The maximum of the ratio, which is 0 . n by the first inequality in (30).All relevant parameter values for the CEV model and theaccompanying chain X are given in the caption of Table 2. P r obab ili t y den s i t y f un c t i on PDF of the realized variance in the CEV model Monte Carlo [log(S)] T (T=1/2)Spectral k=1 I T (T=1/2)Spectral k=2 I T (T=1/2)Monte Carlo [log(S)] T (T=1)Spectral k=1 I T (T=1)Spectral k=2 I T (T=1)Monte Carlo [log(S)] T (T=2)Spectral k=1 I T (T=2)Spectral k=2 I T (T=2) (c) The empirical probability distribution of the realized variance [ log ( S )] T ofthe CEV model S , based on the Monte Carlo simulation described inSubsection 6.2, and the distribution of the random variable I T , obtained fromTheorem 2.1, for T ∈ { . , , } . For details on the definition of I T seeSections 2 and 4. Note that the computational time required to obtain the lawof I T is independent of T (see caption of Table 5).Figure 1: CEV model P r obab ili t y den s i t y f un c t i on PDF of the spot price in the VG model T=1/2T=1T=2 (a) The probability distribution function for the spot price X T , with the maturity T equal to 0.5, 1 and 2 years, where X is the Markov chain used to approximate the variancegamma process. For a precise description of the process X see Subsection 6.1. All relevant parameter values are givenin the caption of Table 3. P r obab ili t y den s i t y f un c t i on PDF of the realized variance in the VG model Monte Carlo [log(S)] T (T=1/2)Spectral k=2 I T (T=1/2)Spectral k=3 I T (T=1/2)Monte Carlo [log(S)] T (T=1)Spectral k=2 I T (T=1)Spectral k=3 I T (T=1)Monte Carlo [log(S)] T (T=2)Spectral k=2 I T (T=2)Spectral k=3 I T (T=2) (b) The empirical probability distribution of the realized variance [ log ( S )] T in the VG model S , based on the Monte Carlosimulation described in Subsection 6.3, and the distribution of therandom variable I T for T ∈ { . , , } matching k ∈ { , } instantaneous moments. For details on I T see Sections 2 and 5.Note that the computational time required to obtain the law of I T is independent of T and that the quality of the approximation isgreater for k =
40 60 80 100 120 140 160 180 200 220 2400246810121416 X(x) M ( x ) M ( x ) / M ( x ) VG model − the ratio 4M (x)M (x)/M (x) (c) The function x M ( x ) M ( x ) M ( x ) , for x ∈ E such that20 ≤ x ≤ X are given in the caption of Table 3.
40 60 80 100 120 140 160 180 200 220 24000.0050.010.0150.020.0250.030.035 X(x)Solutions of the quadratic equation correspond to l n (x) in the VG model min of the two solutionsmax of the two solutions (d) The functions x a ( x ) and x a ( x ) , for x ∈ E such that20 ≤ x ≤ l n ( x ) is positive, we mustchoose the value of the constant a to lie between the twocurves for all x in the above range (see also Subsection 6.3).Figure 2: Variance gamma model OLATILITY DERIVATIVES IN MARKET MODELS WITH JUMPS 27 P r obab ili t y den s i t y f un c t i on PDF of the realized variance in the subordinated CEV model Monte Carlo [log(S)] T (T=1/2)Spectral k=2 I T (T=1/2)Spectral k=3 I T (T=1/2)Monte Carlo [log(S)] T (T=1)Spectral k=2 I T (T=1)Spectral k=3 I T (T=1)Monte Carlo [log(S)] T (T=2)Spectral k=2 I T (T=2)Spectral k=3 I T (T=2) (a) The distribution of the realized variance in the subordinated CEVmodel. P r obab ili t y den s i t y f un c t i on PDF of the corridor−realized variance in the subordinated CEV model Monte Carlo Q
L,UT (T=1/2)Spectral k=2 I T (T=1/2)Spectral k=3 I T (T=1/2)Monte Carlo Q L,UT (T=1)Spectral k=2 I T (T=1)Spectral k=3 I T (T=1)Monte Carlo Q L,UT (T=2)Spectral k=2 I T (T=2)Spectral k=3 I T (T=2) (b) The distribution of the corridor-realized variance in thesubordinated CEV model.Figure 3: Figure 3a (resp. 3b) contains the empirical probability distribution of the realized variance [ log ( S )] T (resp.corridor-realized variance Q L , UT ( S ) , where L =
70 and U = S , based on the Monte Carlosimulation described in the caption of Table 4 (see also Subsection 6.3). The distribution of the random variable I T for thematurity T ∈ { . , , } matching k ∈ { , } instantaneous moments is also plotted in both cases. For details on I T see Sections 2and 5. The computational time required to obtain the law of I T is independent of T and the quality of the approximation improvesdrastically for k ==