[PDF] Recursive Marginal Quantization of Higher-Order Schemes

Abstract

Quantization techniques have been applied in many challenging finance applications, including pricing claims with path dependence and early exercise features, stochastic optimal control, filtering problems and efficient calibration of large derivative books. Recursive Marginal Quantization of the Euler scheme has recently been proposed as an efficient numerical method for evaluating functionals of solutions of stochastic differential equations. This method involves recursively quantizing the conditional marginals of the discrete-time Euler approximation of the underlying process. By generalizing this approach, we show that it is possible to perform recursive marginal quantization for two higher-order schemes: the Milstein scheme and a simplified weak order 2.0 scheme. As part of this generalization a simple matrix formulation is presented, allowing efficient implementation. We further extend the applicability of recursive marginal quantization by showing how absorption and reflection at the zero boundary may be incorporated, when this is necessary. To illustrate the improved accuracy of the higher order schemes, various computations are performed using geometric Brownian motion and its generalization, the constant elasticity of variance model. For both processes, we show numerical evidence of improved weak order convergence and we compare the marginal distributions implied by the three schemes to the known analytical distributions. By pricing European, Bermudan and Barrier options, further evidence of improved accuracy of the higher order schemes is demonstrated.

Full PDF

aa r X i v : . [ q -f i n . C P ] J a n Recursive Marginal Quantization of Higher-Order Schemes

Thomas A. McWalter ∗ , Ralph Rudd , J¨org Kienitz and Eckhard Platen Department of Actuarial Science and the African Collaboration for QuantitativeFinance and Risk Research, University of Cape Town Department of Finance and Investment Management, University of Johannesburg Fachbereich Mathematik und Naturwissenschaften, Bergische Universit¨at Wuppertal Finance Discipline Group and School of Mathematical and Physical Sciences,University of Technology SydneySeptember 10, 2018

Abstract

Quantization techniques have been applied in many challenging ﬁnance applications,including pricing claims with path dependence and early exercise features, stochasticoptimal control, ﬁltering problems and eﬃcient calibration of large derivative books. Re-cursive Marginal Quantization of the Euler scheme has recently been proposed as aneﬃcient numerical method for evaluating functionals of solutions of stochastic diﬀeren-tial equations. This method involves recursively quantizing the conditional marginals ofthe discrete-time Euler approximation of the underlying process. By generalizing thisapproach, we show that it is possible to perform recursive marginal quantization for twohigher-order schemes: the Milstein scheme and a simpliﬁed weak order 2.0 scheme. Aspart of this generalization a simple matrix formulation is presented, allowing eﬃcient im-plementation. We further extend the applicability of recursive marginal quantization byshowing how absorption and reﬂection at the zero boundary may be incorporated, whenthis is necessary. To illustrate the improved accuracy of the higher order schemes, variouscomputations are performed using geometric Brownian motion and its generalization, theconstant elasticity of variance model. For both processes, we show numerical evidence ofimproved weak order convergence and we compare the marginal distributions implied bythe three schemes to the known analytical distributions. By pricing European, Bermudanand Barrier options, further evidence of improved accuracy of the higher order schemesis demonstrated.

Quantization techniques have been shown to be eﬀective in a number of quantitative ﬁ-nance applications. In particular, this includes the pricing of contingent claims with path de-pendency and early exercise [P`ages and Wilbertz, 2009; Sagna, 2011; Bormetti et al., 2016],stochastic control problems [P`ages et al., 2004] and nonlinear ﬁltering [P`ages and Pham,2005]. To improve numerical eﬃciency, Pag`es and Sagna [2015] introduced a technique known ∗ Correspondence: [email protected]

1s recursive marginal quantization. This technique has been shown to be eﬀective for fastcalibration of large derivative books by Callegaro et al. [2014, 2015a], another challengingapplication in ﬁnance. This paper aims to signiﬁcantly improve the accuracy of recursivemarginal quantization.Quantization is a lossy compression technique that produces a discrete representation of asignal using less information than the original. The technique originated in the ﬁeld of signalcompression, but has found application in ﬁelds as far-reaching as signal processing, patternrecognition, data mining, integration theory and, more recently, numerical probability. Forgeneral overviews of the mathematics and applications of quantization see Du et al. [1999]and Pag`es [2014].Vector quantization of probability distributions was formalized in the work of Graf and Luschgy[2000] and has been applied to the ﬁeld of mathematical ﬁnance since its inception. It is a tech-nique for optimally representing a continuous distribution by a discrete distribution, where ameasure of the ‘distance’ between the two, called the distortion, is minimized. The distortionis most commonly measured using the squared Euclidean error.The application of vector quantization to the solution of ﬁnance-related problems generallyproceeds by discretising time and then quantizing the corresponding marginal distributionsof the system of stochastic diﬀerential equations speciﬁc to the problem. The quantized gridsand their associated weights are then used to compute the expectations required in pricingcontingent claims (including claims with early exercise) or for performing the optimizationsrequired in stochastic control problems [Pag`es et al., 2004]. Due to the reliance on Lloyd’sAlgorithm [Lloyd, 1982], or variants thereof, these approaches generally incur a heavy com-putational burden.A more eﬃcient approach for the single-factor case has recently been proposed by Pag`es and Sagna[2015]. Known as recursive marginal quantization (RMQ), it makes use of a Newton-Raphsoniteration to quantize the Euler-Maruyama [Maruyama, 1955] updates of the underlying SDE.This technique has been used to provide fast calibration of a local volatility model byCallegaro et al. [2014, 2015a], and extended for use with two factor SDEs and applied tostochastic volatility models [Callegaro et al., 2015b].In the present work, the RMQ algorithm is generalized, allowing the implementation ofschemes of higher order than the Euler-Maruyama scheme. We now provide an overview ofthe rest of the paper.Section Two provides a review of vector quantization (VQ) as applied to probability dis-tributions. In particular, we strive to simultaneously provide a precise, concise and intuitivedescription of the methodology. The resulting algorithm is presented using a matrix formu-lation, allowing for eﬃcient implementation. The section concludes by showing examples ofvector quantization applied to the Gaussian and noncentral chi-squared distributions.The third section introduces recursive marginal quantization applied to stochastic diﬀer-ential equations. Our formulation of the problem is presented in more generality than theoriginal formulation by Pag`es and Sagna [2015]. A matrix formulation, which demonstratesthe connection with Markov chains, is provided, allowing easy and eﬃcient implementation.In the section that follows we extend the RMQ algorithm to higher-order updates, speciﬁ-cally the Milstein [1975] scheme and a simpliﬁed weak order 2.0 Taylor scheme of Kloeden and Platen[1999]. Geometric Brownian motion (GBM) and the constant elasticity of variance (CEV)process serve as examples to illustrate the improved error in the quantized marginal distribu-tions and the improved weak order convergence.When performing a Monte Carlo simulation of a discrete-time approximation of a process,2on-negativity of the solution is usually enforced by implementing absorption or reﬂection.Under certain circumstances this is also required when using RMQ. In Section Five we presentthe modiﬁcations of the RMQ algorithm necessary to ensure an absorbing or reﬂecting bound-ary at zero. These modiﬁcations allow the RMQ algorithm to be applied to the CEV processfor parameter sets that would otherwise be problematic under the original formulation.Section Six presents numerical results of the application of the three RMQ schemes tooption pricing. European, barrier and Bermudan options are priced under the GBM and CEVmodels. Where possible, the results are compared to available closed-form solutions, otherwisethey are compared to high-resolution ﬁnite diﬀerence or Monte Carlo implementations.One of the goals of this paper has been to produce a more general and easily acces-sible introduction to the theory of VQ and RMQ. The primary contribution is, however,the more general formulation of RMQ that enables the systematic extension of the work ofPag`es and Sagna [2015]. The paper concludes with a discussion of ongoing work.

Vector quantization is a lossy compression technique that provides a way to encode a vectorspace using a discrete subspace. While the technique is applicable more generally, we shallonly consider the quantization of one dimensional distributions. The vector quantizationproblem we aim to address in this section may be speciﬁed intuitively as follows:Find the discrete distribution that “best” represents the continuous distributionfunction associated with a random variable X .This is depicted in the Figure 1, which shows a density function of a continuous randomvariable and its corresponding quantized version. Here, for ease of visualization, we havechosen to plot the probability density function (of the continuous random variable) and theprobability mass function of the quantizer instead of the continuous and discrete distributionfunctions.We now provide a more rigorous speciﬁcation of the problem. Let X be a continuousrandom variable, taking values in R , and deﬁned on a probability space (Ω , F , P ). The abovequestion may be reposed as:How does one optimally approximate X , in a least-squares sense, by a discreterandom variable b X : Ω → Γ, where Γ is a ﬁnite set of elements in R ? Density Quantizer

Figure 1: The continuous probability density function on the left is quantized on the right,with probabilities represented on the vertical axis.3 i γ i +1 γ i − r i + r i − | {z } R i (Γ) Figure 2: Representation of the region R i (Γ) associated with codeword γ i .The reason that quantization is useful is that it allows eﬃcient approximations of expectationsof functionals H ( X ) of X , that is, E [ H ( X )] = Z R H ( x ) d P ( X ≤ x ) ≈ X γ ∈ Γ H ( γ ) P (cid:0) b X = γ (cid:1) . Here, for example, H may be the discounted payoﬀ of a ﬁnancial claim and P the risk-neutralprobability measure.Consider the approximation of X given by ˆ X , a discrete random vector deﬁned as thenearest-neighbour projection of X onto Γ = { γ , γ , . . . , γ N } , a set of distinct points in R with ﬁnite cardinality N ∈ N + . We shall refer to Γ as the quantizer and its elements as codewords . The nearest neighbor projection operator π Γ : R → Γ is deﬁned as π Γ ( X ) = (cid:8) γ i ∈ Γ (cid:12)(cid:12) k X − γ i k ≤ k X − γ j k for all j = 1 , . . . , N ; where equalityholds only for i < j (cid:9) . Here, k · k is the Euclidean norm. Associated with the quantizer, the regions R i (Γ) ⊂ R arethe subset of values of X that are mapped to each codeword γ i : R i (Γ) = (cid:8) x ∈ R (cid:12)(cid:12) π Γ ( x ) = γ i (cid:9) . These are also known as

Voronoi regions . For the sake of brevity we shall use R i to refer to R i (Γ) when it is clear that the quantizer we are referring to is Γ. The set of regions { R i } Ni =1 is called a tessellation of R , and has the following properties: R i ∩ R j = ∅ for i = j and ∪ Ni =1 R i = R . Since we are working in one dimension, the regions R i may be deﬁned directly as R i = { x | r i − < x ≤ r i + } with r i − = γ i − + γ i r i + = γ i + γ i +1 , for 1 ≤ i ≤ N , where, by deﬁnition, r − = −∞ and r N + = ∞ . If the distribution underconsideration is not deﬁned over the whole real line, then r − and r N + are adjusted to reﬂectthe interval of support. Figure 2 shows a simple graphical representation of these regions.With these deﬁnitions in place, we are now in a position to precisely deﬁne the optimizationproblem. We wish to ﬁnd the quantizer Γ such that b X = π Γ ( X ) best approximates X . Thesense in which the quantizer Γ is “best” is speciﬁed by the distortion function D (Γ) = E h k X − b X k i = Z R k x − π Γ ( x ) k d P ( X ≤ x )= N X i =1 Z R i (Γ) k x − γ i k d P ( X ≤ x ) . (1)4e require the Γ that minimizes D (Γ). The probability weights then follow directly as aresult of the nearest neighbor projection operator, i.e., P ( b X = γ i ) = P ( X ∈ R i ). A necessarycondition on the optimal Γ is that the gradient of the distortion function is zero, that is, ∇ D (Γ) = ¯0, where the elements of ∇ D (Γ) are given by ∂D (Γ) ∂γ i = 2 Z R i (Γ) ( γ i − x ) d P ( X ≤ x ) ! , for 1 ≤ i ≤ N . Intuitively, this means that the ﬁrst moment conditioned on the outcomes ofa region equals the respective codeword. Thus, one way to solve this system of equations isto set up a ﬁxed-point iteration using the above expression for the gradient. This is the basisfor Lloyd’s algorithm which starts with an initial guess for the quantizer, Γ (0) , and generatessuccessive updates, Γ ( n +1) , with the new codewords, γ in +1 , computed as the centroids of theregions associated with the previous updates Γ ( n ) : γ in +1 = R R i (Γ ( n ) ) x d P ( X ≤ x ) R R i (Γ ( n ) ) d P ( X ≤ x ) , for 1 ≤ i ≤ N and 0 ≤ n < n max .Another approach is to represent the quantizer by the column vector Γ , derive the Hessian, ∇ D ( Γ ), and compute the updated estimates of the quantizer using an iterative Newton-Raphson method Γ ( n +1) = Γ ( n ) − h ∇ D (cid:16) Γ ( n ) (cid:17)i − ∇ D (cid:16) Γ ( n ) (cid:17) for 0 ≤ n < n max . We now develop this approach further.Suppose f X and F X are the PDF and CDF of X , respectively. We deﬁne the p -th lowerpartial expectation as M pX ( x ) = E (cid:2) X p I { X

1) row vector h oﬀ = −

12 [ f ◦ ∆ Γ ] ⊤ , with the main diagonal given by h main = 2 p + [ h oﬀ |

0] + [0 | h oﬀ ] , where the copies of the h oﬀ vector are appended and prepended with a zero. It is nowstraightforward to set up the Newton-Raphson iteration in terms of the quantities.6 D en s i t y Q uan t i z e r w i t h N = Figure 3: Vector Quantization of the Standard Normal Distribution

In this section we apply the above methodology to the Gaussian distribution and the non-central chi-squared distribution with one degree of freedom. The latter is important for thehigher-order recursive marginal quantization schemes that we explore later in the paper.

When X is a standard normal random variable we have f X ( x ) = φ ( x ) F X ( x ) = Φ( x ) M X ( x ) = − √ π e − x = − φ ( x ) , where φ ( · ) and Φ( · ) are the standard normal PDF and CDF, respectively. Here, a good guessfor the initial quantizer Γ (0) is γ n = 5 . nN + 1 − . , for 1 ≤ n ≤ N . Figure 3 shows the quantizer of cardinality N = 50, using this initial guess,after n max = 20 Newton-Raphson iterations. At ﬁrst glance, it is tempting to think of thequantizer (represented by bars) as one might think of a histogram and suspect that it doesnot adequately capture the features of the density because it has the “incorrect shape”. Thisis, however, a misleading analogy since a histogram represents the probability of realising arandom variable in equally sized intervals on the x -axis. The quantizer, however, accumulatesthe probability mass over the Voronoi regions associated with each of the codewords and, asthese regions are larger in the tails than in the body of the distribution, proportionally moreprobability mass is accumulated for codewords in the tails than for codewords in the body. While, in general, the noncentral chi-squared distribution must be speciﬁed using Besselfunctions, this is not the case when the degree of freedom equals one. In particular, consider7he random variable X = ( Z + µ ) , where Z ∼ N (0 , X ∼ χ ′ (1 , λ ), is a noncentralchi-squared distributed with one degree of freedom and noncentrality parameter λ = µ .Moreover, on x ∈ R + , we have f X ( x ) = 12 √ x (cid:0) φ ( x + ) + φ ( x − ) (cid:1) F X ( x ) = Φ( x + ) − Φ( x − ) M X ( x ) = (1 + λ ) (cid:0) Φ( x + ) − Φ( x − ) (cid:1) + φ ( x + ) x − − φ ( x − ) x + , where φ ( · ) and Φ( · ) are the standard normal PDF and CDF, respectively, and x ± = ±√ x − √ λ. This means that we may express the noncentral chi-squared distribution with one degree offreedom using the standard normal PDF and CDF, thus allowing eﬃcient computation of aquantization scheme. This will be important for computational eﬃciency when we implementhigher-order RMQ schemes later in the paper.When implementing VQ, care must be taken when evaluating these functions at left andright limits. To ensure convergence we set f X (0) = F X (0) = M X (0) = 0, f X ( ∞ ) = 0, F X ( ∞ ) = 1 and M X ( ∞ ) = 1 + λ . Of course, all three functions are zero when x is negative.A good initial guess for Γ (0) is given by γ n = (cid:16) (3+ √ λ ) nN (cid:17) for √ λ < . (cid:16) nN +1 − . √ λ (cid:17) for √ λ ≥ . ≤ n ≤ N .Figure 4 shows three examples of quantizers of cardinality N = 50 for the noncentral chi-squared distribution with one degree of freedom for a range of noncentrality parameters. Notethat for certain values of the noncentrality parameter (e.g. the central panel) the distributionis bimodal while for larger values it resembles a Gaussian distribution. Consider the continuous-time diﬀusion speciﬁed by the stochastic diﬀerential equation dX t = a ( X t ) dt + b ( X t ) dW t , X = x ∈ R , (3)deﬁned on the ﬁltered probability space (Ω , F , ( F t ) t ∈ [0 ,T ] , P ) with a and b suﬃciently smoothand regular functions to ensure the existence of a weak solution. The question of interest is:How does one optimally approximate X t k : Ω → R , for some time discretisationpoint t k ∈ [0 , T ], when the distribution of X t k is unknown?Usually this is achieved by performing a Monte Carlo experiment using a discrete-time ap-proximation scheme for the SDE, the simplest scheme being the Euler-Maruyama [Maruyama,1955] update e X k +1 = e X k + a ( e X k )∆ t + b ( e X k ) √ ∆ tZ k +1 =: U ( e X k , Z k +1 ) , D en s i t y λ =1 Q uan t i z e r D en s i t y λ =5 Q uan t i z e r D en s i t y λ =50 Q uan t i z e r Figure 4: Three examples of the noncentral chi-squared distribution with one degree of free-dom for diﬀerent noncentrality parameters. 9or 0 ≤ k < K , where ∆ t = T /K and independent Z k +1 ∼ N (0 , e X = x . The innovation of Pag`es and Sagna [2015] was to show that a recursive procedurebased on quantizing these updates is possible.Since e X has a Gaussian distribution, it is possible to use vector quantization to obtainΓ , an optimal quantization grid for the ﬁrst step of the above scheme. One must, however,ﬁnd a way to quantize the successive (marginal) distributions of e X k +1 . Given knowledge ofthe distribution of e X k , the distortion of the quantizer Γ k +1 may be written as e D (cid:0) Γ k +1 (cid:1) = E h(cid:13)(cid:13) e X k +1 − b X k +1 (cid:13)(cid:13) i = E h E h (cid:13)(cid:13) e X k +1 − b X k +1 (cid:13)(cid:13) (cid:12)(cid:12)(cid:12) e X k ii = E h E h (cid:13)(cid:13) U ( e X k , Z k +1 ) − b X k +1 (cid:13)(cid:13) (cid:12)(cid:12)(cid:12) e X k ii = Z R E h(cid:13)(cid:13) U ( x, Z k +1 ) − b X k +1 (cid:13)(cid:13) i d P ( e X k ≤ x ) . (4)Unfortunately, we do not know the exact distribution of e X k for k >

1. The main result ofPag`es and Sagna [2015] shows that if one uses the previously quantized distribution of b X k ,instead of the continuous distribution of e X k , the resultant procedure converges. Furthermore,the error associated with this procedure is bounded by a constant, which is dependent on theparameters used. As a result, the integral in (4) may be rewritten as a sum over the codewordsin quantizer Γ k and their associated probabilities.Then an approximate value for the distortion may be computed as e D (cid:0) Γ k +1 (cid:1) ≈ D (cid:0) Γ k +1 (cid:1) := N k X i =1 E h(cid:13)(cid:13) U ( γ ik , Z k +1 ) − b X k +1 (cid:13)(cid:13) i P (cid:0) b X k = γ ik (cid:1) . Here, N k is the cardinality of the quantizer Γ k at time step k , which is allowed to vary.With this deﬁnition of D (cid:0) Γ k +1 (cid:1) we may now specify a Newton-Raphson iteration in order tocompute the quantizer at time step k + 1, which minimizes the distortion.Given the quantizer at time t k , represented as a column vector Γ k , and the associatedprobabilities, P (cid:0) b X k = γ ik (cid:1) for 1 ≤ i ≤ N k , the Newton-Raphson iteration for the quantizer Γ k +1 , at time t k +1 , is given by Γ ( n +1) k +1 = Γ ( n ) k +1 − h ∇ D (cid:16) Γ ( n ) k +1 (cid:17)i − ∇ D (cid:16) Γ ( n ) k +1 (cid:17) , (5)for 0 ≤ n < n max .Before developing the mathematics further, we pause to provide an intuitive explanationof how the RMQ algorithm proceeds. Figure 5 is a depiction of the process that occurs. Thetop panel shows the quantizer at time step k . Conditional on each codeword, a GaussianEuler update is propagated (second panel). In panel three, these updates are weighted bythe probability of the associated originating codeword and summed to produce the marginaldensity at time step k + 1, as shown in the ﬁnal panel. The distribution associated with thismarginal density is the distribution that is quantized to produce the quantizer at time step k + 1. This process is repeated until the quantizer at the ﬁnal time is produced.We now proceed to derive the quantities required for the Newton-Raphson iterations. Tosummarize notation, we write the update in aﬃne form as U ( γ ik , Z k +1 ) =: U ik +1 = m ik Z ik +1 + c ik , (6)10 uantizer (previous time step)Conditional densitiesWeighted conditional densitiesMarginal density Figure 5: Illustration of the RMQ algorithm.where m ik := b ( γ ik ) √ ∆ t and c ik := γ ik + a ( γ ik )∆ t, with Z ik +1 ∼ N (0 ,

1) identically distributed to Z k +1 . Here, we have introduced a new index i for the random variable Z ik +1 anticipating that it may depend on γ ik . This is redundant inthe case of the Euler update because Z k +1 is a standard normal random variate irrespectiveof starting point. This more general notation will become necessary when we analyse moregeneral cases (see Section 4). We denote the corresponding density and distribution functionsby f Z ik +1 and F Z ik +1 respectively.With this notation in place, the approximate marginal distribution of e X k +1 is F e X k +1 ( x ) = Z R P ( U ( y, Z k +1 ) ≤ x ) d P (cid:0) e X k ≤ y (cid:1) ≈ N k X i =1 P ( U ik +1 ≤ x ) P (cid:0) b X k = γ ik (cid:1) = N k X i =1 (cid:20) H ( − m ik ) + sgn( m ik ) F Z ik +1 (cid:18) x − c ik m ik (cid:19)(cid:21) P (cid:0) b X k = γ ik (cid:1) , (7)where H ( · ) is the Heaviside step function and sgn( · ) is the signum function. The last stepfollows due to the fact that the left-hand probability on the penultimate line may be written11s P ( U ik +1 ≤ x ) =  P (cid:16) Z ik +1 ≤ x − c ik m ik (cid:17) for m ik ≥ − P (cid:16) Z ik +1 ≤ x − c ik m ik (cid:17) for m ik < m ik is a proxy for the volatility of the SDE andits positivity is usually guaranteed, in which case (7) may be simpliﬁed. However, we persistwith this formulation because in the general case m ik may not be guaranteed to be positive.The elements of the gradient of the distortion ∇ D (Γ k +1 ) may then be written as ∂D (Γ k +1 ) ∂γ jk +1 = 2 N k X i =1 E h I { U ik +1 ∈ R j (Γ k +1 ) } (cid:16) γ jk +1 − U ik +1 (cid:17) i P (cid:0) b X k = γ ik (cid:1) = 2 N k X i =1 Z U ik +1 ∈ R j (Γ k +1 ) (cid:16) γ jk +1 − U ik +1 (cid:17) d P ( Z ik +1 ≤ x ) P (cid:0) b X k = γ ik (cid:1) , (8)where 1 ≤ j ≤ N k +1 is the index tracking the elements of the t k +1 quantizer, and the indexassociated with the t k quantizer is i . The integration bounds in (8) must now be expressedin terms of the variable of integration.Here, as in Section 2, U ik +1 ∈ R j (Γ k +1 ) is equivalent to the inequality r j − k +1 < U ik +1 ≤ r j + k +1 with r j ± k +1 = ( γ j ± k +1 + γ jk +1 ) , (9)and r − k +1 = −∞ and r N k +1 + k +1 = ∞ by deﬁnition. Deﬁning the conditionally normalized regionboundaries, r i,j ± k +1 = r j ± k +1 − c ik m ik , allows the inequality to be written in terms of the random variable Z ik +1 as U ik +1 ∈ R j (Γ k +1 ) =  r i,j − k +1 < Z ik +1 ≤ r i,j + k +1 for m ik ≥ r i,j − k +1 > Z ik +1 ≥ r i,j + k +1 for m ik < t k +1 is givenby ∂D (Γ k +1 ) ∂γ jk +1 = 2 N k X i =1 h ( γ jk +1 − c ik ) sgn( m ik ) (cid:16) F Z ik +1 ( r i,j + k +1 ) − F Z ik +1 ( r i,j − k +1 ) (cid:17) −| m ik | (cid:16) M Z ik +1 ( r i,j + k +1 ) − M Z ik +1 ( r i,j − k +1 ) (cid:17)i P (cid:0) b X k = γ ik (cid:1) . ∇ D (Γ k +1 ), is given by ∂ D (Γ k +1 ) ∂ (cid:0) γ jk +1 (cid:1) = N k X i =1 (cid:20) m ik ) (cid:16) F Z ik +1 ( r i,j + k +1 ) − F Z ik +1 ( r i,j − k +1 ) (cid:17) + 12 | m ik | f Z ik +1 ( r i,j + k +1 )( γ jk +1 − γ j +1 k +1 )+ 12 | m ik | f Z ik +1 ( r i,j − k +1 )( γ j − k +1 − γ jk +1 ) (cid:21) P (cid:0) b X k = γ ik (cid:1) , with the super-diagonal and sub-diagonal elements given by ∂ D (Γ k +1 ) ∂γ jk +1 ∂γ j +1 k +1 = N k X i =1 | m ik | f Z ik +1 ( r i,j + k +1 )( γ jk +1 − γ j +1 k +1 ) P (cid:0) b X k = γ ik (cid:1) and ∂ D (Γ k +1 ) ∂γ jk +1 ∂γ j − k +1 = N k X i =1 | m ik | f Z ik +1 ( r i,j − k +1 )( γ j − k +1 − γ jk +1 ) P (cid:0) b X k = γ ik (cid:1) , respectively.Although these expressions may appear complex, they are simply summations over thedensity function, cumulative distribution function and ﬁrst lower partial expectation of therandom variable, Z ik +1 , and are thus easy to compute when these functions are known.All the detail required to implement the Newton iteration (5) has now been providedwith the exception of the initial guess. In all applications considered in this paper, we haveassumed that N k = N for 1 ≤ k ≤ K , and used the quantizer from the previous time step asthe initial guess, i.e., Γ (0) k +1 = Γ k . As in Section 2.1, where we provided an eﬃcient matrix formulation for the Newton-Raphsoniteration required for VQ, RMQ is also amenable to a matrix speciﬁcation. This aids simpleand computationally eﬃcient implementation.Aside from a guess for Γ k +1 , we require the following time-indexed column vectors[ m k ] i = m ik , [ c k ] i = c ik , ≤ i ≤ N k and [∆ Γ k +1 ] i = γ i +1 k +1 − γ ik +1 , ≤ i ≤ N k +1 − . The row vector of probabilities p k = [ P ( b X k = γ k ) , . . . , P ( b X k = γ N k k )] , is retained and a row-vector of ones of length d is denoted by j d . With the exception of∆ Γ k +1 , which must be recomputed before each Newton-Raphson iteration, the other vectorsare computed once per time step. 13efore each Newton-Raphson iteration, three matrices must be computed in terms of thenew estimate of Γ k +1 : an N k × N k +1 matrix of transition probabilities[ P k +1 ] i, j = P ( b X k +1 = γ jk +1 | b X k = γ ik )= sgn( m ik ) h F Z ik +1 ( r i,j + k +1 ) − F Z ik +1 ( r i,j − k +1 ) i , another matrix, of the same size, of lower partial moment values[ M k +1 ] i,j = M Z ik +1 ( r i,j + k +1 ) − M Z ik +1 ( r i,j − k +1 ) (10)and an N k × N k +1 − f k +1 ] i,j = f Z ik +1 ( r i,j + k +1 ) . The gradient of the distortion function at time step k + 1 may then be written in terms ofthese vectors and matrices as ∇ D ( Γ k +1 ) ⊤ = 2 p k (cid:16)(cid:0) ( Γ k +1 j N k ) ⊤ − c k j N k +1 (cid:1) ◦ P k +1 − ( | m k | j N k +1 ) ◦ M k +1 (cid:17) , (11)where ◦ is the Hadamard (or element-wise) product.The super and sub-diagonal elements of the (tridiagonal) Hessian matrix, ∇ D ( Γ k +1 ),are given by the vector h oﬀ = − p k (cid:16)(cid:0) | m k | ◦− j ( N k +1 − (cid:1) ◦ f k +1 ◦ (∆ Γ k +1 j N k ) ⊤ (cid:17) , (12)while the main diagonal is given by h main = 2 p k P k +1 + (cid:2) h oﬀ | (cid:3) + (cid:2) | h oﬀ (cid:3) , (13)where ◦ − Γ k +1 are computed using p k +1 = p k P k +1 , where P k +1 must be recomputed in terms of the ﬁnal Γ k +1 . Thus, the matrix formulationpresented here allows RMQ to be interpreted as the propagation of an inhomogeneous, discretetime Markov chain, where Γ k represents the Markov states at time step k , the probabilityof being in those states is p k , and the associated transition probability matrix is P k +1 .Sometimes in the literature the transition probability matrix between time step k and k + 1is represented as P k,k +1 , we have chosen to omit the ﬁrst index. As a ﬁrst example of the RMQ algorithm, Figure 6 shows the evolution of the quantizersthrough time for the canonical (risk-neutral) geometric Brownian motion process dS t = rS t dt + σS t dW t , (14)14 Quantizer P r obab ili t y GBM Euler RMQ Time P r obab ili t y Quantizer

GBM Euler RMQ (3D)

Figure 6: Time evolution of quantizers for the GBM process.using parameters S = 100, r = 5% and σ = 30%. The RMQ parameters used were T =1, K = 12, ∆ t = T /K and N k = 200 for all k , with n max = 50 for the VQ algorithmand n max = 5 for the RMQ algorithm. Unless otherwise stated, these parameters are usedwherever geometric Brownian motion is used for other calculations in the paper. In theseplots, the colour of the quantizer indicates the associated time step, with lines in blue closerto initial time and lines in green closer to ﬁnal time. This convention is kept throughout. Given the general formulation of the RMQ algorithm in the previous section, we now exploretwo high-order extensions: the Milstein scheme and a simpliﬁed weak order 2.0 scheme. Anynumerical scheme for an SDE that can be written in the aﬃne form (6), may be used withthe RMQ algorithm as long as the CDF, PDF and lower partial expectation of the randomvariable Z ik +1 can be computed for all i and k . The Milstein [1975] scheme for the SDE in (3) is given by e X k +1 = e X k + a ( e X k )∆ t + b ( e X k ) √ ∆ tZ k +1 + b ( e X k ) b ′ ( e X k )∆ t (cid:0) Z k +1 − ∆ t (cid:1) , for 0 ≤ k < K , where ∆ t = T /K and Z k +1 ∼ N (0 ,

1) with initial value e X = x . Bycompletion of the square, this may be written as e X k +1 = e X k + (cid:16) a ( e X k ) − b ( e X k ) b ′ ( e X k ) (cid:17) ∆ t − b ( e X k ) b ′ ( e X k ) − + b ( e X k ) b ′ ( e X k )∆ t (cid:18) Z k +1 + (cid:16) √ ∆ tb ′ ( e X k ) (cid:17) − (cid:19) . U ik +1 = m ik Z ik +1 + c ik , where m ik = b ( γ ik ) b ′ ( γ ik )∆ t and c ik = γ ik + (cid:0) a ( γ ik ) − b ( γ ik ) b ′ ( γ ik ) (cid:1) ∆ t − b ( γ ik ) b ′ ( γ ik ) − . The random variable Z ik +1 is now noncentral chi-squared distributed with one degree of free-dom and noncentrality parameter λ ik +1 = (cid:16) √ ∆ tb ′ ( γ ik ) (cid:17) − . It is important to note that, unlike the Euler-Maruyama case, the distribution of the randomvariable Z ik +1 ∼ χ ′ (1 , λ ik +1 ) now depends on the codeword γ ik .Although the Milstein scheme possesses a strong order of convergence of 1, compared tothe Euler scheme, which only has strong order of convergence of , both schemes have aweak order of convergence of 1. Thus, while the Milstein scheme is more accurate in a strongsense than the Euler scheme, we require a diﬀerent update for higher weak order convergence,which is needed when approximating expectations of ﬁnancial payoﬀs. We now explore sucha scheme. . Taylor Scheme

While it is not possible to write a weak order 2.0 Taylor scheme in the aﬃne form required,the simpliﬁed weak order 2.0 scheme of Kloeden and Platen [1999] is amenable. This schemeis given by e X k +1 = e X k + a ( e X k )∆ t + b ( e X k ) √ ∆ tZ k +1 + b ( e X k ) b ′ ( e X k )∆ t ( Z k +1 − (cid:16) a ′ ( e X k ) b ( e X k ) + a ( e X k ) b ′ ( e X k ) + b ′′ ( e X k ) b ( e X k ) (cid:17) (∆ t ) Z k +1 + (cid:16) a ( e X k ) a ′ ( e X k ) + a ′′ ( e X k ) b ( e X k ) (cid:17) (∆ t ) , for 0 ≤ k < K , where ∆ t = T /K and Z k +1 ∼ N (0 ,

1) with initial value e X = x . Again,completion of squares is used to write this update in the required aﬃne form, U ik +1 = m ik Z ik +1 + c ik , where m ik = b ( γ ik ) b ′ ( γ ik )∆ t log ( ∆ t) -22-20-18-16-14-12-10-8-6-4 l og ( | M o m en t E rr o r | ) GBM First Moment

Euler, β = 0.998Milstein, β = 0.997Order 2.0, β = 1.856 -6 -5 -4 -3 -2 log ( ∆ t) -22-20-18-16-14-12-10-8-6-4 l og ( | M o m en t E rr o r | ) CEV First Moment

Euler, β = 0.997Milstein, β = 0.997Order 2.0, β = 1.885 Figure 7: Convergence of the ﬁrst moment for GBM and CEV.and c ik = γ ik + (cid:0) a ( γ ik ) − b ( γ ik ) b ′ ( γ ik ) (cid:1) ∆ t + (cid:0) a ( γ ik ) a ′ ( γ ik ) + a ′′ ( γ ik ) b ( γ ik ) (cid:1) (∆ t ) − (cid:0) b ( γ ik ) + (cid:0) a ′ ( γ ik ) b ( γ ik ) + a ( γ ik ) b ′ ( γ ik ) + b ′′ ( γ ik ) b ( γ ik ) (cid:1) ∆ t (cid:1) b ( γ ik ) b ′ ( γ ik ) . Here, Z ik +1 is again noncentral chi-squared distributed with one degree of freedom, withnoncentrality parameter given by λ ik +1 = b ( γ ik ) + (cid:0) a ′ ( γ ik ) b ( γ ik ) + a ( γ ik ) b ′ ( γ ik ) + b ′′ ( γ ik ) b ( γ ik ) (cid:1) ∆ tb ( γ ik ) b ′ ( γ ik ) p (∆ t ) ! , or, more succinctly, Z ik +1 ∼ χ ′ (1 , λ ik +1 ). To illustrate the accuracy of the above schemes, the RMQ algorithm is applied to geometricBrownian motion, as previously described by (14), and the constant elasticity of variance(CEV) process. The SDE for the CEV process is dS t = rS t dt + σS αt dW t , and, in the examples that follow, the process-speciﬁc parameters chosen were S = 100, r = 5% and α = 0 . σ = σ LN S − α , with σ given in terms of the instantaneous log-normalvolatility σ LN = 30%. In the case of GBM the parameters in Section 3.2 were used. Forboth GBM and CEV, the RMQ speciﬁc parameters mentioned in that section were also used,unless stated otherwise.In Figure 7 we show numerical evidence for weak order convergence. The absolute valueof the diﬀerence between the ﬁrst moment of the resulting terminal quantizer and the true17 D i ff e r en c e × -2 GBM Euler

50 100 150 200 250 300-1-0.500.5 D i ff e r en c e × -3 GBM Milstein

50 100 150 200 250 300-505 D i ff e r en c e × -4 GBM Weak Order 2.0

50 100 150 200 250 300-1.5-1-0.500.5 D i ff e r en c e × -2 CEV Euler

50 100 150 200 250 300-1-0.500.5 D i ff e r en c e × -3 CEV Milstein

50 100 150 200 250 300-505 D i ff e r en c e × -4 CEV Weak Order 2.0

Figure 8: The error in the marginal distribution implied by quantization for GBM and CEV.moment is plotted for a range of time step sizes. In each case, the horizontal axis provides thebase-2 logarithm of the step size and the vertical axis the base-2 logarithm of the error in theﬁrst moment. Thus, the slope of each graph reﬂects the power of the step size and hence theorder of weak convergence for the error. In the ﬁgure legends, the regressed gradients of thegraphs, denoted by β , indicate the weak orders of convergence. As is theoretically expected,see Kloeden and Platen [1999], both the Euler and Milstein schemes have approximately weakorder one convergence, whereas the simpliﬁed weak order 2.0 scheme reaches a weak order closeto two. Therefore, the latter scheme is by magnitude more accurate and can be expected toproduce results with substantially lower error than the Euler scheme when valuing contingentclaims. For increased accuracy in these plots, the cardinality used was N k = 1 000.Since the true conditional distributions for GBM and CEV are known in closed form, wecan compare the approximate marginal distributions at time step k + 1, as implied by thequantizer at time step k and computed using (7), with the exact distributions at time step k + 1. In the case of the CEV process we used the analytical expression for the distributiongiven by Lindsay and Brecher [2012] adjusted for the drift component in the SDE.For each of the three schemes, under the GBM and CEV test cases, the diﬀerence betweenthe exact marginal distribution and the implied marginal distribution is plotted in Figure 8.Here we reverted to using a cardinality of N k = 200. Note the scale of the y -axes of the18raphs in the ﬁgure — from top to bottom, the magnitude of the error decreases by an orderof magnitude in each successive row. This gives an indication of the improvement that canbe expected when these higher order schemes are used to price contingent claims. Sometimes discrete-time approximations of an SDE may exhibit behaviour that is inconsis-tent with the true solution. For example, an Euler-Maruyama approximation of geometricBrownian motion or the CEV process can, under certain circumstances, generate negative val-ues, even though the SDE speciﬁcation guarantees non-negativity in each case. As a result,discrete-time Monte Carlo simulations are often modiﬁed to generate reﬂecting or absorbingbehaviour at zero, see for example Lord et al. [2010]. In this section, we describe how theRMQ algorithm may be modiﬁed in a similar manner.

To model an absorbing boundary, the domain of the approximate marginal distribution of e X , see (7), must be left-truncated at zero. Implementing the RMQ algorithm with a leftlimit of zero results in a quantizer that has probabilities at each time step that do not sumto unity. The probability that is not accounted for as a result of the domain truncation isthe mass accumulated at the absorbing zero boundary. To compensate for this, the quantizerat each time step can be augmented with an extra codeword, which has a value of zero anda probability equal to one minus the sum of the probabilities associated with all the othercodewords at that time step. The transition probability matrix may also be augmented in aconsistent manner by realising that once the process attains the zero state it must remain inthat state indeﬁnitely, i.e., the conditional probability of moving from the absorbing state toany other state is zero, and, correspondingly, the conditional probability of remaining in theabsorbing state is one.Modifying the algorithm is straightforward and incurs no additional computational bur-den. Given that the elements of the previous quantizer Γ k are all positive, the aﬃne form ofthe update (6), will be negative when Z ik +1 < − c ik m ik . This implies that the domain of each Z ik +1 must be left-truncated at − c ik m ik to ensure onlypositive codewords at time-step k + 1. This is achieved by setting r i, − k +1 = − c ik m ik , (15)for 1 ≤ i ≤ N k , in the implementation described in Section 3.1. This is equivalent to assuming r − k +1 = 0 in (9). The rest of the algorithm proceeds without modiﬁcation.Of course, this all depends on the fact that the quantizer at the ﬁrst time step, Γ , also haspositive elements. This is achieved using an analogous truncation in the vector quantizationalgorithm. The initial guess for the Newton iteration must also ensure positivity.19 Domain D en s i t y Reflecting Boundary

Normal DensityBoundaryReflected ComponentReflected Density

Figure 9: Illustration of the standard Gaussian density reﬂected around − . Figure 9 shows f ( x ), a density function — in this case a standard Gaussian density. The redline represents a reﬂecting boundary at ¯ x = − .

5. The f values to the left of the boundaryare reﬂected and depicted by the dashed yellow line in the ﬁgure. These values are given by f (2¯ x − x ) for x > ¯ x .Thus, restricting the domain to [¯ x, ∞ ) the reﬂected density, denoted ¯ f ( x ), is given as thesum ¯ f ( x ) = f ( x ) + f (2¯ x − x ) . Direct integration of this expression over the integration limits from ¯ x to x ∈ [¯ x, ∞ ) gives thereﬂected distribution function ¯ F ( x ) = F ( x ) − F (2¯ x − x )and the ﬁrst lower partial expectation function¯ M ( x ) = M ( x ) + M (2¯ x − x ) − xF (2¯ x − x ) − M (¯ x ) + 2¯ xF (¯ x ) , (16)where F ( x ) and M ( x ) are the un-reﬂected distribution and ﬁrst lower expectation functionsassociated with f . This may be applied, not only to the Gaussian case, but also to thenoncentral chi-squared cases required for the higher order updates.Modifying the RMQ algorithm to allow for a reﬂecting boundary at zero requires twochanges to the implementation described in Section 3.1. Firstly, the lower bound for theintegration, i.e., the domain of the Z ik +1 random variable in each aﬃne update, must beleft-truncated by replacing the furthest left region boundary as in (15) above. Secondly, thedensity, distribution and ﬁrst lower partial expectation functions associated with each randomvariable, must be replaced by their reﬂected counterparts¯ f Z ik +1 ( x ) = f Z ik +1 ( x ) + f Z ik +1 (2¯ x ik − x ) , ¯ F Z ik +1 ( x ) = F Z ik +1 ( x ) − F Z ik +1 (2¯ x ik − x ) , M Z ik +1 ( x ) = M Z ik +1 ( x ) + M Z ik +1 (2¯ x ik − x ) − x ik F Z ik +1 (2¯ x ik − x ) , (17)for x ∈ [¯ x ik , ∞ ), where ¯ x ik = − c ik m ik . The remainder of the algorithm proceeds as normal. Theastute reader will have noticed that there are two terms missing in (17) when compared with(16). The reason for this omission is that these terms are constants for each i and that theRMQ algorithm always only requires diﬀerences of partial moment terms, as seen in (10).There is, therefore, a cancelation of the constant terms when this diﬀerence is taken and thuswe may use a deﬁnition that excludes them.As in the case of the absorbing boundary, the analogous reﬂection must be applied in thevector quantization algorithm to ensure that Γ is consistent. It is well known that when 0 < α < .

5, the CEV process may reach zero and that this statemay be either absorbing or reﬂecting. Lindsay and Brecher [2012] give the correspondingmarginal distributions for both these cases (it is easy to adjust their formulations to accountfor the drift term in the SDE for the CEV process). In Section 4.3 we considered the CEVprocess with α > .

5, which only allows absorption at zero. Now consider the case where S = 0 . α = 0 .

35 and σ LN = 50%, with the rest of the parameters as before. Figure 10shows the diﬀerence between the exact marginal distribution and the marginal distributionimplied by RMQ for the three schemes as modiﬁed to account for an absorbing boundary(left) and a reﬂecting boundary (right). As before, the scale of the graphs changes from topto bottom, indicating the improvement as a result of the choice of the schemes.Note that, under this choice of parameters for the CEV model, the standard RMQ for-mulation of Section 3 fails. Without implementing the modiﬁcations for either absorption orreﬂection proposed in this section, some codewords become negative at a certain point in theexecution of the RMQ algorithm, leading to discrete-time updates with imaginary values. In this section, contingent claims are priced using the RMQ algorithm and the three updateschemes are compared for accuracy. The claims priced include European, Bermudan anddiscretely-monitored barrier options under the dynamics of both geometric Brownian motion(GBM) and its generalization, the constant elasticity of variance (CEV) model.The GBM model and its parameters are described in Section 3.2, whereas the speciﬁcationfor the CEV model can be found in Section 4.3. These parameters are used for pricingthroughout this section. As is implied by these speciﬁcations, the continuously compoundedinterest rate is assumed constant with a value r = 5%. All option maturities are one yearand the RMQ algorithm is executed using K = 12, i.e., using monthly steps, with constantcardinality of N k = 200 for all k . Once a terminal quantizer has been obtained using the RMQ algorithm, a European optionwith payoﬀ function H ( S, X ) at maturity T = t K , where S represents the asset process and21 D i ff e r en c e × -2 Euler with Absorption D i ff e r en c e × -3 Milstein with Absorption D i ff e r en c e × -4 Weak Order 2.0 with Absorption D i ff e r en c e × -2 Euler with Reflection D i ff e r en c e × -3 Milstein with Reflection D i ff e r en c e × -4 Weak Order 2.0 with Reflection

Figure 10: The diﬀerence between the true and approximate marginal distributions for theCEV process. The left column shows the case of absorption while the right shows reﬂection. X the strike, may be priced directly by using the expectation deﬁned in (2). The price isgiven by H = e − rT E [ H ( S T , X )] ≈ e − rT p K H ( Γ K , X ) , (18)where H is the value of the claim at initial time t = 0 and H ( Γ K , X ) is the function H applied element-wise to Γ K , which, as speciﬁed previously, is a column vector of length N K .Figure 11 shows the accuracy of put option prices for the GBM and CEV models for a widerange of strikes. The GBM option prices are compared against the Black-Scholes option pric-ing formula, whereas the CEV prices are compared against the analytical solution originallydue to Schroder [1989] and reformulated in terms of the noncentral chi-squared distributionby Hsu et al. [2008]. In the graphs, the x -axis represents ﬁxed-spot inverse moneyness, whichis determined as the variable strike value over the initial asset price, S .Even though the Euler scheme is reasonably accurate to start with, the increased accuracyof the Milstein and the simpliﬁed weak order 2.0 schemes is evident. For certain strikes theerror is reduced by an order of magnitude. 22 .2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 Moneyness A b s o l u t e E rr o r GBM European Put

EulerMilsteinWeak order 2.0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8

Moneyness A b s o l u t e E rr o r CEV European Put

EulerMilsteinWeak order 2.0

Figure 11: Accuracy of GBM and CEV European put prices computed using RMQ, as com-pared to analytical solutions.

Bermudan option prices are computed using the standard Backward Dynamic ProgrammingPrinciple (BDPP), an important result from discrete-time optimal stopping theory. Pag`es[2014] reviews the use of the BDPP as applied to grids that result from a quantization.Once quantization grids and corresponding transition probability matrices have been com-puted using the RMQ algorithm, the high-level algorithm for Bermudan option pricing maybe speciﬁed as follows:1. Initialize h K = H ( Γ K , X )2. For k = K − , . . . , h k = max( H ( Γ k , X ) , e − r ∆ t P k +1 h k +1 )3. Set H = e − r ∆ t p h Here the max function is applied element-wise with its second argument being the continu-ation value, which is easily computed as a conditional expectation due to availability of thetransition probability matrix at each time step. The initial value of the Bermudan claim isgiven by H .In Figure 12 the accuracy of a Bermudan put option with monthly exercise opportunities isshown for the GBM and CEV models. The reference price is computed using a high resolutionCrank-Nicholson ﬁnite diﬀerence scheme using 600 time steps and 800 stock increments,equally spaced between zero and 4 × S .All three RMQ algorithms result in low absolute errors, with the simpliﬁed weak order2.0 scheme again producing errors that are an order of magnitude smaller.23 .5 1 1.5 Moneyness A b s o l u t e D i ff e r en c e w i t h F D M GBM Bermudan Put

RMQ EulerRMQ MilsteinRMQ Weak order 2.0 0.5 1 1.5

Moneyness A b s o l u t e D i ff e r en c e w i t h F D M CEV Bermudan Put

RMQ EulerRMQ MilsteinRMQ Weak order 2.0

Figure 12: Accuracy of GBM and CEV Bermudan put option prices, as compared to a highresolution Crank-Nicholson ﬁnite diﬀerence scheme.

The pricing of barrier options has previously been explored in the context of quantization bySagna [2011]. This work showed that the barrier-crossing approach described in Section 6.4of Glasserman [2003] may be applied to marginal quantization using a so-called transitionkernel formulation. Using our notation, we now combine this approach with our proposedmore accurate schemes and apply it to discretely monitored barrier options.Consider expression (18) for pricing European options, which may be re-written as H ≈ e − rT p K − Y k =1 P k +1 ! H ( Γ K , X ) . To price a knock-out barrier option the transition probability matrix at each time step inthis expression must to be modiﬁed to take into account the possibility that the underlyingprocess breaches the barrier. Thus, we rescale the transition probabilities by multiplyingthem by the probability of not having crossed the barrier.Let g ( x, y ) be the probability of transitioning between states x and y without crossing thebarrier. If we form an N k by N k +1 matrix of values[ G k +1 ] i,j = g ( γ ik , γ jk +1 ) , then P k +1 ◦ G k +1 deﬁnes the transition kernel. The barrier option may then be priced using H ≈ e − rT ( p ◦ g ) K − Y k =1 ( P k +1 ◦ G k +1 ) ! H ( Γ K , X ) , where g = [ g ( S , γ ) , . . . , g ( S , γ N )] is a row vector.24 Barrier Level (multiple of strike) A b s o l u t e D i ff e r en c e w i t h M C GBM Discrete Barrier Put

RMQ EulerRMQ MilsteinRMQ Weak order 2.0MC 3SDev bound 1 1.02 1.04 1.06 1.08 1.1

Barrier Level (multiple of strike) A b s o l u t e D i ff e r en c e w i t h M C CEV Discrete Barrier Put

RMQ EulerRMQ MilsteinRMQ Weak order 2.0MC 3SDev bound

Figure 13: Accuracy of GBM and CEV discretely monitored up-and-out put option prices,as compared to a Monte Carlo simulation.In the case of discretely-monitored up-and-out barrier options with barrier level L , thefunction g is given simply as the indicator function g ( x, y ) = I { max( x,y )

G. Bormetti, G. Callegaro, G. Livieri, and A. Pallavicini. A backward Monte Carlo approachto exotic option pricing. http://ssrn.com/abstract=2686115 , 2016.G. Callegaro, L. Fiorin, and M. Grasselli. Pricing and calibration in local volatility modelsvia fast quantization. Available at SSRN 2495829, 2014.G. Callegaro, L. Fiorin, and M. Grasselli. Quantized calibration in local volatility.

RiskMagazine , 28:62–67, 2015a.G. Callegaro, L. Fiorin, and M. Grasselli. Pricing via quantization in stochastic volatilitymodels. Available at SSRN 2669734, 2015b.Q. Du, V. Faber, and M. Gunzburger. Centroidal Voronoi tessellations: Applications andalgorithms.

SIAM Review , 41(4):637–676, 1999.P. Glasserman.

Monte Carlo Methods in Financial Engineering . Springer, 2003.S. Graf and H. Luschgy.

Foundations of Quantization for Probability Distributions . Springer,2000.Y.-L. Hsu, T. Lin, and C. Lee. Constant elasticity of variance (CEV) option pricing model:Integration and detailed derivation.

Mathematics and Computers in Simulation , 79(1):60–71, 2008.P. Kloeden and E. Platen.

Numerical Solution of Stochastic Diﬀerential Equations . Springer,1999.A. Lindsay and D. Brecher. Simulation of the CEV process and the local martingale property.

Mathematics and Computers in Simulation , 82(5):868–878, 2012.S. P. Lloyd. Least squares quantization in PCM.

IEEE Transactions on Information Theory ,28(2):129–137, 1982.R. Lord, R. Koekkoek, and D. V. Dijk. A comparison of biased simulation schemes forstochastic volatility models.

Quantitative Finance , 10(2):177–194, 2010.G. Maruyama. Continuous Markov processes and stochastic equations.

Rendiconti del CircoloMatematico di Palermo , 4(1):48–90, 1955.G. Milstein. Approximate integration of stochastic diﬀerential equations.

Theory of Probabilityand Its Applications , 19(3):557–562, 1975.G. Pag`es. Introduction to optimal vector quantization and its applications for numerics.Technical report, July 2014. URL https://hal.archives-ouvertes.fr/hal-01034196 .G. P`ages and H. Pham. Optimal quantization methods for nonlinear ﬁltering with discrete-time observations.

Bernoulli , 11(5):893–932, 2005.G. Pag`es and A. Sagna. Recursive marginal quantization of the Euler scheme of a diﬀusionprocess.

Applied Mathematical Finance , 22(5):463–498, 2015.27. P`ages and B. Wilbertz. Optimal Delaunay and Voronoi quantization methods for pricingAmerican options. In R. Carmona, P. Hu, P. Del Moral, and N. Oudjane, editors,

Numericalmethods in Finance , pages 171–217. Springer, 2009.G. P`ages, H. Pham, and J. Printems. An optimal Markovian quantization algorithm formultidimensional stochastic control problems.

Stochastics and Dynamics , 4(4):501–545,2004.G. Pag`es, H. Pham, and J. Printems. Optimal quantization methods and applications tonumerical problems in ﬁnance. In

Handbook of Computational and Numerical Methods inFinance , pages 253–297. Springer, 2004.A. Sagna. Pricing of barrier options by marginal functional quantization.

Monte CarloMethods and Applications , 17(4):371–398, 2011.M. Schroder. Computing the constant elasticity of variance option pricing formula.