RRobust Product Markovian Quantization
Ralph Rudd ∗ Thomas A. McWalter ∗ †
J¨org Kienitz ∗ ‡
Eckhard Platen ∗ §
June 24, 2020
Abstract
Recursive marginal quantization (RMQ) allows the construction of optimal discretegrids for approximating solutions to stochastic differential equations in d -dimensions.Product Markovian quantization (PMQ) reduces this problem to d one-dimensional quan-tization problems by recursively constructing product quantizers, as opposed to a trulyoptimal quantizer. However, the standard Newton-Raphson method used in the PMQalgorithm suffers from numerical instabilities, inhibiting widespread adoption, especiallyfor use in calibration. By directly specifying the random variable to be quantized at eachtime step, we show that PMQ, and RMQ in one dimension, can be expressed as standardvector quantization. This reformulation allows the application of the accelerated Lloyd’salgorithm in an adaptive and robust procedure. Furthermore, in the case of stochasticvolatility models, we extend the PMQ algorithm by using higher-order updates for thevolatility or variance process. We illustrate the technique for European options, using theHeston model, and more exotic products, using the SABR model. Keywords: vector quantization, option pricing, stochastic volatility, calibration.
JEL:
C63, G12, G13
Quantization is a compression technique used to approximate a given signal using less informa-tion than the original, by minimizing a measure of error called the distortion. In mathematicalfinance, it is used to approximate probability distributions and has been applied to the pric-ing of options with path dependence and early exercise (Pag`es and Wilbertz, 2009; Sagna,2011; Bormetti et al., 2018), stochastic control problems (Pag`es et al., 2004), and non-linearfiltering (Pag`es and Pham, 2005).Pag`es and Sagna (2015) introduced a technique known as recursive marginal quantization(RMQ), which approximates the marginal distribution of a system of stochastic differentialequations by recursively quantizing the Euler updates of the processes. In one dimension, theRMQ algorithm has been extended to higher-order schemes (McWalter et al., 2018), and hasbeen used to calibrate a local volatility model (Callegaro et al., 2015). ∗ The African Institute of Financial Markets and Risk Management, University of Cape Town † Department of Statistics, University of Johannesburg ‡ Fachbereich Mathematik und Naturwissenschaften, Bergische Universit¨at Wuppertal § Finance Discipline Group and School of Mathematical and Physical Sciences, University of TechnologySydney a r X i v : . [ q -f i n . C P ] J un pplying RMQ to multidimensional SDEs requires the use of stochastic numerical meth-ods, such as the randomized Lloyd’s method or stochastic gradient descent methods, e.g.,Competitive Learning Vector Quantization (see Pag`es (2015) for an overview of these meth-ods). The computational cost of these techniques can be prohibitive.Callegaro et al. (2017) overcame the need for stochastic methods by using conditioningto derive a modified RMQ algorithm in the context of stochastic volatility models. Theirapproach was to perform a standard one-dimensional RMQ on the volatility process, and thencondition on the realizations of the resulting quantizer when quantizing the asset process. Themodified RMQ algorithm used for the asset process retained a Newton-Raphson iteration.In doing so, they relied on the approach proposed in a preprint of Fiorin et al. (2019). Later, in the two-dimensional case, Rudd et al. (2017) formulated a product quantizationalgorithm without the need for this conditioning, thereby increasing computational efficiency.Independently and at around the same time, an updated preprint of Fiorin et al. (2019) alsoremoved the conditioning. This approach has been called product Markovian quantization(PMQ).The contribution of the present work is two-fold. Firstly, by directly specifying the ran-dom variable to be quantized we show how both the RMQ and PMQ algorithms can beformulated as standard vector quantization. This allows us to extend the work of Bormettiet al. (2018) and apply the accelerated Lloyd’s algorithm to PMQ, should the more efficientNewton-Raphson method become unstable. Secondly, we show how to extend the higher-orderquantization technique from McWalter et al. (2018) so that it can be applied to stochasticvolatility models. We now provide an overview of the paper.In section 2, the underlying mathematics of vector quantization is reviewed along withthe two numerical methods central to the paper: Lloyd’s algorithm and the Newton-Raphsonmethod. For the one-dimensional case, these algorithms are specified in terms of the densityfunction, distribution function and first lower partial expectation of the random variable beingquantized. In section 3, we review the RMQ algorithm and show how it is amenable to thestandard techniques of vector quantization. Section 4 follows along similar lines with regardsto the PMQ algorithm, and section 5 shows how higher-order discretization schemes can beincorporated when the algorithm is applied to stochastic volatility models. Numerical resultsfor the popular Heston and SABR models are presented in section 7, including exotic optionpricing and a proof-of-concept calibration. Section 8 concludes the paper.
Let X be a continuous random vector, taking values in R d , and defined on the probabilityspace (Ω , F , P ). We seek an approximation of this random vector, denoted (cid:98) X , taking valuesin a set of finite cardinality, Γ, with the minimum average squared Euclidean difference fromthe original. Constructing this approximation is known as quantization , with (cid:98) X called the quantized version of X and the set Γ = { x , . . . , x N } known as the quantizer , with cardinality N . The elements of Γ are called codewords or elementary quantizers . The probabilitiesassociated with each codeword are denoted P ( (cid:98) X = x i ), for 1 ≤ i ≤ N , and are also known as weights . The preprint characterized the Newton-Raphson iteration in terms of expectations of a conditioning vari-able, ζ , see Proposition 3.5 in https://arxiv.org/pdf/1511.01758v2 . See Remark 3.4 in https://arxiv.org/pdf/1511.01758v3 for the reformulated Newton-Raphson iteration. X , e.g., E [ H ( X )] = (cid:90) R d H ( x ) d P ( X ≤ x ) ≈ N (cid:88) i =1 H ( x i ) P (cid:0) (cid:98) X = x i (cid:1) . We now briefly describe the mathematics of quantization. Consider the nearest-neighborprojection operator, π Γ : R d (cid:55)→ Γ, given by π Γ ( X ) := (cid:8) x i ∈ Γ : | X − x i | ≤ | X − x j | for j = 1 , . . . , N, j (cid:54) = i } . The quantized version of X is defined in terms of this projection operator as (cid:98) X := π Γ ( X ).The region R i (Γ), for 1 ≤ i ≤ N , is defined as R i (Γ) := (cid:8) x ∈ R d : π Γ ( x ) = x i (cid:9) , and is the subset of R d mapped to codeword x i through the projection operator. It allows theprobabilities associated with each codeword to be determined as P ( (cid:98) X = x i ) = P ( X ∈ R i (Γ)).To obtain the optimal quantizer, we must minimize the expected squared Euclidean error,known as the distortion , given by D (Γ) = E (cid:2) | X − (cid:98) X | (cid:3) = (cid:90) R d | x − π Γ ( x ) | d P ( X ≤ x )= N (cid:88) i =1 (cid:90) R i (Γ) | x − x i | d P ( X ≤ x ) . The symbol x refers to the continuous domain of the distribution of the random vector X ,whereas x i refers to the discrete codewords of the resulting quantizer, Γ, for 1 ≤ i ≤ N . A common fixed-point algorithm for obtaining an optimal quantization grid is known as
Lloyd’s algorithm (Lloyd, 1982), and is based on recursively enforcing the self-consistency property of optimal quantizers.By setting the gradient of the distortion to zero, it can be shown that any quantizer thatminimizes the distortion function must be self-consistent, i.e., (cid:98) X = E (cid:2) X (cid:12)(cid:12) (cid:98) X (cid:3) , or equivalently x i = E (cid:2) X I { X ∈ R i (Γ) } (cid:3) P ( X ∈ R i (Γ)) , for i = 1 , . . . , N . For an optimal quantization grid, the self-consistency condition requiresthat each codeword is the probability mass centroid of its associated region.Lloyd’s algorithm iteratively enforces this condition until a desired tolerance is achieved,or the maximum number of iterations is attained, using ( l +1) x i = E (cid:104) X I { X ∈ R i ( ( l ) Γ ) } (cid:105) E (cid:104) I { X ∈ R i ( ( l ) Γ ) } (cid:105) , (1) The self-consistency property is often known as stationarity in the literature. This term is avoided toprevent potential confusion with stationary stochastic processes. ≤ l < l LAmax is the iteration index. In a multidimensional setting, Monte Carlo (orquasi-Monte Carlo) methods are used to compute the required expectations.The special case when X is a one-dimensional random variable with a well-defined densityfunction is relevant for many applications, including the recursive marginal quantization andproduct Markovian quantization algorithms presented later.In one dimension, the regions associated with a quantizer may be defined directly as R i = { x ∈ R : x i − < x ≤ x i + } with x i − := x i − + x i x i + := x i + x i +1 , (2)for 1 ≤ i ≤ N , where, by definition, x − := −∞ and x N + := ∞ . If the distribution underconsideration is not defined over the whole real line, then x − and x N + are adjusted to reflectthe support.Suppose f X and F X are the density and distribution functions of X , respectively. Definethe p -th lower partial expectation as M pX ( x ) := E [ X p I { X 1, where ∆ t = T /K and z k +1 ∼ N (0 , I q ), with initial value X = x .Since the distribution of X is known explicitly, standard vector quantization can be usedto obtain the quantization grid at the first time step, Γ , and its associated probabilities. Thedistortion for successive time steps is then given by D (Γ k +1 ) = E (cid:104) | X k +1 − π Γ k +1 ( X k +1 ) | (cid:105) , for k = 1 , . . . , K − 1. However, the exact distribution of X k +1 is also unknown for k > 0. Soa further approximation is made: X k +1 is replaced by (cid:101) X k +1 := U ( (cid:98) X k , z k +1 ) , a random vector that results from applying the Euler update function to the previouslyquantized (cid:98) X k . This gives rise to the Algorithm 1.5 lgorithm 1 Recursive Marginal Quantization Set (cid:98) X := x . for k = 0 to K − do Define (cid:101) X k +1 := U ( (cid:98) X k , z k +1 ). Obtain Γ k +1 by minimizing E (cid:104) | (cid:101) X k +1 − π Γ k +1 ( (cid:101) X k +1 ) | (cid:105) . Set (cid:98) X k +1 = π Γ k +1 ( (cid:101) X k +1 ). end for Note that Step 4 computes the optimal quantization grid and Step 5 computes the associ-ated weights, by insisting that P (cid:0) (cid:98) X k +1 = x ik +1 (cid:1) = P (cid:0) (cid:101) X k +1 ∈ R i (Γ k +1 ) (cid:1) for i = 1 , . . . , N k +1 .This procedure is known as recursive marginal quantization and is due to Pag`es and Sagna(2015). It is the repeated vector quantization of the random vector (cid:101) X k +1 , for k = 0 , . . . , K − F (cid:101) X k +1 ( x ) = N k (cid:88) i =1 Φ d (cid:16) x ; x ik + a ( x ik )∆ t, B ( x ik ) B ( x ik ) (cid:62) ∆ t (cid:17) P (cid:0) (cid:98) X k = x ik (cid:1) , (10)where Φ d ( · ; µ , Σ ) is the d -dimensional Gaussian distribution function with mean µ and co-variance Σ .In the special case where X is a scalar-valued diffusion, this reduces to F (cid:101) X k +1 ( x ) = N k (cid:88) i =1 Φ (cid:18) x − c ik m ik (cid:19) p ik , (11) f (cid:101) X k +1 ( x ) = N k (cid:88) i =1 m ik φ (cid:18) x − c ik m ik (cid:19) p ik (12)and M (cid:101) X k +1 ( x ) = N k (cid:88) i =1 (cid:20) − m ik φ (cid:18) x − c ik m ik (cid:19) + c ik Φ (cid:18) x − c ik m ik (cid:19)(cid:21) p ik , (13)where Φ( · ) and φ ( · ) are the standard normal distribution and density functions, respectively,and c ik = x ik + a ( x ik ) , m ik = B ( x ik ) √ ∆ t and p ik = P ( (cid:98) X k = x ik ) . Here, the density and lower partial expectation of (cid:101) X k +1 are computed by differentiating andintegrating F (cid:101) X k +1 ( x ), respectively. Consequently, one may now use these expressions withthe standard (one-dimensional) vector quantization approaches in section 2.1.Note that when using the Newton-Raphson method, (5), this is equivalent to the origi-nal approach of Pag`es and Sagna (2015), with no difference in convergence characteristics.However, explicitly computing the distribution of (cid:101) X k +1 has the advantage that any vectorquantization optimization technique may now be applied. In particular, Lloyd’s algorithm,as given by (4), may be used—this is especially useful in cases where the Newton-Raphsonmethod fails due to numerical instability.Furthermore, we are not limited to the Euler-Maruyama discretization, but may also usethe Milstein or simplified weak-order 2.0 schemes (McWalter et al., 2018). The necessaryexpressions for the latter appear in Appendix A.6 Product Markovian quantization To ease the exposition, we limit ourselves to the case when q = d , i.e., there is one underlyingBrownian motion for each dimension. Then (8) can be written as d X t = a ( X t ) dt + diag ( b ( X t )) d W t , X = x ∈ R d , (14)with a ( X t ) = [ a ( X t ) , . . . , a d ( X t )] (cid:62) , b ( X t ) = [ b ( X t ) , . . . , b d ( X t )] (cid:62) and W a vector of corre-lated Brownian motions. The correlation matrix for the Brownian motions is given by LL (cid:62) ,where B and b are related by B ( X t ) = diag ( b ( X t )) L .The marginal Euler update for each dimension is given by X nk +1 = X nk + a n ( X k )∆ t + b n ( X k ) √ ∆ tz nk +1 =: U n ( X k , z nk +1 ) , (15)with k = 0 , . . . , K − 1. Here ∆ t = T /K , z nk +1 is a correctly correlated Gaussian randomvariate, and X n = [ x ] n . The central idea of product Markovian quantization (PMQ) is toquantize each of the d dimensions separately, using the marginal Euler updates, and constructthe required d -dimensional quantizer from their Cartesian product. This yields Algorithm 2 Algorithm 2 Product Markovian Quantization Set (cid:98) X := x . for k = 0 to K − do Define (cid:101) X k +1 := U ( (cid:98) X k , z k +1 ). for n = 1 to d do Define (cid:101) X nk +1 := U n ( (cid:98) X k , z nk +1 ). Obtain Γ nk +1 by minimizing E (cid:104) | (cid:101) X nk +1 − π Γ nk +1 ( (cid:101) X nk +1 ) | (cid:105) . end for Set Γ k +1 = Γ k +1 × · · · × Γ dk +1 . Set (cid:98) X k +1 = π Γ k +1 ( (cid:101) X k +1 ). end for Note that Algorithms 1 and 2 quantize the same random variable at each step, but differin how they compute the underlying quantization grid. The weights associated with thecomputed grid are determined the same way (Steps 5 and 9, respectively), using (10).Algorithm 2 solves d one-dimensional vector quantization problems at each time step,instead of solving one d -dimensional quantization problem. Fiorin et al. (2019) derive atheoretical error bound for the approach.As opposed to deriving the gradient and Hessian required for the Newton-Raphson itera-tion directly (the approach taken by Fiorin et al. (2019)), we derive the distribution of each (cid:101) X nk +1 . We also explicitly show how to compute the joint probabilities associated with theproduct grid, Γ k +1 , for k = 0 , . . . , K − k = Γ k × · · · × Γ dk , with cardinality N k = (cid:81) dn =1 N nk . Let the quantizer bespecified as the enumerated set of unique (product) codewords Γ k = { x k , . . . , x N k k } , with x ik = (cid:2) x ,i k , . . . , x d,i d k (cid:3) (cid:62) , The order of enumeration is irrelevant, as long as it is consistently applied. { i , . . . , i d } , associated with index i . These indicesenumerate the constituents of x ik , in terms of the underlying one-dimensional quantizers, x j,i j k := x i j k ∈ Γ jk . Thus, each scalar constituent of the product codeword has two superscripts:the first is its associated dimension, and the second is the index of the codeword in itscorresponding one-dimensional quantizer.With this notation in place, consider the following shorthand for the marginal Euler updatefrom (15), U n,ik +1 := m n,ik z nk +1 + c n,ik where m n,ik = b n (cid:0) x ik (cid:1) √ ∆ t and c n,ik = x n,i n k + a n (cid:0) x ik (cid:1) ∆ t. Now, for each n = 1 , . . . , d , F (cid:101) X nk +1 ( x ) = P ( (cid:101) X nk +1 ≤ x )= N k (cid:88) i =1 P (cid:16) U n ( (cid:98) X k , z nk +1 ) ≤ x (cid:12)(cid:12)(cid:12) (cid:98) X k = x ik (cid:17) p ik = N k (cid:88) i =1 P (cid:16) U n,ik +1 ≤ x (cid:17) p ik = N k (cid:88) i =1 Φ (cid:32) x − c n,ik m n,ik (cid:33) p ik , (16)with p ik := P (cid:16) (cid:98) X k = x ik (cid:17) the joint probability, computed at the previous time step, and k =0 , . . . , K − 1. Simple differentiation and integration then yields f (cid:101) X nk +1 ( x ) = N k (cid:88) i =1 m n,ik φ (cid:32) x − c n,ik m n,ik (cid:33) p ik (17)and M (cid:101) X nk +1 ( x ) = N k (cid:88) i =1 (cid:34) − m n,ik φ (cid:32) x − c n,ik m n,ik (cid:33) + c n,ik Φ (cid:32) x − c n,ik m n,ik (cid:33)(cid:35) p ik . (18)Comparing the above three equations with (11) to (13), the only difference is the extrasuperscript index, n , in the expressions indicating the dimension being quantized. As before,each marginal quantization in the PMQ algorithm only relies on weighted summations ofthe standard Gaussian density and distribution functions. Once again, using (16) to (18)the algorithm can utilize either the one-dimensional Newton-Raphson method or the one-dimensional Lloyd’s algorithm from section 2.1.The probabilities associated with the final product grid, Γ k +1 , are computed using F (cid:101) X k +1 ,the distribution of (cid:101) X k +1 , which in turn is the joint distribution of the marginal Euler updates, (cid:101) X k +1 , . . . , (cid:101) X dk +1 . The regions associated with the product quantizer can only be d -dimensionalrectangles, as they result from the Cartesian product of one-dimensional grids. Thus, the joint8robability of each region can always be computed as sums and differences of multivariatenormal distribution functions. For example, in the two dimensional case, we have p ik +1 = P (cid:16) (cid:98) X k +1 = x ik +1 (cid:17) = F (cid:101) X k +1 (cid:16)(cid:2) x , i + k +1 , x , i + k +1 (cid:3) (cid:62) (cid:17) − F (cid:101) X k +1 (cid:16)(cid:2) x , i + k +1 , x , i − k +1 (cid:3) (cid:62) (cid:17) − F (cid:101) X k +1 (cid:16)(cid:2) x , i − k +1 , x , i + k +1 (cid:3) (cid:62) (cid:17) + F (cid:101) X k +1 (cid:16)(cid:2) x , i − k +1 , x , i − k +1 (cid:3) (cid:62) (cid:17) , which is expressed in terms of (10), using the notation given in (2). Example: Two correlated assets. As a first example, we consider two negatively correlated assets, each driven by geometricBrownian motion. The SDEs for the assets may be specified in the notation of (14) as a ( X t ) = [ rX t , rX t ] (cid:62) and b ( X t ) = [ σ X t , σ X t ] (cid:62) , with d (cid:104) W , W (cid:105) t = ρ dt . The parameters chosen were x = [110 , (cid:62) , σ = 10%, σ = 30%, ρ = − . r = 5% for the risk-free rate. A total of 200 codewords were used for theRMQ algorithm. For the PMQ algorithm, N k = 10 codewords were used for the marginal X process, denoted Asset 1, and N k = 20 codewords were used for the marginal X process,denoted Asset 2. Both algorithms used monthly time steps.Figure 1 illustrates the joint distribution of the assets one year into the future usingthe RMQ and PMQ algorithms. It highlights the fundamental differences between the twoapproaches. The left panel shows the codewords that result from the RMQ algorithm and theircorresponding regions. The underlying heat-map represents the actual bivariate lognormaldensity of the two assets. The regions are polygons, with the codewords clustered in the areasof high probability. In the right panel, the heat-map represents the probability associatedwith each rectangular region of the product grid that results from the Cartesian product ofthe two marginal quantizers. Although the shape of the underlying probability density isstill well approximated, the PMQ algorithm produces a higher concentration of codewords inregions of low probability, e.g., compare the upper right corner of the panels.It is worth noting that, because of the need for stochastic methods, the RMQ algorithmis significantly slower than the PMQ algorithm, requiring approximately 200 times longer tocompute. In the specific case of stochastic volatility models, the dependence between the asset priceprocess and the volatility or variance process is usually less general than that allowed by (8).Consider a two-dimensional system of SDEs given by a ( X t ) = (cid:2) a ( X t ) , a ( X t ) (cid:3) (cid:62) and b ( X t ) = (cid:2) b ( X t ) , b ( X t ) (cid:3) (cid:62) , with d (cid:104) W , W (cid:105) t = ρ dt . Note that the asset price process, X , does not appear in the driftor diffusion coefficient of the volatility or variance process, X . It should be clear that inthis case, the marginal quantization of X can be completed for all k = 0 , . . . , K withoutreference to the X process. This allows higher-order updates to be used for X in the sameway as for the one-dimensional RMQ algorithm (McWalter et al., 2018). The PMQ algorithm9igure 1: Comparison of recursive marginal quantization and product Markovian quantizationfor two correlated assets.remains unchanged, except that the joint probabilities must be computed using a new jointdistribution. We illustrate the case when the simplified weak-order 2.0 scheme is used.To derive the required joint distribution, we adopt the short-hand notation of Appendix Afor the X process, modified slightly to incorporate another index in the superscript to indi-cate the second dimension. Let Φ ( x, y, ρ ) be the bivariate Gaussian cumulative distributionfunction evaluated at [ x, y ] (cid:62) with correlation ρ . The joint distribution becomes F (cid:101) X k +1 (cid:0) [ x, y ] (cid:62) (cid:1) = N k (cid:88) i =1 P (cid:32) z k +1 ≤ x − c , ik m , ik , (cid:18) z k +1 + (cid:113) ¯ λ , ik (cid:19) ≤ y − ¯ c , ik ¯ m , ik (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:98) X k = x ik (cid:33) p ik = N k (cid:88) i =1 P z k +1 ≤ x − c , ik m , ik , z k +1 ≤ − (cid:113) ¯ λ , ik + (cid:118)(cid:117)(cid:117)(cid:116) y − ¯ c , ik ¯ m , ik (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:98) X k = x ik − P z k +1 ≤ x − c , ik m , ik , z k +1 ≤ − (cid:113) ¯ λ , ik − (cid:118)(cid:117)(cid:117)(cid:116) y − ¯ c , ik ¯ m , ik (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:98) X k = x ik p ik = N k (cid:88) i =1 Φ x − c , ik m , ik , − (cid:113) ¯ λ , ik + (cid:118)(cid:117)(cid:117)(cid:116) y − ¯ c , ik ¯ m , ik ; ρ − Φ x − c , ik m , ik , − (cid:113) ¯ λ , ik − (cid:118)(cid:117)(cid:117)(cid:116) y − ¯ c , ik ¯ m , ik ; ρ p ik , which has the net effect of adding an additional evaluation of the bivariate normal distri-bution function to each term in the summation. Although currently no theoretical proof ofconvergence exists, the potential effectiveness of this technique is illustrated in section 7.10 A robust algorithm In the literature, the Newton-Raphson method is used extensively for the RMQ algorithm inone dimension (Pag`es and Sagna, 2015; Callegaro et al., 2015, 2017). Fiorin et al. (2019) statesthat fast quantization is only available in one-dimension because of deterministic procedureslike the Newton-Raphson method, and this motivates the derivation of the product Markovianquantization technique.However, the Newton-Raphson method has three flaws when used to solve the one-dimensional vector quantization problems that arise in RMQ and PMQ. Firstly, should nega-tive codewords be generated when the process dynamics excludes crossing the zero boundary,the method will fail. This can occur because we are approximating the true distribution ofthe process using Euler updates, which may admit negative values. McWalter et al. (2018)solved this problem by showing how to correctly model the zero boundary to ensure positivecodewords.Secondly, the Newton-Raphson method is sensitive to the initial guess used, and may failto converge. Thirdly, the Hessian matrix may become ill-conditioned, which may result insignificant numerical error in the solution of the linear system. Bormetti et al. (2018) brieflyexplore both these conditions and derive an RMQ-specific Lloyd’s algorithm to address them.By directly specifying the random variable to be quantized, we have shown how one-dimensional RMQ and d -dimensional PMQ can both be solved using either the standardNewton-Raphson method or the one-dimensional Lloyd’s algorithm, without modification.In Algorithm 3, we recommend a hybrid approach, where the Newton-Raphson method isapplied until either the iteration limit, l NRmax , is attained or the Hessian matrix becomes numer-ically unstable, at which point we switch to Lloyd’s algorithm to complete the quantization,using a maximum number of iterations l LAmax .For line 8 in Algorithm 3, we define the function g ( · ) to apply (4) to each element of the Γ vector, such that [ g ( Γ )] i = M X (cid:0) x i + (cid:1) − M X (cid:0) x i − (cid:1) F X ( x i + ) − F X ( x i − ) . In this way, Lloyd’s algorithm is expressed as a general fixed point algorithm and can imme-diately benefit from Anderson acceleration (Walker and Ni, 2011).This approach allows us to leverage the speed of the Newton-Raphson method, whilefalling back to the robust accelerated Lloyd’s algorithm when required. This is essential forapplications like calibration, see section 7.3.The basic Anderson acceleration algorithm is described in Bormetti et al. (2018) and acomplete MATLAB implementation is provided in Walker (2011). Furthermore, in MAT-LAB, monitoring the Hessian matrix can be done at no further computational cost. In ourimplementation we rely on the LU-decomposition to solve the linear system in (5). By de-fault, MATLAB will issue a warning when the matrix to be decomposed is close to singular.By escalating this warning to an error, we can use exception handling to switch between theNewton-Raphson method and the accelerated Lloyd’s algorithm. In this section, we price options under the Heston (1993) and SABR (Hagan et al., 2002) mod-els, and provide a proof-of-concept calibration for the SABR model. For European options,11 lgorithm 3 Calculating Γ Set Γ = Γ k − and l = 1 while cond( ∇ D ( Γ ) < tol and l ≤ l NRmax do Γ ← Γ − (cid:2) ∇ D ( Γ ) (cid:3) − ∇ D ( Γ ) l ← l + 1 end while if l < l NRmax then for l = 1 to l LAmax do Γ ← g ( Γ ) end for end if the Heston model is amenable to semi-analytical pricing using Fourier transform techniques,whereas an analytical approximation exists for both the Black and Bachelier implied volatil-ities under the SABR model. The Fourier pricing technique implemented uses the little trapformulation of the characteristic function for the Heston model (Albrecher et al., 2007), whilethe implied volatility approximation for the SABR model is the latest from Hagan et al.(2016).The Heston example serves to highlight the effect of changing the discretization of theindependent process. For the SABR model we price up-and-out barrier and Bermudan putoptions, and provide a proof-of-concept calibration to market data, illustrating the flexibilityof the PMQ algorithm.All simulations were executed using MATLAB R2018a on a computer with a 2 . 20 GHzIntel i-7 processor and 8 GB of RAM. The SDEs for the Heston model may be specified as a ( X t ) = (cid:2) rX t , κ ( θ − X t ) (cid:3) (cid:62) and b ( X t ) = (cid:20)(cid:113) X t X t , σ (cid:113) X t (cid:21) (cid:62) , with d (cid:104) W , W (cid:105) t = ρ dt . The parameters chosen were κ = 2, θ = 0 . σ = 60%, r = 5%, ρ = − . x = 100 and x = 0 . 09, which are based on the SV-I parameter set from Table3 of Lord et al. (2010), with σ adjusted from 1 to 0 . κθ = σ . For the PMQ algorithm N = 30 and N = 15 codewords were used for the two processes, with the number of codewords heldconstant through time. The maturity was set at T = 1, and K = 12 time steps were used.Figure 2 demonstrates pricing a one-year European put option using the PMQ algo-rithm with both the standard Euler discretization for the variance process and the simplifiedweak-order 2.0 discretization. We denote these the Euler-Euler and Euler-WO2 schemesrespectively.The left panel shows the absolute difference between the continuous marginal distributionof (cid:101) X K (before quantization) and the true marginal distribution of X T , which can be obtainedby numerically integrating the characteristic function. As we are comparing two distributionfunctions, this error must be between zero and one. Although both the Euler-Euler andEuler-WO2 schemes approximate the true distribution function well, the Euler-WO2 scheme12 Domain A b s o l u t e D i ff e r en c e -2 Distribution Function Error Euler-EulerEuler-WO2 60 80 100 120 140 Strike A b s o l u t e D i ff e r en c e European Put Error Euler-EulerEuler-WO23 Stdev MC Figure 2: Comparison of the Euler-Euler PMQ algorithm and Euler-WO2 PMQ algorithmfor the Heston model.has an average absolute error across the specified domain of 0 . . The SDEs for the SABR model may be specified as a ( X t ) = [0 , (cid:62) and b ( X t ) = (cid:104) X t ( X t ) β , νX t (cid:105) (cid:62) , with d (cid:104) W , W (cid:105) t = ρ dt . The parameters chosen for option pricing were β = 0 . ν = 0 . ρ = − . x = 0 . 4. For this example, we model the forward value of an asset withstochastic volatility under the assumption of a constant interest rate of r = 10%, such that13 Barrier Level O p t i on P r i c e Up-and-Out Put Option Euler-EulerEuler-WO2Monte Carlo3 Stdev MC 0.8 0.9 1 1.1 1.2 Moneyness O p t i on P r i c e Bermudan Put Option Euler-EulerEuler-WO2Monte Carlo Figure 3: The PMQ algorithm for pricing an up-and-out and Bermudan put options underthe SABR model. x = S exp( rT ), with S = 100 and T set at one year. For the PMQ algorithm we chose N = 60, N = 30 and K = 12. The Monte Carlo simulations in this section utilize thefully-truncated Euler scheme, suggested as the least-biased scheme for stochastic volatilitymodels by Lord et al. (2010), with 100 000 paths and 120 time steps.In the left panel of Figure 3, we price discrete up-and-out put options with a maturityof one year, monthly barrier evaluations and a strike of 100, for a variety of barrier levels.The barrier levels are expressed as a percentage of strike. Two prices produced by the PMQalgorithm lie outside the three-standard-deviation bounds of the Monte Carlo simulation.However, when using the Monte Carlo prices as a benchmark, the resulting prices are veryaccurate, with an average relative error across the barrier levels of less than 0 . An advantage of the PMQ algorithm, like traditional tree methods, is the ability to pricemultiple options without needing to re-generate the underlying grid. Once the optimal quan-tization grid has been generated out to the furthest required option maturity, the computa-tional cost of pricing options is negligible. An immediate application is the ability to calibratestochastic volatility models directly to non-vanilla products. RMQ has previously been used14igure 4: Calibrating the SABR model to American put options on AMZN for January 22,2018.to calibrate the quadratic normal volatility model to vanilla options on the DAX index byCallegaro et al. (2015) and PMQ calibration has been demonstrated using the Heston modelby Callegaro et al. (2018).The SABR calibration problem can be formulated asmin Θ ∈ R F (Θ) , where F is the objective or error function and Θ = { y , β, ν, ρ } is the parameter set forthe SABR model. In Escobar and Gschnaidtner (2016) the relative squared volatility error (RSVE) is recommended as the objective function for calibrating the Heston model, and it isadopted here. It is defined as F (Θ) = L (cid:88) l =1 (cid:18) σ Model l (Θ) − σ Market l σ Market l (cid:19) , where L is the number of calibration instruments used, σ Model l (Θ) is the Black-Scholes impliedvolatility that corresponds to pricing calibration instrument l with the model parameters Θ,and σ Market l is the implied volatility for that instrument observable in the market.As a proof-of-concept example, we calibrate the SABR model directly to American putoptions on AMZN for January 22, 2018. We considered maturities from 3 days to 3 monthsand all strikes within 30% of at-the-money that had non-zero volume, for a total of 393calibration instruments. The stock price was x = 1327 . β ν ρ F (Θ)0 . 87 0 . 86 0 . − . 92 7 . In this paper, we have formulated the one-dimensional RMQ and d -dimensional PMQ algo-rithms as standard vector quantization problems by deriving the density, distribution andlower partial expectation functions of the random variables to be quantized at each timestep. As a consequence, this allows the straightforward application of Lloyd’s algorithm inthe cases where the traditional Newton-Raphson method becomes unstable. We proposed ahybrid algorithm that utilizes the speed of the Newton-Raphson method but may fall backto the less efficient accelerated Lloyd’s algorithm when necessary.Furthermore, we extended the PMQ algorithm for stochastic volatility models by using asimplified weak-order 2.0 update for the volatility process. The effectiveness of this techniquewas demonstrated by comparing the resulting marginal distributions of correlated geometricBrownian motion asset processes, and by pricing European options under the Heston model.Finally, we priced up-and-out barrier and Bermudan put options under the SABR modeland provided a proof-of-concept calibration to an American put option implied volatilitysurface. The calibration, in particular, highlighted the need for our hybrid algorithm, as theinversion of the Hessian matrix became numerically unstable at various stages during thesearch through the parameter space. 16 ppendix A The simplified weak-order 2.0 approximation Consider the continuous-time scalar-valued diffusion specified by the SDE dX t = a ( X t ) dt + b ( X t ) dW t , X = x ∈ R , (19)defined on the filtered probability space (Ω , F , ( F t ) t ∈ [0 ,T ] , P ), where W is a standard one-dimensional Brownian motion. The simplified weak-order 2.0 approximation to this processcan be written as X k +1 = ¯ m ( X k ) (cid:18) z k +1 + (cid:113) ¯ λ ( X k ) (cid:19) + ¯ c ( X k ) , X = x , with ¯ m ( x ) = b ( x ) b (cid:48) ( x )∆ t, ¯ c ( x ) = x + (cid:0) a ( x ) − b ( x ) b (cid:48) ( x ) (cid:1) ∆ t + (cid:0) a ( x ) a (cid:48) ( x ) + a (cid:48)(cid:48) ( x ) b ( x ) (cid:1) (∆ t ) − (cid:0) b ( x ) + (cid:0) a (cid:48) ( x ) b ( x ) + a ( x ) b (cid:48) ( x ) + b (cid:48)(cid:48) ( x ) b ( x ) (cid:1) ∆ t (cid:1) b ( x ) b (cid:48) ( x ) , and ¯ λ ( x ) = (cid:32) b ( x ) + (cid:0) a (cid:48) ( x ) b ( x ) + a ( x ) b (cid:48) ( x ) + b (cid:48)(cid:48) ( x ) b ( x ) (cid:1) ∆ tb ( x ) b (cid:48) ( x ) (cid:112) (∆ t ) (cid:33) . The required distribution, density and lower partial expectation functions become F (cid:101) X k +1 ( x ) = N k (cid:88) i =1 F χ (cid:18) x − ¯ c ik ¯ m ik ; 1 , λ ik (cid:19) p ik ,f (cid:101) X k +1 ( x ) = N k (cid:88) i =1 m ik f χ (cid:18) x − ¯ c ik ¯ m ik ; 1 , λ ik (cid:19) p ik , and M (cid:101) X k +1 ( x ) = N k (cid:88) i =1 (cid:20) ¯ m ik f χ (cid:18) x − ¯ c ik ¯ m ik ; 1 , λ ik (cid:19) + ¯ c ik F χ (cid:18) x − ¯ c ik ¯ m ik ; 1 , λ ik (cid:19)(cid:21) p ik , where F χ ( · , k, λ ) and f χ ( · , k, λ ) are the distribution and density functions of a non-centralchi-square random variable with degrees of freedom k and non-centrality parameter λ , usingthe shorthand notation ¯ m ik := ¯ m ( x ik ), ¯ c ik := ¯ c ( x ik ) and λ ik := ¯ λ ( x ik ).17 eferences Albrecher, H., Mayer, P. A., Schoutens, W., and Tistaert, J. (2007). The little Heston trap. Wilmott , Jan.:83–92.Andersen, L. (2008). Simple and efficient simulation of the Heston stochastic volatility model. Journal of Computational Finance , 11(3):1–42.Bormetti, G., Callegaro, G., Livieri, G., and Pallavicini, A. (2018). A backward Monte Carloapproach to exotic option pricing. European Journal of Applied Mathematics , 29(1):146–187.Callegaro, G., Fiorin, L., and Grasselli, M. (2015). Quantized Calibration in Local Volatility. Risk Magazine , 28:62–67.Callegaro, G., Fiorin, L., and Grasselli, M. (2017). Pricing via recursive quantization instochastic volatility models. Quantitative Finance , 17(6):855–872.Callegaro, G., Fiorin, L., and Grasselli, M. (2018). American quantized calibration in stochas-tic volatility. Risk Magazine , pages 84–88.Escobar, M. and Gschnaidtner, C. (2016). Parameters recovery via calibration in the Hestonmodel: A comprehensive review. Wilmott , Nov.:60–81.Fiorin, L., Pag`es, G., and Sagna, A. (2019). Product Markovian quantization of a diffusionprocess with applications to finance. Methodology and Computing in Applied Probability ,21:1087–1118.Hagan, P. S., Kumar, D., Lesniewski, A. S., and Woodward, D. E. (2002). Managing smilerisk. Wilmott , Sept.:84–108.Hagan, P. S., Kumar, D., Lesniewski, A. S., and Woodward, D. E. (2016). Universal smiles. Wilmott , Jul.:40–55.Heston, S. L. (1993). A closed-form solution for options with stochastic volatility with appli-cations to bond and currency options. Review of Financial Studies , 6(2):327–343.Lloyd, S. P. (1982). Least squares quantization in PCM. IEEE Transactions on InformationTheory , 28(2):129–137.Lord, R., Koekkoek, R., and Van Dijk, D. (2010). A comparison of biased simulation schemesfor stochastic volatility models. Quantitative Finance , 10(2):177–194.McWalter, T. A., Rudd, R., Kienitz, J., and Platen, E. (2018). Recursive marginal quantiza-tion of higher-order schemes. Quantitative Finance , 18:693–706.Pag`es, G. (2015). Introduction to optimal vector quantization and its applications for numer-ics. ESAIM: Proceedings and Surveys , 48:29–79.Pag`es, G. and Pham, H. (2005). Optimal quantization methods for nonlinear filtering withdiscrete-time observations. Bernoulli , 11(5):893–932.18ag`es, G., Pham, H., and Printems, J. (2004). An optimal Markovian quantization algorithmfor multi-dimensional stochastic control problems. Stochastics and Dynamics , 4(4):501–545.Pag`es, G. and Sagna, A. (2015). Recursive marginal quantization of the Euler scheme of adiffusion process. Applied Mathematical Finance , 22(5):463–498.Pag`es, G. and Wilbertz, B. (2009). Optimal Delaunay and Voronoi quantization methodsfor pricing American options. In Carmona, R., Hu, P., Del Moral, P., and Oudjane, N.,editors, Numerical Methods in Finance , pages 171–213. Springer.Rouah, F. D. (2013). The Heston Model and Its Extensions in Matlab and C . John Wiley& Sons.Rudd, R., McWalter, T. A., Kienitz, J., and Platen, E. (2017). Fast quantization of stochasticvolatility models. Available at SSRN 2956168 .Sagna, A. (2011). Pricing of barrier options by marginal functional quantization. Monte CarloMethods and Applications , 17(4):371–398.Walker, H. F. (2011). Anderson acceleration: Algorithms and implementations. WPI Math.Sciences Dept. Report MS-6-15-50 .Walker, H. F. and Ni, P. (2011). Anderson acceleration for fixed-point iterations.