[PDF] A new approach for American option pricing: The Dynamic Chebyshev method

Abstract

We introduce a new method to price American options based on Chebyshev interpolation. In each step of a dynamic programming time-stepping we approximate the value function with Chebyshev polynomials. The key advantage of this approach is that it allows to shift the model-dependent computations into an offline phase prior to the time-stepping. In the offline part a family of generalised (conditional) moments is computed by an appropriate numerical technique such as a Monte Carlo, PDE or Fourier transform based method. Thanks to this methodological flexibility the approach applies to a large variety of models. Online, the backward induction is solved on a discrete Chebyshev grid, and no (conditional) expectations need to be computed. For each time step the method delivers a closed form approximation of the price function along with the options' delta and gamma. Moreover, the same family of (conditional) moments yield multiple outputs including the option prices for different strikes, maturities and different payoff profiles. We provide a theoretical error analysis and find conditions that imply explicit error bounds for a variety of stock price models. Numerical experiments confirm the fast convergence of prices and sensitivities. An empirical investigation of accuracy and runtime also shows an efficiency gain compared with the least-square Monte-Carlo method introduced by Longstaff and Schwartz (2001).

Full PDF

aa r X i v : . [ q -f i n . C P ] J un A new approach for American option pricing:The Dynamic Chebyshev method

Kathrin Glau , , Mirco Mahlstedt , ∗ , Christian P¨otz , , ∗ Queen Mary University of London, UK Technical University of Munich, GermanyJune 15, 2018

Abstract

We introduce a new method to price American options based on Chebyshevinterpolation. In each step of a dynamic programming time-stepping we ap-proximate the value function with Chebyshev polynomials. The key advantageof this approach is that it allows to shift the model-dependent computationsinto an oﬄine phase prior to the time-stepping. In the oﬄine part a familyof generalised (conditional) moments is computed by an appropriate numericaltechnique such as a Monte Carlo, PDE or Fourier transform based method.Thanks to this methodological ﬂexibility the approach applies to a large varietyof models. Online, the backward induction is solved on a discrete Chebyshevgrid, and no (conditional) expectations need to be computed. For each timestep the method delivers a closed form approximation of the price function alongwith the options’ delta and gamma. Moreover, the same family of (conditional)moments yield multiple outputs including the option prices for diﬀerent strikes,maturities and diﬀerent payoﬀ proﬁles. We provide a theoretical error analysisand ﬁnd conditions that imply explicit error bounds for a variety of stock pricemodels. Numerical experiments conﬁrm the fast convergence of prices and sen-sitivities. An empirical investigation of accuracy and runtime also shows aneﬃciency gain compared with the least-square Monte-Carlo method introducedby Longstaﬀ and Schwartz (2001).

Keywords

American Option Pricing, Complexity Reduction, Dynamic Pro-gramming, Polynomial Interpolation

A challenging task for ﬁnancial institutions is the computation of prices and sensitivi-ties for large portfolios of derivatives such as equity options. Typically, equity optionshave an early exercise feature and can either be exercised at any time until maturity(American type) or at a set of pre-deﬁned exercise dates (Bermudan type). In lackof explicit solutions, diﬀerent numerical methods haven been developed to tackle ∗ The authors thank the KPMG Center of Excellence in Risk Management for their support. X t be the underlying Markov process andthe value function V t is given by, V T ( x ) = g ( x ) V t ( x ) = f ( g ( t, x ) , E [ V t +1 ( X t +1 ) | X t = x ]) . with time steps t < t + 1 < . . . < T and payoﬀ function g . The computationalchallenge is to compute E [ V t +1 ( X t +1 ) | X t = x ] for for all time steps t and all states x , where V t +1 depends on all previous time steps .In order to tackle this problem, we approximate the value function in each timestep by Chebyshev polynomial interpolation. We thus express the value function asa ﬁnite sum of Chebyshev polynomials E [ V t +1 ( X t +1 ) | X t = x ] ≈ X c t +1 j E [ T j ( X t +1 ) | X t = x ](1.1)The choice of Chebyshev polynomials is motivated by the promising properties ofChebyshev interpolation such as • The vector of coeﬃcients ( c t +1 j ) j =0 ,...,N is explicitly given as a linear combina-tion of the values V t ( x k ) at the Chebyshev grid points x k . For this, equation(1.1) needs to be solved at the Chebyshev grid points x = x k only. • Exponential convergence of the interpolation for analytic functions and poly-nomial convergence of diﬀerential functions depending on the order. • The interpolation can be implemented in a numerically stable way.The computation of the continuation value at a single time step coincides withthe pricing of a European option. Its interpolation with Chebyshev polynomials isproposed in Gaß et al. (2018), where the method shows to be highly promising andexponential convergence is established for a large set of models and option types.2oreover, the approximation of the value function with Chebyshev polynomials hasalready proven to be beneﬁcial for optimal control problems in economics, see Judd(1998) and Cai and Judd (2013).The key advantage of our approach for American option pricing is that it col-lects all model-dependent computations in the generalized conditional momentsΓ j,k = E [ T j ( X t +1 ) | X t = x k ]. If there is no closed-form solution their calculation canbe shifted into an oﬄine phase prior to the time-stepping. Depending on the under-lying model a suitable numerical technique such as Monte Carlo, PDE and Fouriertransform methods can be chosen, which reveals the high ﬂexibility of the approach.Once the generalized conditional moments Γ j,k are computed, the backward induc-tion is solved on a discrete Chebyshev grid. Which avoids any computations ofconditional expectations during the time-stepping. For each time step the methoddelivers a closed form approximation of the price function x P c tj T j ( x ) along withthe options’ delta and gamma. Since the family of generalized conditional momentsΓ j,k are independent of the value function, they can be used to generate multipleoutputs including the option prices for diﬀerent strikes, maturities and diﬀerentpayoﬀ proﬁles. The structure of the method is also beneﬁcial for the calculation ofexpected future exposure which is the computational bottleneck in the computationof CVA, as investigated in Glau et al. (2018).The oﬄine-online decomposition separates model and payoﬀ yielding a modulardesign. We exploit this structure for a thorough error analysis and ﬁnd conditionsthat imply explicit error bounds. They reﬂect the modularity by decomposing into apart stemming from the Chebyshev interpolation, from the time-stepping and fromthe oﬄine computation. Under smoothness conditions the asymptotic convergencebehaviour is deduced.We perform numerical experiments using the Black-Scholes model, Merton’sjump diﬀusion model and the Constant Elasticity of Variance (CEV) model as arepresenative of a local volatility model. For the computation of the generalizedconditional moments we thus use diﬀerent techniques, namely numerical integrationbased on Fourier transforms and Monte Carlo simulation. Numerical experimentsconﬁrm the fast convergence of option prices along with its delta and gamma. Acomprehensive comparison with the LSM reveals the potential eﬃciency gain of thenew approach, particularly when several options on the same underlying are priced.The rest of the article is organized as follows. We introduce the problem settingand the new method in Section 2 and provide the error analysis in Section 3. Section4 discusses general traits of the implementation and Section 5 presents the numericalexperiments. Section 6 concludes the article, followed by an appendix with the proofof the main result. First, we present the Bellman-Wald equation as a speciﬁc form of dynamic program-ming. Second, we provide the necessary notation for the Chebyshev interpolation.Then we are in a position to introduce the new approach and its application toAmerican option pricing. 3 .1 Optimal stopping and Dynamic Programming

Let X = ( X t ) ≤ t ≤ T be a Markov process with state space R d deﬁned on the ﬁlteredprobability space (Ω , F , ( F t ) t ≥ , P ). Let g : [0 , T ] × R d −→ R be a continuousfunction with E (cid:2) sup ≤ t ≤ T | g ( t, X t ) | (cid:3) < ∞ . Then V ( t, x ) := sup t ≤ τ ≤ T E [ g ( τ, X τ ) | X t = x ] for all ( t, x ) ∈ [0 , T ] × R d over all stopping times τ , see (2 . . ′ ) in Peskir and Shiryaev (2006). In discrete timethe optimal stopping problems can be solved with dynamic programming.Namely, with time stepping t = t < . . . < t n = T the solution of the optimalstopping problem can be calculated via backward induction V T ( x ) = g ( T, x ) V t u ( x ) = max (cid:0) g ( t u , x ) , E [ V t u +1 ( X t u +1 ) | X t u = x ] (cid:1) . Note that n refers to the number of time steps between t and T . For notationalconvenience, we indicate the value function at each time step with subscript t u todirectly refer to the time step t u . For a detailed overview of optimal control problemsin discrete time we refer to Peskir and Shiryaev (2006). The univariate Chebyshev polynomial interpolation as described in detail in Trefethen(2013) has a tensor based extension to the multivariate case, see e.g. Sauter and Schwab(2010). Usually, the Chebyshev interpolation is deﬁned for a function on a [ − , D domain. For an arbitrary hyperrectangular X = [ x , x ] × . . . × [ x D , x D ], we introducea linear transformation τ X : [ − , D → X componentwise deﬁned by τ X ( z i ) = x i + 0 . x i − x i )(1 − z i ) . (2.1)Let N := ( N , . . . , N D ) with N i ∈ N for i = 1 , . . . , D . We deﬁne the index set J := { j ∈ N D : 1 ≤ j i ≤ N D for i = 1 , . . . , d } . The Chebyshev polynomials are deﬁned for z ∈ [ − , D and j ∈ J by T j ( z ) = D Y i =1 T j i ( z i ) , T j i ( z i ) = cos( j i · acos( z i )) , and the j -th Chebyshev polynomial on X as p j ( x ) = T j ( τ − X ( x ))1 X ( x ). The Cheby-shev points are given by z k = ( z k , . . . , z k D ) , z k i = cos (cid:18) π k i N i (cid:19) for k i = 0 , . . . , N i and i = 1 , . . . , D. and the transformed Chebyshev points by x k = τ X ( z k ) . The Chebyshev interpolationof a function f : X → R with Q Di =1 ( N i + 1) summands can be written as a sum ofChebyshev polynomials I N ( f )( x ) = X j ∈J c j T j ( τ − X ( x )) = X j ∈J c j p j ( x ) for x ∈ X (2.2) 4ith coeﬃcients c j for j ∈ J c j = (cid:16) D Y i =1 {

Deﬁnition 2.1.

We consider a Dynamic Programming Problem (DPP) with valuefunction V T ( x ) = g ( T, x )(2.4) V t u ( x ) = f (cid:0) g ( t u , x ) , E [ V t u +1 ( X t u +1 ) | X t u = x ] (cid:1) , (2.5) where t = t < . . . < t n = T and f : R × R → R is Lipschitz continuous withconstant L f . At the initial time T = t n , we apply Chebyshev interpolation to the function g ( T, x ), i.e. for x ∈ X , V T ( x ) = g ( T, x ) ≈ X j ∈J c j ( T ) p j ( x ) =: b V T ( x )At the ﬁrst time step t n − , the derivation of E [ g ( t n , X t n ) | X t n − = x ] is replaced by E [ P j c j ( t n ) p j ( X t n ) | X t n − = x ] yielding V t n − ( x ) = f (cid:0) g ( t n − , x ) , E [ V t n ( X t n ) | X t n − = x ] (cid:1) ≈ f (cid:16) g ( t n − , x ) , E h X j ∈J c j ( t n ) p j ( X t n ) (cid:12)(cid:12)(cid:12) X t n − = x i(cid:17) = f (cid:16) g ( t n − , x ) , X j ∈J c j ( t n ) E h p j ( X t n ) (cid:12)(cid:12)(cid:12) X t n − = x i(cid:17) . At time step t n − the value function V t n − needs only to be evaluated at the speciﬁcChebyshev nodes. Hence, denoting with x k = ( x k , . . . , x k D ) the Chebyshev nodes,it suﬃces to evaluate V t n − ( x k ) ≈ f (cid:16) g ( t n − , x k ) , X j ∈J c j ( t n ) E h p j ( X t n ) (cid:12)(cid:12)(cid:12) X t n − = x k i(cid:17) =: b V t n − ( x k ) . (2.6)A linear transformation of ( b V t n − ( x k )) k ∈J yields the Chebyshev coeﬃcients accord-ing to (2.3) which determines the Chebyshev interpolation b V t n − = P j c j ( t n − ) p j .We apply this procedure iteratively as described in detail in Algorithm 1.The stochastic part is gathered in the expectations of the Chebyshev polyno-mials conditioned on the Chebyshev nodes, i.e. Γ j,k ( t u ) = E [ p j ( X t u +1 ) | X t u = x k ].5oreover, if an equidistant time stepping is applied the computation can be furthersimpliﬁed. If for the underlying stochastic processΓ j,k ( t u ) = E [ p j ( X t u +1 ) | X t u = x k ] = E [ p j ( X t ) | X t = x k ] =: Γ j,k (2.7)for u = 0 , . . . , n −

1, then the conditional expectations need to be computed only forone time step, see Algorithm 2. One can pre-compute these conditional expectationsand thus, the method allows for an oﬄine/online decomposition.

Algorithm 1

Dynamic Chebyshev algorithm

Require: N ∈ N D , X = [ x , x ] × . . . × [ x D , x D ], 0 = t , . . . , t n = T Determine index set J and nodal points x k = ( x k , . . . , x k D ) Pre-computation step: For all j, k ∈ J and all t u , u = 0 , . . . , n − Compute Γ j,k ( t u ) = E [ p j ( X t u +1 ) | X t u = x k ] Time T b V T ( x k ) = g ( T, x k ) , k ∈ J , derive c j ( T ) = D N ( j ) P k ∈J ′′ b V T ( x k ) T j ( z k ) Obtain Chebyshev interpolation b V T ( x ) = P j ∈J c j ( T ) p j ( x ) of V T ( x ) Iterative time stepping from t u +1 → t u , u = n − , . . . , Given Chebyshev interpolation of b V t u +1 ( x ) = P j ∈J c j ( t u +1 ) p j ( x ) Derivation of b V t u ( x k ) , k ∈ J at the nodal points b V t u ( x k ) = f ( g ( t u , x k ) , P j ∈J c j ( t u +1 )Γ j,k ( t u )) Derive c j ( t u ) = D N ( j ) P k ∈J ′′ b V t u ( x k ) T j ( z k ) Obtain Chebyshev interpolation b V t u ( x ) = P j ∈J c j ( t u ) p j ( x ) of V t t ( x ) Deriving the solution at t = 0 b V ( x ) = P j ∈J c j (0) p j ( x ) Algorithm 2

Simpliﬁed Dynamic Chebyshev algrithm

Require:

Time steps 0 = t , . . . , t n = T with ∆ t := t u − t u − Replace in Algorithm 1 Lines 2-4 with: Pre-computation step: Compute Γ j,k = E [ p j ( X ∆ t ) | X = x k ] for all j, k ∈ J In this section we analyse the error of Algorithm 1, i.e. ε t u := max x ∈X | V t u ( x ) − b V t u ( x ) | . (3.1)Two diﬀerent error sources occur at t u , the classical interpolation error of the Cheby-shev interpolation and a distortion error at the nodal points. The latter comes fromthe fact that the values b V t u ( x k ) are approximations of V t u ( x k ). The behaviour of theinterpolation error depends on the regularity of the value function. Here, we assume6nalyticity of the value function. The concept can be extended to further casessuch as assuming diﬀerentiability or piecewise analyticity. The latter is discussed inpreliminary form in Mahlstedt (2017, Section 5.3) and is further investigated in afollow-up paper. Hence, we need a convergence result for the Chebyshev interpola-tion which incorporates a distortion error at the nodal points.First, we introduce the required notation. A Bernstein ellipse B ([ − , , ̺ ) with ̺ > ± ̺ . We deﬁne a generalized Bernstein ellipse B ( X , ̺ ) around the hyperrectangle X with parametervector ̺ ∈ (1 , ∞ ) D as B ( X , ̺ ) := B ([ x , x ] , ̺ ) × . . . × B ([ x D , x D ] , ̺ D )with B ([ x, x ] , ̺ ) := τ [ x,x ] ◦ B ([ − , , ̺ ), where for x ∈ C we have the transform τ [ x,x ] (cid:0) ℜ ( x ) (cid:1) := x + x − x (cid:0) − ℜ ( x ) (cid:1) and τ [ x,x ] (cid:0) ℑ ( x ) (cid:1) := x − x ℑ ( x ) where the sets B ([ − , , ̺ i ) are Bernstein ellipses for i = 1 , . . . , D . Proposition 3.1.

Let

X ∋ x f ( x ) be a real-valued function with an ana-lytic extension to some generalized Bernstein ellipse B ( X , ̺ ) for ̺ ∈ (1 , ∞ ) D with sup x ∈B ( X ,̺ ) | f ( x ) | ≤ b . Assume distorted values f ε ( x k ) = f ( x k )+ ε ( x k ) with | ε ( x k ) | ≤ ε at all nodes x k . Then max x ∈X (cid:12)(cid:12) f ( x ) − I N ( f ε )( x ) (cid:12)(cid:12) ≤ ε int ( ̺, N, D, B ) + ε Λ N . with ε int ( ̺, N, D, B ) := 2 D +1 · B · (cid:18) D X i =1 ̺ − N i i D Y j =1 − ̺ − j (cid:19) (3.2) and Lebesgue constant Λ N ≤ Q Di =1 (cid:0) π log( N i + 1) + 1 (cid:1) .Proof. Using the linearity of the interpolation operator we obtain for the Chebyshevinterpolation of f ε with f ε ( x k ) = f ( x k ) + ε ( x k ) that I N ( f ε )( x ) = I N ( f )( x ) + I N ( ε )( x ) . The tensor-based multivariate Chebyshev interpolation I N ( ε ) can be written in La-grange form I N ( ε )( x ) = X j ∈J ε ( x j ) λ j ( x ) with λ j ( x ) = D Y i =1 ℓ j i ( τ − x i ,x i ] ( x i ))where ℓ j i ( z ) = Q k = j i z − z k z ji − z k is the j i − th Lagrange polynomial. This yieldsmax x ∈X | I N ( ε )( x ) | = max x ∈X (cid:12)(cid:12)(cid:12) X j ∈J ε ( x j ) λ j ( x ) (cid:12)(cid:12)(cid:12) ≤ ε max x ∈X X j ∈J | λ j ( x ) | =: ε Λ N . N is the Lebesgue constant of the (multivariate) Chebyshev nodes whichis given byΛ N = max x ∈X X j ∈J (cid:12)(cid:12) λ j ( x ) (cid:12)(cid:12) = max x ∈X X j ∈J D Y i =1 (cid:12)(cid:12) ℓ j i ( x i ) (cid:12)(cid:12) = D Y i =1 max x i ∈ [ x i ,x i ] N i X j i =0 (cid:12)(cid:12)(cid:12) ℓ j i (cid:0) τ − x i ,x i ] ( x i ) (cid:1)(cid:12)(cid:12)(cid:12) . Since max x i ∈ [ x i ,x i ] P N i j i =0 | ℓ j i ( τ − x i ,x i ] ( x i )) | = max z ∈ [ − , P N i j i =0 | ℓ j i ( z ) | = Λ N i , whichis the Lebesgue constant of the univariate Chebyshev interpolation, we have Λ N = Q Di =1 Λ N i . From Trefethen (2013, Theorem 15.2) we obtain for the univariate Cheby-shev interpolation Λ N ≤ π log( N + 1) + 1 and henceΛ N ≤ D Y i =1 (cid:16) π log( N i + 1) + 1 (cid:17) . (3.3)For the distorted Chebyshev interpolation holds (cid:12)(cid:12) f ( x ) − I N ( f ε )( x ) (cid:12)(cid:12) ≤ | f ( x ) − I N ( f )( x ) (cid:12)(cid:12) + | I N ( ε )( x ) (cid:12)(cid:12) . Therefore, the proposition follows directly from (3.3) and Sauter and Schwab (2010).We use this result to investigate the error of the Dynamic Chebyshev method.First, we introduce the following assumption.

Assumptions 3.2.

We assume

X ∋ x V t u ( x ) is a real valued function that hasan analytic extension to a generalized Bernstein ellipse B ( X , ̺ t u ) with ̺ t u ∈ (1 , ∞ ) D and sup x ∈B ( X ,̺ tu ) | V t u ( x ) | ≤ B t u for u = 1 , . . . , n . Proposition 3.5 provides conditions on the process X and the functions f and g that guaranty Assumptions 3.2. Under this assumptions, we can apply Proposition3.1 to obtain an error bound for the Dynamic Chebyshev method at each time step.This error bound has a recursive structure, since the values of V t u depend on theconditional expectation of V t u +1 . The interpolation error of the ﬁnal time step is ofform (3.2). At any other time step t u an additional distortion error by approximatingthe function values at the nodal points by V t u ( x k ) ≈ f (cid:18) g ( t u , x k ) , X j ∈J c j ( t u +1 ) E [ p j ( X t u +1 ) | X t u = x k ] (cid:19) = b V t u ( x k )comes into play. Proposition 3.1 yields ε t u := max x ∈X | V t u ( x ) − b V t u ( x ) | ≤ ε int ( ̺ t u , N, D, B t u ) + Λ N F t u , where F t u := max j ∈J | V t u ( x j ) − b V t u ( x j ) | . The term F t u depends on the function f and the interpolation error at the previous time step t u +1 .Moreover, two additional error sources can inﬂuence the error bound. If thereis no closed-form solution for the generalized moments E [ p j ( X t u +1 ) | X t u = x k ] anumerical technique, e.g. numerical quadrature or Monte Carlo methods, introducesan additional error. The former is typically deterministic and bounded whereas the8atter is stochastic. In order to incorporate this error in the following error analysiswe introduce some additional notation. The conditional expectation can be seenas a linear operator which operates on the vector space of all continuous functions C ( R D ) with ﬁnite L ∞ -normΓ kt u : C ( R D ) → R with Γ kt u ( f ) := E [ f ( X t u +1 ) | X t u = x k ] . Deﬁne the subspace of all D variate polynomials P N ( X ) := span { p j , j ∈ J } equipped with the L ∞ -norm. We assume the operator Γ kt u is approximated by alinear operator b Γ kt u : P N ( X ) → R on P N ( X ) which fullﬁlls one of the two followingconditions. For all u = 0 , . . . , n the approximation is either deterministic and theerror is bounded by a constant δ , || Γ kt u − b Γ kt u || op := sup p ∈P N || p || =1 (cid:12)(cid:12)(cid:12) Γ kt u ( p ) − b Γ kt u ( p ) (cid:12)(cid:12)(cid:12) ≤ δ ∀ k ∈ J (GM)or the approximation is stochastic and uses M samples of the underlying processand the polynomials p may have stochastic coeﬃcients. In this case we assume theerror bound || Γ kt u − b Γ kt u || op := sup p ∈P N || p || ⋆ ∞ =1 E h(cid:12)(cid:12)(cid:12) Γ kt u ( p ) − b Γ kt u ( p ) (cid:12)(cid:12)(cid:12)i ≤ δ ⋆ ( M ) ∀ k ∈ J (GM ∗ )with norm || p || ⋆ ∞ = max x ∈X E [ | p ( x ) | ]. In order to incorporate stochasticity of b V t u ( x ),we replace (3.1) by ε t u +1 := max x ∈X E h(cid:12)(cid:12)(cid:12) V t u ( x ) − b V t u ( x ) (cid:12)(cid:12)(cid:12)i . (3.4)Note that in the deterministic case (3.1) and (3.4) coincide. Additionally, a trun-cation error is introduced by restricting to a compact interpolation domain X . Weassume that the conditional expectation of the value function outside this set isbounded by a constant E [ V t u +1 ( X t u +1 ) R D \X | X t u = x k ] ≤ ε tr . (TR)The following theorem provides an error bound for the Dynamic Chebyshevmethod. Theorem 3.3.

Let the DPP be given as in Deﬁnition 2.1. Assume the regularityAssumptions 3.2 hold and the boundedness of the truncation error (TR) . Then wehave ε t u ≤ n X j = u C j − u ε jint + Λ N L f n X j = u +1 C j − ( u +1) ( ε tr + ε gm V j )(3.5) with with ε gm = δ if assumption (GM) holds and ε gm = δ ⋆ ( M ) if assumption (GM ∗ ) holds and C = Λ N L f (1 + ε gm ) , V j = max x ∈X | V t j ( x ) | and ε jint = ε int ( ̺ t j , N, D, B t j ) .Proof. The proof of the theorem can be found in the appendix.9he following corollary provides a simpliﬁed version of the error bound (3.5) pre-senting its decomposition into three diﬀerent error sources (interpolation error ε int ,truncation error ε tr and the error from the numerical computation of the generalizedmoments ε gm ). Corollary 3.4.

Let the setting be as in Theorem 3.3. Then the error is bounded by ε t u ≤ (cid:0) ε int ( ̺, N, D, B ) + ε tr + ε gm V (cid:1) ˜ C n +1 − u (3.6) with ˜ C = max { , C } , ̺ = min ≤ u ≤ n ̺ t u , B = max ≤ u ≤ n B t u , V = max u ≤ j ≤ n V j .Moreover, if ε tr = 0 , L f = 1 and N = N i , i = 1 , . . . , D the error bound can besimpliﬁed further. Under (GM ∗ ) δ ⋆ ( M ) ≤ c/ √ M , c > yields ε t u ≤ ˜ c ̺ − N log( N ) D n + ˜ c log( N ) D n M − . . for some constants ˜ c , ˜ c > . Under (GM) the term M − . is replaced by δ .Proof. Assuming

C > n X j = u C j − u ε jint ≤ ε int n X j = u C j − u = ε int n − u X k =0 C k = ε int (cid:18) − C n +1 − u − C (cid:19) ≤ ε int C n +1 − u , where ε int = max j ε jint = max j ε int ( ̺ t j , N, D, B t j ) ≤ ε int ( ̺, N, D, B ) for ̺ = min ≤ u ≤ n ̺ u and B = max ≤ u ≤ n B t u . For C ≤ ε int n +1 − u . Similar, weobtain for the second term in the error bound (3.5) with β = ( ε tr + ε gm V j )Λ N L f n X j = u +1 C j − ( u +1) β j ≤ Λ N L f β n − ( u +1) X k =0 C k ≤ Λ N L f βC n − u ≤ βC n +1 − u where β = max j β j . Moreover, we used Λ N L f ≤ Λ N L f (1 + ε gm ) = C in the laststep. Thus, we obtain the following error bound (3.5) ε t u ≤ ( ε int + β ) ˜ C n +1 − u = (cid:0) ε int ( ̺, N, D, B ) + ε tr + ε gm V (cid:1) ˜ C n +1 − u , where ˜ C = max { , C } and V = max j V j , which shows (3.6).Furthermore, using the deﬁnition of the error bound (3.2) and N = N i , i =1 , . . . , D we conclude that ε int ( ̺, N, D, B ) ≤ c ̺ − N for a constant c >

0. For theLebesgue constant of the Chebyshev interpolation exists a constant c > N ≤ D Y i =1 (cid:0) π log( N + 1) + 1 (cid:1) ≤ D Y i =1 (cid:0) π + 1 (cid:1) log( N ) ≤ c log( N ) D . Under (GM ∗ ), δ ⋆ ( M ) ≤ c/ √ M , c > ε tr = 0, L f = 1 ε t u ≤ (cid:0) ε int ( ̺, N, D, B ) + ε tr + ε gm V (cid:1) (cid:0) Λ N L f (1 + ε gm ) (cid:1) n +1 − u ≤ (cid:0) c ̺ − N + cV M − . (cid:1) (cid:0) c log( N ) D (1 + cM − . ) (cid:1) n ≤ ˜ c ̺ − N log( N ) D n + ˜ c log( N ) D n M − . and this converges towards zero for N → ∞ if √ M > log( N ) D n . If (GM) holds wehave ε gm = δ and the term M − . is replaced by δ .10he following proposition provides conditions under which the value functionhas an analytic extension to some generalized Bernstein ellipse and Assumptions 3.2hold. Proposition 3.5.

Consider a DPP as deﬁned in (2.4) and (2.5) with equidistanttime-stepping and g ( t, x ) = g ( x ) . Let X = ( X t ) ≤ t ≤ T be a Markov process withstationary increments. Assume e h η, ·i g ( · ) ∈ L ( R D ) for some η ∈ R D and g hasan analytic extension to the generalized Bernstein ellipse B ( X , ̺ g ) . Furthermore,assume f : R × R → R has an analytic extension to C . If(i) the characteristic function ϕ x of X ∆ t with X = x is in L ( R D ) for every x ∈ X ,(ii) for every z ∈ R D the mapping x ϕ x ( z − iη ) has an analytic extensionto B ( X , ̺ ϕ ) and there are constants α ∈ (1 , and c , c > such that sup x ∈ B ( X ,̺ ϕ ) | ϕ x ( z ) | ≤ c e − c | z | α for all z ∈ R D ,then the value function x V t u ( x ) of the DPP has an analytic extension to B ( X , ̺ ) with ̺ = ̺ g .Proof. At T the value function x V T ( x ) is analytic since V T ( x ) = g ( x ) and g has an analytic extension by assumption. Moreover, e h η, ·i g ( · ) ∈ L ( R D ) for some η ∈ R D . We assume e h η, ·i V t u +1 ( · ) ∈ L ( R D ) and V t u +1 has an analytic extension to B ( X , ̺ ). Then the function x V t u ( x ) = f (cid:0) g ( x ) , E [ V t u +1 ( X t u +1 ) | X t u = x ] (cid:1) is analytic if x E [ V t u +1 ( X t u +1 ) | X t u = x ] = E [ V t u +1 ( X x ∆ t )] has an analytic exten-sion. From Gaß et al. (2018, Conditions 3.1) we obtain conditions (A1)-(A4) underwhich a function of the form ( p , p ) E [ f p ( X p )] is analytic. In our case we onlyhave the parameter p = x and so X p = X x ∆ t . Condition (A1) is satisﬁed since e h η, ·i V t u +1 ( · ) ∈ L ( R D ) and for (A2) we have to verify that | b V t u +1 ( − z − iη ) | ≤ c e c | z | for constants c , c > | b V t u +1 ( − z − iη ) | = (cid:12)(cid:12)(cid:12) Z R D e i h y, − z − iη i V t u +1 d y (cid:12)(cid:12)(cid:12) ≤ Z R D | e − i h y,z i | (cid:12)(cid:12)(cid:12) e h y,η i V t u +1 ( y ) (cid:12)(cid:12)(cid:12) d y ≤ || e h η, ·i V t u +1 ( · ) || L and thus (A2) holds. The remaining conditions (A3)-(A4) are equivalent to ourconditions (i)-(ii) and Gaß et al. (2018, Theorem 3.2) yields the analyticity of x E [ V t u +1 ( X x ∆ t )] on the Bernstein ellipse B ( X , ̺ ϕ ). Hence, x V t u ( x ) is a composi-tion of analytic functions and therefore analytic on the intersection of the domainsof analyticity B ( X , ̺ ϕ ) ∩ B ( X , ̺ g ) = B ( X , ̺ ) with ̺ = min { ̺ g , ̺ ϕ } .It remains to prove that e h η, ·i V t u ( · ) ∈ L ( R D ). Here the Lipschitz continuity of f yields || e h η, ·i V t u ( · ) || L ≤ L f (cid:16) || e h η, ·i g ( · ) || L + || e h η, ·i V t u +1 ( · ) || L (cid:17) < ∞ . n → ∞ . Remark 3.6.

Assume the setup of Corollary 3.4. Moreover, assume that ε tr = ε gm = 0 . If we let N and n go to inﬁnity, we have to ensure that the error boundtends to zero. We use that ε int ( ̺, N, D, B ) ≤ C ̺ − N for a constant C > and N = min i N i . The following condition on the relation between n and N ensuresconvergence n < log( ̺ ) C D · N log(Λ N ) + log( L f ) + 1 . In this section we discuss several approaches to compute the generalized moments(2.7) which contain the model dependent part. Moreover, preparing the numericalexperiments we tailor the Dynamic Chebyshev method to the pricing of Americanput options.

Naturally, the question arises how the generalized moments (2.7) can be derived.Here, we present four diﬀerent ways and illustrate all approaches in the one-dimensionalcase X = [ x, x ]. Similar formulas can be obtained for a multidimensional domain. Probability density function

For the derivation of E [ p j ( X t u +1 ) | X t u = x k ], let the density function of the ran-dom variable X t u +1 | X t u = x k be given as f u,k ( x ). Then, the conditional expectationcan be derived by solving an integral, E [ p j ( X t u +1 ) | X t u = x k ] = Z xx T j ( τ − x,x ] ( y )) f u,k ( y )d y using p j ( y ) = T j ( τ − X ( y ))1 X ( y ). This approach is rather intuitive and easy to imple-ment. Fourier Transformation

Assume the process X has stationary increments and the characteristic function ϕ of X ∆ t is explicitly available. We apply Parseval’s identity, see Rudin (1973), anduse Fourier transforms E [ p j ( X t u +1 ) | X t u = x k ] = Z ∞−∞ p j ( x + x k ) F (d x ) = 12 π Z ∞−∞ c p x k j ( ξ ) ϕ ( − ξ )d ξ, where p x k j ( x ) = p j ( x + x k ). Using the deﬁnition of τ [ x,x ] , we can express the Fouriertransform of p x k j ( x ) with the help of the Chebyshev polynomial T j ( y ). This yield E [ p j ( X t u +1 ) | X t u = x k ] = 12 π e − iξx k e iξ ( x − x − x ) x − x Z ∞−∞ b T j (cid:16) x − x ξ (cid:17) ϕ ( − ξ )d ξ. (4.1) 12he Fourier transform of the Chebyshev polynomials b T j are presented in Dominguez et al.(2011) and the authors also provide a Matlab implementation. Truncated moments

In this approach, we use that each one-dimensional Chebyshev polynomial canbe represented as a sum of monomials, i.e. T j ( x ) = j X l =0 a l x l , j ∈ N . The coeﬃcients a l , l = 0 , . . . , j , can easily be derived using the chebfun function poly() , see Driscoll et al. (2014). Then, E [ p j ( X t u +1 ) | X t u = x k ] = E [ T j ( τ − X ( X t u +1 ))1 X ( X t u +1 ) | X t u = x k ]= j X l =0 a l E [( τ − X ( X t u +1 )) l X ( X t u +1 ) | X t u = x k ]As τ X is linear the computation of the generalized moments has thus been reducedto deriving truncated moments. Monte-Carlo simulation

Lastly, especially in cases for which neither a probability density function, nor acharacteristic function of the underlying process is given, Monte-Carlo simulation isa suitable choice. For every nodal point x k one simulates N MC paths X it u +1 of X t u +1 with starting value X t u = x k . These simulations can then be used to approximateΓ t u ,t u +1 ( p j )( x k ) = E [ p j ( X t u +1 ) | X t u = x k ] ≈ N MC N MC X i =1 p j ( X it u +1 )for every j ∈ J . For an overview of Monte-Carlo simulation from SDEs and variancereduction techniques we refer to Glasserman (2003). In the numerical section we use the Dynamic Chebyshev method to price Americanput options. Assuming an asset model of the form S t = e X t , the DPP becomes V T ( x ) = ( K − e x ) + V t u ( x ) = max n ( K − e x ) + , e − r ( t u +1 − t u ) E [ V t u +1 ( X t u +1 ) | X t u = x ] o . Typically, the support of the underlying process X t is R and restricting the domain to X introduces a truncation error. We reduce this error by exploiting the asymptoticbehaviour of the payoﬀ. If X t u is below the exercise boundary the option is exercisedat the value K − e X tu which we exploit for X t u < x . The function x V t u tends tozero from above for x → ∞ and thus for x large enough the truncation to zero for13 > x is justiﬁed. Hence, we introduce the following modiﬁcation of the DynamicChebyshev method: V t u +1 ( x ) = V t u +1 ( x )1 { xx } ≈ ( K − e x )1 { x

In this section, we use the Dynamic Chebyshev method to price American put op-tions and we numerically investigate the convergence of the method. Moreover, wecompare the method with the Least-squares Monte-Carlo method of Longstaﬀ and Schwartz(2001).

For the convergence analysis we use three diﬀerent stock price models.

The Black-Scholes model:

In the classical model of Black and Scholes (1973) the stock price process is modelledby the SDE d S t = rS t d t + σS t d W t r is the risk-free interest rate and σ > X t = log( S t ) are normally distributed and for the double truncatedmoments E [ X m [ a,b ] ( X )] for X ∼ N ( µ, σ )explicit formulas are available.Kan and Robotti (2017) present results for the (mul-tivariate) truncated moments and provide a Matlab implementation. The Merton jump diﬀusion model:

The jump diﬀusion model introduced by Merton (1976) adds jumps to the classicalBlack-Scholes model. For S t = S e X t the log-returns X t follow a jump diﬀusion withvolatility σ and added jumps arriving at rate λ > N ( α, β ). The characteristic function of X t is given by ϕ ( z ) = exp (cid:18) t (cid:18) ibz − σ z + λ (cid:18) e izα − β z − (cid:19)(cid:19)(cid:19) with risk-neutral drift b = r − σ − λ (cid:16) e α + β − (cid:17) . The Constant Elasticity of Variance model:

The Constant Elasticity of Variance model (CEV) as stated in Schroder (1989) is alocal volatility model based on the stochastic processd S t = rS t d t + σS β/ t d W t for β > . (5.1)Hence the stock volatility σS ( β − / t depends on the current level of the stock price.For the special case β = 2 the model coincides with the Black-Scholes model. How-ever, from market data one typically observes a β <

2. The CEV-model is oneexample of a model which has neither a probability density, nor a characteristicfunction in closed-form.

In this section we investigate the convergence of the Dynamic Chebyshev method.We price American put options along with the options’ Delta and Gamma in theBlack-Scholes and the Merton jump diﬀusion model, where we can use the COSmethod of Fang and Oosterlee (2009) as benchmark. The COS method is based onthe Fourier-cosine expansion of the density function and provides fast and accurateresults for the class of L´evy models.For the experiments, we use the following parameter sets in the Black-Scholesmodel K = 100 , r = 0 . , σ = 0 . , T = 1 , and for the Merton jump diﬀusion model K = 100 , r = 0 . , α = − . , β = 0 . , σ = 0 . , λ = 0 . | ξ | ≤

250 and use Clenshaw-Curtiswith 500 nodes for the numerical integration. For the Fourier transform of theChebyshev polynomials the implementation of Dominguez et al. (2011) is used. Werun the Dynamic Chebyshev method for an increasing number of Chebyshev nodes N = 50 , , . . . , S equally distributed between 60%and 140% of the strike K . The resulting prices and Greeks are compared using theCOS method as benchmark and the maximum error over the grid is calculated. Herewe use the implementation provided in von Sydow et al. (2015).Figure 5.1 shows the error decay for the Black-Scholes model (left hand side)and the Merton model (right hand side). We observe that the method convergesand an error below 10 − is reached for N = 300 Chebyshev nodes. The experimentsconﬁrm that the method can be used for an American put option. -6 -5 -4 -3 -2 -1 -6 -5 -4 -3 -2 -1 Figure 5.1: Error decay prices Dynamic Chebyshev in the BS model (left) and the Merton model(right). The conditional expectation of the Chebyshev polynomials are calculated with the Fouriertransformation.

So far we empirically investigated the error decay of the method for option price andtheir derivatives. In this section we compare the Dynamic Chebyshev method withthe Least Square Monte-Carlo approach of Longstaﬀ and Schwartz (2001) in termsof accuracy and runtime.

As a ﬁrst benchmark, we use the Black-Scholes model with an interest rate of r =0 .

03 and volatility σ = 0 .

25. Here, we look at a whole option price surface withvarying maturities and strikes. We choose 12 diﬀerent maturities between one month16nd four years given by T ∈ { / , / , / , / , / , , / , / , , / , , } and strikes equally distributed between 80% and 120% of the current stock price S = 100 in steps of 5%. We ﬁx n = 504 time steps (i.e. exercise rights) per year.We compare the Dynamic Chebyshev method to the Least Squares Monte-Carloapproach. We run both methods for an increasing number of Monte-Carlo pathsaccording to M ∈ { , , , , , } . (5.2)The convergence of the DC method depends on both, the number of nodes N andthe number of Monte-Carlo paths M . For an optimal convergence behaviour oneneeds to ﬁnd a reasonable relationship between these factors. For the followingexperiments, we ﬁx the number of Chebyshev nodes as N = √ √ M .Figure 5.2 shows the price surface and the error surface for N = 400 and M = 80000. The error was estimated by using the COS method as benchmark.We reach a maximal error below 4 ∗ − on the whole option surface.In Figure 5.3 the log -error is shown as a function of the log -runtime for bothmethods. The left ﬁgure compares the total runtimes and the right ﬁgure comparesthe oﬄine runtime. For the Dynamic Chebyshev method the total runtime includesthe oﬄine-phase and the online phase. The oﬄine-phase consists of the simulation ofone time step of the underlying asset process X ∆ t for N + 1 starting values X = x k and of the computation of the conditional expectations E [ p j ( X ∆ t ) | X = x k ] for j, k = 0 , . . . , N . The online phase is the actual pricing of the American option for allstrikes and maturities. Similar, the total runtime of the Least-Square Monte-Carlomethod includes the simulation of the Monte-Carlo paths (oﬄine-phase) and thepricing of the option via backward induction (online-phase).We observe that the Dynamic Chebyshev method reaches the same accuracy ina much lower runtime. For example, a maximum error of 0 . . s with the Dynamic Chebyshev method whereas the LSM approachneeds 98 s . This means the Dynamic Chebyshev method is nearly 200 times fasterfor the same accuracy. For the actual pricing in the online phase, the gain in eﬃ-ciency is even higher. We observe that the Dynamic Chebyshev method outperformsthe Least-Square Monte-Carlo method in terms of the total runtime and the pureonline runtime. Moreover, we observe that the performance gain from splitting thecomputation into an oﬄine and an online phase is much higher for the DynamicChebyshev method. For instance, in the example above the online runtime of theDynamic Chebyshev method is 0 . s whereas the LSM takes 95 s , a factor of 1900times more.The main advantage of the Dynamic Chebyshev method is that once the con-ditional expectations are calculated, they can be used to price the whole optionsurface. The pure pricing, i.e. the online phase, is highly eﬃcient. Furthermore, oneonly needs to simulate one time step ∆ t of the underlying stochastic process insteadof the complete path. We investigate this eﬃciency gain by varying the number of17 Figure 5.2: Price surface and corresponding error of the Dynamic Chebyshev method in the Black-Scholes model. The conditional expectations are calculated with Monte-Carlo. -1 0 1 2 3-2.5-2-1.5-1-0.50 -1 0 1 2 3-2.5-2-1.5-1-0.50

Figure 5.3: Log-Log plot of the total/online runtime vs. accuracy. Comparison of the DynamicChebyshev method with the Least-Square Monte-Carlo algorithm.

Figure 5.4: Total runtime of the DC and the LSM method for an increasing number of options (left)and an increasing number of timesteps (right).

Next, we use the constant elasticity of variance (CEV) model for the underlyingstock price process. We perform the same experiments as in the last section. Theparameters in the CEV model, as in (5.1) are the following σ = 0 . , r = 0 . , β = 1 . . Similarly, we compare the Dynamic Chebyshev and the LSM method by comput-ing the prices of an option price surface. We use the same parameter speciﬁcationsfor K , T and n . We run both methods for an increasing number of Monte-Carlosimulations M and ﬁx N = √ √ M . Figure 5.5 shows the price surface and theerror surface for N = 400 and M = 80000. The error is calculated using a binomialtree implementation of the CEV model based on Nelson and Ramaswamy (1990).19

45 3 12010 11015 220 1001 900 80 -104-8-6 3 120-410 -3 -2 110202 1001 900 80 Figure 5.5: Price surface and corresponding error of the Dynamic Chebyshev method in the CEVmodel. The conditional expectations are calculated with Monte-Carlo.

In Figure 5.6 the log -error is shown as a function of the log -runtime for bothmethods. The left ﬁgure compares the total runtimes and the right ﬁgure comparesthe oﬄine runtimes. Again, we observe that the Dynamic Chebyshev method isfaster for the same accuracy and it proﬁts more from an oﬄine-online decomposition.For example, the total runtime of the DC method to reach an accuracy below 0 . . s whereas LSM takes 136 s . For the online runtimes this out-performance is 1 s to 122 s . -0.5 0 0.5 1 1.5 2 2.5-2.5-2-1.5-1-0.5 -0.5 0 0.5 1 1.5 2 2.5-2.5-2-1.5-1-0.5 Figure 5.6: Log-Log plot of the total/online runtime vs. accuracy. Comparison of the DynamicChebyshev method with the Least-Square Monte-Carlo algorithm.

Investigating this eﬃciency gain further, we look at the performance for diﬀerentnumbers of options and time steps (exercise rights). Similarly to the last section,Figure 5.7 compares the total runtime of the DC method with the total runtime ofthe LSM method for an increasing number of options and time steps. In both cases,the runtime of the DC method stays nearly constant whereas the runtime of theLSM method increases approximately linearly. This observation is consistent withthe ﬁndings for the Black-Scholes model in the previous section.20

10 20 30 40 50 60051015202530 0 200 400 600 800 1000024681012

Figure 5.7: Total runtime of the DC and the LSM method for an increasing number of options (left)and an increasing number of timesteps (right).

We have introduced a new approach to price American options via backward induc-tion by approximating the value function with Chebyshev polynomials. Thereby,the computation of the conditional expectation of the value function in each timestep is reduced to the computation of conditional expectations of polynomials. Theproposed method separates the pricing of an option into a part which is model-dependent (the computation of the conditional expectations) and the pure pricingof a given payoﬀ which becomes independent of the underlying model. The ﬁrststep, the computation of the conditional expectation of the Chebyshev polynomials,is the so-called oﬄine phase of the method. The design of the method admits severalqualitative advantageous: • If the conditional expectations are set-up once, we can use them for the pricingof many diﬀerent options. Thus, the actual pricing in the online step becomesvery simple and fast. • In the pre-computation step one can combine the method with diﬀerent tech-niques, such as Fourier approaches and Monte-Carlo simulation. Hence themethod can be applied in a variety of models. • The proposed approach is very general and ﬂexible and thus not restricted tothe pricing of American options. It can be used to solve a large class of optimalstopping problems. • We obtain a closed-form approximation of the option price as a function of thestock price at every time step. This approximation can be used to computethe option’s sensitivities Delta and Gamma at no additional costs. This holdsfor all models and payoﬀ proﬁles, even if Monte-Carlo is required in the oﬄinephase. • The method is easy to implement and to maintain. The pre-computation stepis well-suited for parallelization to speed-up the method.21e have investigated the theoretical error behaviour of the method and introducedexplicit error bounds. We put particular emphasis on the combination of the methodwith Monte-Carlo simulation. Numerical experiments conﬁrm that the method per-forms well for the pricing of American options. A detailed comparison of the methodwith the Least-Square Monte-Carlo approach proposed by Longstaﬀ and Schwartz(2001) conﬁrmed a high eﬃciency gain. Especially, when a high number of options ispriced, for example a whole option price surface. In this case, the Dynamic Cheby-shev method highly proﬁts from the oﬄine-online decomposition. Once the condi-tional expectations are computed they can be used to price options with diﬀerentmaturities and strikes. Besides the eﬃciency gain, the closed-form approximationof the price function is a signiﬁcant advantage because it allows us to calculate thesensitivities. Since Longstaﬀ and Schwartz (2001) introduced their method diﬀerentmodiﬁcations have been introduced. Either to increase eﬃciency or to tackle thesensitivities. For example the simulation algorithm of Jain and Oosterlee (2015) iscomparable to LSM in terms of eﬃciency but is able to compute the Greeks at noadditional costs. Moreover dual approaches were developed to obtain upper boundsfor the option price, see Rogers (2002) and more recently Belomestny et al. (2018).The presented error analysis of the method under an analyticity assumption isthe starting point for further theoretical investigations in the case of piecewise an-alyticity and (piecewise) diﬀerentiability. The former allows to cover rigorously theAmerican option pricing problem and a preliminary version is presented in Mahlstedt(2017). The qualitative merits of the method can be exploited in a variety of ap-plications. Glau et al. (2018) take advantage of the closed-form approximation toeﬃciently compute the expected exposure of early-exercise options as a step in CVAcalculation. Moreover, the method can be used to price diﬀerent options such asBarrier options, Swing options or multivariate American options.

A Proof of Theorem 3.3

Proof.

Consider a DPP as deﬁned in Deﬁnition 2.1, i.e. we have a Lipschitz contin-uous function | f ( x , y ) − f ( x , y ) | ≤ L f ( | x − x | + | y − y | ) . Assume that the regularity Assumption 3.2 and the assumption on the truncationerror (TR) hold. Then we have to distinguish between the deterministic case (GM)and the stochastic case (GM ∗ ). In the ﬁrst case, the expectation in the error boundcan simply be ignored. First, we apply Proposition 3.1. At time point T there is norandom part and no distortion error. Thus,max x ∈X E h | V T ( x ) − b V T ( x ) | i = max x ∈X | V T ( x ) − b V T ( x ) | ≤ ε int ( ̺ t n , N, D, M t n ) . For the ease of notation we will from know on write ε jint = ε int ( ̺ t j , N, D, M t j ). Weobtain for the error at t u max x ∈X E h | V t u ( x ) − b V t u ( x ) | i ≤ ε uint + Λ N F ( f, t u )(A.1) 22ith maximal distortion error F ( f, t u ) = max k ∈J E h | V t u ( x k ) − b V t u ( x k ) | i . Note thatwhether (GM) or (GM ∗ ) hold an approximation error of the conditional expectationof b V t u +1 is made, i.e. E [ b V t u +1 ( X t u +1 ) | X t u = x k ] = Γ kt u ( b V t u +1 ) ≈ b Γ kt u ( b V t u +1 ). TheLipschitz continuity of f yields (cid:12)(cid:12)(cid:12) V t u ( x k ) − b V t u ( x k ) (cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12) f (cid:16) g ( t u , x k ) , Γ kt u ( V t u +1 ) (cid:17) − f (cid:16) g ( t u , x k ) , b Γ kt u ( b V t u +1 ) (cid:17)(cid:12)(cid:12)(cid:12) ≤ L f (cid:16)(cid:12)(cid:12)(cid:12) g ( t u , x k ) − g ( t u , x k ) (cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12) Γ kt u ( V t u +1 ) − b Γ kt u ( b V t u +1 ) (cid:12)(cid:12)(cid:12)(cid:17) = L f (cid:16)(cid:12)(cid:12)(cid:12) Γ kt u ( V t u +1 ) − b Γ kt u ( b V t u +1 ) (cid:12)(cid:12)(cid:12)(cid:17) ≤ L f (cid:16)(cid:12)(cid:12)(cid:12) Γ kt u ( V t u +1 X ) − Γ kt u ( b V t u +1 ) (cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12) Γ kt u ( V t u +1 R D \X ) (cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12) Γ kt u ( b V t u +1 ) − b Γ kt u ( b V t u +1 ) (cid:12)(cid:12)(cid:12)(cid:17) . Next, we consider the expectation for each of the three error terms. For the ﬁrstterm we obtain E h(cid:12)(cid:12)(cid:12) Γ kt u ( V t u +1 X ) − Γ kt u ( b V t u +1 ) (cid:12)(cid:12)(cid:12)i = E h(cid:12)(cid:12)(cid:12) E [ V t u +1 ( X t u +1 ) X − b V t u +1 ( X t u +1 ) | X t u = x k ] (cid:12)(cid:12)(cid:12)i ≤ max x ∈X E h | V t u +1 ( x ) − b V t u +1 ( x ) | i = ε t u +1 and for the second term we have E h(cid:12)(cid:12)(cid:12) Γ kt u ( V t u +1 R D \X ) (cid:12)(cid:12)(cid:12)i ≤ E [ ε tr ] = ε tr . For the last term we have to distinguish two cases. If we assume (GM) holds, theoperator norm yields (cid:12)(cid:12)(cid:12) Γ kt u ( b V t u +1 ) − b Γ kt u ( b V t u +1 ) (cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12) (cid:16) Γ kt u − b Γ kt u (cid:17) (cid:16) b V t u +1 (cid:17)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Γ kt u − b Γ kt u (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) op (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) b V t u +1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∞ ≤ δ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) b V t u +1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∞ . Next, we consider the second case and assume that (GM ∗ ) holds. Then we have E h(cid:12)(cid:12)(cid:12) Γ kt u ( b V t u +1 ) − b Γ kt u ( b V t u +1 ) (cid:12)(cid:12)(cid:12)i ≤ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Γ kt u − b Γ kt u (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) op (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) b V t u +1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ⋆ ∞ ≤ δ ⋆ ( M ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) b V t u +1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ⋆ ∞ . Hence in either case the following bound holds E h(cid:12)(cid:12)(cid:12) Γ kt u ( b V t u +1 ) − b Γ kt u ( b V t u +1 ) (cid:12)(cid:12)(cid:12)i ≤ ε gm max x ∈X E h | b V t u +1 ( x ) | i with ε gm = δ if assumption (GM) holds and ε gm = δ ⋆ ( M ) if assumption (GM ∗ )holds. We need an upper bound for the maximum of the Chebyshev approximationmax x ∈X E h | b V t u +1 ( x ) | i ≤ max x ∈X E h | b V t u +1 ( x ) − V t u +1 ( x ) | i + max x ∈X | V t u +1 ( x ) | ≤ ε t u +1 + V u +1 with V u +1 := max x ∈X | V t u +1 ( x ) | . Hence, the error bound (A.1) becomes ε t u ≤ ε uint + Λ N L f (cid:0) (1 + ε gm ) ε t u +1 + ε tr + ε gm V u +1 (cid:1) .

23y induction, we now show (3.5). For u = n we have ε t n ≤ ε nint and therefore (3.5)holds for u=n. We assume that for n, . . . , u + 1 equation (3.5) holds. For the error ε t u we obtain ε t u ≤ ε uint + Λ N L f (cid:0) (1 + ε gm ) ε t u +1 + ε tr + ε gm V u +1 (cid:1) ≤ ε uint + Λ N L f (cid:16) (1 + ε gm ) (cid:16) n X j = u +1 C j − ( u +1) ε jint + Λ N L f n X j = u +2 C j − ( u +2) ( ε tr + ε gm V j ) (cid:17) + ε tr + ε gm V u +1 (cid:17) = ε uint + C n X j = u +1 C j − ( u +1) ε jint + Λ N L f (cid:16) C n X j = u +2 C j − ( u +2) ( ε tr + ε gm V j )+ ε tr + ε gm V u +1 (cid:17) = ε uint + n X j = u +1 C j − u ε jint + Λ N L f (cid:16) n X j = u +2 C j − ( u +1) ( ε tr + ε gm V j ) + ε tr + ε gm V u +1 (cid:17) = n X j = u C j − u ε jint + Λ N L f n X j = u +1 C j − ( u +1) ( ε tr + ε gm V j )which was our claim. References

Belomestny, D., S. H¨afner, and M. Urusov (2018). Regression-based complexityreduction of the nested Monte-Carlo methods.

SIAM Journal on Financial Math-ematics 9 (2), 665–689.Black, F. and M. Scholes (1973). The pricing of options and corporate liabilities.

Journal of Political Economy 81 (3).Brennan, M. J. and E. S. Schwartz (1977). The valuation of American put options.

The Journal of Finance 2 (32), 449–462.Cai, Y. and K. L. Judd (2013). Shape-preserving dynamic programming.

Mathe-matical Methods of Operations Research 77 (3), 407–421.Dominguez, V., I. Graham, and V. Smyshlyaev (2011). Stability and error estimatesfor Filon–Clenshaw–Curtis rules for highly oscillatory integrals.

IMA Journal ofNumerical Analysis 31 (4), 1253–1280.Driscoll, T. A., N. Hale, and L. N. Trefethen (2014). Chebfun guide.Fang, F. and C. W. Oosterlee (2009). Pricing early-exercise and discrete barrieroptions by Fourier-cosine series expansions.

Numerische Mathematik 114 (1), 27.Gaß, M., K. Glau, M. Mahlstedt, and M. Mair (2018). Chebyshevinterpolation for parametric option pricing.

Finance and Stochastics .https://doi.org/10.1007/s00780-018-0361-y.24lasserman, P. (2003).

Monte Carlo Methods in Financial Engineering , Volume 53.Springer Science & Business Media.Glau, K., R. Pachon, and C. P¨otz (2018). Chebyshev interpolation for the fastcalculation of Credit Valuation Adjustments.

Working Paper .Haasdonk, B., J. Salomon, and B. Wohlmuth (2013). A reduced basis method forthe simulation of American options. In

Numerical Mathematics and AdvancedApplications 2011 , pp. 821–829. Springer.Haentjens, T. and K. J. int Hout (2015). ADI schemes for pricing American optionsunder the Heston model.

Applied Mathematical Finance 22 (3), 207–237.Hilber, N., N. Reich, C. Winter, and C. Schwab (2013).

Computational Methods forQantitative Finance . Springer.Jain, S. and C. W. Oosterlee (2015). The stochastic grid bundling method: Eﬃ-cient pricing of Bermudan options and their Greeks.

Applied Mathematics andComputation 269 , 412–431.Judd, K. L. (1998).

Numerical methods in economics . MIT press.Kan, R. and C. Robotti (2017). On moments of folded and truncated multivariatenormal distributions.

Journal of Computational and Graphical Statistics 26 (4),930–934.Levendorski˘ı, S. (2004). Pricing of the American put under L´evy processes.

Inter-national Journal of Theoretical and Applied Finance 7 (03), 303–335.Longstaﬀ, F. A. and E. S. Schwartz (2001). Valuing American Options by Sim-ulation: A Simple Least-Squares Approach.

Review of Financial studies 14 (1),113–147.Lord, R., F. Fang, F. Bervoets, and C. W. Oosterlee (2008). A fast and accurateFFT-based method for pricing early-exercise options under L´evy processes.

SIAMJournal on Scientiﬁc Computing 30 (4), 1678–1705.Mahlstedt, M. (2017).

Complexity Reduction for Option Pricing . Ph. D. thesis,Technische Universit¨at M¨unchen.Merton, R. C. (1976). Option pricing when underlying stock returns are discontin-uous.

Journal of Financial Economics 3 , 125–144.Nelson, D. B. and K. Ramaswamy (1990). Simple binomial processes as diﬀusionapproximations in ﬁnancial models.

The review of ﬁnancial studies 3 (3), 393–430.Peskir, G. and A. Shiryaev (2006).

Optimal Stopping and Free-Boundary Problems .Springer.Rogers, L. C. (2002). Monte Carlo valuation of American options.

MathematicalFinance 12 (3), 271–286.Rudin, W. (1973).

Functional Analysis . McGraw-Hill Book Co.25auter, S. and C. Schwab (2010).

Boundary Element Methods, Translated and ex-panded from the 2004 German original , Volume 39. Springer Series ComputationalMathematics.Schroder, M. (1989). Computing the constant elasticity of variance option pricingformula.

The Journal of Finance 44 (1), 211–219.Trefethen, L. N. (2013).

Approximation Theory and Approximation Practice . SIAMbooks.von Sydow, L., L. Josef H¨o¨ok, E. Larsson, E. Lindstr¨om, S. Milovanovi´c, J. Persson,V. Shcherbakov, Y. Shpolyanskiy, S. Sir´en, J. Toivanen, et al. (2015). BENCHOP–The BENCHmarking project in option pricing.