Operator Methods, Abelian Processes and Dynamic Conditioning
OOPERATOR METHODS, ABELIAN PROCESSES AND DYNAMICCONDITIONING
CLAUDIO ALBANESE
Abstract.
A mathematical framework for Continuous Time Finance based on operator al-gebraic methods offers a new direct and entirely constructive perspective on the field. It alsoleads to new numerical analysis techniques which can take advantage of the emerging massivelyparallel GPU architectures which are uniquely suited to execute large matrix manipulations.This is partly a review paper as it covers and expands on the mathematical framework un-derlying a series of more applied articles. In addition, this article also presents a few key newtheorems that make the treatment self-contained. Stochastic processes with continuous timeand continuous space variables are defined constructively by establishing new convergence es-timates for Markov chains on simplicial sequences. We emphasize high precision computabilityby numerical linear algebra methods as opposed to the ability of arriving to analytically closedform expressions in terms of special functions. Path dependent processes adapted to a givenMarkov filtration are associated to an operator algebra. If this algebra is commutative, thecorresponding process is named Abelian, a concept which provides a far reaching extension ofthe notion of stochastic integral. We recover the classic Cameron-Dyson-Feynman-Girsanov-Ito-Kac-Martin theorem as a particular case of a broadly general block-diagonalization algo-rithm. This technique has many applications ranging from the problem of pricing cliquetsto target-redemption-notes and volatility derivatives. Non-Abelian processes are also relevantand appear in several important applications to for instance snowballs and soft calls. We showthat in these cases one can effectively use block-factorization algorithms. Finally, we discussthe method of dynamic conditioning that allows one to dynamically correlate over possiblyeven hundreds of processes in a numerically noiseless framework while preserving marginaldistributions.
Contents
1. Introduction 22. Measure Theory on Simplicial Sequences 43. Path Functionals 64. n-point Functions 75. Markov Processes 96. Martingales and Monotonic Processes 127. The Fundamental Theorem of Finance 158. Weak Convergence of Markov Generators 169. Fast Exponentiation and Spectral Methods 1910. Construction of Brownian Motion 2211. Kernel Convergence Estimates for Diffusion Processes 2612. Convergence of Time Discretization Schemes 3313. Hypergeometric Brownian Motion 3514. Stochastic Integrals for Diffusion Processes 4015. Markov Bridges 4316. Abelian Processes in Continuous Time 44
Date : First version December 15th, 2006, last revision October 31, 2018.I would like to thank my collaborators in the past 8 years with whom this work would not have been possible.In particular Joseph Campolieti, Peter Carr, Oliver Chen, Alexei Kuznetsov, Sebastian Jaimungal, Paul Jones,Harry Lo, Stephan Lawi, Alex Lipton, Alex Mijatovic, Adel Osseiran, Dmitri Rubisov, Stathis Tompaidis, ManlioTrovato and Alicia Vidler. Special thanks go to Paul Jones and Adel Osseiran for reading previous versions ofthis manuscript and correcting errors. All remaining mistakes are the solve responsibility of the author. a r X i v : . [ m a t h . P R ] N ov CLAUDIO ALBANESE
17. Abelian Processes in Discrete Time and non-Resonance Conditions 4618. Univariate Moment Expansions on Bridges 4819. Bivariate Moment Expansions on Bridges 5120. Block Factorizations 5421. Dynamic Conditioning 5622. Conclusion 59References 591.
Introduction
The goal of this paper is to attempt to consolidate and present a number of mathematicalmethods developed over several years by myself and collaborators while addressing concreteproblems in derivative pricing theory. The results scattered across a number of papers whichare collected here have been complemented with a rigorous ab initio treatment and a few keytheorems which make the framework mathematically self-contained. This results in a quitecomprehensive approach to the theory of Stochastic Processes and Mathematical Finance whichis novel in that it is fully constructive and perhaps has applications beyond the realm of FinancialEngineering.There are several traditions of Constructive Mathematics. One attempts to re-derive classicalresults of real and functional analysis based on a restrictive constructivist logic according towhich no mathematical object can be considered unless one specifies explicitly how to constructit, see (Bridges and Richman 1987) and (Bishop 1967). Along another tradition, ConstructiveField Theory, see (Glimm and Jaffe 1987), aimed at establishing the existence of interactingquantum field theories by providing a constructive procedure for computing n -point functions anddemonstrating that they satisfy a set of axioms. Measure theoretic probability and the relatedtheory of stochastic processes, (Doob 1953), does not seem to be understandable constructively.The PDE approach in (Feller 1961) and the harmonic analysis approach in (Bochner 1955) areinstead essentially constructive but do not delve into the theory of stochastic integrals and pathdependent processes and into lattice discretization schemes.The main motivation that guided this research is the creation of an engineering framework forexotic financial derivatives. Efficient computability on current hardware has been and remainsthroughout this article our key motivating concern. To this end, we work towards an algebraiza-tion of Probability Theory that reduces all calculations to matrix manipulations which canbe performed efficiently and in particular to matrix multiplications. Similarly to the standardframework of algebraic topology, (Spanier 1966), we consider processes taking values in separabletopological spaces and approximate continuous domains by means of simplicial sequences. Toestablish convergence in the continuous limit, we directly estimate convergence rates for proba-bility transition kernels in the continuous space limit following an approach similar in spirit toConstructive Lattice Field Theory, see (Glimm and Jaffe 1987). Similarly to constructive fieldtheory, sets of axioms on n -point functions are used to identify processes and renormalizationgroup transformations are used to control the continuous limit. Following (Naimark 1959), theapproach is grounded upon the algebraic theory of integration on locally compact Hausdorffseparable topological spaces.Calculations with stochastic processes are carried out using operator methods developed inQuantum Mechanics, (Landau and Lifshits 1977) and systematized in Mathematical Physicsreferences such as (Reed and Simon 1980). In Finance, operator methods have been developedalong two independent and non-overlapping streams of research, one by Ait-Sahalia, Hansenand Scheinkman who focused on econometric estimations in a series of papers reviewed in (Ait-Sahalia et al. et al. et al. b ), (Albanese and Chen 2004 a ), (Albanese and Chen 2004 b ), (Albanese and Kusnetsov 2005), PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 3 (Albanese and Lawi 2004), (Albanese et al. a ), (Albanese and Trovato 2006), (Albaneseand Trovato 2005), (Albanese and Vidler 2007), (Albanese and Jones 2007) and (Albanese andOsseiran 2007). In this paper, we attempt to systematize the mathematical framework of pricingtheory in the operator formalism from our own viewpoint, reserving to future work the task ofpursuing overlaps with the econometric literature.The references quoted above are all relevant to our undertaking and provided motivationson many levels. However, in an effort to keep this writing self-contained, we are not going toassume any previous knowledge of the reader.To ground the mathematical framework, we obtain sharp pointwise convergence estimatesfor probability kernels and its derivatives. More precisely, we show that probability kernelsconverge pointwise at rates of order O ( h ), where h is the lattice spacing. The result applies to alarge class of diffusion processes and extensions thereof including smooth space inhomogeneities,regime switching, finite activity jumps and some degree of time inhomogeneities. We also provesimilar convergence results for the fast exponentiation method, our preferred numerical methodfor exponentiating Markov generators, showing that errors in this scheme are also of order O ( h ), in the sense of pointwise convergence for the probability kernel. In the particular case ofBrownian motion, we show that similar O ( h ) pointwise error estimates apply also to derivativesof the probability kernels and that the power 2 in the O ( h ) bounds is actually sharp.The interest in convergence estimates was prompted by the desire of understanding the mech-anisms behind the empirically observed smoothness and robustness in the calculation of pricesensitivities with the methods in our applied papers. We also observed empirically that thefast exponentiation algorithm is stable under single precision floating point arithmetics. We findthat key to a high precision numerical framework for sensitivities is handling the time coordinateeither as continuous or very finely discretized. A sufficiently fine discretization is defined as onefor which explicit differentiation schemes are stable and typically correspond to a hourly timescale in applications. Typically, weekly time steps would permit stable implicit differentiationschemes but would not allow for as much stability in the calculation of price sensitivities and theprobability kernels we require to evaluate and manipulate. This motivates us to avoid implicitdifferentiation schemes on coarse time intervals.Measure changes and time changes are defined constructively and a version of the Fundamen-tal Theorem of Finance is re-obtained. The Cameron-Martin-Girsanov’s theorem, see (Cameronand Martin 1949), and Ito’s lemma, see (Ito 1949), are proved twice in different ways with oper-ator methods. We also derive the Feynman-Kac formula, see (Feynman 1948) and (Kac 1948).One of the key results is an extension of the Feynman-Kac-Ito formula in three different direc-tions. This formula concerns the characteristic function of a stochastic integral over a diffusionprocess. In our formalism, this formula becomes a block-diagonalization algorithm for largematrices associated to path-dependent processes. The extension we discuss (i) covers Markovprocesses more general than diffusions, (ii) allows for a class of path-dependent processes wename Abelian which extends the notion of stochastic integral and (iii) generalizes the harmonicanalysis framework to include extensions of trigonometric Fourier transforms. The theory ofAbelian processes finds numerous practical applications to path dependent options and is appli-cable to the great majority of path-dependent payoffs, from volatility swaps to cliquets, rangeaccruals, lookback options, target redemption notes and more. We also give a version of Dyson’sformula to accelerate the pricing of path-dependent options given by Abelian processes by meansof a moment expansion. Non-Abelian processes are more difficult to handle but we single outa class admitting block-factorizations (as opposed to a block-diagonalizing transformation) andwhich are also amenable to numerical analysis by matrix algebra. Finally, we illustrate themethod of dynamic conditioning that allows one to correlate possibly numerous processes bymeans of kernel manipulations while preserving marginals and not incurring into dimensionalexplosion.The mathematical methods in this article are particularly efficient as they lend themselves totransparent hardware acceleration on the emerging multi-core GPU hardware platforms. Thesemassively parallel architectures are based on low-cost technologies that have been developed
CLAUDIO ALBANESE for the games and high definition markets and are uniquely suited to implement BLAS Level 3routines such as matrix-matrix multiplications with high efficiency. See also (Goto and van deGeijn to appear) for a state of the art account on matrix-matrix multiplication software onCPUs.The paper is organized as follows. In Section 2, we introduce the notion of simplicial sequencewhich is key to devising approximation schemes for continuous valued process by means of asequence of Markov chains. In Section 3, we consider a general definition of path functional andin Section 4 we give a general description of a stochastic process by means of an n -point function.Markov Processes are introduced in Section 5, martingales and monotonic processes in Section 6.In Section 7, we derive the Fundamental Theorem of Arbitrage Free Pricing Theory. The classicalresults on weak convergence of Markov generators by Bernstein, Bochner, Kyntchine and Levyare re-obtained in Section 8. Time homogeneous Markov Processes, fast exponentiation andspectral methods are described in Section 9. In Section 10, we carry out a constructive analysisof Brownian motion and prove convergence estimates. In Section 11, we study the spectrum ofdiffusion generators. Sharp pointwise kernel estimates are extended to general diffusion processesin Section 12. In Section 13, we give estimates for the convergence rate of time discretisationschemes of the type we advocate for applications, i.e. based on fast exponentiation. Section14 reviews the derivation of hypergeometric Brownian motion and particular cases such as theCEV model. In Section 15, we study stochastic integrals and obtain Ito’s formula for diffusionprocesses and of Girsanov’s theorem. Section 16 contains a derivation of the Feynman-Kacformula for bridges over general Markov processes. The general notion of Abelian process incontinuous time is introduced in Section 17. In Section 18, we discuss the discrete time case. Inthis section, we introduce also the notion of non-resonant block-diagonalisation scheme whichprovides a numerically useful extension of Fourier analysis based on trigonometric functions.Dyson’s formula and moment expansions are in Section 19, covering the uni-variate case, andSection 20, covering the multivariate case. These two sections include applications to exoticvolatility derivatives. Block factorizations and applications to snowballs and soft calls are inSection 21. Dynamic conditioning and multi-factor correlation modeling is discussed in Section22. Conclusions end the paper.2. Measure Theory on Simplicial Sequences
Let d > R d and the sequence of lattices h m Z d where h m = 2 − m . Definition 1. (Lattices.) If A ⊂ h m Z d , the convex hull of A in R d is denoted by Hull( A ) .The interior of A ⊂ h m Z d is denoted with Int( A ) and is defined as the set of all sites x ∈ A contained in A along with each one its neighbors in h m Z d at distance h m . Definition 2. (Simplicial Sequences.)
A bounded simplicial sequence is given by an integer m > , and a sequence of subsets A m ∈ h m Z d defined for all m > m such that • Hull( A m ) ⊂ Hull( A m (cid:48) ) whenever m < m (cid:48) . • for all m > and all internal points x ∈ Int( A m ) there is a M > and an (cid:15) > suchthat, for all m (cid:48) > M and for all y ∈ h m (cid:48) Z d with d ( x, y ) < (cid:15) we have that y ∈ A m (cid:48) . We follow a constructivist logic paradigm according to which in order to identify a set ora sequence of sets one has to explicitly state how to construct it, possibly with a recursivealgorithm, and it must be possible to decide each step of the recursion in a finite number oflogical steps.
Definition 3. (Lattice Functions.)
Let A = ( A m ) , m ≥ m be a bounded simplicial sequence.A real valued simplicial function, denoted by f : A → R , is defined as a sequence of functions f m : A m → R such that f m (cid:48) ( x ) = f m ( x ) for all m (cid:48) > m ≥ m and all x ∈ Int( A m ) . The function f : A → R is said uniformly bounded if there is a constant c > such that | f m ( x ) | < c for all x ∈ A m . The function f : A → R is uniformly continuous if it is uniformly bounded and for all (cid:15) > there is a δ > such that if x, y ∈ h m Z and d ( x, y ) < δ we have that | f ( x ) − f ( y ) | < (cid:15) . PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 5
Definition 4. (Equivalence of Lattice Functions.)
Let A = ( A m ) , m ≥ m , be a boundedsimplicial sequence and f = ( f m ) and g = ( g m ) are two uniformly continuous real valued func-tions on A . We say that these series provide equivalent representations of the same function incase, for all m > and all all x ∈ h m Z we have that lim m (cid:48) →∞ | f m (cid:48) ( x ) − g m (cid:48) ( x ) | = 0 . Let C ( A ) be the set of all continuous functions on A endowed with the natural structure of C ∗ algebra given by the operations of sum, multiplication by a scalar and pointwise multiplication. Definition 5. (Integrals.)
Let A = ( A m ) , m ≥ m , be a bounded simplicial sequence. Anintegral is given by a sequence I = ( I m ) where I m is a linear functional on the linear space offunctions f m : A m → R such that • I m ( f m ) ≥ whenever f m ( x ) ≥ for all x ∈ A m . • The following limit exists for all continuous functions f = ( f m ) ∈ C ( A ) : (2.1) I ( f ) ≡ lim m →∞ I m ( f m ) • The functionals I m ( f m ) defined above satisfies the following bound: (2.2) I m ( f m ) ≤ c || f m || ∞ for some constant c > and all functions f m .An integral is said to correspond to a probability measure if I m (1) = 1 . Definition 6. (Equivalent Integrals.)
Let A = ( A m ) , m ≥ m , be a bounded simplicialsequence and let ( I m ) and ( J m ) be two integrals. We say that these series provide equivalent rep-resentations of the same integral I if lim m →∞ | I m ( f m ) − J m ( f m ) | = 0 for all uniformly continuousfunctions ( f m ) . Given an integral I on the simplicial sequence A , for all p ≥
1, one defines the followingsemi-norm on the space of continuous functions C ( A ):(2.3) || f || p = ( I ( | f | p )) /p . Let L p ( A ; I ) be the linear space obtained by completing C ( A ) with respect to the semi-norm || f || I,p and identifying equivalence classes of functions at zero distance. More precisely, L p ( A ; I )is the linear space of the Cauchy sequences ( f m ) in C ( A ) with respect to the norm || f || I,p modulothe linear space of the Cauchy sequences converging to a limit of zero L p -norm. Definition 7. (Summable Functions.)
A function is called summable if it is in L ( A ; I ) andsquare summable if it is in L ( A ; I ) . Theorem 1. (Monotonic Sequences.)
Let f k be a non-decreasing sequence of summablefunctions and consider the function f = lim k →∞ f k and the limit ¯ I = lim k →∞ I ( f k ) . Theneither ¯ I = ∞ or f is summable and ¯ I = I ( f ) .Proof. If ¯
I < ∞ , then whenever k > j we have that || f k − f j || = I ( f k − f j ). Since the sequence I ( f k ) is uniformly increasing, it is Cauchy. Hence the sequence f k is Cauchy in L ( A ; I ). Itslimit f therefore belongs to L ( A ; I ) since this space is complete. (cid:3) Definition 8. (Measurable Functions.)
A function is called measurable if it can be repre-sented as the limit of a non-decreasing sequence of summable functions.
Theorem 2. (Dominated Convergence.)
Let f k be a sequence of measurable functions,suppose there is a summable function g such that | f k ( x ) |≤ | g ( x ) | and suppose also that thelimit f ( x ) = lim k →∞ f k ( x ) exists in a pointwise sense. Then f is a summable function and I ( f ) = lim k →∞ I ( f k ) .Proof. Consider the sequence u k constructed iteratively so that u = min( u , f ) and u k =max( u k − , min( f k , f )). The sequence u k is uniformly non-decreasing and converges to f ( x ).Furthermore, I ( | u k | ) ≤ I ( | g | ) for all k . Hence f is summable and I ( f ) = lim k →∞ I ( f k ). (cid:3) CLAUDIO ALBANESE
Let V be the linear space of all functions which can be expressed as the limit of a non-decreasing sequence of continuous functions f ( x ) = lim k →∞ f k ( x ) such that the following normis finite(2.4) || f || ∞ = lim k →∞ || f k || ∞ .L ∞ ( A ; I ) is defined as the completion of the linear space V with respect to the uniform norm. Definition 9. (Essentially Bounded Functions.)
A function is called essentially bounded ifit is in L ∞ ( A ; I ) . Definition 10. (Absolute Continuity.)
Let I and J be two integrals on the simplicial sequence A = ( A m ) . I is said absolutely continuous with respect to J if these two integrals admit simplicialrepresentations I = ( I m ) and J = ( J m ) and there exists a J − summable function g = ( g m ) suchthat I m ( f m ) = J m ( g m f m ) for all I − summable functions f = ( f m ) . Path Functionals
To introduce processes we need to specify the notion of measure on a set of paths. Onecould possibly introduce simplicial sequences in space-time but we refrain from doing so andconsider instead the time coordinate as a continuous variable. In the time direction, we are onlygoing to be using Riemann integrals for piecewise smooth functions, so that the underlying timediscretisation one might possibly imagine is quite straightforward and keeping track of it wouldonly lead to notational complexities.Let us consider a bounded simplicial sequence A = ( A m ) and let [ T, T (cid:48) ] ⊂ R be a fixed finitetime interval. Let us consider the sequence Λ = (Λ m ) where Λ m = A m × [ T, T (cid:48) ].Given an integer q ≥
0, let H qm denote the set of all functions y · : [ T, T (cid:48) ] → A m which areconstant on a family of q mutually disjoint sub-intervals of the form [ t k i , t k i +1 ) , i = 0 , ..q with t k = 0 and t k q +1 = T (cid:48) , spanning the interval [ T, T (cid:48) ]. Notice that ( H qm ) for q fixed is itself afinite dimensional manifold with boundaries. In fact, a function y ∈ H qm is characterized by theordered sequence t = T ≤ t ≤ .... ≤ t q +1 = T (cid:48) of time points between which y t is constant anda set of values y , ...y q − such that, y s = y i for all s ∈ [ t i , t i +1 ) and for all i = 0 , ..q − H qm for q = 0 , , ... can be regarded as nested into each other H ( T, T (cid:48) ) ⊂H ( T, T (cid:48) ) ⊂ ... . In fact, a path y t in H qm is also a path in H m (cid:48) ( T, T (cid:48) ) if m < m (cid:48) . Let H ( T, T (cid:48) ) = ∪ m =0 , ,.. H qm be the union of all path spaces containing paths with a finite number of jumps. Definition 11. (Function Algebras.)
Let us denote with C q (Λ) , the C ∗ algebra of all uni-formly continuous sequences of simplicial functions F = F m : H qm → R endowed with theoperations of sum, multiplication and with respect to the uniform norm defined as follows: (3.1) || F || ∞ = lim m →∞ sup y · ∈H qm | F ( y · ) | . Definition 12. (Path Functionals.)
A path functional is a sequence F = ( F q ) , q = 0 , , ... ofcontinuous simplicial functions F q = ( F qm ) ∈ C q (Λ) satisfying the following mutual compatibilitycondition: (3.2) F q (cid:48) m ( y · ) = F qm ( y · ) , for all y · ∈ H qm and all q (cid:48) > q. Definition 13. (Non-anticipatory Path Functionals.)
Let F ( y · , t ) = ( F q ( y · , t )) t ∈ [ T, T (cid:48) ] be a one-parameter family of path functionals. One says that this is a non anticipatory pathfunctional if (3.3) F qm ( y · , t ) = F qm ( y (cid:48)· , t ) whenever y s = y (cid:48) s for all s ≤ t . Intuitively, non-anticipatory path functionals are indifferent to information about the re-alization of the path y · in the argument at future times s > t . An elementary example ofnon-anticipatory path functional is given by a function of two arguments(3.4) F qm ( y · , t ) = F ( y t , t ) PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 7 where F : A ∞ × [ T, T (cid:48) ] → R is a piecewise smooth one-parameter family of functions. Ap-plications typically call for functions F ( y, t ) which are piecewise smooth in t for each y ∈ Λwith possibly a discrete set of jump discontinuities. We follow the usual convention accordingto which if jumps occur, then the discontinuity is of cadlag type, i.e. right continuous and witha left limit.A less elementary example of non-anticipatory path functional one often encounters is givenby integrals of the form:(3.5) F qm ( y · , t ) ≡ (cid:90) tT ds ... (cid:90) ts k − ds n a ( t ; s , ....s k ) F ( y s , s ) ...F k ( y s k , s k ) . where the F j : A ∞ × [ T, T (cid:48) ] → R , j = 1 , ...k , are one-parameter families of lattice functions andalso a ( t ; s , ....s k ) is a function of the time coordinates. Applications typically call for functions F j ( y, t ) which are piecewise smooth in t for each y ∈ Λ with possibly a discrete set of jumpdiscontinuities. Also in this case, we follow the convention according to which at the points ofjump discontinuity the function is cadlag . The same regularity assumptions will be postulatedfor the functions a ( t ; s , ....s k ) with respect to each of its arguments. Although the regularityassumption for these path functionals is sufficient for applications, they are not strictly neededand can be relaxed by taking limits.4. n-point Functions Definition 14. (Filtered Probability Spaces.)
Consider the C ∗ algebra P generated bythe functionals of form (3.4) and (3.5) by taking linear combinations of finite products andcompleting the resulting normed space. A filtered probability space upon the lattice Λ is definedas a bounded linear functional on the C ∗ − algebra P . A constructive definition of a stochastic process can be given with various degrees of gener-ality. We don’t aim here at the utmost generality but instead at pedagogical simplicity and atultimately explaining how to frame the theory of Markov processes. As a step toward this goal,we introduce here a family of time ordered n − point functions(4.1) Φ m ( y , t ; ... ; y n , t n | y )defined for y , ...y n ∈ A m and t , ...t n ∈ [ T, T (cid:48) ] which is assumed to be piecewise differentiablein the time coordinates. In addition, we assume that the following properties hold for all
T An adapted (stochastic) process is given by a non-anticipatory path functional F ∈ P and a measure on P defined by a family of n − point functions (Φ) n =1 , .. . CLAUDIO ALBANESE Given an adapted process of the form F t ≡ F ( y t , t ) ∈ P and given a t > T , the expectationsubject to the initial condition y T = y of the value attained by the process F t at time t is givenby(4.2) E (cid:2) F t | y T = y (cid:3) = (cid:88) y ∈ Λ Φ( y , t | y ) F ( y , t ) . For an adapted process of the form (3.5) instead, the conditional expectation is given by(4.3) E (cid:2) F t | y T = y (cid:3) = (cid:88) y ,..y n (cid:90) tT ds ... (cid:90) ts n − ds n Φ( y , s ; ... ; y n , s n | y ) a ( t ; s , ...s n ) F ( y , s ) ...F n ( y n , s n ) . Also higher moments can be computed. To keep expressions simple, consider a path functionalof the form(4.4) F t ≡ (cid:90) tT ds a ( t ; s ) F ( y s , s ) . The variance of this process at a future point in time conditional to the starting point at time T is given by(4.5) E (cid:2) F t | y T = y (cid:3) = 2 (cid:88) y ,y (cid:90) tT ds (cid:90) ts ds Φ( y , s ; y , s | y ) a ( t ; s ) a ( t ; s ) F ( y , s ) F ( y , s ) . Notice that the factor 2 compensates for the time-ordering needed to recast the expression insuch a way that we can apply to it a 2-point function. This expression and its generalizationsare used extensively in the applications to path-dependent options discussed below.More generally, consider two path functionals F t and G t ∈ L ( H ( T, T (cid:48) ) , µ ), where F t is definedas in (3.5) and(4.6) G t ≡ (cid:90) tT ds ... (cid:90) ts q − ds q b ( t ; s , ....s q ) G ( y s , s ) ...G q ( y s q , s q ) . Then we can compute a mixed moment as follows: E (cid:2) F t G t | y T = y (cid:3) = (cid:88) y ,..y n + q (cid:90) tT ds ... (cid:90) ts n + q − ds n + q Φ( y , s ; .... ; y n + q , s n + q ) (cid:88) π a ( t ; s π (1) , ....s π ( n ) ) F ( y π (1) , s π (1) ) ...F n ( y π ( n ) , s π ( n ) ) b ( t ; s π ( n +1) , ....s π ( n + q ) ) G ( y π ( n +1) , s π ( n +1) ) ...G q ( y π ( n + q ) , s π ( n + q ) ) , where the sum ranges over all the permutations π of the time ordered sequence s ≤ ... ≤ s n + q such that s π (1) ≤ ... ≤ s π ( n ) and s π ( n +1) ≤ ... ≤ s π ( n + q ) . Higher conditional moments of anypath functional of the form above can be evaluated in a similar way. n -point functions Φ( y , s ; ... ; y n , s n | y ) are conditioned to the starting point y . It is straight-forward to define variations of these n -point functions which reflect conditioning on an initialstretch of a path y s for s ∈ [ T, t ), where t > T . Conditioning to a past history is equivalentto restricting the integral over each manifold H ( T, T (cid:48) ) to a sub-manifold which is part of itsboundary. In general, this results in rather clumsy expressions which are difficult to compute.In the next section we specialize to the Markovian case for the underlying lattice process and inthis case there are no memory effects and conditioning is more straightforward. Definition 16. (Radon-Nykodim Derivative.) Consider two sequences of n -point functionson the simplicial sequence A m Φ m ( y , t ; ... ; y n , t n | y ) and Φ m ( y , t ; ... ; y n , t n | y ) , where m =1 , , ... . The Radon-Nykodim derivative of Φ with respect to Φ is given by the path functionaldefined as follows: (4.7) ρ ( y · ) = lim n →∞ Φ m ( y t , t ; ... ; y t n , t n | y )Φ m ( y t , t ; ... ; y t n , t n | y ) PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 9 where t i = T + in ( T (cid:48) − T ) . Notice that the Radon-Nykodim derivative might possibly be infiniteon some paths.The measure in path space given by (Φ m ) is said to be absolutely continuous with respect to (Φ m ) if the Radon-Nykodim derivative is finite and summable. Finally, a word on stopping times. Definition 17. (Stopping Times.) A stopping time is an adapted process τ t = τ ( y · , t ) whichcan take only two values, by convention 0 and 1. Stopping times are often used in conjunction with other adapted processes F t to constructstopped versions of it. If F t = F ( y · , t ) is an adapted process and R t = R ( y · , t ) is a secondprocess, then a stopped version of F t corresponds to the adapted process of function ¯ F ( y · , t ),where ¯ F ( y · , t ) = F ( y · , t ) if τ t = 0 and ¯ F ( y · , t ) = R ( y · , t ) if τ t = 1.5. Markov Processes Definition 18. (Markov Propagator.) A Markov propagator on the simplicial sequence A m , m ≥ m is defined as a sequence of functions U m ( y , t ; y , t ) where y , y ∈ A m and T ≤ t ≤ t ≤ T (cid:48) satisfying the following Chapman-Kolmogorov axioms: (CK1) U m ( y , t ; y , t ) ≥ ∀ y , y ∈ Λ , ∀ t ≤ t ∈ R , (CK2) U m ( y , t ; y , t ) = δ y y ∀ y , y ∈ Λ , ∀ t ∈ R , (CK3) (cid:88) y ∈ Λ U m ( y , t ; y , t ) U m ( y , t ; y , t ) = U m ( y , t ; y , t ) ∀ y , y ∈ Λ , ∀ t ≤ t ≤ t ∈ R . Given a Markov propagator U m ( y , t ; y , t ), one can define a sequence of n − point functionhaving all the necessary properties by setting(5.1) Φ m ( y , t ; ... ; y n , t n | y ) = (cid:89) j =1 ..n U m ( y j − , t j − ; y j , t j ) . where t = T . Definition 19. (Markov Process.) A filtered probability space generated by a Markov propa-gator is called Markovian and the corresponding stochastic process is called Markov process. Definition 20. (Markov Generator.) If the matrix elements U m ( y , t ; y , t ) are differen-tiable functions of the time parameter t in a right neighborhood of t , then one defines theMarkov generator at time t as the following right derivative: (5.2) L m ( y , y ; t ) = lim t ↓ t ddt U m ( y , t ; y , t ) Proposition 1. If L m ( y , y ; t ) is a Markov generator, then for all pairs y , y ∈ A m thefollowing two properties hold: (MG1) L m ( y , y ; t ) ≥ y (cid:54) = y , (5.3) (MG2) L m ( y , y ; t ) = − (cid:88) y (cid:54) = y L m ( y , y ; t ) . (5.4) Viceversa, if L m ( y , y ; t ) is a differentiable one-parameter family of matrices satisfying condi-tions ( A ) and ( B ) above, then the differential equation (5.2) admits one and only one solutionsatisfying the initial condition U m ( y , t ; y , t ) = δ y y . The propagator U m ( y , t ; y , t ) defined by the differential equation in (5.2) can be repre-sented by means of a so-called path-exponential defined as follows. Let N > U Nm ( y , t ; y , t ) = (cid:18) δt N L m ( t ) (cid:19) · .... · (cid:18) δt N L m ( t N ) (cid:19) where δt N = t − t N . If N is so large that(5.6) ( δt ) N < min y ∈ Λ ,t ∈ [ t ,t ] L m ( y, y ; t ) − , then the operator product U Nm ( y , t ; y , t ) in equation (5.5) is a probability kernel. Passing tothe limit N → ∞ , we find(5.7) lim N →∞ U Nm ( y , t ; y , t ) = U m ( y , t ; y , t ) . By expanding the product in (5.5) and passing to the limit, we arrive at the following: Theorem 3. (Dyson expansion.) The probability kernel can be represented as the followingconvergent series: (5.8) U m ( y , t ; y , t ) = (cid:18) ∞ (cid:88) n =1 (cid:90) t t ds (cid:90) t s ds ... (cid:90) t s n − ds n L m ( s ) L m ( s ) .... L m ( s n ) (cid:19) ( y , y ) . By differentiating with respect to the two time coordinates in the Dyson expansion, we alsofind the following two equations: Theorem 4. (Forward and backward equations.) The probability kernel satisfies the back-ward equation (5.9) ∂∂t U m ( t ; t ) + L m ( t ) U m ( t ; t ) = 0 as well as the forward equation (5.10) ∂∂t U m ( t ; t ) = U m ( t ; t ) L m ( t ) . A handy notation for this expansion is given by the following: Definition 21. (Path-ordered Exponential.) The equation (5.8) is written as a path orderedexponential (5.11) U m ( y , t ; y , t ) = Pexp (cid:18) (cid:90) t t L m ( s ) ds (cid:19) ( y , y where the operator P formally acts as follows: (5.12) P (cid:18) (cid:90) t t L m ( s ) ds (cid:19) n = n ! (cid:90) t t ds (cid:90) t s ds ... (cid:90) t s n − ds n L m ( s ) L m ( s ) .... L m ( s n ) . Using the Dyson expansion for the path-ordered exponential, one finds a path-integral repre-sentation of the probability kernel. Let us set the following definition: Definition 22. (Symbolic Path.) A symbolic path γ = { γ , γ , γ , .... } is an infinite sequenceof sites in h m Z such that γ j (cid:54) = γ j − for all j = 1 , ... . Let Γ m be the set of all symbolic paths in h m Z . Theorem 5. (Path-Integral Representation.) The propagator admits the following repre-sentation: U m ( x, T ; y, T (cid:48) ) = ∞ (cid:88) q =1 (cid:88) γ ∈ Γ m : γ = x,γ q = y (cid:90) T (cid:48) T ds (cid:90) T (cid:48) s ds ... (cid:90) T (cid:48) s q − ds q (5.13) exp (cid:18)(cid:90) s L m ( γ , γ ; v ) dv (cid:19) q (cid:89) j =1 L m ( γ j − , γ j ; s j ) exp (cid:18)(cid:90) s j +1 s j L m ( γ j , γ j ; v j ) dv j (cid:19) (5.14) where t q +1 = T . PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 11 Notice that the total mass of the sector of path space H qm ( y , T ; T (cid:48) ) consisting of lattice pathsoriginating from y ∈ Λ at time T and attaining at most q different values by time T (cid:48) is givenby µ ( H qm ( y , T ; T (cid:48) )) = (cid:88) γ ∈ Γ m : γ = x (cid:90) T (cid:48) T ds (cid:90) T (cid:48) s ds ... (cid:90) T (cid:48) s q − ds q exp (cid:18)(cid:90) s T L m ( γ , γ ; v ) dv (cid:19) q (cid:89) j =1 L m ( γ j − , γ j ; s j ) exp (cid:18)(cid:90) s j +1 s j L m ( γ j , γ j ; v j ) dv j (cid:19) . where s m +1 = T (cid:48) . If λ m = max x ∈ A m |L m ( x, x ) | and c = sup t ∈ [ T,T (cid:48) ] ||L m ( t ) || is the operator normof the matrix L m ( t ), then we have that(5.15) µ (cid:2) H qm ( y , T ; T (cid:48) ) (cid:3) ≤ c q ( T (cid:48) − T ) q q ! e − λ m ( T (cid:48) − T ) This expression reaches a maximum as a function of q at q ≈ c ( T (cid:48) − T ) and then declines at super-exponential rate. Hence, by discretizing the space coordinate, we ensure that the probabilitymass of the set of paths with q changes decreases faster than any exponential in the limit as q → ∞ . This convergence is however not uniform as m → ∞ and h m → Definition 23. (Inverse lattice.) Let us consider the following lattice: (5.16) B m = (cid:26) − m − πL + kπL , k = 0 , .. m − (cid:27) also called Brillouin zone or inverse lattice with respect to A m . Definition 24. (Pseudo-differential Symbols.) The symbol of a Markov generator is definedas follows: (5.17) ˆ L m ( x, p ; t ) = (cid:88) y ∈ A m L m ( x, y ) e ip ( y − x ) where p ∈ B m . Two particularly important special examples of Markov processes are given by monotonicprocesses and diffusions. Definition 25. (Monotonic process.) A Markov process of generator L m ( x, y ) on the sim-plicial sequence A m is said monotonic non-decreasing if (5.18) L m ( x, y ) = 0 whenever y < x. and monotonic non-increasing if (5.19) L m ( x, y ) = 0 whenever y > x. Definition 26. (Diffusion Process.) A diffusion process is a Markov process with generatorof the form (5.20) L m ( x, y ; t ) = µ m ( x ; t ) ∇ h m ( x, y ) + σ m ( x ; t ) h m ( x, y ) where (5.21) ∇ h ( x, y ) = δ y,x + h − δ y,x − h h and ∆ h ( x, y ) = δ y,x + h + δ y,x − h − δ y,x h . and µ ( x ; t ) = ( µ m ( x ; t )) and σ ( x ; t ) = ( σ m ( x ; t )) are two simplicial functions which, for simplic-ity, we assume smooth in both arguments. Proposition 2. (i) The symbol of a diffusion process is given by (5.22) ˆ L m ( x, p ; t ) = µ m ( x ; t ) sin phih + σ m ( x ; t ) ph − h . (ii) The path integral representation for a diffusion process has the form U m ( x, T ; y, T (cid:48) ) = ∞ (cid:88) q =1 − q (cid:88) γ ∈ Γ m : γ = x, γ q = y | γ j − γ j − | = 1 ∀ j ≥ W m ( γ, q ; T, T (cid:48) )(5.23) where W m ( γ, q ; T, T (cid:48) ) = (cid:90) T (cid:48) T ds (cid:90) T (cid:48) s ds ... (cid:90) T (cid:48) s q − ds q e R T (cid:48) sq L m ( y,y,v q ) dv q q − (cid:89) j =0 (cid:18) e R sj +1 sj L m ( γ j ,γ j ,v j ) dv j L m ( γ j , γ j +1 ) (cid:19) (5.24) and s = T . Martingales and Monotonic Processes In this section we introduce the notion of piecewise smooth Markov process which coversa large family of models useful for applications. In this context, we define martingales andmonotonic Markov processes. Definition 27. (Piecewise Smooth Markov Processes.) Consider the time interval [ T, T (cid:48) ] ,the simplicial sequence A m , m ≥ m and a finite number of time points T = t < t < .. < t n = T (cid:48) . A piecewise smooth Markov process is given by a family of Markov generators L im ( y , y ; t ) defined on each half open time interval [ t i , t i +1 ) , where i = 0 , , ...n − . In correspondence toeach time point t i , i = 1 , , ...n , one also defines a mapping operator U im ( x, y ) such that (MA1) U im ( y , y ) ≥ (cid:88) y U im ( y , y ) = 1 ∀ y ∈ Λ . (6.2) The Markov propagator for any pair of time points t i ≤ s < s (cid:48) < t i +1 is defined as follows: (6.3) U m ( y , s ; y , s (cid:48) ) = Pexp (cid:32)(cid:90) s (cid:48) s L im ( v ) dv (cid:33) ( y , y ) . Moreover, if s (cid:48) = t i +1 , then (6.4) U m ( y , s ; y , t i +1 ) = (cid:88) y ∈ Λ Pexp (cid:18)(cid:90) t i +1 s L im ( v ) dv (cid:19) ( y , y ) U im ( y , y ) . More general Markov propagators are obtained by taking products of the ones above. Definition 28. (Attainable Sets.) Let U m be a piecewise smooth Markov process, m ≥ m , y ∈ A m and t ∈ [ T, T (cid:48) ] . The attainable set D m ( U, t, y ) ⊂ A m is defined as follows: if t ∈ ( t i , t i +1 ) for some i = 0 , ..n − , then D m ( U, t, y ) is the set of ¯ y ∈ Λ such that L i ( y, ¯ y ; t ) > . If instead t = t i for some i = 1 , ..n − , then D m ( U, t i , y ) is defined as the set of the ¯ y ∈ Λ such that U i ( y, ¯ y ; t ) > . Definition 29. (Equivalent Markov Processes.) Two piecewise smooth Markov propagators U and U (cid:48) are called equivalent if their attainable sets D m ( U, t, y ) and D m ( U (cid:48) , t, y ) are equal forall t ∈ [ T, T (cid:48) ] and all y ∈ Λ . If D m ( U, t, y ) is a subset of D m ( U (cid:48) , t, y ) for all t ∈ [ T, T (cid:48) ] and all y ∈ Λ , then one says that U m is absolutely continuous with respect to U (cid:48) m . Definition 30. (Measure Changes.) Let U m , m ≥ m be a family of piecewise smooth Markovpropagators. A measure change is characterized by a family of positive, non-zero functions G ytm ( y (cid:48) ) ≥ indexed by y ∈ A m and t ∈ [ T, T (cid:48) ] , which is strictly positive for all y (cid:48) ∈ D m ( U, y, t ) PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 13 and zero otherwise. A measure change function defines a transformation of a Markov generatorinto an equivalent one according to the following formula: (6.5) L (cid:48) ( y, y (cid:48) ; t ) = 1 G ytm ( y ) L ( y, y (cid:48) ; t ) G ytm ( y (cid:48) ) − G ytm ( y ) ( L ( t ) G ytm )( y ) δ yy (cid:48) . Notice that the specification of the function G ytm ( y (cid:48) ) at the point y (cid:48) = y is immaterial in thesense that it does not affect the measure change transformation. Definition 31. (Time Changes.) A measure change is called time change if there is a function φ ( y, t ) such that G ytm ( y (cid:48) ) = φ ( y, t ) for all y ∈ A m , y (cid:48) ∈ D m ( U, y, t ) and t ∈ [ T, T (cid:48) ] . The timechange is called state independent if φ ( y, t ) ≡ φ ( t ) is a function of the time coordinate only. Theorem 6. (Deterministic Time Changes.) If φ ( t ) defines a state independent time changecorresponding to the measure change function G ytm ( y (cid:48) ) = φ ( t ) , then (6.6) U (cid:48) m ( y, t ; y (cid:48) , t (cid:48) ) = U m ( y, λ ( t ); y (cid:48) , λ ( t (cid:48) )) where (6.7) λ ( t ) = (cid:90) tT φ ( s ) ds. The following is a particularly interesting special case of measure change: Definition 32. (Numeraire Changes.) Consider a smooth (as opposed to just piece-wisesmooth) Markov process of generator L m ( y ; t ) . Let g m ( y (cid:48) ; t ) be a function satisfying the equation (6.8) ∂g m ( y ; t ) ∂t + ( L m g m )( y ; t ) = 0 . The measure change given by the function G ytm ( y (cid:48) ) = g m ( y (cid:48) ; t ) is called numeraire change andthe Markov generator transforms as follows: (6.9) L (cid:48) m ( y, y (cid:48) ; t ) = g m ( y (cid:48) ; t ) g m ( y ; t ) L m ( y, y (cid:48) ; t ) + 1 g m ( y ; t ) ∂g m ( y ; t ) ∂t δ yy (cid:48) . Notice that if g m ( t ) denotes the multiplication operator of kernel g m ( y ; t ) δ yy (cid:48) , then equation(13.4) can be written more compactly as follows:(6.10) L (cid:48) m ( t ) = 1 g m ( t ) L m ( t ) g m ( t ) + 1 g m ( t ) ∂g m ( t ) ∂t . Theorem 7. (Numeraire Changes.) If g m satisfies equation (13.3) and defines a numerairechange, then (6.11) U (cid:48) m ( y, t ; y (cid:48) , t (cid:48) ) = g m ( y (cid:48) , t (cid:48) ) g m ( y, t ) U m ( y, t ; y (cid:48) , t (cid:48) ) . Let F t = F ( y · , t ) be an adapted process in the time interval [ T, T (cid:48) ] . Let us consider a fixedtime t ∈ [ T, T (cid:48) ). Recall that, since F t is non-anticipatory, we have that F ( y · , t ) = F ( y (cid:48)· , t ) forall pairs of paths such that y s = y (cid:48) s for all s ≤ t . If ¯ y ∈ D m ( U (cid:48) , t, y ), let ˜ y · = Ext t ( y · , y t ) bethe constant extension path such that ˜ y s = y s for all s < t and ˜ y s = y t for all s ≥ t . Withprobability one, we have that(6.12) lim δt ↓ F ( y · , t + δt ) = F (Ext t ( y · , ¯ y ) , t )for some ¯ y ∈ D m ( U, t, y ). Definition 33. (Monotonic Processes.) Let U be a piecewise smooth Markov propagator andlet F t be an adapted process given by the non-anticipatory path functional F ( y · , t ) . F t is said tobe increasing at time t if (i) For all ¯ y ∈ D m ( U, t, y ) we have that (6.13) F (Ext t ( y · , ¯ y ) , t ) − F ( y, t ) ≥ . (ii) We have that (6.14) ∂F (Ext s ( y · , y t − ) , s ) ∂s (cid:12)(cid:12)(cid:12)(cid:12) s = t ≥ where y t − = lim δt ↓ y t − δt . (iii) For all y ∈ A m and all t ∈ [ T, T (cid:48) ] , either the inequality in (i) holds in a strict sense forat least one ¯ y ∈ D m ( U, t, y ) or the inequality in (ii) holds in a strict sense.In case the property (iii) fails but the other two still hold, the process is called non decreasing. Theorem 8. (Monotonic Processes.) Let U m and U (cid:48) m be two piecewise smooth Markov prop-agators and let F ( y · , t ) be a non-anticipatory path functional. If U m and U (cid:48) m are equivalent andif F ( y · , t ) regarded as an adapted process under the path measure generated by U m is increas-ing (non-decreasing), then also F ( y · , t ) regarded as an adapted process under U (cid:48) m is increasing(non-decreasing).Proof. This theorem descends from the fact that the definition of monotonicity depends only onthe attainable sets. (cid:3) Definition 34. (Martingale Processes.) A process F t is a martingale if for all times t wehave that (6.15) ∂∂t ˜ F (Ext t ( y · , y t − ) , t ) + (cid:88) ¯ y ∈D ( U,t,y t ) L ( y t , ¯ y, t ) (cid:0) F (Ext t ( y · , ¯ y ) , t ) − F (Ext t ( y · , y t − ) , t ) (cid:1) = 0 The process F t is called an equivalent martingale if there is a second piecewise smooth Markovpropagator U (cid:48) for which the non-anticipatory path functional F ( y · , t ) is a martingale process. Theorem 9. (Equivalent Martingales.) (i) If F t is an increasing adapted process then it is not an equivalent martingale. Otherwisestated, if F t is an equivalent martingale then it is not increasing. (ii) If F t is a non-decreasing adapted process and it is also an equivalent martingale, then itis a constant process with F ( y · , t ) = const for all t ∈ [ T, T (cid:48) ] .Proof. This theorem descends from the fact that monotonicity properties are preserved by mea-sure changes. (cid:3) Martingales are particularly useful as they can be constructed by taking expectations.Let Φ( y · ) be a continuous path-functional. From the modeling viewpoint, such a functionalcan represent future cash flow streams. For instance, one choice could be(6.16) Φ( y · ) = Φ ( y t (cid:48) )where Φ is a continuous univariate function and t (cid:48) ∈ [ T, T (cid:48) ] is fixed. In a more general example,one may consider a path functional of the form(6.17) Φ( y · ) ≡ (cid:90) T (cid:48) T ds ... (cid:90) T (cid:48) s n − ds n a ( t ; s , ....s n ) F ( y s , s ) ...F n ( y s n , s n )with a ( t ; s , ....s n ) = 0 for t ∈ [ T, t (cid:48) ]. The path conditioned expectation of Φ( y · ) is the non-anticipatory path functional F t = F ( y · , t ) such that(6.18) F ( y · , t ) = (cid:90) A ( y · ,t ) Φ( z · ) µ [ dz · ]where the integral is restricted to the set A ( y · , t ) = { z · | z s = y s ∀ s ≤ t } . The intersection ofthe set A ( y · , t ) with each of the spaces H qmn is a compact, finite dimensional submanifold withboundaries. To denote path conditioning, we also use the following notation:(6.19) F t = F ( y · , t ) = E [Φ( y · ) | y s , s ≤ t ] = E t [Φ( y · )] . Proposition 3. The process F t in (6.19) is a martingale for t < t .Proof. Equation (6.15) descends from the backward equation in (5.9). (cid:3) PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 15 The Fundamental Theorem of Finance In this section, we derive the Fundamental Theorem of Finance in a context general enoughto encompass most cases of practical relevance. Definition 35. (Financial Model.) Let A m , m ≥ m , be a simplicial sequence. A financialmodel is given by a family A km ( y · , t ) , k = 1 , ...n of adapted processes modeling asset prices anda non-decreasing adapted process B m ( y · , t ) modeling the money-market account. For notationalconvenience, we set A m ( y · , t ) = B m ( y · , t ) . Let us introduce also the discounted asset price processdefined as follows: (7.1) ˜ A km ( y · , t ) = A km ( y · , t ) B m ( y · , t ) . Definition 36. (Trading Strategies.) Given a financial model with n assets, a trading strategyis given by a family of adapted processes ζ km ( y · , t ) , k = 0 , ....n . The value process of a strategy isthe adapted process (7.2) Π m ( y · , t ) = n (cid:88) k =0 ζ km ( y · , t ) A km ( y · , t ) The discounted value process instead is given by (7.3) ˜Π m ( y · , t ) = n (cid:88) k =0 ζ km ( y · , t ) ˜ A km ( y · , t ) Definition 37. (Self-Financing Condition.) An adapted trading strategy is called self-financing if the following two conditions hold: (7.4) ( SF n (cid:88) k =0 ∂ζ km ∂t (Ext t ( y · , y t − ) , t ) ˜ A km (Ext t ( y · , y t − ) , t ) = 0 and if, for all ¯ y ∈ D m ( U, t, y t ) , we have that (7.5) ( SF n (cid:88) k =0 (cid:0) ζ km (Ext t ( y · , ¯ y ) , t ) − ζ km (Ext t ( y · , y t − ) , t ) (cid:1) ˜ A km (Ext t ( y · , ¯ y ) , t ) = 0 . Proposition 4. If ζ km ( y · , t ) is a self-financing trading strategy, then the corresponding discountedvalue process satisfies (7.6) ∂ ˜Π m (Ext t ( y · , y t − ) , t ) ∂t = n (cid:88) k =0 ζ km (Ext t ( y · , y t − ) , t ) ∂ ˜ A km (Ext t ( y · , y t − ) , t ) ∂t and for all ¯ y ∈ D m ( U, t, y t ) , we have that ˜Π m (Ext t ( y · , ¯ y ) , t ) − ˜Π m (Ext t ( y · , y t − ) , t )(7.7) = n (cid:88) k =0 ζ km (Ext t ( y · , ¯ y ) , t ) (cid:0) ˜ A km (Ext t ( y · , ¯ y ) , t ) − ˜ A km (Ext t ( y · , y t − ) , t ) (cid:1) . (7.8) Definition 38. (Arbitrage Strategies.) A self-financing strategy is called arbitrage at time t if the corresponding discounted value process ˜Π m ( y · , t ) is increasing at time t . Theorem 10. (Fundamental Theorem of Finance.) If there is an equivalent measure withrespect to which all discounted base asset price processes are martingales, then (i) The discounted value process of any self-financing trading strategy under the same equiv-alent measure is a martingale. (ii) There is no arbitrage.Conversely, if there is no arbitrage than there exists a measure change under which all discountedasset price processes become martingales. Proof. The first part of the theorem is a simple consequence of the definitions put forward. Theconverse instead requires a proof.Let V m ( U, t, y t ) be the vector space spanned by functions of the form v (¯ y ) where ¯ y ∈ D m ( U, t, y t ).Let K be the cone in V m ( U, t, y t ) made up by all the vectors with non-negative components.Also, let us introduce the following vector ξ km ∈ V m ( U, t, y t ):(7.9) ξ km (¯ y ) = δ ¯ y,y t − ∂ ˜ A km (Ext t ( y · , y t − ) , t ) ∂t + (cid:0) − δ ¯ y,y t − (cid:1)(cid:0) ˜ A km (Ext t ( y · , ¯ y ) , t ) − A km (Ext t ( y · , y t − ) , t ) (cid:1) . Notice that, in case k = 0 and the asset is the money market account, then ξ m (¯ y ) = 0 for all¯ y ∈ D m ( U, t, y t ).Suppose there is no arbitrage and fix a time t . Assuming there does not exist a tradingstrategy with a strictly increasing value process, for all vectors ( v k ) k =1 ,...n ∈ V m ( U, t, y t ) thereare two elements y + , y − ∈ D m ( U, t, y t ) such that(7.10) n (cid:88) k =1 v k ξ km (¯ y + ) > n (cid:88) k =1 v k ξ km (¯ y − ) < . If in equation (7.10) we set v k = δ k , we conclude that the vector (cid:0) ξ m (¯ y ) (cid:1) ¯ y ∈D m ( U,t,y t ) hasboth positive and negative components. Hence, the the hyperplane Π ⊂ V m ( U, t, y t ) orthogonalto the vector ξ m (¯ y ) ∈ V m ( U, t, y t ) intersects K .Let P be the orthogonal projection operator onto the hyperplane Π and let us consider thevector(7.11) ( P ξ m )(¯ y ) = ξ m (¯ y ) − (cid:80) ¯ z ∈D m ( U,t,y t ) ξ m (¯ z ) ξ m (¯ z ) (cid:80) ¯ z ∈D m ( U,t,y t ) ξ m (¯ z ) ξ m (¯ z ) ξ m (¯ y ) . for i = 2 , ...n . Due to absence of arbitrage, there are two elements y + , y − ∈ D m ( U, t, y t ) suchthat(7.12) ( P ξ m )(¯ y + ) > P ξ m )(¯ y − ) < . Hence, the vector P ξ m is transversal to the octant K of positive vectors. As a consequence, thehyperplane Π ⊂ Π V is orthogonal to both vectors ξ m and ξ m , also intersects K .The argument above can be iterated n times, leading to the conclusion that there exists astrictly positive function g ytm (¯ y ) > , ¯ y ∈ D m ( U, t, y t ), such that(7.13) (cid:88) ¯ y ∈D m ( U,t,y t ) ξ km (¯ y ) g ytm (¯ y ) = 0for all k = 0 , ...n . In particular, this implies that there exists a measure change function G ytm (¯ y )such that(7.14) (cid:88) y ∈ h Z d (cid:0) ˜ A km (Ext t ( y · , ¯ y ) , t ) − A km (Ext t ( y · , y t − ) , t ) (cid:1) L ( y, ¯ y ; t ) G ytm (¯ y ) = 0 . (cid:3) Weak Convergence of Markov Generators Consider a one dimensional Markov process defined on the simplicial sequence A m , m ≥ m and the time interval [ T, T (cid:48) ]. Many different specifications of Markov generators on A m maycorrespond to the same limit. In this section we identify a canonical sequence of generatorsunder a few regularity hypotheses which imply the existence of a weak limit in distribution sensefor the generator. This is a necessary first step to single out the general form of an admissibleMarkov generator. In the following sections, we then investigate convergence under the muchfiner criteria of pointwise convergence for probability kernels and their derivatives.First consider the case when the limiting domain A ∞ is bounded. Without restricting gener-ality, let us suppose that A ∞ = [ − L, L ]. PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 17 Let L m be a sequence of Markov generators. The first assumption we make is that the firsttwo moments are finite, or more specifically Hypothesis MG1. The sequences (8.1) µ m ( x, t ) ≡ (cid:88) y ∈ A m L m ( x, y ; t )( y − x ) , σ m ( x, t ) ≡ (cid:115) (cid:88) y ∈ A m L m ( x, y ; t )( y − x ) are uniformly bounded in absolute value as m → ∞ and converge to limits for all x ∈ A m andall t ∈ [ T, T (cid:48) ] , i.e. the following limits exist: (8.2) µ ( x, t ) ≡ lim m →∞ µ m ( x, t ) , σ ( x, t ) ≡ lim m →∞ σ m ( x, t ) . Notice that, due to the dominated convergence theorem, the functions µ m ( x, t ) and σ m ( x, t )regarded as piecewise constant functions on A ∞ converge weakly to the corresponding limits. Hypothesis MG2. The following limits exist and are finite for all m ≥ m and all pairs x, y ∈ A m such that | x − y |≥ h m : (8.3) λ m ( x, y ; t ) = lim m (cid:48) →∞ (cid:88) z ∈ A m (cid:48) : y − h m For all x ∈ A m there exists a measure ν xt ( dξ ) in [ − L, L ] (called Levy measure) with the following two properties: (i)(8.4) (cid:90) L − L ν xt ( dξ ) ξ < ∞ (ii) For all y ∈ A m with | x − y |≥ h m the following representation is valid: (8.5) λ m ( x, y ) = lim ε ↓ (cid:90) y + h m / y − h m / ε ν xt ( dξ ) . Definition 39. (Finite activity jumps.) A Markov process is said to have finite activityjumps if for all m ≥ m and all x ∈ A m we have that (8.6) (cid:90) L − L ν xt ( dξ ) < ∞ Notice that given M G M G m →∞ (cid:88) y λ m ( x, y ; t )( y − x ) ≤ (cid:88) y L m ( x, y ; t )( y − x ) < ∞ . More generally, if φ ∈ D ( R ) is a test function such that φ (0) = 0 and φ (cid:48) (0) = 0, then we havethat(8.8) lim sup m →∞ (cid:88) y λ m ( x, y ; t ) φ ( y − x ) < ∞ . Although the above sequence is bounded, the limit may not exist in general. We thus need tostipulate this as a separate assumption: Hypothesis MG3. For all test functions φ ∈ D ( R ) such that φ (0) = φ (cid:48) (0) = 0 and all x ∈ A m ,the sequence (cid:80) y λ m (cid:48) ( x, y ; t ) φ ( y − x ) admits a limit as m (cid:48) → ∞ . Let us introduce the following notations:(8.9) µ m ( x ; t ) = (cid:88) y λ m ( x, y ; t )( y − x ) , σ m ( x ; t ) = (cid:115)(cid:88) y λ m ( x, y ; t )( y − x ) . Notice that the sequence σ m ( x, t ) is non-negative and is uniformly bounded as a function of m . More precisely(8.10) 0 ≤ σ m ( x, t ) ≤ σ m ( x, t ) . On the other hand, we cannot conclude in general that µ m ( x, t ) is necessarily uniformly bounded.In fact, the difference µ m ( x, t ) − µ m ( x, t ) could diverge as m → ∞ while being compensated byterms concentrated at y = x ± m → ∞ while keepingthe total drift µ m ( x, t ) uniformly bounded. An exception to this general situation is found incase the following condition is satisfied: Hypothesis MG4. For all m ≥ m and all x ∈ A m we have that (8.11) (cid:90) L − L | x | ν xt ( dξ ) < ∞ Two particularly important situations in which this condition M G Theorem 12. (MG4.) Assuming M G , M G , M G , if the Markov process is either monotonicor has finite activity jumps, then M G holds. Under the three assumptions M G , M G , M G 3, the sequence L m of Markov generators canbe mapped into an equivalent canonical sequence of Markov generators L C . More precisely, weset(8.12) L Cm ( x, y ; t ) = λ ( x, y ; t )in case | x − y |≥ h m . Furthermore, we set(8.13) L Cm ( x, x ± h m ; t ) = ˜ µ m ( x ; t ) ∇ h m ( x, x ± 1) + ˜ σ m ( x ; t ) h m ( x, x ± µ m ( x ; t ) = µ m ( x ) − µ m ( x ) , ˜ σ m ( x ; t ) = σ m ( x ; t ) − σ m ( x ; t ) . and the operators ∇ h and ∆ h are defined as in equation (5.21). Finally,(8.15) L Cm ( x, x ; t ) = (cid:88) y ∈ A m ,y (cid:54) = x L Cm ( x, y ; t ) Theorem 13. (Canonical Representations of Markov Generators.) For all smooth func-tions of compact support φ ∈ D ( A ∞ ) and all x ∈ A m we have that (8.16) lim m (cid:48) →∞ (cid:88) y ∈ A m (cid:48) L Cm (cid:48) ( x, y ) φ ( y ) = lim m (cid:48) →∞ (cid:88) y ∈ A m (cid:48) L m (cid:48) ( x, y ) φ ( y ) . A compact representation for a canonical generator summarizing these definition is obtainedby considering the symbol as specified in Definition (24). Theorem 14. (Levy-Khintchine representation.) Under the hypothesis M G , M G , M G above and in case A ∞ = [ − L, L ] is bounded, the symbol of a generator in canonical form can beexpressed as follows: (8.17)ˆ L Cm ( x, p ; t ) = iµ ( x ; t ) sin phh +˜ σ ( x ; t ) cos ph − h + (cid:88) y ∈ A m (cid:18) e ip ( y − x ) − − i sin phh ( y − x ) (cid:19) λ m ( x, y ; t ) . PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 19 The following limit converges in the weak topology: (8.18) lim m →∞ ˆ L Cm ( x, p ; t ) = iµ ( x ; t ) p − ˜ σ ( x ; t )2 p + (cid:90) L − L ( e ipξ − − ipξ ) ν xt ( dξ ) , where ν xt ( dξ ) is the Levy measure. Moreover, if also M G is satisfied, then the symbol of agenerator in canonical form can be expressed as follows: (8.19) ˆ L Cm ( x, p ; t ) = i ˜ µ ( x ; t ) sin phh + ˜ σ ( x ; t ) cos ph − h + (cid:88) y ∈ A m (cid:16) e ip ( y − x ) − (cid:17) λ m ( x, y ; t ) . The following limit converges in the weak topology: (8.20) lim m →∞ ˆ L Cm ( x, p ; t ) = i ˜ µ ( x ; t ) p − ˜ σ ( x ; t )2 p + (cid:90) L − L ( e ipξ − ν xt ( dξ ) . Fast Exponentiation and Spectral Methods Numerical analysis of pricing models in the operator formalism depend on the ability to com-pute the propagator for a given generator L ( t ). Time homogenous Markov generators L m ( y , y )represent a privileged special case of particular importance. In this case, the associated propa-gator solving the differential equation in (5.2) is given by the matrix exponential(9.1) U m ( y , s ; y , s (cid:48) ) = exp(( s (cid:48) − s ) L m )( y , y ) . Matrix exponentiation can be defined in several equivalent ways, such as by Taylor expansion(9.2) exp( t L m ) = ∞ (cid:88) j =0 t j j ! L jm or by means of Neper’s formula(9.3) exp( t L m ) = lim N →∞ (cid:18) tN L m (cid:19) N We find that the most efficient and robust method for exponentiating Markov generators isthe so called fast exponentiation algorithm. Let us fix a Markov generator (cid:32)L( x, y ) and a timehorizon t . Let δt > y ∈ Λ (1 + δt (cid:32)L( y, y )) ≥ / tδt = n ∈ N . To compute e t L ( x, y ), we first define the elementary propagator(9.4) u δt ( x, y ) = δ xy + δt (cid:32)L( x, y )and then evaluate in sequence u δt = u δt · u δt , u δt = u δt · u δt , ... u n δt = u n − δt · u n − δt .As we show in the next few sections, this algorithm approximates probability kernels witherrors with respect to the continuous time kernel density which, in fairly general cases, are oforder O ( h m | log h m | ). This is the same order by which the continuous time kernel density differsfrom its continuous limit also according to estimates below. In the case of Brownian motion, asharp convergence estimates of ordered O ( h m ) can also be proven.Matrix multiplication is accomplished numerically by invoking the routine dgemm in Level-3BLAS. Very efficient, processor specific version of dgemm are now available along with implemen-tations on massively parallel GPU chipsets. It turns out that the standard measure of algorithmiccomplexity as the number of floating point operations required to accomplish a certain task, isnot simply proportional and scales non-linearly with respect to execution time. Using blockingand cache optimizations and distributing the load across many cores, execution time for mediumto large matrices appears to scales much better than the naive n scaling one would obtain bytriple looping (Goto and van de Geijn to appear). A more general method that allows to compute not only exponentials but also other func-tions of a Markov generator is full diagonalization. Unfortunately, unless the Markov generatoris symmetrizable, diagonalization algorithms can possibly run into serious instabilities withindouble precision arithmetics because of the phenomenon of pseudo-spectrum (Trefethen andEmbree 2006). This makes it impossible to diagonalize exactly within the limits of double-precision arithmetics and forces one to resort to expedients such as for instance spectral trun-cations. Since the fast exponentiation method has much better scaling properties than fulldiagonalization and is entirely stable, we recommend that it be used in all situations where amatrix exponentiation is required. However, it often arises the necessity to define and com-pute also other functions of a given time independent Markov generator and for this purposediagonalization can be usefully employed, especially when the target matrix is symmetrizable.Not all Markov generators are diagonalizable but most are and it is safe to take diagonaliz-ability for granted in numerical applications. To make this statement more precise, one sets outthe following definitions: Definition 40. (Generic Properties.) A dense G δ of a topological space is a countableintersection of dense open subsets. If a property is valid on a dense G δ one says that it is validgenerically, or that it is generic. Instead, if the same topological set is endowed of a measureand a property is valid on a full measure set of parameters, one says that it is valid almost surelywith respect to that particular measure. We have that Proposition 5. Markov generators are diagonalizable both generically and almost surely. Let u n ( x ) and λ n , n = 1 , ..N , be eigenfunctions and eigenvalues of the Markov generator L ,i.e.(9.5) L u n = λ n u n . Let U be the matrix whose columns are given by the vectors u n ( x ) and let Λ be the diagonalmatrix with the eigenvalues λ n on the diagonal. Hence(9.6) L U = U Λ . We have that L = U Λ U − and e t L = U e t Λ U − . This equation expressed in components readsas follows:(9.7) e t L ( x, x (cid:48) ) = N (cid:88) n =1 e λ n t u n ( x ) v n ( y )where v n ( y ) is the n − th row vector of the inverse matrix V = U − .One may extend the above definition to other functions of a Markov generator. If ψ ( λ ) is afunction, one may define ψ ( L ) as the operator whose matrix is given by(9.8) ψ ( L )( x, x (cid:48) ) = N (cid:88) n =1 ψ ( λ n ) u n ( x ) v n ( y )An important example of functional calculus is found to express the Markov generator of pro-cesses obtained by stochastic time change, whereby the time-change process has independent,uniformly distributed increments. Processes in this class are called Bochner subordinators . Be-cause of time and space homogeneity, Bochner subordinators can be constructed starting froma process on simplicial sequence h m Z and are characterized by a Markov generator of the form L m ( x, y ) = (cid:96) h m ( y − x ). Theorem 15. (Bochner Subordinators.) If the limit lim h ↓ (cid:96) h = (cid:96) exists in weak sense on D (cid:48) ( R ) then the limit kernel has the following form (9.9) (cid:90) (cid:96) ( x ) φ ( x ) dx = µφ (cid:48) (0) + (cid:90) ∞ ( φ ( x ) − φ (0)) ν ( dx ) PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 21 where ν ( dx ) is a positive measure supported on R + and is such that (9.10) (cid:90) ∞ xν ( dx ) < ∞ . The characteristic function of the process defined as the Fourier transform of the Markovgenerator is thus(9.11) (cid:15) ( k ) = lim h ↓ (cid:88) x e ikx (cid:96) h ( x ) = iµk + (cid:90) ∞ ( e ikx − ν ( dx ) . The Laplace transform instead is given by(9.12) φ ( λ ) ≡ − (cid:15) ( iλ ) = − lim h ↓ (cid:88) x e − λx (cid:96) h ( x ) = µλ + (cid:90) ∞ (1 − e − λx ) ν ( dx ) . A function of this form is called Bernstein function .Bernstein functions may be used to express Laplace transform of the transition probabilitykernel of time-homogeneous monotonic processes as follows:(9.13) (cid:90) ∞ e t L (0 , x ) e − λx dx = e − tφ ( λ ) . In turn, we may also express the transition probability kernel in terms of the characteristicfunction, i.e.(9.14) e t L (0 , x ) = (cid:90) ∞−∞ e it(cid:15) ( k )+ ikx dk π . Notice that since φ ( − ik ) = − (cid:15) ( k ), the last formula may also be interpreted as the Fourier-Mellininversion of the previous one. Example 1. (Poisson Process) The Poisson process corresponds to(9.15) φ P ( λ ; c ) = c (1 − e − λ ) = c (cid:90) ∞ (1 − e − λt ) δ ( t − dt. Example 2. (Stable Process) The stable subordinator with index α ∈ (0 , 1) is given by(9.16) φ S ( λ ; α ) = λ α = α Γ(1 − α ) (cid:90) ∞ (1 − e − λt ) t − − α dt. Example 3. (Gamma Process) The Gamma subordinator with variance rate ν > φ V G ( λ ; ν ) = 1 ν log(1 + νλ ) = 1 ν (cid:90) ∞ (1 − e − λt ) t − e − t/ν dt. To add jumps to a diffusion process one can use the method of independent stochastic subor-dination. Namely let F t be the diffusion process with drift function µ ( F ) and volatility function σ ( F ) and let (cid:32)L d ( y , y ; µ, σ ) be the corresponding Markov generator. Consider the generator(9.18) (cid:32)L V G ( µ, σ, ν ) = − φ V G ( − (cid:32)L d ( µ, σ ); ν ) . The propagator for (cid:32)L j ( µ, σ, ν ) satisfies the following equation:(9.19)exp (cid:0) t (cid:32)L V G ( µ, σ, nu ) (cid:1) ( y , y ) = E (cid:2) exp (cid:0) T νt (cid:32)L d ( µ, σ ) (cid:1) ( y , y ) (cid:3) = N (cid:88) n =1 e − φ V G ( − λ n ; ν ) t u n ( x ) v n ( y )where T νt are the paths of the monotonic process in Example 3, the u n ( x ) are the eigenfunctionsof the diffusion generator (cid:32)L d ( µ, σ ), λ n are the corresponding eigenvalues and the functions v n ( x )are defined as in (9.7). Hence, the generator (cid:32)L V G ( µ, σ, ν ) identifies the time-changed processwith paths F T νt . To model asymmetric jumps one can follow several strategies. A simple one is to specify thetwo different variance rates ν + and ν − for the up and down jumps and compute separately twoMarkov generators (cid:32)L V G ( µ, σ, ν ± ) = − φ V G ( − (cid:32)L d ( µ, σ ); ν ± ) . (9.20)The new generator for our process with asymmetric jumps is obtained by combining the twogenerators above(9.21) (cid:32)L V G ( y , y ; µ, σ, ν + , ν − ) = (cid:32)L V G ( y , y ; µ, σ, ν − ) if F ( y ) < F ( y )(cid:32)L V G ( y , y ; µ, σ, ν + ) if F ( y ) > F ( y ) − (cid:80) y (cid:54) = y (cid:32)L V G ( y , y ; µ, σ, ν + , ν − ) . The construction can also be localized. If φ x ( λ ) is a family of Bernstein functions indexed bythe state variable x ∈ Λ m , then one can consider the Markov generator of matrix˜(cid:32)L( x, y ) = − φ x ( − (cid:32)L d ( µ, σ ))( x, y ) . (9.22)This construction allows one to model state dependent jumps.10. Construction of Brownian Motion This section is based on work in collaboration with Alex Mijatovic, see (Albanese and Mijatovic2006).Let A = ( A m ) , m = m , m +1 , ... be a simplicial sequence converging to an interval lim m →∞ A m =[ − L, L ] ⊂ R , where 0 < L < ∞ . Let µ and σ be two constants.The generator of a Brownian motion on A m has the form(10.1) L m = µ ∇ h m + 12 σ ∆ h m for all interior points x ∈ Int ( A m ). We assume that m is large enough so that(10.2) σ h m > | µ | h m for all m ≥ m .If x is a boundary point, i.e. x ∈ ∂ ( A m ), then several definitions of the operator L m are possi-ble depending on the choice of boundary conditions. Absorbing boundary conditions correspondto the choice(10.3) L m ( x, y ) = 0 ∀ y ∈ A m . Reflecting boundary conditions correspond to L m ( x, x ) = µ ∇ h m ( x, x ) + 12 σ ∆ h m ( x, x )(10.4) L m ( x, y ) = −L m ( x, x )(10.5)where y ∈ Int ( A m ) is the closest point to x in the interior of A m , while L m ( x, y ) = 0 for allother points y . Periodic boundary conditions are implemented by setting L m ( x, y ) = µ ∇ h m ( x, y ) + 12 σ ∆ h m ( x, y )(10.6)if y = x or y ∈ Int ( A m ) is the closest point to x in the interior of A m , and also L m ( x, x (cid:48) ) = µ ∇ h m ( x, y ) + 12 σ ∆ h m ( x, y )(10.7)where x (cid:48) ∈ ∂ ( A m ) is the boundary point at the opposite extreme of the simplex A m . Finally, mixed boundary conditions can also be defined by taking a convex linear combination of thegenerators satisfying one of the three boundary conditions above. PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 23 Theorem 16. (Convergence Estimates for Brownian Motion.) Consider a Brownianmotion defined as the process with generator in (10.1) where µ and σ are constants satisfying thebounds in (10.2) for all m ≥ m and we assume periodic boundary conditions. Let us considerthe kernels (10.8) U m ( x, y, T ) ≡ e T L m ( x, y ) and (10.9) U δtm ( x, y, T ) ≡ (cid:18) δt L m (cid:19) Tδt ( x, y ) . If δt is such that < δt ≤ h σ − h | µ | and m is large enough, then there is a constant c > suchthat for all m (cid:48) ≥ m ≥ m we have the following inequalities: (i)(10.10) | h − m U m ( x, y, T ) − h − m (cid:48) U m (cid:48) ( x, y, T ) |≤ ch m (ii)(10.11) | h − m ∇ h m U m ( x, y, T ) − h − m (cid:48) ∇ h m (cid:48) U m (cid:48) ( x, y, T ) |≤ ch m (iii)(10.12) | h − m ∆ h m U m ( x, y, T ) − h − m (cid:48) ∆ h m (cid:48) U m (cid:48) ( x, y, T ) |≤ ch m (iv)(10.13) | h − m U m ( x, y, T ) − h − m U δtm ( x, y, T ) |≤ ch m Proof. It suffices to show this inequality for m (cid:48) = m + 1. Let us assume for simplicity that L = 2 m h m and A m = { x = h m i, i = 0 , .... m } for all m ≥ m . The argument extends to moregeneral lattice geometry but the consideration of these more general cases would obscure thesimplicity of the proof with needless detail and will thus be omitted.Let B m be the Brillouin zone as defined in equation (5.16). Let F m : (cid:96) ( A m ) → (cid:96) ( B m ) bethe Fourier transform operator defined so that:(10.14) ˆ f ( p ) ≡ F m ( f )( p ) = h m (cid:88) x ∈ A m f ( x ) e − ipx for all p ∈ B m . The inverse Fourier transform is given by(10.15) F − m ( ˆ f )( x ) = 12 L (cid:88) p ∈ B m ˆ f ( p ) e ipx . The Fourier transformed generator is diagonal and is given by the operator of multiplicationby(10.16) ˆ (cid:96) m ( p ) = F m L m F − m ( p, p ) = − iµ sin h m ph m + σ cos h m p − h m . We have(10.17) h − m U m ( x, y, T ) = 12 L (cid:88) p ∈ B m e T ˆ (cid:96) m ( p ) e ip ( y − x ) . Using this Fourier series representation, we find (cid:12)(cid:12) h − m U m ( x, y, T ) − h − m +1 U m +1 ( x, y, T ) (cid:12)(cid:12) ≤ L (cid:12)(cid:12)(cid:12)(cid:12) (cid:88) p ∈ B m (cid:18) e T ˆ (cid:96) m ( p ) − e T ˆ (cid:96) m +1 ( p ) (cid:19) e ip ( y − x ) (cid:12)(cid:12)(cid:12)(cid:12) + 12 L (cid:12)(cid:12)(cid:12)(cid:12) (cid:88) p ∈ B m +1 \ B m e T ˆ (cid:96) m +1 ( p ) e ip ( y − x ) (cid:12)(cid:12)(cid:12)(cid:12) . (10.18) Let(10.19) K m = (cid:114) | log h m +1 | σ T . If h m is small enough, i.e. if m is sufficiently large, we have that12 L (cid:12)(cid:12)(cid:12)(cid:12) (cid:88) p ∈ B m , | p |≥ K m e T ˆ (cid:96) m ( p ) e ip ( y − x ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ L (cid:88) p ∈ B m , | p |≥ K m e T (cid:60) (ˆ (cid:96) m ( p )) ≤ c exp (cid:18) T σ cos h m K m − h m (cid:19) ≤ ch m . (10.20)where (cid:60) ( a ) denotes the real part of a ∈ C and c denotes a generic constant. Similarly12 L (cid:12)(cid:12)(cid:12)(cid:12) (cid:88) p ∈ B m +1 , | p |≥ K m e T ˆ (cid:96) m +1 ( p ) e ip ( y − x ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ L (cid:88) p ∈ B m , | p |≥ K m e T (cid:60) (ˆ (cid:96) m +1 ( p )) ≤ c exp (cid:18) T σ cos h m +1 K − h m +1 (cid:19) ≤ ch m +1 (10.21)Since 12 h p − h p ≤ sin hph − sin 2 hp h ≤ h p (10.22)and − h p ≤ cos hp − h − cos 2 hp − h ) ≤ − h p + 148 h p . (10.23)we find that if | p |≤ √ h then | ˆ (cid:96) m ( p ) − ˆ (cid:96) m +1 ( p ) |≤ µ h | p | + σ h p . (10.24)Moreover, since − p ≤ cos hp − h ≤ − p + 124 h p (10.25)we conclude that in case | p |≤ h − (cid:113) , the following inequality holds:cos hp − h ≤ − p (10.26)Hence, if m is large enough, we find12 L (cid:12)(cid:12)(cid:12)(cid:12) (cid:88) p ∈ B m , | p |≤ K (cid:18) e T ˆ (cid:96) m ( p ) − e T ˆ (cid:96) m +1 ( p ) (cid:19) e ip ( y − x ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ L (cid:88) p ∈ B m , | p |≤ K e − p (cid:18) e µT h m | p | + σ T h m p − (cid:19) ≤ L (cid:88) p ∈ B m , | p |≤ K e − p (cid:18) µT h m | p | + σ T h m p (cid:19) ≤ ch m (10.27)for some constant c > m . This concludes the proof of the bound in (10.10).To estimate the sensitivity in (10.11) notice that(10.28) h − m ∇ U m ( x, y, T ) = 1 L (cid:88) p ∈ B m e T ˆ (cid:96) m ( p ) sin ph m h m e ip ( y − x ) . PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 25 and (cid:12)(cid:12) h − m U m ( x, y, T ) − h − m +1 U m +1 ( x, y, T ) (cid:12)(cid:12) ≤ L (cid:12)(cid:12)(cid:12)(cid:12) (cid:88) p ∈ B m (cid:18) e T ˆ (cid:96) m ( p ) sin ph m h m − e T ˆ (cid:96) m +1 ( p ) sin ph m +1 h m +1 (cid:19) e ip ( y − x ) (cid:12)(cid:12)(cid:12)(cid:12) + 12 L (cid:12)(cid:12)(cid:12)(cid:12) (cid:88) p ∈ B m +1 \ B m e T ˆ (cid:96) m +1 ( p ) e ip ( y − x ) (cid:12)(cid:12)(cid:12)(cid:12) . (10.29)Let(10.30) K m = 2 (cid:114) | log h m +1 | σ T . If h m is small enough, we have that12 L (cid:12)(cid:12)(cid:12)(cid:12) (cid:88) p ∈ B m , | p |≥ K m e T ˆ (cid:96) m ( p ) sin ph m h m e ip ( y − x ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ L (cid:88) p ∈ B m , | p |≥ K m e T (cid:60) (ˆ (cid:96) m ( p )) sin ph m h m ≤ c (cid:12)(cid:12)(cid:12)(cid:12) sin Kh m h m (cid:12)(cid:12)(cid:12)(cid:12) exp (cid:18) T σ cos Kh m − h m (cid:19) ≤ ch m . (10.31)where c denotes a generic constant. Similarly12 L (cid:12)(cid:12)(cid:12)(cid:12) (cid:88) p ∈ B m +1 , | p |≥ K m e T ˆ (cid:96) m +1 ( p ) e ip ( y − x ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ ch . (10.32)If m is large enough, we also find12 L (cid:12)(cid:12)(cid:12)(cid:12) (cid:88) p ∈ B m , | p |≤ K m (cid:18) sin ph m h m e T ˆ (cid:96) m ( p ) − sin ph m +1 h m +1 e T ˆ (cid:96) m +1 ( p ) (cid:19) e ip ( y − x ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ L (cid:88) p ∈ B m , | p |≤ K m (cid:12)(cid:12)(cid:12)(cid:12) sin ph m h m (cid:12)(cid:12)(cid:12)(cid:12) e − p (cid:18) e µT h m | p | + σ T h m p − (cid:19) + e − p (cid:12)(cid:12)(cid:12)(cid:12) sin ph m +1 h m +1 − sin ph m h m (cid:12)(cid:12)(cid:12)(cid:12) ≤ ch m (10.33)for some constant c > m . This concludes the proof of the bound in (10.11).The bound in (10.12) can be derived in a similar way.Finally, consider the following Fourier representation for the discretized kernel(10.34) h − m U δtm ( x, y, T ) = 1 L (cid:88) p ∈ B m (cid:18) δt ˆ (cid:96) m ( p ) (cid:19) Tδt e ip ( y − x ) . Consider the formula(10.35) (cid:18) δt ˆ (cid:96) m ( p ) (cid:19) Tδt = exp (cid:18) T log (cid:0) (cid:96) m ( p ) (cid:1)(cid:19) . and let’s represent the difference between kernels in (10.13) as follows: | h − m U m ( x, y, T ) − h − m U δtm ( x, y, T ) |≤ L (cid:12)(cid:12)(cid:12)(cid:12) (cid:88) p ∈ B m (cid:18) exp (cid:0) T ˆ (cid:96) m ( p ) (cid:1) − exp (cid:18) Tδt log (cid:0) δt ˆ (cid:96) m ( p ) (cid:1)(cid:19) e ip ( y − x ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ L (cid:88) p ∈ B m ,p ≤ K m e − p (cid:12)(cid:12)(cid:12)(cid:12) exp (cid:18) Tδt log (cid:0) δt ˆ (cid:96) m ( p ) (cid:1) − T ˆ (cid:96) m ( p ) (cid:19) − (cid:12)(cid:12)(cid:12)(cid:12) L (cid:88) p ∈ B m ,p ≥ K m (cid:12)(cid:12)(cid:12)(cid:12) exp (cid:0) T ˆ (cid:96) m ( p ) (cid:1)(cid:12)(cid:12)(cid:12)(cid:12) + 12 L (cid:88) p ∈ B m ,p ≥ K m (cid:12)(cid:12)(cid:12)(cid:12) exp (cid:18) Tδt log (cid:0) δt ˆ (cid:96) m ( p ) (cid:1)(cid:19)(cid:12)(cid:12)(cid:12)(cid:12) (10.36)where K m is chosen as in (10.19). The very same bounds above lead to the conclusion that thisdifference is ≤ ch m . (cid:3) Kernel Convergence Estimates for Diffusion Processes Diffusions are a particularly important class of Markov processes which generalize Brown-ian motions to allow for space dependent drifts and volatility. In this section we find kernelconvergence estimates for the one-dimensional case following (Albanese 2007 b ).Let A = ( A m ) , m = m , m +1 , ... be a simplicial sequence converging to an interval lim m →∞ A m =[ − L, L ] ⊂ R , where 0 < L < ∞ . Let µ ( x ) and σ ( x ) be smooth functions defined in a neighbor-hood of [ − L, L ]. The generator of a diffusion on A m has the form(11.1) L m = µ ( x ) ∇ h m + 12 σ ( x ) ∆ h m . We assume that m is large enough so that(11.2) σ ( x )2 h m > | µ ( x ) | h m for all m ≥ m and all x ∈ A m . The definition of the generators at boundary points can beextended by imposing one of the boundary conditions in the previous section, i.e. reflecting,absorbing, periodic or mixed. Theorem 17. (Convergence Estimates for Diffusions.) Consider a diffusion process de-fined as in (11.1) where µ ( y ) and σ ( y ) are smooth functions satisfying the bounds in (11.2) forall m ≥ m . Assume that boundary conditions are either periodic or absorbing. Then there is aconstant c > such that (11.3) | h − m U m ( x, y, T ) − h − m (cid:48) U m (cid:48) ( x, y, T ) |≤ ch m for all m (cid:48) ≥ m and all y ∈ A m . It suffices to establish the above inequality in the case m (cid:48) = m + 1. In fact, given thisparticular case, the general statement can be derived with an iterative argument. Let h = h m +1 so that h m = 2 h .Recall that the path integral representation for a diffusion process has the form in equation(11.4). In the special case of a time-homogeneous process, this expansion reads as follows: U m ( x, y, T ) = ∞ (cid:88) q =1 − q (cid:88) γ ∈ Γ m : γ = x, γ q = y | γ j − γ j − | = 1 ∀ j ≥ W m ( γ, q, T )(11.4) PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 27 Figure 1. Contour of integration for the integral in (11.9). C + is the countourjoining the point D to the points E, A, B . C − is the countour joining the point B to C to D .where W m ( γ, q, T ) = (cid:90) T ds (cid:90) Ts ds ... (cid:90) Ts q − ds q e ( T − s q ) L m ( y,y ) q − (cid:89) j =0 (cid:18) e ( s j +1 − s j ) L m ( γ j ,γ j ) L m ( γ j , γ j +1 ) (cid:19) (11.5)where s = 0.Let us introduce the following two constants characterizing the volatility function:(11.6) Σ = inf x ∈ A m σ ( x ) , Σ = sup x ∈ A m (cid:112) σ ( x ) + h m | µ ( x ) | . and let(11.7) M = sup x ∈ A m | µ ( x ) | . Since our interval is bounded, we have that Σ > , M < ∞ .Let us introduce the following Green’s function:(11.8) G m ( x, y ; ω ) = (cid:90) ∞ U m ( x, y, t ) e − iωt dt = 1 L + iω ( x, y ) . The propagator can be expressed as the following contour integral(11.9) U m ( x, y, T ) = (cid:90) C − dω π G m ( x, y ; ω ) e iωT + (cid:90) C + dω π G m ( x, y ; ω ) e iωT . Here, C + is the contour joining the point D to the points E, A, B in Fig. 1, while C − is thecontour joining the point B to C to D . By design, each point ω on the upper path C + isseparated from the spectrum of L . Lemma 1. For m sufficiently large, there is a constant c > such that (11.10) (cid:12)(cid:12)(cid:12)(cid:12) (cid:90) C + dω π G m ( x, y ; ω ) e iωT (cid:12)(cid:12)(cid:12)(cid:12) ≤ ch . Proof. The proof is based on the geometric series expansion(11.11) G m ( ω ) = 1 L m + iω = ∞ (cid:88) j =0 σ ∆ m + iω (cid:20) µ ∇ m σ ∆ m + iω (cid:21) j whose convergence for ω ∈ C + can be established by means of a Kato-Rellich relative bound, see(Kato 1976). More precisely, for any α > 0, one can find a β > ∇ m and ∆ m satisfy the following relative bound estimate:(11.12) ||∇ m f || ≤ α || ∆ m f || + β || f || . for all periodic functions f and all m ≥ m . This bound can be derived by observing that ∇ m and ∆ m can be diagonalized simultaneously by a Fourier transform, as done in the previoussection, and by observing that for any α > 0, one can find a β > (cid:12)(cid:12)(cid:12)(cid:12) sin h m ph m (cid:12)(cid:12)(cid:12)(cid:12) ≤ α (cid:12)(cid:12)(cid:12)(cid:12) cos h m p − h m (cid:12)(cid:12)(cid:12)(cid:12) + β for all m ≥ m and all p ∈ B m .Under the same conditions, we also have that(11.14) (cid:12)(cid:12)(cid:12)(cid:12) µ ∇ m f (cid:12)(cid:12)(cid:12)(cid:12) ≤ M α Σ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) σ ∆ m f (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + β || f || . Hence(11.15) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) µ ∇ m σ ∆ m + iω f (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ M α Σ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) σ ∆ m σ ∆ m + iω f (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + β (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) σ ∆ m + iω f (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) < ω ∈ C + , if α is chosen sufficiently small and if m is large enough.In this case, the geometric series expansion converges in (11.11) converges in L operator norm.The uniform norm of the kernel | G m ( x, y ; ω ) | is pointwise bounded from above by h − m .Since the points B and D have imaginary part equal at height 4 | log h m | T , the integral over thecontour C + converges also and is bounded from above by ch m in uniform norm. (cid:3) Lemma 2. If q ≥ e Σ T h m we have that (11.16) W m ( γ, q ; 0 , T ) ≤ (cid:114) q π exp (cid:18) − Σ T − q (cid:19) . Proof. Let us define the function(11.17) φ ( t ) = Σ h m e − Σ20 t h m t ≥ t ≥ 0) is the characteristic function of R + . We have that(11.18) W m ( γ, q ; 0 , T ) ≤ φ (cid:63)q ( T )where φ (cid:63)q is the q − th convolution power, i.e. the q − fold convolution product of the function φ by itself. The Fourier transform of φ ( t ) is given by(11.19) ˆ φ ( ω ) = Σ h m (cid:90) ∞ e − iωt − Σ20 t h m dt = Σ iωh m + Σ . The convolution power is given by the following inverse Fourier transform:(11.20) φ (cid:63)q ( T ) = (cid:90) ∞ ˆ φ ( ω ) q e iωT = (cid:18) Σ Σ (cid:19) q (cid:90) ∞−∞ (cid:18) iωh m Σ (cid:19) − q e iωT dω π . PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 29 Figure 2. Contour of integration C R for the integral in (11.21).Introducing the new variable z = 1 + iωh m Σ , the integral can be recast as follows(11.21) φ (cid:63)q ( T ) = Σ − q Σ q πih m lim R →∞ (cid:90) C R z − q exp (cid:18) Σ T h m ( z − (cid:19) dz where C R is the contour in Fig. 2. Using the residue theorem and noticing that the only pole ofthe integrand is at z = 0, we find(11.22) φ (cid:63)q ( T ) = 1( q − (cid:18) Σ T h m (cid:19) q exp (cid:18) − Σ T h m (cid:19) . Making use of Stirling’s formula q ! ≈ √ πq q + e − q , we find(11.23) φ (cid:63)q ( T ) ≈ (cid:114) q π exp (cid:18) − Σ T h m + q log Σ T h m + q (1 − log q ) (cid:19) . If log q ≥ log Σ T h m + 2, then we arrive at the bound in (11.16). (cid:3) Definition 41. (Decorating Paths.) Let m ≥ m and let γ = { y , y , y , .... } be a symbolicsequence in Γ m . A decorating path around γ is defined as a symbolic sequence γ (cid:48) = { y , y (cid:48) , y (cid:48) , .... } with y (cid:48) i ∈ h m +1 Z containing the sequence γ as a subset and such that if y (cid:48) j = y i and y (cid:48) k = y i +1 ,then all elements y (cid:48) n with j < n < k are such that | y (cid:48) n − y (cid:48) j |≤ h m +1 . Let D m +1 ( γ ) be the set ofall decorating sequences around γ . The decorated weights are defined as follows: (11.24) ˜ W m ( γ, q ; 0 , T ) = ∞ (cid:88) q (cid:48) = q (cid:88) γ (cid:48) ∈ D m +1 ( γ ) γ (cid:48) q (cid:48) = γ q W m +1 ( γ (cid:48) , q (cid:48) ; 0 , T ) . Finally, let us introduce also the following Fourier transform: (11.25) ˆ W m ( γ, q ; ω ) = (cid:90) ∞ W m ( γ, q ; 0 , t ) e iωt dt, ˆ˜ W m ( γ, q ; ω ) = (cid:90) ∞ ˜ W m ( γ, q ; 0 , t ) e iωt dt. Definition 42. (Notations.) In the following, we set h = h m +1 so that h m = 2 h . We alsouse the Landau notation O ( h n ) to indicate a function f ( h ) such that h − n f ( h ) is bounded in aneighborhood of (0) . Lemma 3. Let x, y ∈ A m and let C − be an integration contour as in Fig. 1. Then (11.26) (cid:12)(cid:12)(cid:12)(cid:12)(cid:18) (cid:90) C − G m +1 ( x, y ; ω ) − G m ( x, y ; ω ) (cid:19) e iωT dω π (cid:12)(cid:12)(cid:12)(cid:12) = O ( h ) . Proof. We have that2 G m +1 ( x, y ; ω ) − G m ( x, y ; ω ) = ∞ (cid:88) q =1 − q (cid:88) γ ∈ Γ m : γ = x, γ q = y | γ j − γ j − | = 1 ∀ j ≥ (cid:16) W m ( γ, q ; ω ) − ˆ W m ( γ, q ; ω ) (cid:17) . (11.27)The number of paths over which the summation is extended is(11.28) N ( γ, q ; x, y ) ≡ (cid:93) { γ ∈ Γ m : γ = x, γ q = y, | γ j − γ j − | = 1 ∀ j ≥ } = (cid:18) q q + k (cid:19) where k = | y − x | h m . Applying Stirling’s formula we find(11.29) N γ (cid:46) q (cid:114) πq . Hence (cid:12)(cid:12)(cid:12)(cid:12) (cid:90) C − (cid:18) G m +1 ( x, y ; ω ) − G m ( x, y ; ω ) (cid:19) e iωT dω π (cid:12)(cid:12)(cid:12)(cid:12) ≤ c ∞ (cid:88) q =1 (cid:114) q max γ ∈ Γ m : γ = x, γ q = y | γ j − γ j − | = 1 ∀ j ≥ (cid:12)(cid:12)(cid:12)(cid:12) (cid:90) C − (cid:18) W m ( γ, q ; ω ) − ˆ W m ( γ, q ; ω ) (cid:19) e iωT dω π (cid:12)(cid:12)(cid:12)(cid:12) . (11.30)for some constant c ≈ (cid:113) π > 0. It suffices to extend the summation over q only up to(11.31) q max ≡ e Σ T h . To resum beyond this threshold, one can use the previous lemma. More precisely, we have that (cid:12)(cid:12)(cid:12)(cid:12) (cid:90) C − (cid:18) G m +1 ( x, y ; ω ) − G m ( x, y ; ω ) (cid:19) e iωT dω π (cid:12)(cid:12)(cid:12)(cid:12) ≤ c √ q max max q, γ ∈ Γ m : γ = x, γ q = y | γ j − γ j − | = 1 ∀ j ≥ (cid:12)(cid:12)(cid:12)(cid:12) (cid:90) C − (cid:18) W m ( γ, q ; ω ) − ˆ W m ( γ, q ; ω ) (cid:19) e iωT dω π (cid:12)(cid:12)(cid:12)(cid:12) . (11.32)Let v ( x ) = σ ( x ) . To evaluate the resummed weight function, let us form the matrix(11.33) ¯ L ( x ; h ) = − v ( x + h ) h v ( x + h )2 h − µ ( x + h )2 h v ( x )2 h + µ ( x ) / (2 h ) − v ( x ) h v ( x )2 h − µ ( x )2 h v ( x − h )2 h + µ ( x − h )2 h − v ( x − h ) h PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 31 and decompose it as follows:(11.34) ¯ L ( x ; h ) = 1 h ¯ L ( x ) + 1 h ¯ L ( x ) + ¯ L ( x ) + h ¯ L ( x ) + O ( h ) . where(11.35) ¯ L ( x ) = − v ( x ) v ( x ) 0 v ( x ) − v ( x ) v ( x )0 v ( x ) − v ( x ) , (11.36) ¯ L ( x ) = − v (cid:48) ( x ) v (cid:48) ( x ) − µ ( x ) 0 µ ( x ) 0 − µ ( x )0 − v (cid:48) ( x ) + µ ( x ) v (cid:48) ( x ) , (11.37) ¯ L ( x ) = − v (cid:48)(cid:48) ( x ) v (cid:48)(cid:48) ( x ) − µ (cid:48) ( x ) 00 0 00 v (cid:48)(cid:48) ( x ) − µ (cid:48) ( x ) − v (cid:48)(cid:48) ( x ) . and(11.38) ¯ L ( x ) = − v (cid:48)(cid:48)(cid:48) ( x ) v (cid:48)(cid:48)(cid:48) ( x ) − µ (cid:48)(cid:48) ( x ) 00 0 00 − v (cid:48)(cid:48)(cid:48) ( x ) + µ (cid:48)(cid:48) ( x ) v (cid:48)(cid:48)(cid:48) ( x ) . Let us introduce the sign variable τ = ± 1, the functions φ ( t, x, τ ) ≡ L m ( x, x + 2 τ h ) e t L m ( x,x ) t ≥ φ ( t, x, τ ) ≡ L m +1 ( x + τ h, x + 2 τ h ) e t ¯ L ( x ; h ) ( x, x + τ h )1( t ≥ φ ( ω, x, τ ) = (cid:18) v ( x )4 h + τ µ ( x )2 h (cid:19) (cid:18) v ( x )4 h + iω (cid:19) − ˆ φ ( ω, x, τ ) = (cid:18) v ( x ) h + τ µ ( x ) + v (cid:48) ( x ) h + v (cid:48)(cid:48) ( x ) + µ (cid:48) ( x )2 + (cid:18) v (cid:48)(cid:48)(cid:48) ( x )6 + µ (cid:48)(cid:48) ( x )2 (cid:19) τ h + O ( h ) (cid:19) < x | (cid:0) − ¯ L ( x ; h ) + iω (cid:1) − | x + τ h > . (11.41)where(11.42) | x > = , and | x + τ h > = δ τ, δ τ, − . We also require the functions(11.43) ψ ( t, x ) ≡ e t L m ( x,x ) t ≥ , ψ ( t, x ) ≡ e t ¯ L ( y ; h ) ( x, x )1( t ≥ ψ ( ω, x ) = (cid:18) v ( x )4 h + iω (cid:19) − , ˆ ψ ( ω, x ) = < x | (cid:0) − ¯ L ( x ; h ) + iω (cid:1) − | x > . (11.44)If γ is a symbolic sequence, thenˆ W m ( γ, q ; ω ) = ˆ ψ ( ω, γ q ) q − (cid:89) j =0 ˆ φ ( ω ; γ j , sgn( γ j +1 − γ j ))(11.45) ˆ˜ W m ( γ, q ; ω ) = ˆ ψ ( ω, γ q ) q − (cid:89) j =0 ˆ φ ( ω ; γ j , sgn( γ j +1 − γ j )) . (11.46) Let us estimate the difference between the functions ˆ φ ( ω, x, τ ) and ˆ φ ( ω, x, τ ) assuming that ω is in the contour C − in Fig. 2. Retaining only terms up to order up to O ( h ), we findˆ φ ( ω, x, τ ) = 1 + 2 µ ( x ) τ hv ( x ) − iωh v ( x ) − µ ( x ) iωτ h v ( x ) − ω h v ( x ) + O ( h ) . (11.47)A lengthy but straightforward calculation which is best carried out using a symbolic manipulationprogram, givesˆ φ ( ω, x, τ ) = 1 + 2 µ ( x ) τ hv ( x ) − iωh v ( x ) − (cid:2) µ ( x ) − v (cid:48) ( x ) (cid:3) iωτ h v ( x ) + r ( x ) · h τ + iωh p ( x ) − ω h v ( x ) + O ( h )(11.48)where r ( x ) = 12 v ( x ) (cid:2) µ (cid:48)(cid:48) ( x ) v ( x ) − µ ( x ) + 2 v (cid:48) ( x ) µ ( x ) − v (cid:48) ( x ) v ( x ) µ (cid:48) ( x ) − (cid:0) µ ( x ) µ (cid:48) ( x ) + v (cid:48)(cid:48) ( x ) v ( x ) − v (cid:48) ( x ) (cid:1) µ ( x ) (cid:3) .p ( x ) = 1 v ( x ) (cid:2) µ ( x ) − v (cid:48) ( x ) µ ( x ) + 4 v ( x ) µ (cid:48) ( x ) + v (cid:48)(cid:48) ( x ) v ( x ) − v (cid:48) ( x ) (cid:3) . (11.49)We have that q − (cid:88) j =0 (cid:18) log ˆ φ ( ω ; γ j , sgn( γ j +1 − γ j )) − log ˆ φ ( ω ; γ j , sgn( γ j +1 − γ j )) (cid:19) = q − (cid:88) j =0 (cid:18) iωv (cid:48) ( γ j ) v ( γ j ) + r ( γ j ) (cid:19) h sgn( γ j +1 − γ j ) + (cid:0) | ω ||| p || ∞ +2 | ω | || v − || ∞ (cid:1) O ( h q )= iωh log (cid:18) v ( γ q ) v ( γ ) (cid:19) + h (cid:0) R ( γ q ) − R ( γ ) (cid:1) + (cid:0) | ω ||| p || ∞ +2 | ω | || v − || ∞ (cid:1) O ( h q )(11.50)where R ( x ) is a primitive of r ( x ), i.e.(11.51) R ( x ) = (cid:90) x r ( z ) dz. We conclude that there is a constant c > (cid:12)(cid:12)(cid:12)(cid:12) (cid:90) C − (cid:18) q − (cid:89) j =0 ˆ φ ( ω ; γ j , sgn( γ j +1 − γ j )) − q − (cid:89) j =0 ˆ φ ( ω ; γ j , sgn( γ j +1 − γ j )) (cid:19) e iωT dω π (cid:12)(cid:12)(cid:12)(cid:12) ≤ ch . for all q ≤ q max . Here we use the decay of e iωT in the upper half of the complex ω plane to offsetthe ω dependencies in the integrand. Similar calculations lead to the following expansions:(11.53) ˆ ψ ( ω, y ) = 4 h v ( y ) + O ( ωh ) , ˆ ψ ( ω, y ) = 2 h v ( y ) + O ( ωh ) = 12 ˆ ψ ( ω, y ) + O ( ωh ) . Since q < ch − and ω ≤ | log h | , we find (cid:12)(cid:12)(cid:12)(cid:12) (cid:90) C − (cid:18) G m +1 ( x, y ; ω ) − G m ( x, y ; ω ) (cid:19) e iωT dω π (cid:12)(cid:12)(cid:12)(cid:12) ≤ cq max h ≤ ch . (11.54)This completes the proof of the Lemma and of the Theorem. (cid:3) PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 33 Convergence of Time Discretization Schemes In this section we analyze the convergence of the time discretized kernel that is obtained bymeans of fast exponentiation.Let A = ( A m ) , m = m , m +1 , ... be a simplicial sequence converging to an interval lim m →∞ A m =[0 , L ] ⊂ R , where 0 < L < ∞ . Let µ ( x ) and σ ( x ) be functions on [0 , L ] satisfying all the condi-tions in the previous section and consider the generator of a diffusion on A m of the form(12.1) L m = µ ( x ) ∇ h m + 12 σ ( x ) ∆ h m . Theorem 18. (Convergence Estimates for Fast Exponentiation.) Let δt > and con-sider the discretized kernel (12.2) U δtm ( x, y, T ) = (1 + δt L m ) Tδt ( x, y, T ) . where L m is the operator in (12.1) and δt m is so small that (12.3) min x ∈ A m δt m L m ( x, x ) > Assume that boundary conditions are either periodic or absorbing and that the ratio Tδt = N isan integer. Then there is a constant c > such that (12.4) | h − m U m ( x, y, T ) − h − m U δtm ( x, y, T ) |≤ ch m for all m ≥ m and all y ∈ A m .Proof. A Dyson expansion can also be obtained for the time-discretized kernel and has the form U δtm ( y , y , T ) = ∞ (cid:88) q =1 (cid:88) γ ∈ Γ m : γ = x,γ q = y N (cid:88) k =1 N (cid:88) k = k +1 ... N (cid:88) k q = k q − +1 (cid:18) δt L m ( γ , γ ) (cid:19) k − ( δt ) q q (cid:89) j =1 L m ( γ j − , γ j ) (cid:18) δt L m ( γ j , γ j ) (cid:19) k j +1 − k j − (12.5)where t q +1 = T and k q +1 = N . In this case, the propagator can be expressed through a Fourierintegral as follows:(12.6) U δtm ( y , y , T ) = (cid:90) πδt − πδt G δtm ( y , y ; ω ) e iωt dω π where(12.7) G δtm ( y , y ; ω ) = δt Tδt (cid:88) j =0 U δtm ( y , y , jδt ) e − iωjδt . The propagator can also be represented as the limit(12.8) U δtm ( y , y , T ) = lim H →∞ (cid:90) C H G δtm ( y , y ; ω ) e iωt dω π where C H is the contour in Fig. 3. This is due to the fact that the integral along the segments BC and DA are the negative of each other, while the integral over CD tends to zero exponentiallyfast as (cid:61) ( ω ) → ∞ , where (cid:61) ( ω ) is the imaginary part of ω . Using Cauchy’s theorem, the contourin Fig. 3 can be deformed into the contour in Fig. 1. To estimate the discrepancy between thetime-discretized kernel and the continuous time one, one can thus compare the Green’s functionalong such contour. Again, the only arc that requires detailed attention is the arc BCD , as theintegral over rest of the contour of integration can be bounded from above as in the previoussection.Let h = h m and let us introduce the two functions φ ( t, x, τ ) ≡ L m ( x, x + τ h ) e t L m ( x,x ) t ≥ , (12.9) φ δt ( j, x, τ ) ≡ L m ( x, x + τ h ) (cid:0) δt L m ( x, x ) (cid:1) j − . (12.10) Figure 3. Contour of integration for the integral in (12.8).and the corresponding Fourier transformsˆ φ ( ω, x, τ ) = (cid:90) ∞ φ ( t, x, τ ) e − iωt dω π = (cid:18) v ( x ) h + τ µ ( x ) h (cid:19) (cid:18) v ( x ) h + iω (cid:19) − (12.11) ˆ φ δt ( ω, x, τ ) = tδt (cid:88) j =0 φ δt ( j, x, τ ) e − iωjδt = (cid:18) v ( x ) h + τ µ ( x ) h (cid:19) (cid:18) e iωδt − δt v ( x ) h (cid:19) − . (12.12)We have thatˆ φ δt ( ω, x, τ ) = (cid:18) v ( x ) h + τ µ ( x ) h (cid:19) (cid:18) iω + v ( x ) h − ω δt + O ( δt ) (cid:19) − = ˆ φ ( ω, x, τ ) + ω v ( x ) h δt + O ( h δt ) . = ˆ φ ( ω, x, τ ) + O ( h ) , (12.13)where the last step uses the fact that δt = O ( h ).Let us also introduce the functions ψ ( t, x, τ ) ≡ e t L m ( x,x ) t ≥ , ψ δt ( j, x, τ ) ≡ j (cid:88) k =1 (cid:0) δt L m ( x, x ) (cid:1) j − . (12.14)and the corresponding Fourier transformsˆ ψ ( ω, x, τ ) = (cid:18) v ( x ) h + iω (cid:19) − , ˆ ψ δt ( ω, x, τ ) = (cid:18) e iωδt − δt v ( x ) h (cid:19) − . (12.15)Again we find that ˆ ψ ( ω, x, τ ) = ˆ ψ δt ( ω, x, τ ) + O ( h ) . (12.16) PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 35 If γ is a symbolic sequence, then let us setˆ W m ( γ, q ; ω ) = ˆ ψ ( ω, γ q ) q − (cid:89) j =0 ˆ φ ( ω ; γ j , sgn( γ j +1 − γ j ))(12.17) ˆ W δtm ( γ, q ; ω ) = ˆ ψ δt ( ω, γ q ) q − (cid:89) j =0 ˆ φ δt ( ω ; γ j , sgn( γ j +1 − γ j )) . (12.18)We have that G δtm ( x, y ; ω ) − G m ( x, y ; ω ) = ∞ (cid:88) q =1 − q (cid:88) γ ∈ Γ m : γ = x, γ q = y | γ j − γ j − | = 1 ∀ j ≥ (cid:16) ˆ W δtm ( γ, q ; ω ) − ˆ W m ( γ, q ; ω ) (cid:17) . (12.19)The integration over the contour in Fig. 1 can again be split into an integration over thecountour C − and an integration over C + . The integral over C + can be bounded from abovethanks to Lemma 3. Furthermore, we have that (cid:12)(cid:12)(cid:12)(cid:12) (cid:90) C − (cid:18) G δtm ( x, y ; ω ) − G m ( x, y ; ω ) (cid:19) e iωT dω π (cid:12)(cid:12)(cid:12)(cid:12) ≤ c √ q max max q, γ ∈ Γ m : γ = x, γ q = y | γ j − γ j − | = 1 ∀ j ≥ (cid:12)(cid:12)(cid:12)(cid:12) (cid:90) C − (cid:18) ˆ W δtm ( γ, q ; ω ) − ˆ W m ( γ, q ; ω ) (cid:19) e iωT dω π (cid:12)(cid:12)(cid:12)(cid:12) . ≤ ch (12.20) (cid:3) Hypergeometric Brownian Motion This section is based on work in collaboration with Joe Campolieti, Peter Carr and AlexLipton, see (Albanese et al. A m ⊂ R d with generator L m ( x, x (cid:48) ; t ).Let ρ be a real valued parameter and suppose f m ( x ) and f m ( x ) are two linearly independentsolutions of the equation (cid:88) x (cid:48) ∈ A m L m ( x, x (cid:48) ; t ) f m )( x (cid:48) ) = ρf m ( x )(13.1)for all x ∈ Int ( A m ). Notice that on the boundary ∂ ( A m ) these equations may fail and, in actualapplications, they will indeed as a rule fail. Consider the function(13.2) g m ( x ; t ) = e − ρt (cid:0) c f m ( x ) + c f m ( x ) (cid:1) for some choice of constants c , c such that this function is strictly positive. This functionsatisfies the equation(13.3) ∂g m ( x ; t ) ∂t + ( L m g m )( x ; t ) = 0 . Hence g m ( x ; t ) defines a measure change and one can construct a new Markov generator bysetting(13.4) L gm ( x, x (cid:48) ; t ) = g m ( x (cid:48) ; t ) g m ( x ; t ) L m ( x, x (cid:48) ; t ) + 1 g m ( x ; t ) ∂g m ( x ; t ) ∂t δ xx (cid:48) . Consider the linear fractional transformation(13.5) Y m ( x ) = c f m ( x ) + c f m ( x ) c f m ( x ) + c f m ( x ) . for some choice of constants c , c . Theorem 19. (Linear Fractional Transformations.) The process Y ( x t ) satisfies the mar-tingale condition (13.6) lim s ↓ s E t (cid:20) Y m ( x t + s ) − Y m ( x t ) s (cid:21) = 0 . for all x t ∈ Int ( A m ) .Proof. We have thatlim s ↓ s E t (cid:20) Y m ( x t + s ) − Y m ( x t ) s (cid:21) = (cid:80) x (cid:48) ∈ A m L gm ( x, x (cid:48) ) (cid:0) Y m ( x (cid:48) ) − Y m ( x ) (cid:1) = g m ( x,t ) (cid:18) (cid:80) x (cid:48) ∈ A m L m ( x, x (cid:48) ) (cid:0) c f m ( x (cid:48) ) + c f m ( x (cid:48) ) (cid:1)(cid:19) + g m ( x,t ) ∂g m ( x ; t ) ∂t (cid:0) c f m ( x ) + c f m ( x ) (cid:1) = e ρt ∂∂t (cid:20) e − ρt ( c f m ( x )+ c f m ( x )) g m ( x,t ) (cid:21) = 0(13.7) (cid:3) This Theorem provides a general methodology for constructing a process which is nearly amartingale out of a Markov process. In the particular case of one-dimensional diffusion processes,the construction gives rise to families with up to 7 adjustable parameters of analytically solvablediffusions with drift equal to zero within the interior of the domain of definition. What makesthe case of one-dimensional diffusions special is the fact that the function Y m ( x ) is invertible inthe limit as m → ∞ , i.e. either monotone increasing or monotone decreasing in this limit.More specifically, consider the diffusion with Markov generator:(13.8) L m = δ ( x ) ∇ h m + 12 ν ( x ) ∆ h m . where x ∈ R . Two important cases are the CIR process (reducing to the Bessel equation in thecontinuous limit) for which(13.9) δ ( x ) = ( λ + λ x ) , ν ( x ) = ν √ x with x ∈ R + and the Jacobi process (reducing to a gaussian hypergeometric polynamials of type F ) for which(13.10) δ ( x ) = ( λ + λ x ) , ν ( x ) = ν (cid:112) x (1 − x )and x ∈ [0 , f jm ( x ) , j = 1 , , , f j ( x ) , j = 1 , , , L f = ρf (13.11)within the interior of the domain A = lim m →∞ A m . Here(13.12) L = δ ( x ) ∂∂x + 12 ν ( x ) ∂ ∂x . Theorem 20. (Invertibility.) Let f j ( x ) , j = 1 , , , be functions satisfying equation 13.12 inthe interior of the corresponding domain of definition and let (13.13) Y ( x ) = c f ( x ) + c f ( x ) c f ( x ) + c f ( x ) . for some choice of constants c , c , c , c such that the denominator in this equation has no zeros.Then we have that (13.14) Y ( x ) = (cid:90) x W ( y ) g ( y ) dy + const PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 37 where (13.15) W ( x ) = ± σ exp (cid:18) − (cid:90) x δ ( y ) ν ( y ) dy (cid:19) where either sign would be allowed and σ is a positive constant. In particular, the function Y ( x ) is invertible.Proof. Let us introduce the Wronskian(13.16) W ( x ) = dh ( x ) dx g ( x ) − dg ( x ) dx h ( x )such that(13.17) g ( x ) = c f ( x ) + c f ( x ) , h ( x ) = c f ( x ) + c f ( x ) . A direct calculation shows that(13.18) ddx Y ( x ) = W ( x ) Y ( x )and(13.19) ddx W ( x ) = − δ ( x ) ν ( x ) W ( x ) . Hence, (13.15) gives the solution to the equation in (13.19). (cid:3) Let X ( y ) be the inverse of the function Y ( x ), we have that(13.20) dyσ ( y ) = ± dX ( y ) ν ( X ( y ))where(13.21) σ ( y ) = Y (cid:48) ( X ( y )) ν ( X ( y )) = σ ν ( X ( y )) exp (cid:16) − (cid:82) X ( y ) δ ( s ) dsν ( s ) (cid:17) g ( X ( y ) , ρ ) Theorem 21. (Kernel mapping.) In the limit m → ∞ , the propagator density for the process Y t with zero drift and volatility as in (13.21) is given by (13.22) U ( y , y, t ) = ν ( X ( y )) σ ( y ) g ( X ( y ) , ρ ) g ( X ( y ) , ρ ) e − ρt u ( X ( y ) , X ( y ) , t ) where u ( x , x, t ) is the propagator density for the process of generator (13.12). Let us consider in detail the case of the CIR model in (13.9). If λ = 0, then the pricingkernel for the state variable is expressed in terms of modified Bessel functions as follows:(13.23) u ( x, t ; x , 0) = (cid:18) xx (cid:19) ( λ ν − e − x + x ) /ν t ν t/ I λ ν − (cid:18) √ xx ν t (cid:19) . The generating function is(13.24) ˆ v ( x, ρ ) = x (1 − λ ν ) (cid:20) q I λ ν − (cid:32)(cid:115) ρxν (cid:33) + q K λ ν − (cid:32)(cid:115) ρxν (cid:33) (cid:21) , with arbitrary constants q , q . Here I ν ( z ) is the modified Bessel function of order ν and K ν ( z )is the associated McDonalds function. In this case we obtain a dual family with 6 adjustableparameters.In case λ < 0. the propagator density for the state variable x can still be expressed in termsof modified Bessel functions as follows:(13.25) u ( x, t ; x , 0) = c t (cid:18) xe − λ t x (cid:19) ( λ ν − exp (cid:2) − c t ( x e λ t + x ) (cid:3) I λ ν − (cid:18) c t (cid:112) xx e λ t (cid:19) , where c t ≡ λ / ( ν ( e λ t − et al. v ( x, ρ ) = x − λ /ν e − λ x/ν (cid:20) q W k,m (cid:18) − λ ν x (cid:19) + q M k,m (cid:18) − λ ν x (cid:19)(cid:21) for arbitrary constants q , q . Here W k,m ( · ) and M k,m ( · ) are Whittaker functions which can alsobe expressed in terms of confluent hypergeometric functions or in terms of Kummer functions.(Abramowitzand Stegun 1972) This construction gives rise to a dual family with 7 free parameters where(13.27) k = λ ν + ρλ , m = λ ν − . The 7 parameter family which reduces to the CIR model has a local volatility function definedon either an interval or on a half line and behaves asymptotically as the CEV volatility on onehand and as a quadratic model on the other. This hybrid shape allows for greater flexibility.Next, we show that classic exact solutions in the literature, namely quadratic and CEVmodels, can all be rediscovered as particular cases of our general formula for the Bessel familywhere we make use of the above solutions to the underlying x space process with β = , λ = 0and λ ≡ λ . Without loss of generality, we can fix ν = 2. Let’s specialize further to the casewhere(13.28) Y ( x ) = ¯ y − a K λ − ( √ ρx ) I λ − ( √ ρx )which leads to a transformed process y t = Y ( x t ) with volatility(13.29) σ ( y ) = a (cid:112) X ( y ) (cid:2) I λ − ( (cid:112) ρX ( y )) (cid:3) , where x = X ( y ) is the inverse of the function in equation (13.28). In this family, a and ρ are positive, ¯ y is arbitrary and λ > 2. The function Y ( x ) maps the half line x ∈ [0 , ∞ )into y ∈ ( −∞ , ¯ y ], where Y ( x ) is a strictly monotonically increasing function with dY ( x ) /dx = σ ( Y ( x )) /ν ( x ). This solution region can be inverted so that y ∈ [¯ y, ∞ ). This is accomplishedby either replacing a by − a in equation (13.28) or by applying a linear change of variables thatmaps y into 2¯ y − y . In this special case, we make use of the generating function in equation(13.24), with the choice q = 0, and formula (13.22) reduces to(13.30) U ( y, t ; y , 0) = e − ρt − ( X ( y )+ X ( y )) / t at X ( y ) (cid:2) I λ − ( (cid:112) ρX ( y )) (cid:3) I λ − ( (cid:112) ρX ( y )) I λ − (cid:32) (cid:112) X ( y ) X ( y ) t (cid:33) . We note that this density integrates exactly to unity in y space (i.e. no absorption). Example 4. (Quadratic volatility models) Pricing kernels for quadratic volatility modelsare readily obtained as a subset of the above general family with the special choice of parameter λ = 3. After making the substitution y → y − y and setting a = (¯ y − ¯¯ y ) /π the transformationfunction Y ( x ) becomes(13.31) Y ( x ) = ¯ y + (¯ y − ¯¯ y ) π K ( σ √ x/ I ( σ √ x/ 2) = ¯ y + (¯ y − ¯¯ y )exp( σ √ x ) − σ > 0. Here, we assume that ¯ y > ¯¯ y . The inverse transformation X ( y ) is given by(13.32) X ( y ) = (1 /σ ) log [1 + (¯ y − ¯¯ y ) / ( y − ¯ y )] , and the volatility function σ ( y ) is obtained by insertion into equation (13.29) while using theBessel function of order ,(13.33) σ ( y ) = σ (¯ y − ¯¯ y ) ( y − ¯ y )( y − ¯¯ y ) . PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 39 Inserting the expression (13.32) into equation (13.30), one obtains the pricing kernel(13.34) U ( y, t ; y , 0) = 2 e − σ t/ σ ( y ) √ πt (cid:115) ( y − ¯ y )( y − ¯¯ y )( y − ¯ y )( y − ¯¯ y ) e − ( φ ( y ) + φ ( y ) ) / σ t sinh (cid:18) φ ( y ) φ ( y ) σ t (cid:19) where φ ( y ) ≡ log(( y − ¯¯ y ) / ( y − ¯ y )). In the special case of a volatility function with a double root,i.e.(13.35) σ ( y ) = σ ( y − ¯ y ) the pricing kernel is computed by taking the limit as ¯¯ y → ¯ y , and one finds U ( y, t ; y , 0) = 1 σ √ πt ( y − ¯ y )( y − ¯ y ) (cid:20) e − (cid:0) ( y − ¯ y ) − − ( y − ¯ y ) − (cid:1) / σ t − e − (cid:0) ( y − ¯ y ) − +( y − ¯ y ) − (cid:1) / σ t (cid:21) . (13.36) Example 5. (Lognormal models) The pricing kernel for the log-normal Black-Scholes modelwith σ ( y ) = σ y is a particular case of the above formula for the quadratic model. The derivativewith respect to y of the quadratic volatility function in (13.33), evaluated at y = ¯ y , is σ .Taking the limit ¯¯ y → −∞ (or ¯¯ y << ¯ y ), while holding the other parameters fixed, one obtains σ ( y ) = σ ( y − ¯ y ). The pricing kernel in (13.34) gives the kernel for the log-normal model in thelimit ¯¯ y → −∞ , i.e.(13.37) U ( y, t ; y , 0) = 1( y − ¯ y ) σ √ πt exp (cid:20) − (cid:18) log(( y − ¯ y ) / ( y − ¯ y )) − σ t (cid:19) (cid:30) σ t (cid:21) . Example 6. (CEV model) The constant-elasticity-of-variance (CEV) model is recovered inthe limiting case as ρ → 0. Assume λ > θ > λ = θ − + 2. Thetransformation y = Y ( x )(13.38) Y ( x ) = ¯ y + ( σ x ) − (2 θ ) − has inverse x = X ( y ) given by(13.39) X ( y ) = σ − ( y − ¯ y ) − θ , for any constant ¯ y . The volatility function for this model is(13.40) σ ( y ) = σ | θ | ( y − ¯ y ) θ . In the limit ρ → 0, the Laplace transform ˆ v ( X ( y ) , 0) = 1, which implies that the numerairechange is trivial in this case. The pricing kernel can be evaluated by substitution into thegeneral formula (13.22), and after collecting terms, it turns out to be U ( y, t ; y , 0) = | θ | σ t ( y − ¯ y ) ( y − ¯ y ) +2 θ e − (cid:0) ( y − ¯ y ) − θ +( y − ¯ y ) − θ (cid:1) / σ t I θ (cid:18) (cid:0) ( y − ¯ y )( y − ¯ y ) (cid:1) − θ σ t (cid:19) . (13.41)This formula was derived in the case θ > 0, for which the limiting value y = ¯ y is not attainedand the density is easily shown to integrate to unity (i.e. no absorption occurs and the densityalso vanishes at the endpoint y = ¯ y ). We note that the same formula solves the propagatorequation for θ < 0, leading to the same Bessel equation of order ± (2 θ ) − . In the range θ < θ < − / 2, hence no absorption occursfor θ ∈ ( −∞ , − / y = ¯ y (i.e. paths do not attain the lower endpoint) for all θ < − 1. In contrast, for θ ∈ ( − , − / y = ¯ y (hence this corresponds to the casethat the density has an integrable singularity for which paths can also attain the lower endpoint,but are not absorbed). For the special case of θ = − / [Note that only for the range θ ∈ ( − / , 0) the above pricing kernel is not useful since it givesrise to a density that has a non-integrable singularity at y = ¯ y . In this case, however, anothersolution that is integrable is obtained by only replacing the order (2 θ ) − by − (2 θ ) − in theBessel function. The latter solution for the density does not integrate to unity and hence givesrise to absorption which can be of use to price options in a credit setting.] The special case of θ = − y, ∞ ) with U ( y, t ; y , 0) = 1 σ √ πt (cid:18) e − ( y − y ) / σ t + e − ( y + y − y ) / σ t (cid:19) . (13.42) 14. Stochastic Integrals for Diffusion Processes This section provides a new derivation of a theorem by Cameron, Feynman, Girsanov, Ito,Kac and Martin, see (Cameron and Martin 1949), (Ito 1949), (Feynman 1948) and (Kac 1948).This result is at the center of stochastic calculus and in the following sections, we derive farreaching extensions and applications. Theorem 22. (Cameron-Feynman-Girsanov-Ito-Kac-Martin.) Consider a diffusion pro-cess on the simplicial sequence A m ⊂ R and of generator (14.1) L m = µ ( x, t ) ∇ h m + σ ( x, t ) h m . Consider also the process given by the integral (14.2) I t = (cid:90) t a ( x s , s ) dx s + b ( x s , s ) ds where a ( x, t ) and b ( x, t ) are smooth functions in both arguments. Let us introduce also thefunction φ ( x, t ) = (cid:82) x a ( y, t ) dy . We have that (i) Ito’s Lemma in integral form. (14.3) I t = φ ( x t , t ) − φ ( x , 0) + J t + O ( h ) . where (14.4) J t = (cid:90) t (cid:18) b ( x s , s ) − σ ( x s , s ) a (cid:48) ( x s , s ) − ˙ φ ( x s , s ) (cid:19) ds. where ˙ φ ( x, t ) = ∂∂t φ ( x, t ) and a (cid:48) ( x, t ) = ∂∂x a ( x, t ) . (ii) Feynman-Kac formula. The characteristic function of J t on the bridge leading from x to y is given by (14.5) E (cid:2) e ipJ t δ ( x t = y ) | x = x (cid:3) = P exp (cid:18) (cid:90) t (cid:18) L ( s ) + ipb ( s ) − ip σ ( s ) a (cid:48) ( s ) − ip ˙ φ ( s ) + O ( h ) (cid:19)(cid:19) ( x, y ) . (iii) Ito’s Lemma in differential form. Let φ ( x, t ) be a smooth function in both arguments.Then we have that lim s ↓ E t (cid:2) s − ( φ ( x t + s , t + s ) − φ ( x t , t )) | x t = x (cid:3) = ∂φ∂t ( x, t ) + µ ( x, t ) ∂φ∂x ( x, t ) + σ ( x, t ) ∂ φ∂x ( x, t ) + O ( h m )(14.6) and (14.7) lim s ↓ E t (cid:2) s − ( φ ( x t + s , t + s ) − φ ( x t , t )) | x t = x (cid:3) = σ ( x, t ) (cid:18) ∂φ∂t ( x, t ) (cid:19) + O ( h m ) . For all n ≥ instead we have that (14.8) lim h ↓ lim s ↓ E t (cid:2) s − ( φ ( x t + s , t + s ) − φ ( x t , t )) n | x t = x (cid:3) = 0 . PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 41 Proof. Consider a discretization of the process I t = (∆ I ) n t . This can be accomplished byintroducing a simplicial complex of one more dimension. Eventually in the proof, we shall takethe limit as ∆ I → h > L ( x, n ; x (cid:48) , n (cid:48) ; t ) = (cid:18) σ ( x ) h + µ ( x, t )2 h (cid:19) δ x (cid:48) ,x + h δ (cid:18) n (cid:48) − n − (cid:20) a ( x, t ) h ∆ I (cid:21)(cid:19) + (cid:18) σ ( x, t ) h − µ ( x, t )2 h (cid:19) δ x (cid:48) ,x − h δ (cid:18) n (cid:48) − n + (cid:20) a ( x, t ) h ∆ I (cid:21)(cid:19) − σ ( x, t ) h δ x (cid:48) x δ ( n − n (cid:48) ) + b ( x, t )∆ I (cid:0) δ n (cid:48) ,n +1 − δ n (cid:48) n (cid:1) (14.9)where [ a ] stands for the nearest integer to a . The partial Fourier transform in the n variable ofthis kernel is˜ L p ( x, x (cid:48) ; t ) = (cid:88) n ˜ L ( x, x (cid:48) , n ; t ) e − i ∆ Ipn = (cid:18) σ ( x, t ) h + µ ( x, t )2 h (cid:19) exp (cid:18) − i (cid:20) ha ( x, t )∆ I (cid:21) ∆ Ip (cid:19) δ x (cid:48) ,x + h + (cid:18) σ ( x, t ) h − µ ( x, t )2 h (cid:19) exp (cid:18) i (cid:20) ha ( x, t )∆ I (cid:21) ∆ Ip (cid:19) δ x (cid:48) ,x − h − σ ( x, t ) h δ x (cid:48) x δ ( n − n (cid:48) ) + b ( x, t )∆ I (cid:0) e ip ∆ I − (cid:1) (14.10)In the limit as ∆ I → L p ( x, x (cid:48) ; t ) = σ ( x, t ) h + (cid:0) µ ( x, t ) − ipσ ( x, t ) a ( x, t ) (cid:1) ∇ h − ipa ( x, t ) µ ( x, t ) − p σ ( x, t ) a ( x, t ) − ipb ( x, t ) + O ( h ) . (14.11)Introducing the function φ ( x, t ) defined above, we find that(14.12) e ipφ ( x,t ) L ( x, x (cid:48) ; t ) e − ipφ ( x (cid:48) ,t ) = ˜ L p ( x, x (cid:48) ; t ) + ipb ( x, t ) δ xx (cid:48) − ip σ ( x, t ) a (cid:48) ( x, t ) δ xx (cid:48) + O ( h ) . This equality can be rearranged as follows: e − ipφ ( x,t ) ˜ L ( x, x (cid:48) ; t ) e ipφ ( x (cid:48) ,t ) + ip ˙ φ ( x, t ) δ xx (cid:48) = L p ( x, x (cid:48) ; t ) − ipb ( x, t ) δ xx (cid:48) + ip σ ( x, t ) a (cid:48) ( x, t ) δ xx (cid:48) + ip ˙ φ ( x, t ) δ xx (cid:48) + O ( h ) . (14.13)Hence P exp (cid:18) (cid:82) t ˜ L p ( s ) ds (cid:19) ( x, x (cid:48) ) = e ip ( φ ( x, − φ ( x (cid:48) ,t )) P exp (cid:18) (cid:82) t ds (cid:0) L ( s ) − ip (cid:0) b ( s ) − σ ( s ) a (cid:48) ( s ) − ˙ φ ( s ) + O ( h ) (cid:1)(cid:19) ( x, x (cid:48) ) . (14.14)The joint kernel is thus given by P exp (cid:18) (cid:82) t ds ˜(cid:32)L( s ) (cid:19) ( x, I ; x (cid:48) , I (cid:48) ) = (cid:82) dp π e ip [ I (cid:48) − I − φ ( x (cid:48) ,t )+ φ ( x, P exp (cid:18) (cid:82) t ds (cid:0) L ( s ) − ip (cid:0) b ( s ) − σ ( s ) a (cid:48) ( s ) − ˙ φ ( s ) + O ( h ) (cid:1)(cid:19) ( x, x (cid:48) ) . (14.15)This formula proves both statements in the Theorem. (cid:3) When in Definition (30) we introduced the notion of measure change, convergence and smooth-ness in the continuum limit of the measure change function G y ( y (cid:48) ; t ) was not emphasized. How-ever, this concept is important, as it is stressed by Girsanov’s theorem below, of which we givetwo independent proofs. Theorem 23. (Girsanov.) Consider two lattice diffusions on the simplicial sequence A m withgenerators (14.16) L m = µ ( x, t ) ∇ h + σ ( x, t ) h , ¯ L h = ¯ µ ( x, t ) ∇ h + σ ( x, t ) h , respectively. Here µ ( x, t ) , ¯ µ ( x, t ) , σ ( x, t ) are assumed to be smooth functions. Then the Markovgenerators L m and ¯ L m are related by the smooth measure change with function (14.17) G xtm ( x (cid:48) ) = exp (cid:18) (¯ µ ( x, t ) − µ ( x, t )) σ ( x, t ) ( x (cid:48) − x ) (cid:19) and the Radon-Nykodym derivative is given by ρ ( x · ) = exp (cid:26) (cid:90) T (cid:48) T ¯ µ ( x t , t ) − µ ( x t , t ) σ ( x t , t ) dy t − ¯ µ ( x t , t ) − µ ( x t , t ) σ ( x t , t ) dt + O ( h ) (cid:27) . (14.18) First Proof. Let v ( x, t ) = ¯ µ ( x, t ) − µ ( x, t ) and let a ( x, t ) = v ( x,t ) σ ( x,t ) . Consider the measure changewith(14.19) G xtm ( x (cid:48) ) = e a ( x,t )( x (cid:48) − x ) . The off-diagonal elements of the transformed Markov generator are(14.20)¯ L ( x, x ± h ; t ) = (cid:18) σ ( x, t ) h ± µ ( x, t )2 h (cid:19) e ± v ( x,t ) σ ( x,t )2 h = σ ( x, t ) h ± ¯ µ ( x, t )2 h + v ( x, t ) σ ( x, t ) + µ ( y, t ) v ( x, t )2 σ ( x, t ) + O ( h ) . The diagonal elements instead change as follows(14.21) ¯ L ( x, x ; t ) = L ( x, x ; t ) − v ( x, t ) σ ( x, t ) − µ ( x, t ) v ( x, t ) σ ( x, t ) + O ( h ) . Second Proof. Let us consider two generators differing by the drift term(14.22) L ( t ) = σ ( x, t ) h + µ ( x, t ) ∇ h , ¯ L ( t ) = σ ( x, t ) h + ¯ µ ( x, t ) ∇ h and consider the formula E (cid:2) e R t a ( x s ,s ) dx s + b ( x s ,s ) ds δ ( x t = y ) | x = x (cid:3) = e ( φ ( y,t ) − φ ( x, P exp (cid:18) (cid:82) t ds (cid:18) L ( s ) + b ( s ) − σ ( s ) a (cid:48) ( s ) + O ( h ) (cid:19)(cid:19) ( x, y ) . (14.23)To derive Girsanov’s theorem, let us ask for what choice of the functions a ( x, t ) , b ( x, t ) theright-hand side equals e t ¯ L ( x, y ; t ) up to terms of order O ( h ). Since(14.24) e φ (cid:18) L + b − σ a (cid:48) (cid:19) e − φ = σ ( x ) h +( a ( x ) σ ( x ) + µ ( x )) ∇ h + σ ( x ) a ( x ) + µ ( x ) a ( x )+ b ( x )+ O ( h )we see that the right hand side equals ¯ L up to terms of order O ( h ) if(14.25) a ( x ) = ¯ µ ( x ) − µ ( x ) σ ( x ) , b ( x ) = µ ( x ) − ¯ µ ( x ) σ ( x ) . PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 43 Markov Bridges A first simple application of this general framework leads to an extension of the results in theprevious section to the case of a general Markov generator.Consider a Markov process on the simplicial sequence A m = h m Z ∩ [ − L, L ] ⊂ R . Accordingto theorem (14) and assuming that hypothesis MG1, MG2, MG3, MG4 hold, the symbol of aMarkov generator in canonical form can be expressed as follows:ˆ L m ( x, p ; t ) = i ˜ µ ( x ; t ) sin phh + ˜ σ ( x ; t ) cos ph − h + (cid:88) y ∈ A m (cid:18) e ip ( y − x ) − − i sin phh ( y − x ) (cid:19) ˜ λ ( x, y ; t ) . Although not necessary for the validity of the calculation and the convergence in the limit h → L m ( x, y ; t ) = iµ ( x ; t ) ∇ h + σ ( x ; t ) ∆ h + h m λ m ( x, y ; t )where the functions µ ( x ; t ) and σ ( x ; t ) are smooth in both arguments and λ m ( x, y ; t ) ≥ x (cid:54) = y and we have that(15.2) λ m ( x, x ; t ) = − (cid:88) y (cid:54) = x ∈ A m λ m ( x, y ; t ) . Consider a stochastic integral of the form(15.3) I t = (cid:90) t a ( x s , s ) dx s + b ( x s , s ) ds where a ( x, t ) and b ( x, t ) are smooth functions in both arguments. Let φ ( x, t ) = (cid:82) x a ( y, t ) dy .The following result holds: Theorem 24. (Markov Bridges.) The joint probability distribution function for I t and theunderlying process on the bridge leading from x to x (cid:48) is given by P exp (cid:18) (cid:82) t ˜ (cid:32)L m ( s ) ds (cid:19) ( x, I ; x (cid:48) , I (cid:48) )= (cid:82) dp π e ip [ I (cid:48) − I − φ ( x (cid:48) )+ φ ( x )] P exp (cid:18) (cid:82) t (cid:0) L m ( s ) − ipb ( s ) + ip σ a (cid:48) ( s ) − ˙ φ ( s ) + κ m,p ( s ) + O ( h ) (cid:1) ds (cid:19) ( x, x (cid:48) ) . (15.4) where κ m,p ( t ) is the operator of matrix elements (15.5) κ m,p ( x, x (cid:48) ; t ) = h m λ m ( x, x (cid:48) ; t ) exp (cid:18) − ip (cid:2) a ( x, t )( x (cid:48) − x ) + φ ( x, t ) − φ ( x (cid:48) , t ) (cid:3)(cid:19) . Proof. Consider again a discretization of the process I t = (∆ I ) n t and the lifted generator˜ L m ( x, n ; x (cid:48) , n (cid:48) ) = L m ( x ; x (cid:48) ) δ (cid:18) n (cid:48) − n − (cid:20) a ( x )( x (cid:48) − x )(∆ I ) (cid:21)(cid:19) + b ( x )(∆ I ) (cid:0) δ n (cid:48) ,n +1 − δ n (cid:48) n (cid:1) (15.6)where [ a ] stands for the nearest integer to a . The partial Fourier transform in the n variable ofthis kernel is˜ L m,p ( x, x (cid:48) ) = (cid:88) n ˜ L ( x, x (cid:48) , n ) e − i (∆ I ) pn = L m ( x ; x (cid:48) ) exp (cid:18) − i (∆ I ) p (cid:20) a ( x )( x (cid:48) − x )(∆ I ) (cid:21)(cid:19) + b ( x )(∆ I ) (cid:0) e ip (∆ I ) − (cid:1) (15.7)In the limit as (∆ I ) → L m,p ( x, x (cid:48) ) = σ ( x ) h + (cid:0) µ ( x ) − ipσ ( x ) a ( x ) (cid:1) ∇ h − ipa ( x ) µ ( x ) − p σ ( x ) a ( x ) − ipb ( x )+ h m λ m ( x, x (cid:48) ; t ) e − ipa ( x )( x (cid:48) − x ) + O ( h ) . (15.8) Introducing the function φ ( x, t ) defined above, we find that e ipφ ( x,t ) L m ( x, x (cid:48) ; t ) e − ipφ ( x (cid:48) ,t ) = ˜ L m,p ( x, x (cid:48) ; t ) + ipb ( x, t ) δ xx (cid:48) − ip σ ( x, t ) a (cid:48) ( x, t ) δ xx (cid:48) + h m λ m ( x, x (cid:48) ; t ) exp (cid:18) − ip (cid:2) a ( x, t )( x (cid:48) − x ) + φ ( x, t ) − φ ( x (cid:48) , t ) (cid:3)(cid:19) + O ( h m ) . (15.9)Henceexp (cid:0) t ˜ L m,p (cid:1) ( x, x (cid:48) ; t )= ip ( φ ( x,t ) − φ ( x (cid:48) ,t )) P exp (cid:18) (cid:82) t (cid:0) L m ( s ) − ipb ( s ) + ip σ ( s ) a (cid:48) ( s ) − ˙ φ ( s ) + κ m,p ( s ) + O ( h m ) (cid:1) ds (cid:19) ( x, x (cid:48) ) . (15.10)The joint kernel is thus given by (16.14). (cid:3) Abelian Processes in Continuous Time This section is based on work in collaboration with Harry Lo and Alex Mijatovic, see (Albanese et al. a ).Abelian processes are path-dependent processes which provide an extension of the notion ofstochastic integral for which one can extend the Feynman-Kac theorem and Ito’s formula inSection (14).Let A m , m ≥ m be a simplicial sequence, consider a time interval [ T, T (cid:48) ] and a stochastic pro-cess described by the Markov generator (cid:32)L m ( y ; y ; t ). To numerically exponentiate, it is crucialin most application examples to first reduce dimensionality by means of block diagonalisations,or else the multiplication of the lifted generators would not be feasible. It turns out that this ispossible for the payoffs of the type above as these two fall in the general class outlined by thefollowing definition: Definition 43. (Markov Generator.) Let (cid:32)L ( y ; y ; t ) be a Markov generator and let us con-sider a lifting of the form (16.1) ¯ (cid:32)L ( y , k ; y , k ; t ) = (cid:32)L ( y ; y ; t ) δ k k + A ( y , k ; y , k ; t ) . where k = 0 ...K . This lifting is called Abelian if the operators A ( y ; y ; t ) defined for each pair y , y ∈ Λ and all times t ∈ [ T, T (cid:48) ] as the linear operators of matrix elements (16.2) A ( y ; y ; t ) k ,k ≡ A ( y , k ; y , k ; t ) are mutually commuting in the sense that (16.3) A ( y , y ) A ( y , y ) = A ( y , y ) A ( y , y ) for all y , y , y , y ∈ Λ and all t ∈ [ T, T (cid:48) ] . Lemma 4. If all the matrices A ( y ; y ; t ) k ,k , y , y ∈ Λ , t ∈ [ T, T (cid:48) ] are mutually commutingand if furthermore they are all diagonalizable, then they can be diagonalized simultaneously atany given point in time, i.e. there exists a time dependent matrix V ( k, i ; t ) , i = 0 , ..K suchthat V ( t ) − A ( y ; y ; t ) V ( t ) = Λ( y ; y ; t ) , where Λ( y ; y ; t ) is a diagonal matrix of the form ( λ ( y ; y ; t ) δ i ,i ) for any t ∈ [ T, T (cid:48) ] .Proof. Fix a t ∈ [ T, T (cid:48) ]. If y , y , y , y ∈ Λ and A ( y ; y ; t ) ψ ( t ) = λ ( y ; y ; t ) ψ ( t ) for some vector ψ ( t ), then(16.4) A ( y ; y ; t ) A ( y ; y ; t ) ψ = A ( y ; y ; t ) A ( y ; y ; t ) ψ ( t ) = λ ( y ; y ; t ) A ( y ; y ; t ) ψ ( t ) . Hence if ψ ( t ) is an eigenvector of A ( y ; y ; t ) also A ( y ; y ; t ) ψ ( t ) is an eigenvector of A ( y ; y ; t ).If ψ ( t ) is an eigenvector of multiplicity one, this shows that ψ ( t ) is also an eigenvector of A ( y ; y ; t ) for all y , y . Otherwise, we conclude that the eigenspace of A ( y ; y ; t ) of eigenvalue λ is invariant under A ( y , y ) for any y , y . Iterating the argument above to this eigenspaceone can constructively obtain a common set of invariant eigenspaces shared by all the operators PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 45 A ( y ; y ; t ) for y , y ∈ Λ. Hence, for any given t ∈ [ T, T (cid:48) ], these operators can be diagonalized si-multaneously by the matrix whose column vectors span bases for all the common eigenspaces. (cid:3) Let V ( k, i ; t ) , i = 0 , ..K be a matrix which diagonalizes simultaneously all members of thefamily of operators T ( k , k ; y t ), so that V ( t ) − T ( y ; t ) V ( t ) = Λ( y ; t ) is diagonal. Considerthe lifted matrix V ( k, y ; i, y ; t ) = V ( k ; i ; t ) δ y y and the transformed lifted Markov generator(16.5) ( V ( t ) − ¯(cid:32)L V ( t ))( y , i ; y , i ; t ) = (cid:32)L( y ; y ; t ) δ i i + Λ( y ; t ) i δ y ,y δ i i . Since this matrix is block-diagonal, the problem of exponentiating it is reduced to the problem ofexponentiating K blocks separately. This reduces the overall numerical complexity and makesit comparable with the complexity of evaluating propagators on K different time points. Asa further simplification, to exponentiate this block-diagonal matrix it is necessary to hold inmemory only the blocks, and not the entire matrix. Notice that the blocks are in generalcomplex matrices. Hence to fast exponentiate them one needs to invoke the complex matrix-matrix multiplication routine sgemm or cgemm of Level-3 BLAS, depending on whether the block-diagonalizing transformation is real or complex valued. Example 7. (Stochastic Integrals) Stochastic integrals over diffusion processes provide ex-amples of Abelian process. This was already used in the previous section but it is worthwhilehere to stress the link between the computability of the characteristic functions for the pathdependent process on a bridge with the block-diagonalizability under partial Fourier transformsof the lifted generator. Working directly with Markov generators, one can also generalize theresults in the previous sections.Consider the integral(16.6) I t = I ( y · , t ) ≡ (cid:90) tT (cid:18) φ ( y s ; s ) + χ ( y s − , y s ; s ) lim t ↓ ψ ( y s ; s ) − ψ ( y s − t ; s − t ) t (cid:19) ds where φ ( y ; s ), ψ ( y ; s ) and χ ( y ; y ; s ) are continuous functions. Consider the problem of findingits distribution on a bridge for the underlying Markov process y t . Introducing the discretisation(16.7) I t ≈ (∆ I ) m t where m t ∈ [0 , ..N ] and (∆ I ) > 0, the lifted generator for the pair of processes ( y t , (∆ I ) m t ) is¯(cid:32)L( y , m ; y , m ; t ) = (cid:32)L( y ; y ; t ) δ (cid:0) m − m − [(∆ I ) − χ ( y ; y ; t )( ψ ( y ; t ) − ψ ( y ; t ))] (cid:1) + δ y ,y (∆ I ) φ ( y ; t ) δ m +1 ,m . (16.8) Theorem 25. (Abelian Bridges.) The joint kernel of I t and the underlying process on thebridge leading from y to y is given by P e R t ˜ (cid:32)L m ( s ) ( y , I ; y , I ; s ) = (cid:90) dp π e ip ( I − I ) P exp (cid:18) (cid:90) t (cid:0) L m ( s ) − ipφ ( s ) + κ m,p ( s ) + O ( h ) (cid:1)(cid:19) ( x, x (cid:48) ) . (16.9) where κ m,p ( t ) is the operator with matrix elements (16.10) κ m,p ( y , y ; t ) = (cid:32)L m ( y , y ; t ) (cid:18) exp (cid:18) − ipχ ( y , y , t )( ψ ( y , t ) − ψ ( y , t )) (cid:19) − (cid:19) . Notice that this theorem extends the result in the previous section on Markov bridges as itincludes multifactor processes such as processes with stochastic volatility and, more generally,regime switching. Example 8. (Double Liftings) There are situations that emerge in practice where one has totrack two integrals over paths, one of form (16.6) and a similar one of form(16.11) I (cid:48) t = I (cid:48) ( y · , t ) ≡ (cid:90) tT (cid:18) φ (cid:48) ( y s , s ) + χ (cid:48) ( y s − , y s , s ) lim t ↓ ψ (cid:48) ( y s , s ) − ψ (cid:48) ( y s − t , s − t ) t (cid:19) ds In this case one can introduce a similar approximation(16.12) I (cid:48) t ≈ ¯ I (cid:48) t = (∆ I (cid:48) ) n t where n t ∈ [0 , ..N ] and a double lifting for the generator¯(cid:32)L( y , m , n ; y , m , n ; t ) =(cid:32)L( y ; y ; t ) δ (cid:18) m − m − (cid:20) χ ( y ; y ; t )( ψ ( y ; t ) − ψ ( y ; t ))(∆ I ) (cid:21)(cid:19) δ (cid:18) n − n − (cid:20) χ (cid:48) ( y ; y ; t )( ψ (cid:48) ( y ; t ) − ψ (cid:48) ( y ; t ))(∆ I (cid:48) ) (cid:21)(cid:19) + δ y ,y (∆ I ) φ ( y ; t ) δ m +1 ,m δ n ,n + + δ y ,y (∆ I (cid:48) ) φ (cid:48) ( y ; t ) δ n +1 ,n δ m ,m . (16.13) Theorem 26. (Multifactor Bridges.) The joint kernel of I t , I (cid:48) t and the underlying processon the bridge leading from y to y is given by the double Fourier transform e t ˜ (cid:32)L m ( y , I ; y , I ) = (cid:82) dp π e ip ( I − I ) (cid:82) dp (cid:48) π e ip (cid:48) ( I (cid:48) − I (cid:48) ) P exp (cid:18) (cid:82) t (cid:0) L m ( s ) − ipφ ( s ) − ip (cid:48) φ (cid:48) ( s ) + κ m,p,p (cid:48) ( s ) + O ( h ) (cid:1)(cid:19) ( x, x (cid:48) ) . (16.14) where κ m,p,p (cid:48) ( t ) is the operator with matrix elements (16.15) κ m,p,p (cid:48) ( y ; y ; t ) = (cid:32)L m ( y ; y ; t ) (cid:0) e − ipχ ( y ; y ; t )( ψ ( y ; t ) − ψ ( y ; t )) − (cid:1)(cid:0) e − ip (cid:48) χ (cid:48) ( y ; y ; t )( ψ (cid:48) ( y ; t ) − ψ (cid:48) ( y ; t )) − (cid:1) . Example 9. (The Max Process) Consider the path dependent quantity given by the maxi-mum attained by a given Markov process, i.e.(16.16) K t = K ( y t , t ) ≡ max [ T,t ] χ ( y s ; s ) ds Let’s introduce the approximation(16.17) K t ≈ ¯ K t = αk t , where α is a constant and k t is an integer value process m t whose paths take values in k t ∈{ ...M } . The dynamics for the joint process ( y t , k t ) is defined by the lifted generator ¯(cid:32)L on ¯Λsuch that:(16.18) ¯(cid:32)L( y , k ; y , k ; t ) = (cid:32)L( y ; y ; t ) δ k ,k + A ( y ) k ,k δ y y where(16.19) A ( y ) k ,k = (cid:40) A if χ ( y ) > αk and k = [ χ ( y ) /α ]0 otherwise . where [ a ] stands for the nearest integer to a and A > A is chosento be a large number as the approximation converges in the limit at A → ∞ and α → 0. Adirect calculation shows that the operators A ( y ) k ,k commute and hence the maximum of theunderlying process is itself an Abelian path dependent process.17. Abelian Processes in Discrete Time and non-Resonance Conditions This section is based on work in collaboration with Manlio Trovato (Albanese and Trovato2005) and Paul Jones, see (Albanese and Jones 2007).An important class of path-dependent options requires computing the joint distribution ofthe underlying lattice process and of a discrete sum of the following form:(17.1) J t = J ( y t , t ) ≡ n (cid:88) i =1 ψ ( y t i − , y t i ; t i ) PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 47 where n is an integer, t i = i ∆ T and t = n ∆ T . Suppose that the Markov generator (cid:32)L is timehomogenous in the interval [ t , t n ] and consider the elementary propagator(17.2) U i ( y , y ) = P exp (cid:18) (cid:90) t i +1 t i (cid:32)L( s ) ds (cid:19) ( y , y ) . To find the joint transition probability, one can again discretize the variable J t so that(17.3) J t ≈ (∆ J ) n t . As opposed to lifting the generator as in the continuous time case discussed in the previoussection, here we lift the elementary propagator itself and form the joint propagator(17.4) ˜ U i ( y , k ; y , k ) = U i ( y , y ) δ (cid:0) k − k + [ ψ i ( y , y )(∆ J ) − ] (cid:1) . This operator can be block-diagonalized by means of a partial Fourier transform (Albanese andTrovato 2005). Theorem 27. Consider the Fourier transform operator ¯ U i,p of matrix elements (17.5) ˜ U i,p ( y ; y ) = U i ( y , y ) e − ipψ i ( y ,y ) . Then we have that (17.6) (cid:18) P n − (cid:89) i =0 ˜ U i (cid:19) ( y , J ; y , J ) = (cid:90) dp π e ip ( J − J ) (cid:18) P n − (cid:89) i =0 ˜ U i,p (cid:19) ( y , y )This case also falls under a more general class of Abelian liftings. Definition 44. (Propagator Liftings.) Let U i ( y , y ) be a family of Markov propagatorsindexed by i = 0 , , ...n − defined over the time intervals [ t i , t i +1 ] . Consider a propagator liftingof the form (17.7) ¯ U i ( y , k ; y , k ) = U i ( y , y ) Q i ( y , y ) k k . where k , k = 0 ...K . This lifting is called Abelian if the operators Q i ( y , y ) satisfy the followingcommutation condition: (17.8) Q i ( y , y ) Q i ( y (cid:48) , y (cid:48) ) − Q i ( y (cid:48) , y (cid:48) ) Q i ( y , y ) = 0 for all y , y , y , y ∈ A m and all time intervals i = 0 , ...n − . Let V ( k, j ) , j = 0 , ..K be a matrix that simultaneously diagonalizes all operators of thefamily Q ( y , y ) k k so that V − Q ( y , y ) V = Λ( y , y ) is diagonal. Consider the lifted matrix V ( k, y ; j, y ) = V ( k, j ) δ y y and the transformed lifted Markov generator(17.9) ( V − ¯ U i V )( y , j ; y , j ) = ¯ U i ( y ; y )Λ i ( y , y ) δ j j . Since this matrix is block-diagonal, the evaluation of the time-ordered product(17.10) P n − (cid:89) i =0 ¯ U i = V (cid:18) P n − (cid:89) i =0 ( V − ¯ U i V ) (cid:19) V − involves only multiplying blocks whose dimension equals the size of the lattice A m . The readerwill recognize that the formula in (17.10) is yet another generalization and extension of theCameron-Feynman-Girsanov-Ito-Kac-Martin formulas discussed above.The non-singular linear transformation that accomplished the block-diagonalization can be aFourier transform or a more general transformation. Several possibilities are open and the opti-mal choice depends on the objective. When using transforms more general than the Fourier trans-form, one has to keep present that the simultaneous diagonalization of the matrices Q ( y , y ) k k above can possibly result in a numerically ill-conditioned problem. An example of this phenom-enon arises when one attempts to use a non-homogeneous discretization of the path processcoordinate. To seize the benefit of inhomogeneous lattices, one needs to be careful when dis-cretizing and avoid resonances. Consider a lattice for the path dependent process defined as follows Ω = { ω , ω , ....ω K } where ω < ω < ... < ω K . Consider the shift operator R of matrix elements(17.11) R k k = (1 − p k ) δ k k + p k δ k ,k +1 , if k < k and R KK = 0. Here we assume that 0 < p k < 1, for k = 1 , ..n . The eigenvalues of R are given by the diagonal elements ρ k = 1 − p k . Diagonalizing this sort of matrix is potentiallya seriously ill-conditioned especially if the matrix is large and the eigenvalues are degenerate ofnearly degenerate. For practical application, one thus needs to achieve non resonance conditionsand keep the eigenvalues ρ i as far apart from each other as possible.Given a non-resonant choice of transition probabilities such as the one above, one can thenfix a ω and a ∆ ω > K (cid:88) k =1 R k k ω k = ω k + ∆ ω for all k = 1 , ....K − ω k so that(17.13) ω k = ω · Z k where ω > Z > 1. For this discretization to be acceptable, the resulting transitionprobabilities(17.14) p k = ∆ ωω Z k ( Z − Z must satisfy the additional constraint(17.15) Z ( Z − > ∆ ωω . We interpolate between the transition kernels R to obtain the relevant kernels of all possiblevalues of ψ ( x , x ). We construct the relevant kernels Λ ψ as follows: Q ( y , y ) = ([ ψ ( y , y )] + 1 − ψ ( y , y )) R [ ψ ( y ,y )] + ( ψ ( y , y ) − [ ψ ( y , y )]) R [ ψ ( y ,y )]+1 , (17.16)for all y and y in A m , where [ ψ ( y , y )] represents the integer part of the functional ψ at( y , y ). In the representation in which the operator R is diagonal, the operators Q ( y , y ) arealso diagonal and the lifted propagator is block-diagonal.18. Univariate Moment Expansions on Bridges This section is based on work in collaboration with Adel Osseiran, see (Albanese and Osseiran2007).A second methodology that applies to most Abelian path dependent options is is based onobtaining only a few moments of an Abelian process on any given bridge with respect to theunderlying Markov process, as opposed to reconstructing the entire conditional distribution.Consider a time interval [ T, T (cid:48) ] and a Markov generator (cid:32)L( y , y ; t ). Consider again theintegral I t in (16.6). Let’s introduce the following one parameter family of deformed Markovoperators parameterized by (cid:15) ∈ R (18.1) (cid:32)L (cid:15) ( y , y ; t ) = (cid:32)L( y , y ; t ) + (cid:15)V ( y , y ; t )where(18.2) V ( y , y ; t ) = φ ( y ; t ) δ y ,y + (cid:32)L( y , y ; t ) χ ( y , y ; t )( ψ ( y ; t ) − ψ ( y ; t )) Theorem 28. (Dyson Formula.) We have that (18.3) (cid:18) dd(cid:15) (cid:19) n (cid:12)(cid:12)(cid:12)(cid:12) (cid:15) =0 P exp (cid:18) (cid:90) tT (cid:32)L (cid:15) ( s ) ds (cid:19) ( y , y ) = E T (cid:2) I nt δ ( y t − y ) | y T = y (cid:3) . PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 49 Proof. Consider Neper’s formula for the propagator(18.4) P exp (cid:18) (cid:90) tT (cid:32)L (cid:15) ( s ) ds (cid:19) = lim N →∞ P N (cid:89) i =1 (cid:18) t − TN ((cid:32)L( t i ) + (cid:15)φ ( t i )) (cid:19) N where t i = T + itN . By collecting similar powers in (cid:15) , one finds Dyson’s formula P exp (cid:18) (cid:90) tT (cid:32)L (cid:15) ( s ) ds (cid:19) = exp(( t − T )(cid:32)L+(18.5) (cid:15) (cid:90) tT ds (cid:18) e ( s − T ) (cid:32)L V ( s ) e ( t − s ) (cid:32)L (cid:19) +(18.6) ∞ (cid:88) n =2 (cid:15) n (cid:90) tT ds ... (cid:90) ts n − ds n (cid:18) e ( s − T ) (cid:32)L V ( s ) e ( s − s ) (cid:32)L ....V ( s n ) e ( t − s n ) (cid:32)L (cid:19) . (18.7)The time-ordered integrals above are proportional to conditional moments, i.e.(18.8) P exp (cid:18) (cid:90) tT (cid:32)L (cid:15) ( s ) ds (cid:19) ( y , y ) = ∞ (cid:88) n =0 (cid:15) n n ! E T (cid:2) I nt δ (cid:0) y t − y (cid:1) | y T = y (cid:3) . Here, the factorials originate from the time ordering. (cid:3) A technique which is numerically stable in many situations of practical relevance is to nu-merically differentiate with respect to (cid:15) the deformed propagators P exp (cid:18) (cid:82) tT (cid:32)L (cid:15) ( s ) ds (cid:19) ( y , y )and evaluate at (cid:15) = 0. This technique can be used to obtain the first moments of I t on anygiven bridge for the underlying Markov process. In most applications, we find that two momentssuffice to extrapolate the probability distribution function to sufficient accuracy. To do so, itis convenient to choose from among the probability distribution functions which are analyti-cally tractable. For instance, starting from the first two moments only, one can use either thelog-normal or the chi-square distribution.The standard chi-square distribution is given by f ( x ) = 12Γ (cid:0) a (cid:1) (cid:16) x (cid:17) a/ − e − x/ where a is the number of degrees of freedom. The first and second (raw) moments of thisdistribution are E [ x ] = a , E [ x ] = a ( a + 2)To match the pre-assigned first and second moments m , m , respectively, one can pass to thenew variable ξ = m a x and chose a = 2 m m − m Let ξ be a log-normally distributed random variable with probability distribution function f ( x ; µ, σ ) = 1 xσ √ π e − (ln x − µ ) / σ The first two moments are given by E [ x ] = e µ + σ / and E [ x ] = e µ +2 σ The parameters µ and σ can be reconstructed from two pre-assigned first and second moments, m and m respectively, by setting µ = log (cid:18) m √ m (cid:19) and σ = log (cid:18) m m (cid:19) . The log-normal and the chi-square distributions allow one to match the first two moments.More general distributions available in closed form allow one to match higher moments also. ThePearson Type III distribution for instance has a probability distribution function given by f ( x ) = 1 b Γ( p ) (cid:18) x − ab (cid:19) p − e − ( x − a ) /b and extends over the half line [ a, + ∞ ). The special case of this distribution when a = 0, b = 2and p is half of an integer, gives the Chi-Squared distribution. In general, the moments are givenby E [ x ] = a + bpE [ x ] = ( a + bp ) + b pE [ x ] = ( a + bp ) + 3 b p ( a + bp ) + 2 b p and matching these with our pre-assigned moments m , m and m and computing the valuesof a, b and p in terms of these moments we get a = m − m − m ) m + 2 m − m m b = m + 2 m − m m m − m ) p = 4( m − m ) ( m + 2 m − m m ) Example 10. (Volatility contracts) As an example, consider variance and volatility swaps.Variance swaps of maturity T (cid:48) and time of issuance T have a payoff given bymin (cid:32)(cid:90) T (cid:48) T v ( y s − , y s ) ds, f · SR (cid:33) − SR where SR is the swap rate and f is a factor, a typical value being f = 6 . 2. Here, v ( y , y , t ) isthe instantaneous variance defined as follows:(18.9) v ( y , y ) = L ( y , y ; t ) log (cid:18) S ( y ) S ( y ) (cid:19) if y is such that x ( y ) (cid:54) = 0. Otherwise, if x ( y ) = 0 then v ( y , t ) = ∞ . The variance swap issaid to be at equilibrium if its price is 0. The payoff of a volatility swap ismin (cid:115)(cid:90) T (cid:48) T v ( y s , s ) ds, SR − SR Here, SR is the volatility swap rate.To price these contracts it suffices to evaluate the distribution of realized variance on a bridge,i.e. of the functional(18.10) RV ( y ) = δ ( y T (cid:48) − y ) (cid:90) T (cid:48) T v ( y s , s ) ds By approximating the distribution of RV ( y ) with the chi-squared distribution, and using theCDF F ( x ; a ) = γ (cid:0) a , x (cid:1) Γ (cid:0) a (cid:1) where Γ( z ) and γ ( z, a ) are the gamma and incomplete gamma functions respectively, i.e:Γ( z ) = (cid:90) ∞ s z − e − s ds , γ ( z, a ) = (cid:90) a s z − e − s ds we find E t [min( RV, RV max ) δ ( y T (cid:48) − y )] = m a [ K (1 − F ( K ; a )) + aF ( K ; a + 2)] PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 51 and E t (cid:104) min( √ RV , (cid:112) RV max ) δ ( y T (cid:48) − y ) (cid:105) = (cid:114) m a (cid:34) √ K (1 − F ( K ; a )) + √ γ (cid:0) a +12 , K (cid:1) Γ (cid:0) a (cid:1) (cid:35) where a = a ( y , y ) = 2 m ( y , y ) m ( y , y ) − m ( y , y ) , K = K ( y , y ) = a ( y , y ) m ( y , y ) RV max Since the dependency on the swap rate in both cases is non-linear, an exact calculation requiresusing a root finder.Using instead the log-normal distribution to extrapolate, we find E t [min( RV, RV max ) δ ( y T (cid:48) − y )]= e µ + σ / N (cid:16) log( RV max ) − µ − σ / σ (cid:17) + RV max N (cid:16) log( RV max ) − µσ (cid:17) . (18.11)For volatility swaps instead, we find E t (cid:104) min( √ RV , √ RV max ) δ ( y T (cid:48) − y ) (cid:105) = e (4 µ + σ ) N (cid:16) log( RV max ) − µ − σ / σ (cid:17) + √ RV max N (cid:16) log( RV max ) − µσ (cid:17) (18.12)where µ and σ are specified above in terms of m and m .Finally, in case the Pearson Type 3 distribution is used, we can express these expectations interms of the chi-square cumulative distribution function F ( x ; a ) as follows: E t [min( RV, RV max ) δ ( y T (cid:48) − y )] = ( a + bp − RV max ) F Chi (cid:18) p, RV max − ab (cid:19) (18.13) − b (cid:18) RV max − ab (cid:19) p e − RV max − ab Γ( p ) + (2 RV max − a − bp ) . Bivariate Moment Expansions on Bridges This section is based on work in collaboration with Adel Osseiran, see (Albanese and Osseiran2007).Consider a time interval [ T, T (cid:48) ] and a Markov generator (cid:32)L( y , y ; t ). We are interested in thejoint distribution on any given bridge for the underlying process of two stochastic integrals(19.1) I ,t = I ( y · , t ) ≡ (cid:90) tT (cid:18) φ ( y s ; s ) + χ ( y s − , y s ; s ) lim t ↓ ψ ( y s ; s ) − ψ ( y s − t ; s − t ) t (cid:19) ds and(19.2) I ,t = I ( y · , t ) ≡ (cid:90) tT (cid:18) φ ( y s ; s ) + χ ( y s − , y s ; s ) lim t ↓ ψ ( y s ; s ) − ψ ( y s − t ; s − t ) t (cid:19) ds. To handle this problem using the moment method, we introduce the following two-parameterfamily of deformed Markov operators parameterized by (cid:15), (cid:15) (cid:48) ∈ R (19.3) (cid:32)L (cid:15) ,(cid:15) ( y , y ; t ) = (cid:32)L( y , y ; t ) + (cid:15) V ( y , y ; t ) + (cid:15) V ( y , y ; t )where(19.4) V ( y , y ; t ) = φ ( y ; t ) δ y ,y + (cid:32)L( y , y ; t ) χ ( y , y ; t )( ψ ( y ; t ) − ψ ( y ; t ))and(19.5) V ( y , y ; t ) = φ ( y ; t ) δ y ,y + (cid:32)L( y , y ; t ) χ ( y , y ; t )( ψ ( y ; t ) − ψ ( y ; t )) Theorem 29. (Dyson Formula.) We have that (19.6) (cid:18) ∂ n + n ∂(cid:15) n ∂(cid:15) n (cid:19) (cid:12)(cid:12)(cid:12)(cid:12) (cid:15) =0 ,(cid:15) =0 P exp (cid:18) (cid:90) tT L (cid:15) ,(cid:15) ( s ) ds (cid:19) ( y , y ) = E T (cid:2) I n ,t I n ,t δ ( y t − y ) | y t = y (cid:3) . Proof. The proof is a simple extension of the proof in the univariate case and will be left to thereader. (cid:3) Consider a bi-variate log-normally distributed variable (cid:18) Y Y (cid:19) = (cid:18) log( X )log( X ) (cid:19) ∼ N (cid:18)(cid:18) µ µ (cid:19) , (cid:18) σ σ (cid:19)(cid:19) whereby X and X are correlated normally distributed variables of joint distribution(19.7) f ( x , x ) = 12 πσ σ (cid:112) − ρ x x · exp (cid:40) − − ρ ) (cid:34)(cid:18) log x − µ σ (cid:19) − ρ (log x − µ )(log x − µ ) σ σ + (cid:18) log x − µ σ (cid:19) (cid:35)(cid:41) , where ρ is a correlation parameter. Both X and X are log-normally distributed with E [ X i ] = e µ i + σ i / , and E [ X i ] = e µ i +2 σ i , i = 1 , . Matching these with the pre-assigned moments, and solving for µ i and σ i we find µ i = log (cid:32) E (cid:104) I ( i ) t (cid:105) (cid:46) (cid:115) E (cid:20)(cid:16) I ( i ) t (cid:17) (cid:21)(cid:33) and σ i = log (cid:18) E (cid:20)(cid:16) I ( i ) t (cid:17) (cid:21) (cid:46) E (cid:104)(cid:16) I ( i ) t (cid:17)(cid:105) (cid:19) Moreover, the mixed term is E [ X X ] = E (cid:2) e Y + Y (cid:3) = e µ + µ + ( σ + σ +2 ρσ σ ) = E [ X ] E [ X ] e ρσ σ which gives ρ (in terms of the pre-assigned moments):(19.8) ρ = 1 σ σ log E (cid:104) I (1) t I (2) t (cid:105) E (cid:104) I (1) t (cid:105) E (cid:104) I (2) t (cid:105) Example 11. (Conditional Variance Swaps) The payoff of a conditional variance swap isgiven by the ratio(19.9) (cid:82) T (cid:48) T v ( y s , s ) ( L < S ( y s ) < H ) ds (cid:82) T (cid:48) T ( L < S ( y s ) < H ) ds − SR To apply the moment method to this case, the first thing to note is that essentially we aremodelling the two integrals appearing in the payoff at the same time. We are going to need aBi-variate distribution to do this. Firstly let’s write: I (1) t = (cid:90) T (cid:48) T v ( y s , s ) ( L < S ( y s ) < H ) dsI (2) t = (cid:90) T (cid:48) T ( L < S ( y s ) < H ) ds and in order to compute the expectation(19.10) E (cid:34) I (1) t I (2) t (cid:35) we’ll need the following expectations: E (cid:104) I (1) t (cid:105) , E (cid:104) I (2) t (cid:105) , E (cid:20)(cid:16) I (1) t (cid:17) (cid:21) , E (cid:20)(cid:16) I (2) t (cid:17) (cid:21) , and E (cid:104) I (1) t I (2) t (cid:105) PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 53 To tackle this problem consider the operator(19.11) L (cid:15) ,(cid:15) ( y , y ) = L ( y , y ) + (cid:15) φ ( y ) δ y y + (cid:15) ψ ( y ) δ y y where(19.12) φ ( y , t ) = (cid:88) y L ( y , y ; t ) log (cid:18) S ( y ) S ( y ) (cid:19) ( L < S ( y ) < H )and(19.13) ψ ( y ) = ( L < S ( y ) < H ) . We have that ∂∂(cid:15) (cid:12)(cid:12)(cid:12)(cid:12) (cid:15) =0 e (( t (cid:48) − t ) L (cid:15) ,(cid:15) ) ( y , y ) = E (cid:104) I (1) t (cid:12)(cid:12)(cid:12) y t = y , y t (cid:48) = y (cid:105) and ∂ ∂(cid:15) (cid:12)(cid:12)(cid:12)(cid:12) (cid:15) =0 e (( t (cid:48) − t ) L (cid:15) ,(cid:15) ) ( y , y ) = E (cid:20) (cid:16) I (1) t (cid:17) (cid:12)(cid:12)(cid:12)(cid:12) y t = y , y (cid:48) t = y (cid:21) similarly (but with respect to (cid:15) ) for E (cid:104) I (2) t (cid:12)(cid:12)(cid:12) y t = y , y t (cid:48) = y (cid:105) and E (cid:20) (cid:16) I (2) t (cid:17) (cid:12)(cid:12)(cid:12)(cid:12) y t = y , y t (cid:48) = y (cid:21) The joint expectation will involve the mixed derivative:(19.14) E (cid:104) I (1) t I (2) t (cid:105) = ∂ ∂(cid:15) ∂(cid:15) (cid:12)(cid:12)(cid:12)(cid:12) (cid:15) ,(cid:15) =0 e (( t (cid:48) − t ) L (cid:15) ,(cid:15) ) ( y , y )and once computed, we make use of these expectations to match a bivariate distribution. Forsimplicity of notation we leave out the conditional part of these expectations, noting that all themoments we obtain are conditional to the initial and final points.To evaluate the expectation E (cid:104) I (1) t (cid:46) I (2) t (cid:105) , let us notice that E (cid:20) X X (cid:21) = E (cid:2) e Y − Y (cid:3) = E (cid:104) e Y + Y (cid:48) (cid:105) where Y (cid:48) = − Y is also be normally distributed ∼ N ( − µ , σ ). Hence E (cid:20) X X (cid:21) = e µ + σ e − µ + σ e − ρσ σ and the expectation (19.10) is given by(19.15) E (cid:34) I (1) t I (2) t (cid:35) = E (cid:104) I (1) t (cid:105) · E (cid:20)(cid:16) I (2) t (cid:17) (cid:21) E (cid:104) I (2) t (cid:105) · E (cid:104) I (1) t (cid:105) E (cid:104) I (2) t (cid:105) E (cid:104) I (1) t I (2) t (cid:105) = E (cid:104) I (1) t (cid:105) E (cid:20)(cid:16) I (2) t (cid:17) (cid:21) E (cid:104) I (2) t (cid:105) E (cid:104) I (1) t I (2) t (cid:105) moreover E (cid:34)(cid:18) X X (cid:19) (cid:35) = e µ +2 σ e − µ +2 σ e − ρσ σ = E (cid:20) X X (cid:21) e − µ e µ so(19.16) E (cid:32) I (1) t I (2) t (cid:33) = E (cid:104) I (1) t (cid:105) E (cid:20)(cid:16) I (2) t (cid:17) (cid:21) E (cid:20)(cid:16) I (1) t (cid:17) (cid:21) E (cid:104) I (2) t (cid:105) E (cid:104) I (1) t I (2) t (cid:105) To compute a payoff we will need the expectation E (cid:104) min (cid:16) X X , CV Max (cid:17)(cid:105) for the payoff of theconditional variance swap.(19.17) E (cid:20) min (cid:18) X X , CV Max (cid:19)(cid:21) = (cid:90) ∞ (cid:90) X CV Max (cid:18) X X − CV Max (cid:19) f ( X , X ) dX dX + CV Max4 CLAUDIO ALBANESE This double integral will need to be evaluated numerically. As the first one has finite bounds andthe second has an infinite upper bound it would make sense to use two types of Gaussian quadra-tures: a Gauss-Legendre quadrature on the inner integral and a Gauss-Laguerre quadrature onthe outer one. 20. Block Factorizations Although most path dependent processes emerging in applications to Finance are Abelian,some aren’t. If the Abelian property fails, the propagator cannot be block-diagonalized by themethods discussed above and also moment methods generally fail. In several relevant situationsone can still achieve a numerically viable framework by using block-factorizations instead ofdiagonalizations.Let U i ( y , y ) be a family of Markov propagator indexed by i = 0 , , ...n − t i , t i +1 ]. Consider a propagator lifting of the form(20.1) ¯ U i ( y , k ; y , k ) = U i ( y , y ) Q i ( y , y ) k k . where k , k = 0 ...K . Definition 45. (Block-Factorization.) We say that the propagators in (20.1) admit a block-factorization if they can be represented in the form: (20.2) ¯ U i = Π i · (cid:0) U i ⊗ I (cid:1) Here Π i , i = 0 , ...n − is a family of permutation matrices, i.e. matrices with the property thatfor each pair ( y , k ) we have that (20.3) Π i ( y , k ; y , k ) = 0 for all pairs ( y , k ) except for one single pair ( Y ( y , k ) , K ( y , k )) for which we have instead (20.4) Π i ( y , k ; Y ( y , k ) , K ( y , k )) = 1 . Furthermore, the tensor product notation in equation (20.2) stands for the operator with thefollowing matrix elements: (20.5) (cid:0) U i ⊗ I (cid:1) ( y , k ; y , k ) = U i ( y , y ) δ k k . If a block-factorization exists, then an efficient backward induction algorithm can be setup.Namely, if v is a vector, then we can value the following path ordered products(20.6) v i = P n − (cid:89) j = i ¯ U j v n = P n (cid:89) j = i Π j · (cid:0) U j ⊗ I (cid:1) v n iteratively in i from i = n − i = 0. At the first step one applies the operator (cid:0) U n − ⊗ I (cid:1) to v n , followed by the permutation Π n − to obtain v n − and then iterate. Due to the tensorproduct structure of the first operator, it is convenient to apply by regarding the vector v i asa matrix of elements v i ( y, k ). This representation makes it possible to leverage on numericalefficiencies and to re-interpret a high dimensional BLAS level 2 matrix-vector multiplication as alower dimensional BLAS level 3 matrix-matrix multiplication. Applying a permutation operatoris then quite straightforward from the numerical viewpoint and is an operation whose complexityscales linearly with respect to the dimension of the vector v . Example 12. (Snowballs) Here we follow (Albanese 2007 a ). Consider the case of a valuing asnowballs for which the structured coupon at time T i = T + (∆ T ) i has the following form:(20.7) C T i = f C T i − + Φ i ( y T i − )where the factor f is a fixed parameter and Φ i ( y T i − ) is a given function. Since the couponamount at a given time affects the coupon amou/nt in the next period, the process is notAbelian, in fact it is path-dependent. However, block-factorizations are still possible. PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 55 Consider discretizing the coupon dimension in intervals ∆ C , so that a generic coupon can beapproximated as follows(20.8) C T i = (∆ C ) k T i where k T i = 0 , , ....N − v ( y, k ) at maturity in a matrixwith N columns, each one indexed by the state variable y and representing the price conditionalto the discretized value of the previous coupon paid. One can then iterate backward by applyingthe pricing kernel to this matrix of conditional pricing functions. After each step in the iteration,one needs to reshuffle the pricing functions in such a way that the conditioning relation on eachcolumn is satisfied. More precisely, we set(20.9) v i − ( y , k ) = (cid:88) y U i ( y , y ) v i ( y , [( f (∆ C ) · k + Φ i ( y )) / (∆ C )]) = (cid:20) Π i · (cid:0) U i ⊗ I (cid:1) v i (cid:21) ( y , k )where(20.10) Π i ( y , k ; y , k ) = δ ( y − y ) δ ( k − K ( y , k ))and(20.11) K ( y , k ) = [( f (∆ C ) · k + Φ i ( y )) / (∆ C )] . Notice that this backward induction scheme can accommodate callability provisions. Example 13. (Soft Calls) Soft calls provide another interesting example of a non-Abelianprocess. In this case, a call decision depends on the fraction of time the process spends above agiven barrier level during a fixed time period prior to the decision itself. By restricting times atwhich one makes observations to the discrete sequence of time points t i = T + iδt, i = 0 , ...M ,the pricing function takes the form(20.12) v i ( y, −→ σ )where(20.13) −→ σ = ( σ , ..., σ N ) ∈ { , } N . Here the variable σ j equals 1 if y t i − N + j satisfies the barrier condition and 0 otherwise. We modelthis dependency by means of a generic function notation(20.14) σ j = Σ i ( y t i − N + j ) . The sequences −→ σ can be arranged in an ordered sequence of 2 N integers so that the data structure v i ( y, −→ σ ) appears as a matrix with 2 N columns. A backward induction scheme involves evaluating(20.15) v i − ( y , −→ σ ) = (cid:88) y U i ( y , y ) v i ( y , Φ i ( −→ σ , y ))where(20.16) Φ i ( −→ σ , y ) = { σ , ...σ N − , Σ( t i , t i − , y ) } This form is block-factorized as(20.17) v i − ( y , −→ σ ) = (cid:20) Π i · (cid:0) U i ⊗ I (cid:1) v i (cid:21) ( y , k )where Π i is the permutation operator of matrix(20.18) Π i ( y , −→ σ ; y , −→ σ ) = δ ( y − y ) δ (cid:0) ( −→ σ ) N − Σ i ( y ) (cid:1) N − (cid:89) j =1 δ (cid:0) ( −→ σ ) j − ( −→ σ ) j +1 (cid:1) . More general types of useful block-factorizations can be imagined, although they might benumerically less efficient. An example is set forth by the following definition: Definition 46. (State Dependent Block-Factorization.) We say that the propagators in(20.1) admit a state dependent block-factorization if they can be represented in the form: (20.19) ¯ U i = Π i · K (cid:77) k =1 U i,k Here Π i , i = 0 , ...n − is a family of permutation matrices as above. The direct sum notation inequation (20.19) stands for the operator with the following matrix elements: (20.20) (cid:0) K (cid:77) k =1 U i,k (cid:1) ( y , k ; y , k ) = U i,k ( y , y ) δ k k . A backward induction in this case is still a lower dimensional problem but since the lowerdimensional propagator is state dependent this reduces to a sequence of matrix-vector multi-plications as opposed to a single matrix-matrix multiplication. The scheme is thus numericallyless efficient. Still it would be useful in cases for instance such as GARCH type models andextensions, see (Engle 1982). 21. Dynamic Conditioning This section is based on work in collaboration with Alicia Vidler, see (Albanese and Vidler2007).Modeling correlations between several processes has sometimes been considered a problemaffected by the so-called curse of dimensionality which causes the numerical complexity to explodeexponentially, see (Bellman 1957). This motivates the recourse to Monte Carlo methods whilecalibrating using closed form solutions.Here we deviate substantially from this framework and aim at building models viable from theengineering deployment viewpoint and specified semi-parametrically or even non-parametricallywithout any restriction imposed by analytical tractability. The method of fast exponentiationallows one to calibrate single name models without recurring to analytic solvability thanks toproviding smooth sensitivities. Although Monte Carlo schemes could in principle be employedin a model constructed non-parametrically and calibrated with operator methods, in many cir-cumstances important for applications this can be avoided by using the technique of dynamicconditioning discussed in this section. This technique can be regarded as a dynamic copulawhereby the single factor marginal distributions are preserved and the numerical complexitygrows linearly with the number of factors, similarly to what happens with Monte Carlo meth-ods. But, unlike Motecarlo methods, dynamic conditioning is numerically noiseless as there isno sampling, no variance reduction scheme is needed and even features such as callability arenumerically treatable.Dynamic conditioning is based on constructing a hierarchy of conditioning relations. Atthe base of the hierarchy we have continuous time lattice models for single factor marginaldistributions. Next, one introduces a binomial process for each risk factor to condition thecorresponding Markov chain. Next one finds a hierarchy of binomial processes to condition theformer conditioning lattices. See figure (4).To describe the technique of dynamic conditioning in detail, consider a particular risk factordescribed by a lattice process whose filtered probability space is engendered by a Markov propa-gator U ( y , t ; y , t ). For each such single factor, we introduce a conditioning binomial process h t ∈ Z , which is constant over the time intervals [ T i , T i +1 ) where T i = T + i ∆ T , i = 0 , , , ...N and ∆ T = ( T (cid:48) − T ) /N . At initial time we set h T = 0 while for all i > h T i − h T i − = ± 1. The elementary propagator for the process h t across neighboring time points T i is(21.1) V ( h i , T i ; h i +1 , T i +1 ) = q ( h i , i ) δ h i +1 ,h i +1 + q − ( h i , i ) δ h i − ,h i +1 PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 57 Figure 4. Dynamic conditioning scheme: a binomial process conditions eachrisk factor and is in turn conditioned by another binomial process. The latteris also conditioned by a hierarchy of binomial processes.where q ( h i , i ) , q − ( h i , i ) > q ( h i , i ) + q − ( h i , i ) = 1. On a general time interval [ T i , T j ],we have that(21.2) V ( h i , T i ; h j , T j ) = (cid:88) h t : h Ti = h i ,h Tj = h j j − (cid:89) k = i V ( h T k , T k ; h T k +1 , T k +1 ) . Next we define a lifted conditional propagator ¯ U ( y i , h i , T i ; y j , h j , T j ) in such a way to preserveunconditional transition probabilities from the starting time at T , i.e. so that(21.3) (cid:88) h j ¯ U ( y , , T ; y j , h j , T j ) = U ( y , T ; y j , T j ) . To do so, one strategy is to define two propagators for each node ( h, T i ), namely to choose apair of operators U ( y i , h i , T i ; y i +1 , h i + 1 , T i +1 ) and U − ( y i , h i , T i ; y i +1 , h i − , T i +1 ) so that U ( y i , T i ; y i +1 , T i +1 ) = q ( h, i ) U ( y i , h, T i ; y i +1 , h + 1 , T i +1 )+ q − ( h, i ) U − ( y i , h, T i ; y i +1 , h − , T i +1 ) . The operator ¯ U ( y , , T ; y j , h j , T j ) satisfying (21.3) is defined as the following sum:¯ U ( y , h , T ; y j , h j , T j ) = (cid:88) h t : h T =0 ,h Tj = h j (cid:88) y ,..y j − (cid:89) k =1 ..j q h Tk − h Tk − ( h T k − ) × U h Tk − h Tk − ( y k − , h T k − , T k − ; y k , h T k , T k ) . Since the operators U and U − do not commute with each other, if not in trivial situations,each path h t in the summation on the right hand side of this equation represents a different oper-ator. In many situations one can however avoid the numerical complexities of a non-recombining tree by finding modified versions of such operators so that the kernels U h · ( y , y k ) = (cid:88) y ,..y j − (cid:89) k =1 ..j q h Tk − h Tk − ( h T k − ) ¯ U ( y k − , h T k − , T k − ; y k , h T k , T k )are all equal to each other, for all paths h t : h T = 0 , h T j = h j and at least one single fixedstarting point y = ¯ y . This will be referred to as conditional recombination property .The reasons why we focus on the initial point y = ¯ y only are manifold. Firstly, in applica-tions one needs to condition marginal distributions only to spot values, as the price of optionsin the hypothetical case asset prices were different is not known. Secondly, if we insisted onthe same property being valid for all initial starting points we would end up with a seriouslyill posed problem of difficult solution. Because of these reasons, we settle for the more modestobjective of conditional recombination.Namely, on each node ( h i , T i ) with i > U ( y i − , h i − , T i − ; y i , h i , T i ) = U h i − h i − ( y i − , h i − , T i − ; y i , h i , T i )in case h i = ± i . Otherwise, we determine this operator so that¯ U (¯ y , h , T ; y i , h i , T i ) = (cid:88) y i − ∈ Λ ¯ U (¯ y , h , T ; y i − , h i − , T i − ) ¯ U ( y i − , h i − , T i − ; y i , h i , T i )for all y i ∈ Λ and a fixed ¯ y . This can be achieved in more than one way. As a guideline, it isadvisable to deviate the least possible from the definition (21.4).As a next step in the construction, consider a second binomial process c t which is piecewiseconstant across neighboring time points T i . The dynamics of c t is by construction correlated tothat of h t . More precisely, the propagator for the joint process is W ( h i ,c i , T i ; h i +1 , c i +1 , T i +1 ) = q ++ ( h i , c i , i ) δ h i +1 ,h i +1 δ c i +1 ,c i +1 + q + − ( h i , c i , i ) δ h i +1 ,h i +1 δ c i − ,c i +1 + q − + ( h i , c i , i ) δ h i − ,h i +1 δ c i +1 ,c i +1 + q −− ( h i , c i , i ) δ h i − ,h i +1 δ c i − ,c i +1 where q ±± ( h i , c i , i ) ≥ q ++ ( h i , c i , i ) + q + − ( h i , c i , i ) + q − + ( h i , c i , i ) + q −− ( h i , c i , i ) = 1 . Due to the conditional recombination property, if y = ¯ y , then the conditional propagatorsresum with a simple formula (cid:88) h t : h T =0 ,h Tj = h j (cid:88) y ,..y j − (cid:89) k =1 ..j W ( h i , c i , T i ; h i +1 , c i +1 , T i +1 ) × q h Tk − h Tk − ( h T k − ) ¯ U h Tk − h Tk − ( y k − , h T k − , T k − ; y k , h T k , T k )= (cid:88) h j W (0 , , T ; h j , c j , T j ) ¯ U ( y , , T ; y j , h j , T j ) ≡ ˜ U ( y , , T ; y j , c j , T j ) . As a next step, the construction above is repeated for a number of different risk factorsassociated to N lattice processes y ( α ) t , where α = 1 , ..N all correlated to the process c t , possiblyin different ways. Then, conditioned to starting all processes at fixed lattice points y ( α )0 = ¯ y ( α )0 and conditioned to c T j = c j , the multi-factor propagator factorizes into the product of conditionalsingle factor propagators(21.6) (cid:89) α =1 ...N ˜ U ( y ( α )0 , , T ; y ( α ) j , c j , T j )This is the key formula which we use to correlate processes while ensuring that numerical com-plexity increases only linearly with the number of factors N . PERATOR METHODS, ABELIAN PROCESSES AND DYNAMIC CONDITIONING 59 The construction can obviously be iterated and we can have a whole hierarchy of binomialprocesses conditioning each other to model multi-factor correlations.22. Conclusion We reviewed a new framework for Mathematical Finance and the theory of stochastic pro-cesses based on operator methods. The framework grew over the years through applied researchand solving specific concrete problems in derivative pricing. As wrong ideas were weeded out,a coherent framework emerged and is now summarized in this review paper. From the nu-merical viewpoint, the methods rely on the ability to multiply efficiently large matrices as themain computational engine. Technologies for matrix manipulation are currently being devel-oped at great pace. The emerging multi-core GPU chipsets will soon provide affordable teraflopperformance for algorithms based on matrix-matrix manipulations such as ours. To both setthe theory of stochastic processes on firm theoretical grounds and justify the empirically ob-served convergence rates, we derive pointwise kernel convergence estimates of a new type whichexplain the observed smoothing properties of the method. We introduce the notion of Abelianprocess to generalize the concept of stochastic integral and extend the classic Cameron-Feynman-Girsanov-Ito-Kac-Martin theorem in multiple directions. This theorem is here reinterpreted as ablock-diagonalization scheme for large matrices corresponding to Abelian processes. We outlinesolution methods for Abelian path dependent options, a class we identified and which comprisesthe great majority of the path-dependent options currently traded. Important non-Abelianprocesses are also considered and discussed by means of a weaker but still effective method ofblock-factorizations. Furthermore, we also discuss a method for dynamic conditioning whichapplies to correlation products such as baskets and hybrid derivatives. References Abramowitz, M. and I.A. Stegun (1972). Handbook of Mathematical Functions . Dover, New York.Ait-Sahalia, Y. (1996). Nonparametric Pricing of Interest Rate Derivative Securities. Econometrica .Ait-Sahalia, Y., L.P. Hansen and J. A. Scheinkman (2005). Operator Methods for Continuous Time MarkovProcesses. Journal of Economic Theory .Albanese, C. (2007 a ). Callable Swaps, Snowballs and Videogames. preprint .Albanese, C. (2007 b ). Kernel Convergence Estimates for Diffusions with Continuous Coefficients. arXiv:0711.0132v1 [math.NA] .Albanese, C. and A. Kusnetsov (2005). Transformations of Markov Processes and Classification Scheme forSolvable Driftless Diffusions. preprint .Albanese, C. and A. Mijatovic (2006). Convergence Rates for Diffusions on Continuous-Time Lattices. .Albanese, C. and A. Osseiran (2007). Moment Methods for Exotic Volatility Derivatives. preprint .Albanese, C. and A. Vidler (2007). A Structural Credit-Equity Model for Bespoke CDOs. Wilmott Magazine .Albanese, C. and M. Trovato (2005). A Stochastic Volatility Model for Callable cms Swaps and TranslationInvariant Path Dependent Derivatives. preprint .Albanese, C. and M. Trovato (2006). An Interest Rate Process with Stochastic Monetary Policy, Volatility CubeCalibration and Callable CMS Spread Range Accruals. to be released .Albanese, C. and P. Jones (2007). Non-resonant block diagonalizations for abelian processes. preprint .Albanese, C. and S. Lawi (2004). Laplace Transforms for Integrals of Markov Processes. To appear in MarkovProcesses and Functional Analysis .Albanese, C. and X.O. Chen (2004 a ). Credit Barrier Models on a Continuous Time Lattice. to appear in Quan-titative Finance .Albanese, C. and X.O. Chen (2004 b ). Pricing Equity Default Swaps. preprint .Albanese, C., G. Camplolieti, P. Carr and A. Lipton (2001). Black-Scholes goes hypergeometric. Risk , 99–103.Albanese, C., H. Lo and A. Mijatovic (2006 a ). Spectral Methods for Volatility Derivatives. preprint .Albanese, C., O. Chen, A. Dalessandro and A. Vidler (2005-2006 b ). Dynamic Credit Correlation Modelling. preprint .Bellman, R.E. (1957). Dynamic Programming . Princeton University Press.Bishop, E. (1967). Foundations of Constructive Analysis . McGraw-Hill.Bochner, S. (1955). Harmonic Analysis and the Theory of Probability . University of California Press.Bridges, D. and F. Richman (1987). Varieties of Constructive Mathematics . Cambridge University Press. Cameron, R.H. and W.T Martin (1949). Transformations of Wiener Integrals by Non-Linear Transformations. Trans. Amer. Math. Soc. Doob, J.L. (1953). Stochastic Processes . John Wiley and Sons.Engle, R. F. (1982). Autoregressive Conditional Heteroscedasticity, with Estimates of the Variance of UnitedKingdom Inflation. Econometrica , 987–1007.Feller, W. (1961). An Introduction to Probability Theory and its Applications . Wiley, New York.Feynman, R.P. (1948). Space-Time Approach to Non-Relativistic Quantum Mechanics. Rev. Modern Phys. Giorno, V., A.G. Nobile, L.M. Ricciardi and L. Sacerdote (1988). Some remarks on the rayleigh process. Journalof Applied Probability .Glimm, J. and A. Jaffe (1987). Quantum Physics. A Functional Integral Point of View . Springer.Goto, K. and R. van de Geijn (to appear). Anatomy of High-Performance Matrix Multiplication. ACM Trans.Math. Soft. Hansen, L. P. and J. Scheinkman (1995). Back to the Future: Generating Moment Implications for Continuous-Time Markov Processes. Econometrica .Hansen, L. P., J. Scheinkman and N. Touzi (1998). Spectral Methods for Identifying Scalar Diffusion Processes. Journal of Econometrics .Ito, K. (1949). On a Formula Concerning Stochastic Differentials. Trans. Amer. Math. Soc. Kac, M. (1948). On Some Connections Between Probability Theory and Differential Equations. Rev. ModernPhys. Kato, T. (1976). Perturbation theory for linear operators . Springer-Verlag, Berlin.Landau, L. D. and E. M. Lifshits (1977). Quantum Mechanics: Non-relativistic Theory (3rd ed.) . Pergamon,London.Naimark, M.A. (1959). Normed Rings . Noordhoff, Groningen.Reed, M. and B. Simon (1980). Methods of Modern Mathematical Physics, Vol I-IV . Academic Press.Spanier, E.S. (1966). Algebraic Topology . McGraw-Hill.Trefethen, Lloyd N. and Mark Embree (2006). Spectra and Pseudospectra: The Behavior of Nonnormal Matricesand Operators . Princeton University Press. E-mail address ::