[PDF] A Term Structure Model for Dividends and Interest Rates

Abstract

Over the last decade, dividends have become a standalone asset class instead of a mere side product of an equity investment. We introduce a framework based on polynomial jump-diffusions to jointly price the term structures of dividends and interest rates. Prices for dividend futures, bonds, and the dividend paying stock are given in closed form. We present an efficient moment based approximation method for option pricing. In a calibration exercise we show that a parsimonious model specification has a good fit with Euribor interest rate swaps and swaptions, Euro Stoxx 50 index dividend futures and dividend options, and Euro Stoxx 50 index options.

Full PDF

AA Term Structure Model for Dividends and Interest Rates ∗ Damir Filipovi´c † Sander Willems ‡ March 15, 2019

Abstract

Over the last decade, dividends have become a standalone asset class instead of a mereside product of an equity investment. We introduce a framework based on polynomial jump-diﬀusions to jointly price the term structures of dividends and interest rates. Prices fordividend futures, bonds, and the dividend paying stock are given in closed form. We presentan eﬃcient moment based approximation method for option pricing. In a calibration exercisewe show that a parsimonious model speciﬁcation has a good ﬁt with Euribor interest rateswaps and swaptions, Euro Stoxx 50 index dividend futures and dividend options, and EuroStoxx 50 index options.

JEL Classiﬁcation : C32, G12, G13

MSC2010 Classiﬁcation : 91B70, 91G20, 91G30

Keywords : Dividend derivatives, interest rates, polynomial jump-diﬀusion, term structure,moment based option pricing

In recent years there has been an increasing interest in trading derivative contracts with a directexposure to dividends. Brennan (1998) argues that a market for dividend derivatives couldpromote rational pricing in stock markets. In the over-the-counter (OTC) market, dividendshave been traded since 2001 in the form of dividend swaps, where the ﬂoating leg pays thedividends realized over a predetermined period of time. The OTC market also accommodatesa wide variety of more exotic dividend related products such as knock-out dividend swaps,dividend yield swaps and swaptions. Dividend trading gained signiﬁcant traction in late 2008,when Eurex launched exchange traded futures contracts referencing the dividends paid out byconstituents of the Euro Stoxx 50. The creation of a futures market for other major indices (e.g.,the FTSE 100 and Nikkei 225) followed shortly after, as well as the introduction of exchange ∗ We thank participants at the Workshop on Dynamical Models in Finance in Lausanne, the 8th GeneralAdvanced Mathematical Methods in Finance conference in Amsterdam, the 2nd International Conference onComputational Finance in Lisbon, the 11th Actuarial and Financial Mathematics Conference in Brussels, the2018 Swiss Finance Institute Research Days in Gerzensee, the 10th Bachelier World Congress in Dublin, the 2018Young Researchers Workshop on Data-Driven Decision Making at Cornell University, and seminars at McMasterUniversity, New York University, Princeton University, and UC Berkely, as well as Peter Carr, J´erˆome Detemple(discussant), Alexey Ivashchenko (discussant), Martin Lettau, and Radu Tunaru for their comments. The researchleading to these results has received funding from the European Research Council under the European Union’sSeventh Framework Programme (FP/2007-2013) / ERC Grant Agreement n. 307465-POLYTE. † EPFL and Swiss Finance Institute. Email: damir.ﬁlipovic@epﬂ.ch ‡ EPFL and Swiss Finance Institute. Email: sander.willems@epﬂ.ch a r X i v : . [ q -f i n . M F ] M a r isted options on realized dividends with maturities of up to ten years. Besides the wide varietyof relatively new dividend instruments, there is another important dividend derivative that hasbeen around since the inception of ﬁnance: a simple dividend paying stock. Indeed, a share ofstock includes a claim to all the dividends paid over the stock’s lifetime. Any pricing model fordividend derivatives should therefore also be capable of eﬃciently pricing derivatives on the stockpaying the dividends. What’s more, the existence of interest rate-dividend hybrid products, therelatively long maturities of dividend options, and the long duration nature of the stock allmotivate the use of stochastic interest rates. Despite its apparent desirability, a tractable jointmodel for the term structures of interest rates and dividends, and the corresponding stock, hasbeen missing in the literature to date.We ﬁll this gap and develop an integrated framework to eﬃciently price derivatives on dividends,stocks, and interest rates. We ﬁrst specify dynamics for the dividends and discount factor, andin a second step we recover the stock price in closed form as the sum of the fundamental stockprice (present value of all future dividends) and possibly a residual bubble component. Theinstantaneous dividend rate is a linear function of a multivariate factor process. The interestrates are modeled by directly specifying the discount factor to be linear in the factors, similarlyas in Filipovi´c et al. (2017). The factor process itself is speciﬁed as a general polynomialjump-diﬀusion, as studied in Filipovi´c and Larsson (2017). Such a speciﬁcation makes themodel tractable because all the conditional moments of the factors are known in closed form.In particular, we have closed form expressions for the stock price and the term structures ofdividend futures and interest rate swaps. Any derivative whose discounted payoﬀ can be writtenas a function of a polynomial in the factors is priced through a moment matching method.Speciﬁcally, we ﬁnd the unique probability density function with maximal Boltzmann-Shannonentropy matching a ﬁnite number of moments of the polynomial, as in Mead and Papanicolaou(1984). We then obtain the price of the derivative by numerical integration. In particular, thisallows us to price swaptions, dividend options, and options on the dividend paying stock. Weshow that our polynomial framework also allows to incorporate seasonal behavior in the dividenddynamics.Within our polynomial framework, we introduce the linear jump-diﬀusion (LJD) model. Weshow that the LJD model allows for a ﬂexible dependence structure between the factors. Thisis useful to model a dependence between dividends and interest rates, but also to model thedependence within the term structure of interest rates or dividends. We calibrate a parsimoniousspeciﬁcation of the LJD model to market data on Euribor interest rate swaps and swaptions,Euro Stoxx 50 index dividend futures and dividend options, and Euro Stoxx 50 index options.Our model reconciles the relatively high implied volatility of the index options with the relativelylow implied volatility of dividend options and swaptions through a negative correlation betweendividends and interest rates. The successful calibration of the model to three diﬀerent classes ofderivatives (interest rates, dividends, and equity) illustrates the high degree of ﬂexibility oﬀeredby our framework.Our paper is related to various strands of literature. In the literature on stock option pricing,dividends are often assumed to be either deterministic (e.g., Bos and Vandermark (2002), Boset al. (2003), Vellekoop and Nieuwenhuis (2006)), a constant fraction of the stock price (e.g.,Merton (1973), Korn and Rogers (2005)), or a combination of the two (e.g., Kim (1995), Over-haus et al. (2007)). Geske (1978) and Lioui (2006) model dividends as a stochastic fractionof the stock price. They derive Black-Scholes type of equations for European option prices,however dividends are not guaranteed to be non-negative in both setups. Chance et al. (2002)2irectly specify log-normal dynamics for the T -forward price of the stock, with T the maturityof the option. Closed form option prices are obtained as in Black (1976), assuming that today’s T -forward price is observable. This approach is easy to use since it does not require any mod-eling assumptions on the distribution of the dividends. However, it does not produce consistentoption prices for diﬀerent maturities. Bernhart and Mai (2015) take a similar approach, butsuggest to ﬁx a time horizon T long enough to encompass all option maturities to be priced. The T -forward price is modeled with a non-negative martingale and the stock price is deﬁned as the T -forward price plus the present value of dividends from now until time T . As a consequence,prices of options with maturity smaller than T will depend on the joint distribution betweenfuture dividend payments and the T -forward price, which is not known in general. Bernhartand Mai (2015) resort to numerical tree approximation methods in order to price options. Thedependence of their model on a ﬁxed time horizon still leads to time inconsistency, since thehorizon will necessarily have to be extended at some point in time. We contribute to this liter-ature by building a stock option pricing model that guarantees non-negative dividends, is timeconsistent, and remains tractable.Another strand of literature studies stochastic models to jointly price stock and dividend deriva-tives. Buehler et al. (2010) assumes that the stock price jumps at known dividend payment datesand follows log-normal dynamics in between the payment dates. The relative size of the divi-dends (i.e., the relative jump amplitudes) are assumed to be i.i.d. distributed as 1 − Z , where Z is log-normally distributed. As a consequence, the stock price remains log-normally distributedand the model has closed form prices for European call options on the stock. The high volatilityin the stock price is reconciled with the low volatility in dividend payments by setting the cor-relation between the relative jump size and stock price extremely negative ( − Z is modeled with a beta distribution instead of a log-normal. This guarantees positive dividendpayments. However, the diﬀusive noise of the stock is assumed independent of Z in order to havetractable expressions for dividend futures prices. Smoothing the dividends through a negativecorrelation between stock price and relative dividend size, as in Buehler et al. (2010), is thereforenot possible. The model is able to produce a skew in dividend option prices, but Monte-Carlosimulations are required for the computation. In a second approach, Tunaru (2018) directlymodels the cumulative dividends with a diﬀusive logistic growth model. This modeling choiceis motivated by the ‘sigmoidal’ shape of the time-series of cumulative Euro Stoxx 50 dividends.3lthough the distribution of the cumulative dividends is in principal available in closed form,the author reports it to be too cumbersome for practical purposes and resorts once again toMonte-Carlo simulations for pricing. The process modeling the cumulative dividends, however,has no guarantee to be monotonically increasing, meaning that negative dividends can (anddo) occur frequently. Moreover, the model must be reset on an annual basis, which essentiallyremoves any dependency between dividends in subsequent years. We add to this literature by al-lowing for stochastic interest rates, which is important for the valuation of interest rate-dividendhybrid products or long-dated dividend derivatives (e.g., the dividend paying stock). Our modelproduces closed form prices for dividend futures and features eﬃcient approximations for optionprices which are signiﬁcantly faster than Monte-Carlo simulations. Seasonality can be incorpo-rated directly in the dividend dynamics, which in turn produces a sigmoidal cumulative dividendprocess by construction . The low volatility in dividends and interest rates is reconciled with thehigh volatility in the stock price through a negative correlation between dividends and interestrates.Our work also relates to literature on constructing an integrated framework for dividends andinterest rates. Previous approaches were mainly based on aﬃne processes, see e.g. Bekaert andGrenadier (1999), Mamaysky et al. (2002), d’Addona and Kind (2006), Lettau and Wachter(2007, 2011), and Lemke and Werner (2009). In more recent work, Kragt et al. (2018) extractinvestor information from dividend derivatives by estimating a two-state aﬃne state space modelon stock index dividend futures in four diﬀerent stock markets. Instead of modeling dividendsand interest rates separately, they choose to model dividend growth, a risk-free discount rate,and a risk premium in a single variable called the ‘discounted risk-adjusted dividend growthrate’. Yan (2014) uses zero-coupon bond prices and present value claims to dividend extractedfrom the put-call parity relation to estimate an aﬃne term structure model for interest rates anddividends. Suzuki (2014) uses a Nelson-Siegel approach to estimate the fundamental value ofthe Euro Stoxx 50 using dividend futures and Euribor swap rates. We add to this literature bybuilding an integrated framework for dividends and interest rates using the class of polynomialprocesses, which contains the traditional aﬃne processes as a special case.Finally, our work also relates to literature on moment based option pricing. Jarrow and Rudd(1982), Corrado and Su (1996b), and Collin-Dufresne and Goldstein (2002b) use Edgeworthexpansions to approximate the density function of the option payoﬀ from the available moments.Closely related are Gram-Charlier expansions, which are used for option pricing for example byCorrado and Su (1996a), Jondeau and Rockinger (2001), and Ackerer et al. (2018). Althoughthese series expansions allow to obtain a function that integrates to one and matches an arbitrarynumber of moments by construction, it has no guarantee to be positive. In this paper, we ﬁndthe unique probability density function with maximal Boltzmann-Shannon entropy. subject to aﬁnite number of moment constraints. Option prices are then obtained by numerical integration.A similar approach is taken by Fusai and Tagliani (2002) to price Asian options. The principle ofmaximal entropy has also been used to extract the risk-neutral distribution from option prices,see e.g. Buchen and Kelly (1996), Jackwerth and Rubinstein (1996), Avellaneda (1998), andRompolis (2010). There exist many alternatives to maximizing the entropy in order to ﬁnda density function satisfying a ﬁnite number of moment constraints. For example, one canmaximize the smoothness of the density function (see e.g., Jackwerth and Rubinstein (1996))or directly maximize (minimize) the option price itself to obtain an upper (lower) bound on theprice (see e.g., Lasserre et al. (2006)). A comparison of diﬀerent approaches is beyond the scopeof this paper. 4he remainder of the paper is structured as follows. Section 2 introduces the factor process anddiscusses the pricing of dividend futures, bonds, and the dividend paying stock. In Section 3 weexplain how to eﬃciently approximate option prices using maximum entropy moment matching.Section 4 describes the LJD model. In Section 5 we calibrate a parsimonious model speciﬁcationto real market data. Section 6 discusses some extensions of the framework. Section 7 concludes.All proofs and technical details can be found in the appendix. We consider a ﬁnancial market modeled on a ﬁltered probability space (Ω , F , F t , Q ) where Q is a risk-neutral pricing measure. Henceforth E t [ · ] denotes the F t -conditional expectation. Wemodel the uncertainty in the economy through a factor process X t taking values in some statespace E ⊆ R d . We assume that X t is a polynomial jump-diﬀusion (cfr. Filipovi´c and Larsson(2017)) with dynamics(1) d X t = κ ( θ − X t ) d t + d M t , for some parameters κ ∈ R d × d , θ ∈ R d , and some d -dimensional martingale M t such that thegenerator G of X t maps polynomials to polynomials of the same degree or less. One of themain features of polynomial jump-diﬀusions is the fact that they admit closed form conditionalmoments. For n ∈ N , denote by Pol n ( E ) the space of of polynomials on E of degree n or lessand denote its dimension by N n . Let h , . . . , h N n form a polynomial basis for Pol n ( E ) anddenote H n ( x ) = ( h ( x ) , . . . , h N n ( x )) (cid:62) . Since G leaves Pol n ( E ) invariant, there exists a uniquematrix G n ∈ R N n × N n representing the action of G on Pol n ( E ) with respect to the basis H n ( x ).Without loss of generality we assume to work with the monomial basis. Example 2.1. If n = 1 , then we have H ( x ) = (1 , x , . . . , x d ) (cid:62) and G becomes (2) G = (cid:18) κθ − κ (cid:19) . From the invariance property of G , one can derive the moment formula (Theorem 2.4 in Filipovi´cand Larsson (2017)) E t [ H n ( X T )] = e G n ( T − t ) H n ( X t ) , (3)for all t ≤ T . Many eﬃcient algorithms exist to numerically compute the matrix exponential(e.g., Al-Mohy and Higham (2011)). Consider a stock that pays a continuous dividend stream to its owner at an instantaneousrate D t , which varies stochastically over time. We model the cumulative dividend process C t = C + (cid:82) t D s d s as:(4) C t = e βt p (cid:62) H ( X t ) , We assume that E has non-empty interior. Since the interior of E is assumed to be non-empty, Pol n ( E ) can be identiﬁed with Pol n ( R d ) and therefore N n = (cid:0) n + dd (cid:1) . β ∈ R and p ∈ R d +1 such that C t is a positive, non-decreasing, andabsolutely continuous (i.e., drift only) process. This speciﬁcation for C t implicitly pins down D t , which is shown in the following proposition. Proposition 2.2.

The instantaneous dividend rate D t implied by (4) is given by (5) D t = e βt p (cid:62) ( β Id + G ) H ( X t ) , where Id denotes the identity matrix. Remark that both the instantaneous dividend rate and the cumulative dividends load linearly onthe factor process. The exponential scaling of C t with parameter β can be helpful to guaranteea non-negative instantaneous dividend rate. Indeed, if(6) λ = sup x ∈ E − p (cid:62) G H ( x ) p (cid:62) H ( x )is ﬁnite, then it follows from (5) that D t ≥ β ≥ λ . Moreover, when all eigenvaluesof κ have positive real parts, it follows from the moment formula (3) thatlim T →∞ T − t log (cid:18) E t [ D T ] D t (cid:19) = β. The parameter β therefore controls the asymptotic risk-neutral expected growth rate of thedividends.The time- t price of a continuously marked-to-market futures contract referencing the dividendsto be paid over a future time interval [ T , T ], t ≤ T ≤ T , is given by: D fut ( t, T , T ) = E t (cid:20)(cid:90) T T D s d s (cid:21) = E t [ C T − C T ]= p (cid:62) (cid:16) e βT e G ( T − t ) − e βT e G ( T − t ) (cid:17) H ( X t ) , (7)where we have used the moment formula (3) in the last equality. Hence, the dividend futuresprice is linear in the factor process. Note that the dividend futures term structure (i.e., thedividend futures prices for varying T and T ) does not depend on the speciﬁcation of themartingale part of X t . Denote the risk-neutral discount factor by ζ t . It is related to the short rate r t as follows ζ T = ζ t e − (cid:82) Tt r s d s , ≤ t ≤ T. We directly specify dynamics for the risk-neutral discount factor: ζ t = e − γt q (cid:62) H ( X t ) , (8) We calculate λ explicitly for the linear jump-diﬀusion model studied in Section 4. γ ∈ R and q ∈ R d +1 such that ζ t is a positive and absolutely continuousprocess. This is similar to the speciﬁcation (4) of C t but, in order to allow for negative interestrates, we do not require ζ t to be monotonic (non-increasing). Filipovi´c et al. (2017) follow asimilar approach and specify linear dynamics for the state price density with respect to thehistorical probability measure P . Their speciﬁcation pins down the market price of risk. It turnsout that the polynomial property of the factor process is not preserved under the change ofmeasure from P to Q in this case. However, as seen in (7), the polynomial property (in particularthe linear drift) under Q is important for pricing the dividend futures contracts.The time- t price of a zero-coupon bond paying one unit of currency at time T ≥ t is givenby: P ( t, T ) = 1 ζ t E t [ ζ T ] . Using the moment formula (3) we get a linear-rational expression for the zero-coupon bondprice(9) P ( t, T ) = e − γ ( T − t ) q (cid:62) e G ( T − t ) H ( X t ) q (cid:62) H ( X t ) . Remark that the term structure of zero-coupon bond prices depends only on the drift of X t .Similarly as in Filipovi´c et al. (2017), one can introduce exogenous factors feeding into themartingale part of X t to generate unspanned stochastic volatility (see e.g., Collin-Dufresne andGoldstein (2002a)), however we do not consider this in our paper.Using the relation r t = − ∂ T log P ( t, T ) | T = t , we obtain the following linear-rational expressionfor the short rate: r t = γ − q (cid:62) G H ( X t ) q (cid:62) H ( X t ) . When all eigenvalues of κ have positive real parts, it follows thatlim T →∞ − log( P ( t, T )) T − t = γ, so that γ can be interpreted as the yield on the zero-coupon bond with inﬁnite maturity.Ignoring diﬀerences in liquidity and credit characteristics between discount rates and IBORrates, we can value swap contracts as linear combinations of zero-coupon bond prices. Thetime- t value of a payer interest rate swap with ﬁrst reset date T ≥ t , ﬁxed leg payment dates T < · · · < T n , and ﬁxed rate K is given by:(10) π swapt = P ( t, T ) − P ( t, T n ) − K n (cid:88) k =1 δ k P ( t, T k ) , with δ k = T k − T k − , k = 1 . . . , n . The forward swap rate is deﬁned as the ﬁxed rate K which makes the right hand side of (10) equal to zero. Note that the discounted swap value ζ t π swapt becomes a linear function of X t , which will be important for the purpose of pricingswaptions. 7 .3 Dividend paying stock Denote by S ∗ t the fundamental price of the stock, which we deﬁne as the present value of allfuture dividends:(11) S ∗ t = 1 ζ t E t (cid:20)(cid:90) ∞ t ζ s D s d s (cid:21) . In order for S ∗ t to be ﬁnite in our model, we must impose parameter restrictions. The followingproposition provides suﬃcient conditions on the parameters, together with a closed form expres-sion for S ∗ t . The latter is derived using the fact that ζ t D t is quadratic in X t , hence we are ableto calculate its conditional expectation through the moment formula (3). Proposition 2.3.

If the real parts of the eigenvalues of G are bounded above by γ − β , then S ∗ t is ﬁnite and given by (12) S ∗ t = e βt w (cid:62) H ( X t ) q (cid:62) H ( X t ) , where w = (cid:2) ( γ − β ) Id − G (cid:62) (cid:3) − v and v ∈ R N is the unique coordinate vector satisfying v (cid:62) H ( x ) = p (cid:62) ( β Id + G ) H ( x ) q (cid:62) H ( x ) . Proposition 2.3 shows that the discounted fundamental stock price ζ t S ∗ t is quadratic in X t ,which means in particular that we have all moments of ζ t S ∗ t in closed form. Loosely speaking,the fundamental stock price will be ﬁnite as long as the dividends are discounted at a suﬃcientlyhigh rate (by choosing γ suﬃciently large). Henceforth we will assume that the assumption ofProposition 2.3 is satisﬁed.The following proposition shows how the price of the dividend paying stock, which we denoteby S t , is related to the fundamental stock price. Proposition 2.4.

The market is arbitrage free if and only if S t is of the form S t = S ∗ t + L t ζ t , (13) with L t a non-negative local martingale. The process L t can be interpreted as a bubble in the sense that it drives a wedge between thefundamental stock price and the observed stock price. Applying Itˆo’s lemma to (13), we obtainthe following risk-neutral stock price dynamics(14) d S t = ( r t S t − D t ) d t + e βt w (cid:62) J H ( X t ) q (cid:62) H ( X t ) d M t + 1 ζ t d L t , where J H ( x ) denotes the Jacobian of H ( x ). Remark that S t has the correct risk-neutraldrift, by construction. Given dynamics for r t and D t , an alternative approach to model S t forderivative pricing purposes would have been to directly specify its martingale part. With suchan approach, however, it is not straightforward to guarantee a positive stock price. Indeed, This relationship has been highlighted in particular by Buehler (2010, 2015) in the context of derivativepricing. Moreover, it is clear that by directly specifying the martingale part of the stockprice, we are implicitly modeling a bubble in the stock price. In contrast, our approach implies amartingale part (the second term in (14)) that guarantees a positive stock price. This martingalepart is completely determined by the given speciﬁcation for dividends and interest rates. In casethis is too restrictive for the stock price dynamics, one can always adjust accordingly throughthe speciﬁcation of the non-negative local martingale L t . For example, Buehler (2015) considersa local volatility model on top of the fundamental stock price that is separately calibrated toequity option prices. Remark 2.5.

Bubbles are usually associated with strict local martingales, see e.g. Cox andHobson (2005). In fact, for economies with a ﬁnite time horizon, a bubble is only possible ifthe deﬂated gains process is a strict local martingale, which corresponds to a bubble of Type3 according to the classiﬁcation of Jarrow et al. (2007). For economies with an inﬁnite timehorizon, which is the case in our setup, bubbles are possible even if the deﬂated gains process isa true martingale. Such bubbles are of Type 1 and 2 in the classiﬁcation Jarrow et al. (2007).Speciﬁcally, a (uniformly integrable) martingale L t corresponds to a bubble of Type 2 (Type 1). We ﬁnish this section with a result on the duration of the stock. We deﬁne the stock durationas(15)

Dur t = (cid:82) ∞ t ( s − t ) E t [ ζ s D s ] d sζ t S ∗ t . The stock duration represents a weighted average of the time an investor has to wait to receive hisdividends, where the weights are the relative contribution of the present value of the dividendsto the fundamental stock price. This deﬁnition is the continuous time version of the one used byDechow et al. (2004) and Weber (2018). The following proposition gives a closed form expressionfor stock duration in our framework.

Proposition 2.6.

The stock duration is given by (16)

Dur t = w (cid:62) [( γ − β ) Id − G ] − H ( X t ) w (cid:62) H ( X t ) . In this section we address the problem of pricing derivatives with discounted payoﬀ functionsthat are not polynomials in the factor process. The polynomial framework no longer allows toprice such derivatives in closed form. However, we can accurately approximate the prices usingthe available moments of the factor process.

In all examples encountered below, we consider a derivative maturing at time T whose discountedpayoﬀ is given by F ( g ( X T )), for some g ∈ Pol n ( E ), n ∈ N , and some function F : R → R . The Instead of starting from dynamics for D t , we could have speciﬁed dynamics for the dividend yield D t /S t .This would help to keep the stock price positive, but it does typically not produce a tractable distribution for D t .This is problematic since dividend derivatives reference notional dividend payments paid out over a certain timeperiod. t price π t of this derivative is given by(17) π t = E t (cid:2) F (cid:0) g ( X T (cid:1)(cid:3) . If the conditional distribution of the random variable g ( X T ) were available in closed form, wecould compute π t by integrating F over the real line. In general, however, we are only given all theconditional moments of the random variable g ( X T ). We thus aim to construct an approximativeprobability density function f matching a ﬁnite number of these moments. In a second step weapproximate the option price through numerically integrating F with respect to f . Given that afunction is an inﬁnite dimensional object, ﬁnding such a function f is clearly an underdeterminedproblem and we need to introduce additional criteria to pin down one particular function. Apopular choice in the engineering and physics literature is to choose the density function withmaximum entropy:(18) max f − (cid:90) R f ( x ) ln f ( x ) d x s . t . (cid:90) R x n f ( x ) d x = M n , n = 0 , . . . , N, where R ⊆ R denotes the support and M = 1 , M , . . . , M N denote the ﬁrst N + 1 moments of g ( X T ). Jaynes (1957) motivates such a choice by noting that maximizing entropy incorporatesthe least amount of prior information in the distribution, other than the imposed momentconstraints. In this sense it is maximally noncommittal with respect to unknown informationabout the distribution.Straightforward functional variation with respect to f gives the following solution to this opti-mization problem: f ( x ) = exp (cid:32) − N (cid:88) i =0 λ i x i (cid:33) , x ∈ R, where the Lagrange multipliers λ , . . . , λ N have to be solved from the moment constraints: (cid:90) R x n exp (cid:32) − N (cid:88) i =0 λ i x i (cid:33) d x = M n , n = 0 , . . . , N. (19)If N = 0 and R = [0 , N = 1 and R = (0 , ∞ )we obtain the exponential distribution, while for N = 2 and R = R we obtain the Gaussiandistribution. For N ≥

3, one needs to solve the system in (19) numerically, which involvesevaluating the integrals numerically. We refer to the existing literature for more details onthe implementation of maximum entropy densities, see e.g. Agmon et al. (1979), Mead andPapanicolaou (1984), Rockinger and Jondeau (2002), and Holly et al. (2011). Directly trying to ﬁnd the roots of this system might not lead to satisfactory results. A more stable numericalprocedure is obtained by introducing the following potential function: P ( λ , . . . , λ N ) = (cid:82) R exp( − (cid:80) Ni =0 λ i x i ) d x + (cid:80) Ni =0 λ i M i . This function can easily be shown to be everywhere convex (see e.g., Mead and Papanicolaou (1984))and its gradient corresponds to the vector of moment conditions in (19). In other words, the Lagrange multiplierscan be found by minimizing the potential function P ( λ , . . . , λ N ). This is an unconstrained convex optimizationproblem where we have closed form (up to numerical integration) expressions for the gradient and hessian, whichmakes it a prototype problem to be solved with Newton’s method. emark 3.1. By subsequently combining the law of iterated expectations and the moment for-mula (3) , we are also able to compute the conditional moments of the ﬁnite dimensional distri-butions of X t . In particular, the method described in this section can also be applied to pricepath-dependent derivatives whose discounted payoﬀ depends on the factor process at a ﬁnite num-ber of future time points. One example of such products are the dividend options, which will bediscussed below. The time- t price π swaptiont of a payer swaption with expiry date T , which gives the owner theright to enter into a (spot starting) payer swap at T , is given by: π swaptiont = 1 ζ t E t (cid:20) ζ T (cid:16) π swapT (cid:17) + (cid:21) = 1 ζ t E t (cid:34)(cid:32) ζ T − ζ T P ( T , T n ) − K n (cid:88) k =1 δ k ζ T P ( T , T k ) (cid:33) + (cid:35) = e − γ ( T − t ) q (cid:62) H ( X t ) E t (cid:34)(cid:32) q (cid:62) (cid:32) Id − e ( G − γ Id)( T n − T ) − K n (cid:88) k =1 δ k e ( G − γ Id)( T k − T ) (cid:33) H ( X T ) (cid:33) + (cid:35) , where we have used (9) in the last equality. Observe that the discounted payoﬀ of the swaptionis of the form in (17) with F ( · ) = max( · ,

0) and g is a polynomial of degree one in X T .The time- t price π stockt of a European call option on the dividend paying stock with strike K and expiry date T is given by π stockt = 1 ζ t E t (cid:2) ζ T ( S T − K ) + (cid:3) = 1 ζ t E t (cid:2) ( L T + ζ T S ∗ T − ζ T K ) + (cid:3) = e − γ ( T − t ) q (cid:62) H ( X t ) E t (cid:20)(cid:16) e γT L T + e βT w (cid:62) H ( X T ) − q (cid:62) H ( X T ) K (cid:17) + (cid:21) , (20)where we have used (12) in the last equality. If ( L t , X t ) is jointly a polynomial jump-diﬀusion,we can compute all moments of the random variable e γT L T + e βT w (cid:62) H ( X T ) − q (cid:62) H ( X T ) K andproceed as explained in Section 3.1. Remark 3.2.

If one assumes independence between the processes L t and X t , then the assump-tion that ( L t , X t ) must jointly be a polynomial jump-diﬀusion is not necessarily needed. Indeed,suppose L t is speciﬁed such that we can compute F ( k ) = e − γ ( T − t ) E t [(e γT L T − k ) + ] eﬃciently.By the law of iterated expectations we have π stockt = E t [ F ( g ( X T ))] q (cid:62) H ( X t ) , where we deﬁne g ( x ) = − e βT w (cid:62) H ( x ) + q (cid:62) H ( x ) K ∈ Pol ( E ) . The numerator in the aboveexpression is now of the form in (17) and we proceed as before. T , T ], expiry date T , andstrike price K . This type of options are actively traded on the Eurex exchange where the EuroStoxx 50 dividends serve as underlying. The time- t price π divt of this product is given by π divt = 1 ζ t E t (cid:34) ζ T (cid:18)(cid:90) T T D s d s − K (cid:19) + (cid:35) = 1 ζ t E t (cid:2) ( ζ T ( C T − C T − K )) + (cid:3) = e − γ ( T − t ) q (cid:62) H ( X t ) E t (cid:20)(cid:16) q (cid:62) H ( X T ) (cid:16) e βT p (cid:62) H ( X T ) − e βT p (cid:62) H ( X T ) − K (cid:17)(cid:17) + (cid:21) . We can compute in closed form all the moments of the scalar random variable q (cid:62) H ( X T ) (cid:16) e βT p (cid:62) H ( X T ) − e βT p (cid:62) H ( X T ) − K (cid:17) by subsequently applying the law of iterated expectations and the moment formula (3), seeRemark 3.1. We then proceed as before by ﬁnding the maximum entropy density correspondingto these moments and computing the option price by numerical integration. In this section we give a worked-out example of a factor process that ﬁts in the polynomialframework of Section 2. In the following, if x ∈ R d then diag( x ) denotes the diagonal matrixwith x , . . . , x d on its diagonal. If x ∈ R d × d , then we denote diag( x ) = ( x , . . . , x dd ) (cid:62) .The linear jump-diﬀusion (LJD) model assumes the following dynamics for the factor pro-cess d X t = κ ( θ − X t ) d t + diag( X t − ) (Σ d B t + d J t ) , (21)where B t is a standard d -dimensional Brownian motion, Σ ∈ R d × d is a lower triangular matrixwith non-negative entries on its main diagonal, J t is a compensated compound Poisson processwith arrival intensity ξ ≥ F (d z ) that admits moments of all orders. Both the jump amplitudes and the Poisson jumps are assumed to be independent from thediﬀusive noise. The purely diﬀusive LJD speciﬁcation (i.e., ξ = 0) has appeared in variousﬁnancial contexts such as stochastic volatility (Nelson (1990), Barone-Adesi et al. (2005)), energymarkets (Pilipovi´c (1997)), interest rates (Brennan and Schwartz (1979)), and Asian optionpricing (Linetsky (2004), Willems (2018)). The extension with jumps has not received muchattention yet.The following proposition veriﬁes that X t is indeed a polynomial jump-diﬀusion and also showshow to choose parameters such that X t has positive components. Proposition 4.1.

Assume that matrix κ has non-positive oﬀ-diagonal elements, ( κθ ) i ≥ , i = 1 , . . . , d , and F has support S ⊆ ( − , ∞ ) d . Then for every initial value X ∈ (0 , ∞ ) d thereexists a unique strong solution X t to (21) with values in (0 , ∞ ) d . Moreover, X t is a polynomialjump-diﬀusion. For simplicity we assume a compound Poisson process with a single jump intensity, however this can begeneralized (see Filipovi´c and Larsson (2017)).

12e will henceforth assume that the assumptions of Proposition 4.1 are satisﬁed, as it allowsto derive parameter restrictions to guarantee C t > ζ t >

0, and D t ≥

0. In order to have p (cid:62) H ( x ) > q (cid:62) H ( x ) > x ∈ (0 , ∞ ) d , the vectors p and q must have non-negative components with at least one component diﬀerent from zero. The following propositionintroduces a lower bound on β such that D t ≥ Proposition 4.2.

Let p = ( p , p , . . . , p d ) (cid:62) ∈ [0 , ∞ ) d and denote ˜ p = ( p , . . . , p d ) (cid:62) . Assumethat at least one of the p , . . . , p d is non-zero, so that dividends are not deterministic. Withoutloss of generality we assume p , . . . , p k > and p k +1 , . . . , p d = 0 , for some ≤ k ≤ d . If wedenote by κ j the j -th column of κ , then we have D t ≥ if and only if (22) β ≥  max (cid:26) ˜ p (cid:62) κ p , . . . , ˜ p (cid:62) κ k p k (cid:27) if p = 0 , max (cid:26) − ˜ p (cid:62) κθp , ˜ p (cid:62) κ p , . . . , ˜ p (cid:62) κ k p k (cid:27) if p > . The LJD model allows a ﬂexible instantaneous correlation structure between the factors throughthe matrix Σ. This is in contrast to non-negative aﬃne jump-diﬀusions, a popular choice in termstructure modeling when non-negative factors are required, see, e.g., Duﬃe et al. (2003). Indeed,as soon as one introduces a non-zero instantaneous correlation between the factors of a non-negative aﬃne jump-diﬀusion, the aﬃne (and polynomial) property is lost. Correlation betweenfactors can be used to incorporate a dependence between the term structures of interest ratesand dividends, but also to model a dependence within a single term structure. The LJD modelalso allows for state-dependent, positive and negative, jump sizes of the factors. This again isin contrast to non-negative aﬃne jump-diﬀusions.The following proposition provides the eigenvalues of the corresponding matrix G under theassumption of a triangular form for κ . Combined with Proposition 2.3, this gives suﬃcientconditions to guarantee a ﬁnite stock price in the LJD model. Proposition 4.3. If κ is a triangular matrix, then the eigenvalues of the matrix G are , − κ , . . . , − κ dd , − κ ii − κ jj + (ΣΣ (cid:62) ) ij + ξ (cid:90) S z i z j F (d z ) , ≤ i, j ≤ d. The eigenvalues of G coincide with the values on the ﬁrst line. In this section we calibrate a parsimonious LJD model speciﬁcation to market data on weeklyintervals (Wednesday to Wednesday) from May 2010 until December 2015. All the data isobtained from Bloomberg. On every day of the sample we minimize the squared diﬀerencebetween the model implied and market observed prices. The initial values of the factor processare included as free parameters, which brings the total number of parameters to be calibratedto 12. For the optimization we use the Nelder-Mead simplex algorithm. We use the outcome ofevery optimization as initial guess for the optimization on the next sample day.13 .1 Data description

The dividend paying stock in our calibration study is the Euro Stoxx 50, the leading blue-chipstock index in the Eurozone. The index is composed of ﬁfty stocks of sector leading companiesfrom twelve Eurozone countries. We choose to focus on the European market because thedividend futures contracts on the Euro Stoxx 50 are the most liquid in the world and have beenaround longer than in any other market. Kragt et al. (2018) report an average daily turnoverof more than EUR 150 million for all expiries combined, with the majority of the liquidity inthe ﬁrst ﬁve expiries. The Euro Stoxx 50 dividend futures contracts are traded on Eurex andreference the sum of the declared ordinary gross cash dividends (or cash-equivalent, e.g. stockdividends) on index constituents that go ex-dividend during a given calendar year, divided bythe index divisor on the ex-dividend day. Corporate actions that cause a change in the indexdivisor are excluded from the dividend calculations, e.g. special and extraordinary dividends,return of capital, stock splits, etc. One every day of the sample there are ten annual contractsavailable for trading with maturity dates on the third Friday of December. For example, onSeptember 1 2015, the k -th to expire contract, k = 1 , . . . ,

10, references the dividends paidbetween the third Friday of December 2014 + k − k .We interpolate adjacent dividend futures contracts using the approach of Kragt et al. (2018) toconstruct contracts with a constant time to maturity of 1 to 9 years. In the calibration we usethe contracts with maturities in 1, 2, 3, 5, 7, and 9 years, the remaining ones will be used foran out-of-sample exercise. Figure 1a plots the interpolated dividend futures prices with 1, 2, 5,and 9 years to maturity.Next to the Euro Stoxx 50 dividend futures contracts, there also exist exchanged traded optionson realized dividends. The maturity dates and the referenced dividends of the options coincidewith those of the corresponding futures contracts. At every calibration date, we consider theBlack (1976) implied volatility of an at-the-money (ATM) dividend option with 2 years tomaturity. Since dividend option contracts have ﬁxed maturity dates, we interpolate the impliedvolatility of the second and third to expire ATM option contract. Figure 1b plots the impliedvolatilities of the dividend options over time.The term structure of interest rates is calibrated to European spot-starting swap contractsreferencing the six month Euro Interbank Oﬀered Rate (Euribor) with tenors of 1, 2, 3, 5, 7,10, and 20 years. Figure 1c plots the par swap rates of swaps with tenors of 1, 5, 10, and20 years (other tenors have been left out of the plot for clarity). In addition, we also includeATM swaptions with time to maturity equal to 3 months and underlying swap with tenor 10years. These are among the most liquid ﬁxed-income instruments in the European market. Theswaptions are quoted in terms of normal implied volatility and are plotted in Figure 1d.We also consider Euro Stoxx 50 index options with ATM strike and a maturity of 3 months.Their prices are quoted in terms of Black-Scholes implied volatility and plotted in Figure 1btogether with the dividend options implied volatility. Figure 1e plots the Euro Stoxx 50 indexlevel over time. We could also calibrate the model without doing any interpolation of the data. However, in order to make theﬁtting errors of the sequential calibrations more comparable over time, we choose to interpolate all instrumentssuch that they have a constant time to maturity. We linearly interpolate the total implied variance σ Black τ , where σ Black denotes the implied volatility and τ the maturity of the option. .2 Model speciﬁcation We propose a parsimonious four-factor LJD speciﬁcation without jumps for X t = ( X I t , X I t , X D t , X D t ) (cid:62)  d X I t = κ I (cid:0) X I t − X I t (cid:1) d t d X I t = κ I ( θ I − X I t ) d t + σ I X I t d B t d X D t = κ D (cid:0) X D t − X D t (cid:1) d t d X D t = κ D ( θ D − X D t ) d t + σ D X D t (cid:16) ρ d B t + (cid:112) − ρ d B t (cid:17) , (23)with ρ ∈ [ − , κ I , κ D , κ I , κ D , θ I , θ D , σ I , σ D >

0, and X ∈ (0 , ∞ ) . By Proposition 4.1, X t takes values in (0 , ∞ ) . Since we only include options with ATM strike in the calibration, wechoose not to include any jumps in the dynamics in order to keep the number of parameterssmall. We deﬁne the cumulative dividend process as C t = e βt X D t , so that X D t and X D t are driving the term structure of dividends. The corresponding instanta-neous dividend rate becomes D t = e βt (cid:0)(cid:0) β − κ D (cid:1) X D t + κ D X D t (cid:1) . Using Proposition 4.2, we guarantee D t ≥ β ≥ κ D . In order to further reduce thenumber of parameters, we set β = κ , so that D t = e βt κ D X D t and X D t no longer enters in thedynamics of D t . We can thus normalize C = X D = 1.The discount factor process is deﬁned as ζ t = e − γt X I t , so that X I t and X I t are driving the term structure of interest rates. The corresponding shortrate becomes r t = ( γ + κ I ) − κ I X I t X I t , which is unbounded from below and bounded above by γ + κ I . This allows to capture thenegative interest rates that occur in the sample. Dividing ζ t by a positive constant does notaﬀect model prices, so for identiﬁcation purposes we normalize θ I = 1. The matrix κ is upper triangular and given by κ =  κ I − κ I κ I κ D − κ D κ D  . In the more general polynomial framework described in Section 2, it is possible to lower bound the short rate.For example, one can use compactly supported polynomial processes, similarly as in Ackerer and Filipovi´c (2018). For a constant k >

0, the dynamics of ( ˜ X I t , ˜ X I t ) := ( kX I t , kX I t ) is given by (cid:40) d ˜ X I t = κ I (cid:16) ˜ X I t − ˜ X I t (cid:17) d t d ˜ X I t = κ I (˜ θ I − ˜ X I t ) d t + σ I ˜ X I t d B t , with ˜ θ I := kθ I . The dynamics of ( ˜ X I t , ˜ X I t ) is therefore of the same form as that of ( X I t , X I t ). κ are all positive by assumption.We can therefore interpret γ as the asymptotic zero-coupon bond yield and β as the asymptoticrisk-neutral expected dividend growth rate. Using Propositions 2.3 and 4.3, we introduce thefollowing constraint on the model parameters in order to guarantee a ﬁnite stock price: γ − β > max (cid:8) , ( σ I ) − κ I , ( σ D ) − κ D , σ I σ D ρ − κ I − κ D (cid:9) . The parameter ρ ∈ [ − ,

1] controls the correlation between interest rates and dividends. Speciﬁ-cally, the instantaneous correlation between the dividend rate and the short rate is given by(24) d[

D, r ] t (cid:112) d[ D, D ] t (cid:112) d[ r, r ] t = − ρ, where [ · , · ] t denotes the quadratic covariation. The minus sign in front of ρ appears becausethe Brownian motion B t drives the discount factor, which is negatively related to the shortrate. Although the option pricing technique described in Seciton 3.1 works in theory for any ﬁnitenumber of moment constraints, there is a computational cost associated with computing themoments on the one hand, and solving the Lagrange multipliers on the other hand. In thecalibration, we use moments up to order four to price swaptions, dividend options, and stockoptions. The number of moments needed for an accurate option price depends on the speciﬁcform of the payoﬀ function and on the model parameters. As an example, Figure 2 shows pricesof a swaption, dividend option, and stock option for diﬀerent number of moments matchedand using the calibrated parameters from an arbitrary day in the sample. As a benchmark, weperform a Monte-Carlo simulation of the model. We discretize (23) at a weekly frequency with asimple Euler scheme and simulate 10 trajectories. For all three options, the maximum entropymethod based on the ﬁrst four moments produces an approximation within the 95%-conﬁdenceinterval of the Monte-Carlo simulation.Table 1 shows the absolute pricing error over the sample period. Considering the relativelysmall number of parameters, the ﬁt is surprisingly good. Dividend futures have a mean absoluterelative error between 1 and 4%. The mean absolute error of the swap rates is in the order ofbasis points for all tenors. The ﬁt with the dividend option, swaption, and stock option impliedvolatilities is close to perfect with a mean absolute pricing error of less than three basis points.The Eurostoxx 50 index level is matched with a mean relative error of less 0.1%. Figure 3 showsthe evolution of the errors over time. The largest errors for the dividend futures occur in the2011-2013 period, which corresponds to the peak of the European debt crisis.Figure 4a plots the calibrated γ , which is the yield on the zero-coupon bond with inﬁnitematurity. The plot shows a steady decline over time from approximately 6.5% to 1%. Thisreﬂects the drop in interest rates over the sample period as a consequence of quantitative easingby the European Central Bank. Figure 4b plots the calibrated β , which corresponds to theasymptotic risk-neutral expected growth rate of the dividends. It is always substantially lowerthan γ , which is required to keep the stock price ﬁnite. Figure 4c plots the calibrated ρ , which In addition, we also use the forward contract price as a control variate. This variance reduction techniquereduces the variance of the Monte-Carlo estimator approximately by a factor 4.

16n view of (24) determines the correlation between the term structure of interest rates anddividends. Remarkably, ρ is positive over almost all of the sample period, with an average ofaround 80%. In view of (24), this indicates a highly negative correlation between interest ratesand dividends. This negative correlation is a central ingredient in our model, since it increases thevolatility of the stock price relative to the dividends and interest rates. This allows to reconcilethe relatively high implied volatility of stock options with the relatively low implied volatilityof dividend options and swaptions. For example, the large drop in ρ at the beginning of 2015corresponds to the a period where the implied volatility of the stock option dropped sharply, butthe dividend option was unaﬀected. Figure 5 plots the instantaneous correlation between S t and r t . The stock price is aﬀected by interest rates through two channels: 1) through the discountingof future dividends and 2) through the correlation between dividends and interest rates. Figure5 shows a negative instantaneous correlation between S t and r t , except on a handful of dayswhere the second channel marginally oﬀsets the ﬁrst one.Figure 6a plots the scaled initial value X D , which corresponds to the spot dividend rate D .Not surprisingly, it closely resembles the dynamics of the 1 year dividend futures price in Figure1a. The term structure of dividend futures is downward sloping over almost the entire sample,which is reﬂected in the calibration by the fact that X D is always well above its long-term mean θ D . Figure 6b plots the initial values X I , X I of the interest rate factors, and Figure 6c plotsthe corresponding model implied short rate r . The increasing trend of the factor process overtime illustrates the increasingly exceptional low interest rate situation in the Eurozone. Figure6d plots the normal volatility σ I κ I X I X I of the short rate, which looks similar in shape to theswaption implied volatility in Figure 1d.As an out-of-sample exercise, we compute model implied prices of instruments not included inthe calibration. Speciﬁcally, we consider dividend futures and interest rate swaps with maturityin 6 and 8 years, a dividend option with maturity in 3 years, a swaption with maturity in 6months and underlying swap with tenor 10 years, and a stock option with maturity in 6 months.All options have ATM strike. The market and model implied prices are shown in Figure 7. Theerrors are reported in Table 2. The errors of the dividend futures and the interest rate swapsare comparable to their in-sample counterparts. There is however a clear deterioration in theﬁt with option prices out-of-sample, especially dividend options. This indicates that a richervolatility structure than the parsimonious one in (23) might be needed to ﬁt the term structureof option prices.Figure 8 plots the stock duration (15) implied by the calibrated model parameters. The durationis quite stable over the calibration period, with an average around 24 years. Dechow et al. (2004)and Weber (2018) construct a stock duration measure based on balance sheet data and ﬁnd anaverage duration of approximately 15 and 19 years, respectively, for a large cross-section ofstocks.Table 3 contains computation times for calculating option prices. The bulk of the computationtime is due to the computation of the moments of g ( X T ) in (17). The number of stochastic factorsthat drive a derivative’s payoﬀ and the degree of moments that have to be matched thereforestrongly aﬀect the computation time. We observe that all timings of the maximum entropymethod are well below the time it took to run the benchmark Monte-Carlo simulation. Thepricing of swaptions is much faster than the pricing of dividend and stock options, especiallyas the number of moments increases. This is because the discounted swaption payoﬀ onlydepends on on the 2-dimensional process ( X I t , X I t ) (cid:62) , while the discounted payoﬀ of the dividend17nd stock option depends on the entire 4-dimensional process X t = ( X I t , X I t , X D t , X D t ) (cid:62) . Inaddition, the discounted payoﬀ of the dividend and stock option is quadratic in the factors.Therefore, in order to compute moments up to degree N of the discounted payoﬀ, we need tocompute moments up to degree 2 N of the factors. The computation of the dividend optionis further complicated by the path-depedent nature of its payoﬀ. Indeed, the dividend optionpayoﬀ depends on the realization of the factors at T and T . In order to compute the momentsof ζ T ( C T − C T ), we have to apply the moment formula twice. Hence, it involves computing amatrix exponential twice, which causes an additional computation time compared to the stockoption. It is well known that some stock markets exhibit a strongly seasonal pattern in the payment ofdividends. For example, Figure 9 shows that the constituents of the Euro Stoxx 50 pay a largefraction of their dividends between April and June each year. The easiest way to incorporateseasonality in our framework is to introduce a deterministic function of time δ ( t ) and redeﬁnethe cumulative dividend process as: C t = (cid:90) t δ ( s ) d s + e βt p (cid:62) H ( X t ) . (25)The function δ ( t ) therefore adds a deterministic shift to the instantaneous dividend rate:(26) D t = δ ( t ) + e βt p (cid:62) ( β Id + G ) H ( X t ) . In addition to incorporating seasonality, δ ( t ) can also be chosen such that the observed dividendfutures prices are perfectly matched. In Appendix A we show how the bootstrapping method ofFilipovi´c and Willems (2018) can be used to ﬁnd such a function. We do not lose any tractabilitywith the speciﬁcation in (25), since the moments of C T − C T can still easily be computed.Alternatively, we could also introduce time dependence in the speciﬁcation of X t . Doing so ingeneral comes at the cost of losing tractability, because we leave the class of polynomial jump-diﬀusions. However, it is possible to introduce a speciﬁc type of time dependence such that we do stay in the class of polynomial jump-diﬀusions. Deﬁne Γ( t ) as a vector of sine and cosinefunctions whose frequencies are integer multiples of 2 π (so that they all have period one)Γ( t ) =  sin(2 πt )cos(2 πt )...sin(2 πKt )cos(2 πKt )  ∈ R K , K ∈ N , t ≥ . The superposition z + z (cid:62) Γ( t ) , ( z , z ) ∈ R K , See e.g. Marchioro (2016) for a study of dividend seasonality in other markets.

18s a ﬂexible function for modeling annually repeating cycles and is a standard choice for pricingcommodity derivatives (see e.g. Sørensen (2002)). In fact, from Fourier analysis we know thatany smooth periodic function can be expressed as a sum of sine and cosine waves. Remark nowthat Γ( t ) is the solution of the following linear ordinary diﬀerential equationdΓ( t ) = blkdiag (cid:18)(cid:18) π − π (cid:19) , . . . , (cid:18) πK − πK (cid:19)(cid:19) Γ( t )d t. The function Γ( t ) can therefore be seen as a (deterministic) process of the form in (1) and canbe added to the factor process. For example, the speciﬁcation for ( X D t , X D t ) in (23) could bereplaced by (cid:40) d X D t = κ D (cid:0) X D t − X D t (cid:1) d t d X D t = κ D ( z + z (cid:62) Γ( t ) − X D t ) d t + σ D X D t (cid:16) ρ d B t + (cid:112) − ρ d B t (cid:17) , where the ﬁrst factor mean-reverts around the second, and the second mean-reverts arounda time-dependent mean. The process X t does not belong to the class of polynomial jump-diﬀusions, however the augmented process (Γ( t ) , X t ) does.In the calibration exercise in Section 5, we did not include any seasonal behavior in the dividendsbecause the instruments used in the estimation are not directly aﬀected by seasonality. Indeed,all the dividend derivatives used in the calibration reference the total amount of dividends paidin a full calendar year. The timing of the dividend payments within the year does therefore notplay any role. In theory, the stock price should inherit the seasonal pattern from the dividendpayments, since it drops by exactly the amount of dividends paid out. In practice, however,these price drops are obscured by the volatility of the stock price since the dividend paymentstypically represent only a small fraction of the total stock price. Dividend seasonality only playsa role for pricing claims on dividends realized over a time period diﬀerent from an integer numberof calendar years. Dividend forwards, also known as dividend swaps, are the OTC equivalent of the exchange tradeddividend futures. The buyer of a dividend forward receives at a future date T the dividendsrealized over a certain time period [ T , T ] against a ﬁxed payment. Dividend forwards diﬀerfrom dividend futures because they are not marked to market on a daily basis. The dividendforward price D swap ( t, T , T ), t ≤ T ≤ T , is deﬁned as the ﬁxed payment that makes theforward have zero initial value D fwd ( t, T , T ) = 1 P ( t, T ) 1 ζ t E t [ ζ T ( C T − C T )]= D fut ( t, T , T ) + Cov t [ ζ T , C T − C T ] P ( t, T ) ζ t . If interest rates and dividends are independent, then we have D fwd ( t, T , T ) = D fut ( t, T , T ).However, if there is a positive (negative) dependence between interest rates and dividends, thenthere is a convexity adjustment and the dividend forward price will be smaller (larger) thanthe dividend futures price. The following proposition derives the dividend forward price in thepolynomial framework. 19 roposition 6.1. The dividend forward price is given by D fwd ( t, T , T ) = (cid:0) e βT w (cid:62) e G ( T − t ) − e βT w (cid:62) e G ( T − t ) (cid:1) H ( X t ) q (cid:62) e G ( T − t ) H ( X t ) , where w , w ∈ R N are the unique coordinate vectors satisfying w (cid:62) H ( x ) = p (cid:62) H ( x ) q (cid:62) e G ( T − T ) H ( x ) , w (cid:62) H ( x ) = p (cid:62) H ( x ) q (cid:62) H ( x ) . We have introduced an integrated framework designed to jointly price the term structures ofdividends and interest rates. The uncertainty in the economy is modeled with a multivariatepolynomial jump-diﬀusion. The model is tractable because we can calculate all conditional mo-ments of the factor process in closed form. In particular, we have derived closed form formulasfor prices of bonds, dividend futures, and the dividend paying stock. Option prices are ob-tained by integrating the discounted payoﬀ function with respect to a moment matched densityfunction that maximizes the Boltzmann-Shannon entropy. We have introduced the LJD model,characterized by a martingale part that loads linearly on the factors. The LJD model allowsfor a ﬂexible dependence structure between the factors. We have assumed that dividends arepaid out continuously and ignored the possibility of default. These assumptions are justiﬁedwhen considering derivatives on a stock index, but become questionable for derivatives on asingle stock. An interesting future research direction is therefore to extend our framework withdiscrete dividend payments and default risk. 20

Bootstrapping an additive seasonality function

In this section we explain how to bootstrap a smooth curve T (cid:55)→ f t ( T ) of (unobserved) futuresprices corresponding to the instantaneous dividend rate D T . The curve should perfectly repro-duce observed dividend futures prices and in addition incorporate a seasonality eﬀect. Once wehave this function, we deﬁne the function δ ( T ) as δ ( T ) = f t ( T ) − p (cid:62) ( β Id + G ) E t [ H ( X T )] , T ≥ t, so that the speciﬁcation in (25) perfectly reproduces observed futures contracts and incorporatesseasonality.Suppose for notational simplicity that today is time 0 and we observe the futures prices F i ofthe dividends realized over one calendar year [ i − , i ], i = 1 , . . . , I . Divide the calendar year in J ≥ w j ≥ w + · · · + w J = 1. Theseseasonal weights can for example be estimated from a time series of dividend payments. Wesearch for the twice continuously diﬀerentiable curve f that has maximal smoothness subjectto the pricing and seasonality constraints:min f ∈ C ( R ) f (0) + f (cid:48) (0) + (cid:90) I f (cid:48)(cid:48) ( u ) d u s . t . (cid:90) i − jJ i − j − J f ( u ) d u = w j F i , i = 1 , . . . , I, j = 1 , . . . , J. This can be cast in an appropriate Hilbert space as a convex variational optimization problemwith linear constraints. In particular, it has a unique solution that can be solved in closedform using similar techniques as presented in Filipovi´c and Willems (2018). By discretizing theoptimization problem, a non-negativity constraint on f can be added as well. B Proofs

This section contains all the proofs of the propositions in the paper.

B.1 Proof of Proposition 2.2

Using the moment formula (3) we have for t ≤ T E t [ C T ] = e βT p (cid:62) e G ( T − t ) H ( X t ) . Diﬀerentiating with respect to T givesd E t [ C T ]d T = β e βT p (cid:62) e G ( T − t ) H ( X t ) + e βT p (cid:62) G e G ( T − t ) H ( X t ) . The result now follows from D t = d E t [ C T ]d T (cid:12)(cid:12)(cid:12)(cid:12) T = t . .2 Proof of Proposition 2.3 Plugging in the speciﬁcations for ζ t and D t in (11) gives: S ∗ t = 1 ζ t (cid:90) ∞ t e − ( γ − β ) s E t (cid:104) p (cid:62) ( β Id + G ) H ( x ) H ( X s ) q (cid:62) H ( X s ) (cid:105) d s. Since X t is a polynomial process, we can ﬁnd a closed form expression for the expectation insidethe integral: E t (cid:104) p (cid:62) ( β Id + G ) H ( X s ) q (cid:62) H ( X s ) (cid:105) = v (cid:62) e G ( s − t ) H ( X t ) . The fundamental stock price therefore becomes: S ∗ t = e βt v (cid:62) q (cid:62) H ( X t ) (cid:90) ∞ t e − ( γ − β )( s − t ) e G ( s − t ) d s H ( X t )= e βt v (cid:62) q (cid:62) H ( X t ) ( G − ( γ − β ) Id) − exp { ( G − ( γ − β ) Id) ( s − t ) } (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) s = ∞ s = t H ( X t )= e βt v (cid:62) q (cid:62) H ( X t ) (( γ − β ) Id − G ) − H ( X t ) < ∞ , where we have used the fact that the eigenvalues of the matrix G − ( γ − β ) Id have negativereal parts. B.3 Proof of Proposition 2.4

The market is arbitrage free if and only if the deﬂated gains process(27) G t = ζ t S t + (cid:90) t ζ s D s d s is a non-negative local martingale.If S t is of the form in (13), then we have G t = E t (cid:20)(cid:90) ∞ ζ s D s d s (cid:21) + L t , which is clearly a non-negative local martingale and therefore the market is arbitrage free.Conversely, suppose that the market is arbitrage free and hence (27) holds. As a direct conse-quence, the process ζ t S t − ζ t S ∗ t = G t − E t (cid:20)(cid:90) ∞ ζ s D s (cid:21) T ≥ tζ t S t − ζ t S ∗ t ≥ E t (cid:20) G T − (cid:90) ∞ ζ s D s (cid:21) = E t (cid:20) ζ T S T − (cid:90) ∞ T ζ s D s d s (cid:21) ≥ E t (cid:20) − (cid:90) ∞ T ζ s D s d s (cid:21) T →∞ −−−−→ , where we have used the limited liability of the stock in the last inequality. B.4 Proof of Proposition 2.6

Similarly as in the proof of Proposition 2.3 we get (cid:90) ∞ t ( s − t ) E t [ ζ s D s ] d s = v (cid:62) (cid:90) ∞ t ( s − t ) e ( β − γ ) s e G ( s − t ) d sH ( X t )= e ( β − γ ) t v (cid:62) (cid:90) ∞ t ( s − t ) e [ G − ( γ − β )Id]( s − t ) d sH ( X t ) . Applying integration by parts gives (cid:90) ∞ t ( s − t ) E t [ ζ s D s ] d s = e ( β − γ ) t v (cid:62) [( γ − β )Id − G ] − (cid:90) ∞ t e [ G − ( γ − β )Id]( s − t ) d sH ( X t )= e ( β − γ ) t v (cid:62) [( γ − β )Id − G ] − H ( X t )= e ( β − γ ) t w (cid:62) [( γ − β )Id − G ] − H ( X t ) . The result now follows from (12) and (15).

B.5 Proof of Proposition 4.1

We start by showing that there exists a unique strong solution X t to (21) with values in (0 , ∞ ) d .Due to the global Lipschitz continuity of the coeﬃcients, the SDE in (21) has a unique strongsolution in R d for every X ∈ R d , see Theorem III.2.32 in Jacod and Shiryaev (2003). It remainsto show that X t is (0 , ∞ ) d -valued for all t ≥ X ∈ (0 , ∞ ) d . First, we prove the statementfor the diﬀusive case. Lemma B.1.

Consider the SDE d X t = κ ( θ − X t ) d t + diag( X t )Σ d W t , (28) for some d -dimensional Brownian motion W t and κ, θ, Σ as assumed in Proposition 4.1. If X ∈ (0 , ∞ ) d , then X t ∈ (0 , ∞ ) d for all t ≥ .Proof. Replace X t in the drift of (28) by X + t componentwise and consider the SDEd Y t = κ ( θ − Y + t ) d t + diag( Y t )Σ d W t , (29) 23ith Y = X ∈ (0 , ∞ ) d . The function y (cid:55)→ y + componentwise is still Lipschitz continuous, sothat there exists a unique solution Y t to (29). Now consider the SDEd Z t = − diag(diag( κ )) Z + t d t + diag( Z t )Σ d W t , (30)with Z = X ∈ (0 , ∞ ) d . Its unique solution is the (0 , ∞ ) d -valued process given by Z t = Z exp (cid:26)(cid:18) − diag( κ ) −

12 diag(ΣΣ (cid:62) ) (cid:19) t + Σ W t (cid:27) . By assumption, we have that the drift function of (29) is always greater than or equal to thedrift function of (30): κθ − κx + ≥ − diag(diag( κ )) x + , ∀ x ∈ R d . By the comparison theorem from (Geiß and Manthey, 1994, Theorem 1.2) we have almost surely Y t ≥ Z t , t ≥ . Hence, Y t ∈ (0 , ∞ ) d and therefore Y t also solves the SDE (28). By uniqueness we conclude that X t = Y t for all t , which proves the claim.Deﬁne τ i as the i th jump time of N t and τ = 0. We argue by induction and assume that X τ i > i = 0 , , . . . . Since the process X t is right-continuous, we have the following diﬀusivedynamics for the process X ( τ i ) t = X t + τ i on the interval [0 , τ i +1 − τ i )d X ( τ i ) t = (cid:18) κθ + (cid:18) − κ − ξ diag (cid:18)(cid:90) S z d F (d z ) (cid:19)(cid:19) X ( τ i ) t (cid:19) d t + diag( X ( τ i ) t )Σ d B ( τ i ) t , with X ( τ i )0 = X τ i and B ( τ i ) t = B τ i + t − B τ i . The stopping time τ i is a.s. ﬁnite and therefore theprocess B ( τ i ) t deﬁnes a d -dimensional Brownian motion with respect to its natural ﬁltration, seeTheorem 6.16 in Karatzas and Shreve (1991). By Lemma B.1 we have X ( τ i ) t ∈ (0 , ∞ ) d for all t ∈ [0 , τ i +1 − τ i ). As a consequence, we have X t ∈ (0 , ∞ ) d for all t ∈ [ τ i , τ i +1 ). The jump size X τ i +1 − X τ i +1 − at time τ i +1 satisﬁes X τ i +1 − X τ i +1 − = diag( X τ i +1 − ) Z i +1 > − X τ i +1 − , where the Z i +1 are i.i.d. random variables with distribution F ( d z ). Rearranging terms gives X τ i +1 ∈ (0 , ∞ ) d . By induction we conclude that X t ∈ (0 , ∞ ) d for t ∈ [0 , τ i ), i ∈ N . The claimnow follows because τ i → ∞ for i → ∞ a.s.Next, we prove that X t is a polynomial jump-diﬀusion. The action of the generator of X t on a C function f : R d → R is given by G f ( x ) = 12 tr (cid:16) diag( x )ΣΣ (cid:62) diag( x ) ∇ f ( x ) (cid:17) + ∇ f ( x ) (cid:62) κ ( θ − x )+ ξ (cid:18)(cid:90) S f ( x + diag( x ) z ) F (d z ) − f ( x ) − ∇ f ( x ) (cid:62) diag( x ) (cid:90) S z F (d z ) (cid:19) , (31)where S denotes the support of F and we assume that f is such that the integrals are ﬁnite.Now suppose that f ∈ Pol n ( R d ) and assume without loss of generality that f is a monomialwith f ( x ) = x α = x α · · · x α d d , | α | = n . We now apply the generator to this function. Itfollows immediately that the ﬁrst two terms in (31) are again a polynomial of degree n or less.24ndeed, the gradient (hessian) in the second (ﬁrst) term lowers the degree by one (two), whilethe remaining factors augment the degree by at most one (two). The third term in (31) becomes(we slightly abuse the notation α to represent both a multi-index and a vector): ξ  x α (cid:90) S d (cid:89) j =1 (1 + z j ) α j F (d z ) − x α − x α α (cid:62) (cid:90) S z F (d z )  = ξx α (cid:90) S (cid:16) e α (cid:62) log(1+ z ) − − α (cid:62) z (cid:17) F (d z ) , (32)where the logarithm is applied componentwise. Hence, we conclude that G maps polynomialsto polynomials of the same degree or less. B.6 Proof of Proposition 4.2

This proof is similar to the one of Theorem 5 in Filipovi´c et al. (2017). From (5) we have that D t ≥ β ≥ sup x ∈ (0 , ∞ ) d − p (cid:62) G H ( x ) p (cid:62) H ( x ) , provided it is ﬁnite. Using (2) we have(34) − p (cid:62) G H ( x ) p (cid:62) H ( x ) = − ˜ p (cid:62) κθ + (cid:80) dj =1 ˜ p (cid:62) κ j x j p + (cid:80) kj =1 p j x j . Using the assumption κ ij ≤ i (cid:54) = j (cfr., Proposition 4.1), we have for all j > k that(35) ˜ p (cid:62) κ j = d (cid:88) i =1 p i κ ij = k (cid:88) i =1 p i κ ij ≤ . Combining (34) with (35) gives(36) sup x ∈ (0 , ∞ ) d − ˜ p (cid:62) κθ + (cid:80) dj =1 ˜ p (cid:62) κ j x j p + (cid:80) kj =1 p j x j = sup x ∈ (0 , ∞ ) k − ˜ p (cid:62) κθ + (cid:80) kj =1 ˜ p (cid:62) κ j x j p + (cid:80) kj =1 p j x j . If p >

0, then the fraction on the right-hand side of (36) can be seen as a convex combinationof (cid:26) − ˜ p (cid:62) κθp , ˜ p (cid:62) κ p , . . . , ˜ p (cid:62) κ k p k (cid:27) , with coeﬃcients p , p x , . . . , p k x k . As a consequence, we have in this casesup x ∈ (0 , ∞ ) d − p (cid:62) G H ( x ) p (cid:62) H ( x ) = max (cid:26) − ˜ p (cid:62) κθp , ˜ p (cid:62) κ p , . . . , ˜ p (cid:62) κ k p k (cid:27) . p = 0, then using the assumption κθ ≥ x ∈ (0 , ∞ ) d − p (cid:62) G H ( x ) p (cid:62) H ( x ) = sup x ∈ (0 , ∞ ) k − ˜ p (cid:62) κθ + (cid:80) kj =1 ˜ p (cid:62) κ j x j (cid:80) kj =1 p j x j = sup x ∈ (0 , ∞ ) k (cid:80) kj =1 ˜ p (cid:62) κ j x j (cid:80) kj =1 p j x j = max (cid:26) ˜ p (cid:62) κ p , . . . , ˜ p (cid:62) κ k p k (cid:27) . B.7 Proof of Proposition 4.3

Suppose ﬁrst that κ is lower triangular. In order to get a speciﬁc idea what the matrix G lookslike, we start by ﬁxing a monomial basis for Pol ( R d ) using the graded lexicographic orderingof monomials: H ( x ) = (1 , x , . . . , x d , x , x x , . . . , x x d , x , x x , . . . , x d ) (cid:62) , x ∈ R d . (37)It follows by inspection of (31) and (32) that, thanks to the triangular structure of κ , thematrix G is lower triangular with respect to this basis. Indeed, the ﬁrst and third term in(31) only contribute to the diagonal elements of G , while the second term contributes to thelower triangular part (including the diagonal). The eigenvalues of G are therefore given by itsdiagonal elements.Each element in the monomial basis can be expressed as as f ( x ) = x α · · · x α d d , for some α ∈ N d with (cid:80) di =1 α i ≤

2. In order to ﬁnd the diagonal elements of G , we need to ﬁnd the coeﬃcientof the polynomial G f ( x ) associated with the basis element f ( x ). It follows from (31) and (32)that this coeﬃcient is given by − d (cid:88) i =1 κ ii α i + 12 (cid:88) i

Using the law of iterated expectations and the moment formula (3) we get: E t [ ζ T ( C T − C T )] = e − γT (cid:16) e βT E t [ q (cid:62) H ( X T ) p (cid:62) H ( X T )] − e βT E t [ p (cid:62) H ( X T ) E T [ q (cid:62) H ( X T )]] (cid:17) = e − γT (cid:16) e βT w (cid:62) e G ( T − t ) H ( X t ) − e βT E t [ p (cid:62) H ( X T ) q (cid:62) e G ( T − T ) H ( X T )] (cid:17) = e − γT (cid:16) e βT w (cid:62) e G ( T − t ) H ( X t ) − e βT w (cid:62) e G ( T − t ) H ( X t ) (cid:17) . Note that the vectors w and w are unique since the basis elements are linearly independentby deﬁnition. Finally, using the bond price formula (9) we get D fwd ( t, T , T ) = 1 ζ t P ( t, T ) E t [ ζ T ( C T − C T )]= e βT w (cid:62) e G ( T − t ) H ( X t ) − e βT w (cid:62) e G ( T − t ) H ( X t ) q (cid:62) e G ( T − t ) H ( X t ) . eferences Ackerer, D. and D. Filipovi´c (2018). Linear credit risk models.

Swiss Finance Institute ResearchPaper (16-34).Ackerer, D., D. Filipovi´c, and S. Pulido (2018). The Jacobi stochastic volatility model.

Financeand Stochastics 22 (3), 667–700.Agmon, N., Y. Alhassid, and R. D. Levine (1979). An algorithm for ﬁnding the distribution ofmaximal entropy.

Journal of Computational Physics 30 (2), 250–258.Al-Mohy, A. H. and N. J. Higham (2011). Computing the action of the matrix exponential,with an application to exponential integrators.

SIAM Journal on Scientiﬁc Computing 33 (2),488–511.Avellaneda, M. (1998). Minimum-relative-entropy calibration of asset-pricing models.

Interna-tional Journal of Theoretical and Applied Finance 1 (04), 447–472.Barone-Adesi, G., H. Rasmussen, and C. Ravanelli (2005). An option pricing formula for theGARCH diﬀusion model.

Computational Statistics & Data Analysis 49 (2), 287–310.Bekaert, G. and S. R. Grenadier (1999). Stock and bond pricing in an aﬃne economy. Technicalreport, National Bureau of Economic Research.Bernhart, G. and J.-F. Mai (2015). Consistent modeling of discrete cash dividends.

Journal ofDerivatives 22 (3), 9–19.Black, F. (1976). The pricing of commodity contracts.

Journal of Financial Economics 3 (1-2),167–179.Bos, M., A. Shepeleva, and A. Gairat (2003). Dealing with discrete dividends.

Risk 16 (9),109–112.Bos, M. and S. Vandermark (2002). Finessing ﬁxed dividends.

Risk 15 (1), 157–158.Brennan, M. J. (1998). Stripping the S&P 500 index.

Financial Analysts Journal 54 (1), 12–22.Brennan, M. J. and E. S. Schwartz (1979). A continuous time approach to the pricing of bonds.

Journal of Banking & Finance 3 (2), 133–155.Buchen, P. W. and M. Kelly (1996). The maximum entropy distribution of an asset inferredfrom option prices.

Journal of Financial and Quantitative Analysis 31 (1), 143–159.Buehler, H. (2010). Volatility and dividends–Volatility modelling with cash dividends and simplecredit risk.

Working Paper .Buehler, H. (2015). Volatility and dividends II–Consistent cash dividends.

Working Paper .Buehler, H., A. S. Dhouibi, and D. Sluys (2010). Stochastic proportional dividends.

WorkingPaper .Chance, D. M., R. Kumar, and D. R. Rich (2002). European option pricing with discretestochastic dividends.

Journal of Derivatives 9 (3), 39–45.28ollin-Dufresne, P. and R. S. Goldstein (2002a). Do bonds span the ﬁxed income markets?Theory and evidence for unspanned stochastic volatility.

Journal of Finance 57 (4), 1685–1730.Collin-Dufresne, P. and R. S. Goldstein (2002b). Pricing swaptions within an aﬃne framework.

Journal of Derivatives 10 (1), 9–26.Corrado, C. J. and T. Su (1996a). Skewness and kurtosis in S&P 500 index returns implied byoption prices.

Journal of Financial Research 19 (2), 175–192.Corrado, C. J. and T. Su (1996b). S&P 500 index option tests of Jarrow and Rudd’s approximateoption valuation formula.

Journal of Futures Markets 16 (6), 611–629.Cox, A. M. and D. G. Hobson (2005). Local martingales, bubbles and option prices.

Financeand Stochastics 9 (4), 477–492.d’Addona, S. and A. H. Kind (2006). International stock–bond correlations in a simple aﬃneasset pricing model.

Journal of Banking and Finance 30 (10), 2747–2765.Dechow, P. M., R. G. Sloan, and M. T. Soliman (2004). Implied equity duration: A new measureof equity risk.

Review of Accounting Studies 9 (2-3), 197–228.Duﬃe, D., D. Filipovi´c, and W. Schachermayer (2003). Aﬃne processes and applications inﬁnance.

Ann. Appl. Probab. 13 (3), 984–1053.Filipovi´c, D. and M. Larsson (2017). Polynomial jump-diﬀusion models.

Swiss Finance InstituteResearch Paper No. 17-60 .Filipovi´c, D., M. Larsson, and A. B. Trolle (2017). Linear-rational term structure models.

Journal of Finance 72 , 655–704.Filipovi´c, D. and S. Willems (2018). Exact smooth term structure estimation.

SIAM Journalon Financial Mathematics 9 (3), 907–929.Fusai, G. and A. Tagliani (2002). An accurate valuation of Asian options using moments.

International Journal of Theoretical and Applied Finance 5 (02), 147–169.Geiß, C. and R. Manthey (1994). Comparison theorems for stochastic diﬀerential equations inﬁnite and inﬁnite dimensions.

Stochastic Processes and their Applications 53 (1), 23–35.Geske, R. (1978). The pricing of options with stochastic dividend yield.

Journal of Fi-nance 33 (2), 617–625.Guennoun, H. and P. Henry-Labord`ere (2017). Equity modeling with stochastic dividends.

Working Paper .Holly, A., A. Monfort, and M. Rockinger (2011). Fourth order pseudo maximum likelihoodmethods.

Journal of Econometrics 162 (2), 278–293.Jackwerth, J. C. and M. Rubinstein (1996). Recovering probability distributions from optionprices.

Journal of Finance 51 (5), 1611–1631.Jacod, J. and A. Shiryaev (2003).

Limit Theorems for Stochastic Processes , Volume 2. Springer-Verlag. 29arrow, R. and A. Rudd (1982). Approximate option valuation for arbitrary stochastic processes.

Journal of Financial Economics 10 (3), 347–369.Jarrow, R. A., P. Protter, and K. Shimbo (2007). Asset price bubbles in complete markets.

Advances in Mathematical Finance , 97–121.Jaynes, E. T. (1957). Information theory and statistical mechanics.

Physical Review 106 (4),620.Jondeau, E. and M. Rockinger (2001). Gram–Charlier densities.

Journal of Economic Dynamicsand Control 25 (10), 1457–1483.Karatzas, I. and S. Shreve (1991).

Brownian Motion and Stochastic Calculus (2nd ed.). Springer-Verlag.Kim, I.-M. (1995). An alternative approach to dividend adjustments in option pricing models.

Journal of Financial Engineering 4 , 351–373.Korn, R. and L. G. Rogers (2005). Stocks paying discrete dividends: modeling and optionpricing.

Journal of Derivatives 13 (2), 44–48.Kragt, J., F. De Jong, and J. Driessen (2018). The dividend term structure.

Journal of Financialand Quantitative Analysis , Forthcoming.Lasserre, J.-B., T. Prieto-Rumeau, and M. Zervos (2006). Pricing a class of exotic options viamoments and SDP relaxations.

Mathematical Finance 16 (3), 469–494.Lemke, W. and T. Werner (2009). The term structure of equity premia in an aﬃne arbitragefree model of bond and stock market dynamics. Technical report, ECB Working Paper.Lettau, M. and J. A. Wachter (2007). Why is long-horizon equity less risky? A duration-basedexplanation of the value premium.

Journal of Finance 62 (1), 55–92.Lettau, M. and J. A. Wachter (2011). The term structures of equity and interest rates.

Journalof Financial Economics 101 (1), 90–113.Linetsky, V. (2004). Spectral expansions for Asian (average price) options.

Operations Re-search 52 (6), 856–867.Lioui, A. (2006). Black-Scholes-Merton revisited under stochastic dividend yields.

Journal ofFutures Markets 26 (7), 703–732.Mamaysky, H. et al. (2002). On the joint pricing of stocks and bonds: Theory and evidence.Technical report, Yale School of Management.Marchioro, M. (2016). Seasonality of dividend point indexes.

Statpro Quantitative ResearchSeries .Mead, L. R. and N. Papanicolaou (1984). Maximum entropy in the problem of moments.

Journalof Mathematical Physics 25 (8), 2404–2417.Merton, R. C. (1973). Theory of rational option pricing.

Bell Journal of Economics 4 (1),141–183. 30elson, D. B. (1990). ARCH models as diﬀusion approximations.

Journal of Econometrics 45 (1),7–38.Overhaus, M., A. Berm´udez, H. Buehler, A. Ferraris, C. Jordinson, and A. Lamnouar (2007).

Equity Hybrid Derivatives . John Wiley & Sons.Pilipovi´c, D. (1997).

Energy Risk: Valuing and Managing Energy Derivatives . McGraw-Hill.Rockinger, M. and E. Jondeau (2002). Entropy densities with an application to autoregressiveconditional skewness and kurtosis.

Journal of Econometrics 106 (1), 119–142.Rompolis, L. S. (2010). Retrieving risk neutral densities from european option prices based onthe principle of maximum entropy.

Journal of Empirical Finance 17 (5), 918–937.Sørensen, C. (2002). Modeling seasonality in agricultural commodity futures.

Journal of FuturesMarkets 22 (5), 393–426.Suzuki, M. (2014). Measuring the fundamental value of a stock index through dividend futureprices.

Working Paper .Tunaru, R. S. (2018). Dividend derivatives.

Quantitative Finance 18 (1), 63–81.Vellekoop, M. H. and J. W. Nieuwenhuis (2006). Eﬃcient pricing of derivatives on assets withdiscrete dividends.

Applied Mathematical Finance 13 (3), 265–284.Weber, M. (2018). Cash ﬂow duration and the term structure of equity returns.

Journal ofFinancial Economics 128 (3), 486–503.Willems, S. (2018). Asian option pricing with orthogonal polynomials.

Quantitative Finance ,Forthcoming.Yan, W. (2014). Estimating a uniﬁed framework of co-pricing stocks and bonds.

Working Paper .31ean Median Std MaxDividend futures (ARE in %)1y 3.90 3.01 3.02 12.982y 1.12 0.93 1.00 5.423y 2.26 1.25 2.29 10.534y 2.59 1.54 2.41 10.805y 2.25 2.15 1.77 7.887y 1.18 0.76 1.10 5.349y 3.13 2.40 2.42 12.06Interest rate swaps (AE in %)1y 0.12 0.11 0.08 0.412y 0.08 0.06 0.07 0.403y 0.09 0.09 0.05 0.384y 0.08 0.09 0.04 0.355y 0.06 0.05 0.05 0.287y 0.10 0.09 0.06 0.2710y 0.16 0.15 0.09 0.3820y 0.16 0.13 0.13 0.61Dividend option (AE in %) 0.04 0.01 0.10 0.78Swaption (AE in bps) 0.01 0.01 0.02 0.10Stock option (AE in %) 0.04 0.01 0.08 0.60Index level (ARE in %) 0.10 0.07 0.12 1.09Table 1: Statistics on Absolute Error (AE) and Absolute Relative Error (ARE) of instrumentsincluded in the calibration. The dividend option has a maturity of 2 years, the swaption has amaturity of 3 months and underlying swap of 10 years, and the stock option has a maturity of3 months. All options have ATM strike. 32ean Median Std MaxDividend futures (ARE in %)6y 1.58 1.42 1.08 4.528y 1.98 1.45 1.66 9.60Interest rate swaps (AE in %)6y 0.07 0.06 0.05 0.258y 0.12 0.12 0.07 0.32Dividend option (AE in %) 2.40 2.42 0.85 5.28Swaption (AE in bps) 3.85 3.71 2.17 8.84Stock option (AE in %) 1.81 1.51 1.65 15.83Table 2: Statistics on Absolute Error (AE) or Absolute Relative Error (ARE) of instrumentsnot included in the calibration. The dividend option has maturity 3 years, the swaption hasmaturity 6 months and underlying swap of 10 years, and the stock option has a maturity of 6months. All options have ATM strike. N = 2 N = 3 N = 4 N = 5 N = 6 MCSwaption 0.02 0.02 0.02 0.02 0.02 1.0114Dividend option 0.03 0.06 0.14 0.41 1.24 25.88Stock option 0.02 0.03 0.06 0.07 0.11 3.49Table 3: Computation times (in seconds) needed to price swaptions, dividend options, andstock options using a) the maximum entropy method matching N moments and b) Monte-Carlosimulation with 10 sample paths and weekly discretization. The swaption has a maturity of 3months and underlying swap of 10 yeas, the dividend option has a maturity of 2 years, and thestock option has a maturity of 3 months. All options have ATM strike.33

010 2011 2012 2013 2014 2015 2016708090100110120130 I nde x po i n t s (a) Dividend futures I m p li ed v o l a t ili t y ( % ) Index optionDividend option (b) Stock option and dividend option S w ap r a t e ( % ) (c) Interest rate swaps N o r m a l i m p li ed v o l a t ili t y ( bp s ) (d) Swaption I nde x po i n t s (e) Index level Figure 1: Data used in the calibration exercise. Dates range from May 2010 until December2015 at a weekly frequency. Figure 1a shows the interpolated Euro Stoxx 50 dividend futuresprices with a constant time to maturity of 1, 2, 5, and 9 years. The contracts with time tomaturity of 3, 4, and 7 years are not plotted for clarity. Figure 1c shows the par swap rate ofEuribor spot starting swaps with tenors 1, 5, 10, and 20 years. The swap rates with tenors 2, 3,4, and 7 years are not plotted for clarity. Figure 1b shows the Black-Scholes and Black impliedvolatility, respectively, of ATM Euro Stoxx 50 index and dividend options. The stock option hasa time to maturity of 3 months and the dividend option 2 years. Figure 1d shows the normalimplied volatility of swaptions with time to maturity 3 months and the underlying swap has atenor of 10 years. Figure 1e shows the level of the Euro Stoxx 50.34

Number of moments matched N o r m a l i m p li ed v o l a t ili t y ( bp s ) Maximal entropyMCMC 95% CI (a) Swaption

Number of moments matched I m p li ed v o l a t ili t y ( % ) Maximal entropyMCMC 95% CI (b) Dividend option

Number of moments matched I m p li ed v o l a t ili t y ( % ) Maximal entropyMCMC 95% CI (c) Stock option

Figure 2: Maximum entropy option prices for diﬀerent number of moments matched. Theswaption has maturity 3 months and underlying swap with tenor ten years, the dividend optionhas maturity 2 years, and the stock option has maturity 3 months. All options have ATM strike.35

010 2011 2012 2013 2014 2015 20160123456789 A R E ( % ) (a) Dividend futures (ARE) AE ( % ) (b) Interest rate swaps (AE) AE ( % ) (c) Stock option (AE) AE ( % ) (d) Dividend option (AE) AE ( bp s ) (e) Swaption (AE) A R E ( % ) (f) Index level (ARE) Figure 3: Absolute Error (AE) or Absolute Relative Error (ARE) of instruments included inthe calibration. The errors of the dividend futures and interest rate swaps are averaged acrossall maturities. The dividend option has maturity 2 years, the swaption has maturity 3 monthsand underlying swap of 10 years, and the stock option has maturity 3 months. All options haveATM strike. 36

010 2011 2012 2013 2014 2015 201601234567 ( % ) (a) γ ( % ) (b) β (c) ρ DD (d) κ D θ D I0I1D1 (e) κ I , κ I , κ D DI (f) σ D , σ I Figure 4: Calibrated model parameters.37

010 2011 2012 2013 2014 2015 2016-1-0.8-0.6-0.4-0.200.2 S , r Figure 5: Instantaneous correlation between S t and r t implied by calibrated parameters. D X D (a) Dividend rate κ D X D X X (b) X I and X I r ( % ) (c) Short rate r = ( γ + κ I ) − κ I X I X I S ho r t r a t e no r m a l v o l a t ili t y ( % ) (d) Normal volatility short rate σ I κ I X I X I . Figure 6: Calibrated interest rate factors X I and X I , scaled dividend factor κ D X D , the shortrate r = ( γ + κ I ) − κ I X I X I , and the normal volatility of the short rate σ I κ I X I X I over time.38

010 2011 2012 2013 2014 2015 2016707580859095100105110115120 I nde x po i n t s MarketModel (a) Dividend futures 6y S w ap r a t e ( % ) MarketModel (b) Interest rate swap 6y I m p li ed v o l a t ili t y ( % ) MarketModel (c) Dividend option 3y N o r m a l i m p li ed v o l a t ili t y ( bp s ) MarketModel (d) Swaption 6m × I m p li ed v o l a t ili t y ( % ) MarketModel (e) Stock option 6m

Figure 7: Market and model implied prices of instruments not included in the calibration.39

010 2011 2012 2013 2014 2015 20162021222324252627282930 D u r a t i on Figure 8: Stock duration implied by calibrated parameters.

Time D VP I nde x po i n t ss