Are multi-factor Gaussian term structure models still useful? An empirical analysis on Italian BTPs
AAre multi-factor Gaussian term structure modelsstill useful? An empirical analysis on ItalianBTPs
Michele Leonardo Bianchi a,1a
Regulation and Macroprudential Analysis Directorate, Banca d’Italia, Rome, Italy
This version: May 28, 2018
Abstract.
In this paper, we empirically study models for pricing Italian sovereignbonds under a reduced form framework, by assuming different dynamics for theshort-rate process. We analyze classical Cox-Ingersoll-Ross and Vasicek multi-factor models, with a focus on optimization algorithms applied in the calibrationexercise. The Kalman filter algorithm together with a maximum likelihood estima-tion method are considered to fit the Italian term-structure over a 12-year horizon,including the global financial crisis and the euro area sovereign debt crisis. An-alytic formulas for the gradient vector and the Hessian matrix of the likelihoodfunction are provided.
Key words:
Cox-Ingersoll-Ross processes, Gaussian Ornstein-Uhlenbeck pro-cesses, bond pricing, Kalman filter, maximum likelihood estimation, non-linearoptimization. This publication should not be reported as representing the views of the Banca d’Italia. Theviews expressed are those of the author and do not necessarily reflect those of the Banca d’Italia. a r X i v : . [ q -f i n . P M ] M a y Introduction
This paper provides a detailed comparative assessment of a number of affine interestrate models applied to the term structure of Italian government securities.Having reliable estimates of these models allows to conduct scenario analysesthat can be used by banks for risk management purposes, e.g. to perform a risk-return analysis or to compute the value-at-risk of a bond portfolio. Furthermore, asindicated by Abdymomunov and Gerlach (2014), instead of assuming deterministicscenarios, supervisory authorities may consider hypothetical stochastic scenariosto stress test bank balance sheets.The present paper contributes to the relevant literature from three differentperspectives: (i) systematically assessing different specifications of affine models,applied to data from one of the largest and most liquid European government bondmarkets over a 12-year horizon which includes structural breaks; (ii) providingthe reader with a detailed tutorial of the technicalities related to the practicalimplementation of such models (especially in the Appendix); (iii) providing newinsight into the modeling of the Italian term structure of interest rates, relying ona well-established and simple mathematical framework.Multi-factor models have been widely studied in the literature (see A¨ıt-Sahaliaand Kimmel (2010) and Duffee and Stanton (2012)) for a complete overview). Duanand Simonato (1999) analyze monthly yield series for the U.S. Treasury debt secu-rities with maturities 3, 6, 12 and 60 months spanning the period form April 1964to December 1997 and conclude for a rejection of exponential affine models. Geyerand Pichler (1999) have calibrated and tested multi-factor versions of the Cox-Ingersoll-Ross (CIR) model on U.S. interest rates observed monthly from January1964 to December 1993. Although the specification of multi-factor CIR models issufficiently flexible for the shape of the term structure, they find strong evidenceagainst the adequacy of the CIR model. They have shown that in the two-factormodel the first factor corresponds to the general level of interest rates and thesecond to the spread between long and short-rates. Therefore, they conclude thatthe assumption that all factors follow a square-root process hinders the explana-tory power of additional factors (as it restricts admissible values for these factorsto be nonnegative). This assumption also affects the decision about the numberof factors to be included. As a consequence, they suggest to relax the square-rootassumption for all factors, in particular in the case of multi-factor models. De Jong(2000) shows that a three-factor affine model is able to provide an adequate fit ofthe cross-section and the dynamics of the term structure. The three factors can begiven the usual interpretation of level, steepness, and curvature. Almeida (2005)explains the relation between principal components obtained assuming no dynamicrestrictions and the dynamic factors estimated using multi-factor Gaussian termstructure models. The author finds that the linear structure embedded in dynamicaffine term structure models directly translates into an approximation of the non-negligible principal components by linear transformations of the state vector. Dateand Wang (2009) have studied the out-of-sample forecasting ability of linear Gaus-sian interest rate models with unobservable underlying factor by considering both The Matlab code on which this paper is based is available upon request. single-plus .Fitting sovereign bond yields without modeling risk premia nor regime shiftsis particularly challenging while analyzing Italian government bond yields overa decade which includes the euro area sovereign debt crisis. In this sense, thecomparative analysis of affine models presented in this paper can be viewed asa “stress test” of widely used models, aiming to highlight their strengths as well3s their limitations. At the same time, the analysis contributes to the literatureon Italian government securities with a thorough assessment of different interestmodels, indicating some viable approaches along with possible pitfalls of well-known models. In the relevant literature, the Italian government bond markethas been studied under different perspectives. Barone et al. (1989) tested the CIRmodel using the prices of Italian government bonds in the secondary market in orderto obtain useful indications regarding the efficiency of the secondary market andthe consistency between the primary and the secondary markets. The authors haveconducted a static calibration and have assessed the parameter stability. D’Ecclesiaand Zenios (1994) developed a multi-factor model for the yields of Italian bondsand, to control the sensitivity of portfolio returns to movements in the risk factors,implemented appropriate factor immunization models. Based on the observationthat short-term bonds interest-rates are characterized by a trend basically drivenby the European Central Bank (ECB) key interest rates, Maggi and Infortuna(2008) have proposed to filter out the effect of the ECB key interest rates from theterm structure of Italian government bonds first, and then, for each maturity, toestimate a CIR model on the transformed data. Musti and D’Ecclesia (2008) haveinvestigated the informational content of the yield curve in the European marketusing data on the Italian term structures and tested the expectation hypothesistheory. More recently, Girardi and Impenna (2013) have conducted and analysis ontwo regulated markets supervised by the Bank of Italy in cooperation with Consob(Commissione Nazionale per la Societ`a e la Borsa – Companies and Stock ExchangeCommission). They analyzed the price discovery process and the informational roleof trading in the interdealer business-to-business (B2B) trading venue (MTS cash),and the business-to-customer (B2C) BondVision market.Finally, the present paper contributes to the literature with a view at the im-plementation of the models analyzed, providing a detailed description of the math-ematical details that practitioners have to work out to efficiently implement thosemodels. The dynamic models we study can be casted in a state-space form, wherethe state is given by the unobservable factors and the observations are given by theterm structure. A state and parameter estimation can then be obtained using theKalman filter together with a maximum likelihood estimation method. However,finding an optimal solution to this type of problems is non trivial and optimizationpackages should be handled with care. In particular, in order to have reliable opti-mization algorithms and to compute the standard errors of the estimates exactly,we compute the analytic expression for the gradient vector and the Hessian matrixof the likelihood function considered in the factors and parameter estimation. Fulldetails are provided in the Appendix.The remainder of the paper is organized as follows. Section 2 reviews theVasicek and CIR multi-factor models considered in the empirical study, includingthe derivation of bond prices and yields in these models. The estimation algorithmstogether with the main empirical results are discussed in Section 3. Section 4concludes. The Appendix A provides analytic formulas for the gradient and theHessian of the likelihood function. 4
The model
We consider the short-rate approach to model the dynamics of the Italian interestrate term structure. There is a general consensus in assuming a stochastic short-rate instead of a deterministic short-rate to model uncertainty about the futuredynamics of the interest rate and credit risk of a given bond. The building blockof short-rate models is the integrated process R t , i.e. R t = (cid:90) t r s ds where r t is the instantaneous short-rate process, that is generally assumed to be astationary affine process. It follows that the price at time 0 of a zero coupon bondmaturing at t B (0 , t ) = E [exp( − R t )] = φ R t ( i ) , (2.1)where the expectation is taken under the risk-neutral measure, φ R t ( u ), with u ∈ C ,is the characteristic function of the random variable R t and i is the imaginaryunit. The previous equality easily follows from the definition of the characteristicfunction of a random variable Xφ X ( u ) = E [exp( iuX )] , where u ∈ C . Knowing the characteristic function of the process R t , it is straight-forward to compute the expectation in equation (2.1), and hence the yield tomaturity at time 0 of a zero-coupon bond maturing at ty (0 , t ) = − t log B (0 , t ) = − t log ( E [exp( − R t )])) = − t log ( φ R t ( i )) . As a consequence, we can compute the entire yield curve by assuming a model forthe short rate process r t . In the following, we explore several options, assumingthat r t is a linear combination of stochastic processes that follow either the Vasicekor the CIR model. As proposed by Vasicek (1977), a Gaussian Ornstein-Uhlenbeck (OU) process canbe used to model the dynamics of the instantaneous short-rate, that is dr t = κ ( η − r t ) dt + ϑdW t , (2.2)with r > κ and ϑ positive parameters, η ∈ R , and W t is a Brownian motion.Usually, the parameter η is chosen to be positive. In order to reduce the numberof parameters, in our empirical study we will assume that some Vasicek factorswill have η = 0. We recall that, even if η is positive, the process can be negative5ith positive probability (see Brigo and Mercurio (2006)). The conditional density p ( r t +1 | r t ) is normal with E [ r t +1 | r t ] = η (1 − e − κ ∆ t ) + e − κ ∆ t r t V ar [ r t +1 | r t ] = ϑ κ (1 − e − κ ∆ t ) . Under the Vasicek model there exists a closed-form expression for the character-istic function of the integrated process (see Proposition 2.6.2.1 in Jeanblanc et al.(2009)) and the integral in equation (2.1) can be computed as follows B (0 , t ) = e A ( t )+ B ( t ) r , (2.3)where A ( t ) = − ηt + η − e − κt κ − ϑ κ (cid:0) − e − κt (cid:1) + ϑ κ (cid:18) t − − e − κt κ (cid:19) ,B ( t ) = − − e − κt κ . For a description of the possible shape of the yield curve under the Vasicek modelsee Zeytun and Gupta (2007) and Keller-Ressel and Steiner (2008).
A well-known way to model the instantaneous short-rate process is to assume theCox-Ingersoll-Ross (CIR) dynamics (see Cox et al. (1985)), by considering thefollowing mean-reverting process dr t = κ ( η − r t ) dt + ϑ √ r t dW t , (2.4)under the initial condition r >
0, where W t is a standard Brownian motion and κ , η and ϑ are positive parameters satisfying the additional condition 2 κη > ϑ inorder to ensure that the origin is inaccessible, i.e. that r t remains positive for all t .The conditional density p ( r t +1 | r t ) of this process has the following form (see e.g.Brigo and Mercurio (2006)) p ( r t +1 | r t ) | r t +1 = z = cf χ ( v,l ) ( cz )where f χ ( v,l ) is the probability density function of a noncentral χ random variablewith parameters v and l , and c = 4 κϑ (1 − e − κ ∆ t ) , v = 4 κηϑ , l = cr t e − κ ∆ t . Strictly speaking, the CIR process is not Gaussian, since its conditional distributionis not normal. However, with a slight abuse of terminology, we will refer to it asa Gaussian process, taking into account that its dynamic is driven by a Brownianmotion. It can be proven that the conditional mean and variance are E [ r t +1 | r t ] = η (1 − e − κ ∆ t ) + e − κ ∆ t r t V ar [ r t +1 | r t ] = ηϑ κ (1 − e − κ ∆ t ) + r t ϑ κ ( e − κ ∆ t − e − κ ∆ t ) . B (0 , t ) = e A ( t )+ B ( t ) r , (2.5)where A ( t ) = 2 κηϑ (cid:18) log(2) + log (cid:18) ( b/ ( κ − b )) e ( κ + b ) t/ ae bt − (cid:19)(cid:19) ,B ( t ) = − κ − b (cid:18) e bt − ae bt − (cid:19) ,a = κ + bκ − b and b = √ κ + 2 ϑ . For a description of the possible shape of the yield curve under the CIR model seeZeytun and Gupta (2007) and Keller-Ressel and Steiner (2008).
The extension of equation (2.1) to the sum of two or more short-rate factors r t , . . . , r dt is straightforward when pairwise independence is assumed between factors.In this case the formula in equation (2.1) becomes B (0 , t ) = φ R t ( i ) · · · φ R dt ( i ) . (2.6)However, if one considers dependent factors, the decomposition in equation (2.6)no longer holds. Normal-based models may still have a closed-form solution, but ingeneral, if one assumes a richer dependence structure (for example, a multivariatemodel or a copula allowing for tail dependence), Monte Carlo simulations or nu-merical methods are needed to evaluate bond prices. Feldh¨utter and Lando (2008)proposed a model with six independent factors to calibrate Treasury bonds, corpo-rate bonds, and swap rates using both cross-sectional and time-series properties ofthe observed yields. This indepedence assumption may be restrictive, although theadvantage is that pricing formulas have explicit solutions, and the model is moreparsimonious with fewer parameters to estimate. Following a similar approach, inour empirical analysis on the Italian yield curve we will consider different multi-factor models with independent CIR or Gaussian OU (Vasicek) factors. This as-sumption allows us to find a balance between computational tractability and modelflexibility. We assume that the short-rate r t at time t is defined as r t = (cid:88) i r it , (2.7)where r it may be a CIR or a Vasicek factor. As already observed, when morethan one factor is considered, we assume that the Brownian motions driving thesefactors are independent. We analyze two 1-factor models (1-CIR and 1-Vasicek),7hree 2-factor models (2-CIR, 2-Vasicek, and 1-CIR&1-Vasicek), and three 3-factormodels (3-CIR, 3-Vasicek, and 2-CIR&1-Vasicek).Also in this case, the price of a (defaultable) sovereign zero-coupon bond canbe written as B (0 , t ) = E [ e − (cid:82) t r s ds ] . (2.8)and the model can be calibrated to the observed yield curve, that is, y (0 , t ) = − t log( B (0 , t )) . (2.9)By equations (2.5) and (2.3) it follows that the yield is a linear function of r , . . . , r d , that is the actual level of the short-rate. As we will see in the following, thisproperty allows one to estimate the model with a Kalman filter. A Buono del Tesoro Poliennale (BTP) is an Italian government bond with semi-annual coupon interest payments, principal repaid on maturity and a minimumdenomination of EUR 1,000. The main institutional investor secondary market forthese sovereign bonds is managed by MTS S.p.a. and the only Italian regulatedsecondary market for retail investors dedicated to the trading of Italian governmentsecurities is managed by Borsa Italiana (MOT). Note that it has been estimatedthat trading on the electronic platforms for institutional investors, that is outsidethe MTS market, in 2013 accounted for about 60 per cent of total trading in Italiangovernment securities (see Banca d’Italia (2013)).We obtain data on Italian government securities from Bloomberg. The zerocoupon yield can be obtained by considering the F905 curve series. We consideredItalian sovereign bond data from May, 31 2003 to May, 31 2015. The time periodin this study includes the high volatility period after the Lehman Brothers fillingfor Chapter 11 bankruptcy protection (September 15, 2008) and the euro areasovereign debt crisis, during which, in November 2011, the spread between the 10-year Italian BTP and the German Bund with the same maturity was higher than500 basis point.
The stochastic risk-free component r t is calibrated by fitting the interest rate termstructure to the term structure of Italian government bonds. That is, at eachtrading day we consider the yield defined in equation (2.9). There are two possiblemethodologies to estimate a short-rate model: (1) one can fit the model to theyields observed daily in the market and check both the model flexibility and theparameter stability; or (2) one can extract the unobservable short-rate process (orprocesses) by using a filter as described and empirically tested for instance by Chenand Scott (2003) or O’Sullivan (2008). While the former is a static estimation, thelatter is a dynamic approach which captures the behavior of the yield curve overtime, taking into account intertemporal consistency of parameter estimates. In8his paper we will follow a dynamic approach. In all the cases we are interested in,the model can be written in the following form x t = f ( x t − , Θ , v t − ) z t = h ( x t , Θ , ε t ) (3.1)where t is the day counter, x t is the d -dimensional state variable (also referred toas the latent or unobservable factor; r t in the notation of section 2), v t − is therandomness from the state variable with covariance matrix Q t , and Θ are the pa-rameters. The state variable follows the dynamics described by f . The variable z t represents the set of n observations (in our case the Italian bond term structuresobserved in the market; y t in the notation of Section 2), while h is the so-calledmeasurement function, which in our case is given by the yield pricing formula(2.9), and it depends on the state variable, model parameters and measurementnoise ε t . It is standard to assume that this measurement noise is normally dis-tributed: since we consider more than one yield observations each day, we havea multivariate normally distributed error. Although in general the measurementerror covariance matrix R can be a non- diagonal matrix, for the sake of simplicityit is assumed to be diagonal in this study. Therefore the covariance structure ofshort-rates is represented only by the model itself and not by the measurementerror covariance matrix. Note that, if some component of x t follows a CIR dynam-ics, the model proposed in equation (3.1) is non-Gaussian with respect to the statevariable. Nonetheless, since it is linear with respect to the measurement function,the classical Kalman filter can be used (see Chen and Scott (2003)). More detailson the maximum likelihood estimation (MLE) method implemented to calibratemarket data can be found in O’Sullivan (2008). Using the notation of O’Sullivan(2008), the model proposed in equation (3.1) can be written as x t = Φ e + Φ x t − + v t − z t = H e + H x t + ε t (3.2)where e is a d -dimensional column vector with all components equal to 1, H and H are n × d rectangular matrices, Φ and Φ are d -dimensional diagonal matrices,and for each maturity T j , with j = 1 , . . . , n , and for each factor i , with i = 1 , . . . , d ,Φ i,i = (1 − e − κ i ∆ t ) η, Φ i,i = e − κ i ∆ t , (3.3) H j,i = − T j A i ( T j ) , H j,i = − T j B i ( T j ) and Σ = diag( σ ε ) (3.4)where A i ( t ) and B i ( t ) are defined in equations (2.3) and (2.5), respectively. Then,we have the diagonal element i of the matrix Q t equal to Q it = ϑ i κ i (1 − e − κ i ∆ t ) , in the Vasicek case, and Q it = ϑ i η i κ i (1 − e − κ i ∆ t ) + r it − ϑ i κ i ( e − κ i ∆ t − e − κ i ∆ t ) ,
9n the CIR case. With a notation similar to O’Sullivan (2008), the Kalman filteralgorithm to calibrate the model defined in equation (3.2) reads as follows. Theso-called prediction step is given by x − t = Φ e + Φ x t − P − t = Φ P t − (Φ ) (cid:48) + Q t . The predicted measurement, the prediction error, and its covariance are z − t = H e + H x − t u t = z t − z − t P zz,t = H P − t ( H ) (cid:48) + Σ . The filtered updates are given by K t = P − t ( H ) (cid:48) ( P zz,t ) − x t = x − t + K t u t P t = ( I − K t H ) P − t . Finally, the log-likelihood function for the maximum likelihood estimation of themodel proposed in equation (3.2) islog L (Θ) = − nd π − n (cid:88) t =0 log | P zz,t | − n (cid:88) t =0 u t P − zz,t u t (3.5)where Θ is the set of model parameters.As observed by Chen and Scott (2003), if one considers a CIR factor, equation(3.2) is an approximation of the the true dynamics, because the innovations in theCIR model have a noncentral χ distributions, in contrast to the normal distribu-tion that is assumed for maximum likelihood estimation of the Kalman filter. Forthis reason, the model (3.1) should in principle be calibrated taking into accountthe non-Gaussian nature of the CIR process. For example, the expectation maxi-mization (EM) algorithm proposed by Sch¨on et al. (2011) could be applied. Thisapproach is not pursued in this paper. It is well-known that finding an optimal solution that maximizes the likelihoodfunction in equation (3.5) is not a simple task. Most of the theoretical literatureon this subject does not report explicitly the algorithm applied to solve the op-timization problem above nor the computational time needed for the calibration.Note that in optimization packages first and second derivatives are usually ap-proximated with finite differences, if the analytic gradient vector and the Hessianmatrix are not provided. In order to speed up the optimization algorithm and tocompute the standard error of the estimates in an exact way, we use the analyticexpression for the gradient vector and the Hessian matrix (see Appendix A). As In a maximum likelihood estimation standard errors can be computed by taking the squareroot of the diagonal elements of the inverse Hessian matrix at the optimal point. κ , η , ϑ ) of eachfactor in the region between (1e-4, 1e-4, 1e-4) and (5, 0.1, 0.5) for the CIR factors,or (5, 0.1, 0.1) for the Vasicek factors. The parameter σ ε ranges between 1e-4 and0.5. The optimization algorithm applied in this study is the sequential quadraticprogramming method implemented in the fmincon Matlab function in which theoption sqp is selected. The analytic gradient vector of the likelihood is providedin order to speed up the algorithm. As an alternative the interior point algorithmthat considers the analytic Hessian matrix can be used. As the computation of theanalytic Hessian is time-consuming, we only use it to compute the standard errorsof the parameters and not in the optimization algorithm. From a practical stand-point, multi-factor models based only on Vasicek factors are simpler to implement,particularly when one wants to use the analytic Hessian matrix.As a starting point of the algorithm we select a random point in the parameterregion. This approach seems to work for all models except for those involving morethan two CIR factors. Given the presence of low (near-zero) rates, the model doesnot seem to perform well. One should add a negative constant to the short-rateprocess in equation (2.7) in order to raise the mean of the factors and give greaterflexibility to the model. Alternatively, one can add a Vasicek factor. In this paper,we follow the latter approach.
In Tables 1 and 2 we report, for each maturity, parameter estimates and calibrationerrors for different models of the Italian government bond term structure over thetime horizon considered (May 31, 2003 to May 31, 2014). We present results foreight different multi-factor short-rate models with CIR and (or) Vasicek factors. Inaddition, Table 2 reports the value of the log-likelihood function and its gradient,as well as the first order optimality condition at the optimal point. As expected, the analysis shows that the models with more than one factorhave a better performance in terms of calibration error. On the one hand, thoseincluding a Vasicek factor have a smaller calibration error, due to a higher degreeof flexibility. However, the use of the Vasicek factor does not ensure that the short-rate remains positive, even if by construction its mean value is strictly positive. Onthe other hand, while multi-factor CIR models generate only positive factors, theyare less flexible, particularly when interest rates approach the zero lower bound. Inthis case it may happen that for some multi-factor models the optimal parametershit the boundaries and this results in a value of the gradient that is far from zero.In addition, for CIR factors some numerical errors may appear in the evaluation ofthe Hessian and this results in negative values (although small in absolute value)for the square of the standard errors.To assess the performance of different models, we apply the Akaike information For the definition of the first order optimality condition we refer to the Matlab documenta-tion.
AIC = 2 np − L where np is the number of parameters and log L is the model log-likelihood. Ac-cording to the AIC reported in Table 2, the 3-factor Vasicek model performs bet-ter than the other models considered in our analysis, i.e. it has the smallest AICvalue. The Bayesian information criterion (BIC) gives the same ranking of modelsis therefore omitted.Figure 1 shows the dynamics of the factors for the eight models analyzed in thispaper. The value of sum of the factors (in black in Figure 1) captures the level ofthe shortest maturity rates while the other parameters capture the average slopeand curvature of the term structure throughout the sample period. It appearsclear that if one considers more than one CIR factor, there are factors that hitthe zero lower bound. This makes the calibration of these models computationallymore challenging, as confirmed by the fact that the optimal parameters depend onthe starting point in the optimization. Conversely, if one considers models withat least one Vasicek factor, this becomes negative, even if the sum of the factors(the black line) remains positive. In these cases the optimization algorithm is morerobust and the optimal solution does not depend on the initial point.For comparative purposes, the models performance across maturities and dif-ferent observation dates is evaluated with the average percentage error (APE) AP E = 1¯ y market (cid:88) t (cid:88) T i | y markett,T i − y modelt,T i ( θ ) | number of observations , (3.6)and the root mean square error (RMSE) RM SE = (cid:118)(cid:117)(cid:117)(cid:116)(cid:88) t (cid:88) T i ( y markett,T i − y modelt,T i ( θ )) number of observations , (3.7)where y t,T i denotes the yield to maturity T i observed at time t , and ¯ y market is theaverage yield across time and maturities.Based on the APE and the RMSE evaluated over the entire sample on successivecross-sections of bond yields multi-factor models perform better than one-factormodels. The APE and the RMSE are larger for both CIR and Vasicek 1-factormodels and smaller for three-factor models. While the overall APE ranges between9.88 (1-CIR) and 2.61 per cent (3-Vasicek), the overall RMSE ranges between48.98 (1-Vasicek) and 18.79 basis points (3-Vasicek). Both APE and RMSE valuesreported in Table 1 show that the calibration error depends on the maturity. Forall models, the error is usually larger for the shortest maturities. Beside the CIRmodel and the Vasicek 1-factor show a similar dynamic for the unobservable factorand a similar calibration error, the estimated parameter differs (see Table 2).In order to have a visual assessment of the fitting exercise, in Figure 2 theestimated rates of the best performing model (i.e. the model with three Vasicekfactors) are compared with the market ones. Furthermore, in Figure 3 we show thecorresponding calibration errors. On the last trimester of 2011 the model reaches12he largest calibration error. We are aware of the fact that Gaussian models arenot able to explain sharp movements in the market, even if the performance overthe observation period is satisfactory.In addition to estimating the eight models, we empirically study their forecast-ing performance over time, by considering 10,000 Monte Carlo simulations. Basedon the estimates in the period from May, 31 2003 to May, 31 2014, we comparethe simulated yield with the data observed in the market until May, 31 2015. InFigure 4 we show the 5 (10, 20, 60, 120 and 260) trading days ahead forecasts.For each model, we report the mean, the 5th and the 95th percentile based on10,000 simulated scenarios. Figure 4 shows that the forecasting performance of the1-factor model is poor. The same is true for the models with only CIR factors.The addition of at least one Vasicek factor in multi-factor models largely improvethe forecasting performance, even if negative values for short-term maturities areallowed. As expected, the forecasting performance decreases and the volatility ofthe forecast increases after 60 trading days. At least for the data considered in thisstudy, some models show a satisfactory 3-month ahead forecasts. The 6- and 12-month ahead forecast are far from the yield curve observed in the market. We areaware of the fact that this empirical study can be affected by the expanded assetpurchase program announced by the European Central Bank in January 2015. The objective of this paper is twofold. First, we provide a maximum likelihood op-timization algorithm based on the Kalman filter in which both the gradient vectorand the Hessian matrix can be computed in closed-form. Second, we explore thecalibration performance of multi-factor models driven by independent univariateCIR and Vasicek processes under the risk-neutral measure in fitting the Italianterm structure. As already observed in other empirical studies, at least three fac-tors are necessary for a satisfactory representation of the behavior of yield curves.This seems a good compromise between a satisfactory performance in terms ofcalibration error and parameter parsimony. We show that at least a Vasicek factoris needed to properly calibrate the dynamics of the yield curve at least for thedata we consider in this paper. If even the Vasicek factor may become negative,the short-rate process, defined as the sum of the factors, remains positive and itsexpected mean is strictly positive. The calibration of multi-factor CIR models isaffected by the presence of low level (near zero) rates, and these observed patternscomplicate the convergence properties of the optimization algorithm. Finally, themulti-factor models with at least a Vasicek factor seem to show a better forecastingperformance. 13
Appendix
A.1 Gradient of the likelihood function
Here we provide the formulas to compute the gradient of the likelihood functionto use in the optimization algorithm. For a generic parameter ξ ∈ Θ, we have that ∂ log L (Θ) ∂ξ = − n (cid:88) t =0 (cid:26) ∂u t ∂ξ P − zz,t u (cid:48) t + tr (cid:18) P − zz,t ∂P zz,t ∂ξ (cid:19) − u t P − zz,t ∂P zz,t ∂ξ P − zz,t u (cid:48) t (cid:27) . (A.1)The first gradient in equation (A.1) is given by ∂u t ∂ξ = − ∂H ∂ξ e − ∂H ∂ξ x − t − H ∂x − t ∂ξ∂x − t ∂ξ = ∂ Φ ∂ξ e + ∂ Φ ∂ξ x t − + Φ ∂x t − ∂ξ . In both the Vasicek and the CIR case we have ∂ Φ ∂ξ = (cid:0) η ∆ te − κ ∆ t − e − κ ∆ t (cid:1) (cid:48) and ∂ Φ ∂ξ = (cid:0) − ∆ te − κ ∆ t (cid:1) (cid:48) . The partial derivatives of H and H are given in Section A.3. More challenging isthe computation of the partial derivatives of P zz,t . These involves both the factorsparameters and the measurement error parameters. ∂P zz,t ∂ξ = ∂H ∂ξ P − t ( H ) (cid:48) + H ∂P − t ∂ξ ( H ) (cid:48) + H P − t ( ∂H ∂ξ ) (cid:48) + ∂ Σ ∂ξ , where the partial derivatives of H are given in Section A.3, ∂P − t ∂ξ = ∂ Φ ∂ξ P t − (Φ ) (cid:48) + Φ ∂P t − ∂ξ (Φ ) (cid:48) + Φ P t − ( ∂ Φ ∂ξ ) (cid:48) + ∂Q t ∂ξ , and the derivatives of Σ can be computed by considering that the matrix is diagonalwith all element equal to σ ε . Furthermore, the filtered updates are given by ∂x t ∂ξ = ∂x − t ∂ξ + ∂K t ∂ξ u t + K t ∂u t ∂ξ∂P t ∂ξ = ∂P − t ∂ξ − ∂K t ∂ξ H P − t − K t ∂H ∂ξ P − t − K t H ∂P − t ∂ξ , with ∂K t ∂ξ = ∂P − t ∂ξ ( H ) (cid:48) ( P zz,t ) − + P − t ∂ ( H ) (cid:48) ∂ξ ( P zz,t ) − − P − t ( H ) (cid:48) ( P zz,t ) − ∂P zz,t ∂ξ ( P zz,t ) − . Q t for a Vasicek factor r is ∂Q t ∂ξ = ∂∂ξ (cid:26) ϑ κ (1 − e − κ ∆ t ) (cid:27) = ∂q v ∂ξ , where ∂q v ∂κ = − ϑ κ (1 − e − κ ∆ t ) + ϑ ∆ tκ e − κ ∆ t ∂q v ∂η = 0 ∂q v ∂ϑ = ϑκ (1 − e − κ ∆ t )The partial derivatives of Q t for a CIR factor r is ∂Q t ∂ξ = ∂∂ξ (cid:26) ϑ η κ (1 − e − κ ∆ t ) + r t − ϑ κ ( e − κ ∆ t − e − κ ∆ t ) (cid:27) = ∂∂ξ { q c + r t − q c } = ∂q c ∂ξ + ∂r t − ∂ξ q c + r t − ∂q c ∂ξ where ∂q c ∂κ = − ϑ η κ (1 − e − κ ∆ t ) + ϑ η ∆ tκ ( e − κ ∆ t − e − κ ∆ t ) ∂q c ∂η = ϑ κ (1 − e − κ ∆ t ) ∂q c ∂ϑ = ϑηκ (1 − e − κ ∆ t ) and ∂q c ∂κ = − ϑ κ ( e − κ ∆ t − e − κ ∆ t ) − ϑ κ (∆ te − κ ∆ t − te − κ ∆ t ) ∂q c ∂η = 0 ∂q c ∂ϑ = 2 ϑκ ( e − κ ∆ t − e − κ ∆ t ) . By considering that the initial point of the KF algorithm are x = η and P = ϑ κ in the Vasicek case, and x = η and P = ηϑ κ in the CIR case, the derivatives with respect to a generic parameter ξ are simpleto compute, that is in both the Vasicek and the CIR case we have ∂x ∂ξ = (cid:0) (cid:1) ∂P ∂ξ = (cid:0) − ϑ κ ϑκ (cid:1) (cid:48) in the Vasicek case, and ∂P ∂ξ = (cid:16) − ϑ η κ ϑ κ ϑηκ (cid:17) (cid:48) in the CIR case. These derivatives are needed as initial point of the algorithm thatcompute the gradient on the likelihood function to maximize. A.2 Hessian of the likelihood function
Here we provide the formulas to compute the Hessian of the likelihood function touse in the optimization algorithm. For generic parameters ξ , ξ ∈ Θ, we have that ∂ log L (Θ) ∂ξ ∂ξ = − n (cid:88) t =0 (cid:26) ∂u t ∂ξ ∂ξ P − zz,t u (cid:48) t − ∂u t ∂ξ P − zz,t ∂P zz,t ∂ξ P − zz,t u (cid:48) t + 2 ∂u t ∂ξ P − zz,t ∂u t ∂ξ (cid:48) − tr (cid:18) P − zz,t ∂P zz,t ∂ξ P − zz,t ∂P zz,t ∂ξ (cid:19) + tr (cid:18) P − zz,t ∂P zz,t ∂ξ ∂ξ (cid:19) − ∂u t ∂ξ P − zz,t ∂P zz,t ∂ξ P − zz,t u (cid:48) t − u t ∂∂ξ (cid:16) P − zz,t ∂P zz,t ∂ξ P − zz,t (cid:17) u (cid:48) t (cid:27) , (A.2)where ∂∂ξ (cid:16) P − zz,t ∂P zz,t ∂ξ P − zz,t (cid:17) = − P − zz,t ∂P zz,t ∂ξ P − zz,t ∂P zz,t ∂ξ P − zz,t − P − zz,t ∂P zz,t ∂ξ P − zz,t ∂P zz,t ∂ξ P − zz,t + P − zz,t ∂P zz,t ∂ξ ∂ξ P − zz,t . Then, we need to compute only the following second derivatives ∂u t ∂ξ ∂ξ and ∂P zz,t ∂ξ ∂ξ ,since all other derivatives have been already reported in Section A.1. By SectionA.1 we have that ∂u t ∂ξ ∂ξ = − ∂H ∂ξ ∂ξ e − ∂H ∂ξ ∂ξ x − t − ∂H ∂ξ ∂x − t ∂ξ − ∂H ∂ξ ∂x − t ∂ξ − H ∂x − t ∂ξ ∂ξ ∂x − t ∂ξ ∂ξ = ∂ Φ ∂ξ ∂ξ e + ∂ Φ ∂ξ ∂ξ x t − + ∂ Φ ∂ξ ∂x t − ∂ξ + ∂ Φ ∂ξ ∂x t − ∂ξ + Φ ∂x t − ∂ξ ∂ξ In both the Vasicek and the CIR case we have ∂ Φ ∂ξ ∂ξ = − η ∆ t e − κ ∆ t te − κ ∆ t te − κ ∆ t and ∂ Φ ∂ξ ∂ξ = ∆ t e − κ ∆ t . H and H are given in Section A.3. More challenging isthe computation of the Hessian of P zz,t . These involves both the factors parametersand the measurement error parameters. ∂P zz,t ∂ξ ∂ξ = ∂H ∂ξ ∂ξ P − t ( H ) (cid:48) + ∂H ∂ξ ∂P − t ∂ξ ( H ) (cid:48) + ∂H ∂ξ P − t ( ∂H ∂ξ ) (cid:48) + ∂H ∂ξ ∂P − t ∂ξ ( H ) (cid:48) + H ∂P − t ∂ξ ∂ξ ( H ) (cid:48) + H ∂P − t ∂ξ ( ∂H ∂ξ ) (cid:48) + ∂H ∂ξ P − t ( ∂H ∂ξ ) (cid:48) + H ∂P − t ∂ξ ( ∂H ∂ξ ) (cid:48) + H P − t ( ∂H ∂ξ ∂ξ ) (cid:48) + ∂ Σ ∂ξ ∂ξ where, as already observed, the second derivatives of H are given in Section A.3,and the derivatives of Σ can be computed by considering that the matrix is diagonalwith all element equal to σ ε . Then we have ∂P − t ∂ξ ∂ξ = ∂ Φ ∂ξ ∂ξ P t − (Φ ) (cid:48) + ∂ Φ ∂ξ ∂P t − ∂ξ (Φ ) (cid:48) + ∂ Φ ∂ξ P t − ( ∂ Φ ∂ξ ) (cid:48) + ∂ Φ ∂ξ ∂P t − ∂ξ (Φ ) (cid:48) + Φ ∂P t − ∂ξ ∂ξ (Φ ) (cid:48) + Φ ∂P t − ∂ξ ( ∂ Φ ∂ξ ) (cid:48) + ∂ Φ ∂ξ P t − ( ∂ Φ ∂ξ ) (cid:48) + Φ ∂P t − ∂ξ ( ∂ Φ ∂ξ ) (cid:48) + Φ P t − ( ∂ Φ ∂ξ ∂ξ ) (cid:48) + ∂Q t ∂ξ ∂ξ and . Furthermore, the filtered updates are given by ∂x t ∂ξ ∂ξ = ∂x − t ∂ξ ∂ξ + ∂K t ∂ξ ∂ξ u t + ∂K t ∂ξ ∂u t ∂ξ + ∂K t ∂ξ ∂u t ∂ξ + K t ∂u t ∂ξ ∂ξ ∂P t ∂ξ ∂ξ = ∂P − t ∂ξ ∂ξ − ∂K t ∂ξ ∂ξ H P − t − ∂K t ∂ξ ∂H ∂ξ P − t − ∂K t ∂ξ H ∂P − t ∂ξ − ∂K t ∂ξ ∂H ∂ξ P − t − K t ∂H ∂ξ ∂ξ P − t − K t ∂H ∂ξ ∂P − t ∂ξ − ∂K t ∂ξ H ∂P − t ∂ξ − K t ∂H ∂ξ ∂P − t ∂ξ − K t H ∂P − t ∂ξ ∂ξ with ∂K t ∂ξ ∂ξ = ∂P − t ∂ξ ∂ξ ( H ) (cid:48) ( P zz,t ) − + ∂P − t ∂ξ ∂ ( H ) (cid:48) ∂ξ ( P zz,t ) − + ∂P − t ∂ξ ( H ) (cid:48) ∂ ( P zz,t ) − ∂ξ + ∂P − t ∂ξ ∂ ( H ) (cid:48) ∂ξ ( P zz,t ) − + P − t ∂ ( H ) (cid:48) ∂ξ ∂ξ ( P zz,t ) − + P − t ∂ ( H ) (cid:48) ∂ξ ∂ ( P zz,t ) − ∂ξ + ∂P − t ∂ξ ( H ) (cid:48) ∂ ( P zz,t ) − ∂ξ + P − t ∂ ( H ) (cid:48) ∂ξ ∂ ( P zz,t ) − ∂ξ − P − t ( H ) (cid:48) ∂∂ξ (cid:16) P − zz,t ∂P zz,t ∂ξ P − zz,t (cid:17) The second derivatives of q v for a Vasicek factor r are ∂q v ∂κ = ϑ κ (1 − e − κ ∆ t ) − ϑ ∆ tκ e − κ ∆ t − ϑ ∆ t κ e − κ ∆ t ∂q v ∂κ∂ϑ = ∂q v ∂ϑ∂κ = − ϑκ (1 − e − κ ∆ t ) + 2 ϑ ∆ tκ e − κ ∆ t ∂q v ∂ϑ = 1 κ (1 − e − κ ∆ t ) 17nd all other second derivatives are zero. The second derivatives of Q t for a CIRfactor r are ∂Q t ∂ξ ∂ξ = ∂q c ∂ξ ∂ξ + ∂r t − ∂ξ ∂ξ q c + ∂r t − ∂ξ ∂q c ∂ξ + ∂r t − ∂ξ ∂q c ∂ξ + r t − ∂q c ∂ξ ∂ξ where ∂q c ∂κ = ϑ ηκ (1 − e − κ ∆ t ) − ϑ η ∆ tκ (1 − e − κ ∆ t ) e − κ ∆ t + ϑ η ∆ tκ (∆ te − κ ∆ t − te − κ ∆ t ) ∂q c ∂κ∂η = ∂q c ∂η∂κ = − ϑ κ (1 − e − κ ∆ t ) + ϑ ∆ tκ ( e − κ ∆ t − e − κ ∆ t ) ∂q c ∂κ∂ϑ = ∂q c ∂ϑ∂κ = − ϑηκ (1 − e − κ ∆ t ) + 2 ϑη ∆ tκ ( e − κ ∆ t − e − κ ∆ t ) ∂q c ∂η = 0 ∂q c ∂η∂ϑ = ∂q c ∂ϑ∂η = ϑκ (1 − e − κ ∆ t ) ∂q c ∂ϑ = ηκ (1 − e − κ ∆ t ) and ∂q c ∂κ = 2 ϑ κ ( e − κ ∆ t − e − κ ∆ t ) + 2 ϑ κ (∆ te − κ ∆ t − te − κ ∆ t ) + ϑ κ (∆ t e − κ ∆ t − t e − κ ∆ t ) ∂q c ∂κ∂η = ∂q c ∂η∂κ = ∂q c ∂ϑ∂η = ∂q c ∂η∂ξ = 0 ∂q c ∂κ∂ϑ = ∂q c ∂ϑ∂κ = − ϑκ ( e − κ ∆ t − e − κ ∆ t ) − ϑκ (∆ te − κ ∆ t − te − κ ∆ t ) ∂q c ∂ϑ = 2 κ ( e − κ ∆ t − e − κ ∆ t ) . Furthermore, in the Vasicek case we have ∂P ∂ξ ∂ξ = ϑ κ − ϑκ − ϑκ κ and in the CIR case we have ∂P ∂ξ ∂ξ = ϑ ηκ − ϑ κ − ϑηκ − ϑ κ ϑκ − ϑηκ ϑκ ηκ . These derivatives are needed as initial point of the algorithm that compute theHessian on the likelihood function to maximize.
A.3 Gradient and Hessian of H0 and H1
As observed in equation (3.4) we have that H and H are defined as follows H j,i = − T j A i ( T j ) , H j,i = − T j B i ( T j )18herefore, to compute the gradient and the Hessian with respect to the modelparameters it is enough to compute the the gradient and the Hessian of A ( t ) and B ( t ). A.3.1 Vasicek model
In the Vasicek case, we have z model = − t ( A ( t ) + B ( t ) r )where A ( t ) = − ηt − ηg ( t ) − ϑ κ ( g ( t )) + ϑ κ ( t + g ( t )) ,B ( t ) = g ( t ) = − − e − κt κ . It follows that the first derivatives to compute the gradient are ∂g ( t ) ∂κ = 1 − e − κt − tκe − κt κ ,∂g ( t ) ∂η = ∂g ( t ) ∂ϑ = 0 ,∂A ( t ) ∂κ = − η ∂g ( t ) ∂κ − ϑ κ g ( t ) ∂g ( t ) ∂κ + ϑ κ ( g ( t )) + ϑ κ ∂g ( t ) ∂κ − ϑ κ ( t + g ( t )) ,∂A ( t ) ∂η = − t − g ( t ) ,∂A ( t ) ∂ϑ = − ϑ κ ( g ( t )) + ϑκ ( t + g ( t ))The second derivatives to compute the Hessian are ∂g ( t ) ∂κ = t κ e − κt − − e − κt ) + 3 tκe − κt κ , all other second derivatives of g ( t ) are zero, and ∂A ( t ) ∂κ = − η ∂g ( t ) ∂κ + ϑ κ g ( t ) ∂g ( t ) ∂κ − ϑ κ (cid:18) ∂g ( t ) ∂κ (cid:19) − ϑ κ g ( t ) ∂g ( t ) ∂κ − ϑ κ ( g ( t )) + ϑ κ g ( t ) ∂g ( t ) ∂κ − ϑ κ ∂g ( t ) ∂κ + ϑ κ ∂g ( t ) ∂κ + 3 ϑ κ ( t + g ( t )) − ϑ κ ∂g ( t ) ∂κ∂A ( t ) ∂κ∂η = ∂A ( t ) ∂η∂κ = − ∂g ( t ) ∂κ ,∂A ( t ) ∂κ∂ϑ = ∂A ( t ) ∂ϑ∂κ = − ϑκ g ( t ) ∂g ( t ) ∂κ + ϑ κ ( g ( t )) + ϑκ ∂g ( t ) ∂κ − ϑκ ( t + g ( t )) ,∂A ( t ) ∂η = ∂A ( t ) ∂η∂ϑ = ∂A ( t ) ∂ϑ∂η = 0 ∂A ( t ) ∂ϑ = − κ ( g ( t )) + 1 κ ( t + g ( t )) . .3.2 CIR model In the CIR case, we have z model = − t ( A ( t ) + B ( t ) r )where A ( t ) = 2 κηϑ (cid:0) log(2) + log( b ) − log( κ − b ) + ( κ + b ) t/ − log( ae bt − (cid:1) ,B ( t ) = − κ − b (cid:18) e bt − ae bt − (cid:19) ,a = κ + bκ − b and b = √ κ + 2 ϑ . It follows that ∂b∂κ = κ √ κ + 2 ϑ ∂b∂η = 0 ∂b∂ϑ = 2 ϑ √ κ + 2 ϑ and ∂a∂κ = ( κ − b )(1 + ∂b∂κ ) − ( κ + b )(1 − ∂b∂κ )( κ − b ) = 2( κ ∂b∂κ − b )( κ − b ) ∂a∂η = 0 ∂a∂ϑ = ( κ − b ) ∂b∂ϑ + ( κ + b ) ∂b∂ϑ ( κ − b ) = 2 κ ∂b∂ϑ ( κ − b ) Thus, by setting f = log(2) + log( b ) − log( κ − b ) + ( κ + b ) t/ − ww = log( ae bt − , the equalities ∂f∂κ = 1 b ∂b∂κ − k − b (cid:18) − ∂b∂κ (cid:19) + (cid:18) ∂b∂κ (cid:19) t − ∂w∂κ∂f∂η = 0 ∂f∂ϑ = 1 b ∂b∂ϑ + 1 k − b ∂b∂ϑ + t ∂b∂ϑ − ∂w∂ϑ∂w∂ξ = ∂a∂ξ e bt + ate bt ∂b∂ξ ae bt − A ( t ) the following partial derivatives ∂A ( t ) ∂κ = 2 ηϑ f + 2 κηϑ ∂f∂κ∂A ( t ) ∂η = 2 κϑ f∂A ( t ) ∂ϑ = − κηϑ f + 2 κηϑ ∂f∂ϑ , B ( t ) ∂B ( t ) ∂κ = 2 h ( κ − b ) (cid:18) − ∂b∂κ (cid:19) − κ − b ∂h∂κ∂B ( t ) ∂η = 0 ∂B ( t ) ∂ϑ = − h ( κ − b ) ∂b∂ϑ − κ − b ∂h∂ϑ where h = e bt − ae bt − ξ ∈ { κ, ϑ } ∂h∂ξ = e bt (cid:16) − t ∂b∂ξ − e bt ∂a∂ξ + ∂a∂ξ + at ∂b∂ξ (cid:17) ( ae bt − . A little bit more tedious are the computation of the second derivatives. More indetails, we have ∂b∂κ = 2 ϑ ( κ + 2 ϑ ) ∂b∂κ∂ϑ = ∂b∂ϑ∂κ = − κϑ ( κ + 2 ϑ ) ∂b∂ϑ = 2 κ ( κ + 2 ϑ ) with all other second derivatives equal to zero, and ∂a∂κ = 2 k ∂b∂κ ( κ − b ) − − ∂b∂κ )( k ∂b∂κ − b )( κ − b ) ∂a∂κ∂ϑ = ∂a∂ϑ∂κ = 2( κ ∂b∂κ∂ϑ )( κ − b ) + 2 ∂b∂ϑ (2 κ ∂b∂κ − b − k )( κ − b ) ∂a∂ϑ = 2 κ ∂b∂ϑ ( k − b ) + 4 κ ( ∂b∂ϑ ) ( κ − b ) with all other second derivatives equal to zero. Then, by setting ∂f∂κ = − b (cid:18) ∂b∂κ (cid:19) + 1 b ∂b∂κ + 1( k − b ) (cid:18) − ∂b∂κ (cid:19) + 1 k − b ∂b∂κ + t ∂b∂κ − ∂w∂κ ∂f∂κ∂ϑ = ∂f∂ϑ∂κ = − b ∂b∂κ ∂b∂ϑ + 1 b ∂b∂κ∂ϑ − k − b ) (cid:18) − ∂b∂κ (cid:19) ∂b∂ϑ + 1 k − b ∂b∂κ∂ϑ + t ∂b∂κ∂ϑ − ∂w∂κ∂ϑ∂f∂ϑ = − b (cid:18) ∂b∂ϑ (cid:19) + 1 b ∂b∂ϑ + 1( k − b ) (cid:18) ∂b∂ϑ (cid:19) + 1 k − b ∂b∂ϑ + t ∂b∂ϑ − ∂w∂ϑ ∂w∂κ = ∂a∂κ e bt + 2 ∂a∂κ te bt ∂b∂κ + at e bt (cid:0) ∂b∂κ (cid:1) + ate bt ∂b∂κ ( ae bt − − (cid:0) ∂a∂κ e bt + ate bt ∂b∂κ (cid:1) ( ae bt − ∂w∂κ∂ϑ = ∂w∂ϑ∂κ = ∂a∂κ∂ϑ e bt + ∂a∂κ te bt ∂b∂ϑ + ∂a∂ϑ te bt ∂b∂κ + at e bt ∂b∂κ ∂b∂ϑ + ate bt ∂b∂κ ∂b∂ϑ ( ae bt − − (cid:0) ∂a∂κ e bt + ate bt ∂b∂κ (cid:1) (cid:0) ∂a∂ϑ e bt + ate bt ∂b∂ϑ (cid:1) ( ae bt − ∂w∂ϑ = ∂a∂ϑ e bt + 2 ∂a∂ϑ te bt ∂b∂ϑ + at e bt (cid:0) ∂b∂ϑ (cid:1) + ate bt ∂b∂ϑ ( ae bt − − (cid:0) ∂a∂ϑ e bt + ate bt ∂b∂ϑ (cid:1) ( ae bt − we obtain for A ( t ) the following partial derivatives ∂A ( t ) ∂κ = 4 ηϑ ∂f∂κ + 2 κηϑ ∂f∂κ ∂A ( t ) ∂κ∂ϑ = ∂A ( t ) ∂ϑ∂κ = − ηϑ f + 2 ηϑ ∂f∂ϑ − κηϑ ∂f∂κ + 2 κηϑ ∂f∂κ∂ϑ∂A ( t ) ∂ϑ = 12 κηϑ f − κηϑ ∂f∂ϑ + 2 κηϑ ∂f∂ϑ and for B ( t ) ∂B ( t ) ∂κ = ∂h∂κ κ − b ) (cid:18) − ∂b∂κ (cid:19) − h ( κ − b ) (cid:18) − ∂b∂κ (cid:19) − h ( κ − b ) ∂b∂κ + 2( κ − b ) ∂h∂κ (cid:18) − ∂b∂κ (cid:19) − κ − b ∂h∂κ ∂B ( t ) ∂κ∂ϑ = ∂B ( t ) ∂ϑ∂κ = ∂h∂ϑ κ − b ) (cid:18) − ∂b∂κ (cid:19) + 4 h ( κ − b ) (cid:18) − ∂b∂κ (cid:19) ∂b∂ϑ − h ( κ − b ) ∂b∂κ∂ϑ − κ − b ) ∂h∂κ ∂b∂ϑ − κ − b ∂h∂κ∂ϑ∂B ( t ) ∂ϑ = − ∂h∂ϑ κ − b ) ∂b∂ϑ − h ( κ − b ) (cid:18) ∂b∂ϑ (cid:19) − h ( κ − b ) ∂b∂ϑ − κ − b ) ∂h∂ϑ ∂b∂ϑ − κ − b ∂h∂ϑ ∂h∂κ = t ∂b∂κ e bt (cid:0) − t ∂b∂κ − e bt ∂a∂κ + ∂a∂κ + at ∂b∂κ (cid:1) ( ae bt − + e bt (cid:0) − t ∂b∂κ − t ∂b∂κ e bt ∂a∂κ − e bt ∂a∂κ + ∂a∂κ + ∂a∂κ t ∂b∂κ + at ∂b∂κ (cid:1) ( ae bt − − e bt (cid:0) − t ∂b∂κ − e bt ∂a∂κ + ∂a∂κ + at ∂b∂κ (cid:1) (cid:0) ∂a∂κ e bt + ate bt ∂b∂κ (cid:1) ( ae bt − ∂h∂κ∂ϑ = t ∂b∂ϑ e bt (cid:0) − t ∂b∂κ − e bt ∂a∂κ + ∂a∂κ + at ∂b∂κ (cid:1) ( ae bt − + e bt (cid:0) − t ∂b∂κ∂ϑ − t ∂b∂ϑ e bt ∂a∂κ − e bt ∂a∂κ∂ϑ + ∂a∂κ∂ϑ + ∂a∂ϑ t ∂b∂κ + at ∂b∂κ∂ϑ (cid:1) ( ae bt − − e bt (cid:0) − t ∂b∂κ − e bt ∂a∂κ + ∂a∂κ + at ∂b∂κ (cid:1) (cid:0) ∂a∂ϑ e bt + ate bt ∂b∂ϑ (cid:1) ( ae bt − ∂h∂ϑ = t ∂b∂ϑ e bt (cid:0) − t ∂b∂ϑ − e bt ∂a∂ϑ + ∂a∂ϑ + at ∂b∂ϑ (cid:1) ( ae bt − + e bt (cid:0) − t ∂b∂ϑ − t ∂b∂ϑ e bt ∂a∂ϑ − e bt ∂a∂ϑ + ∂a∂ϑ + ∂a∂ϑ t ∂b∂ϑ + at ∂b∂ϑ (cid:1) ( ae bt − − e bt (cid:0) − t ∂b∂ϑ − e bt ∂a∂ϑ + ∂a∂ϑ + at ∂b∂ϑ (cid:1) (cid:0) ∂a∂ϑ e bt + ate bt ∂b∂ϑ (cid:1) ( ae bt − eferences A. Abdymomunov and J. Gerlach. Stress testing interest rate risk exposure.
Jour-nal of Banking & Finance , 49:287–301, 2014.Y. A¨ıt-Sahalia and R.L. Kimmel. Estimating affine multifactor term structuremodels using closed-form likelihood expansions.
Journal of Financial Economics ,98(1):113–144, 2010.C.I.R. Almeida. A note on the relation between principal components and dynamicfactors in affine term structure models.
Brazilian Review of Econometrics , 25(1):89–114, 2005.A. Ang and F.A. Longstaff. Systemic sovereign credit risk: Lessons from the USand Europe.
Journal of Monetary Economics , 60(5):493–510, 2013.D. Bams and P.C. Schotman. Direct estimation of the risk neutral factor dynamicsof Gaussian term structure models.
Journal of Econometrics , 117(1):179 – 206,2003.Banca d’Italia. Financial stability report no. 6, November, 2013.E. Barone, D. Cuoco, and E. Zautzik. The term structure of interest rates: A testof the Cox, Ingersoll and Ross model on Italian treasury bonds.
Banca d’Italia,Working paper, no. 128 , 1989.M.L. Bianchi and G.L. Tassinari. Forward-looking portfolio selection with mul-tivariate non-Gaussian models and the Esscher transform.
Preprint , 2018, https://arxiv.org/abs/1805.05584 .D. Brigo and F. Mercurio.
Interest rate models: theory and practice: with smile,inflation, and credit . Springer, 2006.R.R. Chen and L. Scott. Multi-factor Cox-Ingersoll-Ross models of the term struc-ture: estimates and tests from a Kalman filter model.
The Journal of Real EstateFinance and Economics , 27(2):143–172, 2003.J.C. Cox, J.E. Ingersoll, and S.A. Ross. A theory of the term structure of interestrates.
Econometrica , pages 385–407, 1985.P. Date and C. Wang. Linear Gaussian affine term structure models with unobserv-able factors: Calibration and yield forecasting.
European Journal of OperationalResearch , 195(1):156–166, 2009.F. De Jong. Time series and cross-section information in affine term-structuremodels.
Journal of Business & Economic Statistics , 18(3):300–314, 2000.R.L. D’Ecclesia and S.A. Zenios. Risk factor analysis and portfolio immunizationin the italian bond market.
The Journal of Fixed Income , 4(2):51–58, 1994.24.A.H. Dempster, J. Evans, and E. Medova. Developing a practical yield curvemodel: An odyssey. In J.S. Chadha, A.C.J. Durr´e, M.A.S. Joyce, and L. Sarno,editors,
Developments in macro-finance yield curve modelling , pages 252–292.Cambridge University Press, 2014.J.C. Duan and J.G. Simonato. Estimating and testing exponential-affine termstructure models by kalman filter.
Review of Quantitative Finance and Account-ing , 13(2):111–135, 1999.G.R. Duffee and R.H. Stanton. Estimation of dynamic term structure models.
TheQuarterly Journal of Finance , 2(2), 2012.P. Feldh¨utter and D. Lando. Decomposing swap spreads.
Journal of FinancialEconomics , 88(2):375–405, 2008.A.L.J. Geyer and S. Pichler. A state-space approach to estimate and test mul-tifactor Cox-Ingersoll-Ross models of the term structure.
Journal of FinancialResearch , 22(1):107–130, 1999.A. Girardi and C. Impenna. Price discovery in the Italian sovereign bonds market:the role of order flow.
Banca d’Italia, Working paper, no. 906 , 2013.M. Jeanblanc, M. Yor, and M. Chesney.
Mathematical methods for financial mar-kets . Springer, 2009.M. Keller-Ressel and T. Steiner. Yield curve shapes and the asymptotic short ratedistribution in affine one-factor models.
Finance and Stochastics , 12(2):149–172,2008.B. Maggi and F. Infortuna. Assessing Italian government bonds’ term structurewith CIR model in the aftermath of EMU.
Applied Financial Economics Letters ,4(3):163–170, 2008.S. Musti and R.L. D’Ecclesia. Term structure of interest rates and the expectationhypothesis: The euro area.
European Journal of Operational Research , 185(3):1596–1606, 2008.S.K. Nawalkha and R. Rebonato. What interest rate models to use? Buy sideversus sell side.
Journal of Investment Management , 9(3), 2011.S.K. Nawalkha, N.A. Beliaeva, and G. Soto. A new taxonomy of the dynamic termstructure models.
Journal of Investment Management , 8(4), 2010.C. O’Sullivan. Parameter uncertainty in Kalman-Filter estimation of the CIR termstructure model. In J. A. D. Appleby, D.C. Edelman, and J.H. Miller, editors,
Numerical methods for finance . Chapman & Hall, 2008.R. Rebonato, I. Saroka, and V. Putyatin. A principal-component-based affine termstructure model.
Working paper , 2014.T.B Sch¨on, A. Wills, and B. Ninness. System identification of nonlinear state-spacemodels.
Automatica , 47(1):39 – 49, 2011.25. Schoutens and J. Cariboni.
L´evy processes in credit risk . Wiley, 2009.G.L. Tassinari and M.L. Bianchi. Calibrating the smile with multivariate time-changed Brownian motion and the Esscher transform.
International Journal ofTheoretical and Applied Finance , 17(4), 2014.L. Teng, M. Ehrhardt, and M. G¨unther. Numerical evaluation of complex log-arithms in the Cox-Ingersoll-Ross model.
International Journal of ComputerMathematics , 90(5):1083–1095, 2013.O. Vasicek. An equilibrium characterization of the term structure.
Journal ofFinancial Economics , 5(2):177–188, 1977.S. Zeytun and A. Gupta. A comparative study of the Vasicek and the CIR modelof the short rate.
Working paper , 2007.26 e a r s t o m a t u r i t y . . t o t a l - C I R A PE . . . . . . . . . . . . . . . . R M S E . . . . . . . . . . . . . . . .
96 1 - V a s i ce k A PE . . . . . . . . . . . . . . . . R M S E . . . . . . . . . . . . . . . .
98 2 - C I R A PE . . . . . . . . . . . . . . . . R M S E . . . . . . . . . . . . . . . .
51 2 - V a s i ce k A PE . . . . . . . . . . . . . . . . R M S E . . . . . . . . . . . . . . . .
06 1 - C I R + - V a s i ce k A PE . . . . . . . . . . . . . . . . R M S E . . . . . . . . . . . . . . . .
17 3 - C I R A PE . . . . . . . . . . . . . . . . R M S E . . . . . . . . . . . . . . . .
38 3 - V a s i ce k A PE . . . . . . . . . . . . . . . . R M S E . . . . . . . . . . . . . . . .
79 2 - C I R + - V a s i ce k A PE . . . . . . . . . . . . . . . . R M S E . . . . . . . . . . . . . . . . T a b l e : C a li b r a t i o n e rr o r o f t h e m u l t i - f a c t o r G a u ss i a n m o d e l s . W e r e p o r t , f o r e a c h m o d e l a nd f o r e a c h m a t u r i t y , t h e b o nd t e r m s t r u c t u r ec a li b r a t i o n e rr o r i n t h e p e r i o d f r o m M a y , t o M a y , . T h e a v e r ag e p e r ce n t ag ee rr o r ( A PE ) i np e r ce n t ag e p o i n t s a nd t h e r oo t m e a n s q u a r ee rr o r ( R M S E ) i n b a s i s p o i n t s a r e r e p o r t e d . η ϑ κ η ϑ κ η ϑ σ ε - C I R . . . . LL . e + - . e - - . e - . e - . e - o p t . c o n . . e - - . e - . e - - . e - . e - A I C - . e + - V a s i ce k . . . . LL . e + . e - . e - - . e - - . e - o p t . c o n . . e - . e - . e - . e - . e - A I C - . e + - C I R . . . . . . . LL . e + - . e + - . e + . e + . e + . e + - . e + - . e + o p t . c o n . . e + - . e - - . e - - . e - - . e - - . e - - . e - . e - A I C - . e + - V a s i ce k . . . . . . LL . e + . e - - . e + - . e - . e - - . e - - . e - o p t . c o n . . e - . e - . e - . e - . e - . e - . e - A I C - . e + - C I R + - V a s i ce k . . . . . . LL . e + . e + - . e + - . e + - . e - - . e + . e - o p t . c o n . . e + . e - . e - . e - . e - . e - . e - A I C - . e + - C I R . . . . . . . . . . LL . e + - . e + - . e + . e + - . e + . e + - . e + - . e + - . e + . e + . e + o p t . c o n . . e + - . e - - . e - - . e - - . e - - . e - - . e - - . e - - . e - . e - . e - A I C - . e + - V a s i ce k . . . . . . . . LL . e + . e - - . e + . e - - . e - . e + - . e - - . e - . e - o p t . c o n . . e + . e - . e - . e - . e - . e - . e - . e - . e - A I C - . e + - C I R + - V a s i ce k . . . . . . . . . LL . e + - . e + . e + . e + . e + . e + - . e + - . e + - . e + . e + o p t . c o n . . e + - . e - - . e - . e - . e - . e - . e - . e - . e - . e - A I C - . e + T a b l e : E s t i m a t e dp a r a m e t e r s o f t h e m u l t i - f a c t o r G a u ss i a n m o d e l s i n t h e p e r i o d f r o m M a y , t o M a y , . W e r e p o r t , f o r e a c h m o d e l t h e e s t i m a t e dp a r a m e t e r s , t h ec o rr e s p o nd i n g v a l u e o f t h e g r a d i e n t a nd t h e s q u a r e o f t h e s t a nd a r d e rr o r . T h e v a l u e o f t h e li k e li h oo d ( LL ) , o f t h e fi r s t o r d e r o p t i m a li t y c o nd i t i o n a tt h e o p t i m a l p o i n t( o p t . c o n ) , a nd o f t h e A I C a r e s h o w n i n t h e l a s t c o l u m n . Estimated factors and short-rate process r t on the Italy bond yields from May, 312003 to May, 31 2014. Recall that the black line is the sum of the unobservable factors, that isthe estimated short rate r t . We report the estimated factors for all models investigated. TheKalman filter is considered to extract the unobservable short-rate process.
04 05 06 07 08 09 10 11 12 13 14-0.1-0.0500.050.1
04 05 06 07 08 09 10 11 12 13 14-0.1-0.0500.050.1
04 05 06 07 08 09 10 11 12 13 14-0.1-0.0500.050.1
04 05 06 07 08 09 10 11 12 13 14-0.1-0.0500.050.1
04 05 06 07 08 09 10 11 12 13 14-0.1-0.0500.050.1
04 05 06 07 08 09 10 11 12 13 14-0.1-0.0500.050.1
04 05 06 07 08 09 10 11 12 13 14-0.1-0.0500.050.1
04 05 06 07 08 09 10 11 12 13 14-0.1-0.0500.050.1 i g u r e : C a li b r a t i o n o f t h e b e s t p e r f o r m i n g s h o r t - r a t e m o d e l ( - V a s i ce k ) o n t h e I t a l y b o nd y i e l d s f r o m M a y , t o M a y , . W e r e p o r tt h e o b s e r v e db o nd r a t e s a nd t h ee s t i m a t e d o n e s f o r a ll m a t u r i t i e s i n v e s t i ga t e d . T h e K a l m a nfi l t e r i s c o n s i d e r e d t o e x t r a c tt h e un o b s e r v a b l e s h o r t - r a t e p r o ce ss . . - yea r . - yea r - yea r - yea r - yea r - yea r - yea r - yea r - yea r - yea r - yea r - yea r - yea r - yea r - yea r i g u r e : C a li b r a t i o n e rr o r s o f t h e b e s t p e r f o r m i n g s h o r t - r a t e m o d e l ( - V a s i ce k ) o n t h e I t a l y b o nd y i e l d s f r o m M a y , t o M a y , . W e r e p o r t t h ec a li b r a t i o n e rr o r s f o r a ll m a t u r i t i e s i n v e s t i ga t e d . T h e K a l m a nfi l t e r i s c o n s i d e r e d t o e x t r a c tt h e un o b s e r v a b l e s h o r t - r a t e p r o ce ss . - . - yea r - . - yea r - - yea r - - yea r - - yea r - - yea r - - yea r - - yea r - - yea r - - yea r - - yea r - - yea r - - yea r - - yea r - - yea r i g u r e : S i m u l a t e d a nd o b s e r v e d I t a l y b o nd y i e l d s . W e r e p o r tt h e ( , , , nd )t r a d i n g d a y s a h e a d f o r ec a s t i n t h e p e r i o d f r o m f r o m M a y , t o M a y , . F o r e a c h m o d e l, w ec o n s i d e r t h e m e a n , t h e t h a nd t h e t hp e r ce n t il e b a s e d o n , s i m u l a t e d s ce n a r i o s . - / / ( d ays ) - / / ( d ays ) - / / ( d ays ) - / / ( d ays ) - / / ( d ays ) - / / ( d ays ) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -2000200400600