VAR and ES/CVAR Dependence on data cleaning and Data Models: Analysis and Resolution
VVAR and ES/CVAR Dependenceon data cleaning and Data Models:Analysis and Resolution ∗ Chris Kenyon † and Andrew Green ‡ May 30, 2014Version 1.01
Abstract
Historical (Stressed-) Value-at-Risk ((S)VAR), and Expected Short-fall (ES), are widely used risk measures in regulatory capital and InitialMargin, i.e. funding, computations. However, whilst the definitions ofVAR and ES are unambiguous, they depend on input distributions thatare data-cleaning- and Data-Model-dependent. We quantify the scale ofthese effects from USD CDS (2004–2014), and from USD interest rates(1989–2014, single-curve setup before 2004, multi-curve setup after 2004),and make two standardisation proposals: for data; and for Data-Models.VAR and ES are required for lifetime portfolio calculations, i.e. collateralcalls, which cover a wide range of market states. Hence we need standard,i.e. clean, complete, and common (i.e. identical for all banks), marketdata also covering this wide range of market states. This data is histor-ically incomplete and not clean hence data standardization is required.Stressed VAR and ES require moving market movements during a past(usually not recent) window to current, and future, market states. Allchoices (e.g. absolute difference, relative, relative scaled by some func-tion of market states) implicitly define a Data Model for transformationof extreme market moves (recall that 99th percentiles are typical, andthe behaviour of the rest is irrelevant). Hence we propose standard DataModels. These are necessary because different banks have different stresswindows. Where there is no data, or a requirement for simplicity, wepropose standard lookup tables (one per window, etc.). Without thisstandardization of data and Data Models we demonstrate that VAR andES are complex derivatives of subjective choices.
Historical Value-at-Risk (VAR), and Expected Shortfall (ES, aka ConditionalValue-at-Risk or CVAR) are required under current conditions, and under stressed ∗ The views expressed are those of the author(s) only, no other representationshould be attributed. † Contact: [email protected] ‡ Contact: [email protected] a r X i v : . [ q -f i n . R M ] M a y arket conditions, for capital and Initial Margin, i.e. funding, calculations(BCBS-189 2011; BCBS-261 2013; BCBS-265 2013). They are also requiredunder future conditions for lifetime cost of funding, and lifetime cost of capitalpricing (Green and Kenyon 2014; Green, Kenyon, and Dennis 2014). Althoughthe mathematical formulae for VAR and ES are unambiguous given an inputdistribution, we demonstrate that this input distribution is highly data cleaningand Data Model dependent. Hence we make concrete standardisation proposals.We quantify the scale of data cleaning and Data Model effects covering:daily USD senior unsecured CDS (2004–2014); daily USD interest rates usingswaps (OIS and Libor) out to 30-year maturity (1989–2014); and daily EffectiveFed Fund rates (1972–2014). We use a single-curve approach before 2004 anda multi-curve (discounting and spread) approach after 2004 for interpreting theOIS and Libor swaps data (Kenyon and Stamm 2012; Morini and Bianchetti2013). The Effective Fed Fund rate data enables us to probe higher interest rateregimes (to 20%) rather than only the last twenty years of lower rates. Swapsdata is unavailable much prior to 1989, and OIS swaps are unavailable muchprior to 2004. Understanding a wide range of rates is relevant for stressed VARand stressed ES because these involve moving market movements from a past(usually not recent) window to current and future market states. A wide rangeof states is also important for lifetime capital and funding calculations (Greenand Kenyon 2014; Green, Kenyon, and Dennis 2014) where simulation pathscan disperse widely. It is also relevant for understanding possible future marketstates before they occur.We also quantify the relative sensitivity of VAR and ES to data cleaning.Roughly speaking, once data is clean we find no significant difference w.r.t.Data-Models between 10-day VAR(99%) and ES(97.5%) — our cleaning re-moved outliers. Both data-cleaning and Data-Models have significant, and sep-arate, effects.Since VAR and ES are highly data cleaning and Data Model dependentwe propose standard data and standard Data-Models. Standard data is clean,complete and common, i.e. identical for all banks. Where there is insufficientdata, or a requirement for simplicity, we propose a standard look-up table ap-proach. This comprises sets of lookup tables that are appropriate for bankswith different stress windows. To avoid model risk we follow the prospectivePrudential Valuation (EBA 2013) in proposing using a set of Data Models. Wealso propose that standard Data Models and standard lookup tables are consis-tent with the historical record. This may seem obvious but we will demonstratethat two common choices (absolute differences and relative differences) are notconsistent with the historical record for USD interest rates 1989–2014. With-out standardisation of the input distribution construction, both VAR and ESare complex derivatives of subjective data cleaning choices and subjective DataModel choices. The two key issues here for VAR and ES are data cleaning and data modelwhich we define as follows. • Data Cleaning: the process data goes through before it reaches any analy-sis system. The two key issues are missing data, and identification of false2ata (usually outliers), which can then be treated as missing data (assum-ing that the false data carries no relevant information). An example offalse data would be data entered incorrectly, e.g. 31 instead of 13. Datareceived by an institution may have already been cleaned by a separateinstitution e.g. a commercial data provider. Furthermore, data cleaningprocedures may also change periodically making the issue more complex. • Standard Data: data that is clean, complete, and common. That is, itis identical for all banks. This can be effected in the same manner thatbanks are subject to common capital regulations. • Data Model: how historical time series data is turned into an input distri-bution for VAR and ES after data cleaning. It includes how data is movedfrom a past historical window to current and future market states thatmay be very different. A typical example would be the choice of whetherto use relative differences, or absolute differences, or relative differencescaled by some function of market states, in 10-day VAR(99%) for bilat-eral IM (BCBS-261 2013). In practice that are many choices, and somesimple choices are not consistent with the historical record. This is thesecond key issue investigated here. • Standard Data Model: a data model that is identical for all banks. Thiscan be effected in the same manner that banks are subject to commoncapital regulations.For completeness we provide standard definitions of VAR and ES here.VAR( α ) is the lower bound on the loss that is expected to occur α -percentof the time. ES( α ) is the expectation of the loss, given that it is at least as largeas VAR( α ). Standard definitions are:VAR( α ) := min { x s.t. CDF( x ) ≥ α } ES( β ) := 11 − β (cid:90) β VAR( α ) dα = (cid:90) ∞ VAR( α ) x PDF( x ) dx Where CDF is the Cumulative Distribution Function of the losses (this existsfor all distributions), and PDF is the Probability Distribution Function of thelosses.
Sensitivity of ES to outliers and estimation of the input distribution has beennoted (Cont, Deguest, and Scandolo 2008; BCBS-265 2013), but not the depen-dence on data cleaning and Data Model (e.g. relative or absolute differences).Portfolio optimization under VAR and ES is know to be sensitive to noise (Kon-dor, Pafka, and Nagy 2007; Lima, Shanthikumarb, and Vahn 2011). Parameteruncertainty has a long history. However, data cleaning and Data Model is-sues are complementary but orthogonal to usual statistical questions such asestimation error (Kondor 2014).Data cleaning differences between banks, and differences in Data Models mayhave contributed to the range of outcomes observed in comparative risk weightedassets for market risk (BCBS-240 2013), as that document also remarked.3here are many studies of implied volatility under risk-neutral and historicalmeasures, (Rebonato 2004; Hull and White 2014; DeGuillaume, Rebonato, andPogudin 2013). Two of the most recent find a three-part behaviour of volatilityversus level (also considering different tenors): increasing Normal volatility (of1-day differences) with level up to around 2% then constant Normal volatilityup to about 6% then increasing Normal volatility again up to around 10%.These used data from 2004–2010 in one instance, and starting much furtherback in the other. The longer study had to use bond data to go back andso mixing funded (bond) and unfunded (swap) data. Both studies finishingbefore the current low rates regime (roughly 2010–onwards) had lasted verylong. We show roughly similar results when considering Normal volatility butfind a cleaner signal for log-Normal volatility. We also demonstrate issues withmixing funded and unfunded data. Our results also cover discounting and spread(i.e. multi-curve) behaviour, after 2004, (Kenyon and Stamm 2012; Morini andBianchetti 2013). We show that spread curves (difference between projectionand discount curves) behave differently to discount curves. Our results arerobust with respect to tenor, so this represents an addition to earlier work withthe benefit of additional data and methods.The contributions of this paper are: firstly defining the data cleaning (miss-ing and false data) and Data Model issues; secondly quantifying their (different)effects for both VAR and ES; identifying and quantifying the issues for stressedVAR and ES to do with applying past window data to current (and future)market conditions; thirdly standardisation proposals for standard market data(clean, complete, and common) and standard Data Models. We also contributea lookup table alternative where there is either insufficient data or a requirementfor simplicity.
Here, in Table 1, we detail places that VAR and ES must be used now, or areproposed, for regulatory purposes. In BCBS-265 (FRTB) VAR(99%) is replacedfor requirements by ES(97.5%), but VAR(99%) is retained for backtesting. ForIMM cases, CVA VAR capital and Market Risk (MR) capital use the sum ofthe amounts derived from VAR and SVAR. (CCR uses the maximum, but isnot derived from VAR or SVAR). Using the sum of VAR and SVAR derivedcapital for MR and CVA is obviously pro-cyclical because VAR is pro-cyclical.Central Counterparties (CCPs) may also use VAR or ES type methodologies.For example LCH uses an ES-type methodology, but details are proprietary andsubject to change.Art.11(15) in Table 1 refers to draft regulatory technical standards (EBA,EIOPA, and ESMA 2014). We also remark the very small number of pointsupon which historical VAR or ES depend, roughly from two or three to tenpoints.Note that tail risk metrics do not depend on non-tail data therefore the be-haviour of most of the data is irrelevant — only the outliers count . We shall seelater how this influences our analysis. To anticipate, this makes us more inter-ested in methods that are sensitive to outliers and less interested in statisticallyrobust methods. We have the opposite requirement from usual since we wantto be tail-sensitive. 4etric Use Source Definitionholdingperiod windowlength stress? obs %-ile in tailVAR MR BCBS-128 5/10/20 † recent1Y NA 260 99 2.6SVAR MR BCBS-158 5/10/20 † ≥ ≥
25% byAC 780– 99 7.8–ES All BCBS-265 5/10/20/1Y 1Y etc. varies 260– 97.5 6.5–Table 1: Regulatory uses of VAR, and ES equivalent (last row). Percentilesare one-sided, and VAR calibrations must be updated at least quarterly (orfor Art.11(15) every six months). Y = Yes; AC = Asset Class; NA = NotApplicable. † Depends on use case (repo / usual / secured lending), and canbe increased if there are collateral disputes. Generally the holding period givenis the minimum which assumes daily remargining where appropriate. Holdingperiod can be increased because of collateral disputes in some cases.
Data cleaning covers false data identification and missing data construction. Inthis section we quantify the scope of missing data in one major asset class andmarket (USD senior unsecured CDS, i.e. USD-SU-CDS) where the extent ofmissing data is perhaps the most significant for regulatory capital (i.e. CVAVAR). We look at both bulk USD-SU-CDS and GIPPS Euro-Sovereign CDSbecause there was both a financial crisis in 2008-9 and a Euro-Sovereign crisisin 2011-13. The objective of this paper is not to propose any particular falsedata identification method or missing data construction algorithm. Anythingwe proposed, however good, would only have a partial take-up amongst banks.Instead we make the case for standardized data and standardized Data Models— for regulatory purposes — by quantifying the scale of the problem open tosubjective solutions. The next section on Data Models complements this sectionwhere we quantify missing data effects by quantifying the effects of data cleaningon VAR and ES and the effects of different Data Models as well.The quantitative effects of false data identification and missing data con-struction has previously been neglected for regulatory VAR and ES, and espe-cially with respect to stress windows. This is despite the fact that they arehighly significant for tail metrics and that stressed VAR is usuaally by far thelarger of VAR-derived capital requirements. We quantify the effects of false dataidentification and missing data construction for USD interest rate data in thenext section on Data Models (we require a Data Model before we can calculatean effect).The effect of noise for VAR- and ES-based portfolio optimization has beenobserved (Kondor, Pafka, and Nagy 2007; Lima, Shanthikumarb, and Vahn2011). Since they are so significant we may ask whether we should prefer a5
980 1990 2000 201005101520 Year P e r ce n t Fed Funds EffectiveRate
Figure 1: Daily Fed Funds Effective Rate from January 1972 to present froma major data provider. Note that this data has not been altered at all afterdownload. There appears to be a change in quality (different fuzziness) aroundthe turn of the millennium.model-based approach over historical data. The extreme values reached duringthe 2008–onwards crisis have pushed attention towards historical approaches(e.g. in (BCBS-261 2013), because model-based approaches may have beenperceived to have failed. Indeed the model governing tail events may not bethe same as the model governing non-tail events so a model-based approach isnot an automatic solution. Assuming that we do use an historical approach wemust be aware of its limitations, and deal with them: this is the subject of thecurrent paper.
Obviously there should only be one source of clean data in a bank, and thecleaning algorithms should be governed in the same way as pricing algorithms.For capital and funding VAR and ES are part of derivative manufacturing costsspecified in regulations and so feed directly in to pricing.If data is not clean, or cleaning is not subject to governance, the odds arehigh that VAR and ES results have major artefacts subject to arbitrary revision.As a complicating factor, commercial data providers update their data cleaningmethods from time to time. Consider Figure 1 showing the Effective Fed FundsRate daily. Using single high or low points as an indicator of clean data it isapparent that there was a change around the year 2000. Given that this wasthe turn of the millennium, it could be that all the data participants improvedtheir systems, or that the data provider updated their cleaning algorithms. Wewill also use this dataset in the next section on Data Models.
In this section we quantify the extent of the missing data issue for CVA VARfor one important section of the CDS universe, USD senior unsecured (SU)6
004 2006 2008 2010 2012 201405101520 Year CD SS p r ea d (cid:72) p e r ce n t (cid:76) Oct Jan Apr Jul222426283032 Month (cid:72) (cid:76) P e r ce n t CD S M i ss i ng W i ndo w , Th r es ho l d CD SS p r ea d (cid:72) p e r ce n t (cid:76) GreeceIrelandItalyPortugalSpain
Sovereign % MissingGreece 13Ireland 11Italy 0Portugal 0Spain 0Figure 2:
TOP Left shows the median (lower lines) and 90th percentile (upperlines) CDS spread of the USD-SU universe for ten years after 2004. Red linescover the universe, black lines the CDS that were available on Jan 1st 2014.Peak market credit stress is from roughly Sept 2008 to Aug 2009.
TOP Right shows the percentage of CDS that were available on Jan 1st 2014 that haveat least three missing data points in a 10 business day window over the peakstress period.
BOTTOM Left
GIIPS sovereign CDS, each daily observationis one dot. Note that the vertical axis is limited to 20% for ease of comparison— Greek spreads went far higher.
BOTTOM Right
Percent of data missing2004–2014. During the crisis period 2011–2013 only Greece had missing data(42%).CDS. We consider both bulk CDS (all names) and sovereign (GIIPS) CDS. Wepick credit because the missing data issue is most significant for this asset classamongst the regulatory asset classes (Rates, FX, Credit, and Commodities) ina major market (USD). We used CDS data from a major CDS supplier and areport from that supplier where the most liquid CDS for each entity was given.Hence there are different document clauses in the data used, but this is not partof the analysis.One motivation for the introduction of CVA VAR capital was that 2/3 oflosses over the crisis were credit losses that were not defaults (BCBS-189 2011).The data behind this statement has not been made available but our analysiswill shed some light on the accuracy limits of the statement.To identify appropriate stress periods, Figure 2 shows median and 90th per-centile CDS spreads over the USD-SU universe (top row), and GIIPS (Greece,Italy, Ireland, Portugal, Spain) sovereign CDS data. We call the first stressperiod ”Financial” and the second stress period “Euro-Sovereign”. The higherred lines show the whole universe, whilst the lower black lines show the valuesfor the CDS that were available on Jan 1st 2014 (i.e. currently). Roughly 20%7f the CDS names currently available are not available at all over the Financialstress period. During the height of the stress roughly 30% of CDS names avail-able currently available have gaps that cover the entirety of regulatory VARor ES tail requirements (three points). CDS names here correspond to legallyidentifiable entities. During the Euro-Sovereign stress period only Greek CDSwere missing, but these were absent for 42% of the period.A US bank may have as their single-stress period 2008-9, whereas a Euro-focussed bank could well have their single-year single-stress period within 2011-13. Bonds are not a substitute for CDS over the stress period, because thestresses combined liquidity (of different sorts, inter-bank, and Euro-Sovereign)as well as credit. Credit and liquidity effects on bonds are difficult to separate.Thus we have no suitable substitute. Even if we did have a substitute for CDSspreads, the regulations (BCBS-189 2011) explicitly require the use of CDSspreads for CVA VAR. Given that CDS provide explicit relief from CVA VARcapital (where not exempt, as in Euro-Sovereigns under CRD IV) it is natural toask how this may have distorted their interpretation as market-implied defaultprobabilities. (Kenyon and Green 2013a) finds that up to 50% of the CDSspread may be a payment for capital relief rather than default protection.This lack of data has been recognised qualitatively and the regulations per-mit the use of proxies based on Sector, Rating and Region. As also previouslypointed out (Chourdakis, Epperlein, Jeannin, and McEwen 2013), the CDS uni-verse is insufficient to provide coverage using this approach. Technical sugges-tions for alternate proxy construction are available, e.g. (Chourdakis, Epperlein,Jeannin, and McEwen 2013). However, in all these cases we are essentially con-structing a mapping which will lack the idiosyncratic risk and create mappingrisk. Given that 30% of the CDS universe is missing for the stress period it ispossible that significant systematic risks are also missing.Legally Unique CDS Names Number Availablequoted at least once 2004–2014 2958available 1st Jan 2014 1527at some point in stress period Sept 2008– Aug 2009 78%throughout stress period 68%Table 2: Availability of USD Senior Unsecured CDS for Stressed VAR orStressed ES. The table specializes as it goes down, so the 70% number refers tothe 1527 names currently available, not to all the names that have ever beenobserved.Given the extent of the missing data, validation of reconstructed data may beproblematic. We term these risks mapping risk and idiosyncratic reconstructionrisk. The scale of the problem is much larger than we have indicated becausemost banks have tens of thousands of counterparties so most will not be notcovered by currently available, or historically available, CDS.We conclude that the extent of the missing data in the CDS universe, sum-marized in Table 2 is such that there will be wide variation between institutionsin their reconstruction of the missing data. Next we turn to the quantitativeeffects of data cleaning, so we move to Data Models.8
Data Model
By Data Model we mean how time-series data is transformed into an inputdistribution D for use in VAR and ES. For pricing lifetime capital and lifetimefunding we need the input distributions given simulated future market states aswell as the current market state. Three typical Data Models are described belowas examples: absolute, relative, and level-relative. Again we do not imaginethat this encompasses the creativity of data modelers, it is simply sufficient toquantify the output range of a set of common choices.We suppose that the originally observed daily time series is: { x i : i =1 , . . . , x n } . n will typically be around 260 for a year of observations. We fur-ther suppose that the calculation interval is m -days, which is typically 10 days(although it can be as short as five or as long as 20 or longer in some cases,e.g. if there are disputes on collateral calls). The elements ∆ i of the inputdistribution, D , for VAR and ES are then generated as follows. Absolute ∆ i = x i − x i − m , i = m + 1 , . . . , n Relative ∆ i = x i − x i − m x i − m , i = m + 1 , . . . , n Level-Relative ∆ i = f (cid:18) l j , l now , x i − x i − m x i − m (cid:19) , i = m + 1 , . . . , n Where l j describes some state of the market calculated on data up to t j (which will be some date relevant to the observed daily time series), and l now describes some state of the market calculated on data up to now (i.e.the date of the VAR or ES calcualtion). f () is the transformation of therelative difference according to the two market states.For Level-Relative l ∗ will be a metric giving the level relevant to the rel-ative difference. In the case of a linear level function we have: f linear level-relative ( l i , l now , y i ) = y i × b × l now + ab × l i + c Alternatively, for a quadratic level function we have: f quadratic level-relative ( l i , l now , y i ) = y i × c × l + b × l now + ac × l i + b × l i + a where a, b, c are the coefficients of the linear and quadratic level functions,and l y is the level relevant for the observation y i at t i . See below for anexample, e.g. Figure 3.In addition to normal use, without a data model we cannot quantify the effectof missing or false data. We know that data is altered in the original time-seriesbut we would not know what the effect would be on the input distribution D and hence on VAR and ES. 9 .1 Intention and Expression The Data Model combines two distinct aspects, intention and expression. • Intention of the transformation. For example an input distribution D created for SVAR would aim to preserve the stress in the original obser-vations. • Requirements to express the intention given market characteristics. Forexample, if there is a natural finite scale for value changes. In this case,as the market level decreases, relative volatility will increase for constantmarket stress. Considering the SVAR example the aim would be to expressthe same amount of stress but conditioned on current market state.Hence the Data Model can be embodied by a transformation function T ; T : ( R n o , R n c ) × R n u (cid:55)→ D ( n d )That is: T : (observations( t o ) , market state( t c )) × market state( t u )) (cid:55)→ distribution( t u )Where: n o number of observations; n c numbers characterizing the market state relevant for the observations; n u numbers characterizing the market state relevant for use of the observa-tions; n d number of points in empirical (historical) distribution D ; t o observation start; t c start time for market state relevant for observations; t u start time for market state relevant for creation of the shock distribution,i.e. use time.All transformations implicitly identify a set of intentions and expressions,including assumptions on how a given market behaves. What we do here ismake these intentions and expressions explicit. For example, using absolute dif-ferences implicitly assumes that the market is level-independent. Using relativedifferences assumes that the market has a linear relationship with level that hasno offset (i.e. the linear relationship goes through the origin). We note thatmany authors have studied physical-measure market dynamics (we describedseveral in Section 1.2), but as far as we are aware they have not applied theirresults to tail metrics.Typical intentions include: • preserve the market state of the observations; • preserve the current market state.However, without knowing how a market behaves given constant state, e.g.not-stressed, it is not possible to specify the transformation T that will expressthese intentions effectively. Notice also that for tail metrics, e.g. VAR(99%)we are only interested in the transformation of one specific point (the 99thpercentile). The behaviour of the market in general, i.e. the majority of thedata, is irrelevant. 10tressed markets typically depict a mixture of location and scale changes.Consider the Effective Fed Funds rate in Figure 1. The Oil Shock of 1972 wasmostly depicted by a level increase, whereas the financial shock of 2008 involveda level decrease. The inflation period of the mid 1980s involved both a levelincrease and a scale increase. We will go into detail below.We would like to be able to provide statistical validation of Data Models.However, given the lack of data, e.g. no USD swaps market data prior to 1989,or the existence of only one period of low rates after that date, this is unrealistic.We can provide assumptions which can help, e.g. different country’s behaviours’are the same, but these assumptions themselves are difficult to validate. Thishighlights again the subjective nature of Data Models. As mentioned above, without a data model we cannot quantify the effect ofmissing or false data on VAR and ES. Here we do a set of examples usingthe daily Effective Fed Funds data shown in Figure 1. In the next section wecombine this with a detailed quantitative analysis of daily USD interest ratesbootstrapped from OIS swaps and Libor swaps. The aim in this section isto illustrate, qualitatively, the interaction of Intention, Expression, and DataCleaning. We assume that we are interested in building the input distribution D for the current date (i.e. at the end of the illustrated data). • Step 1: characterize the market. We choose to characterize it in terms oflevel, and state-and-scale (of movements). – Level: we observe that the data ranges between roughly 0% and25%, thus we may decide to reject any model which produces negativevalues in the shock distribution D when applied to the current marketstate. – State-and-Scale. We assume that the market has stressed and non-stressed periods (e.g. regime-switching).After August 2009 we know from the USD CDS universe that thereis relatively little systematic credit stress on the US market. Thusat low ( < • Step 2: Define Intention. Let us assume that we want to create an inputdistribution D for SVAR. That is, we want to preserve the stress in anychosen observations. Alternatively we might decide on a fixed lower bound, e.g. −
1% from some additionalknowledge of the market characteristics, e.g. announcements from the Fed about their will-ingness to implement negative rates. Step 3: Characterize Expression. We must ask how stress is expressed atthe current (low) market level. Given that we have no stressed data atthe current (low) market level, we can instead ask the opposite question:how is a non-stressed market expressed?
We focus here on USD interest rate data from 1989 (start of extensive swapsdata) through 2014, specifically OIS swaps, 3M deposits, and Libor swaps outto 30-year maturity. For 1989–2004 we assume that the market priced usinga single-curve appoach and for 2004–2014 we assume a multi-curve approach(discounting and spread), so that we can get a picture of both discount andspread behaviour. The potential mixture of pricing approaches from 2004–2007, i.e. different banks were probably using different approaches, is not anissue because the observed spreads (OIS to Libor) were very low. This data isnot complete nor is it clean. Appendix 1 describes the completion and cleaningprocedure. We include cleaning effects explicity via an analysis of Effective FedFunds data for simplicity (because this involves a single time-series not a curve)Although the Effective Fed Funds data might be expected to be high quality(i.e. very clean) it is not obviously so.
IRS Tenor-Point Volatility vs Level
We analyse 10-day changes of tenorpoints on the discount and spread curves. Discounting and projection curveswere bootstrapped from OIS, 3M deposits and Libor swaps out to 30 years.Futures were not included for simplicity and because these do not have fixedtenors (they are relative to calendar points). Figure 3 shows standard deviationsof relative differences and absolute differences versus level. The interest raterange was divided into 25 basis point (bp) sections and the standard deviationof the differences at each level calculated. The level was determined as themedian of the levels that started in each 25bps bracket.We use standard deviation as our scale metric because it is sensitive to out-liers and we are interested in the behaviour of the tails because we are studyingVAR and ES. A robust scale metric, e.g. Median Deviation about the Me-dian, would therefore be inappropriate. From Figure 3 we make two tentativeconclusions. • Spread 2004-2014 (Projection minus Discounting), versus level is consis-tent with a log-Normal model because the curves in the top-left panel(relative differences) are roughly flat relative to the range. This determi-nation is consistent with the top-right panel (absolute differences) whichshows roughly linear curves with a possible intercept around zero (singlelinear fit: gradient 95% confidence interval includes zero). • Discount 1989–2014, versus level is consistent with a quadratic-level-dependentlog-Normal model because each tenor line on the bottom-left panel iscurved and they are relatively close together (single quadratic fit: p-valuesfor both linear and quadratic terms better than 1e-30). The bottom-rightpanel (absolute differences) supports this determination because it showsno clear clear pattern. 12 .0 0.2 0.4 0.6 0.8 1.001020304050 Level (cid:72) percent (cid:76) S D (cid:72) p e r ce n t (cid:76) Spread, Relative Diffs (cid:72) percent (cid:76) S D (cid:72) p e r ce n t (cid:76) Spread, Absolute Diffs (cid:72) percent (cid:76) S D (cid:72) p e r ce n t (cid:76) Discount, Relative Diffs (cid:72) percent (cid:76) S D (cid:72) p e r ce n t (cid:76) Discount, Absolute Diffs
Figure 3: Standard deviation of relative and absolute 10-day differences forspread and discount tenor points on USD interest rate curves. Tenors used are { } -years. Top panels spread (i.e. projectionminus discount), lower panels discount (from OIS). Black data is for 1989-2004(Libor discounting, no spread) and
Blue data is for 2004-14 (OIS discounting,and separate spread).Note that the 1989-2004 data (black lines) and 2004-2014 data (blue lines)do not show gaps where they overlap. This is consistent with marketdiscounting behaviour being as we have assumed (i.e. single-curve upto 2004 and multi-curve after 2007 with mixed, but insignificant effects,behaviour 2004-2007).
Stressed VAR(99%) and stressed ES(97.5%)
Table 3 shows SVAR(99%)for swaps of maturity 1-year to 30-year from two 1-year windows (2008-9 and2011-12), both calibrated for use at the start of 2014. The 2008-9 window is fora bank whose single stress period is the Financial crisis. The 2011-12 windowis for a bank whose single stress window is the Euro-Sovereign crisis. Thereare differences because the USD swap market was not stressed during the latterwindow. However this is secondary, from the point of view of a single institution,to the major differences due to the Data Models.
Financial crisis window • With a Data Model of relative differences We can observe that 10-dayVAR on a 1-year IRS is roughly 10bps of notional upfront (dependingon swap direction), whereas this is 14% or 19% of notional upfront for a30-year swap. • A level-relative Data Model has VAR up to 60% higher (for 5-year swap) or40% lower (1-year swap) than a Data Model that uses relative differences.13elative level-relativerelative absoluterelative % of notional ratio ratioWindow Maturity SVAR(1%) SVAR(99%) SVAR(1%) SVAR(99%) SVAR(1%) SVAR(99%)2008-9 1 -0.12 0.08 63. 115. 636. 592.2008-9 2 -0.36 0.21 121. 143. 267. 526.2008-9 5 -2.16 1.67 102. 165. 144. 190.2008-9 10 -5.61 5.68 93. 152. 107. 157.2008-9 20 -10.19 13.08 86. 133. 91. 123.2008-9 30 -14.24 18.99 81. 134. 84. 119.2011-12 1 -0.09 0.04 106. 224. 105. 161.2011-12 2 -0.47 0.42 90. 109. 101. 88.2011-12 5 -1.76 2.27 95. 101. 105. 96.2011-12 10 -3.59 4.61 99. 114. 86. 113.2011-12 20 -5.79 8.14 102. 118. 81. 115.2011-12 30 -7.73 11.17 107. 121. 77. 113.Table 3: Data-Model dependence of 10-day VAR for USD interest rate swapsfor the Financial Crisis window (2008-9) and the Euro-Sovereign Crisis window.Data Models for differences are: relative; level-relative (i.e. standard deviationscales inversely with level, see text and Figure 3); and absolute. • A Data Model using absolute differences shows much larger differences atthe short end ( > ± • Although we do not show ES(97.5%) for reasons of space, it is very close toVAR(99%) using this cleaned interest rate data. The range of differencesis roughly ± for cleaned data the ES(97.5%)calibration is appropriate with respect to VAR(99%). However, see belowfor data cleaning effects — practically speaking this similarity may bepurely because of the data cleaning. Euro-Sovereign crisis window • Apart from the 1-year maturity, all the Data Models agree to within 25%on SVAR(99%). This is purely because the stress window is at the samerates level as the current market. This is the best case for agreementbetween the Data Models and only applies to spot SVAR and ES . LifetimeSVAR and ES will have significantly different results because of scalingdifferences between the Data Models for higher rates levels. Agreementon spot SVAR and ES is no predictor of agreement on lifetime values.
Data Cleaning and Extended Rates Levels
We now analyse the dailyEffective Fed Funds rates 1972–2014 to get an idea of a wider rates range (up14o 20%) and compare with interest rates 1989–2014 from swaps data (discountrates). (cid:72) percent (cid:76) S D (cid:72) p e r ce n t (cid:76) (cid:72) percent (cid:76) E FF D i r t y (cid:144) E FF C l ea n S D (cid:72) r a t i o (cid:76) a nd E FF D i r t y (cid:144) S w a p s S D (cid:72) r a t i o (cid:76) Figure 4:
LHS
Standard deviation of relative 10-day differences discount tenorpoints on USD interest rate curves (black = swaps/Libor data 1989–2004, blue= swaps/OIS data 2004–2014) compared with standard deviation of relative10-day differences for cleaned Effective Fed Funds rates (red, data from 1972–2014).
Solid red curve is quadratic fit to not-cleaned Effective Fed Funds(EFF) data from 0% to 15%.
Dashed red curve is similar quadratic fit tocleaned Effective Fed Funds data.
Solid green curve is quadratic fit to cleanedswaps/Libor and swaps/OIS data together.
Dashed green curves are twopossible extrapolations: quadratic and flat.
RHS purple
Ratio of not-cleanedEFF Standard Deviation vs level to cleaned EFF Standard Deviation vs level.
RHS green
Ratio of not-cleaned EFF Standard Deviation vs level to cleanedswaps/Libor and swaps/OIS Standard Deviation vs level.Figure 4 (LHS) shows 10-day relative difference for Effective Fed Funds(EFF) rates and USD discount rates together with interpolating quadraticcurves and extrapolating curves (quadratic and flat). The RHS plot showsthe relative standard deviations between not-cleaned and cleaned EFF ratesand between not-cleaned EFF rates and discount rates from swaps using ratiosof the quadratic fits. • A quadratic curve is a good fit (p-values for linear and quadratic termsbetter than 1e-10) for all three data sets from level of 0% to level of 15% :not-cleaned EFF (not shown), fit is solid red curve; cleaned EFF (shown asred line), fit is dashed red curve; and cleaned swaps discount data (blackand blue data), fit is dashed green curve. • The quadratic fits to the data (SD vs level) are significantly different:maximum differences more than 40% to 250% around levels of 6% to 8%. • Extrapolating flat from the discount data from swaps gives vastly differentresults from a quadratic extrapolation. • Beyond the 15% level the EFF standard deviation goes down . It does notfit a quadratic curve nor does it fit a flat extrapolation. This could bebecause of lack of data or a genuine change in behaviour.15leaning Fed Funds data consists of removing isolated points, i.e. a pointthat jumps up and then down on the next day (to within 10% of the previousvalue). Degrees of cleaning are possible by considering a jump that goes up-then-down over more than one day. The data was cleaned for up to five-dayup-then-down moves. There are an infinity of possible cleaning procedures, weuse this as a a bound on the effects of cleaning.The difference between the EFF quadratic fits and the discount fits couldbe because of tenor, although the discount data covers 3M to 30Y consistently(although the embedded credit and liquidity risk is always 3M). Alternativelyit could be because EFF simply represents a different market. It could also bebecause there is some systematically different effect of the EFF cleaning versusthe cleaning of the swaps data. These two are different because with swaps thereis curve information as well as temporal information. Undoubtedly readers willbe able to add to this list. In any case it suggests caution about combining datasources for quantitative use, even from the same currency. (cid:45) (cid:45) ES : d i r t y (cid:45) c l ea n (cid:144) V A R : d i r t y (cid:45) c l ea n Figure 5: Relative effect of data cleaning on VAR(99%) and ES(97.5%) fordaily Effective Fed Funds rates. Running one-year window. When there is asignificant difference between VAR for clean and not-cleaned data we look atthe relative change of ES (to avoid dividing by zero).
Data Cleaning versus VAR(90%) and ES(97.5%)
Figure 5 show the rel-ative effects of data cleaning on VAR(99%) and ES(97.5%) for daily EffectiveFed Funds rates. It uses a one-year rolling window for the VAR and ES compu-tation. Roughly 70% of the time the ratio is less than one, showing where ESis less affected by data cleaning than VAR. However, on 30% of the dates ES ismuch more affected by the data cleaning than VAR. In fact on those dates it iscommon to see ES twice as sensitive to the cleaning versus VAR.For rates levels above 10% we may only have stressed data, e.g. from OilShock or 1980s Inflation shocks. Any model we consider must make some as-sumption for the discount rates Data Model above 10%. We could argue for16 flat extrapolation from the levels around up to 10%. Alternatively we couldargue for quadratic extrapolation as suggested by EFF data. This may seemimmaterial today, however, for lifetime capital costs it is quite possible to getscenarios with high rates levels. These scenarios affect today’s pricing for col-lateralized interest rate swaps where there is initial margin (Green and Kenyon2014), i.e. when traded through Central Counterparties, or for lifetime cap-ital hedging (Green, Kenyon, and Dennis 2014). These prices in turn affectcollateral calls — including those by the Central Counterparty (Kenyon andGreen 2013b). Alternatively we could argue that whenever future rates levelsare above 10% in simulations then we are in a stressed situation so we do notneed historical data.The number of alternatives, and their materiality, prompts the followingProposals.
The authors in their personal capacity (i.e. with no other representation) makethe following proposals for improvement of VAR and ES tail risk metrics.1. Standard data: clean, complete, and common. That is, all banks (orother users) use identical data. This can be regulated in the same waythat regulations like Basel III are administered. In fact, given the rangeof results we have demonstrated with typical choices we would argue thatstandard data should always be part of capital and funding regulations.2. Common Data Models. That is, all banks (or other users) use identicalData Models. The required coverage should be limited to where there isreasonable data, and a lookup table approach adopted otherwise. Theprospective Prudent Valuation can be included by specifying a set of DataModels, as it requires (EBA 2013).3. Lookup tables for VAR and ES by tenor based on underlying level forUSD rates. For other cases the procedure shown here should be used toidentify the key relationships. The prospective Prudent Valuation can beincluded by specifying a set of lookup tables (depending on use). Severaltables are required because different banks are sensitive (as in stressedVAR) to different periods.Given that we are looking at tail events over a short period, generally 1-year, and used for a much shorter period (generally 10 days) we are reluctantto propose any principal component based analysis because the percentage ofmovements described by a fixed number of factors usually decreases with windowlength. Additionally, by definition the stressed 1-year window is not represen-tative of non-stressed conditions. Put simply, we do not see the necessity ofanother level of modeling, or any added value.These standardisation proposals may seem to create model risk. This is notthe case because we follow the prospective Prudential Valuation in proposingusing more than one model. We would also propose that all models be supportedby evidence and derived transparently. This external standardization also avoidsany potential conflict of interest between capital derived from Risk Departmentsand their business impact within a bank.17hese standardization proposals offer significant efficiency in terms of com-plexity because all banks would be required to follow the same set of DataModels and lookup tables.These proposals would require significantly more effort, and potentially newquantitative skills, from regulators. However, we have demonstrated that thealternatives are quantitatively highly diverse.In short, the problem with tail metrics is not the choice of metric, but thedata cleaning and the Data Models, before the data gets to the metric.We add a final proposal to remove the pro-cyclicality in current risk calcu-lations using sums of unstressed and stressed results (see Appendix 2): • Replace the sum of unstressed and stressed results by twice the maximumof the two. This preserves the quantitative level at the height of thenext crisis and removes the procyclicality that is otherwise driven by thecyclicality of the non-stressed results.
Contrary to usual perception, we have demonstrated that historical VAR andES, and their stressed counterparts, are not objective measures of tail risk be-cause of their critical dependence on data cleaning and Data Models. Datacleaning and Data Models are subjective choices. These tail metrics are noteven immutable, any change of data cleaning, or — especially for CDS — cov-erage, by data providers makes comparisons problematic.To provide consistency of historical VAR and ES we propose standardizationof data (i.e. clean, complete, and common) and standardisation of Data Models.These must be included in capital and funding regulations because they havesuch quantitatively significant effects. Where there is insufficient data or arequirement for simplicity we propose sets of look-up tables adapted to thedifferent windows banks must use according to the regulations described inSection 2. By common data and Data Models we mean identical for all users.To avoid model risk we follow the prospective Prudential Valuation (EBA 2013)in proposing using a set of models, derived transparently from the standarddata.Model-based VAR and ES provide no advantage with respect to data clean-ing and Data Models because these create the input distribution that model-based approaches must calibrate to.There are three main determinants of risk tail metrics: data cleaning; DataModel; and choice of VAR or ES. All three have highly significant effects (50% isnot uncommon out 10 years) for USD interest rate swaps from 1-year maturityto 30-year maturity. We do not expect the results to be qualitatively differentfor other instruments or currencies because data cleaning and Data Models areinitial steps that cannot be avoided in any analysis. Our data cleaning removedoutliers and so with clean data we saw no difference between VAR(99%) andES(97.5%). VAR(99%) is already the robust (i.e. median) estimator or lossesabove VAR(98%) so the added information from the loss estimation property ofES(97.5%) is unclear. It is also unclear whether ES adds anything that is notdata-cleaning dependent. With standard data this might be an area to revisit.We would also query the need to move to ES for regulatory purposes giventhat netting sets are legally defined — it is not clear that the coherence prop-18rty of ES is relevant here. Given that ES cannot be used for backtesting itsadditional value is also questionable.The numerical results can be queried with respect to details of our datacleaning procedures and choice of Data Models this is precisely the point . Wehave demonstrated that the details of data cleaning and choice of Data Modelare critical — and subjective — for risk tail metrics. We have presented whatwe consider to be reasonable choices, but our choices can certainly be debated.Hence, a significant quantity of capital is riding on subjective choices.It can be argued that it is easier to make results look bad, i.e. getting awide range of quantitative outcomes here, than it is to make results look good,i.e. getting a small range of quantitative outcomes. However, we have not usedmodels that are outside the literature. We would also point to the compara-tive RWA study that found a similarly wide range of outcomes in practice foridentical portfolios (BCBS-240 2013).Robust statistics is a well-developed field (Huber and Ronchetti 2009) andmay appear to provide another route forward. For example median deviationcould be used instead of standard deviation. However, the basic problem forhistorical VAR/ES is that they have limited effect without additional assump-tions on the Data Model. A further issue is that this would add another level ofsubjective model choices: why this robust statistic and not another one? Addi-tionally, we are interested in the tails that are, by definition, not very robust.For the tail metrics themselves, as a robust proposal we could advocate usingthe median loss greater than a threshold calibrated to be equal to the currentVAR. This, of course, would be the median loss above 10-day 98% VAR, i.e.99% VAR. Thus the current VAR specifications can — already — be viewed asrobust estimators of lower percentile tail losses.The tension between tail measures and model specification (not-cleaned datais an example of a mixture model) was previously investigated by (Dell’Aquilaand Embrechts 2006) amongst others. For financial and regulatory purposestheir approach of using sophisticated modeling is not appropriate because itloses simplicity and transparency. In addition the validation problem remainsgiven that we are not only dealing with extremes but with very limited data.Standardisation is robust, modeling is diverse.In conclusion we have analysed the data cleaning and Data Model dependen-cies of VAR and ES, and proposed methods for their resolution. We proposedstandardized data (clean, complete, and common) and standard Data Models;or using lookup tables derived from transparent analysis of market behaviourbased on standard data. We also proposed that this standard data and the DataModels be part of capital and funding regulations, just as other numerical itemsare. The problem with tail risk metrics calculation is not the metrics — it iseverything before the data gets to the metrics.
Appendix 1: Data Cleaning for USD Interest Rates
USD interest rate input data is OIS swaps, 3M deposits, and Libor swaps. Thedata is cleaned before bootstrapping into discount and projection curves. Thereare two types of cleaning: by date and by instrument. Since we have a curveof data per date we do each date first, filling in missing data, and then look forfalse data by instrument. 19ata cleaning is ad hoc almost by definition. We provide this mechanism asone example with no claim to optimality. On each date. • Interpolate missing instrument values using a monotonic cubic spline (Hy-man 1983). • Extrapolation: – If there is a gap of one or two days then linearly interpolate acrosstime. If this is done then re-apply interpolation as there may now bean interpolation opportunity. – Otherwise extrapolate flat for Libor swaps. For OIS swaps take thelast spread to Libor and extrapolate assuming a constant spread.For instrument data we apply the following cleaning. This aims to removesisolated points that are bad, i.e. a jump away from previous values followedquickly by a reverse jump. • Check for evidence of bad data. For each instrument: – Calculate the standard deviation (SD) of the differences, SD(all). – Remove the biggest r % of the absolute differences and re-calculate thestandard deviation, SD(some), and record the ratio SD(all)/SD(some)= ratio(observed) – Calculate the ratio for a Standard Normal (SN) distribution with thesame number of data points, ratio(SN), 256 times. – If ratio(observed) is greater than Mean(ratio(SN))+5 SD(ratio(SN)),then assume there is bad data. • If detect bad data then: – Record the r % quantile of the absolute differences q(r). – Replace any point where it is different from the point before andthe point after by q(r), and where the differences either side havedifferent signs. Replace with the average of the points either side. – Also replace points that jump, stay for one observation, then jumpback.We use an r % such that we would expect it to remove one good point, if therewere no bad data points, per couple of thousand data points. r % = 3% providesroughly this level of security. Since we are (mostly) interested in single years(i.e. 260) points this is expected to preserve VAR. Of course changing even oneoutlier may alter ES, this is a fragile tail metric. Appendix 2: Pro-Cyclicality of MR and CVAVAR Capital
MR and CVA VAR capital use the sum of VAR and SVAR. Thus at the peakof the next crisis, if it is similar to the last one in intensity, both of thesecapital terms will see 2 × SVAR. Now if we assume that the current SVAR capital20s significant, then as the next crisis unfolds banks will need significant extracapital, in fact they will need exactly as much as they were considered to bemissing in the last crisis. Thus, although the banks are safer from a defaultpoint of view their capital problem is unchanged. This does not seem to be adesired objective of effective capital regulation.Now suppose that the next crisis is sufficiently far away that banks havefully implemented their counter-cyclical capital buffers (CCB). If we assumethat the CCB are the same size as SVAR-VAR effects, then the combinationof VAR+SVAR and CCB is cycle-neutral. However, the same effect could beachieved by using the maximum of VAR and SVAR with no CCB. In any caseVAR+SVAR reduces the effectiveness CCB, so we would question the utility ofhaving both.The simplest solution is to have maximum of SVAR and VAR with CCBadjusted to take into account the reduction in opposition to counter-cyclicalitythat a cycle-neutral MR and CVA VAR produce.
References
BCBS-189 (2011). Basel III: A global regulatory framework for more resilientbanks and banking systems.
Basel Committee for Bank Supervision .BCBS-240 (2013). Regulatory consistency assessment programme (RCAP)Analysis of risk-weighted assets for market risk.
Basel Committee for BankSupervision .BCBS-261 (2013). Margin requirements for non-centrally cleared derivatives.
Basel Committee for Bank Supervision .BCBS-265 (2013). Fundamental review of the trading book - second consul-tative document.
Basel Committee for Bank Supervision .Chourdakis, K., E. Epperlein, M. Jeannin, and J. McEwen (2013, Februar).
A cross-section aceoss CVA . London: Nomura. Available at .Cont, R., R. Deguest, and G. Scandolo (2008). Robustness and SensitivityAnalysis of Risk Measurement Procedures.
Columbia University Centerfor Financial Engineering, Financial Engineering Report No. 2007-06 ,1–33. Available at SSRN: http://ssrn.com/abstract=1086698 .DeGuillaume, N., R. Rebonato, and A. Pogudin (2013). The Nature of theDependence of the Magnitude of Rate Moves on the Level of Rates: AUniversal Relationship.
Quantitative Finance 13(3) , 351–367.Dell’Aquila, R. and P. Embrechts (2006). Extremes and Robustness: A Con-tradiction?
Financial Markets and Portfolio Management 20(1) , 103–118.EBA (2013). On prudent valuation under Article 105(14) of Regulation (EU)575/2013. Technical report, EBA. EBA-CP-2013-28.EBA, EIOPA, and ESMA (2014, April).
Consultation Paper. Draft regula-tory technical standards on risk-mitigation techniques for OTC-derivativecontracts not cleared by a CCP under Article 11(15) of Regulation (EU)No 648/2012 . EBA, EIOPA, and ESMA.21reen, A. and C. Kenyon (2014). Calculating the Funding Valuation Adjust-ment (FVA) of Value-at-Risk (VAR) based Initial Margin. Available atSSRN: http://ssrn.com/abstract=2432281.Green, A., C. Kenyon, and C. R. Dennis (2014). KVA: Capital Valuation Ad-justment.
SSRN . Available at SSRN: http://ssrn.com/abstract=2400324.Huber, P. and E. Ronchetti (2009).
Robust Statistics (2nd Edition) . Hoboken,New Jersey: Wiley.Hull, J. and A. White (2014). A Generalized Procedure for Building Treesfor the Short Rate and its Application to Determining Market ImpliedVolatility Functions . Available at: .Hyman, J. (1983). Accurate monotonicity preserving cubic interpolation.
SIAM J. Sci. Stat. Comput. 4 , 645–654.Kenyon, C. and A. Green (2013a). Pricing CDSs’ capital relief.
Risk 26 (10).Kenyon, C. and A. Green (2013b). Why CCPs are the new rating agenciesand pose the same risks.
Risk 26 (9).Kenyon, C. and R. Stamm (2012).
Discounting, Libor, CVA and Funding:Interest Rate and Credit Pricing . Palgrave Macmillan.Kondor, I. (2014). Estimation Error of Expected Shortfall. arXiv . 1402.5534.Kondor, I., S. Pafka, and G. Nagy (2007). Noise sensitivity of portfolio se-lection under various risk measures.
Journal of Banking and Finance 31 ,1545–1573.Lima, A., J. Shanthikumarb, and G.-Y. Vahn (2011). Conditional value-at-risk in portfolio optimization: Coherent but fragile.
Operations ResearchLetters 39 , 163–171.Morini, M. and M. Bianchetti (Eds.) (2013).
Interest Rate Modelling after theFinancial Crisis , London. Risk Books.Rebonato, R. (2004).