Fixed income portfolio optimisation: Interest rates, credit, and the efficient frontier
aa r X i v : . [ q -f i n . M F ] A p r Fixed income portfolio optimisation: Interest rates,credit, and the efficient frontier
Richard J. Martin ∗ April 7, 2020
Abstract
Fixed income has received far less attention than equity portfolio optimisation since Markowitz’original work of 1952, partly as a result of the need to model rates and credit risk. We argue thatthe shape of the efficient frontier is mainly controlled by linear constraints, with the standarddeviation relatively unimportant, and propose a two-factor model for its time evolution.
Despite the passage of some 65 years since the emergence of Markowitz’s portfolio theory [12],almost all of the discussion on portfolio optimisation has been confined to equity portfolios. Fixedincome is more difficult because there are two main drivers—rates and credit—and unlike equitiesboth have term structures associated with them and are driven by a concept of yield or spread,which influence the construction of expected return. Credit has the additional feature that theupside is typically far smaller than the downside and so we cannot construct risk on the premise ofNormal distributions. Another differentiator is the use of credit ratings, which in turn are connectedto regulatory capital: this too has no counterpart in equities. Our interest lies both in top-downfixed income portfolio selection from a set of broad asset classes, which we refer to as sleeves (forexample, US BB rated corporates), and also bottom-up security selection. Here we mainly considerthe first of these problems, but what we say applies equally to both.The main thrust of recent literature [9, 20, 3] has been to use yield curve models as a way ofunderstanding risk and return (rather than trying to deal with bond prices directly). We agreewith this. (In a recent paper [4] the authors do not use yield curve models, as “this reducesmisspecification risk”[p.8], but that is a specious argument, as price volatility determines yieldvolatility and vice versa.) Remarkably, though, two fundamental issues are entirely absent exceptin [4]: duration constraints, and credit risks. We will consider these in depth, but also have anotherconcern: we do not agree that mean-variance is the best framework for understanding fixed income.This is not just because stdev does not capture tail risks—if this were the objection we wouldsimply recommend CVaR as a risk measure—but because fixed income portfolios are subject tomany different sources of risk, and it is likely to be more effective to explicitly constrain all of theserather than aggregate them into a single quantity. Put differently, simply knowing the stdev ofan instrument or sleeve, and its ‘beta’ to an index, is much less helpful than understanding howit behaves in a variety of interest rate and credit scenarios, and optimisation must address thisdirectly. ∗ Department of Mathematics, Imperial College London, Exhibition Road, London SW7 2AZ, UK. Email: [email protected] Standard deviation. Conditional Value at Risk, also known as Expected Shortfall. See for example [10] in its application to equityportfolio optimisation. define these scenarios—it is not necessary to estimate their probabili-ties —whereas to calculate the stdev, CVaR or whatever, their probabilities will need to be known.Another benefit is simplicity as we do not need to estimate or deal with the covariance matrix ofthe assets: we simply need the expected value or loss of each asset in each scenario. Further, fora long-only fixed income portfolio, we scarcely need to constrain stdev, because if the interest rateand credit duration are tightly constrained, the stdev will necessarily be low.Another matter of interest is the shape of the efficient frontier and how it has evolved over time.Many discussions about fixed income asset allocation go no further than drawing a chart of returnvs risk, marking the various asset classes upon it; this is unhelpful to the portfolio manager whomust have no more than (say) 15% in emerging markets, 10% in European periphery, an averagecredit quality no worse than BB+, a maximum duration of 6, etc. Such constraints reduce the setof permissible portfolios and hence the maximum return for a given level of risk: therefore theyalter the efficient frontier just as much as do market conditions. For a given set of sleeves andconstraints, we use a simple new model to directly parametrise the efficient frontier, and show thattwo easily-interpretable factors capture almost all its variation.The paucity of research of fixed income portfolio optimisation requires us to spend some timelaying the foundations, which we do in the next sections. After that we introduce our model forthe efficient frontier and go on to present ‘live’ results.
Tehe natural framework for expressing views on duration and curve risks is that of Heath–Jarrow–Morton [5], which expresses the motion of the discount-factor curve B t ( T ) through the instanta-neous forward rate f t ( T ) = − ( ∂/∂T ) B t ( T ), and in particular through its volatility σ . We refer thereader to [7, § f t ( T ) evolvesas df t ( T ) = µ ( t, T, Ω t ) dt + X i σ i ( t, T, Ω t ) dW i,t in which W i,t are Brownian motions that are not necessarily independent: E [ dW i,t dW j,t ] = ρ ij dt ;the symbol Ω t denotes all previous history of interest rates that are necessary for determining theforward rate volatility at time t . We comment on µ later.In general, the HJM framework does not give rise to Markovian dynamics for the short rate,and while this is not essential it is useful for interpretability and analytical tractability in certainproblems. However, a special case does , which is: σ i ( t, T ) = σ i e − κ i ( T − t ) . By performing principal component analysis (diagonalising ρ ) we decompose the motion of f t ( T )into a set of probabilistically independent components σ ♯i ( T − t ) W ♯i,t , in which each σ ♯i is a finite sumof exponential functions. This gives the familiar picture of the first, most important, factor givinga roughly parallel shift, the second causing steepening/flattening, and so on. Specific interest ratescenarios ω i can then be obtained by assigning realisations to each W ♯ , so that ω is a shift higher, ω a shift lower, ω a steepening, ω a flattening, and so on. For strongly nonlinear products such The form can be more general than this but for reasons of space we omit the details.
2s those with embedded options (e.g. callable bonds) we will also need different sizes of rate curvemove: for example the sensitivity to a 2% move will not be twice the sensitivity to a 1% move .The simplest form of this model is when we have only one factor, and for long-only portfoliosthis will do most of what is needed. By solving for the forward rate and noting that f t ( t ) = r t , andthen differentiating, the short rate dynamics emerge as dr t = (cid:0) θ ( t ) − κr t (cid:1) dt + σ r dW t , (1)which we recognise as the Hull–White model [6]; the function θ is obtainable from today’s termstructure. If we make the simplification κT ≪ σ r is constant, rather than depending on r through a function of the form σ √ r or σr , as seen in the Cox–Ingersoll–Ross and Black–Derman–Toy models respectively, causes nodifficulty when rates go negative. The second is that it aligns fairly well with historical experience,as can be seen from the following observation: in the last thirty years the (absolute) volatility ofbond yields has not depended strongly on the level of rates. Figure 1 shows this for the typicalten-day move in the 10y Treasury yield. Further, Table 1 shows that a Gaussian assumption witha constant yield volatility σ y = 0 .
90% is remarkably good, though there is the usual caveat that itshould not be relied upon for estimating very high levels of confidence, particularly over short timescales.Some simplifications are possible. Rather than using the sensitivity to the instantaneous forwardcurve (or the zero curve: a parallel shift in one is the same as a parallel shift in the other), simplybump all the bond yields and use the yield-price relationship, which for a bullet bond of maturity T and coupon c paid m times a year is, as a fraction of par, P ( T, c ; y ) = (1 + y/m ) − mT + c − (1 + y/m ) − mT y . (2)This can be made even simpler if the yield bump is small, by using the duration and convexity.While bumping the yield is not the same as bumping the forward rate (because the yield is nota linear function of it), it has the advantage that it is not necessary to build the instantaneousforward curve or the zero curve. This treatment is very common in practice.For the purposes of this discussion we are most interested in the first interest rate factor andare concerned with the effect of a moderate to large increase in rates, on a 1y horizon. If we regardthis as commensurate with a move of a little over two stdevs, and the annual volatility of the USDcurve is ∼ . µ that makes the expected discountedvalue of a zero-coupon bond a martingale under the risk-neutral measure P . Under a subjectivemeasure P the drift can be different and can be modelled by regressing the Brownian motions W i,t on an assumed set of technical and econometric factors. Common ideas include: short-termmomentum, long-term mean reversion, and aversion in risky asset classes (flight to quality). Moresimply a common assumption, which we use in the simulation later, is carry-and-rolldown, whichis that f t ( T ) stays fixed : this makes higher-maturity bonds attractive in a steep yield curveenvironment. Similarly for portfolios with derivatives. Seasoned campaigners will remember how much Orange County sufferedin 1994 as a result of nonlinear products, after the Fed hiked rates 2% in short order. Another way of expressing this it to make f t ( T ) a P -martingale and ignore any convexity that the tradedinstruments may have. ∆ x | (∆ x ) (∆ x )
95% 99%Actual 0.137 0.0303 0.00360 [ − . , +0 . − . , +0 . σ y = 0 .
9% 0.141 0.0345 0.00289 [ − . , +0 . − . , +0 . Statistics of 10-day changes (∆ x ) in the 10y Treasury yield, cf. Normal distribution. · · · denotes(time) average. | ∆ x/x | (∆ x/x ) (∆ x/x )
95% 99%Actual 0.0667 0.00891 0.000784 [ − . , +0 . − . , +0 . σ = 40% 0.0626 0.00620 0.000120 [ − . , +0 . − . , +0 . Statistics of 10-day relative changes (∆ x/x ) in the CDX.IG 5y spread, cf. lognormal distribution. · · · denotes (time) average.
Credit risk depends on the fortunes of all the issuers involved in the portfolio, which is potentiallya very high-dimensional problem. A full treatment is not possible in a paper of this size, but wecan make some salient observations, and demarcate some problems that are easier than others:(A) A portfolio that is long-only and with no significant issuer concentrations;(B) A market-neutral portfolio;(C) A derivatives trading book with possibly large issuer concentrations and long and short po-sitions across the curves of some or all of the issuers.It is obvious that (A) is the easiest to deal with, and the only one that can be tackled reasonablywell using ‘credit duration’. To this end we should bear in mind that credit spreads typically moveproportionally to the spread level, suggesting ds t = ( . . . ) dt + ˆ σs t dZ t (3)with Z t a process of unit quadratic variation ( E [ dZ t ] = dt ). In the simplest setup, Z t is a Brownianmotion and we have a lognormal model.For indices, or more generally well-diversified credit pools, a lognormal assumption capturesthe majority of market moves but does not capture large systemic shocks and is therefore incorrectat high levels of confidence: see Table 2 and Figure 2 which use ˆ σ = 40%. Note that credit indexoptions are typically priced with essentially this model [16], and a volatility that is often around35%. From this comes the well-known idea that we should assess credit spread by bumping spreadsproportionally rather than apply a 1bp increase to each credit spread (the so-called CS01), as thehigher-yielding ones will typically move more. This gives rise to duration × spread as a simplemeasure of risk, and so a large, but not extreme, credit spread increase (by which we understand alittle more than 2 stdev), is roughly a doubling of the credit spread: this is a convenient definitionwhich we call CSx2. For extreme moves (credit sress loss, CSL) we propose not relative changesbut instead spread levels and define the stress loss of an instrument or asset class to be the effectof moving the yield from its current value y to a stress level y ∗ . Using (2) this isCSL = 1 − P ( T, y ; y ∗ ) = (1 − y /y ∗ ) (cid:0) − (1 + y ∗ /m ) − mT (cid:1) (4)4nd y ∗ is obtained by adding a specified spread level z ∗ (Table 3, which is commensurate with thelosses sustained in 2008-09) to the riskfree yield of the relevant maturity . This kind of constraintoperates differently from CSx2 because it is countercyclical, in the sense that as spreads widen,CSx2 rises but CSL falls , as losses have already been sustained. Constraining both these measureshas the attractive consequence of buying risk when spreads are wide (when CSx2 is the bindingconstraint), increasing the position as markets rally (as CSx2 falls), but reducing it when theybecome too tight (as now CSL binds instead).Of the three cases demarcated earlier, (A) is the one that concerns us most here, but it would beremiss not to consider (B) and (C), in which the main difficulty is caused by single names or sectors.It is instructive to consider the following problem, set in CDS rather than cash terminology forconvenience. Suppose I sell $50M of 2y protection and buy $20M of 5y protection on an investment-grade issuer. As the duration ratio is a little under 5:2, this trade is CS01 positive, which naturallygives the impression of being slightly short risk. The impression is made even stronger if we bumpspreads proportionally: as the 5y spread is almost certainly higher than the 2y spread, the trade ispredicted to perform well if spreads rise as the 5y spread should move more. But this is only truefor small moves. If the issuer becomes distressed, the 2y spread will start to rapidly increase sothat the curve inverts, and it is obvious that in the event of default there will be a large loss. Therelation between the two spreads is complex and a bivariate lognormal distribution does not givethe full picture. The right way to think about this problem is to forget about spreads and yields andreturn to basics, in the form of the structural (Merton) model, to which an introduction is providedin [13, 14]. As a function of the firm value, for small moves the trade has a slightly negative deltaand positive gamma , whereas for large reductions in firm value the delta becomes positive and thegamma very negative: see Figure 3. This motivates the idea that to assess idiosyncratic risk weshould reduce the firm value of each issuer by a small amount (say 10%) and also a much largeramount (say > We cannot simply use credit spread as a proxy for expected return, because it is not guaranteed.For a portfolio of individual credits, we in principle require an expected spread change, and defaultprobability, on each name. This is not practicable because each bond has to be updated as themarket moves. On the other hand, it is practical to attach a credit rating to each issuer , as thisonly needs to be changed when important news comes out, or there is a change in view on thecredit. For these purposes the credit rating does not have to correspond to the public rating: itencompasses our view of the credit quality. With this in mind, there are two key components inthe modelling. The first component is a set of credit curves (of spread vs maturity) for each rating,for the sector in question. This can be done using an appropriate numerical fitting procedure suchas that discussed in [18]. The second component is a Markov chain model of transition rates forall ratings from AAA, AA+, . . . , down to default, see e.g. [8, 21].If the bond is of maturity T and we are looking at the ER at time horizon t < T then we repricethe cashflows of a T − t maturity bond in each rating state j (say) and weight the results by the Or LIBOR for floating-rate instruments. Using option parlance, because we explicitly treat credit as a deep out-of-the money option on the firm. Also taking subordination into account where necessary. i to j . Symbolically if p i → j ( t ) denotes the transitionprobability over time t and P j ( T, c ) denotes the price of a bond of rating j , maturity T and coupon c , then the expected value of the bond at time t is D X j =1 p i → j ( t ) P j ( T − t, c )( D denotes the state of default). The total return is obtained by discounting this back to today,adding in the coupon accrued over time [0 , t ], and subtracting the current price P . By judiciousalgebra this gives TR = ct + B ( t ) P i ( T − t, c ) − P i ( T, c )+ B ( t ) P Dj =1 p i → j ( t ) (cid:0) P j ( T − t, c ) − P i ( T − t, c ) (cid:1) + P i ( T, c ) − P , (5)understood as follows. The first row of terms is the carry and rolldown of a bond rated i , assumingthe rating and associated spread curve remain fixed. The second adjusts for rating migration anddefault, and is negative. The third is the cheapness of the bond relative to the curve for rating i , so that it is assumed that convergence will occur towards that rating curve over time t , as thebond need not currently price in accordance with its rating. In the interests of simplicity, someapproximations have been made: the accrued ct will only be received if the issuer does not defaultin the period [0 , t ], so the first term needs to be slightly reduced ; the credit spread of the bondmay not converge to its assumed rating curve by time t ; we may have the view that the curvesthemselves will move, independently of any rating transition by the credit(s). These issues can becorrected, at the expense of a little extra complexity.If we ignore the third term then (5) reduces to carry and rolldown minus a hurdle representingthe expected loss from rating downgrade and default (and also upgrade but this is not enough tocounterbalance the other two). For high yield credits this hurdle is significant. As an example,assume that a B rated credit has a 1y default probability of ca. 4–5%. If the bond is trading atpar and recovery is assumed to be 30%, then a spread of at least 300bp will be needed to justifybuying the bond. If the riskfree rate is 2% then this translates to a yield of around 5%. Taking intoaccount transitions to B − /CCC+/CCC, we conclude that unless the bond yield exceeds ca. 5.75%,it generates no ER. Not taking credit losses into account causes the portfolio to become barbelled,as the ER of high-yield assets is greatly overestimated: the optimiser buys these in some proportionand allocates the remainder to cash so as to satisfy whatever average rating constraint is given. A full discussion of structured credit modelling is not possible but one issue is paramount. ACDO/CLO tranche rated AA pays a much higher spread than a AA rated corporate (the latter tradeflat to LIBOR). Anecdotally, the investment world still has trouble explaining this phenomenon,and a commonly-proffered explanations are illiquidity and opacity. While there is some truth inthese, they do not address a basic fact: at the senior end of the capital structure, CLOs concentratemarket risk, and according to the Capital Asset Pricing Model which is the risk for which an Essentially, shuffling terms and noting that P Dj =1 p i → j ( t ) = 1 for any i, t . If the hazard rate is λ then it should be (1 − e − λt ) λ − c < ct . The CAPM in its usual form is not ideally suited to credit, but it is possible to reformulate it using CVaR ratherthan stdev as a risk measure, and on so doing the conclusion is essentially the same. Except in distressed debt it ishard to find truly idiosyncratic risks in the credit market.
We define the excess expected return (ER) as the expected return less the riskfree rate, and riskas the value of a risk function which will be defined presently. Most problems reduce to:Maximise ER s.t. (cid:26)
Risk ≤ limitOther constraints. (6)A misconception is that we should attempt to maximise the Sharpe ratio. The table below, withthree hypothetical portfolios, shows why this is wrong (ER and risk are annualised, and in contextrisk means stdev): ER Risk SR1 1% 0.5% 22 5% 4% 1.253 20% 25% 0.8In context, Portfolio 1 offers the highest SR but too little ER to be attractive; to make it moreso it would need to be levered, which may be expensive or even impossible. Portfolio 3 offersthe highest ER, exhibiting the sort of risk that might reasonably come from a distressed debtportfolio—but in that case one want a much higher ER. Portfolio 2 offers similar characteristicsto a credit portfolio of borderline IG/HY credit quality in a steep yield curve environment, whererolldown and a benign view of credit performance might offer such characteristics. This sort ofrisk-return characteristic might be observed ex post , in a year in which markets ‘went up and up’,but as an ex ante statement about expected return and volatility it looks too optimistic, especiallyin the current market environment. It may be fairly said that Portfolio 2 is the most attractive(partly because it is virtually impossible to obtain), despite having neither the highest ER nor thehighest SR.In the context of interest rate risk alone, the following are standard problems which have beenconsidered as mean-variance optimisations (e.g. in [20, 3]) but are easier to formulate using LP.First, an outright optimisation:Maximise P nj =1 c j u j s.t. (cid:26) P nj =1 u j ∆ X j ( ω i ) ≥ − ε i ≤ u j ≤ u j where c j is the ER of the j th instrument (we have to take a view on rates, because under P theER is always zero), ∆ X j ( ω i ) is the return of the j th bond in the i th scenario, − ε i is the worstacceptable loss in the i th scenario, u j are the weights to be determined, and u j is the j th allocationlimit. (Transaction costs can easily be incorporated.) Secondly, index tracking, where risk pertainsto the difference between the index and the tracker. For this we can decide whether or not to take7 view on rates. If we do then the problem is a variant of the first:Maximise P nj =1 c j u j s.t. (cid:26) P nj =1 u j ∆ X j ( ω i ) − ∆ Y ( ω i ) ≥ − ε i ≤ u j ≤ u j where ∆ Y is the return of the index in the i th scenario. If we do not then we no longer need toknow about ER and can minimise the tracking error in the worst scenario:Maximise min i n X j =1 u j ∆ X j ( ω i ) − ∆ Y ( ω i ) + ε i s.t. 0 ≤ u j ≤ u j This minimax problem can be rewritten by introducing a ‘slack variable’ u to obtain:Minimise u w.r.t.( u j ) nj =1 s.t. (cid:26) u + P nj =1 u j ∆ X j ( ω i ) − ∆ Y ( ω i ) + ε i ≥ ∀ i ≤ u j ≤ u j which is again a standard LP problem. If, on solution, u > u j )to be nonnegative, as in each case we are trying to replicate a long-only index. Puhle [20] failed todo this, on the grounds that in the context of a mean-variance problem it leads to solutions thatcannot be written in closed form—an entirely spurious objection, as LP problems are so quick tosolve—and then ended up with impractical portfolios consisting of delicately balancing long andshort positions, a consequence that should have been obvious at the outset .Typical constraints encountered in fixed income can be grouped into these categories: • Allocation limits, which for bottom-up models will be primarily sector, country and issuerlimits, and for top-down models, sleeve limits; • Rating constraints e.g. simply the proportion of sub-IG assets, but also an average rating ona linear scale , or Moody’s weighted average rating factor (WARF) which is nonlinear andpenalises high-yield credits much more ; • Regulatory capital, e.g. NAIC in the US and Solvency Capital Ratio (SCR), which takesduration into account, in the EU; • Scenario-based risk measures such as IR01, CS01, CSx2, CSL; • Nonlinear risk measures such as stdev, CVaR.We discuss this in detail next. Aside from this is his disproportionate emphasis on (hypothetical) portfolios of zero-coupon bonds, which repre-sent a tiny proportion of the bond universe—in the context of corporate and sovereign bonds, we estimate < . AAA=1, AA+=2, AA=3, AA − =4, . . . AA=20, A=120, BBB=360, BB=1350, B=2720, CCC=6500. .2 Allocation limits and regularisation The purpose of allocation limits is threefold. First, it may be undesirable for reasons of liquidityto buy too much of a certain asset (e.g. more than 5% of a corporate bond issue). Secondly, anyasset or sleeve may suffer an unexpected idiosyncratic accident, which such limits mitigate. Asan example, suppose we are concerned that Turkish assets might drop 10% (in price terms), andwe do not want to lose more than 0 . × NAV in this eventuality: we constrain Turkey ≤ . In past years the matter of RAROC (ER ÷ regulatory capital) has been the subject of moreattention than is necessary. We explain why it is the wrong thing to optimise. Obviously whena particular asset is awarded zero risk weight, optimising RAROC generates nonsense, but evenif this pitfall is avoided, it also causes allocation to be skewed towards high-yielding assets whoseregulatory capital treatment is (too?) lenient, and it also causes overconcentration. It is alsocommonly thought that because an asset has a RAROC less than the desired hurdle, it shouldnot be bought. This is also wrong: it is only necessary for the portfolio as a whole to the havehigh enough RAROC, with the lower-yielding assets providing diversification. The correct way tounderstand regulatory capital is simply as an extra constraint. Unfavourable regulatory capitaltreatment is a valid reason not to invest in an asset or asset class; favourable regulatory capitaltreatment is not a good reason to invest in it. Ultimately, economic risk is the risk that is beingtaken, and so it cannot be ignored. The last twenty years has seen much attention devoted to an axiomatic theory of risk measures,one of the earliest papers being [1]. However, the volume of research is out of all proportionto its practical significance. It is almost impossible to find a financial disaster attributable to a‘poor choice of risk measure’; invariably, the culprit is an incorrect assessment of the probabilitiesof bad market moves, which is fundamental to any calculation. The most egregious example inrecent times has been the ‘London Whale’ incident, in which, three years after the Global FinancialCrisis, Normal distribution assumptions were still being used to assess the VaR of structured creditproducts .In fact, the necessary axioms for portfolio optimisation can be expressed succinctly: (i) posi-tivity; (ii) 1-homogeneity ; (iii) subadditivity, i.e. if X , X are portfolios then for any λ ∈ [0 , R (cid:0) λX + (1 − λ ) X (cid:1) ≤ λR ( X ) + (1 − λ ) R ( X ) . With these assumed, we can conclude immediately that in the presence of convex constraints, theefficient frontier must be concave . The brevity of this discussion might seem disturbing, but it is Which are traditionally obviated by Tykhonov regularisation, i.e. imposing a small penalty ∝ k u k , which is likeadding an idiosyncratic risk to each asset. See [11, p.286]: correspondence of P. Hagan on 07-Feb-12. If we scale our positions by some factor, the risk is scaled by the same factor. If two portfolios X , X lie on the efficient frontier, then any admixture λX + (1 − λ ) X is feasible (obeys theconstraints); its ER is the weighted sum of the ER’s of X , X ; and its risk is ≤ the weighted sum of their risks.Therefore the efficient frontier lies to the left of the chord joining X and X . .As is well known, stdev and CVaR obey (i)–(iii), but VaR does not obey (iii). The expected lossin a given scenario (e.g. CSx2) obeys (ii),(iii). Finally, so too does the maximum of risk measuresobeying (ii),(iii), allowing us to combine several different risk measures R k into a ‘total’ one R by R = _ k α k R k , α k > W denotes ‘maximum’ and ( α k ) are positive constants. As in context at least one of the R k will be positive, R also obeys (i). Notice that constraining R ( X ) to be less than or equal to somespecified value R is identical to constraining R k ( X ) ≤ α − k R for each k . If all the constraints are linear then we have a LP problem, for which standard methods areestablished and fast [19], even for large-scale problems. Nonlinear optimisation problems, providedthey are convex, can be solved iteratively using cutting-plane methods [2], which owe much toNewton-Raphson. A convex function R necessarily lies above its tangent at any point, so R ( u ) ≥ R ( u ∗ ) + ( u − u ∗ ) · ∇ R ( u ∗ ) (7)where ∇ R is the gradient of R , and u ∗ denotes the allocations at the present stage in the optimi-sation. So, if the upper limit on R is R , it is necessary for u · ∇ R ( u ∗ ) ≤ R − R ( u ∗ ) + u ∗ · ∇ R ( u ∗ ) , (8)a linear constraint which is added to the constraint set. The optimisation is rerun, ideally usingthe current point u ∗ as its starting-point. Then u ∗ will move, and we add a new constraint of theform (8) and keep rerunning until R ( u ∗ ) exceeds R by no more than an acceptably small amount .Note that it is important to add a new constraint each time, rather than moving (8) around eachtime u ∗ changes. The only requirement is that R be convex and that we can easily evaluate ∇ R . The efficient frontier demarcates the maximum level of ER ( r ) for each level of risk ( R ). It is anupward-sloping, concave function and, assuming that a portfolio with 100% cash is feasible, it mustpass through the origin . Given that risk and return have different units the functional form ψ ( r/a, R/b ) = 0 is necessary, so it is then a question of picking a sensible function ψ . With this inmind, and considering the required shape, an obvious idea is r = a t (1 − e − R/b t ) , a t , b t >
0; (9) For example we shall not be detained by the fact that standard deviation does not satisfy the so-called mono-tonicity axiom, see e.g. [1], which is often deemed necessary. If it is, then the logical conclusion is to dump all ofMarkowitz’s portfolio theory. As it will typically not get below R in finitely many iterations. Alternatively, in (8) replace R by a tighterconstraint R − ε and stop when R ( u ∗ ) < R . In effect the so-called risk contribution, see e.g. [15, § One might enquire why, if all the constraints are linear, the efficient frontier is not simply a straight line. Theanswer is that at different points on the frontier, different constraints are binding; therefore, the frontier is piecewiselinear. If we change the time horizons on which risk or return are calculated, the axes will be stretched accordingly.
10e write t because over time the parameters are expected to change. The interpretation of a t , b t isthat a t is the ER for a high-risk portfolio, as it is the R → ∞ asymptote, and a t /b t is the ER perunit risk for a low-risk portfolio, as it is the gradient of the efficient frontier at the origin.In different market scenarios and with different constraints binding, a t , b t will behave differently:(a) The riskfree curve steepens, with the IR01 constraint binding. (Note that a parallel shift willhave no effect, if a static interest-rate volatility is assumed, because by ER we always meanrelative to the front end of the curve.) Here a fwill increase but not b , and the frontier issimply stretched upwards.(b) Credit spreads increase, CS01 constraint binding. This is the same as (a).(c) Credit spreads increase, with the CSx2 constraint binding. Here the return and risk increaseby the same factor, and the efficient frontier is stretched upwards and to the right so that a and b both increase.(d) Credit spreads increase, CSL binding. Here the ER increases but risk decreases , so that a moves up and b down.(e) ‘Risk off’/‘Flight-to-quality’. If yields compress in low-risk portfolios but decompress in high-risk ones, the main effect is that b increases more than a , so that a/b (the gradient of thefrontier at the origin) decreases.In view of this we can expect that over time a t , b t will not be perfectly correlated, so that we willneed both to explain the full evolution of the efficient frontier. Plausibly, they are mean-revertingover a long enough time scale, and so a reasonable econometric model is that (ln a t , ln b t ) follows abivariate Ornstein–Uhlenbeck process. We construct a risk function defined as R = R ∨ R ∨ R ∨ R (10)where the risk measures are defined as: R Loss from riskfree rate +2% (for floating-rate instruments this will be zero) R Loss from credit spreads doubling (CSx2) R Stress losses (CSL) as eq.(4) R Annual stdevand the stdev is constructed as follows. Interest and credit risks are assumed uncorrelated, sothst the variances simply add. As discussed earlier, interest rate variance uses a yield volatility of σ y = 0 .
9% across all maturities, while the relative credit spread volatility is ˆ σ c = 0 . ρ = 0 . over a time horizon ∆ t (taken as 1y) is thereforeΣ = n(cid:0) P j u j D ir j (cid:1) σ y + ρ (cid:0) P j u j D cr j s j (cid:1) ˆ σ c + (1 − ρ ) P j u j ( D cr j ) s j ˆ σ c o ∆ t (11)where D ir j and D cr j are the interest rate and credit duration of the j th sleeve.We then vary the total risk limit R and plot the ER that the optimisation finds, and also fitting(9). This is done on different dates going back to 2000 in 3m steps.11he asset classes are described in Table 3, which also shows the Bloomberg tickers for the yield and the allocation limits imposed on each sleeve. They are: US Treasuries of various maturities,US investment grade and high yield corporate bonds ranging from AA to CCC; US and Europeansubordinated financials (the latter being swapped into USD); emerging market corporate/sovereignIG/HY; residential mortgage-backed securities; CLOs from AAA to BBB. The allocation constraintson the various sectors are: high-yield (of all types) ≤ ≤ ≤ ≤ a t and b t , whichare seen to be positively correlated. Independent factors a ∗ t , b ∗ t are obtained by taking a ∗ t = a t asthe first and b ∗ t = b t /a pt , for suitably-chosen p , as the second. We found p ≈ .
66 by regressionof ln b t against ln a t , and this is shown in Figure 2(c). Their interpretation is that a ∗ t representsoverall risk appetite, so that as the market goes risk-on, a ∗ t declines; whereas b ∗ t indicates flight-to-quality having controlled for a t , so that as the market moves into safer portfolios b ∗ t increases.Comparison between 2006 and 2018 is interesting because Figure 4(a) shows the efficient frontiersto be of different shape, and indeed the factor b ∗ t is higher now than then.We then run the test without the stdev constraint, so the 2 R term in (10) is removed. Theresults are not identical, but can scarcely be distinguished from Figure 4 and so we have not plottedthem. In view of what we have said before about long-only portfolios, the stdev constraint is largelyredundant once the interest rate and credit duration are constrained. We have established a framework for doing fixed income portfolio optimisation. The efficientfrontier is mostly a product of linear constraints, and for long-only porfolios the standard deviationconstraint is largely unnecessary. By directly modelling the efficient frontier we can make statementsabout risk and reward that pertain directly to the portfolio with full regard to the constraints. Thefactors a t , b ∗ t almost completely describe how the efficient frontier of a particular portfolio hasevolved over time, and give a manager an indication of how risk is being rewarded within theoperating constraints. Acknowledgements
I thank Yang Zhou and Yao Ma for their part in the early development of this work, and MichaelStory, Erik Vynckier and John Zito for helpful discussions.
References [1] P. Artzner, F. Delbaen, J.-M. Eber, and D. Heath. Coherent measures of risk.
MathematicalFinance , 9:203–228, 1999.[2] D. Bertsekas.
Nonlinear Programming . Athena, 2016.[3] J. F. Caldeira, G. V. Moura, and A. A. P. Santos. Bond portfolio optimization: A dynamicheteroskedastic factor model approach. In
Proc. XXVII Jornadas Anuales de Econom´ıa (Mon-tevideo, November 2012) , 2012. Historical spread and total return information is available for these sleeves, but for reasons of space is not shown.
J. Fixed Income , 28(1):6–26, 2018.[5] D. Heath, R. Jarrow, and A. Morton. Bond pricing and the term structure of interest rates:A new approach.
Econometrica , 60(1):77–105, 1992.[6] J. Hull and A. White. Pricing interest-rate derivative securities.
Rev. Fin. Stud. , 3(4):573–592,1990.[7] J. C. Hull.
Options, Futures, and Other Derivatives . Prentice Hall, 1997. 3rd ed.[8] R. A. Jarrow, D. Lando, and S. M. Turnbull. A Markov model for the term structure of creditrisk spreads.
Rev. Fin. Studies , 10(2):481–523, 1997.[9] O. Korn and C. Koziol. Bond portfolio optimization: A risk-return approach.
J. Fixed Income ,15(4):48–60, 2006.[10] P. Krokhmal, J. Palmquist, and S. Uryasev. Portfolio optimization with conditional value-at-risk objective and constraints. , 2001.[11] C. Levin. Exhibits: Hearing on JPMorgan Chase Whale Trade: A Case History of DerivativesRisks & Abuses. Technical report, US Senate Permanent Subcommittee on Investigations,2013.[12] H. Markowitz. Portfolio selection.
J. Finance , 7(1):77–91, 1952.[13] R. J. Martin. CUSP 2007: An overview of our new structural model. Technical report, CreditSuisse, 2007.[14] R. J. Martin. Smiling Jumps.
RISK , 23(9):108–113, 2010.[15] R. J. Martin. Saddlepoint methods in portfolio theory. In A. Lipton and A. Rennie, edi-tors,
The Oxford Handbook of Credit Derivatives , chapter 15. Oxford University Press, 2011. arXiv:1201.0106 .[16] R. J. Martin. A CDS Option Miscellany. arXiv:1201.0111 , 2019. Vsn 3.[17] R. J. Martin, H. Haworth, and F. Koch. Struck Off.
RISK , 21(11):87–91, 2008.[18] R. J. Martin, T. Uzuner, and Y. Ma. Emerging market corporate bonds as first-to-defaultbaskets.
RISK , 31(7), 2018. arXiv:1804.09056 .[19] W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling.
Numerical Recipes inC++ . Cambridge University Press, 2002.[20] M. Puhle.
Bond Portfolio Optimization . PhD thesis, U. of Passau, 2007. Springer LectureNotes in Economics and Mathematical Systems No.605.[21] Standard & Poors. Default, Transition, and Recovery: 2018 Annual Global Corporate Defaultand Rating Transition Study. Technical report, Standard & Poor’s Ratings Services, 2019.13 C h a n g e ( d ) US 10y Tsy /%
Figure 1:
The magnitude of short-term changes in US 10y Tsy yield has not typically depended on the spotlevel. Bars are at symmetrical 95% confidence. Data range 1990–2017. Source: Bloomberg. -60-40-20 0 20 40 60 0 50 100 150 200 250 C h a n g e ( d ) CDX.IG 5y /bp
Figure 2:
Moves in credit spread (CDX.IG 5y shown here) are typically proportional to the spot level. Barsare at symmetrical 95% confidence. Data range 2004–2017. Source: Bloomberg.
14L Firm valueFigure 3:
Sketch of PL vs firm value for credit steepener trade discussed in §
2. A naive argument basedon parallel spread curve movements gives information only about the behaviour in the encircled part of thegraph. Name Ticker Max (%) Maturity z ∗ (bp) FloatCash 100Tsy (5y, 10y, 30y) CMAT05Y etc. 100 5, 10, 30 -US Corporate AA JULIAAY 10 10 100US Corporate A JULIAY 15 10 200US Corporate BBB JULIBBBY 20 10 350US Corp. IG 5y IBOXUMAE 20 5 250 ∗ CLO AAA JCLOAAAY 5 10 350 ∗ CLO AA JCLOAAYL 5 10 550 ∗ CLO A JCLOAYLD 5 10 900 ∗ CLO BBB JCLOBBBY 5 10 1700 ∗ US Corporate BB JPDKBB 5 7 900US Corporate B JPDKB 5 7 1500US Corporate CCC JPDKCCC 5 7 3000US HY 2–4y JPDK2/4 5 3 1500EM Corporate IG JBDYIGYW 10 8 450EM Corporate HY JBDYHYYW 5 7 1100EM Sovereign IG JPBYIGYW 10 12 350EM Sovereign HY JPBYHYYW 5 10 1200Agency real estate LUMSYW 10 5 1200US IG Financials JULIFINY 10 12 500US HY Financials JPDKFINL 5 8 1000EU Insurance (Tier 2) JPSUIBEY 10 10 700
Table 3:
Universe used in simulation. ‘Float’ denotes LIBOR-based instruments with no interest rate risk.Ticker refers to Bloomberg. IBOXUMAE is the ticker for the on-the-run CDX.IG 5y CDS index. ER , r Risk, R - a t b t (b) b* t Figure 4:
Results for test portfolio. (a) Efficient frontiers (dotted lines) and model fit (solid lines) atbeginning of April on dates shown. Units are fractions of portfolio NAV, on both axes. (b) Time evolutionof the factors a t , b t and second independent factor b ∗ t = b t /a . t ..