NNo arbitrage SVI
Claude Martini ∗ and Arianna Mingone † Zeliade Systems, 56 rue Jean-Jacques Rousseau, Paris, France Universit`a degli Studi di Udine, Udine, ItalyMay 8, 2020
Abstract
We fully characterize the absence of Butterfly arbitrage in the SVI formula for implied total varianceproposed by Gatheral in 2004. The main ingredient is an intermediary characterization of the necessarycondition for no arbitrage obtained for any model by Fukasawa in 2012 that the inverse functions of the − d and − d of the Black-Scholes formula, viewed as functions of the log-forward moneyness, shouldbe increasing. A natural rescaling of the SVI parameters and a meticulous analysis of the Durrlemancondition allow then to obtain simple range conditions on the parameters. This leads to a straightforwardimplementation of a least-squares calibration algorithm on the no arbitrage domain, which yields anexcellent fit on the market data we used for our tests, with the guarantee to yield smiles with no Butterflyarbitrage. Jim Gatheral proposed in 2004 the following
Stochastic Volatility Inspired for the implied total variance(meaning: the square of the implied volatility times the time to maturity):
SV I ( k ) = a + b ( ρ ( k − m ) + (cid:112) ( k − m ) + σ )where k is the log-forward moneyness, and ( a, b, ρ, m, σ ) parameters.This formula quickly became the benchmark at least on Equity markets, due to its ability to producevery good fits. Fabien Le Floch (head of research at Calypso) has a blog article on a situation where SVI does not fit , which is a good indicator of how rare such a situation is in practice. The practitioner literatureon SVI and its variants is plentiful ([3], [1], [17], [15], [8]), and SVI is now part of every reference textbookon volatility models ([21], [18]). ∗ [email protected] † [email protected] a r X i v : . [ q -f i n . M F ] M a y n 2009, the whitepaper on the Quasi-explicit calibration of Gatheral’s SVI ([6], also part of StefanoDe Marco PHD thesis) proposed a simple trick to disambiguate the calibration of SVI, and became itself areference calibration algorithm.SVI has been extended by Gatheral and Jacquier in a seminal paper to surfaces in [9], which providesthe first explicit family of implied volatility surfaces with explicit and tractable no arbitrage conditions, bothfor the Butterfly and Calendar Spread arbitrages. SSVI has been extended further in [14] to other smileshapes, and in [4] to correlation parameters functions of the time-to-maturity. A quick and robust calibrationalgorithm for the latter is provided in [19].SSVI smiles (at a fixed time to maturity) are a subset of SVI smiles with 3 parameters instead of 5, andso, for them, an explicit sufficient condition for no (Butterfly) arbitrage is available.A remarkable fact is that, despite the simplicity of the formula, no Butterfly conditions for a SVI smileremained up to now to intricate. So for instance in the algorithm [6] there is no guarantee that the calibratedparameter will be arbitrage-free. An interesting practical approach is provided in [20], where the no arbitrageconstraints are expressed as a discretized set of Durrleman conditions and encoded as non-linear constraintsin the optimizer; stricto sensu there also, there is no guarantee though that the calibrated parameter willbe arbitrage-free. In this paper, we solve this long-standing issue.We start in section 2 by a precise discussion of the meaning of no arbitrage for a volatility smile, which isbased on [12] for the no arbitrage properties of Call price functions, and on [16] and [10] for the correspondingstatements in term of volatility.The first ingredient (section 3) is then to use a version rescaled in a natural way of SVI, which lendsitself better to calculations: we work with the rescaled parameters α, µ where a = σα and m = σµ , and thedummy variable l = k − mσ instead of k . The second ingredient (section 4), which is the key one, is to work outthe conditions obtained by Fukasawa in [10], that the inverse functions of the − d and − d coefficients of theBlack-Scholes formula should be increasing. Those conditions are necessary for no Butterfly arbitrage, andare almost universal. It turns out that the Fukasawa conditions for SVI does not involve σ . An interestingproperty of the Fukasawa conditions is that they provide the positivity of the 1st term of the Durrlemanncondition; based on the fact that in our case the complementing 2nd term reads σ G ( l ) where G does notdepend on σ , ensuring the Durrleman condition yields a simple condition on σ (section 5).It should be noted that we do not impose the condition a ≥
0, as it is often done without justification;we work out the necessary and sufficient conditions in the full domain of the SVI parameters.At this stage, we have fully explicited the domain of the SVI parameters for which no Butterfly arbitrageholds. It is straightforward to code, resorting to root finding numerical routines (like the Brent algorithm)for the computation of the thresholds we characterized in our computations. There are then 2 byproductsof this parametrization of the domain of high practical interest: • a quick check routine that a given SVI parameter lies in the domain or not, which disentangles between4 possible situations of arbitrage; • a calibration algorithm, using any least-squares type objective function and a minimizer able to handlebounds.We provide in the last section (section 6) numerical tests performed on data on index options purchasedfrom the CBOE.SVI models a volatility smile, not a volatility surface, so without ambiguity when we use the no arbitragewording for SVI, we mean the absence of Butterfly arbitrage.We thank Antoine Jacquier and Stefano De Marco for useful discussions and comments.2 .1 Domain of SVI parameters The SVI model
SV I ( k ) = a + b ( ρ ( k − m ) + (cid:112) ( k − m ) + σ )is defined when a, m ∈ R , b ≥ − ≤ ρ ≤ σ ≥ a + bσ (cid:112) − ρ (possibly attainedat infinity if ρ = ±
1) and which goes to infinity as k goes to ±∞ (for ρ (cid:54) = ± a + bσ (cid:112) − ρ ≥ ρ = − ∞ to a , and if ρ = +1 the SVI smile increases from a to ∞ . Let S denote the underlying value of standard Call options with a fixed maturity t >
0. Without loss ofgenerality we assume that there is no interest rates nor dividend rates. In case of deterministic interest r anddividend rates δ , all the statements in this section still holds once S is replaced by the Forward correspondingto the option maturity F t = S exp (cid:82) t ( r s − δ s ) ds and working with the num´eraire of the option maturity. The condition of no Butterfly arbitrage is achieved when the Call price function with respect to the strike is(we follow the very careful treatment in [12]):1. convex;2. non-increasing;3. with value in the range [( S − K ) + , S ].These properties assume only that there is a perfect market for the underlying and for the Call options,with short-selling allowed, and that there is no static buy-sell strategy involving the underlying and a finiteset of Call options with a Profit and Loss which is strictly positive.We recall in particular that the large moneyness behaviour stating that the Call price function should goto zero at + ∞ is an additional assumption, and does not strictly follow from the no-arbitrage axiom.In the case of a Call price function specified through an implied volatility: C ( K ) = BS ( k, (cid:112) w ( k )) where w is the implied total variance σ t and BS ( k, a ) is the Black-Scholes formula expressed as a function ofthe log-forward moneyness k = log KS and the implied total volatility, the 3rd property is automaticallygranted since the BS function is increasing with respect to its 2nd argument and since that the range boundscorrespond to the limit when a goes to 0 and ∞ .Observe now that if the 3rd property is satisfied, then the 1st one implies the 2nd one since an increasingconvex function can not be bounded. Can a volatility smile reach 0 at some (finite) point? Assume that it is the case, so w ( k m ) = 0 at thelog-forward moneyness k m corresponding to some strike K m . Then it means that the Call price with thisstrike is equal to its intrinsic value ( S − K m ) + . If K m lies on the right of S , the price is therefore 0, and bythe property 2 above all the Call prices with K > K m will also be 0. If K m lies on the left of S , the option3rice is S − K m ; since the option price with a strike 0 is equal to S = S −
0, the convexity property impliesthat all the Call prices with
K < K m are smaller than S − K − m which is the value of the chord betweenthe points 0 and K m . Since this value S − K is also lower bound for the Call prices, they are eventuallyequal to this value. So, in the implied volatility space, this means that w = 0 for K ≥ K m in the 1st case,and w = 0 for K ≤ K m in the second case.This means that no arbitrage implies that smiles reaching 0 above (respectively below) the At-The-Money(forward) point will vanish above (respectively below) this point. In the case of SVI, smiles reach zero atmost at a single strike, and only if a + bσ (cid:112) − ρ = 0 and ρ (cid:54) = {− , } , in which case there are strictlypositive for other strike values, and there is a Butterfly arbitrage. So we can discard this case and assume a + bσ (cid:112) − ρ > ρ (cid:54) = {− , } . At this stage we know that SVI smiles with no Butterfly arbitrage are positive, and that the 3rd propertyabove is automatically satisfied. So there is no Butterfly arbitrage if and only if the 1st property holds. Nowfor positive smiles, as recalled in [9] after Lemma 2.2, with w ( k ) = SV I ( k ): p ( K ) := ∂ C∂K (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) K = S e k = ∂ BS ( k, (cid:112) w ( k )) ∂K (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) K = S e k = g ( k ) S e k (cid:112) πw ( k ) exp (cid:18) − d ( k ) (cid:19) where d is the standard coefficient of the Black-Scholes formula: d , ( k ) = − k (cid:112) w ( k ) ± (cid:112) w ( k )2 . So convexity is equivalent to ask to the function g ( k ) ([9], equation 2.1) g ( k ) := (cid:18) − kSV I (cid:48) ( k )2 SV I ( k ) (cid:19) − SV I (cid:48) ( k ) (cid:18) SV I ( k ) + 14 (cid:19) + SV I (cid:48)(cid:48) ( k )2to be positive, which is usually called the Durrleman condition (cf. Theorem 2.9, condition ( IV
3) of [16]).Note that the first derivative of the Call function with respect to the strike necessarily goes to zero as K goes to ∞ , and to a finite limit between − K goes to 0, which means that the total mass of p isless than one, but not necessarily one, meaning that there could be a non-zero mass at zero. It will sum toone if and only if the limit is −
1; in this case, p can be interpreted as a probability measure; the expectationof the underlying under this measure will be strictly less than the underlying value, unless the additionalproperty that the Call price vanishes at infinity holds, in which case it will be exactly the underlying value(cf. Theorem 2.1.2 of [12]).The above discussion can be translated in properties of the smile: we know from Theorem 2.9 in [16]that the large moneyness behaviour is one-to-one with the fact that d ( k ) goes to −∞ at infinity:lim k →∞ d ( k ) = −∞ The fact that there is no mass at zero, or, equivalently, that the derivative of the Call price with respectto the strike goes to − k →−∞ d ( k ) = −∞ In the case of SVI, the 1st condition translates to b ( ρ + 1) < b ( ρ − > − roposition 2.1 (No Butterfly arbitrage criterion for SVI) . A necessary condition for no Butterfly to holdin SVI is that
SV I ( k ) > for all k . Under this condition, let g ( k ) := (cid:18) − kSV I (cid:48) ( k )2 SV I ( k ) (cid:19) − SV I (cid:48) ( k ) (cid:18) SV I ( k ) + 14 (cid:19) + SV I (cid:48)(cid:48) ( k )2 . Then there is no arbitrage in SVI if and only if g is non-negative. In this case, the formula p ( K ) := g ( k ) S e k (cid:112) πSV I ( k ) exp (cid:18) − d ( k )2 (cid:19) where K = S e k , S being the underlying value, and d ( k ) = − k √ SV I ( k ) − √ SV I ( k )2 defines a positive measureon R + such that p ( R + ) < .Moreover, the Call prices in SVI go to zero when the strike goes to infinity if and only if b ( ρ + 1) < ,and the derivative of the Call price (expressed in numeraire of the maturity) with respect to the strike goesto − if and only if b ( ρ − > − . In the first case (cid:82) xp ( x ) dx = S and in the second case p ( R + ) = 1 . Note that the two conditions b ( ρ + 1) = 2 and b ( ρ −
1) = − b = 2and ρ = 0. We rescale SVI in the following way:
SV I ( k ) = ασ + bσ (cid:18) ρ k − mσ + (cid:114)(cid:16) k − mσ (cid:17) + 1 (cid:19) = σN (cid:18) k − mσ (cid:19) with α := a/σ and N ( l ) := α + b ( ρl + √ l + 1). With this rewriting, the derivatives of the SVI modelbecome SV I (cid:48) ( k ) = N (cid:48) (cid:18) k − mσ (cid:19) SV I (cid:48)(cid:48) ( k ) = 1 σ N (cid:48)(cid:48) (cid:18) k − mσ (cid:19) Observe that the second derivative of N (cid:48)(cid:48) is positive so N is strictly convex. Its only critical point is aminimum that we call l ∗ = − ρ √ − ρ . We gather the important properties of N in the following: Lemma 3.1 (Normalized SVI) . Let N ( l ) := α + b ( ρl + √ l + 1) where a = ασ . Then N is strictly convexwith a minimum at l ∗ = − ρ √ − ρ , where N ( l ∗ ) = α + b (cid:112) − ρ . Also: N (cid:48) ( l ) = b (cid:18) ρ + l √ l + 1 (cid:19) ,N (cid:48)(cid:48) ( l ) = b ( l + 1) . n particular as l → ±∞ : N ( l ) ∼ α + b ( ρ ± l,N (cid:48) ( l ) → b ( ρ ± ,N (cid:48)(cid:48) ( l ) → and ∀ k, SV I ( k ) = σN (cid:16) k − mσ (cid:17) . In the sequel we will also put m = µσ , so that k = σ ( l + µ ) and SV I a,b,ρ,m,σ ( k ) = σN α,b,ρ (cid:18) kσ − µ (cid:19) where the parameters have the following constraints: b ≥ | ρ | < µ ∈ R σ ≥ α + b (cid:112) − ρ > . In the sequel, to avoid singularities in our computations, we will: • assume b positive since the case b = 0 is the Black-Scholes case, which is a trivial case of no arbitrage; • exclude the boundary cases ρ = ±
1. We revisit those boundary cases in subsection 5.3. g With our notations, we have g ( k ) = (cid:18) − kN (cid:48) (cid:0) kσ − µ (cid:1) σN (cid:0) kσ − µ (cid:1) (cid:19) − N (cid:48) (cid:0) kσ − µ (cid:1) (cid:18) σN (cid:0) kσ − µ (cid:1) + 14 (cid:19) + N (cid:48)(cid:48) (cid:0) kσ − µ (cid:1) σ and writing G ( l ) := g ( σ ( l + µ )) we find G ( l ) = (cid:18) − ( l + µ ) N (cid:48) ( l )2 N ( l ) (cid:19) − N (cid:48) ( l ) (cid:18) σN ( l ) + 14 (cid:19) + N (cid:48)(cid:48) ( l )2 σ . We can rewrite G as G ( l ) = (cid:18) − N (cid:48) ( l ) (cid:18) ( l + µ )2 N ( l ) + 14 (cid:19)(cid:19)(cid:18) − N (cid:48) ( l ) (cid:18) ( l + µ )2 N ( l ) − (cid:19)(cid:19) + 12 σ (cid:18) N (cid:48)(cid:48) ( l ) − N (cid:48) ( l ) N (cid:19) = G ( l ) + 12 σ G ( l )where G ( l ) := (cid:18) − N (cid:48) ( l ) (cid:18) ( l + µ )2 N ( l ) + 14 (cid:19)(cid:19)(cid:18) − N (cid:48) ( l ) (cid:18) ( l + µ )2 N ( l ) − (cid:19)(cid:19) ,G ( l ) := (cid:18) N (cid:48)(cid:48) ( l ) − N (cid:48) ( l ) N (cid:19) . It will be instrumental in the sequel to observe that:6 G depends only on α, b, ρ, µ , • G depends only on α, b, ρ ,so that the dependency of G in σ is particularly simple; this is the main benefit of our rescaling of SVI. G and the Fukasawa necessary condition for no Butterfly arbi-trage Fukasawa proved in [10] the beautiful result that if a total variance smile w , expressed as a function of thelog-forward moneyness, has no Butterfly arbitrage, then the two functions f and f given by the oppositeof the d and d of the Black-Scholes formula: f , ( k ) = k (cid:112) w ( k ) ∓ (cid:112) w ( k )2are necessarily increasing, so that f (cid:48) , ≥ G and G ? There is a nice expression of g involving the functions f , ; indeedas shown e.g. in [11] (Eq. 55 p. 25): ∂ C∂K (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) K = S e k = φ ( f ( k )) (cid:16) f (cid:48) ( k ) f (cid:48) ( k ) (cid:112) w ( k ) + √ w (cid:48)(cid:48) ( k ) (cid:17) S e k where φ is the standard Gaussian density. By identification this yields g ( k ) = (cid:16) f (cid:48) ( k ) f (cid:48) ( k ) (cid:112) w ( k ) + √ w (cid:48)(cid:48) ( k ) (cid:17)(cid:112) w ( k )to be compared with g ( k ) = G ( l ) = G ( l ) + σ G ( l ).With our notations we have for SVI w ( k ) = σN (cid:0) kσ − µ (cid:1) which yields f (cid:48) , ( k ) = 1 (cid:113) σN (cid:0) kσ − µ (cid:1) (cid:18) − N (cid:48) (cid:16) kσ − µ (cid:17)(cid:18) k σN (cid:0) kσ − µ (cid:1) ± (cid:19)(cid:19) . Recall that G ( l ) = (cid:18) − N (cid:48) ( l ) (cid:18) ( l + µ )2 N ( l ) + 14 (cid:19)(cid:19)(cid:18) − N (cid:48) ( l ) (cid:18) ( l + µ )2 N ( l ) − (cid:19)(cid:19) , which depends on ( α, b, ρ ), and on µ . Call G the first factor of G and G − the second one. Then f (cid:48) , ( σ ( l + µ )) = G ± ( l ) √ σN ( l ) and the Fukasawa conditions correspond to G ± ≥
0, which entails that G ≥ G (cid:0) kσ − µ (cid:1) = f (cid:48) ( k ) f (cid:48) ( k ) w ( k ) and σ G (cid:0) kσ − µ (cid:1) = √ w (cid:48)(cid:48) ( k ) (cid:112) w ( k ). We have the following: 7 emma 4.1 (Limits of G ) . lim ±∞ G ( l ) = (cid:16) − b ( ρ ± (cid:17)(cid:16)
12 + b ( ρ ± (cid:17) . In particular, G ( ∞ ) ≥ and G ( −∞ ) ≥ iff simultaneously b ( ρ + 1) ≤ ,b ( ρ − ≥ − . These conditions are conditions on the asymptotic slopes of the total variance smile, and are thereforerelated to the Roger Lee Moment formula [5]; this is a general fact for the Fukasawa conditions: [10] containsseveral asymptotic statement on f and f which are directly related to the asymptotic behavior of w ( k ) k . µ Let us investigate the corresponding Fukasawa conditions of positivity of G and G − in terms of SVIparameters.We start with the following Lemma 4.2.
Let L ± ( l ; α, b, ρ ) := 2 N ( l ) (cid:16) N (cid:48) ( l ) ∓ (cid:17) − l (1) defined on ] l ∗ , + ∞ [ and ] − ∞ , l ∗ [ respectively.Then G ± > if and only if sup l
0, we need sup l
In order to alleviate the notations, we will often supress the list of parameters in L ± , or whenneed it just denote the dependency in α , ( b, ρ ) being fixed. What are the basic properties of L − and L + ?Note that L − ( l ∗− ) = −∞ and under b ( ρ − > −
2, then L − ( −∞ ) = −∞ . It follows that l − such that L − ( l − ) = sup l
1) = − L − ( −∞ ) = − α whilewhen b ( ρ + 1) = 2 then L + (+ ∞ ) = α . Indeed at infinity L − behaves as 2 α (cid:16) b ( ρ − + (cid:17) + b ( ρ − l while L + as 2 α (cid:16) b ( ρ +1) − (cid:17) + − b ( ρ +1)2 l . In these cases the supremum of L − (or the infimum of L + ), could bereached at −∞ (or + ∞ ).Experiments show that not every choice of ( α, b, ρ ) leads to L − ( l ) < − (cid:15) < l < l ∗ and L + ( l ) > (cid:15) > l > l ∗ , so the interval for µ could be empty: for example, for α = − . , b = 1 and ρ = 0 .
5, we have L − ( l − ) > L + ( l + ). This suggests that the situation is intricate; we show below that when α ≥
0, the intervalis non-empty. 8 .1.3 The case α ≥ α ≥
0, we can indeed demonstrate that the interval for µ is non-empty, with the following easyargument: L − is negative for l < l ∗ iff N N (cid:48) (4 + N (cid:48) ) − l is negative. In this domain N (cid:48) is negative, so the previouscondition is equivalent to ask N (4 + N (cid:48) ) − lN (cid:48) >
0, or equivalently 2( N − lN (cid:48) ) + N (2 + N (cid:48) ) >
0. Let usconsider the first term. We have N − lN (cid:48) = α + b √ l + 1 − bl √ l +1 which is greater than 0 iff, multiplyingby √ l + 1, also α √ l + 1 + b > α > − b √ l +1 . This holds for α ≥ −∞ where it equals 0, so this proof cannot handle the case α < N (cid:48) >
0. Since N (cid:48) > b ( ρ − N (cid:48) > b ( ρ − ≥
0. So L − is always strictly negative for l < l ∗ and α ≥ L + is positive for l > l ∗ iff 2( N − lN (cid:48) ) + N (2 − N (cid:48) ) >
0. With the same arguments as beforewe obtain that L + is strictly positive for l > l ∗ and α ≥ b ( ρ − > − b ( ρ + 1) < L − ( −∞ ) = −∞ and L + ( ∞ ) = ∞ , and it followsthat the interval I is non-empty.When b ( ρ −
1) = − b ( ρ + 1) = 2 this result is still valid. Since in such case L − ( −∞ ) = − α and L + (+ ∞ ) = α respectively, then L − is negative in [ −∞ , l ∗ [ while L + is positive in ] l ∗ , + ∞ ] for α > α = 0, sup l
2; we deal with the case b (1 − ρ ) = 2or b (1 + ρ ) = 2 in the dedicated subsection 4.2.3.We consider the function L − for l < l ∗ and L + for l > l ∗ . We have L (cid:48)± ( l ) = 1 ∓ N (cid:48) − NN (cid:48)(cid:48) N (cid:48) and it followsthat L (cid:48)− ( l − ) = L (cid:48) + ( l + ) = 0.The corresponding equations in l are:1 ∓ b (cid:18) ρ + l √ l + 1 (cid:19) − α + b ( ρl + √ l + 1)) b √ l + 1( ρ √ l + 1 + l ) = 0 . Actually, we don’t need to solve these equations. Accordingly, we set: g ± ( b,ρ ) ( l ) = (cid:16) ρ (cid:112) l + 1 + l (cid:17) (cid:18)(cid:112) l + 1 (cid:18) ∓ bρ (cid:19) ∓ bl (cid:19) − (cid:16) ρl + (cid:112) l + 1 (cid:17) (2)defined on [ l ∗ , ∞ [ and ] − ∞ , l ∗ ] respectively. Then L (cid:48)± ( l ) = 0 iff g ± ( b,ρ ) ( l ) = αb .The following technical result turns to be a key one: Proposition 4.5.
Assume b (1 ± ρ ) < , and let g ± ( b,ρ ) defined by (2) and L ± defined by (1). Then: g ± ( b,ρ ) ( l ∗ ) = − (cid:112) − ρ , g ± ( b,ρ ) ( ±∞ ) = ∞ , and g ± ( b,ρ ) is either monotonous or with a single minimum m ± (cid:54) = l ∗ ; • in the latter case, let r ± (cid:54) = l ∗ such that g ± ( b,ρ ) ( r ± ) = − (cid:112) − ρ . Let s ± = r ± if g ± ( b,ρ ) has a minimum m ± , and s ± = l ∗ otherwise.Then: • L − ( l − ; bg − ( b,ρ ) ( l − )) = sup l
Let b, ρ be fixed, such that b (1 ± ρ ) < .Let α > − b (cid:112) − ρ . There is a unique ( l − , l + ) such that l − < s − , l + > s + and α = bg − ( b,ρ ) ( l − ) = bg +( b,ρ ) ( l + ) , and the interval I α,b,ρ for µ is non-empty iff L − ( l − ; bg − ( b,ρ ) ( l − )) < L + ( l + ; bg +( b,ρ ) ( l + )) . Corollary 4.7 (SVI Fukasawa threshold) . Let b, ρ be fixed, such that b (1 ± ρ ) < . The distance between L + ( l + ; α ) and L − ( l − ; α ) where α = bg +( b,ρ ) ( l + ) = bg − ( b,ρ ) ( l − ) increases with α . Let F ( b, ρ ) denote theunique value of α such that L + ( l + ; α ) = L − ( l − ; α ) if there exists such a value for α > − b (cid:112) − ρ , otherwiselet F ( b, ρ ) = − b (cid:112) − ρ . Then L + ( l + ; α ) > L − ( l − ; α ) if and only if α > F ( b, ρ ) . We name F the Fukasawathreshold of SVI.Proof. This follows directly from the previous analysis: increasing α , the functions g ± ( b,ρ ) increase so thecorresponding l − < s − decreases while l + > s + increases. In turn, the function L + ( l + ; bg +( b,ρ ) ( l + )) increasesand the function L − ( l − ; bg − ( b,ρ ) ( l − )) decreases. Note that l − < s − and l + > s + because α > − b (cid:112) − ρ from the non arbitrage condition of the parameters.Otherwise, we can even demonstrate it by nothing that10 dα ( L + ( l + ; α ) − L − ( l − ; α )) = L (cid:48) + ( l + ) ddα l + − L (cid:48)− ( l − ) ddα l − + ∂ α L + ( l + ; α ) − ∂ α L − ( l − ; α )= ∂ α L + ( l + ; α ) − ∂ α L − ( l − ; α )where l + and l − are functions of α given by α = bg +( b,ρ ) ( l + ) = bg − ( b,ρ ) ( l − ). Now, the RHS is equal to2 (cid:16) N (cid:48) ( l + ) − N (cid:48) ( l − ) − (cid:17) and since N (cid:48) ( l + ) > and − N (cid:48) ( l − ) > , the previous quantity is greater than 1.The following graph displays: • in blue the function l − → L − ( l − ; bg − ( b,ρ ) ( l − )) with l − < r − where r − is such that g − ( b,ρ ) ( r − ) = − (cid:112) − ρ ; • in green the corresponding value of l − → L + ( l + ( bg − ( b,ρ ) ( l − )); bg − ( b,ρ ) ( l − )).We set b = and ρ = .The following corollary gives an easy criterion of an existence of a Butterfly arbitrage: Corollary 4.8. If α ≤ F ( b, ρ ) then for every choice of µ and σ , the SVI model does not satisfy the Fukasawaconditions. Since L + ( l + ; bg +( b,ρ ) ( l + )) − L − ( l − ; bg − ( b,ρ ) ( l − )) goes to infinity when increasing α = bg +( b,ρ ) ( l + ) = bg − ( b,ρ ) ( l − ) to infinity, then there exists ¯ α such that the interval for µ in non-empty and from the pre-vious corollaries for each α > ¯ α this still holds. Decreasing α , we could bump into two situations: • α reaches the value F ( b, ρ ) > − b (cid:112) − ρ for which L + ( l + ; F ( b, ρ )) = L − ( l − ; F ( b, ρ )); • α reaches the value F ( b, ρ ) = − b (cid:112) − ρ . In such case l ± = s ± .Our simulations suggest that the first scenario always occurs.Could we prove this? In this respect we can observe the following: it is equivalent to prove that L + ( s + ; − b (cid:112) − ρ ) < L − ( s − ; − b (cid:112) − ρ ). 11f s + = l ∗ then L + ( s + ; − b (cid:112) − ρ ) = − l ∗ and the function L + ( l + ; bg +( b,ρ ) ( l + )) is increasing. It followsthat the function L − ( l − ; bg − ( b,ρ ) ( l − )) cannot be increasing and s − = r − < l ∗ . We should show that L − ( s − ; − b (cid:112) − ρ ) > − l ∗ .When s − = l ∗ then L − ( s − ; − b (cid:112) − ρ ) = − l ∗ and the function L − ( l − ; bg − ( b,ρ ) ( l − )) is increasing. Again,the function L + ( l + ; bg +( b,ρ ) ( l + )) cannot be increasing so s + = r + > l ∗ . In this case we should prove that L + ( s + ; − b (cid:112) − ρ ) < − l ∗ .In the final case when s ± = r ± it is enough to prove L − ( s − ; − b (cid:112) − ρ ) > − l ∗ and L + ( s + ; − b (cid:112) − ρ ) < − l ∗ .So tu sum up, it would remain to prove that L − ( r − ; − b (cid:112) − ρ ) > − l ∗ and L + ( r + ; − b (cid:112) − ρ ) < − l ∗ to obtain the result in each case. We did not manage to conclude along those lines though. Remark 4.9.
We don’t know whether F ( b, ρ ) > − b (cid:112) − ρ but we conjecture it. Indeed we prove in Annex8 that there is a closed formula for F ( b, which satisfies F ( b, > − b ; the statement F ( b, ρ ) > − b (cid:112) − ρ can be also assessed numerically. So in the sequel we will assume that it is indeed the case. We can exploit the symmetry property of N with respect to ρ in order to restrict the required computationto the function L − only.Indeed N ( l ; α, b, ρ ) = N ( − l ; α, b, − ρ ) , N (cid:48) ( l ; b, ρ ) = − N (cid:48) ( − l ; b, − ρ ) and N (cid:48)(cid:48) ( l ; b ) = N (cid:48)(cid:48) ( − l ; b ). This bringsto the consideration that L − ( l ; α, b, ρ ) = − L + ( − l ; α, b, − ρ ) , L + ( l ; α, b, ρ ) = − L − ( − l ; α, b, − ρ ) . Theninf l>l ∗ ( ρ ) L + ( l ; α, b, ρ ) = − sup l>l ∗ ( ρ ) L − ( − l ; α, b, − ρ ) = − sup l< − l ∗ ( ρ ) L − ( l ; α, b, − ρ ) = − sup l
With the previous notations, • L + ( l + ( α, b, ρ ); α, b, ρ ) = − L − ( l − ( α, b, − ρ ); α, b, − ρ ) ; • l + ( α, b, ρ ) = − l − ( α, b, − ρ ) ; • I α,b,ρ = (cid:3) L − ( l − ( α, b, ρ ); α, b, ρ ) , − L − ( l − ( α, b, − ρ ); α, b, − ρ ) (cid:2) . From the above equations we also have g +( b,ρ ) ( l ) = g − ( b, − ρ ) ( − l ) so with easy arguments s + ( b, ρ ) = − s − ( b, − ρ ). b (1 − ρ ) = 2 or b (1 + ρ ) = 2Assume now that b (1 − ρ ) = 2. Using the same definitions and following the proof in the previous section,we obtain that g − ( b,ρ ) ( l ) is increasing. Now since g − ( b,ρ ) is increasing on ] − ∞ , l ∗ ] and since g − ( b,ρ ) ( l ∗ ) =12 (cid:112) − ρ , it follows that there is no solution to the equation g − ( b,ρ ) ( l − ) = αb . In this case so, the supremumof L − is attained at −∞ and it is − α .If ρ = 0, then also b (1 + ρ ) = 2 so I α, , = (cid:3) − α , α (cid:2) is non-empty if and only if α > F (2 ,
0) = 0.If ρ (cid:54) = 0, then L + reaches its infimum in ] l ∗ , + ∞ [ and the Fukasawa threshold, if it exists, is the value of α such that L − ( l − ( α, b, − ρ ); α, b, − ρ ) = α where l − ( α, b, − ρ ) is such that g − ( b, − ρ ) ( l − ( α, b, − ρ )) = αb . From4.4, in this case F ( b, ρ ) < We can now state the full characterization of the Fukasawa necessary no arbitrage conditions for SVI:
Theorem 4.11 (SVI parameters ( α, b, ρ, µ, σ ) fulfilling Fukasawa necessary no arbitrage conditions) . Let L − ( l ; α, b, ρ ) as in (1) and g − ( b,ρ ) as in (2). • If b (1 ± ρ ) < : ◦ there exist a unique l − ( α, b, ρ ) < l ∗ and a unique l − ( α, b, − ρ ) < l ∗ such that g − ( b,ρ ) ( l − ( α, b, ρ )) = αb and g − ( b, − ρ ) ( l − ( α, b, − ρ )) = αb ; ◦ let F ( b, ρ ) denote the unique value for α > − b (cid:112) − ρ such that − L − ( l − ( α, b, − ρ ); α, b, − ρ ) = L − ( l − ( α, b, ρ ); α, b, ρ ) if there exists such a value, otherwise let F ( b, ρ ) = − b (cid:112) − ρ ; ◦ then F ( b, ρ ) < . The interval I α,b,ρ = (cid:3) L − ( l − ( α, b, ρ ); α, b, ρ ) , − L − ( l − ( α, b, − ρ ); α, b, − ρ ) (cid:2) isnon-empty iff α > F ( b, ρ ) . • If b (1 − ρ ) = 2 (or b (1 + ρ ) = 2 ) and ρ (cid:54) = 0 : ◦ there exists a unique l − ( α, b, − ρ ) < l ∗ (resp. l − ( α, b, ρ ) ) such that g − ( b, − ρ ) ( l − ) = αb (resp. g − ( b,ρ ) ( l − ) = αb ); ◦ let F ( b, ρ ) denote the unique value for α > − b (cid:112) − ρ such that L − ( l − ( α, b, − ρ ); α, b, − ρ ) = α (resp. L − ( l − ( α, b, ρ ); α, b, ρ ) = α ) if there exists such a value, otherwise let F ( b, ρ ) = − b (cid:112) − ρ ; ◦ then F ( b, ρ ) < . The interval I α,b,ρ = (cid:3) − α , − L − ( l − ( α, b, − ρ ); α, b, − ρ ) (cid:2) (resp. I α,b,ρ = (cid:3) L − ( l − ( α, b, ρ ); α, b, ρ ) , α (cid:2) ) is non-empty iff α > F ( b, ρ ) . • If b = 2 and ρ = 0 , then the interval I α, , = (cid:3) − α , α (cid:2) is non-empty iff α > F (2 ,
0) = 0 .In every case, the Fukasawa conditions are satisfied iff µ ∈ I α,b,ρ . Except for F (2 , F ( b, ρ ) negative holds even in the case F ( b, ρ ) > − b (cid:112) − ρ because wehave proven that for α ≥ µ is always non-empty.The previous theorem stated for the common SVI parameters would require aσ > F ( b, ρ ) and mσ ∈ I aσ ,b,ρ .Is the existence of the Fukasawa threshold surprising? We would say no: indeed the values of α too closeto the lower bound − b (cid:112) − ρ correspond to values of the smile too close to zero, and this will lead to anarbitrage as discussed in subsection 2.2, so that one even expects that F ( b, ρ ) > − b (cid:112) − ρ .The explanation of the range constraint for µ is less intuitive to us; we would say that it results fromthe geometrical constraint that the Fukasawa conditions impose on the shape of SVI, as follows from ourcomputations. 13 .2.5 Numerics F ( b, ρ ) at a fixed b . We plot below the Fukasawa threshold at fixed b = as a function of ρ .The graph is symmetric with respect to ρ because F ( b, ρ ) is the value of α suchthat L + ( l + ( α, b, ρ ); α, b, ρ ) − L − ( l − ( α, b, ρ ); α, b, ρ ) = 0, where bg ± ( b,ρ ) ( l ± ( α, b, ρ )) = α . But L + ( l + ( α, b, ρ ); α, b, ρ ) = − L − ( l − ( α, b, − ρ ); α, b, − ρ ) so we look for α such that L − ( l − ( α, b, − ρ ); α, b, − ρ ) + L − ( l − ( α, b, ρ ); α, b, ρ ) = 0and this is symmetric with respect to ρ .The red line is the level α = − b (cid:112) − ρ and it again confirms our hypothesis that F ( b, ρ ) > − b (cid:112) − ρ .From the previous graph, it seems that F ( b, ρ ) has monotonicity of the same sign as ρ . F ( b, ρ ) at fixed ρ as a function of b . Here we plot the Fukasawa threshold at fixed ρ = as a functionof b . 14 − ( l − ( F ( b, ρ ) , b, ρ ); F ( b, ρ ) , b, ρ ) and L − ( l − ( F ( b, ρ ) , b, − ρ ); F ( b, ρ ) , b, − ρ ) as functions of ρ . The followinggraph shows in blue the function L − ( l − ( F ( b, ρ ) , b, ρ ); F ( b, ρ ) , b, ρ ) (denoted for brevity as L − ( F ( b, ρ ) , ρ )) withrespect to ρ while in green the function L − ( l − ( F ( b, ρ ) , b, − ρ ); F ( b, ρ ) , b, − ρ ) (or L − ( F ( b, ρ ) , − ρ )) with respectto ρ . The fixed value for b is .This graph also shows in blue the value of the two bounds for µ when they shrink to one point. Notethat for ρ = 0 this is 0 for every b , while it depends on b for the other ρ s.The function ρ → L − ( l − ( F ( b, ρ ) , b, ρ ); F ( b, ρ ) , b, ρ ) is odd due to the symmetry of ρ → F ( b, ρ ). Further-more, from the graph it seems that ρ and L − ( l − ( F ( b, ρ ) , b, ρ ); F ( b, ρ ) , b, ρ ) have the same sign. L − ( l − ( F ( b, ρ ) , b, ρ ); F ( b, ρ ) , b, ρ ) as a function of b . The following graph shows the function L − ( l − ( F ( b, ρ ) , b, ρ ); F ( b, ρ ) , b, ρ ) (denoted as L − ( F ( b, ρ ) , ρ )) with respect to b . Here we fix ρ = . We can parameterize the normalized SVI parameters satisfying the Fukasawa conditions as follows:1. Choose ρ ∈ ] − ,
1[ and b positive such that − ≤ b ( ρ −
1) and b ( ρ + 1) ≤
22. Compute numerically F ( b, ρ ), and parametrize α by setting α = F ( b, ρ ) + u for positive u ;15. Compute numerically the values ( L − , L + ), for this value of u , and parameterize µ by setting µ = (1+ q )2 L + + (1 − q )2 L − for q ∈ ] − , F ( b, ρ ), indeed it is sufficientto evaluate L − ( l − ( α, b, ρ ); α, b, ρ ) and − L − ( l − ( α, b, − ρ ); α, b, − ρ ).If we are interested only by a test that a given parameter satisfies the Fukasawa conditions, we have thecorresponding waterfall of failure possibilities that we define as follows:1. − > b ( ρ −
1) or b ( ρ + 1) > failure of type 1 ; otherwise:2. α ≤ F ( b, ρ ): failure of type 2 ; otherwise:3. µ not in the range corresponding to α : failure of type 3 . The so-called
Axel Vogt example (cf [9]) became the archetypal example of a smile with arbitrage. The
SV I parameters are ( a, b, ρ, m, σ ) = ( − . , . , . , . , . µ is 0 . − . , . µ . However α = − . F ( b, ρ ) = − . α > F ( b, ρ ) and the interval for µ is non-empty. The problem here is due to µ , which is too large: we facea failure of type 3 . G Recall that the function G is defined as G ( l ) := N (cid:48)(cid:48) ( l ) − N (cid:48) ( l ) N ( l ) (3)and that it depends only on ( α, b, ρ ). G is positively proportional to the second derivative of the volatility smile, meaning of (cid:112) SV I ( k ).Since the variance smile is convex and asymptotically linear on both sides, it is expected that G will beasymptotically negative , while it is positive around the minimum of the smile. In particular it is expectedthat it will have zeros, on both sides of the minimum of the smile. G In this section we prove the following:
Lemma 5.1 (Zeros of G ) . Let G ( l ) := N (cid:48)(cid:48) ( l ) − N (cid:48) ( l ) N ( l ) . Then G has exactly two zeros l , l which satisfy l < min { , l ∗ } and l > max { , l ∗ } such that G ( l ) > ⇐⇒ l ∈ ] l , l [ . Furthermore, G ( l ) → − for l → ±∞ .Proof. For l → ±∞ we have that the first addend behaves as bl − while the second as − b ( ρ ± l − , so G behaves as − b ( ρ ± l − . This means that G goes to 0 − as l → ±∞ . Since G ( l ∗ ) = N (cid:48)(cid:48) ( l ∗ ) > G is continuous, then there exists an interval ( l , l ) containing l ∗ such that for every l in this interval, G ispositive. It follows that G has at least two zeros. Deriving, we find the following interesting relationshipbetween G (cid:48) and G : 16 (cid:48) ( l ) = N (cid:48)(cid:48)(cid:48) ( l ) − N (cid:48) ( l ) N ( l ) G ( l )We will prove now that this relationship entails that the first zero of G is negative. Indeed if l > G , since N (cid:48)(cid:48)(cid:48) ( l ) = − bl ( l + 1) (4)we have G (cid:48) ( l ) <
0, which is not possible because G ( l ) is negative for every l < l . If l = 0, then G (cid:48) (0) = 0 but 0 cannot be a point of local maximum for G , otherwise there would be a following zero l >
0. In such case, G (cid:48) ( l ) < G so far negative, it should be increasing in l . Then 0could at most be an inflection point. However, G (cid:48)(cid:48) ( l ) = N iv ( l ) − N (cid:48) ( l ) N (cid:48)(cid:48)(cid:48) ( l ) N ( l ) + (cid:18) N (cid:48) ( l ) N ( l ) − N (cid:48)(cid:48) ( l ) N ( l ) (cid:19) G ( l )so G (cid:48)(cid:48) (0) = N iv (0) = − b , which is negative since b >
0. Therefore, the first zero l of G is necessarilynegative. With similar arguments we obtain that the next zero l must be non-negative. Suppose l = 0.Then, as before, G (cid:48) (0) = 0 and G (cid:48)(cid:48) (0) = − b <
0, so it would be a point of local maximum, which is notpossible. Then l must be positive.Moreover, there cannot be other zeros for G . Indeed, suppose l is the first zero after l . Then l > G (cid:48) ( l ) < G is negative in the left neighborhoodof l .This leads to the conclusion that G has exactly two zeros, one positive and the other one negative.As a consequence, G (0) = b (cid:16) − bρ α + b ) (cid:17) >
0. This could have been obtained also from the fact that α + b (cid:112) − ρ ≥ N .Then, we find that G > l ∗ ,
0] when ρ ≥ , l ∗ ] when ρ < N, N (cid:48) and N (cid:48)(cid:48) in (3), we obtain G ( l ) = b ( l + 1) − b ( ρ √ l + 1 + l ) l + 1)( α + b ( ρl + √ l + 1))which leads to the remark that G ( l ) b = ˜ G , αb ,ρ ( l ) where ˜ G ,x,ρ ( l ) := l +1) − ( ρ √ l +1+ l ) l +1)( x +( ρl + √ l +1)) , whichreduces in general the study of G to a study of a 2-parameter function.In order to find the zeros of G we should solve 2 αb + b (2 − l ) √ l + 1 − ρ ( l + 1) − ρl = 0 orequivalently 2 αb − ρl = (( ρ + 1) l + ρ − √ l + 1.Note that when ρ = 0 this equation is explicitly solvable. G function. We plot below the function G for the parameters α = , b = , ρ = − .17 .2 The final condition on σ We recall that the positivity of the Durrleman condition in the case of SVI amounts to the positivity of thefunction G ( l ) = G ( l ) + 12 σ G ( l ) (5)where G and G do not depend on σ .We have proven that:1. for every ( α, b, ρ ) with − ≤ b (1 ± ρ ) ≤ α > F ( b, ρ ), where F ( b, ρ ) ≤
0, there exists an intervalfor µ such that G is positive on R (in fact each factor of G is positive on this interval). Moreover itis necessary that the conditions on ( α, b, ρ ) hold and that µ lies in this interval under no-arbitrage.2. for every ( α, b, ρ ) with − ≤ b (1 ± ρ ) ≤ l , l [ containing 0 and l ∗ such that G ( l ) > l ∈ ] l , l [.We insist here again on the key property brought by the Fukasawa condition that it is necessary that G be positive. This structures a lot the picture; previous to Fukasawa’s observation, people investigating thepostivity of G could not assume this. Another consequence is that under the Fukasawa conditions of section4, G is granted to be positive on [ l , l ].The last step is to exploit the fact that thanks to our re-parametrization, the dependency of G in σ isvery simple. Let stand for a fixed set of parameters ( α, b, ρ, µ ) fulfilling the Fukasawa conditions. Thengiven the fact that G ( l ) < l , it follows that if G is non-negative everywhere for ( , σ ), then G isalso non-negative everywhere for every ( , τ ) with τ > σ . It follows that there exists a function → σ ∗ ( )such that G is non-negative everywhere for ( , τ ) iff τ ≥ σ ∗ ( ).The value of σ ∗ can be obtained asking the RHS of (5) to be positive, which holds for σ > sup l − G ( l )2 G ( l ) .Then σ ∗ ( a, b, ρ, µ ) := sup l
1? We discuss below the case ρ = −
1, the case ρ = 1 follows by symmetry.In this case the SVI smile is (convex) decreasing, and reaches its minimum α at infinity, so the domain of α is now α ≥
0. Note that the boundary value 0 is allowed, unlike in the regular case, because the impliedvolatility does not vanish at any finite strike. The negative slope condition requires b ≤
1, and the positive(rightmost) one is automatically fulfilled.Regarding the Fukasawa conditions, the proofs in section 4 still hold with the convention that l ∗ = + ∞ so that N is decreasing. The interval for µ becomes I α,b, − =] L − ( l − ( α, b, − α, b, − , + ∞ [, so exactlyequal to I α,b,ρ with the convention L − ( l − ( α, b, α, b,
1) = −∞ . For α ≥
0, we have L − ( l ) < l so L − ( l − ( α, b, − α, b, − < , ∞ ); this implies that by definition F ( b, −
1) = 0 and the interval for µ is non-degenerate even when α = F ( b, −
1) = 0.The function G has only one negative zero l , above which it is always positive with G (+ ∞ ) = 0 + while G ( −∞ ) = 0 − . So σ ∗ = sup l 1. The positive one does not solve the initialequation, so with the notations used in section 4, we finally find r − = − b +22 √ − b ) . If b = 1, then r − = + ∞ .Note that r − corresponds to l − when α = 0, and we get that L − ( l − (0 , b, − , b, − 1) = − (cid:112) − b ).So for α = 0: • the Fukasawa conditions are satisifed if and only if µ > − (cid:112) − b ); • the unique zero of G does not depend on b and is given by l = − √ , and the parameters with noarbitrage are eventually given by b ≤ µ > − (cid:112) − b ), σ > σ ∗ (0 , b, − , µ ) := sup l< − / √ − G ( l )2 G ( l ) .20 .4 Algorithm We can now complete the algorithms stated for the Fukasawa conditions. For the paramterization of the noarbitrage domain, we just need to add the final step which specifies the range of σ :1. Choose ρ ∈ ] − , 1[ and b positive such that − ≤ b ( ρ − 1) and b ( ρ + 1) ≤ 2; this can be ensured bychoosing b (cid:48) ∈ ]0 , 1] and setting b = b (cid:48) | ρ | ;2. Compute numerically F ( b, ρ ), and parametrize α by setting α = F ( b, ρ ) + u for positive u ;3. Compute numerically the values ( L − , L + ), for this value of u , and parameterize µ by setting µ = (1+ q )2 L + + (1 − q )2 L − for q ∈ ] − , σ ∗ ( α, b, ρ, µ ), and parameterize the interval σ by setting σ = σ ∗ + v where v > ρ, b (cid:48) , u, q, v ) ∈ ] − , × ]0 , × ]0 , ∞ [ × ] − , × ]0 , ∞ [and this is perfectly suitable to feed optimization algorithms working with bounds, like the standard onesin the Scipy.optimize scientific library.A drawback to keep in mind is that sampling this product sub-space in a uniform way corresponds to adistorted sampling in the initial space.There again, we can specify an algorithm which decides whether a SVI parameter lies or not on theno-arbitrage domain:1. − > b ( ρ − 1) or b ( ρ + 1) > 2: failure of type 1; otherwise:2. α ≤ F ( b, ρ ): failure of type 2; otherwise:3. µ not in the range corresponding to α : failure of type 3; otherwise:4. σ ≤ σ ∗ : failure of type 4 . Now that we have parameterized the no arbitrage domain, the design of a calibration algorithm is straight-forward:1. Choose an objective function;2. Choose a starting point policy;3. For the chosen starting points (possibly several of them), run a minimization algorithm of the objectivefunction over the no arbitrage domain;4. Pick up the optimal parameter.As objective function, we choose the classical least squares criterion, whick takes as input the differencesof the data and model total variances on the available set of log-forward moneynesses. This will give equalweights to far-from the money points, where the precise value of the implied volatility, and so the accuracyof the calibration, matters less, than to close-to the money ones, which is not a desirable feature: it can beeasily patched by adding weights given by the Vegas (computed once for all with the data points), so thatthe errors are more in line with losses, unit-wise. This would moreover stabilize the calibration from one dayto another one, especially on illiquid markets, as discussed in detail in [17].21ow the big question for us is rather whether or not the no arbitrage constraint will deteriorate thequality of the fit, and we will also work on model generated data or on index options data which are liquidones, whence our choice of a standard non-weighted objective function.Regarding the starting point policy, we are not big fans of smart guess strategies which try to compute thebest starting point from the data. Such strategies can work brilliantly in many favorable situations, yet theymight fail heavily on data with low quality (e.g. due to a dubious treatment by an internal department), orwhen faced with new market behavior and configurations. There is a clear risk of overengineering here also.We would be more confident by using a set (with small cardinality) of starting points, possibly produced bya machine learning algorithm duly trained on the markets in scope. We implement a very basic version ofthis idea, which picks up uniformly generated points within the hyperrectangle of the no arbitrage domain,irrespective of the data.The scipy function here used is the ‘least squares’ which lies in the optimize library. The method usedis the ‘dogbox’, which handles bounds. The tolerances regarding the change of the cost function (‘ftol’), thechange of the independent variables (‘xtol’) and the norm of the gradient (‘gtol’) are all set at the pythonnumpy machine epsilon. The maximum number of function evaluations (‘max nfev’) is set at 1000.Even though the arbitrage region does not impose an upper bound for α and σ , we choose arbitrary ones.In particular, we ask σ ≤ max (cid:18) | k | r , | k N | r , . σ ∗ (cid:19) with r as parameter to be chosen by the user (default value equal to 0 . | k i | σ is below a threshold r , then the smile is almost flat and this causes uncertainty on theparameters to be chosen.The upper bound for α is left to be chosen by the user. For the index option data we set α < α parameter used to generate data. We set in everycase α < 3, since we know a priori that all the data are generated with α lower than 3.We provide below our calibration results on model generated data and then on market data. To check the robustness of the algorithm we firstly run it on data generated by arbitrary SVI parameterswith no arbitrage, and on the Axel Vogt parameter. We take a vector of 13 log-forward strikes taken fromTable 3.2 of [20]. 22he parameters chosen for each of the previous graphs are arbitrage-free. The red and the blue lines,which represent the total variances generated from the arbitrary parameters and the total variances obtainedfrom the calibrated parameters respectively, overlap. The fact that the fit is excellent can be seen by theFrobenius relative errors: a b ρ m σ Relative error ( × − )0 0.10 1.0 -0.306 0.10 0.30 2.761 -0.10 1.1 0.200 0.00 0.60 1.312 0.01 0.1 -0.600 -0.05 0.10 1.793 0.80 0.2 0.800 1.00 0.90 0.824 1.40 1.9 0.000 -0.10 0.50 1.635 0.90 1.2 0.500 0.20 0.85 6.01 Table 1: Frobenius relative errors for the total variances Furthermore, also the Frobenius relative error on the parameters is low. This means that the algorithmis robust and recovers the original data. a b ρ m σ Relative error ( × − )0 0.10 1.0 -0.306 0.10 0.30 0.101 -0.10 1.1 0.200 0.00 0.60 0.402 0.01 0.1 -0.600 -0.05 0.10 0.043 0.80 0.2 0.800 1.00 0.90 20.004 1.40 1.9 0.000 -0.10 0.50 0.405 0.90 1.2 0.500 0.20 0.85 3.0023 able 2: Frobenius relative errors for the parameters For a matter of fulfillness we run our algorithm on the notorious Axel Vogt parameters, which lead to anarbitrage SVI. The original and the calibrated parameters are reported in the following table: a b ρ m σ Original -0.041 0.1331 0.306 0.3586 0.4153Calibrated -0.0198444 0.102745 0.180754 0.266125 0.310459 Table 3: Axel Vogt parameters vs best fitting no arbitrage Of course, the calibration is not perfect as in the previous case and the Frobenius error between the AxelVogt total variances and the non arbitrage SVIs is of the 2 . g defined in Proposition 2.1 with the original Axel Vogt parameters and thesame function with the new arbitrage-free parameters.24rom the plot it can be seen that the function g with the new arbitrage-free parameters can be very closeto zero, but it is always positive.In the following study, we compare the results obtained with the new arbitrage-free parameters and theones with the parameters described in Example 5.1 of [9], which are also arbitrage free. The following plotshows that the fit of our new parameters is better than the one of Gatheral and Jacquier.In the following table we compare the relative errors on the total variances for the two sets of arbitrage-freeparameters. a b ρ m σ Relative errorArbitrage-Free -0.0198444 0.102745 0.180754 0.266125 0.310459 0.022Gatheral-Jacquier -0.0305199 0.102717 0.100718 0.272344 0.412398 0.133 Table 4: Frobenius relative errors for the total variances We now turn to market data. We work with market data of good quality bought from the CBOE datastoreby Zeliade. They cover daily files for the DJX, SPX500 and NDX equity indices, with bid and ask prices.To obtain implied total variances from the prices, we operate the classical treatment of inferring thediscount factor and forwards values at each option maturity by performing a linear regression of the (mid)Call minus Put prices with respect to the strike. Since the markets under study are very liquid, the fit isexcellent and the residual error extremely small.Then, given the discount forward and forward values for each maturity, we are able (after working out theexact maturity of each contract from its code, if not provided explicitly) to compute the implied volatilities,for the Bid and Ask prices.We feed the objective function with the implied volatility corresponding to the mid price, and plot belowthe implied volatilities for the calibrated model and the bid and ask market data.25 .2.1 DJX .2.2 SPX500 .2.3 NDX From our experiments we draw several positive conclusions: • The quality fit is excellent, and there is no big loss resulting from the no arbitrage constraint; • The implementation we have designed seems sufficiently robust in practice; of course such a statementshould be re-assessed continuously; 28 The payload of the root finding algorithms used to compute the Fukasawa threshold and the boundsfor µ and σ is not an issue, the calibration is still reasonably fast on a basic chip; the average for eachmaturity for the DJX data is 51 . 598 seconds, for the SPX data 36 . 490 seconds and for the NDX data44 . 900 seconds.Of course, there is room for improvement, at least at the level of the starting point strategy. One couldalso think of pre-computing the numerical functions computed on the fly, or to design once for all explicitproxies for them, which would speed massively the execution of the algorithm. Fukasawa’s remark that the inverse of d and d functions of the Black-Scholes formula have to be non-increasing under no Butterfly arbitrage, paired with the natural rescaling of the SVI parameters whichconsists in scaling a and m by σ , allow us to fully describe the domain of no Butterfly arbitrage for SVI.The no Butterfly arbitrage domain can be parametrized as an hyper rectangle, with 2 downstreamalgorithms of practical importance: one for checking that a SVI parameter lies or not in the no arbitragedomain, and the other one to effectively perform a calibration. Three functions have to be computednumerically by resorting to root-finding type algorithms; due to the fact that our careful mathematicalanalysis provided safe bracketing intervals for those functions, this can be achieved in a very quick manner.We provide calibration results on model and market data, the latter showing that there is no loss of fitquality due to imposing the no arbitrage constraint.This analysis settles one important issue in the SVI saga. Other ones are still pending, like the study ofsub-SVI parametrizations with 4 parameters instead of 5, in the spirit of SSVI (which has 3 parameters slice-wise), with more parameter stability than SVI and a better fit quality than SSVI, and also the question ofthe characterization of no Calendar Spread arbitrage for two SVI slices corresponding to different maturities.29 Annex Proof. Observe that at the point l ∗ , ρ √ l + 1 + l = 0 and also after computations, ρl + √ l + 1 = (cid:112) − ρ ,so we have g ± ( b,ρ ) ( l ∗ ) = − (cid:112) − ρ . Substituting N ( l ) = α + lN (cid:48) ( l ) + N (cid:48)(cid:48) ( l )( l + 1) in the expression of L ± we get L ± ( l ) = 2 α (cid:18) N (cid:48) ( l ) ∓ (cid:19) + 2( lN (cid:48) ( l ) + N (cid:48)(cid:48) ( l )( l + 1)) (cid:18) N (cid:48) ( l ) ∓ (cid:19) − l. (6)We have ddα L ± ( l ± ) = L (cid:48)± ( l ± ) ddα l ± + ∂ α L ± ( l ± ) = ∂ α L ± ( l ± ). Deriving (6) with respect to α , we find ∂ α L ± ( l ± ) = 2 (cid:16) N (cid:48) ( l ± ) ∓ (cid:17) .Since N (cid:48) ( l ) > l > l ∗ and 4 ∓ N (cid:48) > 0, then ∂ α L − ( l − ) < ∂ α L + ( l + ) > 0. So the function α → L − ( l − , α ) is decreasing while α → L + ( l + , α ) is increasing. It means that the bounds for µ are anincreasing family of sets (possibly empty) parametrized by α . Consider the lower bound, so l < l ∗ . We canwrite the expression for g − ( b,ρ ) in another way. We have L (cid:48)− ( l ) = 1 + N (cid:48) ( l )2 − N (cid:48)(cid:48) ( l ) N (cid:48) ( l ) ( α + lN (cid:48) ( l ) + N (cid:48)(cid:48) ( l )( l + 1)) . Evaluating this in l − , the LHS becomes 0 and we can isolate α , obtaining g − ( b,ρ ) ( l ) = 1 b (cid:18) N (cid:48) ( l ) N (cid:48)(cid:48) ( l ) (cid:18) N (cid:48) ( l )2 (cid:19) − lN (cid:48) ( l ) − N (cid:48)(cid:48) ( l )( l + 1) (cid:19) . (7)From this expression, we get the derivative of g − ( b,ρ ) such as g (cid:48)− ( b,ρ ) ( l ) = N (cid:48) ( l ) b (cid:18) − N (cid:48)(cid:48)(cid:48) ( l ) N (cid:48)(cid:48) ( l ) ( N (cid:48) ( l ) + 2) (cid:19) ,which is positive iff the second factor is positive. Substituting with the explicit expressions, we find that thisholds iff b √ l + 1 (cid:16) l (2 + bρ ) + b √ l + 1 (cid:17) > − l (2 + bρ ) < b √ l + 1.Note that since b ( ρ − ≥ − 2, then 2 + bρ > 0. If ρ < l ≥ 0. For negative l s we can square, obtaining that it holds iff ( b ( ρ − 1) + 4 bρ + 4) l < b . For b ( ρ − > − 2, the coefficientof l is positive, so the inequality holds iff l > − b √ b ( ρ − bρ +4 := m − . Since ρ is negative, m − < l ∗ . So inthis case g − ( b,ρ ) ( l ) is increasing iff l > m − .If ρ ≥ 0, we proceed in a similar way taking the square and obtaining that, if b ( ρ − > − 2, the inequalityholds iff l > − b √ b ( ρ − bρ +4 := m − . If b ≤ ρ − ρ , then m − ≥ l ∗ and g − is always decreasing. Otherwiseif b > ρ − ρ , then m − < l ∗ and g − ( b,ρ ) is increasing iff l > m − . We can write α as a function of l − , indeed α = bg − ( b,ρ ) ( l − ). This function has the same monotonicity as g − ( b,ρ ) .We obtain from the previous analysis that the function l − → L − ( l − ; bg − ( b,ρ ) ( l − )) is:- increasing iff l − < m − when b > ρ − ρ ;- increasing for every l − < l ∗ when b ≤ ρ − ρ . Using (6) and substituting α with bg − ( b,ρ ) ( l − ) explicitedas in (7), we obtain L − ( l − ; bg − ( b,ρ ) ( l − )) = N (cid:48) ( l − ) N (cid:48)(cid:48) ( l − ) (cid:18) N (cid:48) ( l − )2 (cid:19)(cid:18) N (cid:48) ( l − ) + 14 (cid:19) − l − . (8)From here it can be seen that L − ( l − ; bg − ( b,ρ ) ( l − )) goes to − l ∗ when l − goes to l ∗− . Similarly, we can doall the equivalent computations for L + . First, the function g +( b,ρ ) can be re-written as g +( b,ρ ) ( l ) = 1 b (cid:18) N (cid:48) ( l ) N (cid:48)(cid:48) ( l ) (cid:18) − N (cid:48) ( l )2 (cid:19) − lN (cid:48) ( l ) − N (cid:48)(cid:48) ( l )( l + 1) (cid:19) . L + ( l + ; bg +( b,ρ ) ( l + )) = N (cid:48) ( l + ) N (cid:48)(cid:48) ( l + ) (cid:18) − N (cid:48) ( l − )2 (cid:19)(cid:18) N (cid:48) ( l − ) − (cid:19) − l + and even in this case L + ( l + ; bg +( b,ρ ) ( l + )) goes to − l ∗ when l + goes to l ∗ + . We can study the monotonicityof g +( b,ρ ) , obtaining g (cid:48) +( b,ρ ) ( l ) = N (cid:48) ( l ) b (cid:18) − N (cid:48)(cid:48)(cid:48) ( l ) N (cid:48)(cid:48) ( l ) ( N (cid:48) ( l ) − (cid:19) .Considering the second factor and substituting with the explicit expressions, the latter quantity is positiveiff − b √ l + 1 (cid:16) − l (2 − bρ ) + b √ l + 1 (cid:17) > l (2 − bρ ) > b √ l + 1.Here, since b ( ρ + 1) ≤ 2, then 2 − bρ > 0. If ρ > l ≤ 0. For positive l s wecan square, obtaining that it holds iff ( b ( ρ − − bρ + 4) l > b . For b ( ρ + 1) < 2, the coefficient of l ispositive, so the inequality holds iff l > b √ b ( ρ − − bρ +4 := m + . Since ρ is positive, m + > l ∗ . So in this case g +( b,ρ ) ( l ) is increasing iff l > m + .If ρ ≤ 0, we proceed in a similar way taking the square and obtaining that, if b ( ρ + 1) < 2, the inequalityholds iff l > b √ b ( ρ − − bρ +4 := m + . If b ≤ − ρ − ρ , then m + ≤ l ∗ and g +( b,ρ ) is always increasing. Otherwiseif b > − ρ − ρ , then m + > l ∗ and g +( b,ρ ) is increasing iff l > m + . Remember that the function α → L + ( l + , α )is increasing. To recap, the function l + → L + ( l + ; bg +( b,ρ ) ( l + )) is:- increasing iff l + > m + when b > − ρ − ρ ;- increasing for every l + > l ∗ when b ≤ − ρ − ρ .If b ≤ − ρ − ρ then ρ < b > ρ − ρ while if b ≤ ρ − ρ then ρ > b > − ρ − ρ . This means that L + ( l + ; bg +( b,ρ ) ( l + )) and L − ( l − ; bg − ( b,ρ ) ( l − )) cannot be both monotonous.The last statement of the proposition is a direct consequence to the fact that ddl ± L ± ( l ± ; bg ± ( b,ρ ) ( l ± )) = ∂ α L ± ( l ± ; bg ± ( b,ρ ) ( l ± )) bg (cid:48)± ( b,ρ ) ( l ± ) where ∂ α L − ( l − ) < ∂ α L + ( l + ) > F ( b, In this subsection we compute F ( b, 0) and prove that F ( b, > − b .With ρ = 0 we have l ∗ = 0 and N = α + b (cid:112) l + 1 ,N (cid:48) = bl √ l + 1 ,N (cid:48)(cid:48) = b ( l + 1) . Consider the particular case b = 2. Then we have already shown F (2 , 0) = 0, which is greater than − b (cid:54) = 2. Since b > ρ − ρ = 0, then the function l − → L − ( l − ; bg − ( b, ( l − )) isincreasing iff l − < m − where m − = − b √ − b . Furthermore the interval for µ is I α,b, = (cid:3) L − ( l − ( α, b, α, b, , − L − ( l − ( α, b, α, b, (cid:2) so it is symmetrical with respect to 0. The Fukasawa thresh-old F ( b, 0) is then the solution to L − ( l − ( F ( b, , b, F ( b, , b, 0) = 0.From equation (8) we obtain L − ( l − ; bg − ( b, ( l − )) = b l − (cid:16) (cid:113) l − + 1 + bl − (cid:17)(cid:32) (cid:113) l − + 1 bl − + 14 (cid:33) − l − . l − < 0, this expression is equal to 0 iff (8+ b ) l = − b √ l + 1 and so iff l − equals l ∗− := − b √ b − b +64 .Then F ( b, 0) = bg − ( b, (cid:18) − b √ b − b + 64 (cid:19) where g − ( b, ( l ) = l (2 √ l + 1 + bl ) − √ l + 1.We now need to prove g − ( b, ( l ∗− ) > − l ∗− < r − . From the expression of g − ( b, , weimmediately find that r − satisfies 2( l − √ l + 1 = − bl − 4, so we look for a negative root such that − bl − l − > 0. This happens iff l lies outside the interval (cid:104)(cid:16) − b (cid:17) , −√ (cid:105) if b ≤ √ (cid:104) −√ , (cid:16) − b (cid:17) (cid:105) if b > √ 2. Squaring the previous equation and simplifying by l we find (4 − b ) l − l − b =0. Call P b ( l ) the LHS.At 0, this polynomial and its derivative are negative. Its local maximum is at − √ − b and its value atthis point is √ − b − b which is always positive. So the polynomial has two negative roots and a positiveone.We can observe that P √ ( −√ 2) = P √ (cid:16)(cid:16) − b (cid:17) (cid:17) = 0 with • P b ( −√ 2) = 2 √ b − √ > • P b (cid:16)(cid:16) − b (cid:17) (cid:17) = − b (cid:16) b − b ) + 4 (cid:17) < b < √ r − is the second negative root of the polynomial while if b ≥ √ l ∗− is − b (( b − − √ b − b + 64)( b − which is positive iff b < ˜ b where (cid:113) < ˜ b < √