[PDF] Extreme-Strike Comparisons and Structural Bounds for SPX and VIX Options

Abstract

This article explores the relationship between the SPX and VIX options markets. High-strike VIX call options are used to hedge tail risk in the SPX, which means that SPX options are a reflection of the extreme-strike asymptotics of VIX options, and vice versa. This relationship can be quantified using moment formulas in a model-free way. Comparisons are made between VIX and SPX implied volatilities along with various examples of stochastic volatility models.

Full PDF

aa r X i v : . [ q -f i n . P R ] M a r EXTREME-STRIKE COMPARISONS AND STRUCTURAL BOUNDS FOR SPXAND VIX OPTIONS

A. PAPANICOLAOU ∗ Abstract.

This article explores the relationship between the SPX and VIX options markets. High-strike VIXcall options are used to hedge tail risk in the SPX, which means that SPX options are a reﬂection of the extreme-strike asymptotics of VIX options, and vice versa. This relationship can be quantiﬁed using moment formulas in amodel-free way. Comparisons are made between VIX and SPX implied volatilities along with various examples ofstochastic volatility models.

1. Introduction.

The S&P500 (SPX) index and its volatility have been shown to have strongnegative correlation. For this reason there is a great deal of interest in the Chicago Board OptionsExchange (CBOE) volatility index (VIX), the options-based volatility index and its derivativesfor hedging tail risk. In particular, the CBOE has designed the VIX Tail Hedge (VXTH) indexbased on an SPX and VIX trading strategy, which has been backtested on data and shown to haveperformed better than the SPX over time periods when there has been a crisis event. Part of theVXTH strategy is to buy high-strike European call options on VIX to insure against losses in SPX,as a large rise in the VIX usually coincides with a drop in the SPX. Prior to VIX derivatives, asimilar insurance strategy might have been to buy low-strike European put options on SPX. Thissimilarity means that there is information on the risk-neutral distribution for VIX that is impliedby low-strike SPX put options. The markets for SPX and VIX options are very liquid, and so it isuseful to have structural bounds that quantify the relationship between the two. In particular, thispaper’s Lemma 3.2 will show that for discounted SPX price S t e − rt being a local martingale (where r ≥ T with ξ ∈ R satisﬁes E t e ξ VIX T ≤ q E t S − ξqτ T + τ + 1 p E t ( e rτ S T ) ξpτ for times 0 ≤ t ≤ T , (1.1)where E t denotes risk-neutral expectation conditional on the market at time t , τ = 30 days, andboth p ≥ q ≥ p + q = 1. If there exists ξ > heavy tailed (i.e. the MGF of VIX T exists from some ξ > T isinﬁnite for all ξ >

0, then the right-hand side of (1.1) is always inﬁnite and there exists no negativemoments for S T + τ .Stochastic volatility and L´evy jump processes (or a combination of these two) have been usedin option pricing since the 1990’s, and at that time it may have seemed as if volatility derivativescould be priced and hedged from a well-calibrated model. Indeed, the availability of VIX optionsdata has been an innovation in the study of volatility because it gives new information in additionto the data from SPX options, which means more data to use when calibrating a model. However,data from various days throughout the 2000’s exhibit VIX option implied volatilities that do nothave a good ﬁt to standards such as the square root (Heston) model. From the perspective ofsomeone searching for the “right model”, model speciﬁcation remains an important issue because ∗ Department of Finance and Risk Engineering, NYU Tandon School of Engineering, 6 MetroTech Center, Brook-lyn, NY 11201 [email protected] . Part of this research was performed while the author was visiting the Institute forPure and Applied Mathematics (IPAM), which is supported by the National Science Foundation. A special ‘thankyou’ to the associate editor for your consideration and input in the preparation of this article.1 t is a nontrivial task to ﬁt a single model to both the SPX and VIX implied-volatility surfaces.Hence, it would be quite useful if there were a general theory to explain the relationship betweenmarkets for VIX and markets for SPX options, and vice versa. This paper presents the beginningsof such a theory in a model-free context.Of further interest is the understanding of the implied volatility from VIX options. It is certainlytrue that every asset class requires tailored expert analysis of implied volatility, but VIX optionimplied volatility is special because it is really the implied volatility-of-volatility for the SPX, andhence it is saying something about SPX options. In particular, implied volatilities from VIX optionsare sometimes very high (i.e. in the range of 80% for high-strike VIX call options), but there isnot yet a standard for making comparisons to implied volatilities observed from SPX options. Itwould be a signiﬁcant contribution if implied volatiles from VIX call options could be used to makedeﬁnitive statements about the no-arbitrage range for SPX implied volatilities. Using the momentformula of [Lee04], this paper identiﬁes a relationship using extreme-strike asymptotics.

The VIX formula (as it has been calculated since 2003) is describedin [DDKZ99]. A general description of how volatility derivatives are designed and traded is providedin [CL09] with particular attention paid to volatility markets in the era post-2008 crisis. Stochasticand local volatility models are described in [Gat06], and pricing of volatility derivatives based onthese models is a standard application of partial diﬀerential equation methods. Pricing of VIXoptions using square-root volatility and jumps is covered in [Sep08]. An alternative to model-basedpricing/hedging are the model-free results found in [CL08, CL10, FG05]; VIX options are pricedusing non-parametric approximations of the pricing kernel in [SX16].The issues in ﬁtting the Heston model to VIX options are explained in [Gat08]. There hasbeen some success in ﬁtting VIX options to market models of the variance swap term structure (see[CS07, CK13]), and the added explanatory power from the inclusion of jumps has been demonstratedin [MY11]. Some studies have shown improved ﬁts to large-strike VIX option implied-volatilitiesusing heavy-tailed process (see [BB14, BGK13, Dri12]), but heavy tails are not necessarily requiredas shown in [PS14] using a Markov-chain modulation of the Heston model. In [MHL15] the linksbetween SPX options and VIX options are studied in a constrained hedging problem, and a linkbetween and SPX and VIX markets is established in [Pap16] using a time-spread portfolio.

The main result in this paper is the identiﬁcation of alink between VIX options and negative moments in the SPX price. In particular, the existence ofnegative moments in the SPX’s risk-neutral distribution is an indication that the VIX’s risk-neutraldistribution is not heavy tailed. Conversely, if the risk-neutral distribution of VIX T is heavy tailed,then the price for SPX has no negative moments. The results can be considered model free, asthe main assumptions are (i) absence of arbitrage and (ii) e − rt S t is a continuous local (super)martingale. The majority of the calculations are made under the assumption that prices are givenby risk-neutral expectations, with the exception of Section 3.3 where mispriced options are shown tobe arbitrageable if they can be used to construct replicating portfolios that violate the inequality of(1.1). The paper provides a detailed application of the theory to speciﬁc models that are frequentlyused in SPX and VIX options pricing, with an assessment of their relative usefulness based onhistorically observed market behavior of these options.The rest of the paper is organized as follows: Section 2 introduces the probabilistic frameworkand describes how to price options on SPX and VIX; Section 3 presents the main results (includingLemma 3.2) and other ideas relevant to the problem; Section 3.3 gives static arbitrage portfoliosthat can be implemented if the market data is mispriced in such a way so that the inequality of emma 3.2 is reversed; Section 4 presents various stochastic volatility models and discusses howeach relates to the paper’s results. Section 5 concludes.

2. Probabilistic Framework for Pricing.

Let S t denote the price of the SPX at some time t ≥

0. The model considered throughout this paper has an asset whose log returns are given by astochastic volatility model, d log( S t ) = (cid:18) r − σ t (cid:19) dt + σ t dW t (2.1)where r ≥ W is a risk-neutral Brownian motion, and σ t is avolatility process that is right continuous with left-hand limits and non-anticipative of W . Condition 2.1 (

Finite Second Moment of Stochastic Integral ). For

T < ∞ , the secondmoment of stochastic integral R T σ u dW u is ﬁnite, E Z T σ u dW u ! = E Z T σ u du < ∞ . Condition 2.1 implies that R t σ u dW u is a true, square integrable martingale on ﬁnite time-interval[0 , T ], but S t e − rt may still be a strict local martingale. The Novikov Condition ensures for

T < ∞ that the process S t e − rt is a true martingale on [0 , T ] if E e R T σ u du < ∞ . The Novikov condition isvery strong and often doesn’t hold for stochastic volatility models. Other conditions for exponentialmartingales are discussed in [KL12]. This paper will rely on Condition 2.1, will not assume Novikov,and will show the martingale property on a case-by-case basis.Another important condition is the existence of S T ’s negative moments: Condition 2.2 (

Negative Moments ). For

T < ∞ , there exists constant q > such that E S − qT < ∞ . For S t a supermartingale, Condition 2.2 implies existence of the MGF of log( S T ) over a ﬁniteinterval containing zero, E e ξ log( S T ) < ∞ ∀ ξ ∈ [ − q, . Condition 2.2 is used in [Lee04] to obtain small-strike bounds on implied volatility. In particular,the supremum over all { q > E S − qT < ∞} is identiﬁed with the asymptotic rate at which impliedvolatility grows as strike-price goes to zero for options on S T ; this is part 2 of the moment formula[Lee04] that will be reviewed in Section 3.1. Consider European call and put options on S T for some ﬁxed time T ∈ (0 , ∞ ) and some ﬁxed strike K ∈ [0 , ∞ ), both of which are processesC( t, K, T ) , B t,T E t ( S T − K ) + P( t, K, T ) , B t,T E t ( K − S T ) + for some 0 ≤ t ≤ T , where B t,T = e − r ( T − t ) is the discount factor and the expectation operatoris deﬁned as E t , E [ · |F t ] with F t denoting a σ -algebra with respect to which W is Brownian otion, σ t is F t adapted, and S is F adapted. Throughout the paper, an expectation without asubscript is conditional at time t = 0, that is, E = E . Definition 2.3 (

SPX Implied Volatility ˆ σ ). Implied volatility for SPX options is denoted with ˆ σ ( t, K, T ) and is the unique volatility input to the Black-Scholes prices such thatP ( t, K, T ) = B t,T (Φ ( − d − ) K − Φ ( − d + ) E t S T ) where d ± = log( E t S T /K )ˆ σ ( t,K,T ) √ T − t ± ˆ σ ( t,K,T ) √ T − t and Φ is the standard normal cumulative distributionfunction. The quantity ˆ σ could be equivalently redeﬁned using call options via the put-call parity. A variance swap for the time period [ t, T ] with t < T has a ﬂoating leg of T − t R Tt σ u du (equalto the quadratic variation of log( S t ) divided by time) and a ﬁxed leg that is chosen such that thecontract has zero entry cost at time t . This ﬁxed leg is the variance-swap rate: variance-swap rate = E t " T − t Z Tt σ u du . When trading in variance swaps, an important instrument is the log contract with time- T payoutof log( S T / E t S T ). As shown in [DDKZ99], the negative log contract is replicated by a portfolio ofEuropean call and put options by taking the expectation of the identity − log( S T /s ∗ ) = − S T − s ∗ s ∗ + Z s ∗ ( K − S T ) + K dK + Z ∞ s ∗ ( S T − K ) + K dK , which holds for any reference point s ∗ >

0. Taking s ∗ = E t S t + τ yields the VIX formula:VIX t = vuut τ B t,t + τ Z E t S t + τ P( t, K, t + τ ) dKK + Z ∞ E t S t + τ C( t, K, t + τ ) dKK ! , (2.2)where τ = 30 days. By deﬁnition, equation (2.2) is the square root of the log contract’s priceVIX t = r − τ E t log (cid:16) S t + τ . E t S t + τ (cid:17) . By assuming Condition 2.1 for the continuous model in (2.1),the risk-neutral price of the log contract is equal to the variance-swap rate, and hence the VIXindex is the square root of the variance-swap rate for the coming 30 days,(Condition 2.1) ⇒ VIX t = p variance-swap rate . (2.3) Deﬁne the future contract on VIX T at time t ≤ T as X t,T = E t VIX T = E t r − τ E T log (cid:16) S T + τ . E T S T + τ (cid:17) . (2.4)The price X t,T is important in the VIX market because (unlike the VIX index) it is a trade-ableasset. European call and put options on the VIX are the expectation of functions of VIX T , butshould be thought of as options on X T,T ,C vix ( t, K, T ) , B t,T E t (VIX T − K ) + = B t,T E t ( X T,T − K ) + P vix ( T, K, T ) , B t,T E t ( K − VIX T ) + = B t,T E t ( K − X T,T ) + . onsidering these options as payoﬀs on X T,T makes more clear the convention for ∆-hedging VIXoptions with the future X t,T . It is also the convention for VIX options to quote implied volatilityby inverting the Black-Scholes formula on the VIX future, as is also done in [PS14]. Definition 2.4 (

VIX Implied Volatility ˆ ν ). Implied volatility for VIX options is denoted with ˆ ν ( t, K, T ) and is the unique volatility input to the Black-Scholes prices such thatC vix ( t, K, T ) = B t,T (Φ ( d + ) X t,T − Φ ( d − ) K ) , where d ± = log( X t,T /K )ˆ ν ( t,K,T ) √ T − t ± ˆ ν ( t,K,T ) √ T − t and Φ is the standard normal cumulative distributionfunction. The quantity ˆ ν could be equivalently redeﬁned using VIX put options via put-call parity. Figures 2.1 and 2.2 show implied volatility for SPX and VIX options for September 9th of2010, a day during the European debt crisis when options were trading with high implied volatility.Notice the right-hand skew of the ˆ ν in Figure 2.2, which corresponds to volatility tail risk and is astylistic feature of VIX options that should be captured by a stochastic volatility model that aimsto price VIX options in periods of higher volatility (see [Dri12, BGK13, PS14]). −2 −1.5 −1 −0.5 0 0.500.20.40.60.811.21.41.61.8 SPX Option Implied Volatility 09−Sep−2010, maturity 16−Oct−2010 log(K/E t S T ) ˆ σ Fig. 2.1: Implied volatility for SPX put options on September 9th 2010 and maturity on October16th 2010. The SPX future price is E t S T = 1101 .

97 and the risk-free rate is approximately r = .

3. Extreme-Strike Asymptotics.

Results that are considered model free will usually requiresome assumptions, such as the VIX being ﬁnite almost surely, a condition that is ensured by VIX Option Implied Volatility 09−Sep−2010, maturity 20−Oct−2010 log(K/E t VIX T ) ˆ ν Fig. 2.2: Implied volatility for VIX call options on September 10th 2010, maturity on October20th 2010. The VIX future price is X t,T = E t VIX T = 27 . Proposition 3.1.

Let S t B ,t be a supermartingale on [0 , T ] with R T σ t dt < ∞ a.s. andsatisfying Condition 2.2. Then Condition 2.1 holds.Proof . The stochastic integral R t σ u dW u is a local martingale, and so there exists a familyof increasing stopping times ( T n ) n =1 , , ,... such that R t ∧T n σ u dW u is a martingale for all n < ∞ and T n ∧ t ր t a.s. as n → ∞ . There also exists an increasing family ( T n ) n =1 , , ,... such that R t ∧T n σ u S u dW u is a martingale for all n < ∞ , and T n ∧ t ր t a.s. as n → ∞ , and hence E S t ∧T n = S E /B ,t ∧T n . Then deﬁning T n = T n ∧ T n , the expected variance of the stopped rocess can be bounded from above,12 E Z T ∧T n σ t dt = E Z T ∧T n dS t S t − log( E S T ∧T n /S ) − E log( S T ∧T n / E S T ∧T n )= r E Z T ∧T n dt − log( E e r ( T ∧T n ) )+ Z E S T ∧T n E ( K − S T ∧T n ) + K dK + Z ∞ E S T ∧T n E ( S T ∧T n − K ) + K dK ≤ r E Z T ∧T n dt − r E ( T ∧ T n ) | {z } =0 + Z E S T ∧T n E ( K − S T ∧T n ) + K dK + Z ∞ E S T ∧T n E ( S T ∧T n − K ) + K dK , where the last line comes from Jensen’s inequality. Using the bound E ( S T ∧T n − K ) + ≤ E S T ∧T n ,the above quantity can be further estimated,12 E Z T ∧T n σ t dt ≤ Z E S T ∧T n E ( K − S T ∧T n ) + K dK + E S T ∧T n Z ∞ E S T ∧T n K dK = Z E S T ∧T n E ( K − S T ∧T n ) + K dK + 1 . On the left-hand side lim n →∞ E R T ∧T n σ t dt = E R T σ t dt by monotone convergence theorem. Onthe right-hand side a bound comes from the estimate E t ( K − S T ∧T n ) + ≤ E t S − qT ∧T n q +1 (cid:16) qq +1 (cid:17) q K q forall K ∈ [0 , ∞ ) and q > B T ∧T n ,T S T ) q ≥ S qT ∧T n − q B T ∧T n ,T S T − S T ∧T n S q +1 T ∧T n , with the expectation being bounded, E S − qT ∧T n ≤ E h ( B T ∧T n ,T S T ) − q + q ( B T ∧T n ,T S T − S T ∧T n ) S − ( q +1) T ∧T n i = E  ( B T ∧T n ,T S T ) − q + q E h ( B T ∧T n ,T S T − S T ∧T n ) (cid:12)(cid:12)(cid:12) F T ∧T n i| {z } ≤ supermartingale S − ( q +1) T ∧T n  ≤ B − q ,T E S − qT , and hence there is the bound, Z E S T ∧T n E t ( K − S T ∧T n ) + K ≤ B − q ,T E t S − qT q + 1 (cid:18) qq + 1 (cid:19) q Z E S T ∧T n K − q dK ≤ B − q ,T E t S − qT q ( q + 1) (cid:18) qq + 1 (cid:19) q ( S /B ,T ) q , hich is ﬁnite for some q > E Z T σ t dt ≤ E S − qT q ( q + 1) (cid:18) qq + 1 (cid:19) q ( S /B ,T ) q + 1 , and if E S − qT < ∞ then Condition 2.1 holds.Another instance where ﬁnite SPX moments are important is in determining the existence ofthe MGF of VIX T . The following lemma will be used throughout the rest of the paper: Lemma 3.2.

Let S t B ,t be a supermartingale on [0 , T + τ ] satisfying Condition 2.1. For any ξ ∈ R , the MGF E t e ξ VIX T satisﬁes the inequality E t e ξ VIX T ≤ q E t ( S T + τ ) − ξqτ + 1 p E t ( S T /B T,T + τ ) ξpτ ∀ t ≤ T , (3.1) where p > and q > with p + q = 1 (i.e. p and q are conjugate exponents). This is a strictinequality if E t (cid:16) S T + τ E T S T + τ − (cid:17) > .Proof . From Jensen’s and Young’s inequality, E t e ξ VIX T = E t e − ξτ E T log( S T + τ / E T S T + τ ) ≤ E t e log (cid:16) ST + τ E T ST + τ (cid:17) − ξτ ! (Jensen’s inequality)= E t (cid:18) S T + τ E T S T + τ (cid:19) − ξτ ≤ q E t ( S T + τ ) − ξqτ + 1 p E t (cid:18) E T S T + τ (cid:19) − ξpτ (Young’s inequality) ≤ q E t ( S T + τ ) − ξqτ + 1 p E t ( S T /B T,T + τ ) ξpτ . In this case, Jensen’s inequality is an equality iﬀ the random variable has zero variance (i.e. if E t (cid:16) S T + τ E T S T + τ − (cid:17) = 0), and hence the inequality is strict in non-degenerate cases.Lemma 3.2 is a useful tool when evaluating the market for VIX options, primarily becauseit shows how existence of a negative moment E t S − qT + τ for some q > E t e ξ VIX T = ∞ for all ξ >

0, then E t S − qT + τ = ∞ for all q >

0. In both cases, for ξpτ ≤ E t ( S T /B T,T + τ ) ξpτ < ∞ , so that E t ( S T + τ ) − ξqτ < ∞ implies E t e ξ VIX T < ∞ , and E t e ξ VIX T = ∞ implies E t ( S T + τ ) − ξqτ = ∞ . The Moment Formula from [Lee04] consists of parts 1 and part 2describing the right and left tail, respectively, of the implied volatility smile. Let the price process S t be a martingale and deﬁne ˜ p , sup { p ≥ | E t S pT < ∞} . Part 1 states thatlim sup K ր∞ ˆ σ ( t, K, T )log( K/ E t S T ) / ( T − t ) = β R ∈ [0 , , (3.2) here ˜ p = β R + β R − . Equivalently, β R = 2 − (cid:16)p ˜ p + ˜ p − ˜ p (cid:17) with β R = 0 when ˜ p = ∞ . Nextdeﬁne ˜ q , sup { q ≥ | E t S − qT < ∞} . Part 2 states thatlim sup K ց ˆ σ ( t, K, T )log( K/ E t S T ) / ( T − t ) = β L ∈ [0 ,

2] (3.3)where ˜ q = β L + β L − . Equivalently, β L = 2 − (cid:16)p ˜ q + ˜ q − ˜ q (cid:17) with β L = 0 when ˜ q = ∞ . Parts 1and 2 both take 1 / , ∞ . For many models the limit supremum in (3.2) and (3.3) can be replacedwith a proper limit (see [BF08]). Figure 3.1 shows how the moment formulas apply to the datawith β L and β R estimated from the most extreme strikes in the September 2010 options data seenFigures 2.1 and 2.2. −2 −1.5 −1 −0.5 0 0.500.20.40.60.811.21.41.61.8 log(K/E t S T ) ˆ σ Data ˆ ν . p log( K/E t S T ) / ( T − t ) log(K/E t VIX T ) ˆ ν Data ˆ ν . p log( K/E t V IX T ) / ( T − t ) Fig. 3.1:

Left:

From the lowest-strike SPX put option there is the estimate β L ≥ . Right:

From the highest-strike VIX option there is the estimate p β vixR ≥ . p vix = sup n p > (cid:12)(cid:12)(cid:12) E t VIX pT < ∞ o . If ˜ p vix < ∞ thenthe MGF E t e ξ VIX T = ∞ for all ξ > E t S − qT + τ = ∞ for all q >

0. Hence by part 2 of the moment formula, as stated in equation (3.3), it follows that β L = 2for SPX options with exercise at time T + τ , and the implied volatility limit is at its maximum,lim sup K ց σ ( t,T + τ,K )log( K/ E t S T + τ ) / ( T + τ − t ) = 2.Similarly, moments of SPX’s distribution can say something about implied volatility of VIX op-tions. Deﬁne β vixR = lim sup K ր∞ ˆ ν ( t,K,T )log( K/ E t S T ) / ( T − t ) , and suppose ˜ q = sup n q > (cid:12)(cid:12)(cid:12) E t S − qT + τ < ∞ o >

0. Then from Lemma 3.2 it follows that E t e ξ VIX T < ∞ for some ξ >

0. Moreover, if VIX T hasﬁnite MGF for positive ξ then E t VIX nT < ∞ for all n >

0, and then by equation (3.2) it followsthat β vixR = 0, giving the extreme-strike asymptotic lim sup K ր∞ ˆ ν ( t,T,K )log( K/X t,T ) / ( T − t ) = 0. Moreover,ﬁnite MGF for VIX T for positive ξ is equivalent to saying that the VIX-squared does not have aheavy-tailed distribution, and hence ˜ q > T does not have heavy tails. ore generally, the moment formula and Lemma 3.2 are used to show how implied volatilityfrom VIX options gives a lower bound on the implied volatilities of SPX options. Proposition 3.3.

Assume Condition 2.1 and let ˜ ξ = sup n ξ ≥ (cid:12)(cid:12)(cid:12) E t e ξ VIX T < ∞ o .

1. If ˜ ξ < ∞ and E t S ξτ T < ∞ , then ˜ q = sup n q ≥ (cid:12)(cid:12)(cid:12) E t S − qT + τ < ∞ o ≤ ξτ , and lim sup K ց ˆ σ ( t, K, T + τ )log( K/ E t S T + τ ) / ( T + τ − t ) = β L ≥ − vuut ξτ ! + 4 ˜ ξτ − ξτ  ≥ .

2. If ˜ ξ < ∞ and E t S − ξτ T + τ < ∞ , then ξτ ≥ and ˜ p = sup n p ≥ (cid:12)(cid:12)(cid:12) E t S pT < ∞ o ≤ ξτ − , and lim sup K ր∞ ˆ σ ( t, K, T )log( K/ E t S T ) / ( T − t ) = β R ≥ − vuut ξτ − ! + 4 ˜ ξτ − − ξτ − ! = − − vuut ξτ ! − ξτ − ξτ  ≥ . Proof . Part 1: If ˜ ξ < ∞ then for any ǫ ∈ (0 ,

1) it is the case that E t e ˜ ξ (1+ ǫ ) VIX T = ∞ . UsingLemma 3.2 with p = 2 / (1 + ǫ ) and q = 2 / (1 − ǫ ), if E t S ξτ T < ∞ it then follows that E t S − ξ (1+ ǫ ) τ (1 − ǫ ) T + τ = ∞ and therefore ˜ q = sup { q ≥ | E t S − qT + τ < ∞} < ξ (1+ ǫ ) τ (1 − ǫ ) for all ǫ ∈ (0 , q ≤ ξτ . Nowdeﬁne the function f ( q ) = 2 − (cid:16)p q + q − q (cid:17) and notice that f ′ ( q ) < q >

0, so that in equation (3.3) the limit supremum is β L = f (˜ q ) ≥ f (cid:16) ξτ (cid:17) .Part 2: If ˜ ξ < ∞ then for any ǫ ∈ (0 ,

1) it is the case that E t e ˜ ξ (1+ ǫ ) VIX T = ∞ . Using Lemma3.2 with p = 2 / (1 − ǫ ) and q = 2 / (1 + ǫ ), if E t S − ξτ T + τ < ∞ then it follows that E t S ξ (1+ ǫ ) τ (1 − ǫ ) T = ∞ ,therefore 1 + ˜ p < ξ (1+ ǫ ) τ (1 − ǫ ) for all ǫ ∈ (0 , p ≤ ξτ −

1. The remainder of the proof issimilar to the argument used in Part 1, except with an application of equation (3.2). .2. Replication of VIX MGF with VIX Options. From a market of VIX options with acontinuum of strikes comes a tremendous amount of information about the SPX. In particular, viathe Breeden and Litzenberger formula [BL78] one obtains a risk-neutral distribution on the portfolioof calls and puts given in equation (2.2). Via integration-by-parts it is possible to replicate risk-neutral expectations using European call and put options (see [Bic82, CM01]). This section usessuch techniques to show how the MGF of VIX T can be replicated, and if E t e ξ VIX T < ∞ for some ξ > K grows.Moments on VIX can be replicated with VIX options: for positive n > B t,T E t VIX nT = n ( n − Z ∞ K n − C vix ( t, K, T ) dK . Furthermore, the MGF of VIX T can be replicated, and if ﬁnite will give a rate at which call optionsmust decay for large strikes. Proposition 3.4.

Suppose Condition 2.1. For any ξ ∈ R there is a replication of e ξ VIX T thatgives the MGF of VIX T in terms of VIX options, E t e ξ VIX T = 1 + 1 B t,T Z ∞ (cid:0) ξ + 4 ξ K (cid:1) e ξK C vix ( t, K, T ) dK . Moreover, if the MGF E t e ξ VIX T < ∞ for some ξ > , then lim K →∞ K e ξK C vix ( t, K + ǫ, T ) = 0 for all ǫ > and for all t ≤ T . (3.4)

Proof . It follows from Condition 2.1 that VIX T < ∞ almost surely, and then integration byparts is used to check that e ξ VIX T = 1 + Z ∞ (cid:0) ξ + 4 ξ K (cid:1) e ξK (VIX T − K ) + dK . Then, for t < T the monotone convergence theorem is used to obtain Z ∞ (cid:0) ξ + 4 ξ K (cid:1) e ξK C vix ( t, K, T ) dK = B t,T lim N →∞ E t Z N (cid:0) ξ + 4 ξ K (cid:1) e ξK (VIX T − K ) + dK = B t,T E t lim N →∞ Z N (cid:0) ξ + 4 ξ K (cid:1) e ξK (VIX T − K ) + dK = B t,T (cid:16) E t e ξ VIX T − (cid:17) . Moreover, E t e ξ VIX T < ∞ if and only if R ∞ (cid:0) ξ + 4 ξ K (cid:1) e ξK C vix ( t, K, T ) dK < ∞ , in which casefor a large and ﬁnite ǫ > Z a + ǫ K e ξK C vix ( t, K, T ) dK ≥ Z a K e ξK C vix ( t, K, T ) dK + ǫa e ξa C vix ( t, a + ǫ, T ) , here the inequality uses the monotonically-decreasing property of the call option in K . Takingthe limit as a tends toward inﬁnity yields,0 ≥ lim a →∞ a e ξa C vix ( t, a + ǫ, T ) , and since the quantity is non-negative it follows that the limit is zero.Section 3.3 will show how the replication in Proposition 3.4 is useful, as it can be part ofan argument to show that reversal of the inequality in Lemma 3.2 can result in static arbitrage.Moreover, this static replication is informative because it shows exponential decay for large- K VIXcall options if there is ﬁniteness of E t e ξ VIX T for some ξ >

0. However, this large- K asymptoticcannot be diﬀerentiated to get a tail distribution, but Section 3.4 will explore how to obtain theVIX’s tail distribution. Suppose there are mis-priced options suchthat there is a reversal of the inequality in (3.1). Lemma 3.2 derives (3.1) under the assumptionthat E is a risk-neutral expectation, and so there will be an arbitrage if such a mis-pricing occurs.In general, mis-pricings may not have an easily-identiﬁed arbitrage portfolio, but there are clearlydeﬁned, static portfolios that replicate all quantities considered in inequality (3.1), and this sectionwill show how mis-pricing can be exploited with a static portfolio of tradeable assets; the tradeableasset are futures and options on the underlying S and VIX, as well as shares in S .Assume S t B ,t is a martingale so that the future on S T is F t,T = S t /B t,T ∀ ≤ t < T . For some ξ, p and q such that ξpτ >

1, let ˜ ξ = ξτ and consider the following strategy at time t < T :-Buy portfolio Π t composed of European put options P( t, · , T + τ ),Π t = ˜ ξ ( ˜ ξq + 1) B T,T + τ Z ∞ P( t, K, T + τ ) K ˜ ξq +2 dK , -Buy portfolio Π t composed of European call options C( t, · , T + τ ),Π t = 1 B ˜ ξpT,T + τ ˜ ξ ( ˜ ξp − Z ∞ C( t, K, T ) dKK − ˜ ξp , -Sell portfolio Π t composed of VIX call options C vix ( t, · , T ),Π t = B t,T + Z ∞ (cid:0) ξ + 4 ξ K (cid:1) e ξK C vix ( t, K, T ) dK . It can be veriﬁed using integration by parts (see [Bic82, CL08, CM01, Lee04]) that these arereplicating portfolios such thatΠ t = time- t value of a claim on qB T,T + τ S − ˜ ξqT + τ , for all t ≤ T + τ ,Π t = time- t value of a claim on p ( F T,T + τ ) ˜ ξp , for all t ≤ T and as Proposition 3.4 showsΠ t = time- t value of a claim on e ξ VIX T , for all t ≤ T . n terms of these portfolios, inequality (3.1) is equivalent toΠ t < Π t + Π t ∀ t ≤ T . (3.5)If inequality (3.5) is reversed then there will be a static arbitrage consisting of listed option prices,as concluded from the following two propositions:

Proposition 3.5.

There is arbitrage at time t = T if inequality (3.5) is reversed.Proof . Let V T denote the future (non-discounted) valuation of a claim. V T is obtained fromreplication with listed prices for the underlying and options P( T, K, T + τ )’s and/or C( T, K, T + τ )’s(i.e. the integration-by-parts/replication method in [Bic82, CL08, CM01, Lee04]). For example, V T [ S T + τ ] = F T,T + τ = C( T, , T + τ ) and − τ V T [log( S T + τ /F T,T + τ )] = VIX T . There is arbitrage attime T unless the following strict inequality holds:Π T < V T h e − ξτ log( S T + τ /F T,T + τ ) i . (3.6)Using the fact that Π T = e ξ VIX T , if inequality (3.6) is reversed there is the following arbitrageportfolio:-Short e ξ VIX T claims on − ξτ log( S T + τ /F T,T + τ ),-Long 1 claim on ( S T + τ /F T,T + τ ) − ξ/τ (value given by right-hand side of (3.6)),-Hold ( ξ VIX T − e ξ VIX T contracts in the bond with price B T,T + τ ,which has non-positive entry cost at time T , and at time T + τ has positive payoﬀ due to theinequality e x − e x − e x ( x − x ) > x = − ξτ log( S T + τ /F T,T + τ ) and x = VIX T .Also at time T , there is an arbitrage unless the following strict inequality holds: V T h e − ξτ log( S T + τ /F T,T + τ ) i < Π T + Π T . (3.7)Now using the fact that Π T is a claim on qB T,T + τ S − ξqτ T + τ and Π T is a settled claim equal to p ( F T,T + τ ) ξqτ , if equation (3.7) is reversed there is the following arbitrage portfolio:-Short 1 claim on e − ξτ log( S T + τ /F T,T + τ ) ,-Long B T,T + τ -many of portfolio Π T ,-Hold Π T -many contracts in the B T,T + τ bond,which has non-positive entry cost at time T . At time T + τ this portfolio has non-negative valueand non-zero probability of positive payoﬀ due to Young’s inequality e x + x ≤ q e xq + p e x p with x = − ξτ log( S T + τ ) and x = ξτ log( F T,T + τ ), and where there is equality iﬀ x = pq x . Hence, fromequations (3.6) and (3.7) there is the strict inequalityΠ T < Π T + Π T . Proposition 3.6.

There is arbitrage at time t < T if inequality (3.5) is reversed.Proof . If equation (3.5) is reversed, then there is a clear arbitrage by longing Π t and Π t andshorting Π t for a non-zero net cash ﬂow at time t , which will then receive a positive cash ﬂow attime T because it was shown in Proposition 3.5 that Π T + Π T − Π T > .4. Extreme-Value Theory for VIX’s Distribution. The moment-formula limits β R and β L have a relationship to the rates in the extreme-value distributions of the underlying asset. Inparticular, the stochastic volatility inspired (SVI) parameterization of the volatility surface leadsto an extreme-value distribution that is parameterized by β R for the right tail and β L for the left.The SVI parameterization is related to extreme strikes because its construction is consistent withthe moment formulas in (3.2) and (3.3), namely that the square of large-strike implied volatility isproportional to log-moneyness divided by the square-root of time-to-maturity (see [Gat06, BGK13]).The result of this section can be applied as a risk-neutral systemic risk indicator, which is relatedto those proposed in [Mal13].This section’s main technical hurdle is that the limiting behavior for VIX options given inProposition 3.4 cannot be diﬀerentiated in K . In other words, the limit in (3.4) shows a rate ofconvergence for C vix , but cannot be diﬀerentiated to ﬁnd an asymptotic for the tail distribution ∂∂K C vix . However, by assuming diﬀerentiability in K and assuming that the SVI is not mis-speciﬁed,the asymptotic tail distribution is obtained. In particular, the limiting right slope for VIX options β vixR = lim sup K ր∞ ˆ ν ( t,K,T )log( K/ E t S T ) / ( T − t ) gives the rate of polynomial decay in the peaks-over-threshold(POT) distribution of VIX T if β vixR ∈ (0 , ω ( k ) = a + b (cid:16) ρ ( k − m ) + p ( k − m ) + σ (cid:17) (3.8)where k = log( K/X t,T ), X t,T = E t VIX T , and with parameters ( a, b, ρ, m, σ ) ﬁtted so that ω ( k ) = ˆ ν ( t, K, T )( T − t ) , where ˆ ν is VIX-option implied volatility given in Deﬁnition 2.4. The VIX call price is thenC vix ( t, K, T ) = B t,T X t,T (cid:0) Φ ( d + ( k )) − Φ ( d − ( k )) e k (cid:1) (3.9)where d ± ( k ) = − k √ ω ( k ) ± √ ω ( k )2 .The generalized Pareto distribution (GPD) with parameter α > G α ( y ) , ( − (1 + y/α ) − α for y ≥ α < ∞ − e − y for y ≥ α = ∞ . The Pickands-Balkema-de Haan theorem states that the following are equivalent:(i) The VIX’s distribution function is in the maximum domain of attraction of H α deﬁned as H α ( y ) = exp( G α ( y ) −

1) (see Appendix C).(ii) There exists a positive measurable function a ( · ) such that the peaks-over-threshold (POT)distribution converges to the GPD in the following waylim x ր∞ P t (cid:18) VIX T − xa ( x ) ≥ y (cid:12)(cid:12)(cid:12) VIX T ≥ x (cid:19) = 1 − G α ( y ) . For details on this theory and more general information about extreme-value theory, the reader isdirected to [Deg06, Res87]. Assuming twice-diﬀerentiability in K , the Breeden-Litzenberger formula ields the VIX’s distribution function, P t (VIX T ≥ K ) = − B t,T ∂∂K C vix ( t, K, T ). Then for two largestrikes K > K the POT distribution can be written as P t (cid:16) VIX T ≥ K (cid:12)(cid:12)(cid:12) VIX T ≥ K (cid:17) = P t (VIX T ≥ K ) P t (VIX T ≥ K ) = R ∞ K ∂ ∂K C vix ( t, K, T ) dK R ∞ K ∂ ∂K C vix ( t, K, T ) dK . (3.10)If the VIX options imply a value β vixR ∈ (0 , p ∈ (0 , ∞ ) such that E t VIX pT = ∞ , and the VIX’s distribution is heavy tailed. However, the asymptotics from themoment formulas do not apply to the derivative ∂ ∂K C vix ( t, K, T ), but β vixR combined with the SVIis used to obtain a GPD for the underlying’s tail distribution. Proposition 3.7.

Suppose the large-strike limit from equation (3.2) is applied to VIX calloptions to obtain ˆ ν ( t, K, T )log( K/X t,T ) / ( T − t ) → β vixR ∈ (0 , as K → ∞ .Suppose further that the implied volatility surface is ﬁt by the SVI parameterization, where the ﬁtdoes not admit static arbitrage (i.e. no butterﬂy or calendar-spread arbitrage). Then the asymptoticbehavior for the POT distribution with scaling function a ( x ) = xα is lim x ր∞ P t (cid:18) VIX T − xx/α ≥ y (cid:12)(cid:12)(cid:12) VIX T ≥ x (cid:19) = (cid:16) yα (cid:17) − α , for y ≥ and x tending toward inﬁnity, where α = (cid:18)q β vixR + √ β vixR (cid:19) > for all β vixR < .Proof . No butterﬂy arbitrage in the SVI ﬁt is enough for there to be an explicit formula for theVIX distribution’s density function at time T , namely ∂∂k P t (cid:16) log(VIX T /X t,T ) ≤ k (cid:17) = g ( k ) p πω ( k ) exp (cid:18) − d − ( k ) (cid:19) , where k , d ± ( k ), ω ( k ) are given in equations (3.8) and (3.9), and g ( k ) = (cid:18) − kω ′ ( k )2 ω ( k ) (cid:19) − ( ω ′ ( k )) (cid:18) ω ( k ) + 14 (cid:19) + ω ′′ ( k )2 . The SVI ﬁt to this slice of the implied-volatility surface is free from butterﬂy arbitrage if g ( k ) ≥ k ∈ R and d + ( k ) → −∞ as k → ∞ (see [GJ14]).The moment formula in equation (3.2) says that ω ( k ) ∼ kβ vixR for k large (i.e. as K ր ∞ ),which applied to the density function yields g ( k ) ∼ − ( β vixR )

16 for k tending toward ∞ , d − ( k ) ∼ − s kβ vixR − p kβ vixR k tending toward ∞ . lacing these asymptotic approximations into the density yields ∂∂k P t (cid:16) log(VIX T /X t,T ) ≤ k (cid:17) ∼ (4 − ( β vixR ) )16 p πkβ vixR e − αk for k tending toward ∞ ,where α = (cid:18)q β vixR + √ β vixR (cid:19) > β vixR <

2. Placing this asymptotic approximation ofthe tail density into equation (3.10), and then applying L’Hopitˆal’s rule, the result is found: P t (cid:0) VIX T ≥ x (cid:0) yα (cid:1)(cid:1) P t (VIX T ≥ x ) ∼ R ∞ x ( yα ) ∂ ∂K C vix ( t, K, T ) dK R ∞ x ∂ ∂K C vix ( t, K, T ) dK ∼ R ∞ ℓ ( x,y ) 1 √ k e − αk dk R ∞ ℓ ( x,

0) 1 √ k e − αk dk ∼ (cid:16) yα (cid:17) − α for x large, where ℓ ( x, y ) = log( x (1 + y/α ) /X t,T ).Proposition 3.7 shows that for β vixR ∈ (0 ,

2) with an SVI ﬁt, the POT distribution with scaling a ( x ) = xα converges to a GPD with parameter α . Hence, the Pickands-Balkema-de Haan theoremsays this POT distribution is in the maximum domain of attraction of a generalized extreme-valuedistribution (see Appendix C). Remark 3.8.

Large-strike asymptotics for SVI are presented in [FGGS11]. In particular, thereis a large- k series expansion in powers of k − / . Remark 3.9. If β vixR = 2 then SVI may admit arbitrage as d + ( k ) → as k → , a limit whichcan be seen by using the large- k SVI expansion in [FGGS11]. If β vixR = 0 then the distribution isnot heavy tailed and the extreme-value theory needs to be reworked to ﬁnd an appropriate scalingfunction a ( · ) for the POT distribution. Remark 3.10.

The results of this section can also be applied to the SPX, namely β R and theSVI ﬁt from SPX options can be used to obtain the GPD for tail of the SPX index.

4. Examples.

This section explores some widely-used stochastic volatility models. In par-ticular the MGF of VIX T and negative moments of S T are computed for several models; thesecalculations give insight on how the results from Section 3 apply. Models such as the 3/2 andSABR have heavy-tailed volatility processes and both have VIX T MGFs that become inﬁnite onthe positive real line, but each have diﬀerent moment formula asymptotics for VIX implied volatility.Other models in this section include the constant elasticity of volatility (CEV), the Heston, and theexponential Ornstein-Uhlenbeck (OU). From the September 9, 2010 data it is seen approximatelythat VIX call options have right-hand extreme strikes with √ β R ≈ . The following example shows how special care is required in using modelswhere the asset price does not satisfy Condition 2.2. Take r = 0, T = 1, and the process dS t = p S t dW t = σ t S t dW t t ∈ [0 , S = 1 and σ t = σ ( S t ) = √ S t . This example is informative because it is a non-trivial casewhere a barrier (i.e. via a stopping time) is used to ensure VIX t < ∞ a.s. Without the stoppingtime there would be positive probability of inﬁnite VIX, which would lead to inﬁnite variance-swaprates.The process S t is a true martingale because R ∞ ǫ sσ ( s ) ds = ∞ for some ǫ > (min t ≤ S t = 0) > = E R

10 1 S u du = ∞ and E S − qt = ∞ for all t ∈ (0 ,

1] and all q >

0. However, in a manner similar to the real-life variance-swap barriers discussed in [CL09], caps on payoﬀs based on realized variance can be modeled byusing the stopping time T = min (cid:26) t ≥ (cid:12)(cid:12)(cid:12) Z t S u du = M (cid:27) and then by considering the stopped process ˜ S t = S t ∧T . The VIX on ˜ S is then bounded,VIX = s E Z T ∧ S u du ≤ √ M , which is suﬃcient to have the Novikov condition for the martingales Z ± t = exp − Z T ∧ t S u du ± Z T ∧ t √ S u dW u ! . Hence, ˜ S t = Z + t and for any q ∈ (0 ,

1] the negative moments are E ˜ S − qt = E (cid:2) ( Z + t ) − q (cid:3) = E exp q Z T ∧ t S u du − q Z T ∧ t √ S u dW u ! = E " ( Z − t ) q exp q Z T ∧ t S u du ! ≤ e qM E (cid:2) ( Z − t ) q (cid:3) ≤ e qM E (cid:2) Z − t (cid:3) q = e qM < ∞ , and so Condition 2.2 holds for ˜ S t . This is another example of the contraposition of Proposition 3.1 becauseCondition 2.1 holds yet Condition 2.2 does not because there are no negative moments on S t . For t ∈ [0 , T + τ ], take r = 0, S = 1 and Y >

0, and consider the SABR stochastic volatility model d log( S t ) = − Y t dt + Y t (cid:16)p − ρ dW t + ρdB t (cid:17) dY t = αY t dB t where α > W ⊥ B , and − ≤ ρ ≤ −

1. It is well known that S t is a true martingale if and onlyif ρ ≤ ρ < E S pt < ∞ if and only if p ≤ ρ − ρ (see [Jou04, LM07]).The process Y t is log-normal and almost-surely positive, and clearly E Y t < ∞ implying that E (cid:16)R t Y u dB u (cid:17) < ∞ , which implies Condition 2.1 and therefore the log-contract satisﬁes − E log( S t ) = 12 E Z t Y u du < ∞ . oreover, the VIX has all its moments E VIX nT = E E T τ Z T + τT Y u du ! n ≤ τ E Z T + τT Y nu du = Y n τ Z T + τT e (2 n − n ) α u du< ∞ , yet has inﬁnite MGF, E exp ( ξ VIX T ) = E exp  ξ s τ E T Z T + τT Y u du  = E exp ( ξC τ,α Y T ) = ∞ ∀ ξ > , where C τ,α = q τY T E T R T + τT Y u du = q τα (cid:0) e α τ − (cid:1) >

0. It follows that E e ξ VIX T = ∞ for all ξ > ρ < S t is a martingale, there exists ˜ p = sup { p ≥ | E S pT + τ < ∞} > E e ξ VIX T = ∞ for all ξ >

0. Hence from equation (3.1) it is deduced that ˜ q = sup { q ≥ | E S − qT + τ < ∞} = 0. The SABR model is good for ﬁtting implied volatilities from SPX optionsdata. However, SABR may have trouble ﬁtting both SPX and VIX options because volatilitymodeled as geometric Brownian motion will produce a ﬂat smile that does not have the upwardskew observed from VIX options, i.e. VIX T = C τ,α Y T and C τ,α E t ( Y T − K/C τ,α ) + leads to a ﬂatimplied-volatility smile because Y T is log-normally distributed.This example has demonstrated the VIX-SPX relationship identiﬁed in Lemma 3.2, namelythat E t e ξ VIX T = ∞ for all ξ > E t S − qT + τ = ∞ for all q > This is an example of a model that is slightly better thanSABR for pricing VIX options because it has implied-volatility function ˆ ν ( t, K, T ) that is increasingin K . For t ∈ [0 , T + τ ], take r =0, S = 1, and consider the stochastic volatility model d log( S t ) = − Y t dt + Y t (cid:16)p − ρ dW t + ρdB t (cid:17) dY t = cY t dB t where c > W ⊥ B . The process Y t is almost-surely positive, a fact that can be deduced fromits transition density (see the transition density in Appendix B).Also from the transition density it is seen that E Y pt < ∞ for 0 ≤ p ≤ E Y pt = ∞ for p > E Y t < ∞ it follows that E (cid:16)R T + τ Y u dB u (cid:17) < ∞ and Condition 2.1 holds, which means thelog-contract satisﬁes − E log( S t ) = 12 E Z t Y u du < ∞ . If ρ = 0 or if ρ <

0, then S t is a martingale (shown in Appendix B); in terms of notation fromProposition 3.3 this model also has ﬁnite, positive ˜ ξ and ˜ q (also shown in Appendix B). log(K) Moment Formula for VIX−Option Implied Volatility T = 1 T = 2 / T = 1 / β vixR ) / =0.44949 log(K/E t VIX T ) ˆ ν VIX Option Implied Volatility 09-Sep-2010, maturity 20-Oct-2010

Data ˆ ν CEV Fit . p log( K/E t V IX T ) / ( T − t ) Fig. 4.1:

Left:

Implied volatility for the CEV volatility process in Section 4.3. The plot is of thescaled VIX implied volatility of September 9th, 2010, ˆ ν ( t,K,T ) √ T − t √ log( K max / E t S T ) ≤ ( β vixR ) / , for Y = .

25 and K ≤ K max = 93 .

5. The lines for T = 1 , / / p β vixR ≈ . K max increases. Right:

The ﬁt of the stochastic volatility model proposed in Section 4.3 is ﬁtto the VIX options data from September 9th, 2010. The model captures some of upward skew inVIX implied volatility (certainly more than SABR which produces a ﬂat implied volatility smile),yet cannot produce a steep enough slope to have a good ﬁt to the data. The number . √ β R that was identiﬁed in Figure 3.1. The Heston model is interesting because there is some interplaybetween the quantities ˜ q and ˜ ξ from Proposition 3.3. Furthermore, one of the model’s most inter-esting features is the role played by time in determining negative moments, namely, for any q > t ∗ ∈ ( −∞ , T ) such that E t S − qT = ∞ for t ≤ t ∗ . However, this section will show thatthe Heston model’s time dependence in tail risk is (for the most part) absent in VIX options.The Heston model (with r = 0) has squared volatility given by a Cox-Ingersol-Ross (CIR)process, so that d log( S t ) = − Y t dt + p Y t dW t dY t = κ ( Y − Y t ) dt + γ p Y t dB t where dW t dB t = ρdt for ρ ∈ ( − , γ ≤ Y κ .As shown in [AP07, FGGS11], if ( qγρ + κ ) < γ q (1 + q ) then the earliest time t ∗ < T suchthat E t ∗ S − qT = ∞ for some q > T − t ∗ = 1 p γ q (1 + q ) − ( qγρ + κ ) π { qγρ + κ> } + tan − − p γ q (1 + q ) − ( qγρ + κ ) qγρ + κ !! . For the VIX, the MGF E t e ξ VIX T < ∞ for some ξ >

0, but there is also a point ˜ ξ of explosionthat can be calculated explicitly. Using the SDE for Y , the squared VIX is linear in Y t , VIX t = τ E t R t + τt Y u du = Y + Y t − Yκτ (1 − e − τκ ) = a + bY t . The explicit transition density for Y t (as given in[AS99]) is ∂∂y P ( Y T ≤ y | Y t = y ) = ce − u − v ( v/u ) α/ I α (2 √ uv ) , (4.1)where c = 2 κ/ ( γ (1 − e − κ ( T − t ) )), u = cy e − κ ( T − t ) , v = cy , α = 2 Y κ/γ − ≥

0, and I α is themodiﬁed Bessel function of the ﬁrst kind of order α . The function e ξby is integrable against thisdensity for any ξb < c , hence, E t e ξ VIX T < ∞ if and only if ξ < ˜ ξ where˜ ξ = cb = 2 κ τγ (1 − e − κ ( T − t ) ) (1 − e − τκ ) . (4.2)Note that for T − t large ∂∂y P ( Y T ≤ y | Y t = y ) = ce − u − v ( v/u ) α/ I α (2 √ uv ) ∼ y α e − κγ y (cid:16) γ κ (cid:17) α +1 Γ( α +1) (infact, Y t converges weakly to a gamma-distributed random variable). Hence for large time the VIX’sMGF is E t e ξbY T ∼ (cid:18) − γ ξb κ (cid:19) − ( α +1) for T − t ≫ /κ , which exists for ξ < κγ b = κ τγ (1 − e − τκ ) , in agreement with (4.2).The analysis above is interesting because it shows how the Heston model has a time dynamicin SPX tail risk, but which is not as present in the VIX’s distribution. In other words, there isan increase in tail risk because E t S − qT = ∞ for some ﬁnite T > t , yet the VIX’s MGF exists fora segment of the positive real line for

T > t . This is striking because it means the Heston-modelgives VIX prices that do not capture the same long-term risk that is in the underlying.Figure 4.2 shows how the Heston model can ﬁt the SPX implied volatility, but has some diﬃcultyin ﬁtting the VIX implied volatility. In particular, the CIR process of the Heston model leads toa downward slope in VIX option implied volatility, which is the stylistic feature pointed out in[Dri12, Gat08, PS14] and shows the CIR process is not a good ﬁt to the VIX data.The Heston model has ˜ ξ > β L <

2, yet part 1 of Proposition 3.3 tends to be uninformativebecause r(cid:16) ξτ (cid:17) + ξτ − ξτ is a small number for Heston calibrations having Y within the realm ofwhat is (historically) observed in the data. In general, Proposition 3.3 is model free, which meansthere is an underlying tradeoﬀ between sharpness and speciﬁcation of the model. Sharper estimatesbased on Proposition 3.3 can be obtained for Heston and other speciﬁc volatility models. The Exp-OU model is d log( S t ) = − e Y t dt + e Y t dW t dY t = κ (cid:0) Y − Y t (cid:1) dt + γdB t , where Y ∈ R , κ, γ >

0, and dW t dB t = ρdt for ρ ∈ ( − , Y t is Gaussian with E Y t = Y e − κt + Y (1 − e − κt ) and Var( Y t ) = γ κ (cid:0) − e − κt (cid:1) , so Condition 2.1 is satisﬁed, E Z T e Y t dt = Z T e E Y t +2 Var ( Y t ) dt < ∞ . SPX Option Implied Volatility 09−Sep−2010, maturity 16−Oct−2010 log(K/E t S T ) ˆ σ Data ˆ σ Heston Fit log(K/E t VIX T ) ˆ ν VIX Option Implied Volatility 09-Sep-2010, maturity 20-Oct-2010

Data ˆ ν CIR Fit . p log( K/E t V IX T ) / ( T − t ) Fig. 4.2:

Left:

A ﬁt of the Heston model to the implied volatility smile from SPX options onSeptember 9th, 2010.

Right:

A ﬁt of the Heston model (or simply a CIR process) to the impliedvolatility from VIX options. The downward slope in the right-hand skew illustrates the CIR process’inability to ﬁt VIX options on September 9th, 2010.Moreover, all the moments of VIX T exist, as E VIX nT = E E T τ Z T + τT e Y t dt ! n ≤ τ E Z T + τT e nY t dt < ∞ , for all n ≥

1. However, the MGF of VIX T does not exist for ξ > E exp ξτ E T Z T + τT e Y t dt ! ≥ E exp ξτ Z T + τT e E T Y t dt ! = E exp ξτ Z T + τT e Y T e − κ ( t − T ) +2 Y (1 − e − κ ( t − T ) ) dt ! ≥ E exp ξτ Z T + τT e Y T e − κ ( t − T ) +2 Y (1 − e − κ ( t − T ) ) dt ! { Y T > } ≥ E exp e Y T e − κτ ξτ Z T + τT e Y (1 − e − κ ( t − T ) ) dt !! { Y T > } = ∞ , because the MGF of log-normal random variable e Y T e − κτ does not exist on the positive real line. The 3/2 model is d log( S t ) = − Z t dt + p Z t dW t dZ t = Z t (cid:0) κ − ( κY − γ ) Z t (cid:1) dt − γZ / t dB t , here dW t dB t = ρdt for ρ ∈ ( − , κ >

0, 2

Y κ > γ , and κY − ργ ≥ γ so that the priceprocess is a martingale (see [BCM17, Dri12]), and hence Condition 2.1 holds up to time T + τ .This model can be equivalently written as d log( S t ) = − Y t dt + √ Y t dW t where Y is the square-root(CIR) process from Section 4.4 and Z t = 1 /Y t . This is a popular choice for pricing VIX optionsbecause the volatility process has heavy tails (see [BB14, Dri12]). A ﬁt of the 3/2 model to VIXoption implied volatility is shown in Figure 4.3. The upper bound ˜ q such that E S − qT = ∞ if q ≥ ˜ q is calculated in Appendix A.2 to be ˜ q = √ γ α − . The following proposition calculates ˜ ξ that isthe maximum value for which there is existence of the MGF for VIX T : Proposition 4.1.

Let Y t denote the CIR process from Section 4.4 having parameters κ, Y , γ with α = Y κγ − (note that α is positive because it was assumed above that Y κ > γ ). For ξ > , E e ξ VIX T < ∞ , if and only if ξ < ˜ ξ = γ τα ( α +1)2 (recall the notation ˜ ξ from Proposition 3.3).Proof . (See Appendix A.1).In general, the SABR model of Section 4.2, the CEV volatility model of Section 4.3, the Exp-OU model of Section 4.5, and the 3/2 models are candidates for an improved ﬁt to the VIX databecause the volatility process has heavier tail, whereas the Heston model’s CIR process does notcapture the right-hand skew in VIX implied volatility. However, the Exp-OU and the 3/2 modelappear to be the best for ﬁtting to VIX-option implied volatility, as the SABR model has ﬂatsmile for VIX options, and the CEV volatility model has relatively little skew for VIX options (seeright-hand plot in Figure 4.1). Figure 4.3 shows evidence that the 3/2 model can capture some ofthe right-hand skew in the VIX data. The moment relationships from the examples of Sections 4.1to 4.5 are summarized in Table 4.1. This ﬁnal example will show how the theory presented in earliersections can be generalized to include an independent jump process. Speciﬁcally, an independentjump term is added to the log-returns model presented in equation (2.1). Addition of these jumpswill cause some slight changes to statements of Sections 2 and 3, but the main results of thepaper remain intact. The main diﬀerences are that variance swaps will no longer be equal to thesquared VIX (i.e. the relationship in equation (2.3) will change), and Proposition 3.1 will need tobe modiﬁed.Consider two more stochastic processes: a Poisson arrival process N t with intensity λ ∈ (0 , ∞ ),and an i.i.d. jump process Y i ∈ R with E e Y i < ∞ for i = 1 , , , . . . , where N and Y are independentof each other and jointly independent of ( W, σ ). Equation (2.1) is modiﬁed to include jumps asfollows: log( S t /S ) = Z t (cid:18) r − µ − σ u (cid:19) du + Z t σ u dW u + N t X i =1 Y i , where µ is a compensator so that S t e − rt is a local martingale. The quadratic variation of log( S t )for the continuous-time model was R t σ u du , but for this model is R t σ u du + P N t i =1 Y i . This modelis part of the general class of L´evy jump diﬀusions where variance swaps and the log contract diﬀerby a jump premium. For the example presented above it was shown in [CW09] that the premiumis an additive term,variance-swap rate = 1 τ E t Z t + τt σ u du + λ E Y i = VIX t − λ E (cid:20) e Y i − − Y i − Y i (cid:21) , VIX Option Implied Volatility 09−Sep−2010, maturity 20−Oct−2010 log(K/E t VIX T ) ˆ ν Data ˆ ν . p log( K/E t V I X T ) / ( T − t ) Fig. 4.3: The 3/2 model ﬁt to implied volatility from VIX options. The heavy-tailed 3/2 pro-cess for volatility captures the increasing right-hand skew. The right-hand skew is the stylis-tic feature that is seen through the VIX data, and is important to capture when selectinga stochastic volatility model for pricing VIX options. See [Dri12, Gat08, BGK13, PS14] formore insight on models that ﬁt the right-hand skew. The ﬁt uses approximated VIX formula:VIX T ≈ τ E T R T + τT h Y T − Y T + u − Y T Y T + ( Y T + u − Y T ) Y T i du .Alternatively, it is shown in [CLW11] that if log-returns can be written as a continuous time changeof a L´evy process ( X t ) ≤ t ≤ , then the variance-swap rate is a multiplier of the log contractvariance-swap rate = − Q x τ E t log( S t + τ /S t ) = Q x t , where Q x = Var ( X )log E e X − E X is the multiplier. To generalize, Proposition 3.1 is not too diﬃcult tomodify provided that log( S t )’s quadratic variation has been restated for jumps; independent jumpsdon’t aﬀect Lemma 3.2, the moment formulas, Proposition 3.3, Proposition 3.4, or Proposition 3.7.

5. Conclusions.

This paper has explored some basic relationships between the markets forSPX and VIX options. The main idea is based on the notion that high-strike VIX call options canbe used to hedge tail risk in the SPX. The moment formula was applied to relate the extreme-strikeoptions on the SPX and VIX, and some formulas for comparison were introduced. The primaryfocus was the relationship between negative moments in SPX and the interval of the positive realline where the MGF of VIX-squared is ﬁnite. Negative moments and the MGF were computedfor various stochastic volatility models, giving a sense of what can be accomplished with diﬀerentmodels. odel E VIX pT = ∞ E e ξ VIX T = ∞ E S − qT + τ = ∞ CEV model, dS = S a dW with 0 ≤ a < ∀ p > ∀ ξ > ∀ q > ρ ≤ p = ∞ ∀ ξ > ∀ q > dS = S √ Y dW and dY = cY dB , with dW dB = ρdt and ρ ≤ ∀ p > ∀ ξ > τc ∀ q > √ c − Heston Model p = ∞ ∀ ξ ≥ κ τγ (1 − e − κT )(1 − e − κτ ) ∀ q > qγρ + κ ) <γ q (1 + q ) and T + τ ≥ T ∗ ( q )3 / p = ∞ ∀ ξ ≥ γ τα ( α +1)2 ∀ q > √ γ α − ;Exp-OU model p = ∞ ∀ ξ > ∀ q > T ∗ ( q ) = √ γ q (1+ q ) − ( qγρ + κ ) (cid:18) π { qγρ + κ> } + tan − (cid:18) − √ γ q (1+ q ) − ( qγρ + κ ) qγρ + κ (cid:19)(cid:19) ; when ( qγρ + κ ) < γ q (1 + q ) T ∗ is the moment of explosion computed in [AP07, FGGS11]. For the 3 / α = Y κγ − > Appendix A. 3/2 Model.A.1. Proof of Proposition 4.1.

It is shown in [AG99] that E /Y t = 1 α ζ t e − Y ν t F ( α, α, Y ν t ) , here α = Y κγ − > ζ t = 2 κγ (1 − e − κt ) ν t = ζ t e − κt , and where F ( α, α, ν ) is the conﬂuent hypergeometric function, F ( α, α, ν ) = Γ(1 + α )Γ( α ) Z e νr r α − dr = α Z e νr r α − dr . The moment generating function is, E t e ξ VIX T = E t exp ξτ Z T + τT E T Y u du ! = E t exp ξατ Z T + τT ζ u − T e − Y T ν u − T F ( α, α, Y T ν u − T ) du ! = E t exp ξτ Z r α − Z T + τT ζ u − T e − (1 − r ) Y T ν u − T dudr ! = E t exp (cid:18) − ξγ τ Z r α − Ei (cid:18) − − r ) Y T γ ( e κ ( u − T ) − (cid:19) (cid:12)(cid:12)(cid:12) T + τu = T dr (cid:19) , where Ei ( x ) = − R ∞− x r e − r dr i.e. the exponential integral. Hence, letting C be a constant thatcontains terms not involving Y T , E t e ξ VIX T = E t exp ξγ τ Z r α − Z ∞ − r ) YTγ eκτ − u e − u dudr ! ≤ C E t exp ξγ τ Z r α − Z ∧ − r ) YTγ eκτ − u du ! dr ! = C E t exp (cid:18) ξγ τ Z r α − (cid:18) − log (cid:18) ∧ − r ) Y T γ ( e κτ − (cid:19)(cid:19) dr (cid:19) (A.1)= C E t exp ξγ τ Z ∨ (1 − γ eκτ − YT ) r α − (cid:18) − log (cid:18) − r ) γ ( e κτ − (cid:19) − log( Y T ) (cid:19) dr ! = C E t ( Y T ) − ξγ τα (cid:18) − (cid:18) ∨ (1 − γ eκτ − YT ) (cid:19) α (cid:19) × exp ξγ τ Z ∨ (1 − γ eκτ − YT ) r α − (cid:18) − log (cid:18) − r ) γ ( e κτ − (cid:19)(cid:19) dr ! . Since it is assumed that α > ξγ τ Z ∨ (1 − γ eκτ − YT ) r α − (cid:18) − log (cid:18) − r ) γ ( e κτ − (cid:19)(cid:19) dr ! < ∞ a.s. . n addition, for ξ > E t ( Y T ) − ξγ τα (cid:18) − (cid:18) ∨ (1 − γ eκτ − YT ) (cid:19) α (cid:19) < ∞ iﬀ E t ( Y T ) − ξγ τα < ∞ . Hence, the MGF of VIX T is ﬁnite if E t ( Y T ) − ξγ τα < ∞ . Moreover, it can be checked using thedensity of equation (4.1) that moments of 1 /Y t are inﬁnite iﬀ ξγ τα ≥ α + 1 (i.e. iﬀ ξ ≥ γ τα ( α +1)2 ).Hence, the MGF of VIX T is ﬁnite if ξγ τ < α ( α + 1), which is a suﬃcient condition and shows onedirection of the ‘iﬀ’ statement of the proposition.In fact, ξγ τ < α ( α + 1) is a necessary condition for ﬁniteness of the MGF: there is a constant C independent of Y T such that E t e ξ VIX T = E t exp ξγ τ Z r α − Z ∞ − r ) YTγ eκτ − u e − u dudr ! = C E t exp ξγ τ Z r α − Z ∧ − r ) YTγ eκτ − u e − u du ! dr ! ≥ C E t exp ξγ τ Z r α − Z ∧ − r ) YTγ eκτ − − uu du ! dr ! ( e − u ≥ − u , ∀ u ≥ C E t exp (cid:18) ξγ τ Z r α − (cid:18) − log (cid:18) ∧ − r ) Y T γ ( e κτ − (cid:19) − (cid:18) − ∧ − r ) Y T γ ( e κτ − (cid:19)(cid:19) dr (cid:19) ≥ C E t exp (cid:18) ξγ τ Z r α − (cid:18) − log (cid:18) ∧ − r ) Y T γ ( e κτ − (cid:19) − (cid:19) dr (cid:19) , which is, up to the term − ξγ τ ≥ α ( α + 1). A.2. Negative Moments.

It can also be shown that there are some negative moments of S T . Following [CS07], if the MGF of R T Y t dt exists then it is given by the formula E exp ξ Z T Y t dt ! = Γ( b − a )Γ( b ) (cid:18) γ φ (cid:19) a F (cid:18) a, b, − γ φ (cid:19) , (A.2) here φ = Z ( e κT − κa = − α s α − ξγ b = 2 (cid:18)

12 + a + α (cid:19) α = 2 Y κγ − . This formula is real and positive if and only if ξ ≤ γ α , and if the formula is complex then the MGF does not exist. Indeed, direct diﬀerentiation (veriﬁedwith Mathematica) of equation (A.2) leads to the following asymptotic for the derivative of theMGF: ∂∂ξ E exp ξ Z T Y t dt ! ∼ C q γ α − ξ as ξ ր γ α , where C is a constant. Hence, the MGF ceases to exist for ξ > γ α and the singularity occurs inthe ﬁrst derivative.Next, for any q > S − qt , dS − qt = − kS − qt √ Y t dW t + q ( q + 1)2 S − qt Y t dt , and deﬁne the stopping times T M = inf (cid:26) t > (cid:12)(cid:12)(cid:12) Z t Y u du ≥ M (cid:27) , which allows for the stopped process’s stochastic integral to be a martingale on ﬁnite interval [0 , T ],and hence, E S − qt ∧T M = S − q E exp q ( q + 1)2 Z t ∧T M Y u du ! ∀ t ≤ T . ence, lim M →∞ E S − qT ∧T M = lim M →∞ S − q E exp q ( q + 1)2 Z T ∧T M Y u du ! = S − q E lim M →∞ exp q ( q + 1)2 Z T ∧T M Y u du ! monotone convergence= S − q E exp q ( q + 1)2 Z T Y u du ! . where the last term is the MGF of realized variance and was shown to be inﬁnite for some q > S qT ≥ S qT ∧T M − q S T − S T ∧T M S q +1 T ∧T M , which yields E S − qT ≥ E h S − qT ∧T M − q ( S T − S T ∧T M ) S − ( q +1) T ∧T M i = E  S − qT ∧T M − q E [( S T − S T ∧T M ) | Y t ∧T M ] | {z } =0 S − ( q +1) T ∧T M  = E S − qT ∧T M → S − q E exp q ( q + 1)2 Z T Y u du ! as M → ∞ . On the other hand, from Fatou’s lemma, E S − qT = E lim inf M →∞ S − qT ∧T M ≤ lim inf M →∞ E S − qT ∧T M = S − q E exp q ( q + 1)2 Z T Y u du ! . Hence, E S − qT = S − q E exp (cid:16) q ( q +1)2 R T Y u du (cid:17) and E S − qT < ∞ if and only if q ( q +1)2 ≤ γ α , or interms of the notation from Proposition 3.3˜ q = p γ α − . Appendix B. The CEV Model.

Consider the CEV model with quadratic variance, dY t = σY t dW t . his process is a strict local martingale and is discussed in [CH05]. In particular, the process X t = 1 /Y t is among the class of SDEs considered in [Fel51], and has natural boundaries at zeroand inﬁnity (i.e. both Y and S have zero probability of touching zero). Furthermore, the transitiondensity of S is P ( Y T ∈ dz | Y t = y ) = yz dz p π ( T − t ) σ ×  exp  − (cid:16) z − y (cid:17) T − t ) σ  − exp  − (cid:16) z + y (cid:17) T − t ) σ  . Table B.1 shows the expectations of some important functions of Y T for σ = 1. g ( Y T ) E { g ( Y T ) | Y = y } Y T y (cid:16) − (cid:16) − y √ T (cid:17)(cid:17) , Y T q y T D + (cid:16) y √ T (cid:17) ,log( Y T ) (cid:16) γ e + log(2 /T ) + ∂∂a F (cid:16) , , − T y (cid:17)(cid:17) ,( Y T − K ) + y (Φ( κ − δ ) − Φ( − δ ) + Φ( δ ) − Φ( δ + κ )) − K (cid:0) Φ( κ + δ ) − Φ( δ − κ ) + δ − (Φ ′ ( κ + δ ) − Φ ′ ( κ − δ )) (cid:1) , δ = y √ T κ = K √ T Table B.1: The moments for the CEV process dY = Y dW . The special functions are the nor-mal Gaussian CDF Φ, the Dawson integral D + = e − x R x e u du , the conﬂuent hypergeometricfunction of the ﬁrst kind F = Γ( b )Γ( b − a )Γ( a ) R e ux u a − (1 − u ) b − a − du , and the Euler Gamma γ e ≈ . σ > σ = 1 is the scaling oftime given by E σ [ g ( Y T ) | Y = y ] = E [ g ( Y T σ ) | Y = y ]. .1. CEV Volatility Process (Section 4.3). To determine whether or not S t is a truemartingale it suﬃces to consider the case S = 1 and the expectation E S T = E exp − Z T Y u du + p − ρ Z T Y u dW u + ρ Z T Y u dB u ! = E exp − ρ Z T Y u du + ρ Z T Y u dB u ! = E "(cid:18) Y T Y (cid:19) ρ/c exp ρ ( c − ρ )2 Z T Y u du ! , which is certainly a martingale if ρ = 0. For ρ <

0, deﬁne Z t = (cid:16) Y t Y (cid:17) ρ/c exp (cid:16) ρ ( c − ρ )2 R t Y u du (cid:17) andapply Itˆo’s lemma to get E S T = E Z T = 1 + E " ρ Z T Y u Z u dB u = 1 , where the stochastic integral is a martingale because E R T ( Y u Z u ) du ≤ R T ( E Y u ) / ( E Z u ) / du < ∞ for ρ < c > E Y t < ∞ and E Y − kt < ∞ for all k > S t is a true martingale. B.1.1. MGF of VIX T . Using the CEV process’s 2nd moment form Table B.1, and takingconstant

M > ξ >

0, it is seen that the MGF of VIX T can be broken into two terms, onlyone of which has the possibility of being inﬁnite: E e ξ VIX T = E exp ξτ E T Z T + τT Y u du ! = E exp ξ Y T τ Z T + τT s c ( u − T ) D + Y T p c ( u − T ) ! du ! = E exp  ξ τ c Z ∞ Y T √ c τ x D + ( x ) dx  = C M,ξ E exp  ξ τ c Z M √ c τ Y T √ c τ x D + ( x ) dx  { Y T ≥ M } + E exp  ξ τ c Z ∞ Y T √ c τ x D + ( x ) dx  { Y T , here C M,ξ = exp (cid:18) ξ τc R ∞ M √ c τ x D + ( x ) dx (cid:19) < ∞ for all M > x = Y T √ c ( u − T ) dx = − Y T √ c ( u − T ) / du = − Y T √ c x √ u − T du .

Taking M ≥ / √ c τ and examining the possibly inﬁnite term, E exp  ξ τ c Z M √ c τ Y T √ c τ x D + ( x ) dx  { Y T ≥ M } ≥ E  exp  ξ τ c Z M √ c τ Y T √ c τ (cid:18) x − x (cid:19) dx  { Y T ≥ M }  ≥ e − ξ M τc E  exp  ξ τ c Z M √ c τ Y T √ c τ x dx  { Y T ≥ M }  = e − ξ M τc E h e ξ τc log( Y T /M ) { Y T ≥ M } i = e − ξ M τc E "(cid:18) Y T M (cid:19) ξ τc { Y T ≥ M } , (B.1)where the inequality comes by using the Dawson-integral’s MacLaurin series D + ( x ) = ∞ X n =0 ( − n n (2 n + 1)!! x n +1 > x − x for 0 ≤ x < . The quantity in (B.1) is inﬁnite if and only if ξ τc > T to be inﬁnite is ξ > τ c . This condition is also necessary, as the steps of (B.1) can be modiﬁed with the upper bound x D + ( x ) < /x for 0 < x < E exp  ξ τ c Z M √ c τ Y T √ c τ x D + ( x ) dx  { Y T ≥ M } ≤ E  exp  ξ τ c Z M √ c τ Y T √ c τ x dx  { Y T ≥ M }  = E "(cid:18) Y T M (cid:19) ξ τc { Y T ≥ M } < ∞ if ξ ≤ τ c . ence ˜ ξ from Proposition 3.3 is ˜ ξ = 3 τ c . B.1.2. Negative Moments.

For negative moments (for q >

0) similar steps as those in thebeginning of Appendix A.2 lead to E S − qT = E exp q + q Z T Y t dt ! . Next consider the process X t = Y t and apply Itˆo’s lemma, dX t = c X t dt + 2 cX / t dB t , which is a 3/2 process, and if the moment exists then it is given the formula of [CS07], E S − qT = E exp q + q Z T X t dt ! = Γ( γ − α )Γ( γ ) (cid:18) c T X (cid:19) α F (cid:18) α, γ, − c T X (cid:19) , where F is the conﬂuent hypergeometric function and α = −

14 + r − q + q c γ = 2 (cid:18) α + 34 (cid:19) . This is an analytic formula that is real and positive if q + q ≤ c , and so E S − qT is ﬁnite if q ≤ − √ c . Sharpness of this ﬁniteness inequality can be shown by following the same argument that was usedto show the sharpness of negative moments condition in Section 4.6 and Appendix ?? . Hence, ˜ q from Proposition 3.3 is ˜ q = − √ c . Appendix C. Maximum Domain of Attraction.

Let F ( y ) = P t (VIX T ≤ y ). Considersamples ( Y ℓ ) nℓ =1 where Y ℓ ∼ iid F for each ℓ , and let M n = max( Y , Y , . . . , Y n ). For α > H α ( y ) , ( exp ( − (1 + y/α ) − α ) α < ∞ , exp( − exp( − y )) α = ∞ . istribution function F is said to be in the maximum domain of attraction (MDA) of H α for α < ∞ if and only if a n M n ⇒ H α as n → ∞ , where a n = F − (1 − /n ) (see [Deg06] or [Res87] page 54-57). This is written as F ∈ MDA( H α ).For α = ∞ , F ∈ M DA ( H α ) if and only if1 − F ( y ) ∼ exp( − Ψ( y )) as y → ∞ for some function Ψ ∈ C ( R + ) with (i) Ψ( y ) → ∞ as y → ∞ , (ii) Ψ ′ ( y ) >

0, and (iii) (1 / Ψ ′ ( y )) ′ → y → ∞ (see [Deg06]). REFERENCES[AG99] D. Ahn and B. Gao. A Parametric Nonlinear Model of Term Structure Dynamics.

Review of FinancialStudies , 12(4):721–762, 1999.[AP07] L. Andersen and V. Piterbarg. Moment explosions in stochastic volatility models.

Finance and Stochas-tics , 11(1):29–50, 2007.[AS99] Y. A¨ıt-Sahalia. Transition densities for interest rate and other nonlinear diﬀusions.

The Journal ofFinance , 54(4):1361–1395, August 1999.[BB14] A. Badran and J. Baldeaux. Consistent modeling of VIX and equity derivatives using a 3/2 plus jumpsmodel.

Applied Mathematical Finance , 21(4):299–312, 2014.[BCM17] C. Bernard, Z. Cui, and D. McLeish. On the martingale property in stochastic volatility models basedon time-homogeneous diﬀusions.

Mathematical Finance , 27(1):194–223, 2017.[BF08] S. Benaim and P Friz. Smile asymptotics II: Models with known moment generating functions.

Journalof Applied Probability , 45(1):16–32, 2008.[BGK13] C. Bayer, J. Gatheral, and M. Karlsmark. Fast Ninomiya-Victoir calibration of the double-mean-reverting model.

Quantitative Finance , 13(11):1813–1829, 2013.[Bic82] A Bick. Comments on the valuation of derivative assets.

Journal of Financial Economics , 10(3):331–345,1982.[BL78] D. Breeden and R. Litzenberger. Prices of state-contingent claims implicit in options prices.

Journal ofBusiness , 51:621–651, October 1978.[CH05] A. Cox and D. Hobson. Local martingales, bubbles and option prices.

Finance and Stochastics , 9:477–492, 2005.[CK13] R. Cont and T. Kokholm. A consistent pricing model for index options and volatility derivatives.

Mathematical Finance , 23(2):248–274, 2013.[CL08] P. Carr and R. Lee. Robust replication of volatility derivatives. In

MFA Annual Meeting , 2008.[CL09] P. Carr and R. Lee. Volatility derivatives.

Annual Review of Financial Economics , 1:319–339, 2009.[CL10] P. Carr and R. Lee. Hedging variance options on continuous semimartingales.

Finance and Stochastics ,14(2):179–207, April 2010.[CLW11] P. Carr, R. Lee, and L. Wu. Variance swaps on time-changed L´evy processes.

Finance and Stochastics ,2011.[CM01] P. Carr and D. Madan. Optimal positioning in derivative securities.

Quantitative Finance , 1(1):19–37,2001.[CS07] P. Carr and J. Sun. A new approach for option pricing under stochastic volatility.

Review of DerivativesResearch , 10:87–150, May 2007.[CW09] P. Carr and L. Wu. Variance risk premiums.

The Review of Financial Studies , 22:1311–1341, March2009.[DDKZ99] K. Demeterﬁ, E. Derman, M. Kamal, and J. Zou. More than you ever wanted to know about volatilityswaps.

The Journal of Derivatives , 6(4):9–32, 1999.33Deg06] M. Degan.

On Multivariate Generalised Pareto Distributions and High Risk Scenarios . PhD thesis,ETH Zurich, 2006.[Dri12] G. Drimus. Options on realized variance by transform methods: a non-aﬃne stochastic volatility model.

Quantitative Finance , 12(11):1679–1694, 2012.[Fel51] W. Feller. Two singular diﬀusion problems.

Annals of Mathematics , 54(1), July 1951.[FG05] P. Friz and J. Gatheral. Valuation of volatility derivatives as an inverse problem.

Quantitative Finance ,5(6):531–542, December 2005.[FGGS11] P. Friz, S. Gerhold, A. Gulisashvili, and S. Sturm. On reﬁned volatility smile expansion in the Hestonmodel.

Quantitative Finance , 11(8):1151–1164, 2011.[Gat06] J. Gatheral.

The Volatility Surface: A Practitioner’s Guide . Wiley, 2006.[Gat08] J. Gatheral. Consistent modeling of SPX and VIX options. Presented at The FifthWorld Congress of the Bachelier Finance Society in London, 2008. Available at: .[GJ14] J. Gatheral and A. Jaquier. Arbitrage-free SVI volatility surfaces.

Quantitative Finance , 14(1):59–71,2014.[Jou04] B. Jourdain. Loss of martingality in asset price models with lognormal stochastic volatility. Tech-nical report, 2004. Preprint CERMICS 2004267. http://cermics.enpc.fr/reports/CERMICS-2004/CERMICS-2004-267. pdf.[KL12] F. Klebaner and R. Lipster. When a stochastic exponential is a true martingale. Extension of the Beneˇsmethod.

Theory Probab. Appl. , 58(1):38–62, 2012.[Lee04] R. Lee. The moment formula for implied volatility at extreme strikes.

Mathematical Finance , 14(3):469–480, July 2004.[LM07] P.-L. Lions and M. Musiela. Correlations and bounds for stochastic volatility models.

Annales del’Institut Henri Poincare (C) Non Linear Analysis , 24(1):1–16, 2007.[Mal13] A. Malz. Risk-neutral systemic risk indicators. FRB of New York Staﬀ Report 607, New York FederalReserve, 2013. Available at http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2241567 .[MHL15] S. De Marco and P. Henry-Labord´ere. Linking vanillas and VIX options: A constrained martingaleoptimal transport problem.

SIAM Journal on Financial Mathematics , 6(1):1171–1194, 2015.[MM12] A. Mijatovic and M.Urusov. On the martingale property of certain local martingales.

Probability Theoryand Related Fields , 152:1–30, 2012.[MY11] D. Madan and M. Yor. The S&P 500 index as a Sato process travelling at the speed of the VIX.

AppliedMathematical Finance , 18(3):227–244, 2011.[Pap16] A. Papanicolaou. Analysis of VIX markets with a time-spread portfolio.

Applied Mathematical Finance ,23(5):374–408, 2016.[Pro13] Philip Protter. A mathematical theory of ﬁnancial bubbles. In

Paris-Princeton Lectures on MathematicalFinance 2013 , volume 2081 of

Lecture Notes in Mathematics , pages 1–108. Springer InternationalPublishing, 2013.[PS14] A. Papanicolaou and R. Sircar. A regime-switching Heston model for VIX and S&P 500 implied volatil-ities.

Quantitative Finance , 14(10):1811–1827, 2014.[Res87] S. Resnick.

Extreme Values, Regular Variation and Point Processes . Springer, New York NY, 1987.[Sep08] A. Sepp. Pricing options on realized variance in the Heston model with jumps in returns and volatility.

Journal of Computational Finance , 11(4):33–70, 2008.[SX16] Z. Song and D. Xiu. A tale of two option markets: Pricing kernels and volatility risk.