[PDF] Model independent hedging strategies for variance swaps

Abstract

A variance swap is a derivative with a path-dependent payoff which allows investors to take positions on the future variability of an asset. In the idealised setting of a continuously monitored variance swap written on an asset with continuous paths it is well known that the variance swap payoff can be replicated exactly using a portfolio of puts and calls and a dynamic position in the asset. This fact forms the basis of the VIX contract. But what if we are in the more realistic setting where the contract is based on discrete monitoring, and the underlying asset may have jumps? We show that it is possible to derive model-independent, no-arbitrage bounds on the price of the variance swap, and corresponding sub- and super-replicating strategies. Further, we characterise the optimal bounds. The form of the hedges depends crucially on the kernel used to define the variance swap.

Full PDF

MModel independent hedging strategies for variance swaps

David Hobson ∗ and Martin Klimmek † Department of Statistics, University of WarwickOctober 8, 2018

Abstract

A variance swap is a derivative with a path-dependent payoﬀ which allows investors totake positions on the future variability of an asset. In the idealised setting of a continuouslymonitored variance swap written on an asset with continuous paths it is well known thatthe variance swap payoﬀ can be replicated exactly using a portfolio of puts and calls and adynamic position in the asset. This fact forms the basis of the VIX contract.But what if we are in the more realistic setting where the contract is based on discretemonitoring, and the underlying asset may have jumps? We show that it is possible to derivemodel-independent, no-arbitrage bounds on the price of the variance swap, and correspond-ing sub- and super-replicating strategies. Further, we characterise the optimal bounds. Theform of the hedges depends crucially on the kernel used to deﬁne the variance swap.

The purpose of this article is to construct hedging strategies which super-replicate the payoﬀ ofa variance swap for any price path of the underlying asset, including price paths with jumps.The idea is that at initiation time 0, an agent purchases a portfolio of puts and calls whichshe holds until time T . In addition, she follows a simple, dynamic investment strategy in theunderlying over [0 , T ]. Then, for every possible path of the underlying, the sum of the payoﬀfrom the vanilla portfolio plus the gains from trade from the dynamic strategy is (more than)suﬃcient to cover the obligation from the variance swap. Implicit in this set-up is the idea thatthe super-hedge does not rely on any modelling assumptions. Instead, the super-hedge is robusteven in the presence of jumps.The problem of ﬁnding the cheapest super-hedging strategy can be seen as the dual of aprimal problem which is to bound the prices for variance swaps over the class of all models for theasset price process which are consistent with the traded prices of puts and calls. If the varianceswap is sold for the price upper-bound and hedged with the corresponding super-replicatingstrategy then the seller will not lose money under any scenario.The model-independent approach should be contrasted with the standard methodologywhich begins with a stochastic model for asset prices, and then infers the price of the vari-ation swap by calculating the expected payoﬀ. However, in markets where vanilla instrumentsare liquidly traded, the prices of puts and calls contain information about the market’s expec-tations of the future behaviour of asset prices. The existence of this information removes theneed to model the future, and this fact forms the basis of the model-independent approach.In addition to super-hedges and upper bounds on the price of the variance swap we alsogive sub-hedges and lower bounds. Moreover, our analysis is not restricted to any particular ∗ [email protected] † [email protected] a r X i v : . [ q -f i n . P R ] M a y eﬁnition of the variance swap, nor is it based on a mathematical idealisation of a continuoustime limit of the swap contract, but rather on a discrete set of observations. We deﬁne varianceswaps through their kernels; bivariate functions with regularity properties making them suitableto measure variance properties of the price path. Examples of kernels include squared simplereturns, squared log returns and squared price diﬀerences. Furthermore, the sub- and super-replicating hedges work for discretely sampled variance swaps and continue to work in thecontinuous time limit. As long as the price path has a quadratic variation, these limits existby F¨ollmer’s path-wise Itˆo formula [17]. The standard approach to variance swap pricing is toassume a stochastic model and that the underlying paths are generated from a semi-martingaleprocess with respect to this model. In this article, a model is speciﬁed only when it is necessaryto show that the cheapest super-replicating hedge is tight.Under some minimal restrictions on the form of the variance swap kernel we ﬁnd a family ofsuper-hedging strategies. This family is parameterised by a set of monotone functions. Then,given that the prices of call options for the expiry date of the variance swap are known (orequivalently the marginal law of the underlying price process at maturity is known) we show thatthere exists a cheapest super-replicating hedge from the given family. This hedge is associatedwith a monotone function, and we use this function to describe a stochastic model for theforward price of the asset in which the price process is continuous, except perhaps for a singlejump, after which the process remains constant. In the continuous time limit, the super-hedgereplicates the payoﬀ of the variance swap if the asset price follows this one-jump model. Thisshows that the bounds we produce are best possible and justiﬁes the restriction of our searchto hedging strategies within the given family.This article shares the model-independent ethos for the pricing of variance swaps implicit inNeuberger [25] and Dupire [16] in the setting of continuous price processes. In those articles, itwas shown that if we assume that the asset price process is a continuous forward price, then thecontinuously monitored variance swap based on either squared log returns or squared simplereturns is perfectly replicated by the following strategy: synthesise − − Variance Swap Kernels and Model-Independent Hedging

We begin by deﬁning the payoﬀ of a variance swap on a path-wise basis. The payoﬀ will dependon a kernel, on the times at which the kernel is evaluated and on the asset price at these times.

Deﬁnition 2.1. A variation swap kernel is a continuously diﬀerentiable bi-variate function H : (0 , ∞ ) × (0 , ∞ ) → [0 , ∞ ) such that for all x ∈ (0 , ∞ ), H ( x, x ) = 0 = H y ( x, x ). We say thatthe swap kernel is regular if it is twice continuously diﬀerentiable. A variance swap kernel is a regular variation swap kernel H such that H yy ( x, x ) = x − .Our main focus in this article is on variance swap kernels but we will discuss variation swapkernels H S ( x, y ) = ( y − x ) and H Q ( x, y ) = ( y − x ) brieﬂy, see Remark 3.1 and Example 6.10.(Strictly speaking H S is not a variation swap kernel since it is not non-negative, but most ofour analysis still apllies in this case.) A regular variation swap kernel is a variance swap kernelif H ( x, x (1 + δ )) = δ + o ( δ ) for δ small. Examples of variance swap kernels include H R ( x, y ) = (cid:18) y − xx (cid:19) , H L ( x, y ) = (log( y ) − log( x )) and H B ( x, y ) = − (cid:18) log( y/x ) − (cid:18) y − xx (cid:19)(cid:19) . Deﬁnition 2.2. A partition P on [0 , T ] is a set of times 0 = t < t < ... < t N = T . A partitionis uniform if t k = kTN , k = 0 , , ...N . A sequence of partitions P = ( P ( n ) ) n ≥ = ( { t ( n ) k ; 0 ≤ k ≤ N ( n ) } ) n ≥ is dense if lim n ↑∞ sup k ∈{ ,...,N ( n ) − } | t ( n ) k +1 − t ( n ) k | = 0. Deﬁnition 2.3. A price realisation f = ( f ( t )) ≤ t ≤ T is a c`adl`ag function f : [0 , T ] → (0 , ∞ ). Deﬁnition 2.4.

The payoﬀ of a variation swap with kernel H for a partition P and a pricerealisation f is V H ( f , P ) = N − (cid:88) k =0 H ( f ( t k ) , f ( t k +1 )) . (2.1) Remark . (i) The price realisations f should be interpreted as realisations of the forwardprice of the asset with maturity T . Later we will extend the analysis to cover un-discountedprice processes, rather than forward prices.(ii) Large parts of the subsequent analysis can be extended to allow for price processes whichcan take the value zero, provided we also deﬁne H (0 ,

0) = 0, or equivalently truncate thesum in (2.1) at the ﬁrst time in the partition that f hits 0. In this case we must have thatzero is absorbing, so that if f ( s ) = 0, then f ( t ) = 0 for all s ≤ t ≤ T .(iii) In practice the variance swap contract is an exchange of the quantity V = V H ( f , P ) fora ﬁxed amount K . However, since there is no optionality to the contract, and since thecontract paying K can trivially be priced and hedged, we concentrate solely on the ﬂoatingleg.(iv) In many of the earliest academic papers, and in particular in Demeterﬁ et. al [13, 14],but also in some very recent papers, e.g. Zhu and Lian [30], the variance swap is deﬁnedin terms of the kernel H R . However, it has become market practice to trade varianceswaps based on the kernel H L . Nonetheless these contracts are traded over-the-counterand in principle it is possible to agree any reasonable deﬁnition for the kernel. Varianceswaps deﬁned using the variance kernel H B were introduced by Bondarenko [3], see also5euberger [26]. As we shall see, the contract based on this kernel has various desirablefeatures. For continuous paths then in the limit of a dense partition the contract does notdepend on the chosen kernel, see Example 6.10 and Lemma 6.9, but this is not the casein general.(v) The labels { S, Q, R, L, B } on the variation swap kernels denote { Skew, Quadratic, Re-turns, Logarithmic returns, Bondarenko } respectively.Let P = ( P ( n ) ) n ≥ be a dense sequence of partitions. If lim n ↑∞ V H ( f , P ( n ) ) exists then the limitis denoted V H ( f , P ∞ ) and is called the continuous time limit of V H ( f , P ( n ) ) on P .An important concept will be the quadratic variation of a path. For a dense sequence ofpartitions P , the quadratic variation [ f ] of f on P is deﬁned to be [ f ] t = lim n ↑∞ (cid:88) t ( n ) k ≤ t ( f ( t ( n ) k +1 ) − f ( t ( n ) k )) , provided the limit exists. We split the function into its continuous and discontinuousparts, [ f ] t = [ f ] ct + (cid:88) u ≤ t (∆ f ( u )) . Later we will relate this deﬁnition to that introduced byF¨ollmer [17], which is used to develop a path-wise version of Itˆo calculus. Our goal is to discuss how to price the variance swap contract, or more generally any path-dependent claim, under an assumption that European call and put (vanilla) options with ma-turity T are traded and can be used for hedging, but without any assumption that a proposedmodel is a true reﬂection of the real dynamics. In this sense the strategies and prices we deriveare model independent and robust.Let call prices be given by C ( K ), expressed in units of cash at time T . We assume thata continuum of calls are traded, and to preclude arbitrage we assume that C is a decreasingconvex function such that C (0) = f (0), C ( K ) ≥ ( f (0) − K ) + and lim K ↑∞ C ( K ) = 0, see e.g. Davisand Hobson [12]. We exclude the case where C ( f (0)) = 0 for then C ( K ) = ( f (0) − K ) + and thesituation is degenerate: the forward price must remain constant and upper and lower bounds onthe price of the variance swap are zero. Although we assume that calls are traded today (time0), we do not make any assumption on how call prices will behave over time, except that theywill respect no-arbitrage conditions and that on expiry they will be worth the intrinsic value. Deﬁnition 2.6. A synthesisable payoﬀ is a function ψ : (0 , ∞ ) (cid:55)→ R which can be representedas the diﬀerence of two convex functions (so that ψ (cid:48)(cid:48) ( x ) exists as a measure).Let Ψ = { ψ : ψ ∈ Ψ } be the set of synthesisable payoﬀs ψ : (0 , ∞ ) (cid:55)→ R . Then we have ψ ( f ) = ψ ( f (0))+ ψ (cid:48) + ( f (0))( f − f (0))+ (cid:90) (0 ,f (0)] ( x − f ) + ψ (cid:48)(cid:48) ( x ) dx + (cid:90) ( f (0) , ∞ ) ( f − x ) + ψ (cid:48)(cid:48) ( x ) dx. (2.2)where ψ (cid:48) + denotes the right-derivative. Thus we can represent the payoﬀ of any suﬃcientlyregular European contingent claim as a constant plus the gains from trade from holding a ﬁxedquantity of forwards, plus the payoﬀ of a static portfolio of vanilla calls and puts.Let D [0 , t ] denote the space of c´adl´ag functions on [0 , t ]. Deﬁnition 2.7. A dynamic strategy for a ﬁxed partition P is a collection of functions ∆ =( δ t , . . . , δ t N − ), where δ t j : D [0 , t j ] → R . The payoﬀ of a dynamic strategy along a pricerealisation f is N − (cid:88) k =0 δ t k (( f ( t )) ≤ t ≤ t k )( f ( t k +1 ) − f ( t k )) . (2.3)6et ¯∆( P ) be the set of dynamic strategies. Deﬁnition 2.8. ∆ = ¯∆( P ) is a Markov dynamic strategy if δ t j ( f ( t ) ≤ t ≤ t j ) = δ t j ( f ( t j )) for all j .A Markov dynamic strategy is a time homogeneous Markov dynamic strategy (THMD-strategy)if δ t j ( f ( t j )) = δ ( f ( t j )) for all j .In the sequel we will concentrate mainly on THMD-strategies. The quantity δ t j representsthe quantity of forwards to be held over the interval ( t j , t j +1 ]. In principle this quantity maydepend on the current time and on the price history ( f ( t )) ≤ t ≤ t j . However, as we shall see, forour purposes it is suﬃcient to work with a much simpler set of strategies where the quantitydoes not explicitly depend on time, nor on the price history except through the current value.We call this the Markov property, but note there are no probabilities involved here yet. Deﬁnition 2.9. A semi-static hedging strategy ( ψ, ∆) is a function ψ ∈ Ψ and a dynamicstrategy ∆ ∈ ¯∆( P ). The terminal payoﬀ of a semi-static hedging strategy for a price realisation f is ψ ( f ( T )) + N − (cid:88) k =0 δ t k (( f ( t )) ≤ t ≤ t k )( f ( t k +1 ) − f ( t k )) . (2.4)Without loss of generality we may assume that ψ (cid:48) ( f (0)) = 0. If not then we simply adjusteach δ t k by the quantity ψ (cid:48) ( f (0)) and the payoﬀ in (2.4) is unchanged. In the sequel, wewill concentrate on the case when ∆ is a THMD strategy. Then we identify ∆ ∈ ¯∆( P ) with δ : (0 , ∞ ) → R and write ( ψ, δ ) instead of ( ψ, ∆).Given that investments in the forward market may be assumed to be costless, the dynamicstrategy has zero price. Thus, in order to deﬁne the price of a semi-static hedging strategy itis suﬃcient to focus on the price associated with the payoﬀ function ψ . The last two terms in(2.2) are expressed in terms of the payoﬀs of calls and puts. Thus we can identify the priceof ψ ( f ( T )) with the price of a corresponding portfolio of vanilla objects. We also use put-call parity to express the cost of the penultimate term in (2.2) in terms of call prices. LetΨ = { ψ ∈ Ψ : ψ (cid:48) + ( f (0)) = 0 } . Deﬁnition 2.10.

The price of a semi-static hedging strategy ( ψ ∈ Ψ , ∆ ∈ ¯∆( P )) is ψ ( f (0)) + (cid:90) (0 ,f (0)] ψ (cid:48)(cid:48) ( x )( C ( x ) + f (0) − x ) dx + (cid:90) ( f (0) , ∞ ) ψ (cid:48)(cid:48) ( x ) C ( x ) dx. The idea we wish to capture is that the agent holds a static position in calls together with adynamic position in the underlying such that in combination they provide sub- and super-hedgesfor the claim.

Deﬁnition 2.11.

Let G = G (( f ( t k )) k =0 ,...N ) be the payoﬀ of a path-dependent option. Supposethat there exists a semi-static hedging strategy ( ψ, ∆) such that on the partition PG ≤ (respectively ≥ ) ψ ( f ( T )) + N − (cid:88) k =0 δ t k (( f ( t )) ≤ t ≤ t k )( f ( t k +1 ) − f ( t k )) . Then ( ψ, ∆) is called a semi-static super-hedge (respectively semi-static sub-hedge ) for G .Given a semi-static sub-hedge (respectively super-hedge) we say that the price of the sub-hedge (respectively super-hedge) is a model independent lower (respectively upper) bound on theprice of the path-dependent claim G . This means that we do not need to introduce a notation for the put price, which is convenient since P isalready in use for the partition. Put-call parity for the forward says that the price of a put with strike x is theprice of a call with the same strike plus f (0) − x .3 Consistent models The aim of the agent is to construct a hedge which works path-wise, and does not dependon an underlying model. Nonetheless, sometimes it is convenient to introduce a probabilisticmodel and a stochastic process, and to interpret f ( t ) as a realisation of that stochastic process.In that case we work with a probability space (Ω , F , F , P ) supporting the stochastic process X = ( X t ) ≤ t ≤ T . Deﬁnition 2.12.

A model (Ω , F , F , P ) and associated stochastic process X = ( X t ) ≤ t ≤ T is consistent with the call prices ( C ( K )) K ≥ if ( X t ) t ≥ is a non-negative ( F , P )-martingale and if E [( X T − K ) + ] = C ( K ) for all K > V H ( X, P ) : Ω → R + is a random variable, and for ω ∈ Ω, V H ( X ( ω ) , P ) is a realised value of a variance swap. From a pricing perspective we areinterested in getting upper and lower bounds on E [ V H ( X ( ω ) , P )] as we range over consistentmodels. Knowledge of call prices is equivalent to knowledge of the marginal law of X T undera consistent model (Breeden and Litzenberger [4]). If we write µ for the law of X T and if C µ ( K ) = E [( Z µ − K ) + ] where Z µ is a random variable with law µ , then X is consistent for thecall prices C if C µ ( K ) = C ( K ). We write m = (cid:90) ∞ xµ ( dx ) and we assume, using the martingaleproperty, that f (0) = m . Then the problem of characterising consistent models is equivalent tothe problem of characterising all martingales with a given distribution at time T . In the situation where both the monitoring and the price-realisations are continuous the theoryfor the pricing of variance swaps is complete and elegant. We will use this setting to developintuition for the jump case.Suppose that the price realisation f is continuous, and possesses a quadratic variation [ f ] :[0 , T ] → R + on a dense sequence of partitions P . Dupire [16] and Neuberger [25] independentlymade the observation that the continuity assumption implies that a variance swap with payoﬀ (cid:90) T f ( t ) − d [ f ] t can be replicated perfectly by holding a static portfolio of log contracts andtrading dynamically in the underlying asset. Both Dupire and Neuberger assume f ≡ X is arealisation of a semi-martingale, but in our setting, the observation follows from a path-wiseapplication of Itˆo’s formula in the sense of F¨ollmer [17], see Section 6. Applying Itˆo’s formulato − f ( t )) we have − f ( T )) + 2 log( f (0)) = − (cid:90) T f ( t ) d f ( t ) + (cid:90) T f ( t ) d [ f ] t . (3.1)Then, as we show in Section 6 below, down a dense sequence of partitions V H ( f , P ∞ ) = (cid:90) T f ( t ) d [ f ] t = − f ( T )) + 2 log( f (0)) + (cid:90) T f ( t ) d f ( t ) . (3.2)Provided it is possible to trade continuously and without transaction costs, the right-hand-side ofthis identity has a clear interpretation as the sum of a European contingent claim with maturity T and payoﬀ − f ( T ) / f (0)) and the gains from trade from a dynamic investment of 2 / f ( t ) inthe underlying. Alternatively, the right-hand-side of (3.2) can be viewed as the payoﬀ of a semi-static hedging strategy in the continuous time limit for the choice ψ ( x ) = − x/f (0))+2( x − (0)) /f (0) and ∆ = ( δ t ) ≤ t ≤ T where δ t (( f ( u )) ≤ u ≤ t ) = (2 /f ( t )) − (2 /f (0)). Note that thereis equality in (3.2) so that ( ψ, δ ) is both a sub- and super-hedge for V H ( f, P ∞ ). In particular,under a price continuity assumption, the variance swap has a model-independent price and anassociated riskless hedge. Even if the continuity assumption cannot be justiﬁed, the associated replication strategy isnevertheless a reasonable candidate for a hedging strategy in the general case. Let us focuson the discrepancy between the payoﬀ of the variance swap and the gains from trade resultingfrom using the hedge derived in the continuous case. The path-by-path Itˆo formula continuesto apply in the case with jumps, see [17] and Section 6 below. Hence − f ( T )) + 2 log( f (0)) = − (cid:90) T f ( t − ) d f ( t ) + (cid:90) T f ( t − ) d [ f ] ct + (cid:88) ≤ t ≤ T (cid:26)(cid:18) ∆ f ( t ) f ( t − ) (cid:19) − log (cid:18) f ( t ) f ( t − ) (cid:19)(cid:27) . Note that d [log( f )] t = d [ f ] ct / f ( t − ) + (∆ log( f ( t ))) . By adding and subtracting the discontinu-ous part of the quadratic variation of log( f ) on the right-hand-side of the above expression, weﬁnd − f ( T )) + 2 log f (0) = − (cid:90) T f ( t − ) d f ( t ) + [log( f )] T − (cid:88) ≤ t ≤ T J L (∆ f ( t ) / f ( t − )) (3.3)where J L ( η ) = − η + 2 log(1 + η ) + log(1 + η ) . It is intuitively clear, but see also Corollary 6.5, that V H L ( f , P ∞ ) ≡ [log( f )] T . Then it follows byre-arrangement of equation (3.3) that the discrepancy between the realised value of the varianceswap V H L ( f , P ∞ ) and the return generated by the classical continuous hedging strategy can berepresented as the sum of the jump contributions: V H L ( f , P ∞ ) − (cid:18) − f ( T )) + 2 log f (0) + 2 (cid:90) T f ( t − ) d f ( t ) (cid:19) = (cid:88) ≤ t ≤ T J L (cid:18) ∆ f ( t ) f ( t − ) (cid:19) . We call this the hedging error with the convention that if the hedge sub-replicates the varianceswap then the hedging error is positive.Now consider the kernel H R and deﬁne V H R ( f , P ∞ ) = (cid:90) T d [ f ] t /f ( t − ) , again, see Corol-lary 6.5 for justiﬁcation. By a similar analysis, but adding and subtracting (cid:18) ∆ f ( t ) f ( t − ) (cid:19) insteadof the discontinuous part of the quadratic variation of log( f ), we have V H R ( f , P ∞ ) − (cid:18) − f ( T )) + 2 log( f (0)) + 2 (cid:90) T f ( t − ) d f ( t ) (cid:19) = (cid:88) ≤ t ≤ T J R (cid:18) ∆ f ( t ) f ( t − ) (cid:19) . where J R ( η ) = − η + 2 log(1 + η ) + η .

9n the continuous case, under some mild regularity conditions on f and P , the variance swapvalue is independent of the chosen kernel. In contrast, the value of a variance swap in thegeneral case is highly dependent on the chosen kernel.To see that this is the case, and to examine the impact of jumps on the hedging error for thekernels H L and H R we consider the shapes of the functions J R and J L , see Figure 1. For thekernel H L , a downward jump results in a positive contribution to the hedging error. Thus, if alljumps are downwards, then the classical continuous hedging strategy sub-replicates V H L ( f , P ∞ ).Conversely, upward jumps result in a negative contribution to the hedging error. The story isreversed for the kernel H R .Figure 1: J L (as represented by the dashed line) is convex decreasing for x ≤ x ≥

0. In contrast J R (solid line) is ﬁrst concave increasing and then convexincreasing. The diﬀerent shapes of these two curves explains the diﬀerent nature of the depen-dence of the payoﬀ of the variance swap on upward and downward jumps for diﬀerent kernels.It follows from the argument in the previous paragraph that for the kernel H L the hedgingerror will be maximised under scenarios for which the price realisation has downward jumps, butno upward jumps. Paths with this feature might arise as realisations of − N where N = ( N t ) t ≥ is a compensated Poisson process. Moreover, from the convexity of J L on ( − , H R .In summary, we ﬁnd that, under a continuity assumption on f , and for a dense sequence ofpartitions, the value of a variance swap is independent of the kernel and can be replicated witha static hedge in a forward contract and a dynamic hedging strategy. In the presence of jumps,however, the value of the variance swap depends on the kernel. An agent who holds a varianceswap and hedges under the assumption of continuity, may super-replicate or sub-replicate thepayoﬀ depending on the form of the jumps. For example, for the kernel H L an agent who actsas if the price realisation can be assumed to be continuous will sub-replicate the variance swapif there are downward jumps and no upward jumps. Such an agent will underprice the swap.We will use the analysis of this section to give us intuition about the extremal models whichwill lead to the price bounds on variance swaps derived in the Section 4. The bounds willdepend crucially on the kernel. Models under which the variance swap with kernel H L hashighest price (assuming consistency with a given set of call prices) will be characterised by asingle downward jump and no upward jumps. Remark . We will see later that the model which minimises the price for variance swaps with10ernel H R also minimises the price for variation swaps with kernel H S . If f has a quadraticvariation, then in the continuous limit V H S ( f , P ∞ ) = (cid:88) m P ν ( z ) − C ν ( y ) y − z . (3.7) Let B be Brownian motion started at m , with maximum process S and minimum process I .Suppose µ has no atom at m . Then τ Pν := inf { u > B u < α ν ( S u ) or B u > β ν ( I u ) } solvesthe Skorokhod embedding problem for ν in the sense that B τ Pν ∼ ν and ( B t ∧ τ Pν ) t ≥ is uniformlyintegrable.If ν has an atom at m then we assume F is suﬃciently rich as to support a uniform randomvariable ˜ Z U , which is independent of B . Then τ Pν := (cid:26) Z U ≤ ν ( { m } )inf { u > B u < α ν ( S u ) or B u > β ν ( I u ) } ˜ Z U > ν ( { m } ) solves the Skorokhod embedding for ν . The Perkins embedding has a minimality property in that for increasing functions F itminimises E [ F ( S τ )] over embeddings τ of ν . Moreover, as shown in [20] it also minimises theexpected value of functionals of the joint law of the running maximum and terminal value F ( B τ , S τ ) over stopping times τ in U I ( ν ), provided F satisﬁes some consistency conditions.The salient characteristic of the Perkins embedding which results in optimality is that either B τ Pν = S τ Pν or B τ Pν = α ν ( S τ Pν ).Now consider the problem of ﬁnding the consistent model for which V H R ( X, P ∞ ) has lowestpossible price, and recall that knowledge of call prices is equivalent to knowledge of the marginallaw µ of X T . To obtain the lowest possible price we might expect equality in each of (3.4)-(3.5),and thus that just before a jump, the process is at its current maximum. Moreover, the modelshould be related to the Perkins embedding. Lemma 3.3.

Let B be Brownian motion started at m . Let H b = inf { u ≥ B u = b } be theﬁrst hitting time of level b by Brownian motion. Let Λ( t ) be a strictly increasing, continuousfunction such that Λ(0) = m and lim t ↑ T Λ( t ) is inﬁnite.Deﬁne the process ˜ Q µ = ( ˜ Q µt ) ≤ t ≤ T by ˜ Q µt = B H Λ( t ) ∧ τ Pµ , (3.8) and let Q µ be the right-continuous modiﬁcation of ˜ Q µ .Then, Q µ is a martingale such that Q µT ∼ µ . Moreover, the paths of Q µ are continuousand increasing, except possibly at a single jump time. Finally, either Q µT ≡ B τ Pµ = S τ Pµ or Q µT ≡ B τ Pµ = α µ ( S τ Pµ ) .Proof. Since τ Pµ is ﬁnite almost surely we have that Q µT ≡ B τ Pµ ∼ µ . Moreover, for Λ( t ) < τ Pµ , Q µt = Λ( t ) = B H Λ ( t ) = S H Λ( t ) .The martingale Q µ will be used in Section 6 to show that in the continuous-time limit, thebounds we obtain are tight. The martingale Q µ is the related to the Perkins embedding in thesame way that the Dubins-Gilat [15] martingale is related to the Az´ema-Yor [1] embedding.We can also consider a reﬂected version of the martingale Q µ based on the inﬁmum processrather than the maximum process. 12 emma 3.4. Let λ ( t ) be a strictly decreasing, continuous function such that λ (0) = m and lim t ↑ T λ ( t ) is zero.Deﬁne the process ˜ R µ = ( ˜ R µt ) ≤ t ≤ T by ˜ R µt = B H λ ( t ) ∧ τ Pµ , (3.9) and let R µ be the right-continuous modiﬁcation of ˜ R µ .Then, R µ is a martingale such that R µT ∼ µ . Moreover, the paths of R µ are continuousand decreasing, except possibly at a single jump time. Finally, either R µT ≡ B τ Pµ = I τ Pµ or R µT ≡ B τ Pµ = β µ ( I τ Pµ ) .Remark . In this section we have exploited a connection between the problem of ﬁndingbounds on the prices of variance swaps and the Skorokhod embedding problem. This link is oneof the recurring themes of the literature on the model-independent bounds, see Hobson [19].We exhibit this link for the kernel H R , and in this sense at least, it seems that variance swapsdeﬁned via H R are the more natural mathematical object. Nonetheless, the intuition developedvia H R and the Skorokhod embedding problem is valid more widely. Previous sections have deﬁned notation and developed intuition for the problem. Now we beginthe construction of path-wise hedging strategies. We do this by deﬁning a class of synthesisablepayoﬀs with a useful extra property which can be exploited to give sub-hedges. Then, motivatedby the results of Section 3.3, we deﬁne a further class of payoﬀs which are based on decreasingfunctions. Finally we show that for the kernel H R , members of this new class belong to theformer class also, and thus yield sub-hedges.To construct a sub-hedge for a variation swap with kernel H for any price realisation f ,suppose that there exists a pair of functions ( ψ, δ ) such that for x, y ∈ R H ( x, y ) ≥ ψ ( y ) − ψ ( x ) + δ ( x )( y − x ) . (4.1)Then we may interpret ( ψ, δ ) as a semi-static hedging strategy (for a Markov and time-homogeneousdynamic strategy) and then for any price realisation f and partition P , V H ( f , P ) ≥ ψ ( f ( T )) − ψ ( f (0)) − (cid:88) k δ ( f ( t k ))( f ( t k +1 ) − f ( t k )) . By Deﬁnition 2.11 we have constructed a sub-hedge for the variation swap with kernel H .Suppose now that H is a variance swap kernel, and that ψ is diﬀerentiable. Recall that H y ( x, x ) = 0. Dividing both sides of (4.1) by y − x and letting y ↓ x , we ﬁnd that δ ( x ) ≤ − ψ (cid:48) ( x ).Similarly letting y ↑ x , δ ( x ) ≥ − ψ (cid:48) ( x ). Thus if (4.1) is to hold we must have that δ ≡ − ψ (cid:48) andour search for pairs of functions satisfying (4.1) is reduced to ﬁnding diﬀerentiable functions ψ satisfying H ( x, y ) ≥ ψ ( y ) − ψ ( x ) − ψ (cid:48) ( x )( y − x ) . (4.2)or equivalently, ψ ( y ) ≤ H ( x, y ) + ψ ( x ) + ψ (cid:48) ( x )( y − x ). Note that there is equality in this lastexpression at y = x . Deﬁnition 4.1. ψ ∈ Ψ is a candidate sub-hedge payoﬀ if for all y ∈ (0 , ∞ ), ψ ( y ) = inf x (cid:8) H ( x, y ) + ψ (cid:48) ( x )( y − x ) + ψ ( x ) (cid:9) . (4.3)13iven a candidate sub-hedge payoﬀ ψ we can generate a candidate semi-static hedge ( ψ, δ )by taking δ = − ψ (cid:48) . We will say that ψ is the root of the semi-static sub-hedge ( ψ, − ψ (cid:48) ).It remains to show how to choose candidate sub-hedge payoﬀs and especially those whichhave good properties. Using the intuition developed in the previous section for the kernel H R we expect optimal sub-hedging strategies to be associated with the martingale Q deﬁned in(3.8). For realisations of Q , either the path has no jump, or there is a single jump, and if thejump occurs when the process is at x then the jump is to α ( x ).With this in mind let K = K ( f (0)) be the set of monotone decreasing right-continuousfunctions κ : [ f (0) , ∞ ) → (0 , f (0)], with κ ( f (0)) = f (0). Let k denote the inverse of κ . For y < f (0) we want the inﬁmum in (4.3) to be attained at x = k ( y ). Then ψ must satisfy ψ ( y ) = H ( k ( y ) , y ) + ψ ( k ( y )) + ψ (cid:48) ( k ( y ))( y − k ( y )) . (4.4)Moreover, if ψ (cid:48) is diﬀerentiable, then for x = k ( y ) to be the argument of the inﬁmum in (4.3)we must have that k satisﬁes H x ( k ( y ) , y ) + ψ (cid:48)(cid:48) ( k ( y ))( y − k ( y )) = 0 or equivalently H x ( x, κ ( x )) = ψ (cid:48)(cid:48) ( x )( x − κ ( x )) . (4.5)This suggests that we can deﬁne candidate sub-hedge payoﬀs ψ via (4.5) on ( f (0) , ∞ ) and via(4.4) on (0 , f (0)).If ψ satisﬁes (4.2) then so does ψ + a + b ( y − x ) for any a , b . Earlier we argued that withoutloss of generality for a semi-static hedging strategy we could assume ψ (cid:48) ( f (0)) = 0. Now we mayrestrict attention further to ψ with ψ ( f (0)) = 0.Deﬁne Φ( u, y ) = H x ( u, y ) / ( u − y ). Write Φ R ( u, y ) = H Rx ( u, y ) / ( u − y ), and similarly forother kernels. Deﬁnition 4.2.

For κ ∈ K with inverse k , deﬁne ψ κ,H ≡ ψ κ : (0 , ∞ ) (cid:55)→ R + , by ψ κ ( f (0)) = 0and ψ κ = (cid:26) ψ κ ( x ) ψ κ ( z ) (cid:27) =  (cid:90) xf (0) ( x − u )Φ( u, κ ( u )) du x > f (0) ψ κ ( k ( z )) + ψ (cid:48) κ ( k ( z ))( z − k ( z )) + H ( k ( z ) , z ) z < f (0)We call such a function a candidate payoﬀ of Class K .By convention we use the variable x on ( f (0) , ∞ ) and z on (0 , f (0)), to reﬂect the fact that ψ is deﬁned explicitly on the former set, but only implicitly on the latter.For the present we ﬁx κ and we write simply ψ for ψ κ . Note that the value of ψ ( x ) doesnot depend on the right-continuity assumption for κ . Further, observe that if κ is not injectiveand there is an interval A z ≡ { x : κ ( x ) = z } ⊆ ( m, ∞ ) over which κ takes the value z then k has a jump at z . Nonetheless, the value of ψ ( z ) does not depend on the choice of k ( z ). To see this, for x ∈ A z consider Ψ( x ) := ψ ( x ) + ψ (cid:48) ( x )( z − x ) + H ( x, z ). Then, on A z , d Ψ /dx = ψ (cid:48)(cid:48) ( x )( z − x ) + H x ( x, z ) ≡

0, using (4.5).Motivated by the results of Section 3.3 we have deﬁned ψ relative to the set of decreasingfunctions K with the aim of constructing a sub-hedge. However, there are analogous deﬁnitionsbased on constructing super-hedges or using the martingale R or both. Deﬁnition 4.3. ψ : (0 , ∞ ) → (0 , ∞ ) is a candidate super-hedge payoﬀ if for all y ∈ (0 , ∞ ), ψ ( y ) = sup x (cid:8) H ( x, y ) + ψ (cid:48) ( x )( y − x ) + ψ ( x ) (cid:9) . (4.6)Deﬁne L = L ( f (0)) be the set of monotone increasing functions (cid:96) : (0 , f (0)) → ( f (0) , ∞ ),with (cid:96) ( f (0)) = f (0). Let l be inverse to (cid:96) . 14 eﬁnition 4.4. For (cid:96) ∈ L with inverse l , deﬁne ψ (cid:96) : (0 , ∞ ) (cid:55)→ R + , the candidate payoﬀ ofClass L by ψ (cid:96) ( f (0)) = 0 and ψ (cid:96) = (cid:26) ψ (cid:96) ( x ) ψ (cid:96) ( z ) (cid:27) =  (cid:90) f (0) x ( u − x )Φ( u, (cid:96) ( u )) du x < f (0) ψ (cid:96) ( l ( z )) + ψ (cid:48) (cid:96) ( l ( z ))( z − l ( z )) + H ( l ( z ) , z ) z > f (0)Our next aim is to give conditions which guarantee that the semi-static strategy ( ψ, − ψ (cid:48) )satisﬁes equation (4.1). Deﬁnition 4.5.

A variation swap kernel H is an increasing (a decreasing ) kernel if it is a regularvariation swap kernel and(i) Φ( u, y ) is monotone increasing (decreasing) in y ,(ii) H ( a, b ) + H y ( a, b )( c − b ) ≥ ( ≤ ) H ( a, c ) − H ( b, c ) for all a > b .The second condition in Deﬁnition 4.5 is equivalent to the fact that H yy ( x, y ) is increasing(decreasing) in its ﬁrst argument. Example 4.6. H R and H S are increasing kernels and H L is a decreasing kernel. The kernels H B and H Q are simultaneously both increasing and decreasing since Φ B ( u, y ) = 2 u − and Φ Q ( u, y ) = 2 do not depend on y and Condition (ii) in Deﬁnition 4.5 is satisﬁed with equalityin both cases. Example 4.7.

Consider the kernels H G − ( u, y ) = uH R ( u, y ) and H G + ( u, y ) = yH R ( u, y ) . Inthe ﬁrst case, variance is weighted by the pre-jump value of the price realisation and in thesecond case the variance is weighted by the post-jump value. Swaps of this type are known asGamma swaps, see, for example, Carr and Lee [9]. Both H G − and H G + are increasing kernels. Theorem 4.8. (i) (a) If H is an increasing kernel then every candidate payoﬀ of Class K is the root of a semi-static sub-hedge for the kernel H .(b) If H is an increasing kernel then every candidate payoﬀ of Class L is the root of asemi-static super-hedge for the kernel H .(ii) (a) If H is a decreasing kernel then every candidate payoﬀ of Class L is the root of asemi-static sub-hedge for the kernel H .(b) If H is an decreasing kernel then every candidate payoﬀ of Class K is the root of asemi-static super-hedge for the kernel H .Proof. We will prove the theorem in the case (i)(a). The proofs in the other cases are similar.Fix κ ∈ K let L κ ( x, y ) = ψ κ ( x ) + ψ (cid:48) κ ( x )( y − x ) + H ( x, y ) − ψ κ ( y ). The result will follow ifwe can show that L κ ( x, y ) ≥ x, y ) ∈ (0 , ∞ ) . Since κ is ﬁxed we drop the subscript κ in what follows.Suppose that x, z > f (0) and y ∈ (0 , ∞ ). Since ψ ( x )+ ψ (cid:48) ( x )( y − x ) = (cid:90) xf (0) ( y − u )Φ( u, κ ( u )) du we have that L ( x, y ) − L ( z, y ) = ψ ( x ) + ψ (cid:48) ( x )( y − x ) + H ( x, y ) − ψ ( z ) − ψ (cid:48) ( z )( y − z ) − H ( z, y )= (cid:90) xz { ( y − u )Φ( u, κ ( u )) + H x ( u, y ) } du = (cid:90) xz { Φ( u, y ) − Φ( u, κ ( u )) } ( u − y ) du. y ≥ f (0), then set z = y to ﬁnd that L ( x, y ) = (cid:90) xy { Φ( u, y ) − Φ( u, κ ( u )) } ( u − y ) du. Since y ≥ f (0) ≥ κ ( u ), Φ( u, y ) ≥ Φ( u, κ ( u )) for all u . Hence L ( x, y ) ≥ y = x .If y < f (0) and k is continuous at y set z = k ( y ). Otherwise, for deﬁniteness set z = k ( y +).Then L ( k ( y +) , y ) = 0 and L ( x, y ) = (cid:90) xk ( y +) { Φ( u, y ) − Φ( u, κ ( u )) } ( u − y ) du. If k ( y +) ≤ x then y ≥ ˆ x , for all ˆ x ∈ [ κ ( x +) , κ ( x − )]. Then for u ∈ ( k ( y +) , x ), κ ( u ) ≤ y andsince Φ( u, z ) is increasing in z , the integrand is positive.If x < k ( y +), then y < ˆ x for all ˆ x ∈ [ κ ( x +) , κ ( x − )]. Then for u ∈ ( x, k ( y +)) we have κ ( u ) > y . Then again L ( x, y ) ≥ L ( x, y ) ≥ x < f (0). Note that since, by what we have shownabove, L ( k ( x ) , y ) ≥ L ( x, y ) ≥ L ( k ( x ) , y ). But, L ( x, y ) − L ( k ( x ) , y ) = ψ ( x ) + ψ (cid:48) ( x )( y − x ) + H ( x, y ) − ψ ( k ( x )) − ψ (cid:48) ( k ( x ))( y − k ( x )) − H ( k ( x ) , y )= ψ ( k ( x )) + ψ (cid:48) ( k ( x ))( x − k ( x )) + H ( k ( x ) , x ) + ψ (cid:48) ( k ( x ))( y − x )+ H y ( k ( x ) , x )( y − x ) + H ( x, y ) − ψ ( k ( x )) − ψ (cid:48) ( k ( x ))( y − k ( x )) − H ( k ( x ) , y )= H ( k ( x ) , x ) + H ( x, y ) + H y ( k ( x ) , x )( y − x ) − H ( k ( x ) , y ) ≥ , where the last inequality follows from Deﬁnition (4.5). In the next three sections we concentrate on lower bounds and increasing variance kernels, butthere are equivalent results for upper bounds and/or decreasing variance kernels.In this section we ﬁx the call prices and attempt to identify the most expensive sub-hedgefrom the set of sub-hedges generated by candidate payoﬀs of Class K . The price of this sub-hedge provides a highest model-independent lower bound on the price of the variance swap ina sense which we will explain in the section on continuous limits.Associated with the set of call prices C ( k ) (and put prices C ( k ) + f (0) − k given by put-call parity) there is a measure µ on R + with mean m . Since f is a forward price we musthave f (0) = m . Write C = C µ to emphasise the connection between these quantities. Then C ( k ) = C µ ( k ) = (cid:90) ∞ k ( x − k ) µ ( dx ). Recall that C µ is convex so that µ ( dx ) = C (cid:48)(cid:48) µ ( x ) dx with theright-hand-side to be interpreted in a distributional sense as necessary. We wish to calculatethe cost of the European claim which forms part of the semi-static sub-hedge. By constructionthis is equal to (cid:90) R + ψ ( x ) µ ( dx ) = (cid:90) m ψ (cid:48)(cid:48) ( z )( C µ ( z ) + m − z ) dz + (cid:90) ∞ m ψ (cid:48)(cid:48) ( x ) C µ ( x ) dx . Proposition 5.1.

For H a variance swap kernel and κ ∈ K ( m ) , (cid:90) ∞ ψ κ ( x ) µ ( dx ) = (cid:90) m µ ( dz ) H ( m, z ) + (cid:90) ∞ m du Σ ( u ) µ ( κ ( u )) (5.1)16 here, for v < m < u , Σ ( u ) µ ( v ) = Φ( u, v ) C µ ( u ) + (cid:90) (0 ,v ] µ ( dz )( u − z ) { Φ( u, z ) − Φ( u, v ) } . Proof.

Let ψ = ψ κ . Note that by deﬁnition ψ ( m ) = 0, so there is no contribution from mass at m and we can divide the integral on the left of (5.1) into intervals (0 , m ) and ( m, ∞ ). For thelatter, (cid:90) ∞ m ψ ( x ) µ ( dx ) = (cid:90) ∞ m µ ( dx ) (cid:90) xm ( x − u )Φ( u, κ ( u )) du = (cid:90) ∞ u = m du Φ( u, κ ( u )) (cid:90) ∞ u ( x − u ) µ ( dx )= (cid:90) ∞ u = m du Φ( u, κ ( u )) C µ ( u ) =: I . Now consider (cid:90) m ψ ( z ) µ ( dz ). For this, using H ( k, z ) = H ( m, z )+ (cid:90) km H x ( u, z ) du and ψ ( x )+ ψ (cid:48) ( x )( z − x ) = (cid:90) xm du ( z − u )Φ( u, κ ( u )) we have (cid:90) m ψ ( z ) µ ( dz ) = (cid:90) m µ ( dz ) H ( m, z ) + (cid:90) m µ ( dz ) (cid:90) k ( z ) m du ( u − z ) { Φ( u, z ) − Φ( u, κ ( u )) } =: I + I Note that I depends on H but not on κ . Moreover, I does not depend on the particular valueschosen for the inverse taken over intervals of constancy of κ . (If x < ˜ x are a pair of possiblevalues for k ( z ) then (cid:90) ˜ xx du ( u − z ) { Φ( u, z ) − Φ( u, κ ( u )) } = 0 since over this range κ ( u ) = z .)Changing the order of integration we have I = (cid:90) ∞ m du (cid:90) (0 ,κ ( u )] µ ( dz )( u − z ) { Φ( u, z ) − Φ( u, κ ( u )) } , and then I + I = (cid:90) ∞ m du Σ ( u ) µ ( κ ( u )) . Our goal is to maximise the expression (5.1) over decreasing functions κ ∈ K . As notedabove, I is independent of κ , and to maximise (cid:90) ∞ m du Σ ( u ) µ ( κ ( u )) we can maximise Σ ( u ) µ ( κ )separately for each u > m , and then check that the minimiser is a decreasing function of u . Proposition 5.2.

Suppose H is an increasing variance swap kernel. Then (cid:90) ∞ ψ κ ( x ) µ ( dx ) ismaximised over κ ∈ K by κ = α where α is the quantity which arises in (3.7) in the deﬁnitionof the Perkins solution to the Skorokhod embedding problem.Proof. For u > m consider Θ ( u ) µ ( v ) := C µ ( v ) − (cid:90) (0 ,v ] µ ( dz )( u − z ) deﬁned for v ∈ (0 , u ). Then foreach u , Θ ( u ) µ is a strictly decreasing right-continuous function taking both positive and negativevalues on (0 , m ). Let κ = κ ( u ) = sup { v : Θ ( u ) µ ( v ) ≥ } . We have Θ ( u ) µ ( κ − ) ≥ ≥ Θ ( u ) µ ( κ +).Suppose H is an increasing variance swap kernel so that Φ( u, y ) is increasing in y . We wantto show that Σ ( u ) µ ( v ) is maximised by v = κ ( u ).17uppose m > v > κ ( u ). We aim to show that for all κ ∈ ( κ ( u ) , v ) we have Σ ( u ) µ ( v ) ≤ Σ ( u ) µ ( κ ).We haveΣ ( u ) µ ( v ) − Σ ( u ) µ ( κ ) = Φ( u, v ) C µ ( u ) + (cid:90) v µ ( dz )( u − z ) { Φ( u, z ) − Φ( u, v ) }− Φ( u, κ ) C µ ( u ) − (cid:90) κ µ ( dz )( u − z ) { Φ( u, z ) − Φ( u, κ ) } = (cid:90) vκ µ ( dz )( u − z ) { Φ( u, z ) − Φ( u, v ) } + [Φ( u, v ) − Φ( u, κ )] Θ ( u ) µ ( κ ) . Since H is an increasing variance kernel, for z ∈ ( κ, v ), Φ( u, z ) ≤ Φ( u, v ), and the ﬁrst integralis non-positive. Furthermore, Φ( u, v ) ≥ Φ( u, κ ) and Θ ( u ) ( κ ) <

0. Hence we conclude thatΣ ( u ) µ ( v ) ≤ Σ ( u ) µ ( κ ).Similar arguments show that if v < κ ( u ) then Σ ( u ) µ ( v ) ≤ Σ ( u ) µ ( κ ) for any κ ∈ ( v, κ ( u )), andit follows that κ = κ ( u ) is a maximiser of Σ ( u ) µ ( v ).Note that κ ( u ) is precisely the quantity α which arises in the Perkins construction. Hence κ is a decreasing function. Moreover, the deﬁnition κ ( u ) = sup { v : Θ ( u ) µ ( v ) ≥ } ensures that κ is right continuous. Corollary 5.3.

Suppose κ n ( x ) is a sequence of elements of K with κ n ( x ) ↓ κ ( x ) . Then (cid:90) [0 , ∞ ) ψ κ n ( x ) µ ( dx ) converges monotonically to (cid:90) [0 , ∞ ) ψ κ ( x ) µ ( dx ) .Proof. Recall that (cid:90) [0 , ∞ ) ψ κ ( x ) µ ( dx ) = (cid:90) µ ( dz ) H (1 , z ) + (cid:90) ∞ du Σ ( u ) µ ( κ ( u )). By the abovearguments we have that Σ ( u ) µ ( z ) is increasing in z for z > κ ( u ). Hence the result follows bymonotone convergence. Example 5.4.

Let H = H R , an increasing variance kernel. Let µ = U [0 , and let κ : [1 , → [0 , be given by κ ( x ) = α µ ( x ) = x − √ x − . Similarly we deﬁne (cid:96) ( x ) = β µ ( x ) = x + 2 √ − x .Then ( ψ κ , − ψ (cid:48) κ ) is the most expensive sub-hedge of class K and ( ψ (cid:96) , − ψ (cid:48) (cid:96) ) is the cheapest super-hedge of class L . Although we cannot calculate the functions ψ κ , ψ (cid:96) explicitly, they can beevaluated numerically, see the left hand side of Figure 2. Now suppose H = H L . The rolesof ψ κ and ψ (cid:96) are reversed (see the right hand side of Figure 2) and ( ψ κ , − ψ (cid:48) κ ) is the root of asemi-static super-hedge and ( ψ (cid:96) , − ψ (cid:48) (cid:96) ) is the root of a semi-static sub-hedge. The bounds we have constructed based on the functions ψ κ hold simultaneously across all pathsand all partitions. The purpose of this section is to consider the limit as the partition becomesﬁner. It will turn out that in the continuous limit there is a stochastic model which is consistentwith the observed call prices and for which there is equality in the inequality (4.1) from whichwe derive the lower bound. In this sense the model-free bound is optimal, and can be attained.The analysis of this section justiﬁes restricting attention to candidate payoﬀs of Classes K and L . Hedges of this type either sub-replicate or super-replicate the payoﬀ of the varianceswap depending on the form of the kernel, but there could be other sub- and super-replicatingstrategies which do not take this form. In principle, for a given partition one of these othersub-hedges could give a tighter model-independent bound than we can derive from our analysis.(As an extreme example, suppose the partition is trivial (0 = t < t = T ). Then V H ( f, P ) = H ( f (0) , f ( T )) which can be replicated exactly using call options.) However, in the continuous18igure 2: For the two kernels ψ κ is shown as a dashed line and ψ (cid:96) is shown as a solid line. Forthe kernel H R (left-hand-side), ψ κ is associated with a lower bound on the price of the varianceswap. For the kernel H L (right-hand-side) ψ κ is associated with an upper bound.limit our bound is best possible, so that when the partition is ﬁnite, but the mesh size is smallwe expect our hedge to be close to best possible and relatively simple to implement.For a ﬁnite partition P ( n ) in the dense sequence P = ( P ( n ) ) n ≥ we have V H ( f, P ( n ) ) = N ( n ) − (cid:88) k =0 H ( f ( t k ) , f ( t k +1 )) ≥ ψ ( f ( T )) − ψ ( f (0)) − N ( n ) − (cid:88) k =0 ψ (cid:48) ( f ( t k ))( f ( t k +1 ) − f ( t k )) . (6.1)We want to conclude that the limits V H ( f, P ∞ ) = lim n V H ( f, P ( n ) ) andlim n N ( n ) − (cid:88) k =0 ψ (cid:48) ( f ( t k ))( f ( t k +1 ) − f ( t k )) = (cid:90) T ψ (cid:48) ( f ( t − )) df ( t ) (6.2)exist for each path under consideration. Our analysis follows the development of a path-wiseItˆo’s formula in F¨ollmer [17]. Let (cid:15) t denote a point mass at t . Deﬁnition 6.1.

A path realisation f has a quadratic variation on a dense sequence of partitions P = ( P ( n ) ) n ≥ if, when we deﬁne the measure ζ n = N ( n ) − (cid:88) k =0 , t k ∈ P ( n ) ( f ( t k +1 ) − f ( t k )) (cid:15) t k , then the sequence ζ n converges weakly to a Radon measure ζ on [0 , T ]. Then ([ f ] t ) t ≥ is givenby [ f ] t = ζ ([0 , t ]).The atomic part of ζ is given by squared jumps of f . Moreover the quadratic variation([ f ] t ) t ≥ is simply the cumulative mass function of ζ . Theorem 6.2. (F¨ollmer [17]) Suppose the price realisation f has a quadratic variation along P = ( P ( n ) ) n ≥ and G is a twice continuously diﬀerentiable function from R + to R , then (cid:90) T G (cid:48) ( f ( t − )) df ( t ) = lim n ↑∞ N ( n ) − (cid:88) t =0 G (cid:48) ( f ( t k ))( f ( t k +1 ) − f ( t k ))19 xists and G ( f ( T )) − G ( f (0)) = (cid:90) T G (cid:48) ( f ( s − )) df ( s ) + 12 (cid:90) (0 ,T ] G (cid:48)(cid:48) ( f ( s )) d [ f ] cs + (cid:88) s ≤ T (cid:2) G ( f ( s )) − G ( f ( s − )) − G (cid:48) ( f ( s − ))∆ f ( s ) (cid:3) , and the series of jump terms is absolutely convergent. Hence, provided ψ is twice continuously diﬀerentiable on the support of f and f has aquadratic variation along P , it follows immediately that the limit in (6.2) exists. In our setting ψ (cid:48)(cid:48) κ ( u ) = Φ( u, κ ( u )) for u >

1, so that a suﬃcient condition for ψ (cid:48)(cid:48) κ ( u ) to be continuous on (1 , ∞ )is that κ is continuous. Further, on u <

1, provided k ≡ κ − is diﬀerentiable and H y exists, wehave ψ (cid:48) ( z ) = ψ (cid:48) ( k ( z )) + H y ( k ( z ) , z ). Hence, suﬃcient conditions for ψ to be twice continuouslydiﬀerentiable on (0 ,

1) are that k is continuously diﬀerentiable, κ is continuous and H xy and H yy are continuous. Let K c be the class of decreasing functions κ : ( f (0) , ∞ ) → (0 , f (0)) whichare continuous and have an inverse k which is continuously diﬀerentiable. Corollary 6.3.

Suppose that H is an increasing variance kernel, and that f has a quadraticvariation. Suppose κ ∈ K c and ψ = ψ κ . Then the limit in (6.2) exists. Now we want to consider V H ( f, P ∞ ) = lim n V H ( f, P ( n ) ). Lemma 6.4.

Suppose H is a variance swap kernel. If P = ( P ( n ) ) n ≥ is a dense sequence ofpartitions, and f has a quadratic variation along P , then lim n ↑∞ V H ( f, P ( n ) ) exists and satisﬁes V H ( f, P ∞ ) = (cid:90) (0 ,T ] f ( t − ) d [ f ] t + (cid:88)

Our proof follows F¨ollmer [17]. Fix (cid:15) >

0. Partition [0 , T ] into two classes: a ﬁnite class C = C ( (cid:15) ) of jump times and a class C = C ( (cid:15) ) such that (cid:88) s ∈ [0 ,T ] , s ∈ C ( (cid:15) ) (∆ f ( s )) ≤ (cid:15) . (6.4)Then N ( n ) − (cid:88) k =0 H ( f ( t k ) , f ( t k +1 )) = (cid:88) H ( f ( t k ) , f ( t k +1 )) + (cid:88) H ( f ( t k ) , f ( t k +1 )), where (cid:88) indi-cates a sum over those 0 ≤ k ≤ N ( n ) − t k , t k +1 ] contains a jump of class C . Itfollows that lim n ↑∞ (cid:88) H ( f ( t k ) , f ( t k +1 )) = (cid:88) t ∈ C ( (cid:15) ) H ( f ( t − ) , f ( t )) . (6.5)On the other hand, using the properties H ( x, x ) = 0, H y ( x, x ) = 0 we have from Taylor’s formulathat H ( x, y ) = 12 H yy ( x, x )( y − x ) + r ( x, y ). Using the fact that ( f ( t )) ≤ t ≤ T is a compact subsetof (0 , ∞ ) we may assume that the remainder term satisﬁes | r ( x, y ) | ≤ R ( | y − x | )( y − x ) where20 is an increasing function on [0 , ∞ ) such that R ( c ) → c →

0. Then (cid:88) H ( f ( t k ) , f ( t k +1 )) = 12 (cid:88) H yy ( f ( t k ) , f ( t k ))( f ( t k +1 ) − f ( t k )) + (cid:88) r ( f ( t k ) , f ( t k +1 ))= 12 (cid:88) H yy ( f ( t k ) , f ( t k ))( f ( t k +1 ) − f ( t k )) − (cid:88) H yy ( f ( t k ) , f ( t k ))( f ( t k +1 ) − f ( t k )) + (cid:88) r ( f ( t k ) , f ( t k +1 )) . (6.6)Since H yy ( f, f ) = 2 /f is uniformly continuous over the bounded set of values ( f ( t )) ≤ t ≤ T ,by (9) in F¨ollmer [17], the ﬁrst term in (6.6) converges to (cid:90) (0 ,T ] f ( t − ) d [ f ] t and the second termconverges to − (cid:88) s ∈ C f ( t − ) (∆ f ( t )) . Using (6.4) and the fact that the remainder term satisﬁes | r ( x, y ) | ≤ R ( | y − x | )( y − x ) we have that the last term is bounded by R ( (cid:15) )[ f ] T . Finally, letting (cid:15) ↓ V H ( f, P ∞ ) = lim n V H ( f, P ( n ) ) exists and (6.3) follows. Corollary 6.5. V H R ( f, P ∞ ) = (cid:90) (0 ,T ] f ( t − ) − d [ f ] t and V H L ( f, P ∞ ) = [log f ] T . Combining (6.1) with Theorem 6.2 and Lemma 6.4 it follows that for a path of ﬁnitequadratic variation and ψ a twice-continuously diﬀerentiable function with ψ ( f (0)) = 0, V H ( f, P ∞ ) ≥ ψ ( f ( T )) − (cid:90) T ψ (cid:48) ( f ( t − )) df ( t ) . (6.7)The left hand side is the payoﬀ of the variance swap in the continuous limit. The expressionon the right can be interpreted as the payoﬀ of a semi-static hedging strategy ( ψ, − ψ (cid:48) ) undercontinuous trading. From Deﬁnition 2.10 for each of the partitions in the sequence we have thatthe price of the semi-static hedge is (cid:90) ∞ ψ ( x ) µ ( dx ) = (cid:90) ∞ f (0) ψ (cid:48)(cid:48) ( x ) C µ ( x ) dx + (cid:90) f (0)0 ψ (cid:48)(cid:48) ( z )( C µ ( z ) + f (0) − z ) dz. (6.8)Since this value does not depend on the partition, in the continuous-time setting we deﬁne theprice of sub-hedge ( ψ, − ψ (cid:48) ) to also be the expression given in (6.8). Corollary 6.6.

Suppose H is an increasing variance swap kernel. A model-independent lowerbound on the price of the continuous time limit of the variance swap with payoﬀ V H ( f ) is sup κ (cid:90) ∞ ψ κ ( x ) µ ( dx ) = (cid:90) ∞ ψ α µ ( x ) µ ( dx ) (6.9) where α µ is the quantity arises in the Perkins embedding (Theorem 3.2).Proof. For any decreasing function κ ∈ K c we can construct ψ κ such that (cid:90) ∞ ψ κ ( x ) µ ( dx ) is theprice of a sub-hedge for V H for any partition, and this continues to hold in the continuous-timelimit. Moreover, by optimising over κ we obtain a bound (cid:90) ∞ ψ α µ ( x ) µ ( dx ) which is the best21ound of this form by Proposition 5.2. Note that even if α µ is not in class K c , by Corollary 5.3we can approximate it from above by a sequence of elements of class K c such that in the limitwe obtain the price (cid:90) ∞ ψ α µ ( x ) µ ( dx ) as a bound.Our goal now is to show that this is a best bound in general and not just an optimal boundbased on inequalities such as (6.1) for ψ ≡ ψ κ and κ a decreasing function. We do this byshowing that there is a consistent model for which the price of the continuously monitoredvariance swap is equal to (cid:90) ∞ ψ α µ ( x ) µ ( dx ). Theorem 6.7.

There exists a consistent model such that V H (( X t ) ≤ t ≤ T , P ∞ ) = ψ α µ ( X T ) − (cid:90) T ψ (cid:48) α µ ( X s − ) dX s . (6.10) Proof.

Recall Deﬁnition 2.12 and note that we are given a set of call prices and that in con-structing a consistent model we are free to design an appropriate probability space (Ω , F , F =( F t ) ≤ t ≤ T , P ) as well as a stochastic process ( X t ) t ≥ .Suppose we are given call prices C ( x ) = C µ ( x ) for some µ . Let (Ω , G , G = ( G t ) ≤ t ≤ T , P )support a Brownian motion ( W u ) u ≥ with initial value W = f (0) = (cid:90) R + xµ ( dx ) and suppose G contains a U [0 ,

1] random variable which is independent of W . (This last condition is necessarypurely to ensure that the Perkins embedding of µ can be deﬁned when µ has an atom at f (0).If µ has no atom at f (0) then we may take G to be trivial.)Let τ Pµ be the Perkins embedding of µ in W . Write S for the maximum process of W so that S u = max v ≤ u W v . Write H x for the ﬁrst hitting time by W of x . Let (Λ( t )) ≤ t ≤ T be astrictly increasing continuous function with Λ(0) = f (0) and lim t ↑ T Λ( t ) = ∞ . Now deﬁne theleft-continuous process ˜ X = ( ˜ X t ) ≤ t ≤ T via˜ X t = (cid:40) Λ( t ) H Λ( t ) ≤ τ Pµ W τ Pµ τ Pµ < H Λ( t ) .Note that the condition H Λ( t ) ≤ τ Pµ can be re-written as Λ( t ) ≤ S τ Pµ or equivalently t ≤ Λ − ( S τ Pµ ). Deﬁne also ˜ F t = G ¯ H Λ( t ) . Then ˜ X is adapted to the ﬁltration ˜ F = ( ˜ F t ) ≤ t ≤ T and ˜ X is a ˜ F -martingale for which ˜ X T = W τ Pµ ∼ µ .In order to construct a right-continuous martingale with the same properties, for t < T weset F t = ∩ u>t ˜ F t and X t = lim u ↓ t ˜ X u , and for t = T we set F T = ˜ F T X T = ˜ X T . Then X is aright-continuous F martingale such that (Ω , F , F = ( F t ) ≤ t ≤ T , P ) is a consistent model.Now we want to show that for this model (6.10) holds path-wise. Writing ψ for ψ α µ , and X t as shorthand for each X t ( ω ) we have for each ωψ ( X T ) − (cid:90) T ψ (cid:48) ( X t − ) dX t = ψ ( W τ Pµ ) − (cid:90) Λ − ( S τPµ ) t =0 ψ (cid:48) (Λ( t )) d Λ( t ) − ψ (cid:48) ( S τ Pµ )( W τ Pµ − S τ Pµ )= ψ ( W τ Pµ ) − (cid:90) S τPµ f (0) ψ (cid:48) ( u ) du − ψ (cid:48) ( S τ Pµ )( W τ Pµ − S τ Pµ )= ψ ( W τ Pµ ) − ψ ( S τ Pµ ) − ψ (cid:48) ( S τ Pµ )( W τ Pµ − S τ Pµ ) . W τ Pµ = S τ Pµ , in which case this expression is equal to 0 or, W τ Pµ = α µ ( S τ Pµ ) and then the expression becomes ψ ( α µ ( s )) − ψ ( s ) − ψ (cid:48) ( s )( α µ ( s ) − s ) ≡ H ( s, α ( s ))at s = S τ Pµ , using Deﬁnition 4.2. In either case the right hand side of (6.10) is H ( S τ Pµ , W τ Pµ ).For the left hand side of (6.10), [ X ] cT = 0 and (∆ X u ) = ( S τ Pµ − W τ Pµ ) { u =Λ − ( S τPµ ) } { W τPµ (cid:54) = S τPµ } so that from (6.3), V H ( f, P ∞ ) = H ( S τ Pµ , W τ Pµ ). Hence (6.10) holds path-wise. Corollary 6.8.

Suppose H is an increasing variance swap kernel. Then the highest modelindependent lower bound on the price of a variance swap is given by the expression in (6.9). Corollary 6.9. If Φ( u, y ) does not depend on y then the corresponding variance swap is perfectlyreplicable by ( ψ, − ψ (cid:48) ) . For all consistent models the variation swap has price (cid:90) R + ψ ( x ) µ ( dx ) . Example 6.10.

Recall the deﬁnitions of the kernels H B and H Q and Example 4.6. Φ B ( u, y ) =2 u − and so ψ (cid:48) ( u ) = − /u and ψ ( u ) = − u ) . Thus H B ( x, y ) = ψ ( y ) − ψ ( x ) − ψ (cid:48) ( x )( y − x ) and the strategy ( ψ, − ψ (cid:48) ) replicates the payoﬀ perfectly for any price realisation. The observationthat H B has one model-independent price was ﬁrst made by Bondarenko in [3]. Similarly, H Q ( x, y ) = ψ ( y ) − ψ ( x ) − ψ (cid:48) ( x )( y − x ) , where ψ ( x ) = x . An alternative analysis of thesetwo payoﬀs is due to Neuberger [26]. Neuberger introduces the aggregation property. Translatedinto the notation of our setting, a kernel enjoys the aggregation property if E [ V H ( X, P ( n ) )] = E [ H ( X T − X )] . Both Bondarenko [3] and Neuberger [26] advocate the use of H B due to thefact that its price is not sensitive to the price path, but only to the value of X T . To date we have worked with forward prices. This has the implication that the dynamic partof a hedging strategy has zero cost. In this section we outline how our analysis can be extendedto non-zero, but deterministic, interest rates.Suppose that interest rates are deterministic. Let D t = D t ( T ) be the discount factor over[ t, T ] so that the asset price realisation ( s = ( s t ) ≤ t ≤ T ) and the forward price realisation arerelated by s ( t ) = D t f ( t ). In the case of constant interest rates D t ( T ) = e − r ( T − t ) so that s ( t ) = e − r ( T − t ) f ( t ).Let P be a partition of [0 , T ]. For k ∈ { , , ..., N − } write s k = s ( t k ), f k = f ( t k ) and D k = D t k ( T ). Set D k,k +1 = D k +1 /D k . Note that if interest rates are non-negative then D k,k +1 ≥ G be the kernel of a variation swap and write G k ( x, y ) = G ( D k x, D k y ). Then the payoﬀof the variance swap is given by V G ( s, P ) = N − (cid:88) k =0 G ( D k f k , D k +1 f k +1 ) = N − (cid:88) k =0 G k ( f k , D k,k +1 f k +1 ) . Proposition 7.1.

Suppose that there exists a variation swap kernel H , functions η , (cid:15) , B anda constant A ∈ R such that for all D > G k ( x, yD ) ≥ AH ( x, y ) + η ( y ) − η ( x ) + (cid:15) ( x, k, D )( y − x ) + B ( k, D ) . (7.1) Without loss of generality we may take η ( f (0)) = 0 . uppose that there exists a semi-static sub-hedging strategy ( ψ, ∆) for the variation swapwith kernel H . Then V G ( s, P ) ≥ ( Aψ + η )( f ( T )) + (cid:88) k [ (cid:15) ( f k , k, D k,k +1 ) + δ t k (( f ( t ) t ≤ t k )]( f k +1 − f k ) + (cid:88) k B ( k, D k,k +1 ) , and there is a model-independent sub-hedge and price lower bound for V G .Proof. We have V G ( s, P ) = N − (cid:88) k =0 G k ( f k , D k,k +1 f k +1 ) ≥ (cid:88) k [ AH ( f k , f k +1 ) + η ( f k +1 ) − η ( f k ) + (cid:15) ( f k , k, D k,k +1 )( f k +1 − f k ) + B ( k, D k,k +1 )] ≥ A [ ψ ( f ( T )) + (cid:88) k δ t k (( f ( t ) t ≤ t k )( f k +1 − f k )] + η ( f ( T ))+ (cid:88) k (cid:15) ( f k , k, D k,k +1 )( f k +1 − f k ) + (cid:88) k B ( k, D k,k +1 ) Remark . If we are content to assume that interest rates are non-negative then we only need(7.1) to hold for D ≥ Remark . The price for the ﬂoating leg associated with the hedge is the price of the staticvanilla portfolio with payoﬀ ( Aψ + η )( f ( T )) plus the constant N − (cid:88) k =0 B ( k, D k,k +1 ). Corollary 7.4.

Suppose H is an increasing variance kernel, and ψ is of Class K . If (7.1) holdsthen we have a path-wise sub-hedge and a model independent bound on the price of V G . In the setting of increasing or decreasing variance kernels the bound in (7.2) will be tightprovided ( ψ, − ψ (cid:48) ) is a tight semi-static hedge for V H ( f, P ) and there is equality in Equation(7.1). Example 7.5.

Suppose G ( x, y ) = H R ( x, y ) = ( y − x ) x . Then G k ( x, y ) = G ( x, y ) , so that (cid:15) ( x, k, D ) and B ( k, D ) will not depend on k . Moreover, G ( x, yD ) = 1 x ( Dy − Dx + Dx − x ) = D (cid:18) y − xx (cid:19) + D ( D − x ( y − x ) + ( D − Suppose that interest rates are non-negative so that D k,k +1 ≥ . Then (7.1) holds for A = 1 , η = 0 , (cid:15) ( x, D ) = D ( D − /x and B ( D ) = ( D − .Note that there is an inequality in (7.1) for A = 1 . If D k,k +1 is independent of k (the naturalexample is to assume that interest rates are constant and the partition is uniform, in which case d = log D k,k +1 = rT /N ) then we can have equality by taking A = e rT/N . In that case we havean improved bound, but the improvement becomes negligible in the limit N ↑ ∞ . Example 7.6.

Suppose G ( x, y ) = H L ( x, y ) = (log( y ) − log( x )) . Then G k ( x, y ) = G ( x, y ) and G ( x, yD ) = (log D + log y − log x ) = H L ( x, y ) + 2 log D (log y − log x ) + (log D ) . Suppose now that the partition is such that D k,k +1 is independent of k , and set d = log D k,k +1 .Then Equation (7.1) holds with equality for A = 1 , η ( y ) = 2 d log y , (cid:15) = 0 and B ( D ) = d . xample 7.7. Suppose G ( x, y ) = H B ( x, y ) = − y − log x ) − ( y/x − . Then G k ( x, y ) = G ( x, y ) and G ( x, yD ) = − y − log x + log D ) + 2 D ( y − x ) + 2( D − H B ( x, y ) + 2( D − y/x −

1) + H B (1 , D ) . Then Equation (7.1) holds with equality for A = 1 , η ( y ) = 0 , (cid:15) ( x, D ) = 2( D − /x , B ( D ) = H B (1 , D ) . We can consider the limit as the partition becomes dense, in which case the bounds for thevariance swap become tight. For deﬁniteness we will assume that we cave a sequence of uniformpartitions with mesh size tending to zero, and that interest rates are constant, though this canbe weakened for the squared return and Bondarenko kernels.Then, for each of the three examples above we have that N − (cid:88) k =0 B ( k, D k,k +1 ) = N B ( e rT/N ) →

0. Further, in each case η ( y ) →

0, and A = 1. Then in the limit the lower bound on the priceof the variance swap based on the price realisation s is the same as the upper and lower boundsfor the variance swap deﬁned relative to the forward price f . Thus, for variance swaps basedon frequent monitoring, the bounds we have calculated in earlier sections based on the forwardprice may also be used for undiscounted price processes. Corollary 7.8.

Suppose there exists H , η , (cid:15) , B , and A such that G k ( x, yD ) ≤ AH ( x, y ) + η ( y ) − η ( x ) + (cid:15) ( x, k, D )( y − x ) + B ( k, D ) , (7.2) and suppose that there exists a semi-static super-hedging strategy ( ψ, ∆) for the variation swapwith kernel H . Then there is a corresponding model-independent super-hedge and price upperbound for V G . The analysis of the kernels H R , H L , H B and upper bounds is similar to that in Exam-ples 7.5—7.7 above. For the kernel H B , the choices listed in Example 7.7 give equality in(7.2) and can be used equally for upper bounds. Provided that we have an upper bound for D k,k +1 , so that D k,k +1 ≤ ¯ D uniformly in k , for the kernel H R we may take A = ¯ D , η = 0, (cid:15) ( x, D ) = D ( D − /x and B ( D ) = ( D − . Finally, for H L , provided interest rates arenon-negative, we can write G ( x, yD ) = H L ( x, y ) + 2 log D (log y − log x ) + (log D ) ≤ H L ( x, y ) + 2 log Dx ( y − x ) + (log D ) so that (7.2) holds for A = 1, η = 0, (cid:15) ( x, D ) = 2(log D ) /x and B ( D ) = (log D ) . Note that,unlike for the lower bound in Example 7.6, for the upper bound we do not need to assume that D k,k +1 is independent of k . Remark . In his analysis of lower bounds for the kernel H L , Kahal´e [23] does not need toassume the partition is uniform and that interest rates are constant (or more generally that D k,k +1 is constant), and can allow for arbitrary ﬁnite partitions and deterministic interestrates. Our results complement his results nicely. Although we need the assumption that D k,k +1 is constant to recover Kahal´e’s result in the setting of lower bounds and the kernel H L , in allother cases of study (upper bounds for V H L and upper and lower bounds for V H R and V H B ) ourmethods also allow for arbitrary partitions and non-constant but deterministic interest rates.25 Numerical Results

Given a continuum of call prices, it is possible to calculate the model independent boundsfor the prices of variance swaps. When the implied terminal distribution of the asset price issimple it is sometimes possible to calculate the monotone functions associated with the Perkinsembedding explicitly (see Example 5.4) and to obtain a closed form integral expression for themodel independent upper and lower bounds. For more realistic and complex target laws, themonotone functions and bounds can still be calculated numerically. The case when the terminallaw is lognormally distributed is of particular practical interest.A standard time frame for a volatility swap is 30 days or one month ( T = 1 / H R and H L relative to the cost of − σ → E [ ψ κ,H ( X σ/ √ )] / E [ − X σ/ √ ] , and σ → E [ ψ (cid:96),H ( X σ/ √ )] / E [ − X σ/ √ ] , where X σ ≡ e σN − σ / is the lognormal random variable with volatility parameter σ and H = H R or H L . Here, ψ K,H is the function given in Deﬁnition 4.2 and κ is chosen according toProposition 5.2 (with (cid:96) chosen similarly). Thus the upper bound for the kernel H L and thelower bound for the kernel H R correspond to the decreasing function κ associated with thePerkins embedding, while the other two bounds are constructed with the increasing function (cid:96) associated with the reversed Perkins embedding.Note that the price of a variance swap in the Black-Scholes model (as given by E [ − X σ √ T ])is an increasing function of volatility. The upper and lower bounds are also increasing functionsof volatility, and, as can be seen in the ﬁgure, they also become wider as volatility increases,when expressed as a ratio against the no-jump case. For reasonable values of volatility, and forboth kernels, the impact of jumps is to aﬀect the price by a factor of less than two, and for thekernel H L the bounds are even tighter. The observation that the bounds for the kernel H R arewider than those for the kernel H L is partly explained by considering the leading term in the ex-pansion of the hedging error (see Section 3.2). We have J R ( x ) ≈ x / J L ( x ) ≈ − x / H R is twice that of the leading error termfor H L . Note that for the optimal martingales the jumps are not local, so this approximationbecomes less relevant as σ increases. 26igure 3: Model independent upper and lower bounds for the prices of variance swaps based onthe kernels H L (solid lines) and on H R (dashed lines) relative to the price of − .

5. Here T = 1 /

12 and we work with variance swaps on forward prices.

This article developed from an attempt to express the results of Kahal´e [23] on no-arbitragelower bounds for the prices of variance swaps in the framework of model-independent hedging,in which extremal models and prices are associated with extremal solutions of the Skorokhodembedding problem. Beginning with Hobson [18], the focus in this literature is on hedging,and on ﬁnding pathwise inequalities relating the payoﬀ of the exotic, path-dependent derivativeand the payoﬀ of a static vanilla call portfolio combined with the gains from trade from aninvestment in the underlying security. In the context of variance swaps we ﬁnd that the lowerbound is associated with a martingale price process which can be expressed as a time-changeof the Perkins solution of the Skorokhod embedding problem. This embedding has appearedpreviously in ﬁnance in the construction of model-independent bounds for the prices of barrieroptions (Brown et al [6]).We approach the problem of ﬁnding hedging strategies in a more general setting than Ka-hal´e [23] in that we consider a variety of kernels in the deﬁnition of the variance swap. Theability to consider general kernels allows us to emphasise the dependence of the payoﬀ on thepresence and character of the jumps, and to show that the nature of this dependence is stronglyinﬂuenced by the form of the kernel. Bondarenko [3] and Neuberger [26] argue that the ﬁnanceindustry should consider deﬁning variance swaps using the kernel H B as then they can be repli-cated perfectly, even in the presence of jumps, recall Example 6.10. The counterargument isthat variance swaps provide value precisely because they are not redundant in this way. Sophis-27icated investors want to be able to take positions on the likely presence and direction of jumps.This is possible if the variance swap is deﬁned using the kernel H R or H L , but not using H B .Kahal´e [23] only considers the kernel H L , and lower bounds and sub-replicating strategies.On the other hand he works directly with the undiscounted asset price, and does not givespecial attention to contracts written on the forward price. He introduces the class of V -convexfunctions which have the property that each such function gives a lower bound on the price ofthe variance swap, and an associated sub-hedge. He then proceeds to show that functions ψ ofClass L (in our notation) are V -convex. In this way he can deduce a lower bound on the priceof a variance swap. Further, for a particular choice of decreasing function he can show that thislower bound can be attained in the continuous time limit under a well-chosen stochastic model— hence the bound he attains must be a best bound.In contrast, initially we consider contracts based on the forward price. This simpliﬁes theanalysis signiﬁcantly and reduces the search for candidate sub-hedge payoﬀs to a search forfunctions satisfying (4.3). The condition (4.3) is considerably simpler than the correspondingcondition for V-convexity in Kahal´e [23, Equation (3.1)]. The fact that we have a more trans-parent representation of the key property allows us to ﬁnd candidate super-hedge payoﬀs quiteeasily and allows us to extend the analysis to general variation swap kernels provided they havea monotonicity property. Moreover, we can easily develop upper bounds to complement thelower bounds. Only later do we introduce interest rates and variance swaps written on theundiscounted asset price, at which point we ﬁnd simple inequalities which extend our boundsto the general case. In the limit of a dense sequence of partitions the same bounds are optimalin both the undiscounted and forward price settings. We believe that the two-stage approachbrings insight, not least because in the forward case there is a direct link to martingales andsolutions of the Skorokhod embedding problem, and because inequalities such as (7.1) allowus to quantify the price diﬀerence between contacts written on the undiscounted and forwardprices for discrete monitoring.A further contribution of this article is to provide a derivation of bounds on the pricesof variance swaps without any recourse to probability. This involves construction of a classof hedges parameterised by monotone functions, and the choice of an optimal element in thisclass for a given set of call prices, together with F¨ollmer’s non-probabilistic Itˆo calculus. Pricetrajectories for which the bound is path-wise tight have at most one jump, after which thetrajectory is constant. Probability is only required to show that these trajectories correspondto a stochastic model for the price process. The relationship between the optimality of thecheapest hedge, derived in a purely non-proabilistic fashion, and the optimality of the Perkinsembedding provides a pleasing completeness to the story. References [1] J. Az´ema and M. Yor. Une solution simple au probl`eme de Skorokhod. In

S´eminaire deProbabilit´es, XIII (Univ. Strasbourg, Strasbourg, 1977/78) , volume 721 of

Lecture Notes inMath. , pages 90–115. Springer, Berlin, 1979.[2] A. Bick and W. Willinger. Dynamic spanning without probabilities.

Stochastic Processesand their Aplications , 50:349–374, 1994.[3] O. Bondarenko. Variance trading and market price of variance risk. Working paper, 2007.[4] D.T. Breeden and R.H. Litzenberger. Prices of state-contingent claims implicit in optionprices.

J. Business , 51:621–651, 1978. 285] M. Broadie and Ashish. Jain. The eﬀect of jumps and discrete sampling on volatility andvariance swaps.

Int. J. of Th. and App. Finance , 11(8):761–791, 2008.[6] H. Brown, D.G. Hobson, and L.C.G Rogers. Robust hedging of barrier options.

Math.Finance , 11(3):285–314, 2001.[7] P. Carr and A. Corso. Covariance contracting for commodities.

Energy and Power RiskManagement , April:42–45, 2001.[8] P. Carr and R. Lee. Volatility derivatives.

Annual Rev. Financ. Econ. , 1:313–339, 2009.[9] P. Carr and R. Lee. Variation and share-weighted variation swaps on time-changed L´evyprocesses. Preprint, 2010.[10] P. Carr, R. Lee, and L. Wu. Variance swaps on time-changed L´evy processes. Preprint,2010.[11] A. Cox and J. Wang. Root’s barrier: construction, optimality and applications to varianceoptions. Preprint, 2011.[12] M.H.A. Davis and D.G. Hobson. The range of traded option prices.

Mathematical Finance ,17(1):1–14, 2007.[13] K. Demeterﬁ, E. Derman, M. Kamal, and J. Zou. A guide to volatility and variance swaps.

The Journal of Derivatives , 6(4):9–32, 1999.[14] K. Demeterﬁ, E. Derman, M. Kamal, and J. Zou. More than you ever wanted to knowabout volatility swaps.

Goldman-Sachs Quantitive Strategies Research Notes , 1999.[15] L.D. Dubins and D. Gilat. On the distribution of maxima of martingales.

Proceedings ofthe American Mathematical Society , 68:337–338, 1978.[16] B. Dupire. Arbitrage pricing with stochastic volatility.

Soci´et´e G´en´erale, Options Division,Paris , 1992.[17] H. F¨ollmer. Calcul d’Itˆo sans probabilites. In

S´eminaire de Probabilit´es, XV (Univ. Stras-bourg, Strasbourg, 1981) , volume 15 of

Lecture Notes in Math. , pages 143–150. Springer,Berlin, 1981.[18] D.G Hobson. Robust hedging of the lookback option.

Finance and Stochastics , 2:329–347,1998.[19] D.G. Hobson. The Skorokhod embedding problem and model independent bounds foroption prices. In

Paris-Princeton Lecture Notes on Mathematical Finance . Springer, 2010.[20] D.G. Hobson and M. Klimmek. Maximising functionals of the joint law of the maximumand terminal value in the Skorokohd embedding problem. Preprint, 2010.[21] D.G. Hobson and J. L. Pedersen. The minimum maximum of a continuous martingale withgiven initial and terminal laws.

Ann. Probab. , 30(2):978–999, 2002.[22] R. Jarrow, Y. Kchia, M. Larsson, and P. Protter. Discretely sampled variance and volatilityswaps versus their continuous approximations. Preprint, 2011.[23] N. Kahal´e. Model-independent lower bound on variance swaps. Preprint, 2011.2924] I. Martin. Simple variance swaps. Preprint, 2011.[25] A. Neuberger. The Log Contract.

Journal of Portfolio Management , 20(2):74–80, 1994.[26] A. Neuberger. Realized skewness. Working paper, 2010.[27] E. Perkins. The Cereteli-Davis solution to the H -embedding problem and an optimalembedding in Brownian motion. In Seminar on Stochastic Processes, 1985 (Gainesville,Fla., 1985) , pages 172–223. Birkh¨auser Boston, Boston, MA, 1986.[28] E. Platen and L. Chang. A cautions note on the design of volatility derivatives. ArXiv,http://arxiv.org/abs/1007.2968v1, July 2010.[29] A. V. Skorokhod.

Studies in the theory of random processes . Translated from the Russianby Scripta Technica, Inc. Addison-Wesley Publishing Co., Inc., Reading, Mass., 1965.[30] S. Zhu and G-H. Lian. A closed-form exact solution for pricing variance swaps with stochas-tic volatility.