Cover's Rebalancing Option With Discrete Hindsight Optimization
CCover’s Rebalancing Option with DiscreteHindsight Optimization
Alex Garivaltis ∗ March 5, 2019
Abstract
We study T. Cover’s rebalancing option (Ordentlich and Cover 1998) underdiscrete hindsight optimization in continuous time. The payoff in question isequal to the final wealth that would have accrued to a $1 deposit into the best ofsome finite set of (perhaps levered) rebalancing rules determined in hindsight.A rebalancing rule (or fixed-fraction betting scheme) amounts to fixing an assetallocation (i.e. 200% stocks and -100% bonds) and then continuously executingrebalancing trades to counteract allocation drift.Restricting the hindsight optimization to a small number of rebalancingrules (i.e. 2) has some advantages over the pioneering approach taken by Cover& Company in their brilliant theory of universal portfolios (1986, 1991, 1996,1998), where one’s on-line trading performance is benchmarked relative to thefinal wealth of the best unlevered rebalancing rule of any kind in hindsight.Our approach lets practitioners express an a priori view that one of the fa-vored asset allocations (“bets”) b ∈ { b , ..., b n } will turn out to have performedspectacularly well in hindsight. In limiting our robustness to some discreteset of asset allocations (rather than all possible asset allocations) we reducethe price of the rebalancing option and guarantee to achieve a correspondinglyhigher percentage of the hindsight-optimized wealth at the end of the planningperiod.A practitioner who lives to delta-hedge this variant of Cover’s rebalancingoption through several decades is guaranteed to see the day that his realizedcompound-annual capital growth rate is very close to that of the best b i inhindsight. Hence the point of the rock-bottom option price. Keywords:
Continuously-Rebalanced Portfolios, Adaptive Asset Alloca-tion, Kelly Criterion, Universal Portfolios, Lookback Options, Exchange Op-tions, Rainbow Options, On-Line Portfolio Selection, Robust Procedures ∗ Assistant Professor of Economics, Northern Illinois University, 514 Zulauf Hall, DeKalb IL 60115.E-mail: [email protected]. ORCID: 0000-0003-0944-8517. a r X i v : . [ q -f i n . P M ] M a r over’s Rebalancing Option with Discrete Hindsight Optimization A. Garivaltis
JEL Classification:
C44, D80, D81, G11 Introduction
The main alternative to the Markowitz (1952) mean-variance theory of portfolio se-lection was popularized by Kelly (1956) who sought to optimize a gambler’s asymp-totic continuously-compounded capital growth rate in repeated bets on horse racesin the presence of partial inside information. His reasoning is in fact applicable toall gambling, insurance, and investment problems. Rather than optimize the staticreward per unit of risk, the Kelly Criterion (Poundstone 2010) is equivalent to theprescription that one should act each round so as to maximize the expected log ofhis capital. Breiman (1961) showed that the Kelly Criterion constitutes asymptoti-cally dominant behavior: a Kelly gambler will almost surely beat any other gamblerin the long run by an exponential factor, and he has the shortest expected hittingtime for a distant wealth goal. With probability approaching 1 as time goes on, theKelly gambler’s bankroll will (amusingly) overtake that of a mean-variance investor,who has a smooth ride but ultimately cannot “eat his Sharpe ratio.” The books byCover and Thomas (2006) and Luenberger (1998) are excellent primers of the theoryof asymptotic capital growth in discrete and continuous time, respectively. Thorp (cf.his 2017 biography) demonstrated the practical effectiveness of the Kelly Criterionwhen he used it to size his Blackjack bets in certain favorable situations that areidentifiable via his trademark (1966) theory of card counting. In this connection, thecorrect behavior is to bet the fraction b ∗ := p − q of your net worth on a given handfor which p is the chance of winning and q is the chance of losing.For growth opportunities in the stock market, the analog of Kelly’s fixed fractionbetting scheme is a certain constant-rebalanced portfolio b ∗ that trades continuouslyso as to maintain a target growth-optimal fraction of wealth in each risk asset. Forinstance, rather than bet b := 2% of wealth on a (favorable) hand of Blackjack, one over’s Rebalancing Option with Discrete Hindsight Optimization A. Garivaltiscould bet 2% of wealth (or even b := 200% of wealth) on the S&P 500 index. Intheory, if stock market returns are iid across (discrete) time then one can calculatethe corresponding log-optimal portfolio directly from the return distribution. But inpractice, equity investors must get along without complete knowledge of the returndistribution. Thus, a real-world investor cannot measure the exact regret of hisportfolio relative to the Kelly bet for the simple reason that he does not know theKelly bet.The way out of this conundrum was discovered by information theorist ThomasCover (1938-2012), who formulated the individual sequence approach to investment.For a given observed sequence of asset prices, one can look back and determine whichconstant-rebalanced asset allocation would have yielded the greatest final wealth forthat particular sequence . By definition, a Kelly gambler (who knows the distribu-tion of returns but not the individual sequence that will occur in the future) willachieve a final wealth that is no greater than that of the best constant-rebalancedportfolio determined in hindsight for the actual sequence of returns. Thus beganCover’s important universal portfolio theory that formulated various on-line invest-ment schemes (1986, 1991, 1996, 1998) that guarantee to achieve a high percentageof the final wealth of the best constant unlevered rebalancing rule (of any kind) inhindsight. Of course, any such scheme would then also guarantee to achieve a highpercentage of the Kelly final wealth in iid stock markets. One can consider Cover’s performance benchmark to be a financial derivative (“Cover’srebalancing option”) whose final payoff is equal to the wealth that would have ac-crued to a $1 deposit into the best rebalancing rule (or fixed-fraction betting scheme)2 over’s Rebalancing Option with Discrete Hindsight Optimization
A. Garivaltisdetermined in hindsight. Ordentlich and Cover (1998) began the work of pricing thisoption in the Black-Scholes (1973) market at time-0 for unlevered hindsight optimiza-tion over a single underlying risk asset. Garivaltis (2018) priced and replicated therebalancing option at any time t for levered hindsight optimization over an arbitrarynumber of correlated stocks in geometric Brownian motion. That paper obtained theelegant result that for completely relaxed (levered) hindsight optimization, the cor-responding delta-hedging strategy simply looks back over the observed price history[0 , t ], computes the best rebalancing rule in hindsight b ( S t , t ), and bets that fractionof wealth over the next differential time step [ t, t + dt ].The present paper studies Cover’s rebalancing option with hindsight optimizationover a discrete set B := { b , ..., b n } of rebalancing rules. Apart from the scientificobligation to extend Ordentlich and Cover’s incisive (1998) chain of reasoning, ourapproach has some interesting advantages relative to hindsight-optimization over allpossible rebalancing rules. In our world, the (delta-hedging) practitioner is now freeto express any of his institutional constraints or beliefs about future returns througha judicious choice of the set B . Our newly austere mode of hindsight optimizationyields a rock-bottom option price and correspondingly better guarantees of relativeperformance at the end of the planning period, whose shortened length is now wellwithin a human life span. Say, for robust betting on the S&P 500 index, the authorhimself is inclined to use B := { , . , , . , } , which amounts to the following five(continuously-rebalanced) asset allocations: (1)
0% stocks, 100% cash (2)
50% stocks, 50% cash (3) (4) −
50% cash (margin loans)3 over’s Rebalancing Option with Discrete Hindsight Optimization
A. Garivaltis (5) − Cost of Achievingthe Best [Rebalancing Rule] in Hindsight that would correspond (Garivaltis 2018) to B := R or even B := [0 , b > c of rebalancing rules and a single underlying risk asset.We price and replicate both the horizon- T and perpetual versions of the rebalancingoption, and give performance simulations that illustrate the general behavior of thereplicating strategy. Section 4 extends the methodology to general discrete sets ofasset allocations. We show how the rebalancing option can be interpreted as a cer-tain portfolio of Margrabe-Fischer (1978) exchange options, and derive the generalreplicating strategy, which is a time- and state-varying convex combination of the b i .We close the paper by proving that American-style rebalancing options (with generalexercise price K ) are always “worth more alive than dead” in equilibrium. We start in the Black-Scholes (1973) market with a single underlying stock whoseprice S t follows the geometric Brownian motion dS t S t = µ dt + σdW t , (1)4 over’s Rebalancing Option with Discrete Hindsight Optimization A. Garivaltiswhere µ is the drift, σ is the volatility, and W t is a standard Brownian motion. Thereis a risk-free bond whose price B t := e rt follows dB t B t = r dt. (2)A constant rebalancing rule b ∈ ( −∞ , + ∞ ) is a fixed-fraction betting scheme thatcontinuously maintains the fraction b of wealth in the stock and the fraction 1 − b of wealth in bonds. We let V t ( b ) denote the wealth at t that accrues to a $1 depositinto the rebalancing rule b . Thus, the trader holds ∆ := bV t ( b ) /S t shares of the stockat time t , and his remaining (1 − b ) V t dollars are invested in bonds. Maintenanceof the target asset allocation generally requires continuous trading. If 0 < b < dS t /S t ≥ r dt ) to restore the target allocation. Similarly, when the riskasset underperforms cash over [ t, t + dt ] (i.e. when dS t /S t ≤ r dt ) the trader mustbuy additional shares to restore the balance. This amounts to a volatility harvestingscheme (cf. Luenberger 1998) that “lives off the fluctuations” of the underlying.For b = 1 the trader just buys the stock and holds it; for b > b − V t ( b ) dollars at time t . A levered rebalancing rule b > − /b . Thus, when the stockrises (and debt is now a smaller percentage of assets) the trader will borrow againsthis new wealth to buy additional shares. Similarly, when the stock falls he must sellsome shares to reduce the loan-to-value ratio. This “buy high, sell low” strategy isonly appropriate for stocks with relatively high drift and low volatility. Finally, forlow quality underlyings one can hold all cash ( b = 0) or a continuously-rebalancedshort position b < over’s Rebalancing Option with Discrete Hindsight Optimization A. Garivaltisrules b > c , who wants to perform well relative to the best of B := { b, c } in hindsight.Accordingly, we create for him the financial derivative whose final payoff at T is V ∗ T := max { V T ( b ) , V T ( c ) } . (3)Ordentlich and Cover (1998) investigated the best unlevered rebalancing rule in hind-sight, with payoff V ∗ T := max ≤ b ≤ V T ( b ). They found the time-0 price of this contingentclaim to be C = 1 + σ (cid:114) T π . (4)The owner of this rebalancing option (cf. Garivaltis 2018) will compound his money atthe same asymptotic rate as the best unlevered rebalancing rule in hindsight. Indeed,the final excess continuously-compounded growth rate of the best rebalancing rule inhindsight over that of the replicating strategy is log (cid:8) σ (cid:112) T / (2 π ) (cid:9) /T , which tendsto 0 as T → ∞ . This growth rate spread obtains deterministically, regardless of therealized price path ( S t ) ≤ t ≤ T .Garivaltis (2018) extended the Ordentlich-Cover (1998) analysis by computingthe general time- t price C ( S, t ) of Cover’s rebalancing option for both levered andunlevered hindsight optimization. For levered hindsight optimization (with payoff V ∗ T := max b ∈ R V T ( b )), Garivaltis (2018) found the general pricing formula C ( S, t ) = (cid:114) Tt exp { rt + z t / } , (5)where z t := log( S t /S ) − ( r − σ / tσ √ t (6)is an auxiliary variable that is distributed unit normal with respect to the equivalent6 over’s Rebalancing Option with Discrete Hindsight Optimization A. Garivaltismartingale measure Q . More generally, for a Black-Scholes market with d correlatedstocks in geometric Brownian motion, Garivaltis (2018) found that C ( S, t ) = (cid:18) Tt (cid:19) d/ · exp { rt + z (cid:48) t R − z t / } , (7)where R := [ ρ ij ] d × d is the correlation matrix of instantaneous returns, z it := log( S it /S i ) − ( r − σ i / tσ i √ t , (8)are auxiliary variables, and σ i is the volatility of stock i . When we relax the hindsightoptimization to include all levered rebalancing rules b ∈ R d , replication becomes espe-cially simple. At time t , one just looks back at the observed price history [0 , t ], findsthe best ( d -dimensional) rebalancing rule b ( S, t ) in hindsight, and bets the fraction b i ( S, t ) of wealth on stock i over [ t, t + dt ]. The relation C ( S, t ; T ) ∝ T d/ matches themodel-independent O ( T d/ ) super-replicating price calculated by Cover & Company.In what follows, we work toward reducing the option price (cid:112) T /t · exp { rt + z t / } by replacing B = R with B := { b, c } . In order to get the payoff max { V T ( b ) , V T ( c ) } into a more practical form, we note that V t ( b ) is a geometric Brownian motion, since dV t ( b ) V t ( b ) = b dS t S t + (1 − b ) dB t B t = [ r + ( µ − r ) b ] dt + bσdW t . (9)Solving this stochastic differential equation, we obtain (cf. Wilmott 1998, 2001) V t ( b ) = exp { [ r + ( µ − r ) b − σ b / t + bσW t } . (10)In order to get the payoff in terms of the observable variable S t (rather than the7 over’s Rebalancing Option with Discrete Hindsight Optimization A. GarivaltisWiener process W t ), we start with the equation S t = S exp { ( µ − σ / t + σW t } , (11)and solve for σW t in terms of S t . Substituting the resulting expression into 9, we get V t ( b ) = exp { ( r − σ b / t + b [log( S t /S ) − ( r − σ / t ] } . (12)We thus have V t ( b ) = exp { ( r − σ b / t + bσ √ t · z t } , (13)where z t := log( S t /S ) − ( r − σ / tσ √ t (14)is distributed unit normal with respect to the equivalent martingale measure Q . Notethat the drift µ (which is difficult to estimate) does not appear in this formula. Thefinal wealth of the rebalancing rule b is now expressed solely in terms of z t , the risk-freerate r , the time t , and the volatility σ , which is easily estimated from high-frequencyprice data. Before we can price the rebalancing option with payoff max { V T ( b ) , V T ( c ) } , we mustcharacterize the random outcomes under which b will turn out to outperform c overthe interval [0 , t ]. Accordingly, we compare the exponents of V t ( b ) and V t ( c ) to obtain Lemma 1.
For two given rebalancing rules b > c , b outperforms c over [0 , t ] if and over’s Rebalancing Option with Discrete Hindsight Optimization A. Garivaltis only if z t ≥ b + c σ √ t . (15) Proposition 1.
The best rebalancing rule (of any kind) in hindsight over [0 , t ] , de-noted b ( S, t ) , is b ( S, t ) := arg max b ∈ R V t ( b ) = z ( S, t ) σ √ t . (16) Given any closed set B of rebalancing rules, the best performer in hindsight is the b ∈ B that is nearest to b ( S, t ) = z ( S, t ) / ( σ √ t ) .Proof. We compute the abscissa of vertex of the parabola b (cid:55)→ log V t ( b ). This yields b ( S, t ) = arg max b ∈ R log V t ( b ) = − σ √ t · z t − σ t/
2) = z t σ √ t . (17)Because the graph of a parabola is symmetric about its vertex, the b ∈ B thatmaximizes the height of this parabola is whichever element of B is nearest to thevertex b ( S, t ).We proceed to compute the cost of achieving the best of two rebalancing rules inhindsight, by finding the expected present value of max { V T ( b ) , V T ( c ) } at time-0 withrespect to the equivalent martingale measure Q . This cost is the sum of two integrals I + I , where I := exp( − σ b T / √ π (cid:90) ∞ b + c σ √ T exp( − z / bσ √ T · z ) dz (18)and I := exp( − σ c T / √ π (cid:90) b + c σ √ T −∞ exp( − z / cσ √ T · z ) dz. (19)In the sequel, we will often use the following general formula (i.e. the appendix to9 over’s Rebalancing Option with Discrete Hindsight Optimization A. GarivaltisReiner and Rubinstein 1992): (cid:90) BA e − αy + βy dy = (cid:114) πα exp (cid:18) β α (cid:19)(cid:20) N (cid:18) B √ α − β √ α (cid:19) − N (cid:18) A √ α − β √ α (cid:19)(cid:21) , (20)where α > N ( • ) is the cumulative normal distribution function. Simplifyingthe two integrals, we get I = I = N (cid:18) b − c σ √ T (cid:19) . (21) Theorem 1.
The time- cost of achieving the best of two rebalancing rules { b, c } inhindsight is C ( δ, σ, T ) = 2 N (cid:18) δ σ √ T (cid:19) . (22) where δ := | b − c | is the distance between the two rebalancing rules. Corollary 1.
The equilibrium price at t = 0 of a perpetual option ( T := ∞ ) on thebest of two rebalancing rules { b, c } in hindsight is C ( δ, σ, ∞ ) = $2 . Note that the horizon- T price is independent of the interest rate r , and it istranslation invariant in the sense that it depends only on the distance δ = | b − c | between the two rebalancing rules. We always have 1 ≤ C ( δ, σ, T ) ≤
2; besidesthe perpetual version of the option, the maximum $2 price also obtains if σ = ∞ or δ = ∞ . The minimum $1 price obtains if any of the parameters δ, σ, T tends to0. Since the increasing function N ( • ) is concave over [0 , ∞ ), we see that the optionprice is increasing and concave separately in each of the parameters δ, σ, T . Theorem 2.
Given two rebalancing rules b > c with distance δ = | b − c | , an initial $1 deposit into the horizon- T replicating strategy achieves at T a compound growth-rate over’s Rebalancing Option with Discrete Hindsight Optimization A. Garivaltis that is exactly T log (cid:26) N (cid:18) δ σ √ T (cid:19)(cid:27) (23) percent lower than that of the best of { b, c } in hindsight. A $1 deposit into the cor-responding horizon-free strategy (that replicates the perpetual version of the option)achieves a compound-growth rate at T that is at most
100 log(2) /T percent lower thanthat of the best of { b, c } in hindsight.Proof. The trader’s initial ($1) deposit into the replicating strategy buys him 1 /C units of the option at t = 0. For the horizon- T option, his wealth at expiration will bemax { V T ( b ) , V T ( c ) } /C , and hence the excess continuously-compounded growth ratewill be1 T log[max { V T ( b ) , V T ( c ) } ] − T log[max { V T ( b ) , V T ( c ) } /C ] = log C ( δ, σ, T ) T . (24)For the horizon-free option, the trader’s initial dollar buys him half a unit of theoption at t = 0. Thus, his wealth at T will be at least half the exercise value ofthe option, which is max { V T ( b ) , V T ( c ) } . Hence, the excess continuously-compoundedgrowth rate of the hindsight-optimized rule at T is at most1 T log[max { V T ( b ) , V T ( c ) } ] − T log[max { V T ( b ) , V T ( c ) } /
2] = log 2
T . (25)
Example 1.
Consider the following robust scheme for T := 25 years of leveraged betson the S&P 500 index. We put b := 2 and c := 1 (e.g. buy-and-hold), with σ := 0 . .We get C = $1 . and log( C ) /T = 1% , so the replicating strategy is guaranteedto achieve a final compound-growth rate that is lower than the best of { b, c } in over’s Rebalancing Option with Discrete Hindsight Optimization A. Garivaltis hindsight. If b = 2 happens to outperform the index by more than per year, thenthe trader will beat the market over t ∈ [0 , . If b = 2 underperforms the index(or outperforms by less than a year), then the trader’s compound-growth rate willhave lagged the market by at most a year.Note that the corresponding horizon-free strategy (that replicates the perpetual ver-sion of the option) can only guarantee to get within log(2) /T = 2 . of the hindsight-optmized growth rate at T = 25 Example 2.
We construct a robust T := 25 year scheme for long-run stock marketinvestment that guarantees preservation of capital. We put b := 1 ( stocks) and c := 0 (all cash). Assuming that σ := 0 . , the practitioner can rest easy, safe inthe knowledge that his foray into risk assets will ultimately not cause him to lag therisk-free rate by more than a year. If r > . , then he is guaranteed not tolose money if he sticks to the Plan for T = 25 years. At the same time, if stocks gothrough the roof, his strategy will earn the long-run market growth rate minus a “universality cost.” Would-be practitioners who enjoyed these example can use Figure 1 to informthe choice of horizon: it plots the excess continuously-compounded growth rate fordifferent volatilities and maturities with δ := 1. Before we can put our on-line schemes for robust asset allocation into actual practice,we must derive general time- t formulas for pricing and replication of the rebalancingoption under discrete hindsight optimization. Thus, we proceed to extend the aboveintegration technique to the general situation. To simplify the notation, we let τ := T − t denote the remaining life of the option at time t . Inspired by Garivaltis (2018),12 over’s Rebalancing Option with Discrete Hindsight Optimization A. GarivaltisFigure 1:
The excess percent growth rate of the best of two rebalanc-ing rules over the replicating strategy, for different horizons andvolatilities, with δ := 1 . we start with the decomposition z T = (cid:114) tT · z t + (cid:114) τT · y , (26)where y := log( S T /S t ) − ( r − σ / τσ √ τ (27)is distributed unit normal with respect to the equivalent martingale measure and theinformation available at t . Conditional on the values of time- t variables, b outperforms c at T if and only if y ≥ b + c σT − √ t · z t √ τ . (28)13 over’s Rebalancing Option with Discrete Hindsight Optimization A. GarivaltisThus, the general price C ( S, t ) is equal to the sum of two integrals I + I , where I := exp( rt − σ b T / bσ √ t · z t ) √ π (cid:90) ∞ [( b + c ) σT/ −√ t · z t ] / √ τ exp( − y / bσ √ τ · y ) dy (29)and I := exp( rt − σ c T / cσ √ t · z t ) √ π (cid:90) [( b + c ) σT/ −√ t · z t ] / √ τ −∞ exp( − y / cσ √ τ · y ) dy. (30)These integrals simplify to I = N (cid:18) [ b − c ] σT / √ t · z t − bσt √ τ (cid:19) V t ( b ) (31)and I = N (cid:18) [ b − c ] σT / − √ t · z t + cσt √ τ (cid:19) V t ( c ) . (32) Theorem 3.
The general cost C ( S, t ) of achieving the best of two rebalancing rules b > c in hindsight is C = N ( d ) V t ( b ) + N ( d ) V t ( c ) , (33) where d := ( b − c ) σT / √ t · z t − bσt √ τ (34) and d := ( b − c ) σ √ τ − d = ( b − c ) σT / − √ t · z t + cσt √ τ . (35) Theorem 4.
A perpetual option ( T := ∞ ) on the best of two rebalancing rules b > c costs C ( S, t ) = V t ( b ) + V t ( c ) in state ( S t , t ) . To delta-hedge the perpetual option, one over’s Rebalancing Option with Discrete Hindsight Optimization A. Garivaltis holds ∆ = bV t ( b ) + cV t ( c ) S t (36) shares of the underlying in state ( S t , t ) , and therefore bets the fraction ˆ b ( S t , t ) := ∆ SC = bV t ( b ) + cV t ( c ) V t ( b ) + V t ( c ) (37) of wealth on the underlying at t .Proof. As T → ∞ , we see that d , d → + ∞ and the option price converges to V t ( b ) + V t ( c ). Next, one can verify by direct calculation from (13) and (14) that ∂V ( b ) ∂S = ∂V ( b ) ∂z ∂z∂S = bV t ( b ) S t . (38)Alternately, one can observe that the rebalancing rule b keeps (by definition) bV t ( b )dollars in the stock at time t , which amounts to bV t ( b ) /S t shares. Thus, to replicatethe sum V t ( b ) + V t ( c ) we must own a total of ∆ = bV t ( b ) /S t + cV t ( c ) /S t shares of theunderlying.We should note that our general pricing formulas could have been obtained dif-ferently, by applying the theory of “exchange options” that was bequeathed to us insumultaneous papers by Margrabe (1978) and Fischer (1978). Rather than the singleunderlying S t , one could view the (perfectly correlated) geometric Brownian motions U ( t ) := V t ( b ) and U ( t ) := V t ( c ) as underlyings of a multi-asset option with payoffmax { U , U } = max { U − U , } + U . (39)This amounts to a $1 deposit into the rebalancing rule c , plus the option to exchangethe final wealth of c for the final wealth of b at T . Substituting the aggregate volatility15 over’s Rebalancing Option with Discrete Hindsight Optimization A. Garivaltis σ a := ( b − c ) σ into Margrabe’s Formula (cf. Zhang 1998) yields the same result C ( U , U , t ) = N ( d ) U + N ( σ a √ τ − d ) U , (40)where d := log( U /U ) σ a √ τ + σ a √ τ , (41)is in agreement with (34). Figure 3 plots the price and intrinsic value of the rebal-ancing option for different values of S under the parameters r := 0 . , T := 10 , S :=100 , t := 5 , σ := 0 . , b := 1 . , and c := 0 . Theorem 5.
The horizon- T replicating strategy for the best of two rebalancing rules b > c in hindsight holds ∆ = N ( d ) bV t ( b ) + N ( d ) cV t ( c ) S t (42) shares of the stock in state ( S t , t ) , which amounts to betting the fraction ˆ b ( S t , t ) = ∆ SC = N ( d ) bV t ( b ) + N ( d ) cV t ( c ) N ( d ) V t ( b ) + N ( d ) V t ( c ) . (43) of wealth on the stock at t . Thus, the on-line fraction of wealth bet on the stock is atime- and state-varying convex combination of b and c .Proof. First, we note the standard relations ∂C/∂U = N ( d ) and ∂C/∂U = N ( d ),which follow by direct calculation from (40), (41), and the fact that U φ ( d ) = U φ ( d ), where φ ( • ) is the standard normal density function. Differentiating theoption price, we get ∂C∂S = ∂C∂U ∂U ∂S + ∂C∂U ∂U ∂S = N ( d ) bU S + N ( d ) cU S , (44)16 over’s Rebalancing Option with Discrete Hindsight Optimization
A. Garivaltiswhich is the desired result.Thus, even if the best rebalancing rule (of any kind) in hindsight b ( S, t ) = z ( S, t ) / ( σ √ t ) happens to lie between b and c , the replicating strategy will not gener-ally bet the hindsight-optimized fraction arg max b ∈ R V t ( b ) of wealth on the stock. Thisphenomenon is illustrated in Figure 2. Instead, the relative weighting of b and c (whichis initially 50 /
50 at time-0) evolves with the observed performances V t ( b ) , V t ( c ) andthe remaining life τ := T − t of the option. For a fixed time t , if U → ∞ or U → ∞ then the on-line portfolio weight will converge to b or c accordingly. As τ → d tends to ±∞ and d tends to ∓∞ according as to whether b or c has outperformedover the known price history. Thus, small differences in the observed performances V t ( b ) and V t ( c ) get amplified in the on-line portfolio weight as τ →
0. Figures 4 and5 simulate the performance of the replicating strategy for different parameter valuesover a T := 30 year horizon. We carry on with the general discrete set B := { b , ..., b n } ⊂ R of asset allocations,where the b i are arranged in increasing order: b < b < · · · < b n . Thus, we nowdeal with the payoff V ∗ t := max ≤ i ≤ n V t ( b i ). For notational convenience, we will alsowrite b := −∞ and b n +1 := + ∞ . We let ∆ b i := b i +1 − b i for 0 ≤ i ≤ n , and thus∆ b = ∆ b n = + ∞ . For a given rebalancing rule b i (1 ≤ i ≤ n ), the final payoff of theoption is equal to V T ( b i ) if and only if b i − + b i σ √ T ≤ z T ≤ b i + b i +1 σ √ T . (45)17 over’s Rebalancing Option with Discrete Hindsight Optimization
A. GarivaltisFigure 2:
The fraction of wealth bet by the replicating strategy fordifferent stock prices, r := 0 . , T := 10 , S := 100 , t := 5 , σ := 0 . , b :=1 . , c := 0 . . Thus, conditional on the values of time- t variables, b i will turn out to be the bestperformer over [0 , T ] if and only if( b i − + b i ) σT / − √ t · z t √ τ ≤ y ≤ ( b i + b i +1 ) σT / − √ t · z t √ τ . (46)For simplicity, we will write this interval as y ∈ [ A i − , A i ]. Thus A = −∞ and A n = + ∞ . The expected present value of the final payoff with respect to Q and theinformation available at t is equal a sum of integrals I + · · · + I n , where I i := exp( rt − σ b i T / b i σ √ t · z t ) √ π (cid:90) A i A i − exp( − y / b i σ √ τ · y ) dy. (47)18 over’s Rebalancing Option with Discrete Hindsight Optimization A. GarivaltisFigure 3:
Option price and intrinsic value for different stock prices, r := 0 . , T := 10 , S := 100 , t := 5 , σ := 0 . , b := 1 . , c := 0 . . Evaluating these integrals, we obtain the general pricing formula C ( S, t ) = n (cid:88) i =1 { N ( A i − β i ) − N ( A i − − β i ) } V t ( b i ) , (48)where A i := ( b i + b i +1 ) σT / − √ t · z t √ τ . (49)and β i := b i σ √ τ . (50)19 over’s Rebalancing Option with Discrete Hindsight Optimization A. GarivaltisFigure 4:
Performance simulation over T := 30 years for the parameters S := 1 , b := 2 , c := 0 . , r := 0 . , σ := 0 . , ν := 0 . , µ = ν + σ / . Bearing in mind that A = −∞ and A n = + ∞ , we can also write C ( S, t ) = N ( A − β ) V t ( b ) + n − (cid:88) i =2 { N ( A i − β i ) − N ( A i − − β i ) } V t ( b i )+ N ( β n − A n − ) V t ( b n ) . (51)The general option price could again have been obtained differently, by an interestingapplication of Margrabe’s theory of exchange options. Indeed, we could considerthe wealth processes ( V t ( b i )) ni =1 as separate underlyings U i := V t ( b i ) of a multi-asset20 over’s Rebalancing Option with Discrete Hindsight Optimization A. GarivaltisFigure 5:
Performance simulation over T := 30 years for the parameters S := 1 , b := 2 , c := 0 . , r := 0 . , σ := 0 . , ν := 0 . , µ = ν + σ / . option whose final payoff is equal to max { U , U , ..., U n } . First of all, we remark thatat any given time the ordered sequence of numbers U ( t ) , ..., U n ( t ) is unimodal, orsingle-peaked. This happens because the (log U i ) ni =1 trace out a sequence of heightson the parabola b (cid:55)→ log V t ( b ) as we move from left to right over the abscissae b
A. GarivaltisFigure 6:
Option price and intrinsic value for different stock prices, r := 0 . , T := 10 , S := 100 , t := 5 , σ := 0 . , B := { , . , , . , } . implied volatilities σ that could rationalize an observed price of the rebalancing op-tion. Figure 7 plots the option price for different volatilities under the parameters r := 0 . , T := 10 , S := 100 , t := 5 , S t := 200 , and B := { , . , . } .In specializing the pricing formula for t := 0 and simplifying (remembering that V ( b i ) := 1), we get Theorem 6.
For hindsight optimization over n discrete rebalancing rules b < · · ·
Option prices for different volatilities, r := 0 . , T := 10 , S :=100 , t := 5 , S t := 200 , B := { , . , . } . Corollary 2.
A perpetual option on the best of any n distinct rebalancing rules inhindsight is worth C = n dollars at time-0. Thus, we see that the time-0 price of the general horizon- T rebalancing option isindependent of the interest rate, and it is increasing and concave separately in theparameters ∆ b i , σ, T . We again observe that horizontal translations of the point set { b , ..., b n } do not alter the option price. We always have the relation 1 ≤ C ≤ n ;the maximum n dollar price obtains if any of the parameters tends to infinity and theminimum $1 price obtains if any of the parameters tends to zero. Example 3.
For a T := 25 year planning horizon, we cherry pick five favored assetallocations B := { , . , , . , } . Assuming stock market volatility of σ := 0 . goingforward, we get C = $1 . , and the excess growth rate of the hindsight-optimizedasset allocation will be exactly log( C ) /T = 1 . . Assuming that the risk-free rateis greater than . , the replicating strategy is guaranteed not to lose money if the over’s Rebalancing Option with Discrete Hindsight Optimization A. Garivaltis practitioner sticks to the Plan for the next T = 25 years. Theorem 7.
The horizon- T replicating strategy for the best of the rebalancing rules b < b < · · · < b n in hindsight holds ∆ = N ( − d ) b V t ( b ) S t + n − (cid:88) i =2 [ N ( d ,i − ) − N ( d i )] b i V t ( b i ) S t + N ( d ,n − ) b n V t ( b n ) S t (59) shares of the stock in state ( S t , t ) , thereby betting the fraction ˆ b ( S, t ) = ∆
S/C of itsbankroll on the stock. This amounts to a time- and state-varying convex combinationof the b i . As τ → , the option price converges to U i ∗ := max ≤ i ≤ n U i and the fraction ofwealth bet by the replicating strategy converges to arg max b ∈ B V t ( b ) if this set is a singleton;if arg max b ∈ B V t ( b ) = U i ∗ = U i ∗ +1 has two distinct points, then ˆ b converges to the midpoint ( b i ∗ + b i ∗ +1 ) / as τ → . The horizon-free replicating strategy (corresponding to theperpetual version of the option) bets the performance-weighted average ˆ b ( S t , t ) = n (cid:80) i =1 b i V t ( b i ) n (cid:80) i =1 V t ( b i ) (60) of the rebalancing rules b i , which converges almost surely to arg max b ∈ B (cid:26) ( µ − r ) b − σ b (cid:27) = arg min b ∈ B (cid:12)(cid:12)(cid:12)(cid:12) b − µ − rσ (cid:12)(cid:12)(cid:12)(cid:12) , (61) i.e. it converges to whichever element of B is closest to the continuous time Kellyrule (cf. Luenberger 1998).Proof. Note that the pricing formula (56) is a linearly homogeneous function of theunderlyings ( U , ..., U n ). By Euler’s theorem for homogeneous functions, we therefore25 over’s Rebalancing Option with Discrete Hindsight Optimization A. Garivaltishave the relation C = n (cid:88) i =1 ∂C∂U i U i . (62)Accordingly, by direct calculation on (56) one can (carefully) verify the partial deriva-tives ∂C∂U = N ( − d ) , (63) ∂C∂U i = N ( d ,i − ) − N ( d i ) for 2 ≤ i ≤ n − , (64) ∂C∂U n = N ( d ,n − ) . (65)To verify these partials easily, one needs the identity φ ( d i ) φ ( d i ) = U i +1 U i for 1 ≤ i ≤ n − , (66)where φ ( • ) is the standard normal density function. Observe that U i generally appearsin the terms of (56) that correspond to the indices i − , i, and i + 1. U appears inthe first two terms and U n appears in the last two terms. This being done, thedelta-hedging strategy now obtains from the chain rule ∂C∂S = n (cid:88) i =1 ∂C∂U i ∂U i ∂S (67)in conjunction with the fact that ∂U i /∂S = b i V t ( b i ) /S . To get the horizon-free result,we just observe that d i → + ∞ and d i → −∞ as T → ∞ . Finally, consider whathappens when τ →
0. The numbers d i , d i will converge to the same limit ±∞ according as U i +1 is greater or less than U i . In the event that U i +1 = U i then d i and d i both converge to zero. The numbers ( U i ) ni =1 will typically have a unique mode U i ∗ , e.g. U < · · · < U i ∗ − < U i ∗ > U i ∗ +1 > · · · > U n . In this case, all coefficients in26 over’s Rebalancing Option with Discrete Hindsight Optimization A. Garivaltisthe linear combination (56) converge to zero except the one corresponding to i = i ∗ ,which converges to 1. If there are two modes U i ∗ = U i ∗ +1 , then the correspondingcoefficients in (56) both converge to 1 /
2, and the result follows.Note that for n >
2, the initial weighting of the b i at time-0 is not uniform, evenif the b i themselves are equally spaced. The endpoints b and b n have initial weights N (∆ b σ √ T / /C and N (∆ b n − σ √ T / /C , respectively, and the rest of the b i haveinitial weights [ N (∆ b i − σ √ T / − N ( − ∆ b i σ √ T / /C for 2 ≤ i ≤ n −
1. If the b i are equally spaced, then each of the intermediate points (2 ≤ i ≤ n −
1) gets initialweight [2 N ( δσ √ T / − /C , but the endpoints b , b n get the higher initial weight N ( δσ √ T / /C . Theorem 8.
For any closed set B ⊆ R of rebalancing rules (finite or infinite), theAmerican-style version of Cover’s rebalancing option (with exercise price K and pay-off max { max b ∈ B V t ( b ) − K, } ) will never be exercised early in equilibrium, and its price C a ( S t , t ) equals the price C e ( S t , t ) of the corresponding European-style option.Proof. For simplicity, let V ∗ t := max b ∈ B V t ( b ) denote the hindsight-optimized wealthover [0 , t ], and let b ∗ t := arg max b ∈ B V t ( b ) denote the best (allowable) rebalancing rulein hindsight over [0 , t ]. Consider, from the standpoint of time t , the following twotrading strategies: Strategy 1
Invest Ke − rτ dollars in the risk-free bond and buy 1 unit of Cover’s(European-style) rebalancing option at a price of C e ( S t , t ). Strategy 2
Invest V ∗ t dollars into the rebalancing rule b ∗ t . That is, take the bestrebalancing rule in hindsight over [0 , t ], and adhere to that same (constant)continuously-rebalanced portfolio over [ t, T ].Observe that Strategy 1 has a final payoff of max { V ∗ T , K } and Strategy 2 has a finalpayoff of V T ( b ∗ t ). Since the payoff at T of Strategy 1 is guaranteed to be at least as27 over’s Rebalancing Option with Discrete Hindsight Optimization A. Garivaltisgreat as that of Strategy 2, the initial investment of Ke − rτ + C e ( S t , t ) dollars intoStrategy 1 must be greater or equal to the investment V ∗ t that is required for Strategy2. Thus, we have the inequalities C a ( S t , t ) ≥ C e ( S t , t ) ≥ V ∗ t − Ke − rτ ≥ V ∗ t − K. (68)Hence, since the price of an American rebalancing option always exceeds the exercisevalue, the option “is worth more alive than dead” and will never be exercised inequilibrium. On account of the fact that early exercise rights are worthless anyhow,we must therefore have C a ( S t , t ) = C e ( S t , t ).We remark that this is a general model-independent result that applies equallywell to rebalancing rules b ∈ B ⊆ R d over arbitrary d -dimensional stock markets.The dominance argument only requires the market (and the set B ) to admit a well-defined hindsight-optimized rebalancing rule b ∗ t := arg max b ∈ B V t ( b ). For B := { } thebest rebalancing rule in hindsight is just b ∗ t = 1 and we get V ∗ t = S t ; this recovers theproof given by Merton (1973, 1990) of the no-exercise theorem for vanilla call options.The special cases B := R d and B := [0 ,
1] were observed by Garivaltis (2018) for acontinuous-time Black-Scholes market with K := 0. This paper studied Cover’s rebalancing option with discrete hindsight optimization.In the context of a single risk asset, a constant (perhaps levered) rebalancing rule is asimple trading strategy that continuously maintains some fixed fraction of wealth inthe underlying asset. Cover’s discrete-time universal portfolio theory derives robuston-line trading strategies that are guaranteed to achieve an acceptable percentage of28 over’s Rebalancing Option with Discrete Hindsight Optimization
A. Garivaltisthe final wealth of the best rebalancing rule (of any kind) in hindsight.Working in continuous time, we formulated the less aggressive benchmark of thebest rebalancing rule in hindsight that hails from some finite set B := { b , ..., b n } .This approach allows the (delta-hedging) practitioner to cherry pick a small numberof favored rebalancing rules that could embody institutional leverage constraints orthe trader’s own speculative beliefs as to the future pattern of returns in the stockmarket.Accordingly, we priced and replicated the financial option whose final payoff isequal to the wealth V ∗ T := max ≤ i ≤ n V T ( b i ) that would have accrued to a $1 deposit intothe best b i in hindsight. We found that a perpetual option (with zero exercise price)on the best of n distinct rebalancing rules costs n dollars at t = 0. The corresponding(horizon-free) replicating strategy amounts to depositing a dollar into each b i and“letting it ride.”If the option expires at some fixed date T the price is lower; it is concavely in-creasing in T and in the volatility σ of the underlying risk asset. From the standpointof t = 0, the cost C of achieving the best of the b i is translation invariant: it in-creases monotonically with the distances ∆ b i between adjacent rebalancing rules, butit does not otherwise depend on their precise location. In this connection, the repli-cating strategy amounts to a time- and state-varying convex combination of the b i that dynamically considers both the observed performances V t ( b i ) and the remaininglife τ := T − t of the option. No-arbitrage considerations dictate that American-stylerebalancing options (for general exercise price K ) will never be exercised early inequilibrium. This model-independent result holds for arbitrary closed sets B ⊆ R d of rebalancing rules over any d − dimensional stock market that admits a well-definedbest rebalancing rule in hindsight. Toward the end of the investment horizon (asit becomes more and more obvious which b i is likely to be the best in hindsight),29 over’s Rebalancing Option with Discrete Hindsight Optimization A. Garivaltiseven small differences in observed performance will cause the replicating strategy todramatically over- or under-weight the various b i .Any practitioner of the horizon- T delta-hedging strategy is guaranteed to achieveat T the deterministic fraction 1 /C of the final wealth of the best b i in hindsight.The excess compound-growth rate at T of the best b i (over and above the trader) islog( C ) /T , which tends to 0 as T → ∞ . The replicating strategy will asymptoticallybeat the underlying (i.e. an S&P 500 ETF) if any of the b i turns out to achieve acompound-growth rate that is higher than b = 1. If there is no such b i ∈ B , but thetrader had the good sense to put 1 ∈ B , then the trader’s compound-annual growthrate will lag the underlying risk asset by at most 100 log( C ) /T percent at T . If wehave 0 ∈ B , then the trader also guarantees that he will ultimately not lose moneyover [0 , T ] if the condition log( C ) /T < r is satisfied. Hence, our trading strategy isin a sense the most conservative attempt at detecting on-the-fly whether any of therebalancing rules in some finite set is capable of beating the underlying over a giveninvestment horizon.We have therefore obtained a universal continuous-time asset allocation schemethat is computationally pleasant as well as feasible for the human life span. Theon-line behavior is Markovian in the sense that the relevant state vector is just( S t , t, ( V t ( b i )) ni =1 ). The algorithm requires no prior knowledge of the (hard-to-estimate)drift parameter µ of the stock market. Apart from the finite-dimensional state vector,the trader’s behavior depends only on the known parameters r, σ, T, and B . In just 25years, say, our method guarantees to achieve within 1 .
87% of the compound-annualgrowth rate of whichever turns to be the most profitable asset allocation among B := { , . , , . , } . 30 over’s Rebalancing Option with Discrete Hindsight Optimization A. Garivaltis
References [1]
Black, F. and Scholes, M., 1973 . The Pricing of Options and CorporateLiabilities.
Journal of Political Economy, 81 (3), pp.637-654.[2]
Breiman, L., 1961 . Optimal Gambling Systems for Favorable Games. In
Pro-ceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Prob-ability, Volume 1: Contributions to the Theory of Statistics . The Regents of theUniversity of California.[3]
Cover, T.M., 1991 . Universal Portfolios.
Mathematical Finance, 1 (1), pp.1-29.[4]
Cover, T.M. and Gluss, D.H., 1986 . Empirical Bayes Stock Market Portfolios.
Advances in Applied Mathematics, 7 (2), pp.170-181.[5]
Cover, T.M. and Ordentlich, E., 1996 . Universal Portfolios with Side Infor-mation.
IEEE Transactions on Information Theory, 42 (2), pp.348-363.[6]
Cover, T.M. and Thomas, J.A., 2006 . Elements of Information Theory . JohnWiley & Sons.[7]
Fischer, S., 1978 . Call Option Pricing When the Exercise Price is Uncertain,and the Valuation of Index Bonds.
The Journal of Finance, 33 (1), pp.169-176.[8]
Garivaltis, 2018.
Exact Replication of the Best Rebalancing Rule in Hindsight.Working Paper.[9]
Kelly J.L., 1956 . A New Interpretation of Information Rate.
Bell System Tech-nical Journal .[10]
Luenberger, D.G., 1998 . Investment Science . Oxford University Press.[11]
MacLean, L.C., Thorp, E.O. and Ziemba, W.T., 2011 . The Kelly CapitalGrowth Investment Criterion: Theory and Practice . World Scientific.31 over’s Rebalancing Option with Discrete Hindsight Optimization
A. Garivaltis[12]
Margrabe, W., 1978 . The Value of an Option to Exchange one Asset forAnother.
The Journal of Finance, 33 (1), pp.177-186.[13]
Markowitz, H., 1952 . Portfolio Selection.
The Journal of Finance, 7 (1), pp.77-91.[14]
Merton, R.C., 1973 . Theory of Rational Option Pricing.
The Bell Journal ofEconomics and Management Science , pp.141-183.[15]
Merton, R.C., 1990 . Continuous-Time Finance . Blackwell.[16]
Ordentlich, E. and Cover, T.M., 1998 . The Cost of Achieving the BestPortfolio in Hindsight.
Mathematics of Operations Research, 23 (4), pp.960-982.[17]
Poundstone, W., 2010 . Fortune’s Formula: The Untold Story of the ScientificBetting System That Beat the Casinos and Wall Street . Hill and Wang.[18]
Reiner, E. and Rubinstein, M., 1992 . Exotic Options. Working Paper.[19]
Thorp, E.O., 1966 . Beat the Dealer: a Winning Strategy for the Game ofTwenty One . Random House.[20]
Thorp, E.O., 2017 . A Man for All Markets . Random House.[21]
Wilmott, P., 1998 . Derivatives: the Theory and Practice of Financial Engi-neering . John Wiley & Sons.[22]
Wilmott, P., 2001 . Paul Wilmott Introduces Quantitative Finance . John Wiley& Sons.[23]