From optimal martingales to randomized dual optimal stopping
FFrom optimal martingales to randomized dual optimalstopping
Denis Belomestny · John SchoenmakersAbstract
In this article we study and classify optimal martingales in the dual for-mulation of optimal stopping problems. In this respect we distinguish between weaklyoptimal and surely optimal martingales. It is shown that the family of weakly optimaland surely optimal martingales may be quite large. On the other hand it is shownthat the Doob-martingale, that is, the martingale part of the Snell envelope, is in acertain sense the most robust surely optimal martingale under random perturbations.This new insight leads to a novel randomized dual martingale minimization algorithmthat doesn’t require nested simulation. As a main feature, in a possibly large familyof optimal martingales the algorithm efficiently selects a martingale that is as close aspossible to the Doob martingale. As a result, one obtains the dual upper bound for theoptimal stopping problem with low variance.
Keywords
Optimal stopping problem, Doob-martingale, Randomization.
Mathematics Subject Classification (2010) · · JEL Classification
G10 · G12 · G13
The last decades have seen a huge development of numerical methods for solving opti-mal stopping problems. Such problems became very prominent in the financial industryin the form of American derivatives. For such derivatives one needs to evaluate the rightof exercising (stopping) a certain cash-flow (reward) process Z at some (stopping) time τ , up to some time horizon T . From a mathematical point of view this evaluation comesdown to solving an optimal stopping problem Y (cid:63) = sup stopping time τ ≤ T E [ Z τ reward at stopping ] . Denis BelomestnyFaculty of Mathematics, University of Duisburg–Essen, Thea Leymann Str. 9, 45127 Essen,GermanyE-mail: [email protected] SchoenmakersWIAS Berlin, GermanyE-mail: [email protected] a r X i v : . [ m a t h . P R ] F e b Denis Belomestny, John Schoenmakers
Typically the cash-flow Z depends on various underlying assets and/or interest ratesand as such is part of a high dimensional Markovian framework. Particularly for highdimensional stopping problems, virtually all generic numerical solutions are MonteCarlo based. Most of the first numerical solution approaches were of primal naturein the sense that the goal was to construct a “good” exercise policy and to simulatea lower biased estimate of Y (cid:63) . In this respect we mention, for example, the well-known regression methods by Longstaff & Schwartz [11], Tsiklis & Van Roy [14], andthe stochastic mesh approach by Broadie & Glasserman [5], and the stochastic policyimprovement method by Kolodko & Schoenmakers [10]. For further references we referto the literature, for example [8] and the references therein.In this paper we focus on the dual approach developed by Rogers [12], and Haugh& Kogan [9], initiated earlier by Davis & Karatzas [6]. In the dual method the stoppingproblem is solved by minimizing over a set of martingales, rather than a set of stoppingtimes, Y (cid:63) = inf M : martingale, M =0 E (cid:20) max ≤ s ≤ T ( Z s − M s ) (cid:21) . (1.1)A canonical minimizer of this dual problem is the martingale part, M (cid:63) of the Doob(-Meyer) decomposition of the Snell envelope Y (cid:63)t = sup t ≤ stopping time τ ≤ T E F t [ Z τ ] , which moreover has the nice property that Y (cid:63) = max ≤ s ≤ T ( Z s − M (cid:63)s ) almost surely. (1.2)That is, if one would succeed in finding M (cid:63) , the value of Y (cid:63) can be obtained from onetrajectory of Z − M (cid:63) only.Shortly after the development of the duality method in [12] and [9], various nu-merical approaches for computing dual upper bounds for American options based onit appeared. May be one of the most popular methods is the nested simulation ap-proach by Andersen & Broadie [1], who essentially construct an approximation to theDoob martingale of the Snell envelope via stopping times obtained by the Longstaff& Schwartz method [11]. A few years later, a linear Monte Carlo method for dual up-per bounds was proposed in [3]. In fact, as a common feature, both [1] and [3] aimedat constructing (an approximation of) the Doob martingale of the Snell envelope viasome approximative knowledge of continuation functions obtained by the method ofLongstaff & Schwartz or in another way. Instead of relying on such information, thecommon goal in later studies [7], [13], [2], [4], was to minimize the expectation functionalin the dual representation (1.1) over a linear space of generic “elementary” martingales.Indeed, by parameterizing the martingale family in a linear way and replacing the ex-pectation in (1.1) by the sample mean over a large set of trajectories, the resultingminimization comes down to solving a linear program. However, it was pointed outin [13] that in general there may exist martingales that are “weakly” optimal in thesense that they minimize (1.1), but fail to have the “almost sure property” (1.2). Asa consequence, the estimator for the dual upper bound due to such martingales mayhave high variance. Moreover, an example in [13] illustrates that a straightforwardminimization of the sample mean corresponding to (1.1) may end up with a martingalethat is asymptotically optimal in the sense of (1.1) but not surely optimal in the senseof (1.2), when the sample size tends to infinity. As a remedy to this problem, in [2] rom optimal martingales to randomized dual optimal stopping 3 variance penalization is proposed, whereas in [4] the sample mean is replaced by themaximum over all trajectories.In this paper we first extend the study of surely optimal martingales in [13] tothe larger class of weakly optimal martingales. As a principal contribution, we givea complete characterization of weakly and surely optimal martingales and moreoverconsider the notion of randomized dual martingales. In particular, it is shown that ingeneral there may be a fullness of martingales that are optimal but not surely optimal.In fact, straightforward minimization procedures based on the sample mean in (1.1)may typically return martingales of this kind, even if the Doob martingale of the Snellenvelope is contained in the martingale family (as illustrated already in [13], though ata somewhat pathological example with partially deterministic cash-flows). As anothermain contribution we will show that the Doob martingale plays a distinguished rolewithin the family of all optimal martingales. Namely, it will be shown that by random-izing the arguments in the path-wise maximum for each trajectory in a particular way,any non-Doob optimal martingale can be turned to a suboptimal one. More specifi-cally, we will prove that there exists a particular “optimal randomization” such thatthe Doob martingale, perturbed or randomized with it, remains guaranteed (surely)optimal, while any other surely or weakly optimal martingale turns to a suboptimalone. Of course, as a rule this “optimal randomization” is not directly known or availablein practical applications. But, it turns out that by just incorporating some simple ran-domization due to uniform random variables, sample mean minimization may return amartingale that is closer to the Doob-martingale than one obtained without random-ization. We thus end up with a martingale with low variance, which in turn guaranteesthat the corresponding upper bound based on (1.1) is tight (see [2] and[13]). Comparedto [4] and [2], the benefit of this new randomized dual approach is its computationalefficiency: From the experiments we conclude that it may be sufficient to add on foreach trajectory simple i.i.d. uniform random variables to (some of) the arguments ofthe maximum. An extensive numerical analysis of the here presented randomized dualmartingale approach will certainly be an interesting subsequent study but is consideredbeyond the scope of this article.The structure of the paper is as follows. Section 2 carries out a systematic theo-retical analysis of optimal martingales. In Section 3 we deal with randomized optimalmartingales and the effect of randomizing the Doob-martingale. More technical proofsare given in Section 4 and some first numerical examples are presented in Section 5. Since practically any numerical approach to optimal stopping is based on a discreteexercise grid, we will work within in a discrete time setup. That is, it is assumed thatexercise (or stopping) is restricted to a discrete set of exercise times t = 0 , ..., t J = T, for some time horizon T and some J ∈ N + . For notational convenience we will furtheridentify the exercise times t j with their index j, and thus monitor the reward process Z j , at the “times” j = 0 , ..., J. Let ( Ω, F , P) be a filtered probability space with discrete filtration F = ( F j ) j ≥ . An optimal stopping problem is a problem of stopping the reward process ( Z j ) j ≥ insuch a way that the expected reward is maximized. The value of the optimal stopping Denis Belomestny, John Schoenmakers problem with horizon J at time j ∈ { , . . . , J } is given by Y (cid:63)j = ess sup τ ∈T [ j,...,J ] E F j [ Z τ ] , (2.1)provided that Z was not stopped before j. In (2.1), T [ j, . . . , J ] is the set of F -stoppingtimes taking values in { j, . . . , J } and the process (cid:0) Y (cid:63)j (cid:1) j ≥ is called the Snell enve-lope. It is well known that Y (cid:63) is a supermartingale satisfying the backward dynamicprogramming equation (Bellman principle): Y (cid:63)j = max (cid:0) Z j , E F j [ Y (cid:63)j +1 ] (cid:1) , ≤ j < J, Y (cid:63)J = Z J . Along with a primal approach based on the representation (2.1), a dual method wasproposed in [12] and [9]. Below we give a short self contained recap while including thenotions of weak and sure optimality.Let M be the set of martingales M adapted to F with M = 0 . By using theDoob’s optimal sampling theorem one observes that Y (cid:63)j ≤ E F j (cid:20) max j ≤ r ≤ J ( Z r − M r + M j ) (cid:21) , j = 0 , . . . , J, (2.2)for any M ∈ M . We will say that a martingale M is weakly optimal , or just optimal ,at j, for some j = 0 , ..., J, if Y (cid:63)j = E F j (cid:20) max j ≤ r ≤ J ( Z r − M r + M j ) (cid:21) . (2.3)The set of all martingales (weakly) optimal at j will be denoted by M ◦ ,j . The setof martingales optimal at j for all j = 0 , . . . , J, is denoted by M ◦ . We say that amartingale M is surely optimal at j, for some j = 0 , ..., J, if Y (cid:63)j = max j ≤ r ≤ J ( Z r − M r + M j ) almost surely. (2.4)The set of all surely optimal martingales at j will be denoted by M ◦◦ ,j . The set ofsurely optimal martingales at j for all j = 0 , . . . , J, is denoted by M ◦◦ . Note that,obviously, M ◦◦ ⊂ M ◦ ⊂ M . Now there always exists at least one surely optimal martingale, the so-called Doob-martingale coming from the Doob decomposition of the Snell envelope ( Y (cid:63)j ) j ≥ . Indeed,consider the Doob decomposition of Y (cid:63) , that is, Y (cid:63)j = Y (cid:63) + M (cid:63)j − A (cid:63)j , (2.5)where M (cid:63) is a martingale with M (cid:63) = 0 , and A (cid:63) is predictable with A (cid:63) = 0 . It followsimmediately that M (cid:63)j = j (cid:88) l =1 ( Y (cid:63)l − E F l − [ Y (cid:63)l ]) , A (cid:63)j = j (cid:88) l =1 ( Y (cid:63)l − − E F l − [ Y (cid:63)l ]) , (2.6)and so A (cid:63) is non-decreasing due to the fact that Y (cid:63) is a supermartingale. One thushas by (2.5) on the one hand max j ≤ r ≤ J ( Z r − M (cid:63)r + M (cid:63)j ) = Y (cid:63)j + max j ≤ r ≤ J ( Z r − Y (cid:63)r + A (cid:63)j − A (cid:63)r ) ≤ Y (cid:63)j rom optimal martingales to randomized dual optimal stopping 5 and due to (2.2) on the other hand E F j (cid:20) max j ≤ r ≤ J ( Z r − M (cid:63)r + M (cid:63)j ) (cid:21) ≥ Y (cid:63)j . Thus, it follows that (2.4) holds for arbitrary j, hence M (cid:63) ∈ M ◦◦ . Furthermore wehave the following properties of the sets ( M ◦ ,j ) and ( M ◦◦ ,j ) . Proposition 2.1
The sets M ◦ ,j and M ◦◦ ,j for j = 0 , ..., J, M ◦ , and M ◦◦ are convex. As an immediate consequence of Proposition 2.1; if there exist more than oneweakly (respectively surely) optimal martingale, then there exist infinitely many weakly(respectively surely) optimal martingales.
Proposition 2.2
It holds that M ∈ M ◦ ,j for some ≤ j ≤ J, if and only if for anyoptimal stopping time τ (cid:63)j ≥ j satisfying Y (cid:63)j = sup τ ≥ j E F j [ Z τ ] = E F j [ Z τ (cid:63)j ] , one has that max j ≤ r ≤ J ( Z r − M r ) = Z τ (cid:63)j − M τ (cid:63)j . Proof
Let τ (cid:63)j ≥ j be an optimal stopping time. Suppose that M ∈ M ◦ ,j . On the onehand, one trivially has max j ≤ r ≤ J ( Z r − M r ) − (cid:16) Z τ (cid:63)j − M τ (cid:63)j (cid:17) ≥ and on the other, since M ∈ M ◦ ,j (see (2.3)), E F j (cid:20) max j ≤ r ≤ J ( Z r − M r ) − (cid:16) Z τ (cid:63)j − M τ (cid:63)j (cid:17)(cid:21) = Y (cid:63)j − M j − (cid:0) Y (cid:63)j − M j (cid:1) = 0 , hence max j ≤ r ≤ J ( Z r − M r ) = Z τ (cid:63)j − M τ (cid:63)j almost surely. (2.7)The converse follows from (2.7) by taking conditional F j -expectations.It will be shown below that the class of the optimal martingales M ◦ may beconsiderably large. In fact, any such martingale can be seen as a perturbation of theDoob martingale ( M (cid:63)j ) . For this, let us introduce some further notation and define τ := 0 − with − < by convention and let, for l ≥ , τ l be the first optimal stoppingtime strictly after τ l − . That is, if τ l − < J, we define recursively τ l = inf (cid:110) τ l − < i ≤ J : Z i ≥ E F i (cid:2) Y (cid:63)i +1 (cid:3)(cid:111) , where Y (cid:63)J +1 := 0 . There so will be a last number, l J say, with τ l J = J. Further, thefamily ( τ (cid:63)i ) i ≥ defined by τ (cid:63)i = τ l for τ l − < i ≤ τ l , l ≥ , (2.8)is a consistent optimal stopping family in the sense that Y (cid:63)j = E F j [ Z τ (cid:63)j ] and that τ (cid:63)i > i implies τ (cid:63)i = τ (cid:63)i +1 . The next lemma provides a corner stone for an explicit structural characterizationof (weakly) optimal martingales.
Denis Belomestny, John Schoenmakers
Lemma 2.3 M ∈ M ◦ if and only if M is an adapted martingale with M = 0 suchthat the identities(i) max τ l −
Let ( S i ) ≤ i ≤ J be an adapted sequence with S = 0 and consider the“shifted” Doob martingale M i = M (cid:63)i − S i , ≤ i ≤ J. Let l i ≥ be the unique number such that τ l i − < i ≤ τ l i for any ≤ i ≤ J. If S satisfies for all ≤ i ≤ J, max τ li −
Let us represent an (arbitrary) adapted S with S = 0 by S i +1 = S i + ζ i +1 , ≤ i < J, (2.11) where each ζ i +1 is a F i +1 -measurable random variable. Then the conditions (2.9) and(2.10) are equivalent to the following ones.(i) On the F i -measurable event (cid:110) τ l i − < i < τ l i (cid:111) it holds that ζ i +1 ≥ max τ li −
Indeed, take j such that (cid:110) τ l j − < j ≤ τ l j (cid:111) , l j ≥ . If j − > τ l j − then l j − = l j and (2.12) and (2.13) imply with i = j − via (2.11), ≥ max τ lj −
Corollary 2.6
By Corollary 2.5 there always exists an adapted process S satisfying(2.12), (2.13), (2.14) with E i [ ζ i +1 ] = 0 for ≤ i < J due to (2.9) and (2.10).Hence, there exist martingales S that satisfy Lemma 2.4. By Lemma 2.3, for any suchmartingale S , M = M (cid:63) − S ∈ M ◦ , that is, M is the optimal martingale. Interestingly, the converse to Corollary 2.6 is also true and we so have the followingcharacterization theorem.
Theorem 2.7
It holds that M ∈ M ◦ if and only if M = M (cid:63) − S , where S is amartingale with S = 0 that satisfies (2.9) and (2.10) in Lemma 2.4. The proofs of Lemmas 2.3-2.4 and Theorem 2.7 are given in Section 4. In fact, Theo-rem 2.7 reveals that, besides the Doob martingale, there generally exists a large set ofoptimal martingales M ∈ M ◦ . From Theorem 2.7 we also obtain a characterization ofthe surely optimal martingales which is essentially the older result in [13], Thm. 6 (seeSection 4 for the proof).
Corollary 2.8
It holds that M ∈ M ◦◦ if and only if M = M (cid:63) − S with S representedby (2.11) with all E F i [ ζ i +1 ] = 0 , ζ i +1 satisfying (2.14) for i = τ l i , and ζ i +1 = 0 for τ l i − < i < τ l i , l i ≥ . In applications of dual optimal stopping, hence dual martingale minimization, it isusually enough to find martingales M that are “close to” surely optimal ones, merelyat some specific point in time i , that is, M ∈ M ◦◦ ,i . Naturally, since M ◦ ,i ⊃ M ◦ , wemay expect that in general the family of undesirable (not surely) optimal martingalesat a specific time may be even much larger than the family M ◦ characterized byTheorem 2.7. A characterization of M ◦ ,i and M ◦◦ ,i is given by the next theorem,where we take i = 0 without loss of generality. The proof is given in Section 4. Theorem 2.9
The following statements hold.(i) M = M (cid:63) − S ∈ M ◦ , for some martingale S represented by (2.11), if and only if max ≤ r After dropping the nonnegative term Y (cid:63)j − Z j in the right-hand-sides of (2.16) and(2.18) we may obtain tractable sufficient conditions for a martingale to be optimal orsurely optimal at a single date, respectively. In the spirit of Corollary 2.5 they may beformulated in the following way. Corollary 2.10 Let M = M (cid:63) − S for some martingale S represented by (2.11), then(i) M ∈ M ◦ , if ζ j ≥ max ≤ r For any (cid:102) M of the form (3.1) one has the upper estimate (cid:101) E (cid:104) max ≤ j ≤ J ( Z j − (cid:102) M j ) (cid:105) ≥ Y (cid:63) . (3.2) If S = 0 , that is, (cid:102) M j = M (cid:63)j − η j (3.3) and the random perturbations ( η j ) satisfy in addition η j ≤ Y (cid:63)j − Z j + A (cid:63)j , (cid:101) P − a.s. j = 0 , . . . , J, (3.4) with ( A (cid:63)j ) defined in (2.5) , then one has the almost sure identity Y (cid:63) = max ≤ j ≤ J ( Z j − (cid:102) M j ) (cid:101) P -a.s. (3.5) Moreover, for the first optimal stopping time τ (cid:63) := τ (cid:63) (see (2.8)) one must have that η τ (cid:63) = 0 a.s., and if τ (cid:63) is strict in the sense that Y (cid:63)τ (cid:63) − E F τ(cid:63) (cid:2) Y (cid:63)τ (cid:63) +1 (cid:3) > , then j = τ (cid:63) is the only time j where η j = 0 . Due to the following theorem, any (weakly or surely) optimal non Doob martingaleturns to a non optimal one in the sense that (cid:101) E (cid:104) max ≤ j ≤ J ( Z j − (cid:102) M j ) (cid:105) > Y (cid:63) (3.6)after a particular “optimal” randomization. Theorem 3.2 Suppose that M ∈ M ◦ , and let ( η j ) be a sequence of random variablesas in Proposition 3.1, given by η j = ξ j (cid:0) Y (cid:63)j − Z j + A (cid:63)j (cid:1) , ≤ j ≤ J, (3.7) where the ( ξ j ) are assumed to be i.i.d. distributed on ( −∞ , , independent of F with (cid:101) E [ ξ j ] = 0 . It is further assumed that the r.v. ( ξ j ) have a joint continuous density p supported on ( −∞ , with p (1) > . As such the randomizers (3.7) satisfy (3.4),and Proposition 3.1 thus provides an upper bound (3.2) due to the pseudo martingale (cid:102) M = M − η. Now, for the randomized martingale (cid:102) M one has (3.6) if M (cid:54) = M (cid:63) withpositive probability. The following corollary states that an optimally randomized non Doob martingalein M ◦ , , which is thus suboptimal in the sense of (3.6) due to the previous theorem,cannot have zero variance. The proof relies on Theorem 3.2. Corollary 3.3 Let M ∈ M ◦ , , ( η j ) as in Theorem 3.2, and (cid:102) M = M − η . Then Var (cid:0) max ≤ j ≤ J ( Z j − (cid:102) M j ) (cid:1) = 0 if and only if M = M (cid:63) . Discussion Proposition 3.1 provides us with a remarkable freedom of perturbing the Doob mar-tingale randomly while (3.5) remains true. The bottom line of Theorem 3.2 is thatrandomization under condition (3.4) of an optimal, or even surely optimal, but non -Doob martingale results in a non optimal (pseudo) martingale, while any randomizationof the Doob martingale under (3.4) remains a surely optimal pseudo martingale. Thisis an important feature, since in this way martingale candidates that are optimal butnot equal to the (surely optimal) Doob martingale can be sorted out by randomization. M ◦ ,j and M ◦◦ ,j for any j. For any M, M (cid:48) ∈ M ◦ ,j and θ ∈ (0 , one has E F j (cid:20) max j ≤ r ≤ J (cid:0) Z r − (cid:0) θM r + (1 − θ ) M (cid:48) r (cid:1)(cid:1) + θM j + (1 − θ ) M (cid:48) j (cid:21) = E (cid:20) max j ≤ r ≤ J (cid:0) θ ( Z r − M r + M j ) + (1 − θ ) (cid:0) Z r − M (cid:48) r + M (cid:48) j (cid:1)(cid:1)(cid:21) ≤ θ E (cid:20) max j ≤ r ≤ J ( Z r − M r + M j ) (cid:21) + (1 − θ ) E (cid:20) max j ≤ r ≤ J (cid:0) Z r − M (cid:48) r + M (cid:48) j (cid:1)(cid:21) = Y (cid:63)j while by (2.2), E F j (cid:20) max j ≤ r ≤ J (cid:0) Z r − (cid:0) θM r + (1 − θ ) M (cid:48) r (cid:1) + θM j + (1 − θ ) M (cid:48) j (cid:1)(cid:21) ≥ Y (cid:63)j . Similarly, for any M, M (cid:48) ∈ M ◦◦ ,j and θ ∈ (0 , we have max j ≤ r ≤ J (cid:0) Z r − (cid:0) θM r + (1 − θ ) M (cid:48) r + θM j + (1 − θ ) M (cid:48) j (cid:1)(cid:1) = max j ≤ r ≤ J (cid:0) θ ( Z r − M r + M j ) + (1 − θ ) (cid:0) Z r − M (cid:48) r + M (cid:48) j (cid:1)(cid:1) ≤ θ max j ≤ r ≤ J ( Z r − M r + M j ) + (1 − θ ) max ≤ r ≤ J (cid:0) Z r − M (cid:48) r + M (cid:48) j (cid:1) = Y (cid:63)j while by (2.2), E F j (cid:20) max j ≤ r ≤ J (cid:0) Z r − (cid:0) θM r + (1 − θ ) M (cid:48) r (cid:1) + θM j + (1 − θ ) M (cid:48) j (cid:1)(cid:21) ≥ Y (cid:63)j . In both cases the sandwich property completes. rom optimal martingales to randomized dual optimal stopping 11 M is a martingale with M = 0 such that Lemma 2.3-(i) and (ii) hold.Then (ii) implies for q ≥ that Z τ − M τ ≥ Z τ − M τ ≥ ... ≥ Z τ q − M τ q (4.1)Now take ≤ i ≤ J arbitrarily, and let q i ≥ be such that τ q i − < i ≤ τ q i (Note that q i is unique and F i measurable). Then due to Lemma 2.3-(i) and (4.1), max i ≤ r ≤ J ( Z r − M r ) = max (cid:18) max i ≤ r ≤ τ qi ( Z r − M r ) , max q>q i max τ q − 1) ( Y (cid:63)j − Z j + A (cid:63)j )) ≤ max ≤ j ≤ J ( S j ) . (4.23)Consider the stopping time σ := inf { j ≥ S j ≥ c } . Then, using S = 0 and(4.23), we must have that < σ ≤ J almost surely. Since S is a martingale, Doob’ssampling theorem then implies S = E [ S σ ] ≥ c, hence a contradiction. That is,the assumption Var (cid:16) max ≤ j ≤ J ( Z j − (cid:102) M j ) (cid:17) = 0 was false. J = 2 , Z = 0 , Z = 1 , and Z = U is a random variable which uniformlydistributed on the interval [0 , . The optimal stopping time τ ∗ is thus given by τ ∗ = (cid:26) , U ≥ , , U < . and the optimal value is Y (cid:63) = E max( U , 1) = 5 / . Furthermore, it is easy to see thatthe Doob martingale is given by M (cid:63) = 0 , M (cid:63) = M (cid:63) = max {U , } − . As an illustration of the theory developed in Sections 2-3, let us consider the linearspan M ( α ) = αM (cid:63) as a pool of candidate martingales and randomize it according to(3.7). We thus consider the objective function O θ ( α ) := (cid:101) E (cid:104) max ≤ j ≤ (cid:0) Z j − αM (cid:63)j + θξ j (cid:0) Y (cid:63)j − Z j + A (cid:63)j (cid:1)(cid:1)(cid:105) , (5.1) optimalrandomization``naive''randomization'norandomization α - - - ``naive''optimalwithout relative st. deviationrandomized estimators α - - - Figure 1 Left panel: objective functions O ( α ) (no randomization), O ( α ) (optimal random-ization), and O (“naive” randomization); right panel: relative deviations of Z ( α ) (withoutrandomization), Z ( α ) (optimal randomization), Z ( α ) (“naive” randomization) for some fixed θ ≥ , where ( ξ j ) are i.i.d. random variables with uniform distributionon [ − , . Note that for this example Y (cid:63) = max( U , , Y (cid:63) = 1 , and A (cid:63) = A (cid:63) =0 , A (cid:63) = max {U , } − , is the non-decreasing predictable process from the Doobdecomposition. Moreover, it is possible to compute (5.1) in closed form (though weomit detailed expressions which can be conveniently obtained by Mathematica forinstance). In Figure 1 (left panel) we have plotted (5.1) for θ = 0 and θ = 1 , togetherwith the objective function O ( α ) := (cid:101) E (cid:104) max ≤ j ≤ (cid:0) Z j − αM (cid:63)j + ξ j (cid:1)(cid:105) , due to a “naive” randomization, not based on knowledge of the factor Y (cid:63)j − Z j + A (cid:63)j . Also, in Figure 1 (right panel), the relative standard deviations (cid:112) Var( · ) /Y (cid:63) of thecorresponding random variables Z θ ( α ) := max ≤ j ≤ (cid:0) Z j − αM (cid:63)j + θξ j (cid:0) Y (cid:63)j − Z j + A (cid:63)j (cid:1)(cid:1) , θ = 0 , , and Z ( α ) := max ≤ j ≤ (cid:0) Z j − αM (cid:63)j + ξ j (cid:1) are depicted as a function of α. From [13, Section 8] we know that, and from the plot of O ( α ) in Figure 1 (left panel) wesee that, M ( α ) ∈ M ◦ for α ∈ [ − , / . On the other hand, the right panel plot showsthat Var( Z ( α )) may be relatively large for α (cid:54) = 1 , and that the Doob martingale(i.e. α = 1 ) is the only surely optimal one in our parametric family. Moreover, theobjective function due to the optimal randomization attains its unique minimum at theDoob martingale, i.e. for α = 1 . Further, the variance of the corresponding optimallyrandomized estimator attains its unique minimum zero also at α = 1 . Let us note thatthese observations are anticipated by Theorem 3.2 and Corollary 3.3. The catch is thatfor each α (cid:54) = 1 the randomized M ( α ) fails to be optimal in the sense of (3.6). We also seethat both the optimal and the “naive” randomization render the minimization problemto be strictly convex. Moreover, while the minimum due to the “naive” randomizationlays significantly above the true solution, the argument where the minimum is attained, α say, identifies nonetheless a martingale that virtually coincides with the Doob optimalone. That is, α ≈ and M ( α ) is optimal corresponding to variance Var( Z ( α )) ≈ ,which can be seen in the right panel. rom optimal martingales to randomized dual optimal stopping 21 J = 2 , and specify the (discounted) cash-flows Z j as functions of the (discounted) stock prices S j by Z = 0 , Z = ( S − κ ) + , Z = ( S − κ ) + (5.2)For S we take the Black-Scholes model S j = S exp( − σ j + σW j ) , j = 0 , , , (5.3)where W ∼ N (0 , and W , := W − W ∼ N (0 , , independent of W . As suchwe have a stylized example of a Bermudan call option under a Black-Scholes modelwith two (non-trivial) exercise dates if κ > κ ≥ . Note that usually a Bermudancall is considered for a fixed strike and a dividend paying stock, yielding a non-trivialoptimal stopping time. Though increasing strikes here look somewhat unusual, it issimple for presentation while, mathematically, the effect is the same as for a dividendpaying stock and a fixed strike. For the continuation function at j = 1 we thus have C ( W ) = E W (cid:20) (cid:16) S exp( − σ + σW ) − κ (cid:17) + (cid:21) = (cid:90) (cid:16) S exp( − σ + σW + σz ) − κ (cid:17) + φ ( z ) dz, (5.4)where φ ( z ) = (2 π ) − / exp( − z / is the standard normal density. While abusingnotation a bit we will denote the cash-flows by Z ( W ) and Z ( W ) = Z ( W , W , ) , respectively. For the (discounted) option value at j = 0 one thus has Y (cid:63) = E [max ( Z ( W ) , C ( W ))]= (cid:90) max (cid:32)(cid:18) S exp( − σ + σz ) − κ (cid:19) + , C ( z ) (cid:33) φ ( z ) dz Further we obviously have Y (cid:63) ( W ) = max ( Z ( W ) , C ( W )) and Y (cid:63) ( W ) = Z ( W ) = Z ( W , W , ) . The Doob martingale for this example is thus given by M (cid:63) = 0 , M (cid:63) = Y (cid:63) ( W ) − Y (cid:63) , M (cid:63) − M (cid:63) = Z ( W , W , ) − C ( W ) and the non-decreasing predictable component A (cid:63) is given by A (cid:63) = A (cid:63) = 0 , A (cid:63) = Y (cid:63) ( W ) − C ( W ) . For demonstration purposes we will quasi analytically compute the optimal random-ization coefficient in (3.7), Y (cid:63) − Z + A (cid:63) = Y (cid:63) j = 0 , ( C ( W ) − Z ( W )) + , j = 1 , ( Z ( W ) − C ( W )) + , j = 2 . by using a Black(-Scholes) type formula C ( W ) = S exp( − σ + σW ) N (cid:18) W + 1 σ ln( S /κ ) (cid:19) − κ N (cid:18) W + 1 σ ln( S /κ ) − σ (cid:19) , and a numerical integration for obtaining the target value Y (cid:63) . We now consider twomartingale families.(M-Sty) For any α = ( α , α , α , α ) we set M sty ( α , W ) := α (cid:0) Y (cid:63) ( W ) − Y (cid:63) − W (cid:1) + α W (5.5) M sty ( α , W ) := M sty ( α , W ) + α ( Z ( W , W , ) − C ( W ) − W , ) + α W , . Note that M sty ((1 , , , , W ) = M (cid:63) ( W ) . (M-Hermite) Using that the (probabilistic) Hermite polynomials given by He k ( x ) = ( − k e x (cid:18) ddx (cid:19) k e − x , k = 0 , , , ..., are orthogonal with respect to the standard Gaussian density we consider a mar-tingale family M H ( α , W ) = K (cid:88) k =1 α ,k He k ( W ) (5.6) M H ( α , W ) = M H ( α , W ) + K (cid:88) k =0 L (cid:88) l =1 α ,k,l He k ( W ) He l ( W , ) , with obvious definition of α ∈ R K ⊕ R ( K +1) × R L (note that He ≡ ). Sinceour mere goal is to exhibit the effect of randomization, for the examples below werestrict ourselves to the choice K = L = 3 . The parameters in (5.2) and (5.3) are taken to be such that with a medial probabilityoptimal exercise takes place at j = 1 . In particular, we consider two cases specifiedwith parameter sets(Pa1): S = 2 , σ = 13 , κ = 2 , κ = 3 , target value Y (cid:63) = 0 . , (Pa2): S = 2 , σ = 125 , κ = 2 , κ = 52 , target value Y (cid:63) = 0 . , respectively. From Figure 2 we see that the probability of optimal exercise at j = 1 is almost 50% for (Pa1) and almost 30% for (Pa2). Let us visualize on the basis ofmartingale family (M-Sty) and parameters (Pa1) the effects of randomization. Considerthe objective function O θ ( α ) := (cid:101) E (cid:104) max ≤ j ≤ (cid:16) Z j − M sty j ( α ) + θξ j (cid:0) Y (cid:63)j − Z j + A (cid:63)j (cid:1)(cid:17)(cid:105) . (5.7)where θ scales the randomization due to i.i.d. random variables ( ξ j ) , uniformly dis-tributed on [ − , . I.e., for θ = 0 there is no randomization and θ = 1 gives the optimal rom optimal martingales to randomized dual optimal stopping 23 randomization. Now restrict (5.7) to the sub domain α = ( α , α , α , α ) =: ( α , α ) (while slightly abusing notation), i.e. α = α = α and α = α = α . Thefunction O ( α , α ) , i.e. (5.7) without randomization is visualized in Figure 3, whereexpectations are computed quasi-analytically with Mathematica. From this plot we seethat the true value Y (cid:63) = 0 . is attained on the line ( α , for various α (i.e.not only in (1 , ). On the other hand, O ( α , α ) i.e. (5.7) with optimal randomiza-tion, has a clear strict global minimum in (1 , , see Figure 4. Let us have a closerlook at the map α → O θ ( α , α , , for θ = 0 and θ = 1 , respectively, and also at α → O . ( α , α , , due to the “naive” randomization O . ( α , 1) := (cid:101) E (cid:104) max ≤ j ≤ (cid:16) Z j − M sty j ( α , 1) + 0 . ξ j (cid:17)(cid:105) , where the scale parameter θ = 0 . is taken to be roughly the option value. (It turnsout that the choice of this scale factor is not critical for the location of the minimum.)In fact, the results, plotted in Figure 5, tell there own tale. The second panel depictsthe relative deviation of Z ( α , 1) := max ≤ j ≤ (cid:16) Z j − M sty j ( α , (cid:17) . In fact, similar comments as for the example in Section 5.1 apply. The “naive” ran-domization attains its minimum at α = 0 . , which we red off from the tables thatgenerated this figure. We thus have found the martingale M sty (0 . , , which maybe virtually considered surely optimal, as can be seen from the variance plot (secondpanel). Analogue visualizations for the parameter set (Pa2) with analogue conclusionsmay be given, though are omitted due to space restrictions.Let us now pass on to a Monte Carlo setting, where we mimic the approach in realpractice more closely. Based on N simulated samples of the underlying asset model,i.e. S ( n ) , n = 1 , ..., N, we consider the minimization (cid:98) α θ := arg min α N N (cid:88) n =1 (cid:104) max ≤ j ≤ (cid:16) Z ( n ) j − M ( n ) j ( α ) + θξ j (cid:16) Y (cid:63) ( n ) j − Z ( n ) j + A (cid:63) ( n ) j (cid:17)(cid:17)(cid:105) (5.8)for θ = 0 (no randomization) and θ = 1 (optimal randomization), along with theminimization (cid:98) α θ naive := arg min α N N (cid:88) n =1 (cid:104) max ≤ j ≤ (cid:16) Z ( n ) j − M ( n ) j ( α ) + θ naive j ξ j (cid:17)(cid:105) (5.9)based on a “naive”randomization where the coefficients θ naive j , j = 0 , , are pragmat-ically chosen. In (5.8) and (5.9) M stands for a generic linearly structured martingalefamily, such as (5.5) and (5.6) for example. The minimization problems (5.8) and (5.9)may be solved by linear programming (LP). They may be transformed into a suitableform such that the (free) LP package in R can be applied. This transformation pro-cedure is straightforward and spelled out in [7] for example. In the latter paper it isargued that the required computation time scales with N due to the sparse structureof the coefficient matrix involved in the LP setup. However, taking advantage of thissparsity requires a special treatment of the implementation of the linear program inconnection with more advanced LP solvers (as done in [7]). Since this paper is essen-tially on the theoretical justification of the randomized duality problem (along with the classification of optimal martingales), we consider an in-depth numerical analysisbeyond scope of this paper.For both parameter sets (Pa1) and (Pa2), and both martingale families (5.5) and(5.6) with K = L = 3 , we have carried out the LP optimization algorithm sketchedabove. We have taken N = 2000 and for the “naive” randomization θ naive = 1 . for (Pa1), θ naive = 4 . for (Pa2), and simply θ naive = θ naive = 0 . In the Table 1, for (Pa1), and Table 2, for (Pa2), we present for the minimizers (cid:98) α , (cid:98) α , (cid:98) α θ naive the in-sample expectation (cid:98) m , the in-sample standard deviation (cid:98) σ/ √ N, and the path-wise maximum due to a single trajectory (cid:98) σ, followed by the correspond-ing “true” values m test , σ test / √ N test , σ test , based on a large “test” simulation of N test = 10 samples. (Pa1) M sty M H (cid:98) α (cid:98) α θ naive (cid:98) α (cid:98) α (cid:98) α θ naive (cid:98) α (cid:98) m . . . . . . (cid:98) σ/ √ N . . . . . . (cid:98) σ . . . . . . m test . . . . . . σ test / √ N test . . . . . . σ test . . . . . . Table 1 LP minimization results due to M sty and M H for (Pa1)(Pa2) M sty M H (cid:98) α (cid:98) α θ naive (cid:98) α (cid:98) α (cid:98) α θ naive (cid:98) α (cid:98) m . . . . . . (cid:98) σ/ √ N . . . . . . (cid:98) σ . . . . . . m test . . . . . . σ test / √ N test . . . . . . σ test . . . . . . Table 2 LP minimization results due to M sty and M H for (Pa2) The results in tables Tables 1-2 show that even a simple (naive) randomization at j = 0 leads to a substantial variance reduction (up to times) not only on trainingsamples but also on the test ones. We think that for more structured examples andmore complex families of martingales even more pronounced variance reduction effectmay be expected. For example, in general it might be better to take Wiener integrals,i.e. objects of the form (cid:82) α ( t, X t ) dW, where α runs through some linear space of basisfunctions, as building blocks for the martingale family. Also other types of random-ization can be used, for example one may take different distributions for the r.v. ξ. However all these issues will be analyzed in a subsequent study. rom optimal martingales to randomized dual optimal stopping 25 Z C W - - W C Z - - Figure 2 Cash-flow Z versus continuation value C as a function of W for (Pa1) (left) and(Pa2) (right) Figure 3 Object function for BS-Call (Pa1) without randomization as function of ( α , α ) Figure 4 Object function for BS-Call (Pa1) with optimal randomization as function of ( α , α ) α = + ``naive''optimalwithout → n2 4 6 8 100.170.180.190.200.21 α = + relative st. deviation → n2 4 6 8 100.20.40.60.81.0 Figure 5 Left panel: object functions of α , with α = 1 fixed, for BS-Call (Pa1) without,optimal, and “naive” randomization; right panel: relative deviation of Z ( α , (i.e. withoutrandomization)rom optimal martingales to randomized dual optimal stopping 27 References 1. Leif Andersen and Mark Broadie. A Primal-Dual Simulation Algorithm for Pricing Multi-Dimensional American Options. Management Science , 50(9):1222–1234, 2004.2. Denis Belomestny. Solving optimal stopping problems via empirical dual optimization. The Annals of Applied Probability , 23(5):1988–2019, 2013.3. Denis Belomestny, Christian Bender, and John Schoenmakers. True upper bounds forBermudan products via non-nested Monte Carlo. Math. Finance , 19(1):53–71, 2009.4. Denis Belomestny, Roland Hildebrand, and John Schoenmakers. Optimal stopping viapathwise dual empirical maximisation. Appl. Math. Optim. , 79(3):715–741, 2019.5. M. Broadie and P. Glasserman. A stochastic mesh method for pricing high-dimensionalAmerican options. Journal of Computational Finance , 7(4):35–72, 2004.6. M.H.A. Davis and I. Karatzas. A deterministic approach to optimal stopping. Kelly, F.P. (ed.), Probability, statistics and optimisation. A tribute to Peter Whittle. Chichester:Wiley. Wiley Series in Probability and Mathematical Statistics. Probability and Mathe-matical Statistics. 455-466, 1994.7. V.V. Desai, V.F. Farias, and C.C. Moallemi. Pathwise optimization for optimal stoppingproblems. Management Science , 58(12):2292–2308, 2012.8. Paul Glasserman. Monte Carlo methods in financial engineering , volume 53. SpringerScience & Business Media, 2003.9. Martin Haugh and Leonid Kogan. Pricing American options: A duality approach. Oper.Res. , 52(2):258–270, 2004.10. Anastasia Kolodko and John Schoenmakers. Iterative construction of the optimal Bermu-dan stopping time. Finance Stoch. , 10(1):27–49, 2006.11. Francis A. Longstaff and Eduardo S. Schwartz. Valuing American options by simulation:a simple least-squares approach. Review of Financial Studies , 14(1):113–147, 2001.12. Leonard C. G. Rogers. Monte Carlo valuation of American options. Mathematical Finance ,12(3):271–286, 2002.13. John Schoenmakers, Jianing Zhang, and Junbo Huang. Optimal dual martingales, theiranalysis, and application to new algorithms for Bermudan products. SIAM J. FinancialMath. , 4(1):86–116, 2013.14. J. Tsitsiklis and B. Van Roy. Regression methods for pricing complex American styleoptions.