Reward Design in Risk-Taking Contests
aa r X i v : . [ m a t h . O C ] F e b Reward Design in Risk-Taking Contests
Marcel Nutz ∗ Yuchong Zhang † February 9, 2021
Abstract
Following the risk-taking model of Seel and Strack, n players decide when to stopprivately observed Brownian motions with drift and absorption at zero. They are thenranked according to their level of stopping and paid a rank-dependent reward. We studythe problem of a principal who aims to induce a desirable equilibrium performance ofthe players by choosing how much reward is attributed to each rank. Specifically, wedetermine optimal reward schemes for principals interested in the average performanceand the performance at a given rank. While the former can be related to rewardinequality in the Lorenz sense, the latter can have a surprising shape. Keywords
Stochastic contest; Stackelberg game; optimal stopping
AMS 2020 Subject Classification
We consider the Seel–Strack model [16] of risk-taking under private information and relative per-formance pay: n players decide when to stop privately observed, i.i.d. Brownian motions with drift.As the processes are absorbed at the origin, players risk bankruptcy by gambling longer, and thisrisk represents a cost for stopping later. Once all players have stopped, they are rewarded accordingto their relative ranks. Seel and Strack focus on a winner-takes-all game, meaning that only thetop-ranked player receives a reward and the players’ problem boils down to maximizing the prob-ability of winning. Here, we consider arbitrary reward schemes where subsequent ranks may alsoreceive payments. For instance, a hedge fund may compensate managers according to their rank,giving smaller bonuses also to the second and third-best performers, or even to all managers. Or, afirm may decide on promotions and terminations based on relative performance. The game admitsa unique Nash equilibrium for any reward scheme.A main result of Seel and Strack was that their contest is an inappropriate compensation schemefor firms because even a small negative drift can lead to large losses in the performance of anaverage manager—as the players care only about their relative ranking and not the absolute levelof stopping, the winner-takes-all design induces risk-seeking behavior and the associated extendedgambling implies that the drift takes a significant toll on the average performance. This observationis a motivation for our investigation: how should a principal allocate a given reward budget over theranks in order to incentivize a desirable performance (stopping level) by the agents in equilibrium?This Stackelberg game is studied for several objective functions. Mathematically, reward inequalityin the sense of Lorenz order leads to a single-crossing property of the stopping distributions whichdrives several of our results. ∗ Departments of Statistics and Mathematics, Columbia University, New York, USA, [email protected] . Research supported by an Alfred P. Sloan Fellowship and NSF Grant DMS-1812661. † Department of Statistical Sciences, University of Toronto, Canada, [email protected] . Re-search supported by NSERC Discovery Grant RGPIN-2020-06290. irst, we show that a principal deriving utility from the performance of the average playercan use the reward design to align agents’ risk preferences with her own, under suitable marketconditions. Under negative drift, a risk-averse principal benefits from a more equal compensationscheme. Indeed, this alleviates the issue raised in [16]: as players are less incentivized to gamble andstop sooner, their performance suffers less from the declining market. While as in [16], the largestlosses still occur for small negative values of the drift, their magnitude is greatly reduced. Underpositive drift, there is a trade-off between risk aversion and benefit from mean return, which resultsin an ambiguous comparison.Second, we study a principal maximizing the expected performance of the first-ranked player.For instance, a firm launching a competition for a novel product design or architecture project maybe interested in the winning submission (that will be realized later on) rather than the average.The performance of the first-ranked player is shown to be monotone in Lorenz order for any marketcondition, and as a result, the winner-takes-all scheme is always optimal. Intuitively, this principalreaps outsized benefits from higher variance in the performance distribution which outweigh possiblelosses from a negative drift over time.Third, we consider a principal maximizing the expected performance at the k -th rank, where < k ≤ n − . As an example, consider a platform linking buyers and sellers in sealed-bid, second-price auctions (as common e.g. in online advertising). If the platform receives a percentage of theprice paid (i.e., the second-highest bid) and develops a reward program for bidders based on ranks,how should a given budget be distributed? A first guess may be to give equal rewards to the first tworanks. More generally we may consider the cut-off scheme at rank j , which allocates equal rewardsto the first j ranks and nothing to the rest—for instance, a company distinguishing franchises with atop-ten award or promoting its five best-performing employees (or terminating the worst-performingemployees, as relevant to the fund industry [10]). The performance at the k -th rank turns out to bemore subtle than the first rank. Indeed, the benefits from variance decline as k increases, and othereffects come to play. Under zero drift, a cutoff at rank 2 is optimal for the second-rank performance,but this result does not extend to larger k : while a cut-off is still optimal, it can be preferableto attribute rewards beyond the k -th rank. For example, when n = 10 , the performance of themedian player ( k = 5 ) is optimized by paying equal rewards to the first 7 ranks. For positive drift,cutoff schemes are again optimal, whereas for negative drift, the optimal scheme may also pay anintermediate amount.The winner-takes-all contest of [16] has been extended in several directions, including moregeneral diffusion processes [6], random initial laws [7], heterogeneous loss constraints [15] and abehavioral model [8] where losers may be penalized if they (a posteriori) missed an opportunity towin. A different model [17] has no bankruptcy condition but instead postulates a flow cost thatis charged until stopping. Rank-order prize allocations have been studied extensively for staticgames; see [18, Chapter 3] for an introduction and related literature. In the game of [4], playersindependently choose any distribution on R + subject to an upper bound on the mean and receiverank-based rewards according to their realization. The authors establish existence and uniqueness ofan equilibrium and show, among other comparative statics, that reward inequality leads to greaterdispersion of the equilibrium distribution in the sense of convex order. In a different but relatedmodel with convex effort costs [5], reward inequality is shown to decrease efforts. The authors discussthe implications of this “discouragement effect” in numerous areas such as managerial compensation,employee promotion, grading and admissions in higher education. Many of their conclusions are alsorelevant to the present paper.Via Skorokhod’s embedding theorem, the game of [4] is equivalent to the present timing gamein the case of driftless Brownian motion. When the drift is nonzero, a monotone transformationcan be used to identify equilibria with the driftless case. As rewards only depend on ranks andranks are preserved by the transformation, this immediately implies the existence and uniquenessof an equilibrium. On the other hand, comparative statics that are not invariant under monotonetransformations may differ—for instance, the aforementioned result on dispersion does not hold for ositive drift (Example 3.4). The main difference with the present study, however, is our focus on aprincipal designing the reward. To the best of our knowledge, performance at a given rank has notbeen studied in these games.In a dynamic Poissonian game where players control the jump intensity and are ranked accordingto their jump times, [12] shows that the expected jump time of the k -th ranked player is minimizedby a reward scheme which pays nothing to the ranks below k . The amounts paid to ranks , . . . , k are positive and strictly concave; in particular, they are not equal. In the mean field game limit withan infinite number of competing players, the effect of reward inequality and several contest designproblems are analyzed in [3] and [2]. Here players exert effort to maximize rewards based on theranking of their terminal position and completion time of drifted Brownian motions, respectively,but analytical results are not available for the associated finite-player games.Following this Introduction, Section 2 details the model and the equilibrium for a given rewardscheme, whereas Section 3 studies the optimal reward design. We fix the number n ≥ of players. For ≤ i ≤ n , consider a diffusion X it = x + µt + σW it with absorption at x = 0 . The parameters x , σ ∈ (0 , ∞ ) and µ ∈ R are common among all playerswhereas the standard Brownian motions W i are independent. Each player i observes only her owndiffusion and chooses a possibly randomized stopping time τ i < ∞ . The players are then rankedaccording to the level X iτ i at which they stopped, with ties split uniformly at random. The player withrank k is given a reward R k . These prizes are deterministic and ordered, R ≥ R ≥ · · · ≥ R n ≥ ,with R > R n to exclude the constant case where any profile of stopping times is an equilibrium.We denote the total reward by R tot := P nk =1 R k and the average reward by ¯ R := R tot /n .A given stopping time τ i leads to a distribution F = Law( X iτ i ) for the position at stopping. Theset F of distributions that are feasible in this sense is readily characterized through Skorokhod’sembedding theorem, as observed in [16]. Lemma 2.1.
The set F consists of all distributions F supported on [0 , ∞ ) satisfying R R h ( x ) F ( dx ) =1 if µ > and R R h ( x ) F ( dx ) ≤ if µ ≤ , respectively, where h is the normalized scale function h ( x ) = exp( − µxσ ) − − µx σ ) − , µ = 0 , xx , µ = 0 . (2.1)This result goes back to [9]; see [14, Section 9] for a systematic derivation and background. (Theextension to the present case with absorbing boundary is immediate.) We say that F ∈ F is an equilibrium distribution if, for i.i.d. stopping levels X iτ i ∼ F , no player is incentivized to choose adifferent stopping time (or equivalently, a different distribution in F ).The equilibrium can be motivated through an ansatz as follows. We guess that there is anequilibrium F with no atoms and support [0 , ¯ x ] for some < ¯ x < ∞ . For ≤ x ≤ ¯ x , let u ( x ) be theexpected reward of player 1 (say) for stopping at level x , given that all other players stop accordingto F . As F is atomless, x = ¯ x leads to the first rank with probability one, hence u (¯ x ) = R .Similarly, u (0) = R n , and symmetry suggests that u ( x ) = ¯ R . More generally, we guess that inequilibrium, player 1 is invariant between all stopping times ≤ τ ≤ ¯ τ , where ¯ τ is the first exit timefrom [0 , ¯ x ] . This translates to the condition that u ( X ) is a martingale as long as X stays within (0 , ¯ x ) . If u is smooth on (0 , ¯ x ) , it follows via Itô’s formula that µu ′ ( x ) + σ u ′′ ( x ) = 0 on thatinterval. For µ = 0 , the unique function satisfying all these conditions is u ( x ) = ( ¯ R − R n ) exp( − µxσ ) − − µx σ ) − R n , ≤ x ≤ ¯ x, (2.2) here ¯ x is determined via u (¯ x ) = R to be ¯ x = σ − µ log (cid:26) R − R n ¯ R − R n (cid:20) exp (cid:16) − µx σ (cid:17) − (cid:21) + 1 (cid:27) . (2.3)More precisely, this expression is finite (and strictly positive) when µ < ¯ µ , where ¯ µ > is definedby setting the argument of the above logarithm to zero, ¯ µ = σ x log (cid:18) R − R n R − ¯ R (cid:19) . (2.4)The restriction µ < ¯ µ is a standing assumption . It ensures that players stop in finite time; inparticular, the ranking is well-defined. In the driftless case µ = 0 , the above simplifies to u ( x ) = ¯ R − R n x x + R n , ¯ x = R − R n ¯ R − R n x . (2.5)Let p k ( x ) be the probability that stopping at level x leads to rank k in equilibrium. In otherwords, the probability that, of the n − other i.i.d. variables X iτ i ∼ F , exactly k − are (strictly)larger than x , or p k ( x ) = (cid:18) n − k − (cid:19) F ( x ) n − k (1 − F ( x )) k − . In equilibrium we must have u ( x ) = P nk =1 R k p k ( x ) . This right-hand side is of the form g ( F ( x )) ,and the following allows us to define F by inverting g . Lemma 2.2.
The function g : [0 , → [ R n , R ] , g ( y ) = n X k =1 R k (cid:18) n − k − (cid:19) y n − k (1 − y ) k − is strictly increasing, hence invertible on [ R n , R ] = [ u (0) , u (¯ x )] . Define F ( x ) = g − ( u ( x )) , ≤ x ≤ ¯ x (2.6) as well as F ( x ) = 0 for x < and F ( x ) = 1 for x > ¯ x . Then F is the cdf of an atomless distributionwith support [0 , ¯ x ] whose density f is strictly positive on (0 , ¯ x ) . Moreover, F ∈ F . Here and below, we use the same symbol to denote the measure and its cdf. The stated propertiesof g follow from the observation that g ( y ) is the expected reward for stopping at y in the game wherethe other n − players stop according to a uniform distribution on [0 , . A direct computation shows R hdF = 1 , so that F ∈ F is guaranteed by Lemma 2.1.The construction implies that F is indeed an equilibrium: If players , . . . , n have stoppingdistribution F , then u is the value function for player 1; in particular, player 1 can attain an expectedreward of ¯ R by choosing F as well. If τ is any stopping time (possibly randomized), Itô’s formulaand the fact that X := X is absorbed at imply that u ( X t ) is a nonnegative supermartingale andin particular E [ u ( X τ )] ≤ u ( x ) = ¯ R . Hence, player 1 has no incentive to deviate from F , showingthat F is an equilibrium. Proposition 2.3.
Let u, ¯ x, ¯ µ, F, u be defined as in (2.2) – (2.6) and µ < ¯ µ . There exists a uniqueequilibrium, given by the distribution F , and u is the corresponding equilibrium value function.Proof. In the case µ = 0 , Lemma 2.1 shows that the game is equivalent to the static, capacity-constrained game of [4], where players choose among all distributions F on R + with R x dF ≤ x .Existence and uniqueness is established in [4, Theorem 1]. If µ = 0 , using the fact that the reward isbased solely on the rank as well as µ < ¯ µ , we see that F is an equilibrium if and only if ˜ F := F ◦ h − is an equilibrium of the game with µ = 0 , and the proposition follows. emark 2.4. (a) The value function u depends on the minimal, maximal, and average reward, butnot on the further details of the reward vector R . By contrast, the equilibrium distribution dependson all rewards R k . More precisely, there are n − degrees of freedom in R that can affect F . Indeed,we could have assumed R n = 0 without loss of generality: subtracting a constant c from all the R k will change u into u − c and g into g − c whereas the equilibrium distribution F is unchanged.Moreover, one can normalize the average (or the total) reward: replacing R by λR for λ > changes u into λu but leaves F invariant.(b) We have assumed that agents are risk-neutral wrt. the reward. This entails no loss ofgenerality: if agents optimize a utility function U of the reward, we can treat ˜ R k := U ( R k ) as anauxiliary reward and agents as risk-neutral wrt. ˜ R .(c) As F is atomless, ties and bankruptcies almost-surely do not occur. We now study how the reward scheme influences the equilibrium stopping distribution and thus theplayers’ levels of stopping, also called their performance in what follows. While players only careabout their rank, a principal interested in the performance of one or more players may optimizethe reward scheme such as to induce a desirable performance. Throughout, we normalize R n = 0 and vary R , . . . , R n − while keeping the total reward P ni =1 R i = 1 constant; cf. Remark 2.4 (a).The standing assumption µ < ¯ µ , cf. (2.4), is in force for all reward schemes under discussion. Thisassumption is most stringent for the winner-takes-all scheme ( R i = 0 for i > ), where it reads µ < σ x log (cid:18) nn − (cid:19) . (3.1)We identify two notions that are crucial for this discussion. First, the Lorenz order, which isa well-known measure of inequality in economics [1]. Given two reward vectors R and ˜ R with thesame total reward, ˜ R exhibits less inequality than R in Lorenz order, or ˜ R ≤ L R, if k X i =1 ˜ R i ≤ k X i =1 R i for k = 1 , . . . , n. Among all normalized reward vectors, the winner-takes-all scheme is the largest in Lorenz orderwhereas the uniform reward ( R = · · · = R n − ) is the smallest. The upper bound ¯ x R of the supportof the equilibrium distribution F corresponding to R , cf. (2.3), is increasing in R . Hence, ˜ R ≤ L R implies ¯ x R ≥ ¯ x ˜ R , so that F and ˜ F (corresponding to ˜ R ) are both concentrated on (0 , ¯ x R ) .The second notion refers to the equilibrium distribution. Given two cdf F and ˜ F , we say that ˜ F is strictly single crossing wrt. F if there are a < x < b with F ( a ) = ˜ F ( a ) = 0 and F ( b ) = ˜ F ( b ) = 1 as well as ˜ F < F on ( a, x ) and ˜ F > F on ( x , b ) . Where it is useful to be more explicit, we say that the functions are strictly single crossing on ( a, b ) with crossing point x . In words, ˜ F − F crosses the horizontal axis exactly once, in an increasingfashion, in an interval supporting both distributions. It means that as F is transformed into ˜ F , anontrivial part of the mass below x is transported above x , thus reflecting an upward-mobility (interms of level of stopping) inside the population of players. In addition to this direct interpretation,the following theorem is also a tool for proving several of the results below. Theorem 3.1.
Let R, ˜ R be distinct reward vectors and F, ˜ F the corresponding equilibrium distribu-tions. If ˜ R ≤ L R , then ˜ F is strictly single crossing wrt. F .
20 40 60 80 100 120 140 160 180 x F ( x ) R (1) =(1,0,0,0)R (2) =(2/3,1/3,0,0)R (3) =(1/2,1/2,0,0)R (4) =(1/2,1/3,1/6,0)R (5) =(1/2,1/4,1/4,0)F (1) F (2) F (3) F (4) F (5) Figure 1: Single crossing property of the equilibrium cdf F ( i ) corresponding to rewards R (1) ≥ L · · · ≥ L R (5) . Here µ = − . , σ = 1 and x = 100 . For i = 1 , , , the schemes R ( i ) only differ in the first two ranks and then F ( i ) intersect at a common point. Similarlyfor i = 3 , , , where the schemes differ in the second and third ranks. The distributions for i = 3 , , have the same support; cf. (2.3). Proof.
Following Hardy, Littlewood and Pólya (see [11]), the first step is to observe the result inthe special case when the rewards differ only at two ranks: Fix ≤ i < j < n and consider rewardvectors R, R δ where R δj = R j + δ and R δi = R i − δ and R δk = R k for k = i, j . Let F, F δ be thecorresponding equilibrium distributions. Then for δ > , F δ is strictly single crossing with respectto F on (0 , ¯ x F ) . Indeed, δ F δ ( x ) is strictly decreasing for x ∈ (0 , x ) and strictly increasing in δ for x ∈ ( x , ¯ x F ) , for a suitable x . This can be shown by direct arguments, or one may combinethe result of [4, Lemma 9] for capacity-constrained games with the transformation mentioned in theproof of Proposition 2.3.Second, we observe that the change from R to ˜ R can be decomposed into a finite sequence R (0) , . . . , R ( N ) of such two-rank transformations, where R (0) = R and R ( N ) = ˜ R . This is easily seenby induction (see [11, Lemma B.1, p. 32] for a detailed proof). If the single crossing property weretransitive, Theorem 3.1 would be a direct consequence. It is not transitive, of course—but a carefulargument is nevertheless successful.Let F k be the equilibrium distribution induced by R ( k ) . By the above, F k is strictly singlecrossing with respect to F k − . Let x k denote the crossing point, x min := min ≤ k ≤ N x k and x max :=max ≤ k ≤ N x k , then < x min ≤ x max < ¯ x R . For x ∈ (0 , x min ) , the pairwise strict single crossingproperty implies F k ( x ) < F k − ( x ) for all k , hence ˜ F ( x ) < F ( x ) . A similar argument shows that ˜ F ( x ) > F ( x ) for x ∈ ( x max , ¯ x R ) . Thus, by continuity, ˜ F − F must cross zero from below at leastonce in ( x min , x max ) ⊂ (0 , ¯ x R ) .It remains to show that the zero of ˜ F − F in (0 , ¯ x R ) is unique. To this end, let x ∈ (0 , ¯ x R ) bea zero of ˜ F − F and y = F ( x ) = ˜ F ( x ) . As F has a positive density on (0 , ¯ x R ) , it suffices to showthe uniqueness of y . Note that ˜ F ( x ) = F ( x ) < F (¯ x R ) = 1 implies x < ¯ x ˜ R . Since R and ˜ R havethe same average and R ≥ ˜ R , we see that g ( F ( x )) = u ( x ) = ˜ u ( x ) = ˜ g ( ˜ F ( x )) on [0 , ¯ x ˜ R ] . Setting x = x yields (˜ g − g )( y ) = 0 ; that is, y must be a zero of ˜ g − g in (0 , .Write ˜ R − R = P ( i,j ) δ i,j ( e j − e i ) where e i is the i -th basis vector and each term in the finitesum represents an inequality-reducing transformation changing the reward at two ranks: the amount i,j > is moved from the i -th place to the j -th place, where i < j . Let P k ( y ) be the probabilityof winning rank k at location y ∈ [0 , if ( n − other random variables are i.i.d. and uniform on [0 , . Then (˜ g − g )( y ) = n X k =1 ( ˜ R k − R k ) P k ( y ) = X ( i,j ) δ i,j ( P j ( y ) − P i ( y ))= X ( i,j ) δ i,j (cid:20)(cid:18) n − j − (cid:19) y n − j (1 − y ) j − − (cid:18) n − i − (cid:19) y n − i (1 − y ) i − (cid:21) = X ( i,j ) δ i,j y n − j (1 − y ) i − (cid:20)(cid:18) n − j − (cid:19) (1 − y ) j − i − (cid:18) n − i − (cid:19) y j − i (cid:21) . Writing G i,j ( y ) for the expression in square brackets, (˜ g − g )( y ) = 0 and < y < imply P i,j δ i,j (1 − y ) i y j G i,j ( y ) = 0 . Both G i,j ( y ) and (1 − y ) i /y j are strictly decreasing on (0 , . Togetherwith δ i,j > , we conclude that y is unique. Suppose a principal derives utility from the individual agent performance X τ according to a utilityfunction φ , then the expected utility in equilibrium is E [ φ ( X τ )] = Z ∞ φ ( x ) dF ( x ) . We recall the scale function h defined in (2.1), a smooth function with h ′ > that is concave for µ ≥ and convex for µ ≤ . Theorem 3.2.
Let R, ˜ R be distinct reward vectors with ˜ R ≤ L R and F, ˜ F the corresponding equi-librium distributions. Let φ : R + → R be an increasing, absolutely continuous function.(i) If φ ′ /h ′ is increasing on (0 , ¯ x R ) , then R ∞ φ ( x ) d ˜ F ( x ) ≤ R ∞ φ ( x ) dF ( x ) .(ii) If φ ′ /h ′ is decreasing on (0 , ¯ x R ) , then R ∞ φ ( x ) d ˜ F ( x ) ≥ R ∞ φ ( x ) dF ( x ) .The inequalities are strict unless φ = ah + b for some constants a, b .Proof. (i) Integration by parts yields Z ∞ φ ( x ) d ( ˜ F − F )( x ) = − Z ¯ x R ( ˜ F − F )( x ) φ ′ ( x ) dx. By Theorem 3.1, ˜ F is strictly single crossing wrt. F with some crossing point x ∈ (0 , ¯ x R ) . As φ ′ /h ′ is increasing and h ′ > , Z ¯ x R ( ˜ F − F )( x ) h ′ ( x ) φ ′ ( x ) h ′ ( x ) dx ≥ φ ′ ( x ) h ′ ( x ) Z ¯ x R ( ˜ F − F )( x ) h ′ ( x ) dx. Another integration by parts gives Z ¯ x R ( ˜ F − F )( x ) h ′ ( x ) dx = ( ˜ F − F )( x ) h ( x ) (cid:12)(cid:12)(cid:12)(cid:12) ¯ x R x =0 − Z ¯ x R h ( x ) d ( ˜ F − F )( x ) = 0 , where the last equality holds by Lemma 2.2. Combining the above displays, we have R ∞ φ ( x ) d ( ˜ F − F )( x ) ≤ , and the inequality is strict unless φ ′ /h ′ ≡ φ ′ ( x ) /h ′ ( x ) a.e. The proof of (ii) is analogous. pecializing to risk-averse and risk-seeking utility functions, we obtain the following. Corollary 3.3.
Let R, ˜ R, F, ˜ F , φ be as in Theorem 3.2.(i) If φ is convex and µ ≥ , then R ∞ φ ( x ) d ˜ F ( x ) ≤ R ∞ φ ( x ) dF ( x ) .(ii) If φ is concave and µ ≤ , then R ∞ φ ( x ) d ˜ F ( x ) ≥ R ∞ φ ( x ) dF ( x ) .If µ = 0 and φ is not constant, the asserted inequality is strict.Proof. This follows from the concavity/convexity of h and Theorem 3.2.Intuitively, the reward allocation induces a “risk preference” in agents. This comparison can bemotivated via Remark 2.4 (b): Starting from a reward R , consider a concave increasing function U and ˜ R := U ( R ) . By an affine normalization of U we may assume that ˜ R is again a normalized reward.It is easy to see that ˜ R ≤ L R ; cf. [11, Proposition B.2, p. 188]. That is, risk-neutral players (as wehave assumed) with reward allocation ˜ R are equivalent to risk-averse players with allocation R .Conversely, the more unequal the reward, the more risk-seeking agents become, staying longer inthe game to gamble for a high performance (see also Corollary 3.6 below).Corollary 3.3 shows that the principal should align agents’ risk preferences with her own, providedthat the market condition µ is not too strong a counter force. A negative drift reinforces a risk-averseprincipal’s preference for agents to stop early, to reduce both variance and expected losses due tothe drift, whereas a positive drift reinforces the preference to gamble and profit from the drift. If theprincipal’s preferences and the market condition are opposed, the trade-off results in an ambiguouscomparison, as shown by the following example. Example 3.4 (Risk-averse principal in a bull market) . Let µ > and φ ( x ) = − γ e − γx where γ > .Then φ ′ ( x ) h ′ ( x ) = σ µ (1 − exp( − µx σ )) exp(( µσ − γ ) x ) ; thus φ ′ /h ′ is strictly increasing if µ/σ > γ ,strictly decreasing if µ/σ < γ , and constant if µ/σ = γ . As a result, reward inequality ispreferred for small values of the risk aversion γ whereas equality is preferred for large values.Clearly Corollary 3.3 can be used to analyze the optimal reward scheme for the principal. Weonly state the result for linear utility. Corollary 3.5.
The expected performance E [ X τ ] is strictly increasing wrt. the Lorenz order of thereward scheme when µ > , and strictly decreasing when µ < . In particular, E [ X τ ] is maximizedby the winner-takes-all scheme when µ > and by the uniform reward when µ < . For µ = 0 , theexpected performance is independent of the reward.Proof. The result follows from Corollary 3.3 with φ ( x ) = x after noting that uniform and winner-takes-all are, respectively, the unique minimum and maximum elements wrt. Lorenz order amongall normalized reward schemes.See Figure 2 for numerical examples illustrating Corollary 3.5. The figure also shows that,similarly as in [16], the largest losses occur for an intermediate value of negative drift µ . Whilethe corresponding µ varies only slightly with the reward scheme, the losses for winner-takes-all aresubstantially larger than for the schemes with lower inequality.As alluded above, we can show that higher reward inequality implies that players gamble longer,in line with the interpretation given below Corollary 3.3. As players only care about their relativeranking and not about the absolute performance, it is natural that the sign of the drift does notappear in this result. Corollary 3.6.
The expected duration E [ τ ] of play is monotone increasing wrt. the Lorenz orderof the reward scheme. In particular, it is maximized by the winner-takes-all and minimized by theuniform scheme. EX R=(1,0,0)R'=(2/3,1/3,0)R''=(1/2,1/2,0)
Figure 2: Average performance E [ X τ ] as a function of drift µ for three different rewardschemes R ′′ ≤ L R ′ ≤ L R . Here x = 100 and σ = 1 . Proof.
As the equilibrium distribution does not have an atom at 0, an optional sampling argumentyields E [ X τ ] = x + µE [ τ ] . Now, we again use Corollary 3.3 with φ ( x ) = x . Remark 3.7. If µ ≤ , then ˜ F dominates F in second stochastic order; i.e., R y ( ˜ F ( x ) − F ( x )) dx ≤ for all y ≥ . Indeed, this order is alternately characterized through integrals of increasing concavefunctions, so that the claim is a reformulation of Corollary 3.3 (ii). The interpretation is as above:a more equitable reward makes players prefer less variance and stop earlier, hence suffer less fromthe negative drift and achieve a higher performance in equilibrium.For µ > , Example 3.4 shows that ˜ F and F cannot be ordered in this sense, as that wouldimply that the principal’s preference is the same for all positive risk aversion parameters.For µ = 0 , the game is equivalent to the capacity-constrained game of [4] and the secondstochastic dominance is shown in [4, Proposition 5]. For µ > , the order is not preserved bythe transformation mentioned in the proof of Proposition 2.3, as evidenced by the aforementionedexample. Next, we study the problem of a principal aiming to maximize the expected equilibrium performanceof the first-ranked player, E (cid:20) max i =1 ,...,n X τ i (cid:21) = n Z ¯ x xF ( x ) n − dF ( x ) . In contrast to the preceding subsection, this constitutes a nonlinear functional of the equilibriumdistribution, and we obtain a result that is independent of the drift (even though the proofs differdepending on the sign). The first rank naturally incorporates an upwards bias relative to the averageperformance, and the difference increases with the volatility. For positive drift, this strongly suggeststhat the principal will profit from gambling and thus should encourage a long duration of the game.More surprisingly, the profit from volatility turns out to be more important than any losses thatmay occur due to a negative drift, so that reward inequality is preferred in any market condition.
Theorem 3.8.
The expected performance E [max i X τ i ] of the first-ranked player is strictly increasingwrt. the Lorenz order of the reward scheme. In particular, the winner-takes-all scheme is the uniquemaximizer. he following lemma is required for the proof. For later use, we state it for the k -th rank ratherthan just the first rank. Lemma 3.9.
Let R be a reward scheme and F the associated equilibrium distribution. Let ( Y i ) ≤ i ≤ n be i.i.d. with distribution F and denote by Y ( k ) the k -th reverse order statistic (the k -th largest value),where ≤ k ≤ n − .(i) If µ = 0 , then E [ Y ( k ) ] = nx n !(2 n − (cid:18) n − k − (cid:19) n X l =1 R l φ ( k, l ) , where (3.2) φ ( k, l ) := (2 n − k − l )!( k + l − n − l )!( l − . (3.3) (ii) If µ = 0 , then setting A = − µσ and B = exp( Ax ) − , E [ Y ( k ) ] = n (cid:18) n − k − (cid:19) A − Z log[ nBg ( y ) + 1)] y n − k (1 − y ) k − dy. In particular, E [ Y ( k ) ] is strictly concave with respect to R for µ < , strictly convex for µ > , andlinear for µ = 0 .Proof. Recall that F is strictly increasing on [0 , ¯ x ] , hence admits an inverse q := F − . Clearly E [ Y ( k ) ] = n (cid:18) n − k − (cid:19) Z ¯ x xF ( x ) n − k (1 − F ( x )) k − dF ( x )= n (cid:18) n − k − (cid:19) Z q ( y ) y n − k (1 − y ) k − dy. (3.4)In view of u ( x ) = g ( F ( x )) , we have q ( y ) = u − ( g ( y )) for ≤ y ≤ .(i) Let µ = 0 . As R is normalized with ¯ R = 1 /n , we obtain u ( x ) = xnx and u − ( y ) = nx y . Asa result, q ( y ) = nx g ( y ) , and then by (3.4), E [ Y ( k ) ] = n x (cid:18) n − k − (cid:19) Z g ( y ) y n − k (1 − y ) k − dy = n x (cid:18) n − k − (cid:19) n X l =1 R l (cid:18) n − l − (cid:19) Z y n − k − l (1 − y ) k + l − dy. To compute this expression, we note that Z y n − k − l (1 − y ) k + l − dy = Beta(2 n − k − l + 1 , k + l − n − k − l )!( k + l − n − where we have used that the Beta function Beta( x, y ) = R t x − (1 − t ) y − dt satisfies the relation Beta( x, y ) = Γ( x )Γ( y ) / Γ( x + y ) with the Gamma function. As a result, E [ Y ( k ) ] = n x (cid:18) n − k − (cid:19) n X l =1 R l (cid:18) n − l − (cid:19) (2 n − k − l )!( k + l − n − nx n !(2 n − (cid:18) n − k − (cid:19) n X l =1 R l (2 n − k − l )!( k + l − n − l )!( l − . ii) Let µ = 0 . Note h ( x ) = exp( Ax ) − B , hence h − ( z ) = A − log( Bz + 1) . As u ( x ) = n h ( x ) for x ≤ ¯ x , we have u − ( z ) = h − ( nz ) ; i.e., q ( y ) = u − ( g ( y )) = A − log( nBg ( y ) + 1) . This expression is well defined due to (3.1). In view of (3.4), the claim follows.
Proof of Theorem 3.8.
Let ˜ R ≤ L R be two reward schemes and ˜ F , F the corresponding equilibria.By Theorem 3.1, ˜ F is strictly single crossing wrt. F .(i) Case µ ≥ . We also have that F − is strictly single crossing with respect to ˜ F − on (0 , .Let y be the crossing point, then Z ( F − ( y ) − ˜ F − ( y )) y n − dy > y n − Z ( F − ( y ) − ˜ F − ( y )) dy ≥ , where the last inequality is due to Corollary 3.5 and µ ≥ .(ii) Case µ < . For λ ∈ [0 , , we define (cf. Lemma 2.2) ϕ ( λ ) := E [ Y (1) λ ] = nA − Z log[ nB ( λ ˜ g ( y ) + (1 − λ ) g ( y )) + 1] y n − dy and show that ϕ attains its unique maximum at λ = 0 . As ϕ is strictly concave (Lemma 3.9), itsuffices to show that the right derivative ϕ ′ (0+) ≤ . Indeed, ϕ ′ (0+) = nA − Z nB (˜ g − g )( y ) y n − nBg ( y ) + 1 dy. As µ < , we have B > and one checks that the factor y n − nBg ( y ) + 1 = " n X ℓ =1 ( nBR ℓ + 1) (cid:18) n − ℓ − (cid:19) (cid:18) − yy (cid:19) ℓ − − is increasing in y . In view of ˜ R ≤ L R , g is strictly single-crossing with respect to ˜ g on (0 , . Finally, R ˜ g ( y ) dy = ¯ R = R g ( y ) dy . Together, these three facts imply that ϕ ′ (0+) ≤ . k -th Rank We consider a principal maximizing the expected performance of the k -th ranked player, where ≤ k ≤ n − . This problem is more involved that the first rank: if k is close to 1 (relative to n/ ), we may expect to see similar effects as for the first rank, but clearly the profits from volatilityare weaker. A first guess may be that the principal should maximize the reward at the k -th rankin order to maximize k -th rank performance. While this is not always true, the following rewardschemes nevertheless play a special role. Definition 3.10.
For ≤ j ≤ n − , the reward scheme R j = ( R j , . . . , R jn ) with R ji = 1 /j, i ≤ j and R ji = 0 , i > j is called the cut-off at j .In words, R j distributes the total reward uniformly over the first j ranks. This scheme maximizesthe reward at the j -th rank. The winner-takes-all scheme R and the uniform scheme R n − arespecial cases. e first focus on the case of zero drift which allows for the most detailed analysis. When k = 1 ,we have seen in Theorem 3.8 that the winner-takes-all reward is optimal. The next result shows thatthe guess also holds for the second rank: it is optimal to reward the first two ranks equally, and givezero reward to the subsequent ranks. However, this does not extend to higher target ranks k ≥ .While a cut-off reward is still optimal, it can be beneficial to extend the cut-off point beyond k . Theanalytic description uses the function φ of (3.3). Proposition 3.11.
Let µ = 0 . Then the unique normalized reward scheme maximizing the expectedperformance E [ Y ( k ) ] of the k -th rank is the cut-off at k ∗ , where k ∗ = max ( j ≥ k : φ ( k, j ) ≥ j − j − X l =1 φ ( k, l ) ) . (3.5) In particular, the winner-takes-all scheme is optimal for k = 1 and the cut-off at 2 is optimal for k = 2 . For k ≥ , it may happen that k ∗ > k . For instance, for n = 5 and k = 3 , the cut-off at k ∗ = 4 is optimal; for n = 10 and k = 5 , the cut-off at k ∗ = 7 is optimal (cf. Figure 3). cut-off rank EY ( k ) / x log n k * / n Figure 3: Illustration of Proposition 3.11. The left panel shows the k -th rank performancefor all cut-off schemes when n = 10 and k = 5 ; the best performance is attained at k ∗ = 7 .The right panel shows the optimal cut-off ratio k ∗ ( n ) /n when the target rank k varies with n ,chosen such that k/n = α := 1 / is constant. The behavior for finite n is rather complexbut suggests a simplification in the limit n → ∞ , which has motivated the study of thelimiting mean field game in a companion paper [13]. Proof.
We have φ ( k,l +1) φ ( k,l ) = ( k + l − n − l )(2 n − k − l ) l . Noting that ( k + l − n − l ) − (2 n − k − l ) l = n ( k − l ) + l − n is < if l ≥ k and > if l < k , we see that φ ( k,l +1) φ ( k,l ) < if l ≥ k and φ ( k,l +1) φ ( k,l ) > if l < k . That is,we have φ ( k, < φ ( k, < · · · < φ ( k, k − < φ ( k, k ) > φ ( k, k + 1) > · · · φ ( k, n − and in particular φ ( k, k ) is a maximum. In view of (3.2), we conclude that an optimal reward schememust pay equal rewards to ranks j = 1 , . . . , k ∗ . For k = 1 it follows directly that k ∗ = 1 . For k = 2 we ote that φ (2 , > φ (2 , holds for all n , which of course implies that [ φ (2 ,
1) + φ (2 , > φ (2 , .The further examples are verified by direct calculation.We now turn to the case of non-zero drift, where our result is less detailed. The number k ∗ isdefined in (3.5). Proposition 3.12. If µ > , the expected performance E [ Y ( k ) ] is maximized by a cut-off at j forsome j ≤ k ∗ . If µ < , the optimal reward scheme pays equal amounts to ranks 1 through k ∗ . Remark 3.13. (a) For µ < , the optimizer need not be a cut-off scheme. That is, in addition tothe equal amounts mentioned in the proposition, smaller amounts may be paid to lower ranks. Forinstance, let µ = − . , σ = 1 , x = 1 and ( n, k ) = (5 , . Then k ∗ = 2 and numerical experimentsshow that the optimal reward scheme is given by (0 . , . , . , , , which is not a cut-offscheme.(b) For µ > , we conjecture that the optimal j satisfies k ≤ j ≤ k ∗ . Both inequalities maybe strict. As an example, let µ = 0 . , σ = 1 , x = 1 and ( n, k ) = (10 , . In this case, µ < ¯ µ issatisfied for all rewards. We have k ∗ = 7 and numerical experiments show that the cut-off at j = 6 is optimal. Proof of Proposition 3.12.
The cut-off schemes ( R i ) i =1 ,...,n − are the extreme points of the compact,convex set of normalized reward schemes. Any normalized reward R can be uniquely expressed asa convex combination R = P n − i =1 λ i R i where λ = ( λ , . . . , λ n − ) is an element of the unit simplex ∆ ⊂ R n − . Introducing the function g i associated with R i as in Lemma 2.2, g i ( y ) := n X l =1 R il (cid:18) n − l − (cid:19) y n − l (1 − y ) l − = i X l =1 i (cid:18) n − l − (cid:19) y n − l (1 − y ) l − , we can rewrite the optimization over normalized reward schemes as sup λ ∈ ∆ n (cid:18) n − k − (cid:19) A − Z log " nB n − X i =1 λ i g i ( y ) + 1 y n − k (1 − y ) k − dy. Dropping a positive factor for brevity, we thus seek to maximize J ( λ ) := A − Z log " nB n − X i =1 λ i g i ( y ) + 1 y n − k (1 − y ) k − dy (3.6)over ∆ . This is a strictly convex, continuous function for µ > , showing that any optimizer mustbe an extreme point. Whereas for µ < , J is strictly concave, showing that the optimizer is unique(and explaining why the solution may well be an interior point rather than a cut-off scheme).(i) Let µ > , so that A, B < . Fix k ∗ < j < n , then g k ∗ is strictly single-crossing wrt. g j withsome crossing point y ∈ (0 , . Writing e i for the i -th basis vector in R n − , and using also that x ≤ e x − , with equality only for x = 1 , the crossing property implies J ( e k ∗ ) − J ( e j ) = A − Z log (cid:18) nBg k ∗ ( y ) + 1 nBg j ( y ) + 1 (cid:19) y n − k (1 − y ) k − dy> A − Z (cid:18) nBg k ∗ ( y ) + 1 nBg j ( y ) + 1 − (cid:19) y n − k (1 − y ) k − dy = nBA Z g k ∗ ( y ) − g j ( y ) nBg j ( y ) + 1 y n − k (1 − y ) k − dy ≥ nBA ( nBg j ( y ) + 1) Z ( g k ∗ ( y ) − g j ( y )) y n − k (1 − y ) k − dy. oreover, Z ( g k ∗ ( y ) − g j ( y )) y n − k (1 − y ) k − dy = n X l =1 (cid:18) l ≤ k ∗ k ∗ − l ≤ j j (cid:19) (cid:18) n − l − (cid:19) Z y n − k − l (1 − y ) k + l − dy = n X l =1 (cid:18) l ≤ k ∗ k ∗ − l ≤ j j (cid:19) (cid:18) n − l − (cid:19) (2 n − k − l )!( k + l − n − n − n − n X l =1 (cid:18) l ≤ k ∗ k ∗ − l ≤ j j (cid:19) φ ( k, l )= ( n − n − k ∗ k ∗ X l =1 φ ( k, l ) − j j X l =1 φ ( k, l ) ! . The last expression is nonnegative by the definition of k ∗ . Putting everything together, we haveshown that e j is strictly suboptimal and the claim follows.(ii) Let µ < , so that A, B > . Let λ ∈ ∆ be such that λ i > for some i < k ∗ and define λ := λ + λ i ( e k ∗ − e i ) . To show that λ is strictly better than λ , it suffices by concavity to show ∂∂θ J ( λ θ ) (cid:12)(cid:12) θ =0 < , where λ θ := θλ + (1 − θ ) λ . Indeed, we have ∂∂θ J ( λ θ ) (cid:12)(cid:12) θ =0 = nBA Z P n − i =1 ( λ i − λ i ) g i ( y ) nB P n − i =1 λ i g i ( y ) + 1 y n − k (1 − y ) k − dy. Using the single-crossing property of g i with respect to g k ∗ , the strict monotonicity of nB P n − i =1 λ i g i ( y )+1 , and the definition of k ∗ , we deduce that ∂∂θ J ( λ θ ) (cid:12)(cid:12) θ =0 = nBA Z λ i ( g i ( y ) − g k ∗ ( y )) nB P n − i =1 λ i g i ( y ) + 1 y n − k (1 − y ) k − dy< C Z λ i ( g i ( y ) − g k ∗ ( y )) y n − k (1 − y ) k − dy = C Z λ i n X l =1 (cid:18) l ≤ i i − l ≤ k ∗ k ∗ (cid:19) (cid:18) n − l − (cid:19) y n − k − l (1 − y ) k + l − dy = Cλ i n X l =1 (cid:18) l ≤ i i − l ≤ k ∗ k ∗ (cid:19) (cid:18) n − l − (cid:19) (2 n − k − l )!( k + l − n − C i i X l =1 φ ( k, l ) − k ∗ k ∗ X l =1 φ ( k, l ) ! ≤ , where C is a positive constant that may vary from line to line. This shows that J ( λ ) > J ( λ ) . Asa consequence, the optimal reward scheme must be a convex combination of R k ∗ , . . . , R n − . References [1] B. C. Arnold and J. M. Sarabia.
Majorization and the Lorenz order with applications in appliedmathematics and economics . Statistics for Social and Behavioral Sciences. Springer, Cham,2018.
2] E. Bayraktar, J. Cvitanić, and Y. Zhang. Large tournament games.
Ann. Appl. Probab. ,29(6):3695–3744, 2019.[3] E. Bayraktar and Y. Zhang. Terminal ranking games.
To appear in Math. Oper. Res. , 2019.[4] D. Fang and T. Noe. Skewing the odds: Taking risks for rank-based rewards.
PreprintSSRN:2747496 , 2016.[5] D. Fang, T. Noe, and P. Strack. Turning up the heat: The discouraging effect of competitionin contests.
J. Political Econ. , 128(5):1940–1975, 2020.[6] H. Feng and D. Hobson. Gambling in contests modelled with diffusions.
Decis. Econ. Finance ,38(1):21–37, 2015.[7] H. Feng and D. Hobson. Gambling in contests with random initial law.
Ann. Appl. Probab. ,26(1):186–215, 2016.[8] H. Feng and D. Hobson. Gambling in contests with regret.
Math. Finance , 26(3):674–695, 2016.[9] W. J. Hall. Embedding submartingales in Wiener processes with drift, with applications tosequential analysis.
J. Appl. Probability , 6:612–632, 1969.[10] A. Kempf, S. Ruenzi, and T. Thiele. Employment risk, compensation incentives, and managerialrisk taking: Evidence from the mutual fund industry.
J. Financ. Econ. , 92(1):92 – 108, 2009.[11] A. W. Marshall, I. Olkin, and B. C. Arnold.
Inequalities: theory of majorization and itsapplications . Springer Series in Statistics. Springer, New York, second edition, 2011.[12] M. Nutz and Y. Zhang. A mean field competition.
Math. Oper. Res. , 44(4):1245–1263, 2019.[13] M. Nutz and Y. Zhang. Gambling in a mean field contest.
Preprint , 2021.[14] J. Obłój. The Skorokhod embedding problem and its offspring.
Probab. Surv. , 1:321–390, 2004.[15] C. Seel. Gambling in contests with heterogeneous loss constraints.
Economics Letters , 136:154– 157, 2015.[16] C. Seel and P. Strack. Gambling in contests.
J. Econ. Theory , 148(5):2033–2048, 2013.[17] C. Seel and P. Strack. Continuous time contests with private information.
Math. Oper. Res. ,41(3):1093–1107, 2016.[18] M. Vojnović.
Contest Theory: Incentive Mechanisms and Ranking Methods . Cambridge Uni-versity Press, 2016.. Cambridge Uni-versity Press, 2016.