[PDF] Is there a Golden Parachute in Sannikov's principal-agent problem?

Abstract

This paper provides a complete review of the continuous-time optimal contracting problem introduced by Sannikov, in the extended context allowing for possibly different discount rates of both parties. The agent's problem is to seek for optimal effort, given the compensation scheme proposed by the principal over a random horizon. Then, given the optimal agent's response, the principal determines the best compensation scheme in terms of running payment, retirement, and lump-sum payment at retirement. A Golden Parachute is a situation where the agent ceases any effort at some positive stopping time, and receives a payment afterwards, possibly under the form of a lump-sum payment, or of a continuous stream of payments. We show that a Golden Parachute only exists in certain specific circumstances. This is in contrast with the results claimed by Sannikov, where the only requirement is a positive agent's marginal cost of effort at zero. Namely, we show that there is no Golden Parachute if this parameter is too large. Similarly, in the context of a concave marginal utility, there is no Golden Parachute if the agent's utility function has a too negative curvature at zero. In the general case, we provide a rigorous analysis of this problem, and we prove that an agent with positive reservation utility is either never retired by the principal, or retired above some given threshold (as in Sannikov's solution). In particular, different discount factors induce naturally a face-lifted utility function, which allows to reduce the whole analysis to a setting similar to the equal-discount rates one. Finally, we also confirm that an agent with small reservation utility does have an informational rent, meaning that the principal optimally offers him a contract with strictly higher utility value.

Full PDF

IIs there a Golden Parachutein Sannikov’s principal–agent problem? ∗ Dylan Possamaï † Nizar Touzi ‡ July 14, 2020

Abstract

This paper provides a complete review of the continuous–time optimal contracting problem introducedby Sannikov [54], in the extended context allowing for possibly diﬀerent discount rates of both parties. Theagent’s problem is to seek for optimal eﬀort, given the compensation scheme proposed by the principal over arandom horizon. Then, given the optimal agent’s response, the principal determines the best compensationscheme in terms of running payment, retirement, and lump–sum payment at retirement.A Golden Parachute is a situation where the agent ceases any eﬀort at some positive stopping time, andreceives a payment afterwards, possibly under the form of a lump sum payment, or of a continuous streamof payments. We show that a Golden Parachute only exists in certain speciﬁc circumstances. This is incontrast with the results claimed by Sannikov [54], where the only requirement is a positive agent’s marginalcost of eﬀort at zero. Namely, we show that there is no Golden Parachute if this parameter is too large.Similarly, in the context of a concave marginal utility, there is no Golden Parachute if the agent’s utilityfunction has a too negative curvature at zero.In the general case, we provide a rigorous analysis of this problem, and we prove that an agent withpositive reservation utility is either never retired by the principal, or retired above some given threshold (asin Sannikov’s solution). In particular, diﬀerent discount factors induce naturally a face–lifted utility function ,which allows to reduce the whole analysis to a setting similar to the equal–discount rates one. Finally, wealso conﬁrm that an agent with small reservation utility does have an informational rent, meaning that theprincipal optimally oﬀers him a contract with strictly higher utility value.

Key words: continuous–time principal–agent, optimal control and stopping, face–lifting.

Principal–agent problems naturally stem from questions of optimal contracting between two parties – aprincipal (’she’) and an agent (’he’), when the agent’s eﬀort cannot be observed or contracted upon. Math-ematically, they are formulated as a Stackelberg non–zero sum game, and can also be identiﬁed to bi–leveloptimisation problems in the operations research literature. The number of articles related to this topicis staggering, mainly due to the wide spectrum of concrete problems where this theory is able to providerelevant results, for instance for moral hazard problem in microeconomics with applications to corporategovernance, portfolio management, and many other areas of economics and ﬁnance.The ﬁrst, and seminal, paper on principal–agent problems in continuous–time is by Holmström andMilgrom [34], who show that the optimal contract is linear in the output process, in a ﬁnite horizon settingwith CARA utility functions for both parties, and when the agent’s eﬀort impacts solely the drift of theoutput process. This paper is the ﬁrst to highlight that optimal contracting problems tend to be easier to ∗ We are grateful to Yuliy Sannikov for his insightful comments on the ﬁrst version of this paper. This work beneﬁted fromsupport of the ANR project PACMAN ANR–16–CE05–0027. † Columbia University, IEOR department, USA, [email protected] ‡ CMAP, École Polytechnique, 91128 Palaiseau Cedex, France, [email protected]. This authors is also grateful forthe ﬁnancial support from the Chaires FiME-FDD and Financial Risks of the Louis Bachelier Institute. a r X i v : . [ ec on . T H ] J u l ddress in continuous–time, an observation which has been conﬁrmed by the large continuous–time literaturein this area. Holmström and Milgrom’s work was extended by Schättler and Sung [56], Sung [66; 67], Müller[43; 44], and Hellwig and Schmidt [32; 31]. While the aforementioned papers use continuous–time extensionsof the celebrated ﬁrst–order approach from the contract theory literature in static cases, see for instanceRogerson [52], the papers by Williams [70; 71; 72] and Cvitanić, Wan, and Zhang [15; 16; 17] use thestochastic maximum principle and forward–backward stochastic diﬀerential equations to characterise theoptimal compensation for more general utility functions, see also the excellent monograph by Cvitanić andZhang [14]. The seminal work of Sannikov [54], see also Sannikov [55], represents a genuine breakthrough in this vastliterature from various perspectives. First, from the methodological viewpoint, Sannikov introduced theidea to focus on the dynamic continuation value of the agent as a state variable for the principal’s problem.Although this idea was already acknowledged throughout the discrete–time literature on this problem, anilluminating example being Spear and Srivastava [63], its systematic implementation in continuous–timeoﬀers an elegant solution approach by means of a representation result of the dynamic value function.Second, the inﬁnite horizon setting considered by Sannikov revealed remarkable economic implications.Indeed, Sannikov’s main conclusions are that the principal optimally retires the agent, oﬀering him a GoldenParachute, that is to say a lifetime constant continuous stream of consumption, when his continuation utilityreaches a suﬃciently high level, and that an agent with small reservation utility possesses an informationalrent, in the sense that he is oﬀered a contract with strictly higher value.The main objective of our paper is twofold. First, we revisit Sannikov’s seminal work, but putting astronger weight on technical rigour, which is unfortunately lacking in some key parts of [54]. We would liketo emphasise that this should not be seen in any case as a reason to underestimate the importance of thispaper, given the groundbreaking novelties recalled above. In contrast, our ﬁrst aim is to try and contributeeven more to the success of [54] by making it more accessible to a wider community of mathematicians andeconomists, whose overall understanding of the model may be hindered by the technical gaps in [54]. Noticethat we are not the ﬁrst to try and obtain rigorously the results claimed in [54]. For instance, Struloviciand Szydlowski [65, Section 4.3] oﬀers a more rigorous take on the existence of optimal contracts in themodel. However, the authors take for granted the fact that [54] proves that the HJB equation for theprincipal’s problem has a smooth solution, while we will argue that the proof has important gaps. Similarly,the unpublished PhD thesis of Choi [12] aims at putting the problem on rigorous foundations. Nonetheless,existence of optimal contracts is not addressed there, and the results rely on the assumption that it is neveroptimal to retire the agent temporarily, while our approach actually proves that this is the case. We alsowould like to refer to the recent work of Décamps and Villeneuve [20], where the authors study a related,but diﬀerent, contracting problem, and where again the heart of the analysis is technical clarity: this shouldbe an additional illustration that actually proving rigorously results in this literature is a challenging task.Our second goal is to prove that our analysis extends beyond the case where the principal and the agenthave the same discount rates. It is an important feature, as most models in the discrete– or continuous–time literature either allow for risk–averse agents who are as patient as the principal, as in Sannikov [54],Fong [26], Myerson [45], and Hajjej, Hillairet, Mnif, and Pontier [28], or for more impatient, but risk–neutral agents, as in DeMarzo and Sannikov [21], Biais, Mariotti, Plantin, and Rochet [4], Biais, Mariotti,Rochet, and Villeneuve [5], Biais, Mariotti, and Rochet [6], He [30], Piskorski and Tchistyi [50], Piskorskiand Westerﬁeld [51], DeMarzo, Fishman, He, and Wang [22], Pagès and Possamaï [48], or Williams [72].Even more surprisingly, our analysis can also accommodate the case where the principal is actually strictlymore impatient than the agent. More precisely, when the principal is more impatient, but not too much(the actual bound depends on the level of risk–aversion of the agent), the solution exhibits no fundamental Other early continuous–time contract theory models were proposed by Adrian and Westerﬁeld [1], Biais, Mariotti, Plantin,and Rochet [4], Biais, Mariotti, Rochet, and Villeneuve [5], Biais, Mariotti, and Rochet [6], Capponi and Frei [9], DeMarzo andSannikov [21], DeMarzo, Fishman, He, and Wang [22], Fong [26], He [30], Hoﬀmann and Pfeil [33], Ju and Wan [35], Keiber [37],Leung [39], Mirrlees and Raimondo [42], Myerson [45], Ou-Yang [46], Pagès [47], Pagès and Possamaï [48], Piskorski and Tchistyi[50], Piskorski and Westerﬁeld [51], Sannikov [53], Schroder, Sinha, and Levental [58], Van Long and Sorger [68], Westerﬁeld [69],Zhang [73], Zhou [74], or Zhu [75]. Exceptions are the recent work by Hajjej, Hillairet, and Mnif [29], where the agent is risk–averse and more impatient than theprincipal. However, they do not obtain clear results saying that the hypotheses of their veriﬁcation result [29, Theorem 4.3] canbe veriﬁed in practice, as well as the work of Lin, Ren, Touzi, and Yang [41], but there the emphasis is more on obtaining generalmethods to attack inﬁnite horizon moral hazard problems. iﬀerence compared to the case where the principal is more patient. However, when the discount rate ofthe principal falls below a critical threshold, the problem degenerates, optimal contracts cease to exist, andthe principal can achieve her ﬁrst–best value with appropriately deﬁned sequences of incentive–compatiblecontracts. As far as we know, our paper is the ﬁrst oﬀering such a comprehensive analysis.Our main ﬁndings are the following. First, in contrast with the overall message from [54], we show thata Golden Parachute only exists in some speciﬁc situations. It never happens if the agent’s marginal costof eﬀort at zero is zero, or is suﬃciently large. And it never happens if the agent’s marginal utility is alsoconcave, and his utility function has suﬃciently large negative curvature at zero, with a level depending onthe marginal cost of eﬀort at zero. We also conﬁrm rigorously the existence of informational rent for anagent with small reservation utility. Under our set of assumptions, our Theorem 3.6 provides a necessaryand suﬃcient condition for this important economic eﬀect to occur. The condition combines the curvatureof the agent’s utility function at zero, and his marginal cost of eﬀort at zero. We emphasise that our rigorouspresentation involves advanced tools from stochastic control theory and partial diﬀerential equations. Inparticular, the justiﬁcation of the solution claimed by Sannikov in [54] requires the use of Perron’s existenceapproach, combined with the theory of viscosity solutions, and it is unclear to us how the proof could besigniﬁcantly simpliﬁed.Finally, from the methodological and theoretical point of view, we have highlighted a novel phenomenonin (properly renormalised) moral hazard problems with risk–aversion and diﬀerent discount rates, where theprincipal’s problem has an optimal stopping component, in the sense that she can terminate the contract.Indeed, we proved that the problem could actually be treated as in the case with similar discount rates, butprovided that in the principal’s optimisation, the certainty equivalent of the agent’s continuation utility paidupon retirement is not computed using the inverse utility function of the agent, but using instead a ’face–lifted’, or a ’shadow utility’ function, obtained as the solution to a speciﬁc deterministic control problem.In words, this deterministic control problem assesses whether upon termination of the contract, it couldactually be proﬁtable for the principal to instead retire the agent by providing him, for a certain amount oftime, a deterministic rent while he exerts no eﬀort. Though we present this method in the speciﬁc contextof the model in [54], it applies to generic moral hazard problems with early retirement possibilities. To thebest of our knowledge, such a phenomenon has not been observed before, neither in the contract theoryliterature, nor in the optimal stopping literature. The paper is organised as follows. Section 2 provides a rigorous formulation of the continuous–timecontracting problem, with a clear description of the set of contracts, and introduces the face–lifted utility F .Our main results are given in Section 3. Thus, Section 3.1 provides some conditions under which no GoldenParachute can exist, which can all be recovered by the more abstract suﬃcient condition I ( F , F ) > δ is the ratio of the discount rate of the agent and the principal, and I is deﬁned in(3.2). Next, Section 3.2 identiﬁes the value function of the principal and describes optimal contracts, whileSection 3.3 presents our numerical illustrations, and Section 3.4 discusses the gaps in [54]. Subsequently,for completeness, we provide a review of the ﬁrst–best version of [54]’s model in Section 4, thus highlightingthe very diﬀerent nature of the ﬁrst–best optimal contract, for which a Golden Parachute never exists.Then, Section 5 uses the result of Lin, Ren, Touzi, and Yang [41], itself an extension of earlier results byCvitanić, Possamaï, and Touzi [18; 19], which justify rigorously Sannikov’s [54] remarkable reduction ofthe principal’s Stackelberg game problem into a standard control–and–stopping problem. Such a reductionopens the door for the use of standard tools of stochastic control theory. In particular, we treat the case of avery impatient principal in Section 6, which can be addressed directly by exhibiting a sequence of contractsinducing a degenerate situation where both parties achieve as large a payment as possible. The alternativecase of reasonably impatient principal is analysed by means of the corresponding dynamic programmingequation introduced in Section 7, where we also provide a veriﬁcation result following the standard theory.In Section 9, we provide a rigorous analysis of the dynamic programming equation, and we isolate a set of The ’face–lifting’ phenomenon corresponds to the so–called boundary layer eﬀect in singular optimal control problems, andappeared naturally in various pricing problems in ﬁnance, either with hedging constraints or market frictions, see for instanceBroadie, Cvitanić, and Soner [8], Bouchard and Touzi [7], Chassagneux, Élie, and Kharroubi [10], Guasoni, Rásonyi, and Schacher-mayer [27], Soner and Touzi [59; 60; 61; 62], Cheridito, Soner, and Touzi [11], and Schmock, Shreve, and Wystup [57], or for utilitymaximisation problems, see Larsen, Soner, and Žitković [38]. However all these references consider either ’pure’ optimal controlor stochastic target problems, while in our context, the face–lifting phenomenon occurs because of an optimal stopping problem,and is therefore of a diﬀerent nature. onditions which guarantee that the solution is of the form claimed in [54]. Finally, Section 10 complementsour results and examines the possibility of existence of a Golden Parachute in the context of the ﬁnite horizonHolmström and Milgrom [34], where both parties are now allowed to be risk–averse. This section reports our understanding of the continuous time contracting model in Sannikov [54]. Let(Ω , F , P ) be a probability space carrying a one–dimensional P –Brownian motion W . For ﬁxed parameters σ > X ∈ R , the output process is deﬁned by X t := X + σW t , t ≥ . We denote by F the P –augmentation of the natural ﬁltration of X (or equivalently, of W ), which is well–known to satisfy the usual conditions. We next introduce distributions P α of the output process under eﬀort α , so as to induce the dynamics d X t = α t d t + σ d W αt , for some P α –Brownian motion W α . This is naturallyaccomplished by means of the following argument based on the Girsanov transformation.Let A be the collection of all F –predictable processes α with values in a compact subset A of [0 , ∞ ),containing 0. For all α ∈ A , we may introduce an equivalent probability measure P α so that the process W α := W − R · α s σ d s is a P α –Brownian motion, and the process X can be written in terms of W α as X t = X + Z t α s d s + σW αt , t ≥ . Any α ∈ A is called an eﬀort process, and is interpreted as an action exerted in order to aﬀect thedistribution of the output process from P to P α .A contract is a triple C := ( τ, π, ξ ), where τ ∈ T , the set of all F –stopping times, ξ is a non–negative F τ –measurable random variable, and π ∈ Π, the set of F –predictable non–negative processes. Here, τ representsa retirement time, π is a process of continuous payment rate until retirement, and ξ is a lump–sum paymentat retirement, which may be interpreted as a Golden Parachute in the terminology of Sannikov [54], seeDeﬁnition 2.4 below.We shall introduce later in Section 2.5 the collection C of admissible contracts, by imposing someintegrability requirements. These contracts allow to formulate the contracting problem which sets the termsof the delegation by the principal (she) of the output process to the agent (he). Namely, the principal seeksto design the optimal contract so as to guarantee that the agent best serves her objectives, while optimisinghis own interest. The agent preferences are deﬁned by • a utility function u : [0 , ∞ ) −→ [0 , ∞ ) which is increasing, strictly concave, twice continuously diﬀer-entiable on (0 , ∞ ), satisﬁes u (0) = 0 together with the (one–sided) Inada condition lim x →∞ u ( x ) = 0and the growth condition c ( − π γ (cid:1) ≤ u ( π ) ≤ c (cid:0) π γ (cid:1) , π ≥ , for some ( c , c ) ∈ (0 , ∞ ) , and some γ > , (2.1)which implies that u ( ∞ ) = ∞ , and u − ( y ) ≤ C (cid:0) y γ (cid:1) , for any ( y, π ) ∈ [0 , ∞ ), and for some C > • a cost function h : [0 , ∞ ) −→ [0 , ∞ ) assumed to be increasing, strictly convex, continuously diﬀeren-tiable, with h (0) = 0; • a ﬁxed discount rate r > C := ( τ, π, ξ ) ∈ C and α ∈ A , the utility obtained by the agent is deﬁned by the problem V A ( C ) := sup α ∈A J A ( C , α ) , where J A ( C , α ) := E P α (cid:20) e − rτ u ( ξ ) + Z τ r e − rs (cid:0) u ( π s ) − h ( α s ) (cid:1) d s (cid:21) . (2.2) s u ≥

0, and A is bounded, notice that J A ( C , α ) ∈ R ∪ {∞} is well–deﬁned. Moreover, as the agent isallowed to choose zero eﬀort, inducing J A ( C , ≥

0, it follows that V A ( C ) ≥ C ∈ C . We denote by A ? ( C ) := (cid:8) α ∈ A : V A ( C ) = J A ( C , α ) (cid:9) , the (possibly empty) set of all optimal responses of the agent.In addition, the agent only accepts contracts which provide him with a utility above some ﬁxed threshold u ( R ), where R ≥

0, called participation level. Thus he is only willing to consider contracts in the subset C R := (cid:8) C ∈ C : V A ( C ) ≥ u ( R ) (cid:9) . Observe that the ﬁnal lump–sum utility for the agent can be written as u ( ξ ) = R ∞ τ r e − rt u ( ξ )d t , so that itcan be equivalently implemented by the payment of the lifetime consumption ξ after retirement at time τ .We shall comment further on this normalisation in Remark 2.3 below. The principal is risk–neutral with the objective of maximising her overall revenue induced by the agent’seﬀort and the promised compensation J P ( C , α ) := E P α (cid:20) − e − ρτ ξ + Z τ ρ e − ρs (d X s − π s d s ) (cid:21) , where we consider here an extension of Sannikov [54], allowing the principal to have a possibly diﬀerentdiscount rate ρ > α ∈ A , E P α (cid:2) R τ e − ρs d s (cid:3) ≤ R ∞ e − ρs d s = ρ < ∞ . Then, by standard It¯ointegration theory, we have E P α (cid:2) R τ e − ρs σ d W αs (cid:3) = 0 for all stopping time τ ∈ T , implying that J P ( C , α ) = E P α (cid:20) − e − ρτ ξ + Z τ ρ e − ρs ( α s − π s )d s (cid:21) , which is well–deﬁned in {−∞} ∪ R , due to the boundedness of A and the non–negativity of ξ and π .We also notice again that the lump–sum payment ξ at time τ can be rewritten as ξ = R ∞ τ ρ e − ρt ξ d t , andso it can be implemented by the lifetime payment at rate ξ after τ , in agreement with the correspondinginterpretation in the agent’s problem.The principal’s problem is deﬁned as follows: anticipating the agent’s optimal response, she chooses thecontract which best serves her objective under the participation constraint V P := sup C ∈ C R sup α ∈A ? ( C ) J P ( C , α ) . (2.3) We next re–write our contracting problem equivalently by using the opposite of the inverse of the agent’sutility F := − u − [0 ,u ( ∞ )) − ∞ R + \ [0 ,u ( ∞ )) . Then, denoting ζ := u ( ξ ) and η := u ( π ), the criterion of the agent becomes J A ( C , α ) = E P α (cid:20) e − rτ ζ + Z τ r e − rt (cid:0) η t − h ( α t ) (cid:1) d t (cid:21) , ( C , α ) ∈ C × A , (2.4)where we abuse notations and indiﬀerently refer as contract the triplet ( τ, π, ξ ), or the triplet ( τ, η, ζ ). Wewill use this identiﬁcation implicitly throughout the paper. As for the principal, we have J P ( C , α ) = E P α (cid:20) e − ρτ F ( ζ ) + Z τ ρ e − ρt (cid:0) α t + F ( η t ) (cid:1) d t (cid:21) , ( C , α ) ∈ C R × A . ere, the (negative) reward of the principal by stopping at τ is F ( ζ ).Our ﬁrst result shows that in general, the principal may be able to improve her reward by not ending thecontract at some time τ with a lump–sum payment to the agent, but by instead discouraging the agent fromexerting any eﬀorts (which can be understood as an alternative way of ending the contract), and oﬀering hima continuous consumption. The improved (or face–lifted, hereafter) reward is naturally deﬁned by means ofthe following deterministic control problem F ( y ) := sup p ∈B R + sup T ∈ [0 ,T y ,p ] (cid:26) e − ρT F (cid:0) y y ,p ( T ) (cid:1) + Z T ρ e − ρt F (cid:0) p ( t ) (cid:1) d t (cid:27) , y ≥ , (2.5)where B R + is the set of Borel measurable maps from R + to R + , and for all ( y , p ) ∈ R + × B R + , T y ,p := inf (cid:8) t ≥ y y ,p ( t ) ≤ (cid:9) ∈ [0 , ∞ ] , and the state process y y ,p is deﬁned by the controlled ﬁrst–order ODE y y ,p (0) = y , ˙ y y ,p ( t ) = r (cid:0) y y ,p ( t ) − p ( t ) (cid:1) , t > . (2.6)To better understand the expression (2.5) for the improved payment, notice that for any p ∈ B R + , directintegration of this linear ODE leads to y = e − rT y y ,p ( T ) + Z T e − rt p ( t )d t, for all y ≥ , and T ≤ T y ,p , meaning that for a given state of the world ω , the agent is indiﬀerent between a lump–sum payment ξ ( ω )at some retirement time τ ( ω ), and a continuous payment p ( t ) on the time interval [ τ ( ω ) , τ ( ω ) + T ], withzero eﬀort on this time interval, and a retirement deferred to τ ( ω ) + T , where the lump–sum payment isnow ξ ( ω ) := u − (cid:0) y ζ ( ω ) ,p ( T ) (cid:1) . The restriction T ≤ T y ,p on such deferral policies is induced by the fact thatthe agent is protected by limited liability, and therefore can only receive non–negative payments. The ideais that while the agent is indiﬀerent between these two alternatives, the discrepancy between the discountrates may be such that the principal can actually beneﬁt from postponing retirement.An immediate consequence of this is that the value function of the principal can be expressed in itsrelaxed formulation as V P := sup C ∈ C R sup α ∈A ? ( C ) ¯ J P ( C , α ) , where ¯ J P ( C , α ) := E P α (cid:20) e − ρτ F ( ζ ) + Z τ ρ e − ρt (cid:0) α t + F ( η t ) (cid:1) d t (cid:21) , (2.7)where C R := { C ∈ C : V A ( C ) ≥ u ( R ), for some subset C ⊂ C deﬁned in Section 2.5 below.The following result states the equivalence of our original contracting problem with V P , and characterisesthe face–lifted reward F in closed form in terms of the concave conjugate functions F ? ( p ) := inf y ≥ (cid:8) yp − F ( y ) (cid:9) , and F ? ( p ) := inf y ≥ (cid:8) yp − F ( y ) (cid:9) , p ∈ R . Notice that F ? = 0 on R − , and that our condition (2.1) on the agent’s utility function is immediatelyconverted for F ? into − c ? (cid:0) | p | γ ? (cid:1) ≤ F ? ( p ) ≤ c ? (cid:0) − | p | γ ? (cid:1) , p ≤ , with 1 γ − γ ? = 1 , for some ( c ? , c ? ) ∈ (0 , ∞ ) . (2.8) Proposition 2.1.

We have V P = V P , and the face–lifted reward function satisﬁes (i) F = 0 , if ρ ≥ γr ;(ii) F = F , if ρ = r ; As observed by Yuliy Sannikov in private communication with us, the principal problem may actually be directly deﬁned bythe relaxed formulation (2.7). iii) if either ρ ∈ ( r, γr ) , or ρ ∈ (0 , r ) and lim y →∞ F ( y ) yF ( y ) exists, we have F = ( F ? ) ? where F ? ( p ) = 11 − δ (cid:18) | p | δ (cid:19) − δ Z δpb | x | − − − δ F ? ( x )d x, with δ := ρr , b := −∞ { r<ρ } + F (0) { r>ρ } . (2.9) In particular F is decreasing, strictly concave, rF (0) = ρF (0) { r ≥ ρ } , and F ? satisﬁes similar bounds to (2.8) , with appropriate positive constants ¯ c ? , and ¯ c ? , which translate into bounds on F similar to those in (2.1) , with appropriate positive constants ¯ c , and ¯ c . Moreover, the supremum over T in (2.5) is attained at T y ,p , meaning that F ( y ) = sup p ∈B R + (cid:26) Z T y ,p ρ e − ρt F (cid:0) p ( t ) (cid:1) d t (cid:27) , y ≥ . (2.10)The equality V P = V P in Proposition 2.1 is a direct consequence of our deﬁnition of admissible contractsin Section 2.5 below. The remaining claims are proved in Appendix A, and provide the following signiﬁcantresults. In the case ρ = r considered by Sannikov [54], the principal never gains by postponing retirementand allowing the agent to produce zero eﬀort for a while. On the other hand, when ρ = r , and ρ is not toolarge, it is always optimal to postpone retirement and F is a non–trivial majorant of F . Finally, when theprincipal becomes a lot more impatient than the agent, we actually have F = 0, meaning that she can bringback the cost of permanently retiring the agent to 0. Example 2.2.

Let u ( π ) := π /γ , and ρ = r with ρ < γr , then F ( y ) = − y γ , and we compute directly F ? ( p ) = − ( γ − (cid:18) | p | γ (cid:19) γ/ ( γ − , F ? ( p ) = − ρ ( γ − rγ − ρ (cid:18) r | p | ργ (cid:19) γγ − , p ≤ , and it follows from Proposition 2.1 that F ( y ) = − (cid:18) rγ − ρρ ( γ − (cid:19) γ − (cid:18) ryρ (cid:19) γ , y ≥ . Remark 2.3.

The normalisation of the running rewards of the principal and the agent by their correspondingdiscount rates in

Equation (2.2) and

Equation (2.3) , is not fundamental, per se . However, the face–liftedprincipal’s beneﬁt function plays a crucial role to relate equivalent formulations of the problem. Considerfor instance the agent’s criterion J A0 ( C , α ) := E P α (cid:20) e − rτ u ( ξ ) + Z τ e − rs (cid:0) u ( π s ) − h ( α s ) (cid:1) d s (cid:21) , which diﬀers from J A in (2.2) by the form of discount factor e − rt instead of r e − rt . Similarly, change theprincipal’s criterion to J P0 ( C , α ) := E P α (cid:20) − e − ρτ ξ + Z τ e − ρt (cid:0) α t − π t ) (cid:1) d t (cid:21) . Then, following the same argument, the corresponding face–lifted utility function is F ( y ) := sup p ∈B R + sup T ∈ [0 ,T y ,p ] (cid:26) e − ρT F (cid:0) y y ,p ( T ) (cid:1) + Z T e − ρt F (cid:0) p ( t ) (cid:1) d t (cid:27) , y ≥ , with controlled state satisfying for any p ∈ B R + , y y ,p (0) = y , and ˙ y y ,p ( t ) = (cid:0) ry y ,p ( t ) − p ( t ) (cid:1) , t > . Thecorresponding Hamilton–Jacobi equation is min n F − F, ρF − ryF + F ? ( F ) o = 0 . In particular, in the case ρ = r of equal discount rates, we see immediately that F ( y ) := r F ( ry ) , y ≥ , isa solution of this equation. Consequently the decision of retiring the agent should be discussed by comparingthe principal’s value function to F instead of F in this case, see Deﬁnition 2.4 below. In this sense, thesetting of [54] is the only parametrisation of the problem with ρ = r for which the face–lifted retirementreward function F coincides with F . .5 Admissible contracts and Golden Parachute For technical reasons, we introduce further integrability conditions which guarantee that both criteria of theagent and the principal are ﬁnite, and more importantly, allow to apply the reduction result of Lin, Ren,Touzi, and Yang [41]. We denote by C the collection of all contracts C := ( τ, π, ξ ), satisfying in addition thefollowing integrability conditionlim n →∞ sup α ∈A P α [ τ ≥ n ] = 0 , and sup α ∈A E P α (cid:20)(cid:0) e − r τ | ξ | (cid:1) γ + Z τ (cid:0) e − r s | π s | (cid:1) γ d s (cid:21) < ∞ , (2.11)for some r ∈ (0 , r ∧ ργ ).In order to guarantee that the equality V P = V P of Proposition 2.1 holds, we deﬁne the set C as thecollection of all triples ( τ , π , ξ ) such that τ = τ + T, π = π [0 ,τ ) + p [ τ,τ ) , and u ( ξ ) = y u ( ξ ) ,p ( T ) , for some ( τ, π, ξ ) ∈ C , and F τ − measurable p with values in B R + , and T with values in (cid:2) , T u ( ξ ) ,p (cid:3) .We can now introduce the notion of Golden Parachute which may have two diﬀerent meanings in ourrelaxed formulation (2.7)( i ) in Sannikov’s formulation, the retirement time τ is not explicitly involved in the model formulation.Instead, a Golden Parachute is deﬁned as a stopping time τ such that the agent exerts no eﬀort whilereceiving a constant consumption on [ τ, ∞ );( ii ) our deﬁnition of contracts includes a retirement time τ , and we may naturally deﬁne a situation ofGolden Parachute by τ > ξ > P –a.s. Deﬁnition 2.4.

We say that the contracting model exhibits a

Golden Parachute , if there exists an optimalcontract ( τ ? , π ? , ξ ? ) ∈ C R for the relaxed formulation of the principal problem (2.7) such that τ ? > , and P [ ξ ? > > . In other words, a Golden Parachute corresponds to a situation where there is a high–retirement pointfor the agent, with either lump–sum payment at retirement or continuous payment after retirement, whereretirement means that the agent ceases to exert any eﬀort forever.

Remark 2.5.

We shall provide in

Section 4 a complete characterisation of the ﬁrst–best version of ourcontracting problem V P , FB := sup n J P ( C , α ) : C ∈ C FB , α ∈ A , and J A ( C , α ) ≥ u ( R ) o , where C FB is an appropriate extension of our C . In particular, Theorem 4.1 shows that the ﬁrst–best optimalcontract exhibits no Golden Parachute.

Our ﬁrst main result provides a necessary condition for the potential optimality of a Golden Parachute, andthen deduces some suﬃcient conditions which exclude the existence of a Golden Parachute, thus contrastingwith the results claimed in Sannikov [54]. Our statement requires to introduce the convex conjugate of thecost of eﬀort function h ? ( z ) := sup a ∈ A { za − h ( a ) } , z ∈ R . (3.1)We also introduce the corresponding subgradient ∂h ? ( z ) := { a ∈ A : h ? ( z ) = za − h ( a ) } , together with thesecond order diﬀerential operator I ( v , v ) := sup z ∈ R , ˆ a ∈ ∂h ? ( z ) (cid:8) ˆ a + h (ˆ a ) δv + ηz δv (cid:9) , for all C function v, where δ := rρ , η := 12 rσ . (3.2) roposition 3.1. If a

Golden Parachute in the sense of

Deﬁnition 2.4 exists, then sup y ≥ ¯ y n I ( F , F )( y ) o ≤ , for some ¯ y > , or equivalently sup p ≤ ¯ p (cid:26) I (cid:18) p, F ? ) ( p ) (cid:19)(cid:27) ≤ , for some ¯ p < . In particular, there is no

Golden Parachute whenever either (NGP1) h (0) = 0;(NGP2) or h (0) > , F is non–increasing, and I (cid:0) F (0) , F (0) (cid:1) > or h (0) > , A is an interval, and h ∈ C with inf a ∈ A ( (cid:0) ( h ) (cid:1) ( a ) h ( a ) ) ≥ η sup y ≥ ( − F ( y ) F ( y ) ) , and sup y ≥ n F ( y ) + 2 ηF ( y ) h (0) o ≤ − δh (0) . Remark 3.2.

Assume for simplicity that F (0) = 0 , then F (0) = 0 by Proposition 2.1 . Under our conditionthat ¯ a := max A < ∞ , we have sup z ≥ h (0) (cid:8) z − max ˆ A ( z ) (cid:9) ≤ ¯ a ( h (0)) . Then, when F is non–increasing, the existence of a Golden Parachute implies that ( h (0)) < ¯ a/ (cid:0) − ηF (0) (cid:1) .In other words, the second alternative (NGP2) of Proposition 3.1 states that there is no

Golden Parachute forsuﬃciently large h (0) . As for the third alternative (NGP3) , notice that the ﬁrst condition is automaticallysatisﬁed whenever ( h ) is convex, while the second one again requires h (0) to be large enough. Example 3.3.

Sannikov [54, Figure 1] considers the situation δ = 1 , ( so that F = F ) , and F ( y ) = − y , y ≥ , h ( a ) := 12 ha + βa, a ∈ A = R + , for some positive constants h and β. Notice that, given

Sannikov ’s conclusion that a

Golden Parachute exists, the unboundedness of A is notproblematic, as the optimal eﬀort remains bounded, so that the problem is unchanged by restricting to thecorresponding compact subset of A . Under the present speciﬁcation, we have F ( y ) = − , and sup z ≥ β (cid:8) z − max ˆ A ( z ) (cid:9) = sup a ≥ (cid:26) ah ( a + β ) (cid:27) = 14 hβ . Then, since F (0) = 0 in this case, the second alternative in Proposition 3.1 can be reformulated as (NGP2) if and only if 8 βηh ≥ . Recall from Proposition 2.1 that F = 0 when δγ ≤

1. Our ﬁrst result shows that the solution of thecontracting problem is degenerate in this case. Indeed, we shall exhibit a sequence of admissible contractswhich induces a utility as large as we want for the agent, and reaches the highest possible level for theprincipal, namely ¯ a . Roughly speaking, these contracts make small intermediate payments, enforce thehighest possible eﬀort for the agent at all times, and promise to pay him an extremely high value after anextremely long time. By exploiting the large discrepancy between the discount rates of the agent and theprincipal, we show that the continuation utilities of both parties reach their maximum.We emphasise that this result is in line with the solution of the ﬁrst–best contracting problem of Section 4below, where we also exhibit a sequence of contracts which induce arbitrarily large level of utility for theagent, while providing the principal with a value as close as we want to her universal maximal utility of ¯ a .We refer the reader to Section 4 and Section 6 for more intuitions on the contracts we construct. Theorem 3.4.

Let ρ ≥ γr . Then V P = ¯ a , there is no optimal contract achieving this value, and thesecond–best value of the principal coincides with her ﬁrst–best value. he proof of this result is reported in Section 6. We next focus on the more interesting case ρ < γr .Similar to Sannikov [54], the solution of the contracting problem is characterised by means of the second–order diﬀerential equation v (0) = 0 , and v − δyv + F ? ( δv ) − I ( v, v ) + = 0 , on [0 , ∞ ) . (3.3)Our main results hold under the following assumption. Assumption 3.5.

Either β := h (0) > , or A ⊃ [0 , ¯ a ] for some ¯ a > . Moreover, if ρ ∈ (0 , r ) then lim y →∞ F ( y ) / (cid:0) yF ( y ) (cid:1) exists. Theorem 3.6.

Let

Assumption 3.5 hold true, and let S := (cid:8) v = F (cid:9) . Then (i) there exists a unique solution v ∈ C ( R + ) of (3.3) , such that ≤ ( v − F )( y ) ≤ C log(1 + log(1 + y )) , y ≥ , for some C > v is strictly concave, ultimately decreasing, v (0) ≥ , and whenever F (0) = 0 , we have v (0) > if andonly if I (cid:0) , F (0) (cid:1) > if β = 0 , then S = { } , and if β > , and in addition the maps F and I of (7.2) are analytic, then S = { } ∪ [ y gp , ∞ ) for some y gp ∈ [0 , ∞ ];(iv) if S = { } ∪ [ y gp , ∞ ) for some y gp < ∞ , then V P = sup y ≥ u ( R ) v ( y ) , and the supremum is attained at some ˆ y ≥ u ( R ) . Deﬁning ˆ z : [0 , ∞ ) −→ R to be a ( measurable ) maximiserof I ( v , v ) , ˆ π : [0 , ∞ ) −→ R to be a ( measurable ) minimiser of F ? ( δv ) , there exists a unique weak solutionto the SDE corresponding to b Y := Y ˆ y, ˆ z ( b Y ) , ˆ π ( b Y ) . In particular, the contract (cid:0) ˆ τ , ˆ π ( b Y ) , u − ( b Y ˆ τ ) (cid:1) , where ˆ τ := inf n t ≥ b Y t (0 , y gp ) o , is an optimal contract for the relaxed principal problem (2.7) . Remark 3.7.

Sannikov mentions that if ‘the agent had a higher discount rate than the principal, then withtime the principal’s beneﬁt from output outweighs the cost of the agent’s eﬀort,’ and that ‘it is sensible toavoid permanent retirement by allowing the agent to suspend eﬀort temporarily.’ ([54, pp. 959]) . Our resultshows that this statement is not correct: having δ > does not change the nature of the solution to theproblem. Remark 3.8.

The case S = { } is not covered by Theorem 3.6.( iv ) , due to the fact that in this case,the optimal retirement time τ ? may be inﬁnite with positive probability, and therefore cannot satisfy theintegrability requirement in Equation (2.11) . This is however not a critical issue. Indeed, the integrabilitycondition on admissible stopping times in

Equation (2.11) is taken from the general result in [41] . But adetailed reading of their arguments shows that they only require it in order to be able to treat moral hazardproblems where the agent is allowed to control the volatility of the output process, for which they need a theoryfor second–order backward SDEs with random horizon, which is obtained in

Lin, Ren, Touzi, and Yang [40] ,but does not allow for inﬁnite horizon. In our problem of interest, the agent only controls the drift of X ,meaning that the classical theory of backward SDEs is suﬃcient, and these objects are known to be well–posed even with inﬁnite horizon, see for instance Papapantoleon, Possamaï, and Saplaouras [49] . With theseresults in hand, we can straightforwardly extend the general reduction result of

Section 5 to include possiblyinﬁnite retirement times, and then obtain a veriﬁcation result general enough to cover these situations. Asthis is not central to our message, we refrain to go to this level of generality.

We next provide some numerical results with the cost of eﬀort function from Example 3.3, and utilityfunction u ( π ) := π γ , γ >

1. We of course choose the model parameters so that neither (NGP2) nor (NGP3)are satisﬁed, since in those cases the solution is F everywhere. igure 1a takes the parameters in [54] (with γ = 2, η = 0 . h = 0 . β = 0 .

4, and δ = 1), and shows thearchetypical case where a Golden Parachute exists, as in [54]. Figure 1b however (with γ = 3 / η = h = 1, β = 0 .

01, and δ = 1) suggests strongly that v remains always strictly above F but becomes asymptoticallyclose to it, a case for which a Golden Parachute would not exist.(a) v (red), F (blue) (b) v (red), F (green)The next two sets of ﬁgures show what happens when δ = 1. More precisely, Figure 2a (with γ = 3 / η = h = 1, β = 0 .

01, and δ = 3 /

4) shows a case where v becomes equal to F after a while and a Goldenparachute does exist, while, at least numerically, Figure 2b (with γ = 3, η = h = 1, β = 0 .

01, and δ = 2),seems to show that v remains always above F , and that no Golden Parachute exists.(a) v (red), F (green), F (blue) (b) v (red), F (green), F (blue) In this subsection, we specialise the discussion to the case δ = 1 to better compare with [54]. Notice thatthe HJB equation considered by Sannikov in [54, Equation (5)] is the same as our Equation (3.3) whenrestricted to the continuation region v − yv + F ? ( v ) − I ( v, v ) + = 0 , y ∈ [0 , y gp ] , v (0) = F (0) , v ( y gp ) = F ( y gp ) and v ( y gp ) = F ( y gp ) , which corresponds to the natural guess that the stopping region S = { v = F } is of the form { } ∪ [ y gp , ∞ ),with some free boundary point y gp < ∞ to be determined so as to guarantee that the smooth–ﬁt condition ( y gp ) = F ( y gp ) holds. Such a guess is more naturally justiﬁed by the optimal stopping component of theprincipal’s problem in our formulation. We shall also see that it is necessary in order to apply the veriﬁcationargument of Proposition 7.2 below (which in fact requires C regularity).A few pages later, namely in [54, Equation (6)], the author rewrites this ODE with I instead of I +0 v − yv + F ? ( v ) − I ( v, v ) = 0 , y ∈ [0 , y gp ] , v (0) = F (0) , v ( y gp ) = F ( y gp ) and v ( y gp ) = F ( y gp ) . (3.4)This is motivated by the natural guess that the principal is expected to induce a positive eﬀort for the agenton the continuation region. More importantly, direct manipulations allow to reformulate the last equationequivalently as v = inf z ≥ h (0) , ˆ a ∈ ˆ A ( z ) (cid:26) v − yv + F ? ( v ) − ˆ a − h (ˆ a ) v ηz (cid:27) , (3.5)thus reducing the equation to an explicit non–linear second–order ODE under the additional restriction toa positive marginal cost of eﬀort, that is to say when h (0) > y gp < ∞ , the potential explosion of the solution due to the superlinear feature of F ? is bypassed, as the concavity of v implies that v is bounded in [ v ( y gp ) , v (0)]. Although this assumptionis not always true, see Proposition 3.1, we continue along the line of Sannikov. Then, it follows from thestandard Cauchy–Lipschitz theorem that the last ODE, with initial data v (0) = 0 and v (0) = b , has aunique classical solution for any choice of b , say v b . Then, Sannikov argues that it is possible to choose b sothat this solution v b indeed solves Equation (3.4). Although, Sannikov’s proof of this claim is not rigorous,we show in the subsequent analysis that this result may be correct for suﬃciently small β . However, noticethat our main results given in Section 3.2 and Section 3.1, show that • for β = 0, there is no y gp ≥ F on [ y gp , ∞ ), see (NGP1) of Proposition 3.1; • for β > S is either reducedto { } , or is of the form { } ∪ [ y gp , ∞ ) for some y gp ≥

0. This requires some involved technicalarguments which are displayed in Section 9 below; • when the curvature at zero u (0) of the agent’s utility is suﬃciently large negative, the stopping region S is always reduced to { } for whatever value of β >

0. See (NGP2) of Proposition 3.1.Finally, we observe that in [54, Figure 6], the value function v is tangent to F at the point y gp , butseems to be strictly above F on ( y gp , ∞ )! We believe that the function plotted in this ﬁgure is the solutionof (3.4). Although this solution coincides with the solution of the dynamic programming equation (3.3) onthe continuation region [0 , y gp ], this ﬁgure shows that it lies strictly above it on the stopping region ( y gp , ∞ )where the principal optimally retires the agent. Hence, this seems to be a concrete numerical evidence thatthe dynamic programming equation is not equivalent to (3.4). This section reports for completeness the solution of the ﬁrst–best version of the contracting problem V P , FB := sup n J P ( C , α ) : C ∈ C FB , α ∈ A , and J A ( C , α ) ≥ u ( R ) o , where C FB consists of all contracts ( τ, π, ξ ) where τ ∈ T is a stopping time with values in [0 , ∞ ], and ( π, ξ )satisfy the integrability condition of (2.11). In particular, we shall see that the ﬁrst–best optimal contractexhibits no Golden Parachute.We ﬁrst consider the case where δγ ≤

1, which is somewhat degenerate. Indeed, as mentioned earlier,we can ﬁnd a sequence of admissible contracts which ensure a utility as large as we want for the agent, andreaches the highest possible level for the principal, namely ¯ a . The idea is to oﬀer no intermediate payments,to ask the agent to exert maximal eﬀort at all times, and to retire him after a very long time, at which weoﬀer him a very large lump–sum payment. The diﬃculty is then in how to calibrate the speed at which theretirement time and the ﬁnal payment explode, so as to ensure that the principal’s utility still increases. heorem 4.1. Assume that δγ ≤ . Then, we have V P , FB = ¯ a , and there does not exist an optimal contract.Proof. Notice ﬁrst that the limited liability constraints on the payments made to the agent, and the factthat A is bounded by ¯ a imply immediately that for any ( C , α ) ∈ C FB × A , we have J P ( C , α ) ≤ ¯ a. Moreover, the only way this can be an equality is to choose α = ¯ a , and C := ( τ, π, ξ ) such that π = 0and F ( ξ )e ρτ = 0, which means that either τ = ∞ , or ξ = 0. However, such contracts do not satisfy theparticipation constraint of the agent, and therefore there cannot exist an admissible contract attaining theupper bound ¯ a for the principal. We will however show that one can ﬁnd a sequence of admissible contractswhich allows to approach ¯ a as close as we want.For any ε >

0, let us consider the following contract: τ ε := − log( ε ) /ε , π ε := 0, ξ ε := ε − e γ ( r − ε ) τ ε , withthe level of eﬀort α ε := ¯ a . Since these contracts are deﬁned by deterministic components, they automaticallysatisfy the integrability condition of (2.11). Notice also that when ε goes to 0, both τ ε and ξ ε convergeto ∞ . Therefore, we can choose ε small enough and ﬁnd a constant C >

0, independent of ε , such that u ( ξ ε ) ≥ C ( ξ ε ) /γ . The utility received by the agent is thene − rτ ε u (cid:0) ξ ε (cid:1) − h (¯ a ) (cid:0) − e − rτ ε (cid:1) ≥ Cε γ − − h (¯ a ) −→ ε → ∞ , so that the agent’s participation constraint is satisﬁed for ε small enough. The principal’s utility is − e − ρτ ε ξ ε + ¯ a (cid:0) − e − ρτ ε (cid:1) = e ρ (1 − δγ ) log( ε ) ε ε γ − + ¯ a (cid:0) − e − ρτ ε (cid:1) −→ ε → ¯ a, since δγ ≤

1, which ends the proof in this case.When δγ >

1, the problem does not degenerate any longer, unless the reservation utility of the agent istoo low and either ¯ a is too small, or ( F ? ) (0) is too large. The solution is expressed in terms of the function G ? ( p ) := sup a ∈ A (cid:8) a + ph ( a ) (cid:9) , p ∈ R . Theorem 4.2.

Let δγ > . Then (i) if u ( R ) ≤ − h (¯ a ) + (cid:0) F ? (cid:1) (0) , the value function of the ﬁrst–best problem is V P , FB = ¯ a , and there is nooptimal contract which achieves this value ;(ii) otherwise, V P , FB = − λ ? δu ( R ) + (cid:0) G ? − F ? (cid:1)(cid:0) − δλ ? (cid:1) , where λ ? is the unique positive solution of − u ( R ) − Z ∞ r e − rt (cid:0) G ? − F ? (cid:1) (cid:0) − δλ ? e ( ρ − r ) t (cid:1) d t = 0 . Moreover, the agent’s participation constraint is saturated, with ﬁrst–best optimal contract τ ? = ∞ , and π ?t ∈ ˆ U (cid:18) e ( r − ρ ) t δλ ? (cid:19) , a ?t ∈ ˆ A (cid:18) e ( r − ρ ) t δλ ? (cid:19) , t ≥ , where for any z ∈ R , ˆ U ( z ) := argmin p ≥ (cid:8) zp − u ( p ) (cid:9) .Proof. By the standard Karush–Kuhn–Tucker method, we rewrite the ﬁrst best problem asinf λ ≥ (cid:26) − λu ( R ) + sup ( C ,α ) ∈ C FB ×A E P α (cid:20) − (cid:0) e − ρτ ξ − e − rτ λu ( ξ ) (cid:1) − Z τ (cid:16) ρ e − ρt π t − r e − rt λu ( π t ) (cid:17) d t + Z τ (cid:16) ρ e − ρt α t − r e − rt λh ( α t ) (cid:17) d t (cid:21)(cid:27) = inf λ ≥ (cid:26) − λu ( R ) + sup ( τ,α ) ∈T ×A E P α (cid:20) − e − ρτ F ? (cid:0) − λ e ( ρ − r ) τ (cid:1) + Z τ ρ e − ρt (cid:0) G ? − F ? (cid:1)(cid:0) − δλ e ( ρ − r ) t (cid:1) d t (cid:21)(cid:27) = inf λ ≥ n − λu ( R ) + sup T ≥ f ( T ) o , f ( T ) := − e − ρT F ? (cid:0) − λ e ( ρ − r ) T (cid:1) + Z T ρ e − ρt (cid:0) G ? − F ? (cid:1)(cid:0) − δλ e ( ρ − r ) t (cid:1) d t. s G ? ≥

0, and F ? is concave, we have f ( T ) ≥ ρ e − ρT (cid:16) F ? (cid:0) − λ e ( ρ − r ) T (cid:1) − F ? (cid:0) − λδ e ( ρ − r ) T (cid:1) + λ (1 − δ )e ( ρ − r ) T ( F ? ) (cid:0) − λ e ( ρ − r ) T (cid:1)(cid:17) ≥ , T ≥ . Then, the supremum over T ≥ T →∞ f ( T ) = φ ( λ ) < ∞ , as δγ >

1, where φ ( λ ) := Z ∞ ρ e − ρt (cid:0) G ? − F ? (cid:1)(cid:0) − δλ e ( ρ − r ) t (cid:1) d t < ∞ , and V P , FB0 = inf λ ≥ (cid:8) − λu ( R ) + φ ( λ ) (cid:9) . Notice that φ is strictly convex, with φ (0) = ( G ? − F ? )(0) = G ? (0) = ¯ a , and lim λ →∞ φ ( λ ) = ∞ , since G ? has linear growth (recall that A is compact), and F ? grows as ( − p ) γ/ ( γ − at −∞ . We also compute directlythat φ (0) = − (cid:0) G ? − F ? (cid:1) (0) ≤ − (cid:0) G ? − F ? (cid:1) (0) = − h (¯ a ) + (cid:0) F ? (cid:1) (0) , where the last equality follows from the diﬀerentiability of G ? and the observation that G ? ( p ) = ¯ a + ph (¯ a )for p ≥

0. Consequently, the minimum in the last expression of V P , FB is attained • either at λ ? = 0, if − u ( R ) − h (¯ a )+ (cid:0) F ? (cid:1) (0) ≥

0, inducing the value V P , FB = ( G ? − F ? )(0) = G ? (0) = ¯ a , • or at the unique solution λ ? of − u ( R ) − φ ( λ ? ) = 0. Then it follows from a direct integration by partsthat − λ ? (1 − δ ) u ( R ) = λ ? (1 − δ ) φ ( λ ? ) = ( G ? − F ? )( − δλ ? ) − φ ( λ ? ), and therefore V P , FB = − λ ? u ( R ) + φ ( λ ? ) = − λ ? δu ( R ) + (cid:0) G ? − F ? (cid:1)(cid:0) − δλ ? (cid:1) . Notice ﬁnally that when λ ? = 0, similar to the proof of Theorem 4.1, there is no optimal contract. In order to prove our main results reported in Section xsect:mainresults, we use the general approach of Lin,Ren, Touzi, and Yang [41] which justiﬁes the remarkable solution approach introduced by Sannikov [54],reducing the Stackelberg game problem of the principal (2.7) into a standard stochastic control one.To do this, observe that the Hamiltonian of the agent’s problem is given by convex conjugate function h ? introduced in (3.1), and that the corresponding sub–gradient contains all possible optimal agent responsesˆ A ( z ) := ∂h ? ( z ) = { a ∈ A : h ? ( z ) = za − h ( a ) } . As A is closed and h is strictly convex , notice thatˆ A ( z ) = ∅ , whenever h ? ( z ) < ∞ , and ˆ A ( z ) = { } , for z ≤ h (0) , (5.1)because a za − h ( a ) is decreasing whenever z ≤ h (0). We also abuse notations slightly, and for any F –predictable, real–valued process Z and any α ∈ A , we write α ∈ ˆ A ( Z ) whenever α t ∈ ˆ A ( Z t ), d t ⊗ d P –a.e.Then, the lump–sum payment ξ = u − ( ζ ) promised by the principal at τ takes the form ζ = Y Y ,Z,πτ = Y + r Z τ Z t d X t + (cid:0) Y Y ,Z,πt − h ? ( Z t ) − η t (cid:1) d t, (5.2)where Y Y ,Z,π represents the continuation utility of the agent given a continuous consumption stream π = u − ( η ) and Z satisﬁes the integrability conditionsup α ∈A E P α (cid:20) sup ≤ t ≤ τ (cid:0) e − r t | Y t | (cid:1) p (cid:21) < ∞ , and sup α ∈A E P α (cid:20)(cid:18) Z τ (e − r t | Z t | ) d t (cid:19) p (cid:21) < ∞ . (5.3) See Footnote 7. The methodology developed in [41] extends the ﬁnite maturity setting of Cvitanić, Possamaï, and Touzi [19]and is largely inspired by the method developed in Sannikov [54]. If A is an interval, then the strict convexity of h guarantees that ˆ A ( z ) is a singleton. However, for a general closed subset A ,the maximiser may not be unique. emark 5.1. As observed by

Sannikov [54] , notice that the non–negativity condition on u and h impliesthat the so–called limited liability condition Y Y ,Z,π ≥ is satisﬁed. Indeed, as the dynamics of the process Y Y ,Z,π are given by d Y Y ,Z,πt = r (cid:0) Y Y ,Z,πt + h ? ( Z t ) − η t (cid:1) d t + σrZ t d W t , under the agent’s optimal response,we see that is an absorption point for the continuation utility with optimal eﬀort . By the main reduction result of [41], we may rewrite the principal’s problem (2.7) as V P = sup Y ≥ u ( R ) V ( Y ) , where V ( Y ) := sup ( τ,Z,π ) ∈Z ( Y a ∈ ˆ A ( Z ) J ( τ, π, Z, ˆ a ) , (5.4)and J ( τ, π, Z, ˆ a ) := E P ˆ a (cid:20) e − ρτ F (cid:0) Y Y ,Z,πτ (cid:1) + Z τ ρ e − ρt (cid:0) ˆ a t + F ( η t ) (cid:1) d t (cid:21) . (5.5)Here Z ( Y ) is the collection of all triples ( τ, Z, π ) such that τ , ˆ a and ξ = − F ( Y Y ,Z,πτ ) satisfy the integrabilityconditions (2.11), for some ˆ a ∈ ˆ A ( Z ), and therefore also (5.3), together with the limited liability condition Y Y ,Z,π ≥ Y Y ,Z,π under the optimal response of the agent(due to the principal’s criterion which does not involve anymore the state variable X )d Y Y ,Z,πt = r (cid:0) Y Y ,Z,πt + h (cid:0) ˆ a t (cid:1) − η t ) (cid:1) d t + rZ t σ d W ˆ at , P ˆ a –a.s., for all ˆ a ∈ ˆ A ( Z ) . (5.6) We now provide the proof of Theorem 3.4 by using the problem reduction from the previous section. Noticeﬁrst that whenever V P = ¯ a , then the result of Theorem 4.1 shows that there cannot exist an optimal contract,and that the ﬁrst–best and second–best value coincide. Our proof is based on an explicit construction of asequence of contracts following the idea used in the proof of Theorem 4.1: we want to have a retirement timegoing to ∞ , associated with a large lump–sum payment. However, because we are now in the second–bestcase, we need to oﬀer the agent contracts which are incentive–compatible with the level of eﬀort ¯ a , meaningthat these contracts cannot be deterministic. This can however be achieved by choosing a large enoughand constant control process Z in Equation (5.6). The price to pay now with such contracts is that thecontinuation utility of the agent may reach 0 in ﬁnite time with positive probability, thus preventing theprincipal from oﬀering a large lump–sum payment. This thus requires to carefully control the probabilityof early termination of the contract, and we show that by oﬀering the agent a suﬃciently large utility, thisprobability can be made arbitrarily small. Proof of Theorem 3.4.

Let us ﬁx some y > z > h (¯ a ). It is immediate that in this case ˆ A ( z ) = { ¯ a } . Forarbitrary ε ∈ (0 , r ∧ π εt := u − (cid:0) εY εt (cid:1) , t ≥

0, where Y ε := Y y / √ ε,z,π ε isthe corresponding continuation utility of the agent, which is given by Y εt = y √ ε + Z t (cid:0) ( r − ε ) Y εs + rh (¯ a ) (cid:1) d s + rzσW ¯ at , t ≥ . Notice that Y ε an Ornstein–Uhlenbeck process under P ¯ a , whose SDE can be solved explicitly: Y εt = e ( r − ε ) t y √ ε + rr − ε h (¯ a ) (cid:0) e r ( t − ε ) − (cid:1) + rzσ Z t e ( r − ε )( t − s ) d W ¯ as , t ≥ . Let now T ε := inf (cid:8) t > Y εt = 0 (cid:9) , and consider the contract C ε with retirement time τ ε := (cid:0) − log( ε ) ε (cid:1) ∧ T ε ,continuous payments π ε , and terminal payment ξ ε := u − (cid:0) Y ετ ε (cid:1) . We know from the general results inSection 5 that such a contract provides the agent with utility − log( ε ) y , which he will accept for ε small By the growth condition (2.1) on u , the integrability condition (2.11) implies thatsup α ∈A E P α (cid:20)(cid:0) e − r τ u ( ξ ) (cid:1) γ + Z τ (cid:0) e − r s u ( π s ) (cid:1) γ d s (cid:21) < ∞ , which is precisely the integrability condition required by Lin, Ren, Touzi, and Yang [41]. nough, regardless of the level of his participation constraint. Indeed, all the integrability requirements areobviously satisﬁed here, since z is deterministic, τ ε is bounded, and from the explicit formula for Y ε .We now compute the principal’s utility induced by this contract J P ( C ε , ¯ a ) = E P ¯ a (cid:2) e − ρτ ε F (cid:0) Y ετ ε (cid:1)(cid:3) + E P ¯ a (cid:20) Z τ ε ρ e − ρt F (cid:0) εY εt (cid:1) d t (cid:21) + ¯ a (cid:16) − E P ¯ a (cid:2) e − ρτ ε (cid:3)(cid:17) . Step . For ε < r , we have T ε > ¯ T ε := inf (cid:8) t > Y εt = 0 (cid:9) , where ¯ Y εt := y / √ ε + rh (¯ a ) t + rzσW ¯ at , t ≥ T ε is well–known (see for instance Karatzas and Shreve [36, Equation (5.13)]), and we have P ¯ a (cid:2) T ε < ∞ (cid:3) ≤ P ¯ a (cid:2) ¯ T ε < ∞ (cid:3) = exp (cid:18) − h (¯ a ) y rσ z √ ε (cid:19) −→ ε → . (6.1)This implies that E P ¯ a (cid:2) e − ρτ ε (cid:3) = e ρ log( ε ) ε P ¯ a (cid:2) T ε = ∞ (cid:3) + E P ¯ a h e − ρτ ε { T ε < ∞} i ≤ e ρ log( ε ) ε P ¯ a (cid:2) T ε = ∞ (cid:3) + P ¯ a (cid:2) T ε < ∞ (cid:3) −→ ε → . Step . Next, we have that there exists some

C >

0, which may change value from line to line, but isindependent of ε , such that for any t ∈ (cid:2) , T ε (cid:3) ≤ − e − ρt F (cid:0) Y εt (cid:1) ≤ C e − ρt (cid:0) (cid:12)(cid:12) Y εt (cid:12)(cid:12) γ (cid:1) ≤ C e − ρt (cid:18) (cid:0) ε − γ/ (cid:1) e ( r − ε ) γt + e γ ( r − ε ) t (cid:12)(cid:12)(cid:12)(cid:12) Z t e − ( r − ε ) s d W ¯ as (cid:12)(cid:12)(cid:12)(cid:12) γ (cid:19) . Then, as the last stochastic integral is a Gaussian random variable, and δγ ≤

1, we see that0 ≤ − E P ¯ a (cid:2) e − ρτ ε F (cid:0) Y ετ ε (cid:1)(cid:3) ≤ C E P ¯ a (cid:20) e − ρτ ε (cid:18) (cid:0) | log( ε ) | γ (cid:1) e ( r − ε ) γτ ε + e γ ( r − ε ) τ ε (cid:12)(cid:12)(cid:12)(cid:12) Z τ ε e − ( r − ε ) s d W ¯ as (cid:12)(cid:12)(cid:12)(cid:12) γ (cid:19)(cid:21) ≤ C e ρ log( ε ) ε (cid:18) (cid:0) ε − γ/ (cid:1) e ( r − ε ) γ − log( ε ) ε + e − γ ( r − ε ) log( ε ) ε E P ¯ a (cid:20)(cid:12)(cid:12)(cid:12)(cid:12) Z − log( ε ) ε e − ( r − ε ) s d W ¯ as (cid:12)(cid:12)(cid:12)(cid:12) γ (cid:21)(cid:19) + C E P ¯ a (cid:20) { T ε < ∞} e − ρτ ε (cid:18) (cid:0) ε − γ/ (cid:1) e ( r − ε ) γτ ε + e γ ( r − ε ) τ ε (cid:12)(cid:12)(cid:12)(cid:12) Z τ ε e − ( r − ε ) s d W ¯ as (cid:12)(cid:12)(cid:12)(cid:12) γ (cid:19)(cid:21) ≤ C e ρ log( ε ) ε (cid:18) (cid:0) ε − γ/ (cid:1) e − ( r − ε ) γ log( ε ) ε + e − γ ( r − ε ) log( ε ) ε (cid:16) − e r − ε ) log( ε ) ε (cid:17) γ (cid:21)(cid:19) + C (cid:0) ε − γ/ (cid:1) P ¯ a (cid:2) T ε < ∞ (cid:3) + C E P ¯ a (cid:20) { T ε < ∞} (cid:12)(cid:12)(cid:12)(cid:12) Z τ ε e − ( r − ε ) s d W ¯ as (cid:12)(cid:12)(cid:12)(cid:12) γ (cid:21) . (6.2)It can be checked directly that since δγ ≤

1, the ﬁrst term on the right–hand side of Equation (6.2) goes to0 as ε go to 0. By (6.1), the second term also goes to 0 as ε goes to 0. Finally, for the third one, we haveusing Cauchy–Schwarz inequality and Burkholder–Davis–Gundy’s inequality E P ¯ a (cid:20) { T ε < ∞} (cid:12)(cid:12)(cid:12)(cid:12) Z τ ε e − ( r − ε ) s d W ¯ as (cid:12)(cid:12)(cid:12)(cid:12) γ (cid:21) ≤ (cid:16) P ¯ a (cid:2) T ε < ∞ (cid:3)(cid:17) E P ¯ a (cid:20)(cid:12)(cid:12)(cid:12)(cid:12) Z τ ε e − ( r − ε ) s d W ¯ as (cid:12)(cid:12)(cid:12)(cid:12) γ (cid:21) ≤ C (cid:16) P ¯ a (cid:2) T ε < ∞ (cid:3)(cid:17) (cid:18) Z ∞ e − r − ε ) s d s (cid:19) γ ≤ C (cid:16) P ¯ a (cid:2) T ε < ∞ (cid:3)(cid:17) −→ ε → , Step . Notice also at this point that when δγ <

1, we can follow all the steps above but take instead π ε = 0 in the contract. Then, all the terms appearing still converge to 0 when ε goes to 0, and it is enoughin this case to conclude that lim ε → J P ( C ε , ¯ a ) = ¯ a . n the case δγ = 1, it remains to control the continuous payment term0 ≤ − E P ¯ a (cid:20) Z τ ε ρ e − ρt F (cid:0) εY εt (cid:1) d t (cid:21) ≤ − Z ∞ ρ e − ρt E P ¯ a h { ε | Y εt |≤ } F (cid:0) ε | Y εt | (cid:1)i d t + C Z ∞ ρ e − ρt E P ¯ a h { ε | Y εt | > } (cid:0) ε γ | Y εt | γ (cid:1)i d t ≤ − Z ∞ ρ e − ρt E P ¯ a h F (cid:0) ∧ | εY εt | (cid:1)i d t + Cε γ Z ∞ ρ e − ρt E P ¯ a (cid:2) | Y εt | γ (cid:3) d t + C Z ∞ ρ e − ρt P ¯ a (cid:2) ε | Y εt | ≥ (cid:3) d t. Notice next that we have that for any t ≥ ε γ | Y εt | γ ≤ C (cid:18) ε γ/ + ε γ e γ ( r − ε ) t + ε γ (cid:12)(cid:12)(cid:12)(cid:12) Z t e ( r − ε )( t − s ) d W ¯ as (cid:12)(cid:12)(cid:12)(cid:12) γ (cid:19) . Therefore, we have since γ > ≤ ε γ Z ∞ ρ e − ρt E P ¯ a (cid:2) | Y εt | γ (cid:3) d t ≤ ε γ/ + Cρ ε γ − γ + ρε γ Z ∞ e − γεt (cid:16) − e − r − ε ) t (cid:17) γ d t ≤ ε γ ( − log( ε )) γ + 2 ρ ε γ − γ −→ ε → . Finally, since for any t ≥ ε | Y ε | converges P ¯ a –a.s. to 0, it is immediate by dominated convergence that Z ∞ ρ e − ρt P ¯ a (cid:2) ε | Y εt | ≥ (cid:3) d t −→ ε → , and Z ∞ ρ e − ρt E P ¯ a h F (cid:0) ∧ | εY εt | (cid:1)i d t −→ ε → , which concludes the proof. This section prepares for the proof of the remaining main results of Section 3 by applying the dynamicprogramming approach to solve the mixed control–and–stopping problem (5.4)–(5.5).Notice that this problem is stationary in time due to the inﬁnite horizon feature, and the time homogeneityof the dynamics of Y . By standard stochastic control theory, together with Remark 5.1, the correspondingHJB equation is v (0) = 0 , and min (cid:8) v − F , L v (cid:9) = 0 , on (0 , ∞ ) , (7.1)where for any y > L v ( y ) := v − δyv ( y ) + F ? (cid:0) δv ( y ) (cid:1) − I (cid:0) v ( y ) , v ( y ) (cid:1) + = v ( y ) − F ( y ) − T F (cid:0) y, δv ( y ) (cid:1) − I (cid:0) v ( y ) , v ( y ) (cid:1) + , and the second order diﬀerential operator I is as introduced in Equation (3.2), and can be rewritten thanksto (5.1) as I ( p, q ) = ∞ { q> } + { q ≤ } sup z ≥ h (0) , ˆ a ∈ ˆ A ( z ) (cid:8) ˆ a + h (ˆ a ) p + ηz q (cid:9) , ( p, q ) ∈ R , (7.2)where T F ( y, p ) := yp − F ( y ) − F ? ( p ) , y ≥ , p ∈ R . Observe by deﬁnition that T F ( y, p ) ≥ , and T F (cid:0) y, F ( y ) (cid:1) = 0 , for all y ≥ . (7.3)Moreover, the face–lifted principal reward function F introduced in (2.5) satisﬁes F − F − T F (cid:0) · , δF (cid:1) = 0 , on R + , (7.4)see the proof of Proposition 2.1 in Appendix A. emark 7.1. (i) Notice that L F ≤ on R + . Indeed L F = F − F − T F ( y, δF ) − I ( F , F ) + = − I ( δF , δF ) + ≤ by (7.4) . (ii) Equation (7.1) is equivalent to v (0) = 0 , and L v = 0 , on (0 , ∞ ) , (7.5) which agrees exactly with Equation (3.3) used in the statement of

Theorem 3.4 . Indeed, if v is a solution of (7.1) , then L v = 0 on S c , where S := { v = F } is the so–called stopping region, and L v = L F ≥ on S ,which implies that L v = 0 on R + by part (i) of the present remark.Conversely, assuming that L v = 0 , we see that v = F + T F ( · , δv ) + I ( v , v ) + ≥ F + T F ( · , δv ) , andtherefore v is a supersolution of (7.4) . By Lemma A.2 , this implies that v ≥ F , and we conclude that v solves Equation (7.1) . We next provide a veriﬁcation argument which is the standard justiﬁcation of the importance of thedynamic programming equation (7.1), and which guides the subsequent technical analysis to solve the con-tracting problem.

Proposition 7.2. (i)

Let v ∈ C ( R + ) be a super–solution of (7.1) , i.e. v (0) ≥ , and L v ≥ . Then v ≥ V on R + . (ii) Assume further that v (0) = 0 , L v = 0 on the continuation region S c := { v > F } , and that • for any y > , there exists a maximiser ˆ z ( y ) of I ( δv , δv )( y ) such that the SDE (5.6) , with, for any t ≥ , u ( π ?t ) := ( F ? ) (cid:0) δv ( Y t ) (cid:1) , Z ?t := ˆ z ( Y t ) , and ˆ a ?t ∈ ˆ A ( Z ?t ) , has a weak solution ; • deﬁning τ ? := inf n t : Y y,Z ? ,π ? t

6∈ S c o , the triplet ( τ ? , Z ? , π ? ) belongs to Z ( Y ) .Then v ( Y ) = V ( Y ) . (iii) If in addition v is ultimately decreasing, then the value function of the principal is V P = v ( Y ? ) , forsome Y ? ≥ u ( R ) with optimal contract ξ ? given by u ( ξ ? ) := Y ? + r Z τ ? Z ?t d X t + r Z τ ? (cid:0) Y t − h ? ( Z ?t ) − u ( π ? ) (cid:1) d t. Proof. (i) We ﬁrst prove that v ≥ V . For an arbitrary Y ≥

0, and ( τ, Z, π ) ∈ Z ( Y ) with correspondingˆ a ∈ ˆ A ( Z ), we introduce τ n := τ ∧ inf { t ≥ Y t ≥ n } , and we directly compute by It¯o’s formula that v ( Y ) = e − ρτ n v ( Y τ n ) − Z τ n e − ρt (cid:18) − ρv + ∂ t v + ( y + h (ˆ a t ) − u ( π t )) rv y + 12 σ r Z t v yy (cid:19) ( Y t )d t − Z τ n e − ρt v y ( Y t ) rZ t σ d W ˆ at ≥ e − ρτ n F ( Y τ n ) + Z τ n e − ρt (cid:0) L v ( Y t ) + ˆ a t − π t (cid:1) d t − Z τ n e − ρt v y ( Y t ) rZ t σ d W ˆ at ≥ e − ρτ n F ( Y τ n ) + Z τ n e − ρt (cid:0) ˆ a t − π t (cid:1) d t − Z τ n e − ρt v y ( Y t ) rZ t σ d W ˆ at . Since v y is bounded on [0 , τ n ] and Z satisﬁes (5.3), this implies that v ( Y ) ≥ E P ˆ a (cid:20) e − ρτ n F ( Y τ n ) + Z τ n e − ρt (cid:0) ˆ a t − π t (cid:1) d t (cid:21) −→ E P ˆ a (cid:20) e − ρτ F ( Y τ ) + Z τ e − ρt (cid:0) ˆ a t − π t (cid:1) d t (cid:21) , as n −→ ∞ , where the last convergence follows from the fact that (cid:12)(cid:12) e − ρτ n F ( Y τ n ) (cid:12)(cid:12) ≤ C (cid:0) − ρτ n Y γτ n (cid:1) ≤ C (cid:18) ≤ t ≤ τ (cid:0) e − ργ t Y t (cid:1) γ (cid:19) , by the estimate stated in Proposition 2.1, together with the integrability conditions on π in (2.11) and on Y in (5.3). By the arbitrariness of ( τ, Z, π ) ∈ Z ( Y ), this shows that v ( Y ) ≥ V ( Y ). o prove (ii), we now repeat the previous argument starting from the control ( τ ? , Z ? , π ? ) introduced inthe statement, and denoting Y ? the induced controlled state process. As Z ?t and u ( π ?t ) are maximisers of I ( δv , δv )( Y ?t ) and F ? ( δv ( Y ?t )), respectively, we see that for any ˆ a ? ∈ ˆ A ( Z ? ) v ( Y ) = E P ˆ a? (cid:20) e − ρτ n v ( Y ?τ ?n ) + Z τ ?n e − ρt (cid:0) ˆ α ?t − π ?t (cid:1) d t (cid:21) −→ n →∞ E P ˆ a? (cid:20) e − ρτ ? v (cid:0) Y ?τ ? (cid:1) + Z τ ? e − ρt (cid:0) ˆ a ?t − π ?t (cid:1) d t (cid:21) = E P ˆ a? (cid:20) e − ρτ ? F (cid:0) Y ?τ ? (cid:1) + Z τ ? e − ρt (cid:0) ˆ a ?t − π ?t (cid:1) d t (cid:21) , since v = F on the boundary of S .(iii) Finally, v is concave by Remark 7.1 (iii). As it is assumed to be ultimately decreasing, the existenceof a maximiser Y ? of v ( y ) on [ u ( R ) , ∞ ) follows, and we obtain that V P = sup Y ≥ u ( R ) v ( y ) = v ( Y ? ). This section reports the proof of Proposition 3.1 by analysing the action of the operator L on the face–liftedprincipal’s reward F . Indeed, if there is a Golden Parachute, then the value function of the principal coincideswith F on [ y gp , ∞ ), and we must therefore have L F ≥ L F = 0 on [ y gp , ∞ ). By deﬁnition of F , this means that we must have I (cid:0) F ( y ) , F ( y ) (cid:1) + = 0for any large enough y , and that I (cid:0) F ( y ) , F ( y ) (cid:1) > y in a set of non–empty interior. Hence theﬁrst part of the statement. The equivalence with the condition written in terms of F ? can be obtained byevaluating I (cid:0) F ( y ) , F ( y ) (cid:1) + at the point y = (cid:0) F (cid:1) − ( p ), and by computing that F ( y ) = 1 / ( F ? ) ( p ).Consequently, we now justify the suﬃcient conditions of the proposition by verifying some cases where F either solves Equation (7.1) on the whole R + , or nowhere. Lemma 8.1.

Let β := h (0) . We have (i) L F ( y ) = 0 for some y > , if and only if I ( F , F )( y ) + = 0;(ii) if β > , then L F = 0 on [ y , ∞ ) , for some y ≤ (cid:0) F (cid:1) − (cid:16) F (0) ∧ − βδ (cid:17) ;(iii) if β = 0 , and A ⊃ [0 , ¯ a ] for some ¯ a > , then L F < on (0 , ∞ ) .Proof. (i) follows immediately from the deﬁnition of F . To prove (ii), recall from Proposition 2.1 that F isdecreasing and strictly concave on [0 , ∞ ), implying that0 ≤ I (cid:0) δF , δF (cid:1) + ≤ sup z ∈ R , ˆ a ∈ ˆ A ( z ) (cid:8) ˆ a + h (ˆ a ) δF } ≤ sup z ∈ R , ˆ a ∈ ˆ A ( z ) (cid:8) ˆ a (1 + βδF ) } , where the last inequality is a consequence of the convexity of h , which implies that h (ˆ a ) ≥ h (0)+ h (0)ˆ a = β ˆ a .Now, observe that y := ( F ) − (cid:0) F (0) ∧ − βδ (cid:1) is such that 1 + βδF ≤ y , ∞ ). Then, since ˆ A (0) = { } ,we deduce that I ( F , F ) + = 0 on [ y , ∞ ).(iii) As A contains an interval, h is strictly convex, F is concave, and h (0) = 0, we have that I (cid:0) F ( y ) , F ( y ) (cid:1) + = sup z ≥ , ˆ a ∈ ˆ A ( z ) (cid:8) ˆ a + h (ˆ a ) δF ( y ) + ηz δF ( y ) (cid:9) ≥ sup z ≥ , ˆ a ∈ ˆ A ( z ) ⊂ [0 , ¯ a ] (cid:8) ˆ a + h (ˆ a ) δF ( y ) + ηz δF ( y ) (cid:9) = sup z ≥ n ( h ) − ( z ) ∧ ¯ a + h (cid:0) ( h ) − ( z ) ∧ ¯ a (cid:1) δF ( y ) + ηz δF ( y ) o = sup a ∈ [0 , ¯ a ] (cid:8) a + h ( a ) δF ( y ) + η (cid:0) h ( a ) (cid:1) δF ( y ) (cid:9) , y > . Now notice that since h (0) = 0, the derivative at a = 0 of the map inside the supremum above is equal to1 >

0. Therefore, this map is increasing on a right–neighbourhood of 0, and thus for any y >

0, we have I (cid:0) F ( y ) , F ( y ) (cid:1) + > he next result complements Lemma 8.1.(ii) by exploring the regions where L F = 0 may hold underadditional conditions on either F or h . Let φ ( y ) := sup z ≥ β, ˆ a ∈ ˆ A ( z ) (cid:8) ˆ a + h (ˆ a ) δF ( y ) + ηz δF ( y ) (cid:9) , y ≥ , so that I ( F , F ) + = φ + , and notice that φ is non–increasing, whenever F is. Lemma 8.2.

Let β := h (0) > . Then the following holds (i) if F is non–increasing, then (cid:8) L F = 0 (cid:9) = { I ( F , F ) + = 0 (cid:9) = [ y , ∞ ) , where y := inf (cid:8) y ≥ φ ( y ) ≤ (cid:9) < ∞ . In particular, L F = 0 on R + if and only if I (cid:0) F (0) , F (0) (cid:1) + = 0;(ii) if A = [0 , ¯ a ] for some ¯ a > , and h ∈ C with inf a ∈ A ( (cid:0) ( h ) (cid:1) ( a ) h ( a ) ) > , and F (0) + 2 ηF (0) h (0) < − βδ , (8.1) then I ( F , F ) + = 0 , on [0 , y ] ∪ [ y , ∞ ) , for some y > , and y ≤ ( F ) − (cid:18) F (0) ∧ − βδ (cid:19) ;(iii) In the context of (NGP3) we have I (cid:0) δF , δF (cid:1) = 0 , on (0 , ∞ ) . Proof. (i) Since I +0 ≥ L F = I (cid:0) F , F (cid:1) + as in the previous proof, the ﬁrst part of (i) followsimmediately, and we see that y < ∞ by Lemma 8.1.(iii). Next, we just observe that φ (0) ≤ a + h (ˆ a ) δF (0) + ηz δF (0) ≤ , for all z ≥ β and ˆ a ∈ ˆ A ( z ), which provides the required condition giventhat F (0) ≤ y is direct as in (i). Next, under our assumption on A , we have directly that I (cid:0) F ( y ) , F ( y ) (cid:1) + = sup a ∈ A n a + h ( a ) δF ( y ) + η (cid:0) h ( a ) (cid:1) δF ( y ) | {z } =: ψ ( a,y ) o + , y > . Notice that ∂ aa ψ ( a, y ) = h ( a ) δF ( y ) + η (cid:0) ( h ) (cid:1) ( a ) δF ( y ) , ( a, y ) ∈ A × [0 , ∞ ) , so that the ﬁrst condition in Equation (8.1) implies that sup a ∈ A ∂ aa ψ ( a, <

0, and therefore by continuity,sup a ∈ A ∂ aa φ ( a, y ) ≤ , y ], y >

0. This shows that ψ ( · , y ) is concave in a , for y inthis interval. We next compute that, reducing y if necessary ∂ a ψ (0 , y ) = 1 + βδ (cid:16) F ( y ) + 2 ηF ( y ) h (0) (cid:17) ≤ , y ∈ [0 , y ] , by the second condition in Equation (8.1) and the continuity of F (0) . Hence, for any y ∈ [0 , y ], thefunction a ψ ( a, y ) is non–increasing, concave, and thus attains its maximum at a = 0, implying that I (cid:0) F ( y ) , F ( y ) (cid:1) + = ψ (0 , y ) + = 0.(iii) This is very similar to ( ii ). Simply notice that now the ﬁrst condition in (NGP3) implies that for any y ≥ A a ψ ( a, y ) is concave, while the second condition in (NGP3) implies that for any y ≥ ∂ a ψ (0 , y ) ≤

0, and thus that φ is non–increasing in a for any y ≥

0, and therefore the desired result.

Throughout this section, Assumption 3.5 is in force. We start with proving the strict concavity of v . Lemma 9.1.

Any continuous solution of

Equation (7.1) is strictly concave. roof. To prove concavity, suppose to the contrary that v is strictly convex on some non–empty openinterval ( y , y ) ⊂ R + , then we would have that − v < y , y ), and thus that − I ( v , v ) + = −∞ on ( y , y ) (still in the viscosity sense), contradicting the fact that v is a continuousviscosity solution of Equation (7.1).The strict concavity follows the same line of argument as in [54]. Suppose to the contrary that v ( y ) = b + by for y in some interval [ y , y ] ⊂ R + , then b + (1 − δ ) by + F ? ( δb ) − I ( b, + = 0 , y ∈ [ y , y ] . If δ = 1, this implies that b = 0, and therefore b = I (0 , + = ¯ a >

0. In particular I +0 = I .We next argue that this ODE is in addition uniformly elliptic. This is immediate when β >

0. For β = 0,we have [0 , ¯ a ] ⊂ A by Assumption 3.5, and I (cid:0) δv , δv (cid:1) + ≥ sup z ≥ , ˆ a ∈ ˆ A ( z ) ⊂ [0 , ¯ a ] (cid:8) ˆ a + δh (ˆ a ) v + δηz v ( y ) (cid:9) = sup a ∈ [0 , ¯ a ] (cid:8) a + δh ( a ) v ( y ) + δη (cid:0) h ( a ) (cid:1) v ( y ) | {z } =:Φ( a,y ) (cid:9) , and Φ(0 , y ) = 0, ∂ a Φ(0 , y ) = 1, for any y >

0. Then, for any compact subset of (0 , ∞ ), the supremum in I (cid:0) δv , δv (cid:1) + is attained on [ ε, ¯ a ] for some ε >

0, independent of y (but of course depending on the chosencompact set). Hence, the ODE can always be written in explicit form on any compact subset of (0 , ∞ ).Consequently, the standard Cauchy–Lipschitz existence and uniqueness theory applies. By uniquenessof the solution of Equation (7.1) with boundary condition v ( y ) = b and v ( y ) = 0, we deduce that v = b on [0 , y ], contradicting the boundary condition v (0) = 0. If δ = 1, we also see by the same argument that v ( y ) = b + by on [0 , y ], so that v (0) = 0 implies that b = 0, and we get F ? ( δb ) − I ( δb, + = 0 andtherefore F ? ( δb ) = I ( δb, + = 0, which again cannot happen. Remark 9.2. By Lemma 9.1 , it is natural to introduce the concave dual function v ? ( p ) := inf y ≥ { yp − v ( y ) } , p ∈ R . Then, if in addition v is a C solution of the dynamic programming equation, v ? solves the equation L ? v ? ( p ) := v ? ( p ) − F ? ( δp ) + ( δ − p ( v ? ) ( p ) + I (cid:18) p, v ? ) ( p ) (cid:19) + = 0 , p ∈ R . (9.1) This follows by evaluating (7.5) at the point y = ( v ) − ( p ) and by computing that v ( y ) = 1 / ( v ? ) ( p ) . Lemma 9.3.

There is a unique solution v of Equation (7.1) , such that ≤ ( v − F )( y ) ≤ C log(1+log(1+ y )) , y ≥ , for some C > . Besides, v is strictly concave, ultimately decreasing, and belongs to C ( R + ) . Proof.

By Remark 7.1 and Lemma B.1 below, F and F b are respectively sub–solution and super–solutionof (7.1), for b large enough. Lemma B.3 below shows that this equation satisﬁes comparison betweensub–solutions and super–solutions lying between F and F b . Then, since F (0) = F b (0) = 0, we deduce fromPerron’s existence result, see Crandall, Ishii, and Lions [13, Theorem 4.1], that Equation (7.1) has a viscositysolution v b lying between F and F b . Using Lemma B.3, this solution must be unique, and thus does notdepend on b , so that we can denote it by v .Recall that v is strictly concave by Lemma 9.1. Since it is below F b , it has to be ultimately decreasing,as F b is. In addition, v is diﬀerentiable Lebesgue–a.e., and we may deﬁne the measurable set I := (cid:8) y ≥ v ( y ) − F ( y ) − T F (cid:0) y, δv ( y ) (cid:1) = 0 (cid:9) . For almost–every y ∈ I , we have by using the inequality v − F ≥ y since it holdsin the viscosity sense and both v and F are continuous) and the deﬁnitions of F ? and T F ? (cid:0) δv ( y ) (cid:1) = δyv ( y ) − v ( y ) ≤ δyv ( y ) − F ( y ) ≤ F ? (cid:0) δv ( y ) (cid:1) . Such a transformation can also be conducted if the solution is expressed in the sense of viscosity solutions (as it will be neededlater), but one has to be careful as strict convexity is not suﬃcient, see Alvarez, Lasry, and Lions [3, Proposition 5] and the remarkafter its proof. herefore, the above inequalities must be equalities almost–everywhere on I , which in particular impliesthat v = F , a.e. on I , and therefore everywhere on I since v and F are continuous. We also automaticallyhave that v is C on I .On the other hand, we have on R + \ I that I (cid:0) v , v (cid:1) + = I (cid:0) v , v (cid:1) >

0, almost–everywhere. Arguyingas in the proof of Lemma 9.1, we see that v is a viscosity solution of a (locally) uniformly elliptic ODE v = L ( y, v, v ) , on R + \ I, for some locally Lipschitz nonlinearity L . Consequently, since R + \ I = (cid:0) v − F (cid:1) − ((0 , ∞ )) is an open set bycontinuity of v and F , the standard Cauchy–Lipschitz theorem then shows that v must also be smooth on R + \ I . Finally, any point y on the boundary ∂I ∩ (0 , ∞ ) is a minimiser of the diﬀerence v − F . As such • we have by the ﬁrst order condition that v ( y − ) − F ( y ) ≤ ≤ v ( y +) − F ( y ), implying by concavitythat v is diﬀerentiable at y and v ( y ) = F ( y ); • we also have v ( y ) ≥ F ( y ), implying by continuity that0 = L v ( y ) = − I (cid:0) F ( y ) , v ( y ) (cid:1) ≤ − I (cid:0) F ( y ) , F ( y ) (cid:1) ≤ , which implies that v ( y ) = F (0), and I (cid:0) F ( y ) , F ( y ) (cid:1) + = 0. Hence, y ∈ (cid:8) I (cid:0) F , F (cid:1) + = 0 (cid:9) . Lemma 9.4.

Let β > , assume that F and I are analytic, and let v be the solution from Lemma 9.3 .Then (cid:8) v = F (cid:9) ∩ (0 , ∞ ) is a possibly empty interval unbounded to the right.Proof. (i) Suppose v = F on some interval [ y , y ], and let us show that we must have v = F on [ y , ∞ ),which implies in particular that I (cid:0) F , F ) + = 0 on [ y , ∞ ), and consequently I ∩ (0 , ∞ ) has the claimedform, where the set I is introduced in the proof of Lemma 9.3.To see this, suppose to the contrary that L F < y , y ) to the right of y , anddenote φ ( y ) := (cid:0) v − F (cid:1) ( y ) − T F ( y, δv ( y )). Clearly, φ ≥ R + and φ = 0 on [ y , y ], so that0 = (cid:0) v − F (cid:1) ( y ) = min y ∈ ( y ,y ) (cid:0) v − F (cid:1) ( y ) , and 0 = φ ( y ) = min y ∈ ( y ,y ) φ ( y ) . (9.2)Next, reducing y > y if necessary, we have that on ( y , y ), both v and F are strictly concave anddecreasing, and we claim that the supremum in I ( δv , δv ) + cannot be attained on a right–neighbourhoodof β . Indeed, otherwise I ( δv , δv ) + would be equal to 0 on some interval at the right of y , since we have I ( v , v ) = sup ˆ a ∈ ˆ A ( β ) (cid:8) ˆ a + δv h (ˆ a ) + δηv β (cid:9) = δηv β < , so that, by continuity, this supremum remains negative on a right–neighbourhood of y , and I ( v , v ) + = I ( v , v ) + = 0. Consequently, v and F are both solutions of the ODE w − δyw + F ? ( δw ) = 0 withsame boundary condition at y , implying that v = F on some interval at the right of y , and thereforecontradicting the deﬁnition of y as a (right) extreme point of (cid:8) v = F (cid:9) .Moreover, since A is bounded, the supremum over z must be attained on a compact set, meaning thatfor some β min > β and some ﬁnite β max > β min I ( v , v ) + = I ( v , v ) = max z ∈ [ β min ,β max ] , ˆ a ∈ ˆ A ( z ) (cid:8) ˆ a + δv h (ˆ a ) + δηz v (cid:9) . By our assumptions on F and I , the function v solves on R + \ I an explicit ODE with analytic non–linearity.Indeed, I is invertible and does not take the value 0, implying that its reciprocal function is still analytic.By Cauchy–Kowaleski’s theorem, we deduce that v is also analytic on [ y , y ), and therefore C ∞ . Using allthe above results, we have that on ( y , y ) I (cid:0) F , F (cid:1) + > − L v = I ( v , v ) + − φ ≥ I ( F , F ) + − φ + δ (cid:16) ηβ (cid:0) v − F (cid:1) + − ηβ (cid:0) v − F (cid:1) − + h ( a ) (cid:0) v − F (cid:1) + (cid:17) − δh ( a ) (cid:0) v − F (cid:1) − . enoting c := ( ηβ ) ∧ h ( a ) > c := ( ηβ ) ∨ h ( a ) >

0, we have then δc (cid:2) ( v − F ) + + (cid:0) v − F (cid:1) + (cid:3) − δc (cid:2) ( v − F ) − + (cid:0) v − F (cid:1) − (cid:3) < φ, on ( y , y ) . (9.3)As v ( y ) = F ( y ), v ( y +) ≥ F ( y ) and φ ( y ) = 0, by (9.2), these inequalities imply by sending y & y ,that ( v − F )( y ) = 0. This implies in turn that φ is diﬀerentiable at y , and, since v and F are C ∞ onthe right of y , and y is a local minimum of v − F (cid:0) v − F (cid:1) ( y +) ≥ , and φ ( y ) = − v ( y ) (cid:0) y − ( F ) − ◦ v ( y ) (cid:1) = 0 , Then, dividing (9.3) by ( y − y ) and sending y & y , we get ( v − F )( y +) = 0, and we deduce that φ istwice diﬀerentiable at y , and (cid:0) v − F (cid:1) ( y +) ≥ , and φ ( y ) = 0 . Direct iteration of this argument shows that (cid:0) v − F (cid:1) is inﬁnitely diﬀerentiable at the point y , with 0derivative of any order. Since it is an analytic function on a right–neighbourhood on y , we deduce that v − F = 0 on a right–neighbourhood of y , which contradicts the deﬁnition of y .Together with the previous lemmatas, the following result concludes the proof of Theorem 3.6.(i)–(ii)–(iii). Lemma 9.5.

Let v be the solution constructed in Lemma 9.3 , and deﬁne S := (cid:8) v = F (cid:9) . Then (i) we have v (0) ≥ , and whenever F (0) = 0 , we have v (0) > if and only if I (cid:0) , δF (0) (cid:1) > if β = 0 , then S = { } .Proof. (i) By continuity, we have0 = L v (0) = F ? (cid:0) δv (0) (cid:1) − I (cid:0) δv (0) , δv (0) (cid:1) . Since F ? ≤ I ≥

0, it follows that F ? (cid:0) δv (0) (cid:1) = 0, and consequently v (0) ≥

0, since F ? < −∞ , I ( p, q ) is non–decreasing in p , we deduce that 0 = I (cid:0) δv (0) , δv (0) (cid:1) ≥ I (cid:0) , δv (0) (cid:1) . However, underour assumptions, I (cid:0) , δF (0) (cid:1) >

0. Then it follows from the non–decrease of I ( p, q ) in q that v (0) < F (0).Consequently, we have v ≥ F , v (0) = F (0), v (0) ≥ F (0) = 0 and v (0) < F (0) ≤

0, implying that v (0) >

0, as required.(ii) By Lemma 8.1.(iii), we know that F never solves the ODE, and we claim that this implies that v > F on (0 , ∞ ). Indeed, notice that any contact point y of v and F is a local minimiser of the diﬀerence v − F ,so that v = F at such a point. Then, as I ≥

0, it follows from Equation (7.1) that T F (cid:0) y , δF ( y ) (cid:1) = 0which cannot happen unless y = 0.We ﬁnally prove Theorem 3.6.(iv) by using the veriﬁcation result provided in Proposition 7.2, in order toshow that one can identify the value function of the principal with the function v constructed in Theorem 3.6. Proof of

Theorem 3.6.(iv) . The existence of ˆ y is immediate by the strict concavity of v , and the fact thatit is ultimately decreasing. Then, the rest of the proof simply requires to check that the assumptions inProposition 7.2 are satisﬁed here. First of all, notice that the map ˆ z is bounded from above since A iscompact, and from below by β , and it is continuous on S c because v is C there. Similarly, the map ˆ π isbounded on S c , from below by 0 and from above as well because S c is a bounded set under our assumptions.The existence of a unique weak solution for b Y is then direct from Stroock and Varadhan [64, Corollary6.4.4]. Notice in addition that b Y has moments of any order under P (and thus under any P α , α ∈ A ,recall that A is compact). It remains to verify that ˆ τ satisﬁes (2.11). However, b Y is a one–dimensionalMarkov process for which the boundaries 0 and y gp are regular and accessible, it is therefore well–knownthat ˆ τ is ﬁnite with probability 1. Since A is compact, the densities d P α / d P all have moments of any order,uniformly in α ∈ A , from which it is immediate that (2.11) holds. The drift of b Y is not bounded as required in [64, Corollary 6.4.4], because of the term r b Y . However, it suﬃces to apply theresult to (cid:0) e rt b Y t (cid:1) t ≥ . In this section, we explore whether adding risk–aversion for the principal fundamentally modiﬁes the resultsfor the existence of a Golden Parachute. As such, we consider a variation which mixes both Sannikov’smodel [54] and Holmström and Milgrom’s [34]. There is a ﬁnite horizon

T >

0, the contracts only stipulatea lump–sum payment at τ (meaning that π is always 0). Moreover, the agent also has a CARA utility and,abusing notations slightly, for any admissible contract C := ( τ, ξ ), we have V A ( C ) := sup α ∈A J A ( C , α ) , where J A ( C , α ) := E P α (cid:20) U A (cid:18) ξ − Z τ h ( α s )d s (cid:19)(cid:21) , where U A ( x ) := − e − ψx , x ∈ R , for some ψ >

0. Using again the general approaches from Cvitanić, Possamaï,and Touzi [19] and Lin, Ren, Touzi, and Yang [41] , we can show that the lump–sum payment ξ at time τ takes the form ξ = Y Y ,Zτ = Y + Z τ Z t d X t − Z τ (cid:18) h ? ( Z t ) − ψσ Z t (cid:19) d t, where this time Y Y ,Z should be interpreted as the certainty equivalent of the agent. In turn, the principal’sproblem boils now down to V P = sup Y ≥ R sup ( τ,Z, ˆ a ) ∈ Z ( Y ) × ˆ A ( Z ) E P ˆ a h U P (cid:16) X τ ∧ T − Y Y ,Zτ ∧ T (cid:17)i , (10.1)where U P ( x ) := − e − ηx , x ∈ R , for some η >

0, and Z ( Y ) is a proper reformulation of Z ( Y ) in this context.In the present context, a Golden Parachute is a situation where the optimal retirement time chosenby the principal lies in (0 , T ) with positive probability. As V P ( t, x, y ) = U P ( x − y ) upon retirement, theexistence of a Golden Parachute is reduced to the non–emptiness of the stopping region before maturity (cid:8) ( t, x, y ) : t < T and V P ( t, x, y ) = U P ( x − y ) (cid:9) .Similar to Section 8, we shall explore the potential existence of a Golden Parachute by analysing theaction of the dynamic programming operator on the obstacle U P ( x − y ). In the present context, the dynamicprogramming equation corresponding to the reduced principal problem (10.1) is given bymin (cid:26) v − U P ( x − y ); − ∂ t v − σ v xx − M ( v x , v y , v yy , v xy ) (cid:27) = 0 , v ( t, x,

0) = U P ( x ) , on [0 , T ) × R × (0 , ∞ ) ,v ( T, x, y ) = U P (cid:0) x − y (cid:1) , on ( x, y ) ∈ R × [0 , ∞ ) , where M ( q , q , γ , γ ) := sup ( z, ˆ a ) ∈ R × ˆ A ( z ) (cid:26) ˆ a ( z ) q + (cid:18) σ ψ z + h (cid:0) ˆ a ( z ) (cid:1)(cid:19) q + σ z γ + σ zγ (cid:27) . For the sake of clarity, the following result focuses on the case where the agent’s cost of eﬀort is quadratic.

Lemma 10.1.

In the present setting, assume that A = [0 , ∞ ) , and h ( a ) = ha / βa , a ≥ , for some h > and β ≥ . A necessary condition for a Golden Parachute to exist is β ≥ β := − s σ hψ (1 + σ hη )1 + σ h ( ψ + η ) ! + . Proof.

With the choice of A and h in the statement of the lemma, we have that when σ h ( ψq + γ ) + q < , and ψq + γ < M ( q , q , γ , γ ) = 12 h max ( sup z ≥ β (cid:26)(cid:0) σ h ( ψq + γ ) + q (cid:1) z + 2 (cid:0) σ hγ + q (cid:1) z − β (cid:0) q + βq (cid:1)(cid:27) , sup z<β (cid:26) σ h (cid:0) ψq + γ (cid:1) z + 2 σ hγ z (cid:27)) . See also Cvitanić, Possamaï, and Touzi [18], Aïd, Possamaï, and Touzi [2], Élie and Possamaï [23], Élie, Mastrolia, and Possamaï[25], or Élie, Hubert, Mastrolia, and Possamaï [24] for models with CARA utilities using this approach. e now evaluate M along the appropriate derivatives of the map ( x, y ) U P ( x − y ), which correspondsto the substitutions q ←→ − η U P ( x − y ) , q ←→ η U P ( x − y ) , γ ←→ η U P ( x − y ) , γ ←→ − η U P ( x − y ) . Deﬁning M ( x, y ) := M (cid:0) − η U P ( x − y ) , η U P ( x − y ) , η U P ( x − y ) , − η U P ( x − y ) (cid:1) , we compute directly thatfor ψ > ηψ + η < σ hη +1 σ h ( ψ + η )+1 , from which we have M ( x, y ) = − η U P ( x − y )2 h  ( σ hη + 1) σ h ( ψ + η ) + 1 + β (cid:0) β − (cid:1) , if ηψ + η > β,σ hη ψ + η , if σ hη + 1 σ h ( ψ + η ) + 1 < β. In the intermediary case ηψ + η ≤ β ≤ σ hη +1 σ h ( ψ + η )+1 , to ﬁnd the maximum in the expression of M , we need tosolve the inequality( σ hη + 1) σ h ( ψ + η ) + 1 + β (cid:0) β − (cid:1) ≥ σ hη ψ + η ⇐⇒ β − β + σ hη + 2 σ hηψ + η + ψ ( ψ + η )(1 + σ h ( ψ + η )) ≥ . It is straightforward to check that the second–order polynomial in β above has two positive roots, thesmallest one only belonging to (cid:2) ηψ + η , σ hη +1 σ h ( ψ + η )+1 (cid:3) , and therefore that for ηψ + η ≤ β ≤ σ hη +1 σ h ( ψ + η )+1 β − β + σ hη + 2 σ hηψ + η + ψ ( ψ + η )(1 + σ h ( ψ + η )) ≥ ⇐⇒ β ≤ − ψσ s h ( ψ + η )(1 + σ h ( ψ + η )) =: β. Overall, we thus have M ( x, y ) = − η U P ( x − y )2 h  ( σ hη + 1) σ h ( ψ + η ) + 1 + β (cid:0) β − (cid:1) , if β > β,σ hη ψ + η , if β ≤ β. Hence, the diﬀusion operator in the PDE applied to ( x, y ) U P ( x − y ) is exactly equal to − η U P ( x − y )2 h  ησ h − ( σ hη + 1) σ h ( ψ + η ) + 1 − β (cid:0) β − (cid:1) , if β > β,ησ h − σ hη ψ + η , if β ≤ β. When β ≥ ¯ β , the above quantity is always non–negative, and when β < β , one can check that the second–order polynomial in β we get always has real roots, that the largest one is above β , the lowest one is below β , and therefore that it will be non–negative if and only if β ≥ β .The message from the previous lemma is that whenever the lower bound for β given in the statement ispositive, which is equivalent to having 1 > σ hη (cid:0) σ hψ − , (10.2)a Golden Parachute cannot exist for small values of β . Since the classical Holmström and Milgrom’s modelhas exactly β = 0, Golden Parachute cannot exist as soon as Equation (10.2) holds, which happens for smallrisk aversions for either the principal or the agent. This means that golden parachutes can only arise insituations where we have either high risk–aversions, or high marginal costs for the agent, or high uncertaintyon the returns of the output X . Though one should keep in mind that the setting is now somewhat diﬀerentfrom [54], this is in stark contrast with the statement that ‘if we allow the principal to be explicitly riskaverse we can expect the qualitative features of the optimal contract (including retirement) to be the sameas with risk neutrality.’ ([54, Remark 3]). ppendices A Face–lifted principal’s reward

This section is dedicated to the proof of Proposition 2.1.

A.1 Very impatient principal may reduce her loss to zero

We ﬁrst consider the case ρ ≥ γr of Proposition 2.1.(i). Notice that we always have F (0) = 0, and that since F is non–positive, we have F ≤

0. Besides, by our assumptions on u , there exists M >

C >

0, suchthat for any y ≥ M , F ( y ) ≥ − Cy γ . Fix some some y > ε >

0, and consider then the followingcontrol p ( t ) := [ t ? , ∞ ) ( t ) ρεy ( t ) , t ≥ , where t ? is the ﬁrst instant at which y y , reaches the value M . We immediately have that y y ,p ( t ) = y e rt [0 ,t ? ) ( t ) + M e ( r − ρε )( t − t ? ) [ t ? , ∞ ) ( t ) , t ≥ . In particular, T y ,p = ∞ , and for T > t ? F ( y ) ≥ e − ρT F (cid:0) y y ,p ( T ) (cid:1) + Z T ρ e − ρt F (cid:0) p ( t ) (cid:1) d t = e − ρT F (cid:0) M e ( r − ρε )( T − t ? ) (cid:1) + Z Tt ? ρ e − ρt F (cid:0) ρεM e ( r − ρε )( t − t ? ) (cid:1) d t ≥ − CM γ e γ ( ρε − r ) t ? − ρT (1 − γ ρr + εγ ) − Cε γ ρ γ M γ e − ρt ? − γ ρr + γε (cid:16) − e − ρ ( T − t ? )(1 − γ ρr + γε ) (cid:17) −→ T →∞ − Cε γ ρ γ M γ e − ρt ? − γδ + γε , by the condition ρ ≥ γr . As γ >

1, the last limit converges to 0 as ε & A.2 Non–degenerate face–lifted utility

By standard control theory, the Hamilton–Jacobi equation corresponding to the mixed control–stoppingproblem deﬁning F is min n w − F, w − δyw + F ? (cid:0) δw (cid:1)o = 0 , on (0 , ∞ ) , w (0) = 0 . Notice ﬁrst that, similar to Remark 7.1.(ii), this ODE is equivalent to w − δyw + F ? (cid:0) δw (cid:1) = 0 , on (0 , ∞ ) , w (0) = 0 , (A.1)as the last equation implies that w − F = δyw − F − F ? (cid:0) δw (cid:1) ≥ δ = 1, it has a unique strictly concave solution givenby F . The following lemma addresses the general case. Lemma A.1.

Let δ = 1 and γδ > , Denote by w ? be the function introduced in (2.9) , and let w := ( w ? ) ? beits concave conjugate. Then w is a solution of (A.1) satisfying ¯ c ( − y γ ) ≤ w ( y ) ≤ ¯ c ( − y γ ) . Moreover, w (0) = F (0) δ { δ ≥ } .Proof. Notice ﬁrst that if w solves Equation (A.1), then, whenever w (0) is ﬁnite, by letting y go to 0, weget F ? (cid:0) δw (0) (cid:1) = 0. This implies that δw (0) ≥ F (0), and as w ≥ F we deduce that w (0) ≥ F (0)1 ∨ δ , aninequality obviously satisﬁed when w (0) = ∞ , which is the only possible inﬁnite value by concavity of w .We now show that w (0) = F (0) 1 δ { δ ≥ } . (A.2)To see this, we consider the following alternative cases. δ ≥

1. Assume to the contrary that δw (0) > F (0), then δw > F (0) on [0 , ε ) for some ε >

0, bycontinuity. This in turn implies that F ? ( δw ) = 0 on [0 , ε ), and equation (A.1) reduces to w ( y ) − δyw ( y ) = 0,on [0 , ε ), and we get w ( y ) = Const . w ( y ) y δ y δ , < y ≤ y < ε. (A.3)As w (0) = 0 and w ≤

0, we see that w ( y ) /y δ −→

0, as y &

0, and therefore w = 0 on [0 , ε ), contradictingthe strict concavity of w . • δ <

1. Since w (0) ≥ F (0), we have again F ? ( δw ) = 0 on [0 , ε ), for some ε >

0, and by arguying asin the previous case, we arrive to again to the same conclusion (A.3). However, as δ <

1, the only way toavoid explosion of w ( y ) /y δ as y & w (0) = 0.In particular, notice that (A.2) implies that any strictly concave solution of (A.1) is decreasing. Next, we canuse convex duality and consider the dual of w , given by w ? ( p ) := inf y ≥ (cid:8) py − w ( y ) (cid:9) . Notice that since weproved that we needed to have w (0) = F (0) δ { δ> } =: f δ , the domain over which w ? is naturally deﬁnedis ( −∞ , f δ ]. As such the ODE satisﬁed by w ? is − w ? ( p ) + (1 − δ ) p (cid:0) w ? (cid:1) ( p ) + F ? ( δp ) = 0 , p < f δ , w ? ( f δ ) = 0 . (A.4)This linear ODE has the generic solution, for any C ∈ R and ε > w ? ( p ) = ( − p ) − δ − (cid:18) C − − δ Z f δ − εp F ? ( δx )( − x ) − δ d x (cid:19) , p < f δ . (A.5)We need to study the behaviour of the solution when p goes to f δ − . We will thus consider three cases. Case 1: f δ < . In this case, we can take directly ε = 0 in Equation (A.5), as the integrand has no singularity,and the boundary condition at f δ imposes that C = 0, so that the solution is uniquely determined by w ? ( p ) = ( − p ) − δ − δ − Z f δ p F ? ( δx )( − x ) − δ d x, p ≤ f δ . In this case, we necessarily have δ >

1, therefore, by Lemma A.4, w ? is strictly concave, increasing, and w ? ≤ F ? . This immediately proves that w is unique, strictly concave, decreasing, and above F . Besides,the explicit formula we obtained shows by direct integration and using Equation (2.8) that w ? satisﬁes alsoEquation (2.8) with appropriate constants, which directly implies the required inequalities for w . Case 2: f δ = 0 and δ > . Under this condition, we can take ε = 0 in Equation (A.5), leading to( − p ) − δ − − δ Z p F ? ( δx )( − x ) − δ d x = p → − o (1) , and thus converges to 0 as p goes to 0 − . Therefore, when δ >

1, we need to take again C = 0 in Equa-tion (A.5) to satisfy the boundary condition w ? (0) = 0, and our solution is uniquely determined. Moreover,by Lemma A.4, w ? is strictly concave, increasing, and w ? ≤ F ? . This immediately proves that w is unique,strictly concave, decreasing, and above F . We deduce that w satisﬁes the required inequalities as in theprevious case. Case 3: δ < . In this case, it can be checked that for any C ∈ R and ε >

0, we havelim p → − ( − p ) − δ − (cid:18) C − − δ Z − εp F ? ( δx )( − x ) − δ d x (cid:19) = 0 . We therefore have, a priori , inﬁnitely many possible solutions to the ODE. However, notice that the growthimposed on w translates into ¯ c ∗ ( − | p | γγ − ) ≤ w ? ( p ) ≤ ¯ c ∗ (1 + | p | γγ − ), and this implies that | p | − − δ w ? ( p ) ≤ ¯ c ∗ (cid:16) | p | − − δ + | p | − γδ (1 − δ )( γ − (cid:17) −→ p →−∞ , (A.6) s γδ >

1. Then, C and ε must be such that C = − δ R − ε −∞ F ? ( δx )( − x )

1+ 11 − δ d x, where the ﬁniteness of the lastintegral is satisﬁed in our setting again by γδ >

1. Consequently, w ? is uniquely determined and given by w ? ( p ) = ( − p ) − δ − (1 − δ ) Z p −∞ F ? ( δx )( − x ) − δ d x, p ≤ , (A.7)and we can again use Lemma A.4 to conclude.The following is the easy inequality in a generic veriﬁcation theorem for F . Lemma A.2.

Let w be a C super–solution of w − δyw + F ? ( δw ) ≥ on R + . Then w ≥ F .Proof. We ﬁrst observe that w − F ≥ T F ( y, δw ) ≥

0. We next compute for all p ∈ B R + and T ≤ T y ,p that w ( y ) = e − ρT w (cid:0) y y ,p ( T ) (cid:1) + Z T ρ e − ρt (cid:16) w ( y y ,p ( t )) − δ (cid:0) y y ,p ( t ) − p ( t ) w ( y y ,p ( t ) (cid:1)(cid:17) d t ≥ e − ρT F (cid:0) y y ,p ( T ) (cid:1) + Z T ρ e − ρt F (cid:0) p ( t ) (cid:1) d t, by the super–solution property of w . The arbitrariness of p ∈ B R + and T ≤ T y ,p implies that w ≥ F . Lemma A.3.

Let δ = 1 and δγ > . Assume further that Assumption 3.5 holds. Then F = (cid:0) F ? (cid:1) ? , where F ? is given explicitly in (2.9) , is the unique solution of Equation (A.1) in the class of functions satisfying ¯ c ( − y γ ) ≤ F ( y ) ≤ ¯ c ( − y γ ) . Moreover, F is a strictly concave decreasing majorant of F , with F (0) = δ − F (0) { δ> } .Proof. We show by a standard veriﬁcation argument that w = F where w is the solution of (A.1) whoseconcave dual was derived explicitly in the Lemma A.1. By Lemma A.2, w is an upper bound for F , i.e. F ≤ w .On the other hand, consider for any y > F ? ( δw ( y )), that is to say( F ? ) (cid:0) δw ( y ) (cid:1) as a feedback control˙ y ?t = r (cid:0) y ?t − p ?t (cid:1) , where p ?t := ( F ? ) (cid:0) δw ( y ?t ) (cid:1) . Direct diﬀerentiation of (A.1) provides that for any y >

0, (1 − δ ) w ( y ) = (cid:0) y − ( F ? ) (cid:0) δw ( y )) (cid:1) δw ( y ), sothat ˙ y ?t = r (cid:18) δ − (cid:19) y ?t h ( y ?t ) , t ≥ , where h ( y ) := w ( y ) yw ( y ) , y > . Since h ≥

0, we see that y ? is decreasing when δ >

1, and is therefore well–deﬁned at least until the hittingtime of zero T ? := T y ,p ? < ∞ . In contrast, when δ < y ? is increasing until some explosion time ¯ T , and T ? = ∞ .Following the same calculation as in the ﬁrst step of the present proof, we see that under the control p ? ,all inequalities are turned into equalities, leading for any T ∈ [0 , ¯ T ] to w ( y ) = e − ρT ∧ T ? w ( y ?T ∧ T ? ) + Z T ∧ T ? ρ e − ρt F ( p ?t )d t. (A.8)First, by the previous step, when δ >

1, we have T ? < ∞ , and we obtain by sending T to ∞ and usingthe boundary condition w (0) = 0 that ( T ? , p ? ) attains the upper bound w ( y ), and is therefore an optimalcontrol for the problem F .In the alternative case δ <

1, we have T ? = ∞ . In the rest of this proof, we show that¯ h := sup y ≥ ˆ y { h ( y ) } < γ (1 − δ ) , for some ˆ y > . (A.9) hen, y ? is deﬁned on R + , i.e. ¯ T = ∞ , and since ˆ T := inf { t ≥ y ?t ≥ ˆ y } < ∞ , we deduce from the growthof w that for some C >

0, whose value may change from line to line, and any t ≥ ˆ T e − ρ ( t − ˆ T ) | w ( y ?t ) | ≤ C e − ρ ( t − ˆ T ) (cid:0) | y ?t | γ (cid:1) ≤ C e − ρ ( t − ˆ T ) (cid:16) ργ (1 − δ ) R t ˆ T h ( y ?s )d s (cid:17) ≤ C e − ρ ( t − ˆ T ) (cid:16) ργ (1 − δ )¯ h ( t − ˆ T ) (cid:17) −→ t →∞ , as 1 − γ (1 − δ )¯ h >

0. Sending T to ∞ in (A.8) this provides again that ( T ? , p ? ) = ( ∞ , p ? ) attains the upperbound w ( y ).In order to verify (A.9), we prove equivalently that the concave dual w ? satisﬁessup p ≤ ˆ p (cid:26) p ( w ? ) ( p )( w ? ) ( p ) (cid:27) < γ (1 − δ ) , for some ˆ p < . (A.10)Diﬀerentiating the ODE (A.4) satisﬁed by w ? , and using the expression of w ? from Equation (A.7) in thepresent case, we see that p ( w ? ) ( w ? ) = δ − δ − Ψ( δp ) , with Ψ( p ) := − pψ ( p ) ψ ( p ) , and ψ ( p ) := ( − p ) − − δ F ? ( p ) − Z p −∞ F ? ( u )(1 − δ )( − u ) − δ d u. Notice that lim p →−∞ ψ ( p ) = 0, and that ψ is easily shown to be non–negative and non–decreasing. Therefore,if − pψ ( p ) does not go to 0 as p goes to −∞ , we have that lim p →−∞ Ψ( δp ) = ∞ , and Equation (A.10)automatically holds. Now if − pψ ( p ) −→ p →−∞

0, it follows from l’Hôpital’s rule and our assumptions thatlim y →∞ (cid:26) F ( y ) yF ( y ) (cid:27) = lim p →−∞ (cid:26) p ( F ? ) ( p )( F ? ) ( p ) (cid:27) = δ − δ − lim p →−∞ (cid:26) − ψ ( p ) − pψ ψ ( p ) (cid:27) . Then, using again l’Hôpital’s rule, we deducelim sup p →−∞ (cid:26) p ( w ? ) ( p )( w ? ) ( p ) (cid:27) = δ − δ − lim inf p →−∞ (cid:26) − pψ ( p ) ψ ( p ) (cid:27) = δ − δ − lim inf p →−∞ (cid:26) − ψ ( p ) − pψ ψ ( p ) (cid:27) = lim y →∞ (cid:26) F ( y ) yF ( y ) (cid:27) . Then, assuming to the contrary that (A.10) does not hold means that, for ﬁxed γ ∈ (1 + γ (1 − δ ) , γ ), wemay ﬁnd y > F ( y ) F ( y ) ≤ ( γ − y , for y ≥ y . Integrating twice and recalling that F ≤

0, thisimplies that F ( y ) ≥ F ( y ) + y F ( y ) γ (cid:18)(cid:18) yy (cid:19) γ − (cid:19) , for all y ≥ y , which in turn leads to the following contradiction y F ( y ) γ ≤ lim F ( y ) y γ = −∞ by our assumption on thegrowth of F together with the fact that γ < γ .We end this section with the result used in the proof of Lemma A.1. Lemma A.4.

Let δ = 1 , and let F ? be a solution of − F ? + (1 − δ ) p (cid:0) F ? (cid:1) + F ? ( δp ) = 0 , p < F (0) δ { δ> } , F ? (cid:18) F (0) δ { δ> } (cid:19) = 0 . (A.11) Then F ? ≤ F ? , F ? is strictly concave and increasing.Proof. Denote φ := F ? − F ? , and notice that Equation (A.11) says that for any p < F (0) /δ { δ> } =: f δ φ ( p ) = F ? ( p ) − F ? ( δp ) + ( δ − p (cid:0) F ? (cid:1) ( p ) ≥ (1 − δ ) pφ ( p ) , by the concavity of F ? . In other words, if we deﬁne for p < f δ , ψ ( p ) := ( − p ) δ − , we have ψ ( p ) = ( − p ) δ − − δ (cid:0) φ ( p ) − (1 − δ ) pφ ( p ) (cid:1) . e need to distinguish two cases, depending on whether δ > δ <

1. First, if δ >

1, we have that ψ isnon–increasing, and thus for any p < f δ ( − p ) δ − φ ( p ) ≥ lim p → f δ n ( − p ) δ − φ ( p ) o = 0 , since F ? ( f δ ) = F ? ( f δ ) = 0 (recall that F ? is 0 above F (0), which is itself below f δ , since δ > δ <

1, by arguying as in (A.6), we arrive at the conclusion φ ≥

0, and thus that F ? ≤ F ? , as desired.Next, by direct diﬀerentiation of Equation (A.11), then substituting the expression of ( F ? ) from Equa-tion (A.11), and ﬁnally using the strict concavity of F ? , we deduce that for any p < f δ ( δ − p ( F ? ) ( p ) = δ ( δ − p (cid:0) ( F ? ) ( δp ) − ( F ? ) ( p ) (cid:1) = δ ( δ − p ( F ? ) ( δp ) − δ (cid:0) F ? ( δp ) − F ? ( p ) (cid:1) < δ (cid:0) F ? ( p ) − F ? ( p ) (cid:1) ≤ , thus proving the strict concavity of F ? .Finally, since F ? is strictly concave, remains below F ? which is increasing on ( −∞ , F (0)], and 0 on[ F (0) , f δ ∧ F (0)], then F ? must also be increasing on its domain. B Ingredients for Perron’s existence method

This section provides two main technical results which were needed to justify the existence of a solutionof the dynamic programming equation in Lemma 9.3. We ﬁrst prove the existence of a super–solution forthe dynamic programming equation with appropriate growth. Then, we show that, despite the explodingfeature of F ? , the dynamic programming equation satisﬁes a comparison result. Lemma B.1. (Super–solution)

Let g : R + −→ R + be a C , increasing, strictly concave function, with atmost logarithmic growth at inﬁnity, such that g (0) = 0 and the following possibly inﬁnite limit exists c o := lim y →∞ − yg ( y ) g ( y ) ≥ (1 − δ − ) + . For b > , let F b ( y ) := F ( y ) + bg ( y ) , y ≥ . Then, for b suﬃciently large, we may ﬁnd a super–solution ¯ v of (7.1) with growth at inﬁnity controlled by F b .Proof. We proceed in ﬁve steps.

Step . As I is Lipschitz in ( p, q ), it follows from the standard Cauchy–Lipschitz theorem that we mayconsider the maximal solution v b on [0 , ¯ y ), for some ¯ y , of v b − δyv b + F ? (cid:0) δv b (cid:1) − I (cid:0) δv b , δv b ) = 0 , v b (0) = 0 , v b (0) = b, by writing this equation in its explicit form (3.5). Indeed, this is immediate when β >

0, and when β = 0,we can argue as in the proof of Lemma 9.1.Next, as long as v b is non–decreasing, we necessarily have v b ≥ F (recall that F (0) = 0 and F isdecreasing), and therefore I ( v b , v b ) = v b − δyv b + F ? (cid:0) δv b (cid:1) ≥ F − δyv b + F ? (cid:0) δv b (cid:1) ≥

0. Consequently I ( δv b , δv b ) = I ( δv b , δv b ) + as long as v b is non–decreasing, and v b is a solution of the required equation(7.1) on this region. Step . We ﬁrst consider the case where v b remains increasing on [0 , ¯ y ). Then, v b solves the ODE on [0 , ∞ ),i.e. ¯ y = ∞ , and we claim that0 ≤ v b ≤ ¯ a, and ¯ v b := v b + F is a super–solution of (7.1) . he last statement follows from the fact that L ¯ v b ( y ) = v b ( y ) + F ( y ) − δy (cid:0) v b ( y ) + F ( y ) (cid:1) + F ? (cid:0) δv b ( y ) + δF ( y ) (cid:1) − I (cid:0) δv b ( y ) + δF ( y ) , δv b ( y ) + δF ( y ) (cid:1) + ≥ v b ( y ) + F ( y ) − δy (cid:0) v b ( y ) + F ( y ) (cid:1) + F ? (cid:0) δv b ( y ) + δF ( y ) (cid:1) − I (cid:0) δv b ( y ) , δv b ( y ) (cid:1) + − sup z ∈ R , ˆ a ∈ ˆ A ( z ) (cid:8) h (ˆ a ) δF ( y ) + ηz δF ( y ) (cid:9) = F ? (cid:0) δv b ( y ) + δF ( y ) (cid:1) − F ? (cid:0) δF ( y ) (cid:1) − sup z ∈ R , ˆ a ∈ ˆ A ( z ) (cid:8) h (ˆ a ) δF ( y ) + ηz δF ( y ) (cid:9) ≥ − sup z ∈ R , ˆ a ∈ ˆ A ( z ) (cid:8) h (ˆ a ) δF ( y ) + ηz δF ( y ) (cid:9) ≥ , by the non–decrease of F ? , together with direct manipulation of the supremum and the negativity of F and F . To verify the claim that v b ≤ ¯ a , we consider two separate cases.(i) Case δ ≤

1: by the concavity of v b and the increase of I in q , we have0 = v b ( y ) − δyv b ( y ) − I (cid:0) δv b ( y ) , δv b ( y ) (cid:1) ≥ v b ( y ) − δyv b ( y ) − ¯ a − h (¯ a ) δv b ( y ) . Assume to the contrary that v b ( y ) > ¯ a , for some y >

0. Then, v b > ¯ a on [ y , y ) for some y > y , and itfollows from the last inequality that v b ( y ) v b ( y ) − ¯ a ≥ δy + h (¯ a ) , and therefore v b ( y ) − ¯ a ≥ ( v b ( y ) − ¯ a ) (cid:18) δy + h (¯ a ) δy + h (¯ a ) (cid:19) δ , y ∈ [ y , y ) . This shows that we may take y = ∞ . Moreover, this completes the proof in the case δ < v b . In the remaining case δ = 1, this shows that v b is aﬃne to the right of y , and as thisaﬃne function solves the ODE on R + , we deduce that v b ( y ) = b + by is aﬃne on R + . But this cannothappen as the equation imposes that the constant b = I ( b,

0) = ¯ a + bh (¯ a ), while the boundary conditionimposes that v b (0) = b = 0.(ii) Case δ >

1: as v b is increasing, we have 0 = v b − δyv b − I ( δv b , δv b ) + ≤ v b − δyv b , which implies that v b ( y ) ≤ y δ . Indeed, if we deﬁne w b ( y ) := y /δ − v b ( y ), then the previous inequality implies directly that (cid:0) w b ( y ) y − /δ (cid:1) ≥

0. Since for δ >

1, we have lim y → w b ( y ) y − /δ = 1 >

0, we obtain the desired result.Next, let ψ be a non–negative continuous function deﬁned on a neighbourhood of the origin with ψ (0) = 0.We shall specify this function later, and we denote ψ ε := ψ ( ε ) for all small ε >

0. Assuming to the contrarythat 2 η := v b ( y o ) − ¯ a >

0, for some y o ∈ (0 , ¯ y ), we may ﬁnd for all ε > y ε > M ε := max y ≥ (cid:8) v b ( y ) − ¯ a − εy δ + ψ ε (cid:9) = v b ( y ε ) − ¯ a − εy δ + ψ ε ε . Besides, for ε small enough, we have M ε ≥ v b ( y o ) − ¯ a − εy δ + ψ ε o ≥ η > , showing that y ε must be an interior maximiser for ε small enough. In particular, for such small values of ε ,we have v b ( y ε ) = ε (cid:0) δ + ψ ε (cid:1) y δ + ψ ε − ε , as well as v b ( y ε ) ≤

0, and it follows from the deﬁnition of v b that0 = v b ( y ε ) − δy ε v b ( y ε ) − I (cid:0) δv b ( y ε ) , δv b ( y ε ) (cid:1) + ≥ v b ( y ε ) − δy ε v b ( y ε ) − I (cid:0) δv b ( y ε ) , (cid:1) + = v b ( y ε ) − δy ε v ( y ε ) − ¯ a − h (¯ a ) v ( y ε ) ≥ η − δεψ ε y δ + ψ ε ε − h (¯ a ) δ (cid:18) δ + ψ ε (cid:19) εy δ + ψ ε − ε . (B.1)On the other hand, we have η ≤ v b ( y ε ) − ¯ a − εy δ + ψ ε ε ≤ y δ ε − ¯ a − εy δ + ψ ε ε , which implies that as ε goes to 0 • either ( y ε ) ε> remains bounded (and is in any case bounded away from 0, since v b is non–decreasing)and, by sending ε &

0, (B.1) implies that 0 ≥ η , contradicting the positivity of η ; or y ε −→ ε → ∞ along some sub–sequence, with εy ψ ε ε <

1. We claim that we may choose the function ψ so that for ε small enough ψ ε = ε δψε . (B.2)In this case, εψ ε y δ + ψ ε ε = ε y ψ ε ε ψ ε y δε ε ≤ ε ψ ε , and (B.1) provides again a contradiction by sending ε & ψ ε satisfying (B.2) by verifying directly that for ε small enough, thefunction f ε ( ψ ) := ψε − − δψ is decreasing on [ ε, ∞ ), with f ( ε ) > ψ > f ( ε δψ ) <

1. inparticular, for any ψ > ε δψ < ψ ε < ε implying that ψ ε −→ ε & Step

3: otherwise, if v b is ultimately decreasing, then we may ﬁnd a point of maximum ˆ y b >

0, and we shalljustify in

Step v b (ˆ y b ) = 0 , and ˆ y b % ∞ , as b % ∞ . (B.3)Let ˆ b := − F (ˆ y b ), and set¯ v b ( y ) := v b ( y ) [0 , ˆ y b ] ( y ) + (cid:0) v b (ˆ y b ) + F ˆ b ( y ) − F ˆ b (ˆ y b ) (cid:1) (ˆ y b , ∞ ) ( y ) , y ≥ . By deﬁnition L ¯ v b = 0 on [0 , ˆ y b ]. On [ˆ y b , ∞ ), we compute directly that L ¯ v b ( y ) = v b (ˆ y b ) − F (ˆ y b ) + L F ˆ b ( y ) , Since v b (ˆ y b ) > > F (ˆ y b ), this implies that L ¯ v b ( y ) > L F ˆ b ( y ) ≥ , for large ˆ b, by Step 5 below, together with (B.3) which implies that ˆ b = − F (ˆ y b ) % ∞ as b % ∞ . Step

4: we next justify (B.3). By Remark 9.2, the concave dual v ?b of v b solves the ODE (9.1), which reduceson [0 , b ] (that is to say the domain where y ≤ ˆ y b ) to − ( v ?b ) ( p ) = δη inf z ≥ β, ˆ a ∈ ˆ A ( z ) (cid:26) z v ? ( p ) + ˆ a + δh (ˆ a ) p + ( δ − p ( v ?b ) ( p ) (cid:27) , on [0 , b ] , v ?b ( b ) = 0 , ( v ?b ) (0) = ˆ y b , by the identity ( v ?b ) = ( v b ) − , and the fact that on the domain where v b is increasing, we can replace I +0 by I in the ODE.Assume ﬁrst that δ ≤

1. As 0 = v (0) = − sup p ∈ R v ? ( p ), and since v ?b is concave and increasing on [0 , b ],we deduce from the previous PDE that − ( v ?b ) ≥ ηδ β ¯ a + δh (¯ a ) p , which provides by direct integration between 0 and b thatˆ y b = ( v ?b ) (0) ≥ − (cid:0) ( v ?b ) ( b ) − ( v ?b ) (0) (cid:1) ≥ ηδβ (cid:0) log (cid:0) ¯ a + δh (¯ a ) b (cid:1) − log (¯ a ) (cid:1) −→ ∞ , as b % ∞ . Similarly, when δ >

1, we have − ( v ?b ) ≥ ηδ β ¯ a + ( δh (¯ a ) + ( δ − y b ) p , which provides by direct integration between 0 and b thatˆ y b ≥ ηδβ (cid:0) log (cid:0) ¯ a + ( δh (¯ a ) + ( δ − y b ) b (cid:1) − log (¯ a ) (cid:1) . Now, if ˆ y b remained bounded as b goes to ∞ , the above inequality would lead to a contradiction for b large.We can then let b go to ∞ and deduce again that ˆ y b goes to ∞ as b goes to ∞ . tep

5: It remains to prove that F b is a super–solution of (7.1) on (cid:8) F b ≤ (cid:9) for suﬃciently large b > (cid:8) F b ≤ (cid:9) = [ y ( b ) , ∞ ) , where y ( b ) is the unique solution of F (cid:0) y ( b ) (cid:1) + bg (cid:0) y ( b ) (cid:1) = 0 , which exists as F b is strictly concave and ultimately decreasing. Moreover, we get by direct diﬀerentiation,and using the fact that g is increasing, and both g and F are strictly concave y ( b ) = − g ( y ( b )) bg ( y ( b )) + F ( y ( b )) > , b ≥ . Hence b y ( b ) is increasing, and it is immediate from the equation deﬁning y ( b ) that it must go to ∞ as b goes to ∞ . Next, by the non–decrease of F ? and I ( p, . ) + , we directly compute that L F b ≥ bf ( y ) − I ( δF b , δF ) ≥ bf ( y ) − ¯ a, where f ( y ) := g ( y ) − δyg ( y ) , y ≥ . Direct calculations show that f ( y ) = g ( y ) (cid:18) − δ − δ yg ( y ) g ( y ) (cid:19) , y ≥ . If c o = ∞ , then for y large enough f must be increasing. Otherwise, we have f ( y ) ∼ y →∞ g ( y ) (cid:0) − δ + δc o (cid:1) , and f is still increasing for large values of y . We can then choose b large enough so that f is increasing on[ y ( b ) , ∞ ), and we can then deduce that L F b ≥ bf (cid:0) y ( b ) (cid:1) − ¯ a on [ y ( b ) , ∞ ). We now complete the proof byshowing that f (cid:0) y ( b ) (cid:1) − ¯ a ≥ b . However this is immediate since we have that f isincreasing on [ y ( b ) , ∞ ) for b large, that it converges to ∞ as f goes to ∞ , and we have already proved that y ( b ) increases to ∞ as b goes to ∞ . Remark B.2. In Lemma B.1 , there are several possible choices for the function g . For instance, g ( y ) =log(1 + y ) , or g ( y ) = log (cid:0) y ) (cid:1) , both verify the required properties with c o = 1 . This actuallyextends to arbitrary many iterations of the logarithm. In particular, the upper bound for v b in Lemma 9.3 below can be improved, but the one we give is enough for our purpose here.

We conclude this section by reporting the comparison result used in the proof of Lemma 9.3.

Lemma B.3. (Comparison)

Let u and v be respectively a viscosity sub–solution and a viscosity super–solution of (7.1) , such that for ϕ ∈ { u, v } and for some b > F ( y ) ≤ ϕ ( y ) ≤ F ( y ) + b log (cid:0) y ) (cid:1) , y ≥ . Then u ≤ v on R + . Remark B.4.

The speciﬁc upper bound in the statement of

Lemma B.3 with an iterated logarithm is notimportant per se . Indeed, the proof goes through as long as the upper bound is of the form bg ( y ) for somepositive, increasing, strictly concave map g , null at , growing strictly slower at ∞ than log( y ) . And wehave already seen in Remark B.2 that we could ﬁnd inﬁnitely many such functions such that F + bg is asuper–solution of Equation (7.1) for b large enough.Proof of Lemma B.3 . Notice that u and v are respectively viscosity sub–solution and super–solution of theequation w + G ( y, w , w ) = 0 , on R + , (B.4)where the nonlinearity G is given, for any ( y, p, q ) ∈ R + × R , by G ( y, p, q ) := − δyp + F ? ( δp ) − I (cid:0) δp, δq (cid:1) + , Our objective is to follow Crandall, Ishii, and Lions [13, Section 3], and adapt the arguments there to ourcontext. Fix some ν > µ := 2 ∨ ( γ + ν ). We consider for any α > ε > ψ α,ε ( x, y ) := u ( x ) − v ( y ) − αµ | x − y | µ − ε log(1 + y ) , ( x, y ) ∈ R . e let for any α > ε > M α,ε := sup ( x,y ) ∈ R ψ α,ε ( x, y ) . By the growth assumptions on u and v , the supremum in the deﬁnition of M α,ε is attained, and we candeﬁne an R –valued sequence ( x α,ε , y α,ε ) α> such that for any α > ε > M α,ε = ψ α,ε ( x α,ε , y α,ε ) . Since the supremum is attained on a compact set, we can ﬁnd for any ε > x εn , y εn ) n ∈ N := ( x α n ,ε , y α n ,ε ) n ∈ N , converging to some (ˆ x ε , ˆ y ε ). Moreover, by standard arguments fromviscosity solution theory (see for instance Crandall, Ishii, and Lions [13, Proposition 3.7]), we haveˆ x ε = ˆ y ε , lim n →∞ α n (cid:12)(cid:12) x εn − y εn (cid:12)(cid:12) µ = 0 , M ε := lim n →∞ M α n ,ε = sup y ≥ ( u − v )( y ) − ε log (cid:0) y ε (cid:1) . Let us now assume that there is some y o > η := ( u − v )( y o ) >

0. Then, we have for any n ∈ N and ε > η − ε log(1 + y o ) ≤ M α n ,ε = ψ α,ε ( x εα n , y εα n ) . In particular, for n suﬃciently large, we have that x εn and y εn are both positive, and we assume for notationalsimplicity that we took the appropriate subsequence. Using Crandall–Ishii’s lemma (see Crandall, Ishii, andLions [13, Theorem 3.2]), we can ﬁnd for each integer n , an R –valued sequence ( X εn , Y εn ) n ∈ N such that (cid:0) α n ( x εn − y εn ) µ − , X εn ) ∈ J , + u ( x εn ) , (cid:18) α n ( x εn − y εn ) µ − − ε y εn , Y εn (cid:19) ∈ J , − v ( y εn ) , with the notation x φ := sgn( x ) | x | φ for all φ > x ∈ R , and − (cid:18) λ + k C εn k (cid:19) ≤ (cid:18) X εn − Y εn (cid:19) ≤ C εn (cid:18) I + λC εn (cid:19) , for all λ > , where I is the two–dimensional identity matrix, and C εn := a εn A + b εn B, a εn := α n ( µ − | x εn − y εn | µ − , b εn := ε (1 + y εn ) , A := (cid:18) − − (cid:19) , B := (cid:18) (cid:19) , and where we use the spectral norm for symmetric matrices. Take λ = k C εn k − , we get − k C εn k I ≤ (cid:18) X εn − Y εn (cid:19) ≤ C εn (cid:18) I + C εn k C εn k (cid:19) . This implies in particular (simply multiply the above inequality by (1 ,

1) to the left and (1 , > to the right)that for any n ∈ N X εn − Y εn ≤ ( b εn ) k C εn k + b εn . By the sub–solution and super–solution properties of u and v , we have for any n ∈ N u ( x εn ) + G (cid:0) x εn , α n ( x εn − y εn ) µ − , X εn (cid:1) ≤ ≤ v ( y εn ) + G (cid:18) y εn , α n ( x εn − y εn ) µ − − ε y εn , Y εn (cid:19) . We deduce that η − ε log(1 + y o ) ≤ u ( x εn ) − v ( y εn ) ≤ F n,ε , where F n,ε := G (cid:18) y εn , α n ( x εn − y εn ) µ − − ε y εn , Y εn (cid:19) − G (cid:0) x εn , α n ( x εn − y εn ) µ − , X εn (cid:1) . ow notice that since I +0 is Lipschitz continuous and non–decreasing with respect to its second variable,and since F ? is non–decreasing, we have for some c o > F n,ε = δα n | x εn − y εn | µ + δε y εn y εn + F ? (cid:18) δα n ( x εn − y εn ) µ − − δε y εn (cid:19) − F ? (cid:0) δα n ( x εn − y εn ) µ − (cid:1) + I (cid:0) δα n ( x εn − y εn ) µ − , X εn (cid:1) + − I (cid:18) δα n ( x εn − y εn ) µ − − δε y εn , Y εn (cid:19) + ≤ δα n | x εn − y εn | µ + δε + I (cid:18) δα n ( x εn − y εn ) µ − , Y εn + ( b εn ) k C εn k + b εn (cid:19) + − I (cid:18) δα n ( x εn − y εn ) µ − − δε y εn , Y εn (cid:19) + ≤ δα n | x εn − y εn | µ + δε + c o (cid:18) ( b εn ) k C εn k + b εn (cid:19) + c o δε y εn ≤ δα n | x εn − y εn | µ + (1 + c o ) δε + c o b εn + c o ( b εn ) k C εn k . We now want to let n go to ∞ , and will distinguish two cases. First, if (cid:0) k C εn k (cid:1) n ∈ N is unbounded, taking asubsequence if necessary, we deduce by letting n go to ∞ that η − ε log(1 + y o ) ≤ (1 + c o ) δε + c o ε (1 + ˆ y ε ) , which gives a contradiction when ε goes to 0.If now (cid:0) k C εn k (cid:1) n ∈ N remains bounded, we take a converging subsequence, and notice that we then have forsome a ε ∈ R k C εn k −→ n →∞ k C ε k := (cid:13)(cid:13)(cid:13) a ε A + ε (1 + ˆ y ε ) | {z } =: b ε B (cid:13)(cid:13)(cid:13) . If the sequence ( a ε ) ε> is unbounded, we take again a subsequence and get a contradiction by letting ε goto 0 in η − ε log(1 + y o ) ≤ (1 + c o ) δε + c o b ε + c o ( b ε ) k C ε k . (B.5)If now ( a ε ) ε> remains bounded and is such that a := limsup ε → a ε = 0, or a := liminf ε → a ε = 0, thentaking another subsequence, we obtain a contradiction by letting ε go to 0 in Equation (B.5). Finally, iflim ε → a ε = 0, then three cases can occur(i) ﬁrst, if a ε = ε → o (cid:0) b ε (cid:1) , then ( b ε ) k C ε k ∼ ε → b ε −→ ε → , and we conclude again by letting ε go to 0 in Equa-tion (B.5);(ii) if instead b ε = ε → o (cid:0) a ε (cid:1) , then ( b ε ) k C ε k ∼ ε → b ε a ε b ε −→ ε → , and we conclude similarly;(iii) ﬁnally, if a ε ∼ ε → cb ε for some c = 0, then ( b ε ) k C ε k ∼ ε → b ε k cA + B k −→ ε → , and we get once more a contradiction. References [1] T. Adrian and M.M. Westerﬁeld. Disagreement and learning in a dynamic contracting model.

Reviewof Financial Studies , 22(10):3873–3906, 2009.[2] R. Aïd, D. Possamaï, and N. Touzi. Optimal electricity demand response contracting with responsivenessincentives. arXiv preprint arXiv:1810.09063 , 2018.[3] O. Alvarez, J.-M. Lasry, and P.-L. Lions. Convex viscosity solutions and state constraints.

Journal deMathématiques Pures et Appliquées , 76(3):265–288, 1997.

4] B. Biais, T. Mariotti, G. Plantin, and J.-C. Rochet. Dynamic security design: convergence to continuoustime and asset pricing implications.

The Review of Economic Studies , 74(2):345–390, 2007.[5] B. Biais, T. Mariotti, J.-C. Rochet, and S. Villeneuve. Large risks, limited liability, and dynamic moralhazard.

Econometrica , 78(1):73–118, 2010.[6] B. Biais, T. Mariotti, and J.-C. Rochet. Dynamic ﬁnancial contracting. In D. Acemoglu, M. Arellano,and E. Dekel, editors,

Advances in economics and econometrics, 10th world congress of the EconometricSociety, volume 1, economic theory , number 49 in Econometric Society Monographs, pages 125–171.Cambridge University Press, 2013.[7] B. Bouchard and N. Touzi. Explicit solution to the multivariate super–replication problem undertransaction costs.

The Annals of Applied Probability , 10(3):685–708, 2000.[8] M. Broadie, J. Cvitanić, and H.M. Soner. Optimal replication of contingent claims under portfolioconstraints.

The Review of Financial Studies , 11(1):59–79, 1998.[9] A. Capponi and C. Frei. Dynamic contracting: accidents lead to nonlinear contracts.

SIAM Journalon Financial Mathematics , 6(1):959–983, 2015.[10] J.-F. Chassagneux, R. Élie, and I. Kharroubi. When terminal facelift enforces Delta constraints.

Financeand Stochastics , 19(2):329–362, 2015.[11] P. Cheridito, H.M. Soner, and N. Touzi. The multi–dimensional super–replication problem under gammaconstraints.

Annales de l’institut Henri Poincaré, analyse non linéaire (C), 22(5):633–666, 2005.[12] S.M. Choi.

On Sannikov’s continuous–time principal–agent problem . PhD thesis, University of CaliforniaBerkeley, 2014.[13] M.G. Crandall, H. Ishii, and P.-L. Lions. User’s guide to viscosity solutions of second order partialdiﬀerential equations.

Bulletin of the American Mathematical Society , 27(1):1–67, 1992.[14] J. Cvitanić and J. Zhang.

Contract theory in continuous–time models . Springer, 2012.[15] J. Cvitanić, X. Wan, and J. Zhang. Optimal contracts in continuous–time models.

International Journalof Stochastic Analysis , 2006(095203), 2006.[16] J. Cvitanić, X. Wan, and J. Zhang. Principal–agent problems with exit options.

The B.E. Journal ofTheoretical Economics , 8(1):23, 2008.[17] J. Cvitanić, X. Wan, and J. Zhang. Optimal compensation with hidden action and lump–sum paymentin a continuous–time model.

Applied Mathematics and Optimization , 59(1):99–146, 2009.[18] J. Cvitanić, D. Possamaï, and N. Touzi. Moral hazard in dynamic risk management.

ManagementScience , 63(10):3328–3346, 2017.[19] J. Cvitanić, D. Possamaï, and N. Touzi. Dynamic programming approach to principal–agent problems.

Finance and Stochastics , 22(1):1–37, 2018.[20] J.-P. Décamps and S. Villeneuve. A two–dimensional control problem arising from dynamic contractingtheory.

Finance and Stochastics , 23(1):1–28, 2019.[21] P.M. DeMarzo and Y. Sannikov. Optimal security design and dynamic capital structure in a continuous–time agency model.

The Journal of Finance , 61(6):2681–2724, 2006.[22] P.M. DeMarzo, M.J. Fishman, Z. He, and N. Wang. Dynamic agency and the q theory of investment.

The Journal of Finance , 67(6):2295–2340, 2012.[23] R. Élie and D. Possamaï. Contracting theory with competitive interacting agents.

SIAM Journal onControl and Optimization , 57(2):1157–1188, 2019.

24] R. Élie, E. Hubert, T. Mastrolia, and D. Possamaï. Mean–ﬁeld moral hazard for optimal energy demandresponse management. arXiv preprint arXiv:1902.10405 , 2019.[25] R. Élie, T. Mastrolia, and D. Possamaï. A tale of a principal and many many agents.

Mathematics ofOperations Research , 44(2):440–467, 2019.[26] K. Fong. Evaluating skilled experts: optimal scoring rules for surgeons. Stanford university, 2009.[27] P. Guasoni, M. Rásonyi, and W. Schachermayer. Consistent price systems and face–lifting pricing undertransaction costs.

The Annals of Applied Probability , 18(2):491–520, 2008.[28] I. Hajjej, C. Hillairet, M. Mnif, and M. Pontier. Optimal contract with moral hazard for public privatepartnerships.

Stochastics: An International Journal of Probability and Stochastic Processes , 89(6–7):1015–1038, 2017.[29] I. Hajjej, C. Hillairet, and M. Mnif. Optimal stopping contract for public private partnerships undermoral hazard. arXiv preprint arXiv:1910.05538 , 2019.[30] Z. He. Optimal executive compensation when ﬁrm size follows geometric brownian motion.

Review ofFinancial Studies , 22(2):859–892, 2009.[31] M.F. Hellwig. The role of boundary solutions in principal–agent problems of the Holmström–Milgromtype.

Journal of Economic Theory , 136(1):446–475, 2007.[32] M.F. Hellwig and K.M. Schmidt. Discrete–time approximations of the Holmström–Milgrom Brownian–motion model of intertemporal incentive provision.

Econometrica , 70(6):2225–2264, 2002.[33] F. Hoﬀmann and S. Pfeil. Reward for luck in a dynamic agency model.

The Review of Financial Studies ,23(9):3329–3345, 2010.[34] B. Holmström and P. Milgrom. Aggregation and linearity in the provision of intertemporal incentives.

Econometrica , 55(2):303–328, 1987.[35] N. Ju and X. Wan. Optimal compensation and pay–performance sensitivity in a continuous–timeprincipal–agent model.

Management Science , 58(3):641–657, 2012.[36] I. Karatzas and S.E. Shreve.

Brownian motion and stochastic calculus , volume 113 of

Graduate textsin mathematics . Springer–Verlag New York, 2nd edition, 1998.[37] K.L. Keiber. Overconﬁdence in the continuous–time principal–agent problem. Technical report, WHUOtto Beisheim Graduate School of Management, 2003.[38] K. Larsen, H.M. Soner, and G. Žitković. Facelifting in utility maximization.

Finance and Stochastics ,20(1):99–121, 2016.[39] R.C.W. Leung. Continuous–time principal–agent problem with drift and stochastic volatility control:with applications to delegated portfolio management. Technical report, Haas School of Business, Uni-versity of California Berkeley, 2014.[40] Y. Lin, Z. Ren, N. Touzi, and J. Yang. Second order backward SDE with random terminal time. arXivpreprint arXiv:1802.02260 , 2018.[41] Y. Lin, Z. Ren, N. Touzi, and J. Yang. Random horizon principal–agent problem. arXiv preprintarXiv:2002.10982 , 2020.[42] J.A. Mirrlees and R.C. Raimondo. Strategies in the principal–agent model.

Economic Theory , 53(3):605–656, 2013.[43] H.M. Müller. The ﬁrst–best sharing rule in the continuous–time principal–agent problem with expo-nential utility.

Journal of Economic Theory , 79(2):276–280, 1998.

44] H.M. Müller. Asymptotic eﬃciency in dynamic principal–agent problems.

Journal of Economic Theory ,91(2):292–301, 2000.[45] R.B. Myerson. Leadership, trust, and power: dynamic moral hazard in high oﬃce. University ofChicago, 2008.[46] H. Ou-Yang. Optimal contracts in a continuous–time delegated portfolio management problem.

Reviewof Financial Studies , 16(1):173–208, 2003.[47] H. Pagès. Bank monitoring incentives and optimal ABS.

Journal of Financial Intermediation , 22(1):30–54, 2013.[48] H. Pagès and D. Possamaï. A mathematical treatment of bank monitoring incentives.

Finance andStochastics , 18(1):39–73, 2014.[49] A. Papapantoleon, D. Possamaï, and A. Saplaouras. Existence and uniqueness for BSDEs with jumps:the whole nine yards.

Electronic Journal of Probability , 23(121):1–68, 2018.[50] T. Piskorski and A. Tchistyi. Optimal mortgage design.

Review of Financial Studies , 23(8):3098–3140,2010.[51] T. Piskorski and M.M. Westerﬁeld. Optimal dynamic contracts with moral hazard and costly monitor-ing.

Journal of Economic Theory , 166:242–281, 2016.[52] W.P. Rogerson. The ﬁrst–order approach to principal–agent problems.

Econometrica , 53(6):1357–1368,1985.[53] Y. Sannikov. Agency problems, screening and increasing credit lines. Princeton university, 2007.[54] Y. Sannikov. A continuous–time version of the principal–agent problem.

The Review of EconomicStudies , 75(3):957–984, 2008.[55] Y. Sannikov. Contracts: the theory of dynamic principal–agent relationships and the continuous–timeapproach. In D. Acemoglu, M. Arellano, and E. Dekel, editors,

Advances in economics and econometrics,10th world congress of the Econometric Society, volume 1, economic theory , number 49 in EconometricSociety Monographs, pages 89–124. Cambridge University Press, 2013.[56] H. Schättler and J. Sung. The ﬁrst–order approach to the continuous–time principal–agent problemwith exponential utility.

Journal of Economic Theory , 61(2):331–371, 1993.[57] U. Schmock, S.E. Shreve, and U. Wystup. Valuation of exotic options under shortselling constraints.

Finance and Stochastics , 6(2):143–172, 2002.[58] M.D. Schroder, S. Sinha, and S. Levental. The continuous–time principal–agent problem with moralhazard and recursive preferences. Technical report, Michigan State University, 2010.[59] H.M. Soner and N. Touzi. Superreplication under gamma constraints.

SIAM Journal on Control andOptimization , 39(1):73–96, 2000.[60] H.M. Soner and N. Touzi. Dynamic programming for stochastic target problems and geometric ﬂows.

Journal of the European Mathematical Society , 4(3):201–236, 2002.[61] H.M. Soner and N. Touzi. The problem of super–replication under constraints. In P. Bank, E. Baudoin,H. Föllmer, L.C.G. Rogers, H.M. Soner, and N. Touzi, editors,

Paris–Princeton lectures on mathematicalﬁnance 2002 , volume 1814 of

Lecture notes in mathematics , pages 133–172. Springer, 2003.[62] H.M. Soner and N. Touzi. Hedging under gamma constraints by optimal stopping and face–lifting.

Mathematical Finance , 17(1):59–79, 2007.[63] S.E. Spear and S. Srivastava. On repeated moral hazard with discounting.

The Review of EconomicStudies , 54(4):599–617, 1987.

64] D.W. Stroock and S.R.S. Varadhan.

Multidimensional diﬀusion processes , volume 233 of

Grundlehrender mathematischen Wissenschaften . Springer–Verlag Berlin Heidelberg, 1997.[65] B. Strulovici and M. Szydlowski. On the smoothness of value functions and the existence of optimalstrategies in diﬀusion models.

Journal of Economic Theory , 159:1016–1055, 2015.[66] J. Sung. Linearity with project selection and controllable diﬀusion rate in continuous–time principal–agent problems.

The RAND Journal of Economics , 26(4):720–743, 1995.[67] J. Sung. Corporate insurance and managerial incentives.

Journal of Economic Theory , 74(2):297–332,1997.[68] N. Van Long and G. Sorger. A dynamic principal–agent problem as a feedback Stackelberg diﬀerentialgame.

Central European Journal of Operations Research , 18(4):491–509, 2010.[69] M.M. Westerﬁeld. Optimal dynamic contracts with hidden actions in continuous time. University ofSouthern California, 2006.[70] N. Williams. On dynamic principal–agent problems in continuous time. University of Wisconsin,Madison, 2008.[71] N. Williams. Persistent private information.

Econometrica , 79(4):1233–1275, 2011.[72] N. Williams. A solvable continuous time dynamic principal–agent model.

Journal of Economic Theory ,159(part B):989–1015, 2015.[73] Y. Zhang. Dynamic contracting with persistent shocks.

Journal of Economic Theory , 144(2):635–675,2009.[74] Y. Zhou. Principal–agent analysis in continuous–time. Technical report, The Chinese University ofHong Kong, 2006.[75] J.Y. Zhu.

Sticky incentives and dynamic agency . PhD thesis, University of California Berkeley, 2011.. PhD thesis, University of California Berkeley, 2011.