Contracting over persistent information
CCONTRACTING OVER PERSISTENT INFORMATION
WEI ZHAO, CLAUDIO MEZZETTI, LUDOVIC RENOU, AND TRISTAN TOMALAA
BSTRACT . We consider a dynamic moral hazard problem between a principal and anagent, where the sole instrument the principal has to incentivize the agent is the disclo-sure of information. The principal aims at maximizing the (discounted) number of timesthe agent chooses a particular action, e.g., to work hard. We show that there exists anoptimal contract, where the principal stops disclosing information as soon as its most pre-ferred action is a static best reply for the agent or else continues disclosing informationuntil the agent perfectly learns the principal’s private information. If the agent perfectlylearns the state, he learns it in finite time with probability one; the more patient theagent, the later he learns it.K
EYWORDS : Dynamic, contract, information, revelation, disclosure, sender, receiver, per-suasion.JEL C
LASSIFICATION : C73; D82.
Date : July 14, 2020.Claudio Mezzetti thankfully acknowledges financial support from the Australian Research Council Dis-covery grant DP190102904. Ludovic Renou gratefully acknowledges the support of the Agence Nationalepour la Recherche under grant ANR CIGNE (ANR-15-CE38-0007-01) and through the ORA Project “Ambi-guity in Dynamic Environments” (ANR-18-ORAR-0005). We thank Emmanuel Macron and Boris Johnsonfor granting us an extended period of concentrated research. a r X i v : . [ ec on . T H ] J u l
1. I
NTRODUCTION
Public authorities routinely attempt at persuading the public to follow specific recom-mendations. For instance, national health services regularly promote the heath benefitsof eating fruits and vegetables, of exercising or of limiting alcohol consumption. As an-other instance, during the Covid-19 outbreak, authorities were instructing the publicto practice social distancing. In these instances, as in many others, the public may notfully embrace the recommendations. The public may elect to follow some but not allrecommendations or to flexibly interpret them. E.g., an individual may decide that eat-ing three portions of fruits and vegetables a day is plenty, despite a recommendation ofseven. Similarly, during the Covid-19 lockdowns, an individual may decide to exerciseoutdoors two to three times a day, despite public authorities recommending to exerciseoutdoors at most once a day. In most of these instances, public authorities cannot, ordo not choose to, enforce their recommendations. Yet, they can still nudge the publictowards full compliance with judicious provision of information. The problem of the op-timal provision of information over time is naturally not limited to persuading the publicto follow some recommendations. Other examples include financial advisers persuadingclients to follow their advises, firms persuading customers to purchase their products,or employers persuading employees to work hard. How best to do so?To study this question, we consider a simple “principal-agent” model, where the soleinstrument the principal has is information. The principal aims at incentivizing theagent to choose the same action, e.g., be fully compliant, buy the product or work hard,as often as possible, and can only do so by disclosing information about an unknown(binary) state, e.g., whether the policy recommendation is effective, the product is goodor the task is easy. We assume the principal commits to a disclosure policy. E.g., duringthe Covid-19 outbreak, the UK government committed to daily report the number of newcases of Covid-19 (and was choosing the testing policy). We refer to the commitment toa disclosure policy as a “contract.”The main contribution of this paper is to provide a complete characterization of an opti-mal contract. We now discuss the salient features of the optimal contract we character-ize. We do so in the context of a public authority (the principal) persuading the public That is, the principal cannot make transfers, terminate the relationship, choose allocations or constraintthe agent’s choices. Public Health England was providing the number of new cases. As Public Health England is subject tothe Freedom of Information Act, the public had the opportunity to verify the accuracy of the numbersreported. See https://coronavirus.data.gov.uk/about
ONTRACTING OVER PERSISTENT INFORMATION 3 (the agent) to follow a specific recommendation. The main property of the optimal dis-closure policy is its gradualism: information is gradually and continuously revealed overtime. At each period, it incentivizes the agent to follow the principal’s recommendationwith the promise of further information disclosure in the future.We further illustrate the main properties of our policy - particularly how beliefs evolveover time - with the help of Figure 1. Figure 1 plots four representative evolutions of theagent’s belief about the “high” state – the state where the cost to incentivize the agentrelative to the benefit is the highest. In each panel, the grey region “OPT” indicates theregion at which following the recommendation is (statically) optimal. An arrow pointingfrom one belief to another indicates how the agent revised his belief within the periodfollowing a signal’s realization. Within a period, the agent takes a decision after thebeliefs have been revised. Arrows have different colors/patterns. At all beliefs at theend of continuous black arrows, the agent follows the principal’s recommendation. Atall beliefs at the end of dotted magenta arrows, he does not (and chooses what is bestgiven his current belief). periodsbeliefs
OPT ( A ) periodsbeliefs OPT ( B ) periodsbeliefs OPT ( C ) periodsbeliefs OPT ( D ) F IGURE
1. Evolution of actions and beliefs over timeThe following are general properties of our optimal policy. The first observation to makeis that the agent continuously updates his belief until either he perfectly learns the state
WEI ZHAO, CLAUDIO MEZZETTI, LUDOVIC RENOU, AND TRISTAN TOMALA or following the recommendation becomes (statically) optimal. Moreover, if the agentlearns the state, he learns it in finite time. We provide an explicit characterization ofthe priors at which the agent eventually learns the state.Second, along the paths at which the agent follows the recommendation, his beliefsabout the “high” state are decreasing. Intuitively, the optimal contract exploits theasymmetry in opportunity costs and lowers the perceived opportunity cost – hence mak-ing it easier to incentivize the agent– by sometimes informing him when the opportunitycost is high. Third, with the exception of panel ( D ), the policy does not disclose information to theagent at the first period. Thus, adopting the definition of persuasion as the act of chang-ing the agent’s beliefs prior to him making any decision, information disclosure rewardsthe agent for following the recommendation, but does not persuade him in panels (A),(B) and (C). Yet, as panel ( D ) illustrates, the policy sometimes needs to persuade theagent. For instance, if the promise of full information disclosure at the next periodwouldn’t incentivize the agent, then persuading the agent is necessary, that is, the pol-icy must generate a strictly positive value of information. However, there are othercircumstances at which persuading the agent may be necessary. Persuading the agentmay reduce sufficiently the perceived opportunity cost to compensate for the instanta-neous loss.Finally, with the exception of panel ( B ), the policy does not induce the agent to believethat following the recommendation is optimal. This is markedly different from what wewould expect from the static analysis of Kamenica and Gentzkow [2011]. Intuitively, the“static” policy is sub-optimal because it does not extract all the informational surplus itcreates, that is, it creates a strictly positive value of information, but does not extractit all. (The participation constraint of the agent does not bind.) And even in panel ( B ),the beliefs do not jump immediately to the “OPT” region. In fact, the belief process mayapproach the “OPT” region only asymptotically. Related literature.
The paper is part of the literature on Bayesian persuasion, pio-neered by Kamenica and Gentzkow [2011], and recently surveyed by Kamenica [2019].The two most closely related papers are Ball [2019] and Ely and Szydlowski [2020]. In To be precise, under our policy, upon perceiving the signal “the opportunity cost is high,” the agentlearns that this is indeed true. However, the signal is not sent with probability one. This corresponds tothe arrows ending at 1.
ONTRACTING OVER PERSISTENT INFORMATION 5 common with our paper, both papers study the optimal disclosure of information in dy-namic games and show how the disclosure of information can be used as an incentivetool. The observation that information can be used to incentivize agents is not new anddates back to the literature on repeated games with information incomplete, e.g., Au-mann et al. [1995]. See Garicano and Rayo [2017] and Fudenberg and Rayo [2019] forsome more recent papers exploring the role of information provision as an incentive tool.The class of dynamic games studied differ considerably from one paper to another, whichmakes comparisons difficult. In Ely and Szydlowski [2020], the agent has to repeatedlydecide whether to continue working on a project or to quit; quitting ends the game. Theprincipal aims at maximizing the number of periods the agent works on the project andcan only do so by disclosing information about the complexity of the project, modeledas the number of periods required to complete the project. Thus, their dynamic gameis a quitting game, while ours is a repeated game. When the project is either easyor difficult, the optimal disclosure policy initially persuades the agent that the task iseasy, so that he starts working. (Naturally, if the agent is sufficiently convinced thatthe project is easy, there is no need to persuade him initially.) If the project is in factdifficult, the policy then discloses it at a later date, when completing the project is nowwithin reach. A main difference with our optimal disclosure policy is that informationcomes in lumps in Ely and Szydlowski [2020], i.e., information is disclosed only at theinitial period and at a later period, while information is continuously disclosed in ourmodel. In the words of Ely and Szydlowski, our policy continuously leads the agenttowards following the recommendation. Another main difference is as follows. In Elyand Szydlowski, only when the promise of full information disclosure at a later date isnot enough to incentivize the agent to start working does the principal persuade theagent initially. This is not so with our policy: the principal persuades the agent in alarger set of circumstances. This initial persuasion reduces the cost of incentivizing theagent in future periods. Ball [2019] studies a continuous time model of information provision, where the statechanges over time and payoffs are the ones of the quadratic example of Crawford and See Orlov et al. [2019] and Smolin [2018] for two other papers on quitting games and information dis-closure. To illustrate the difficulties of comparing models, Orlov et al. [2019] consider a quitting game,where the principal also aims at delaying the quitting time as far as possible. The quitting time is thetime at which the agent decides to exercise an option, which has different values to the principal and theagent. The principal chooses a disclosure policy informing the agent about the option’s value. The optimaldisclosure policy is to fully reveal the state with some delay. (Note that the principal is referred to asthe agent in their work.) This policy is not optimal in Ely and Szydlowski [2020]. Seemingly innocuousdifferences in models have important consequences.
WEI ZHAO, CLAUDIO MEZZETTI, LUDOVIC RENOU, AND TRISTAN TOMALA
Sobel [1982]. Ball shows that the optimal disclosure policy requires the sender to dis-close the current state at a later date, with the delay shrinking over time. The maindifference between his work and ours is the persistence of the state (also, we considertwo different classes of games). When the state is fully persistent, as in Ely and Szyd-lowski [2020] and our model, full information disclosure with delay is not optimal ingeneral. (See Example 1.) 2. T
HE PROBLEM
A principal and an agent interact over an infinite number of periods, indexed by t ∈{ , , . . . } . At the first stage, before the interaction starts, the principal learns a payoff-relevant state ω ∈ Ω = { ω , ω } , while the agent remains uninformed. The prior prob-ability of ω is p ( ω ) > . At each period t , the principal sends a signal s ∈ S and, uponobserving the signal s , the agent takes decision a ∈ A . The sets A and S are finite.The cardinality of S is as large as necessary for the principal to be unconstrained in hissignaling. We assume that there exists a ∗ ∈ A such that the principal’s payoff is strictly positivewhenever a ∗ is chosen and zero, otherwise. The principal’s payoff function is thus v : A × Ω → R , with v ( a ∗ , ω ) > , v ( a ∗ , ω ) > and v ( a, ω ) = v ( a, ω ) = 0 for all a ∈ A \ { a ∗ } .The agent’s payoff function is u : A × Ω → R . The (common) discount factor is δ ∈ (0 , .We write A t − for A × · · · × A (cid:124) (cid:123)(cid:122) (cid:125) t − times and S t − for S × · · · × S (cid:124) (cid:123)(cid:122) (cid:125) t − times , with generic elements a t and s t ,respectively. A behavioral strategy for the principal is a collection of maps ( τ t ) t , with τ t : A t − × S t − × Ω → ∆( S ) . Similarly, a behavioral strategy for the agent is a collectionof maps ( σ t ) t with σ t : A t − × S t − × S → ∆( A ) .We write V ( τ, σ ) for the principal’s payoff and U ( τ, σ ) for the agent’s payoff under thestrategy profile ( σ, τ ) . The objective is to characterize the maximal payoff the principalachieves if he commits to a strategy τ , that is, sup ( τ,σ ) V ( τ, σ ) , subject to U ( τ, σ ) ≥ U ( τ, σ (cid:48) ) , for all σ (cid:48) . From Makris and Renou [2020], it is enough to have the cardinality of S as large as the cardinality of A . ONTRACTING OVER PERSISTENT INFORMATION 7
Several comments are worth making. First, we interpret the strategy the principal com-mits to as a contract specifying, as a function of the state, the information to be disclosedat each history of realized signals and actions. That is, the contract specifies a statisti-cal experiment at each history of realized signals and states. The principal chooses thecontract prior to learning the state. An alternative interpretation is that neither theprincipal nor the agent know the state, but the principal has the ability to conduct sta-tistical experiments contingent on past signals and actions. We can partially dispensewith the commitment assumption. Indeed, since the choices of statistical experimentsare observable, we can construct strategies that incentivize the principal to implementthe specified statistical experiments. Second, the only additional information the agentobtains each period is the outcome of the statistical experiment. Third, the state is fullypersistent and the principal perfectly monitors the action of the agent. Finally, the onlyinstrument available to the principal is information. The principal can neither remu-nerate the agent nor terminate the relationship nor allocate different tasks to the agent.We purposefully make all these assumptions to address our main question of interest:what is the optimal way to incentivize the agent with information only?
Example 1.
Throughout the paper, we illustrate most of our results with the help of thefollowing example. The agent has three possible actions a , a and a ∗ , with a (resp., a )the agent’s optimal action when the state is ω (resp., ω ). The prior probability of ω is / and the discount factor is / . The per-period payoffs are in Table 1, with the firstcoordinate corresponding to the principal payoff.T ABLE
1. Payoff table of Example 1 a a a ∗ ω , , , / ω , , , / We start with few preliminary observations. First, regardless of the agent’s belief, action a ∗ is never optimal. Second, if the agent knew the state, he would choose a (resp., a ) instate ω (resp., ω ), resulting in an expected payoff of / . Third, the opportunity cost ofplaying a ∗ is the highest when the state is ω , i.e., u ( a , ω ) − u ( a ∗ , ω ) > u ( a , ω ) − u ( a ∗ , ω ) .It is harder to incentivize the agent to play a ∗ when he is more confident that the stateis ω . As we shall see, the optimal policy exploits this property.We now consider some simple strategies the principal may commit to. To start with, as-sume that the principal commits to disclose information at the initial stage only. Clearly, The simplest such strategy is to have the agent play a (cid:54) = a ∗ in all future periods after a deviation. WEI ZHAO, CLAUDIO MEZZETTI, LUDOVIC RENOU, AND TRISTAN TOMALA since a ∗ is never optimal, the principal’s payoff is 0. The principal must condition hisinformation disclosure on the agent’s actions. The simplest such disclosure policy is to“reward” the agent with full information disclosure for playing a ∗ sufficiently often atthe beginning of the relationship, say up to period T ∗ . Note that if the agent deviates,the harshest punishment the principal can impose is to withhold all information in allsubsequent periods, inducing a normalized expected payoff of / . We are thus lookingfor the largest T ∗ such that (1 − δ ) (cid:18) (cid:0) δ + δ + · · · + δ T ∗ − (cid:1) + 43 (cid:0) δ T ∗ + . . . (cid:1)(cid:19) ≥ , With such a simple strategy, T ∗ = (cid:98) ln(5) / ln(2) (cid:99) = 2 and the principal’s payoff is / .There is yet another simple strategy the principal can commit to. The principal cancommit to a “recursive” policy, where he fully discloses the state with probability α atperiod t (and withhold all information with the complementary probability) if the agentplays a ∗ at period t − . (Again if the agent deviates, the harshest punishment is towithhold all information in all subsequent periods.) Thus, if we write V (resp., U ) for theprincipal (resp., agent) payoff, the best recursive policy is to choose α so as to maximize V = 12 1 + 12 (1 − α ) V, subject to U = 12 (cid:16) (cid:17) + 12 (cid:104) (1 − α ) U + α (cid:105) ≥ . The principal’s best payoff is V = 4 / with α = 1 / . The “recursive” policy does betterthan the policy of fully disclosing the state with delay. Intuitively, the “recursive” policyperforms better because it makes it possible to incentivize the agent to play a discountednumber of periods slightly larger than . (In continuous time, both policies would beequivalent.) As full information with delay plays an important role in the work of Ball[2019] and Orlov et al. [2019], we will compare the recursive policy with our optimalpolicy later on. For now, it suffices to say that it is not optimal in Example 1.3. O PTIMAL CONTRACTS
This section fully characterizes the optimal contract and discusses its most salient prop-erties. We first start with a recursive formulation.3.1.
A recursive formulation.
The first step in deriving an optimal contract is to re-formulate the principal’s problem as a recursive problem. To do so, we introduce two
ONTRACTING OVER PERSISTENT INFORMATION 9 state variables. First, it is well-known that classical dynamic contracting problems ad-mit recursive formulations if one introduces promised continuation payoffs as a statevariable and imposes promise-keeping constraints, e.g., Spear and Srivastava [1987].The second state variable we need to introduce is beliefs. We now turn to a formalreformulation of the problem. We first need some additional notation. For any p ∈ ∆(Ω) , we let u ( a, p ) := (cid:80) ω p ( ω ) u ( a, ω ) be the agent’s expected payoff of choosing a when his belief is p , m ( p ) := max a ∈ A u ( a, p ) be the agent’s best payoff when his belief is p , and M ( p ) := (cid:80) ω p ( ω ) max a ∈ A u ( a, ω ) bethe agent’s expected payoff if he learns the state prior to choosing an action. It is worthnoting that m is a piece-wise linear convex function, that M is linear and that m ( p ) ≤ M ( p ) for all p . Similarly, we let v ( a, p ) be the principal’s payoff when the agent chooses a and the principal’s belief is p . Finally, let P := { p ∈ [0 ,
1] : m ( p ) = u ( a ∗ , p ) } , be the set ofbeliefs at which a ∗ is optimal. If non-empty, the set P is the closed interval [ p, p ] .Let W ⊆ [0 , × R such that ( p, w ) ∈ W if and only if w ∈ [ m ( p ) , M ( p )] . Throughout, werestrict attention to functions V : W → R , with the interpretation that V ( p, w ) is theprincipal’s payoff if he promises a continuation payoff of w to the agent when the agent’scurrent belief is p .The principal’s maximal payoff is V ∗ ( p , m ( p )) , where V ∗ is the unique fixed point of thecontraction T , defined by T ( V )( p, w ) := max (cid:0) ( λ s , ( p s ,w s ) ,a s ) ∈ [0 , ×W× A (cid:1) s ∈ S (cid:80) s ∈ S λ s [(1 − δ ) v ( a s , p s ) + δV ( p s , w s )] , subject to: (1 − δ ) u ( a s , p s ) + δw s ≥ m ( p s ) , (cid:80) s ∈ S λ s [(1 − δ ) u ( a s , p s ) + δw s ] ≥ w, (cid:80) s ∈ S λ s p s = p, (cid:80) s ∈ S λ s = 1 . We briefly comment on the maximization program. The first constraint is an incentiveconstraint: the agent must have an incentive to play a s when w s is the agent’s promised A nearly identical reformulation has already appeared in Ely [2015], one of the working version of Ely[2017]. We remind the reader that Ely [2017] analyzes the interaction between a long-run principaland a sequence of short-run agents. (See also Renault et al. [2017].) While discussing the extension ofhis model to the interaction between a long-run principal and a long-run agent, Ely [2015] has deriveda recursive reformulation nearly identical to ours. However, he didn’t go further. We start from therecursive formulation and use it to derive an optimal policy. See Section A.2 for a detailed comparison ofboth formulations. continuation payoff and p s the agent’s belief. To understand the right-hand side, ob-serve that the agent can always play a static best-reply to any belief he has so that hisexpected payoff must be at least m ( p s ) when his current belief is p s . Conversely, if thecontract specifies action a s and the agent does not execute that action, the contract canspecify no further information revelation, in which case the agent’s payoff is at most m ( p s ) . Therefore, m ( p s ) is the agent’s min-max payoff. The second constraint is thepromise-keeping constraint: if the principal promises the continuation payoff w at a pe-riod, the contract must honor that promise in subsequent periods. The third constraintstates that the policy selects a splitting of p , that is, a distribution over posteriors withexpectation p .Throughout, we slightly abuse notation and denote by τ a policy, that is, a function from W to ([0 , ×W × A ) | S | . A policy is feasible if it specifies a feasible tuple (( λ s , ( p s , w s ) , a s )) s ∈ S for each ( p, w ) , i.e., a tuple satisfying the constraints of the maximization problem T ( V )( p, w ) .Two important observations are worth making. First, for any function V , T ( V ) is aconcave function in ( p, w ) . Concavity reflects the fact that the more information theprincipal discloses, the harder it is to reward the agent in the future. Second, T ( V ) isa decreasing function in w , that is, the more the principal promises to the agent, theharder it is to incentivize the agent to play a ∗ . We will repeatedly make use of thesetwo properties, which we formally record in the following proposition.
Proposition 1.
The value function V ∗ is concave in both arguments and decreasing in w . Proposition 1 together with the recursive formulation has a number of additional im-plications. First, there is at most one signal s at which the principal recommends theagent to play a ∗ . Moreover, whenever the principal recommends a ∗ , the agent is indif-ferent between obeying the recommendation or deviating. In other words, the promisedcontinuation payoff does not leave rents to the agent. Second, if the principal does notrecommend a ∗ at a period, then the principal never recommends a ∗ at a subsequent pe-riod, that is, the principal’s continuation value is zero. In other words, as soon as anaction other than a ∗ is played, the principal stops incentivizing the agent to play a ∗ . More precisely, if the agent’s belief at period τ is p τ , he obtains the payoff m ( p τ ) by playing a static best-reply. Since the function m is convex and beliefs follow a martingale, his expected payoff is therefore atleast (1 − δ ) (cid:80) τ (cid:48) ≥ τ δ τ (cid:48) − τ E [ m ( p τ (cid:48) ) |F τ ] ≥ m ( p τ ) , where F τ is the agent’s filtration at period τ . A real-valued function f is increasing (resp., strictly increasing) if x ≥ y implies that f ( x ) ≥ f ( y ) (resp., f ( x ) > f ( y ) ). The function f is (resp., strictly) decreasing if − f is (resp., strictly) increasing. ONTRACTING OVER PERSISTENT INFORMATION 11
Finally, if the principal induces the posterior p s while recommending the action a s andpromising the continuation payoff w s , the principal should not have an incentive to fur-ther disclose information in that period. The following proposition formally states thesethree implications. Proposition 2.
For all ( p, w ) , there exists a solution ( λ s , p s , w s , a s ) s ∈ S to T ( V ∗ )( p, w ) suchthat (i): There exists at most one signal s ∗ ∈ S such that λ s ∗ > and a s ∗ = a ∗ . Moreover, (1 − δ ) u ( a s ∗ , p s ∗ ) + δw s ∗ = m ( p s ∗ ) . (ii): For all s ∈ S such that λ s > and a s (cid:54) = a ∗ , V ∗ ( p s , w s ) = 0 . (iii): For all s ∈ S such that λ s > , we have (1 − δ ) v ( a s , p s ) + δV ∗ ( p s , w s ) = V ∗ ( p s , (1 − δ ) u ( a s , p s ) + δw s ) . Proposition 2 states key properties that an optimal policy possesses. We conclude thissection with a partial converse, that is, we state properties that guarantee optimality ofa policy. To do so, we need to introduce two additional notation. We first let Q be the setof beliefs at which the agent has an incentive to play a ∗ if he is promised full informationdisclosure at the next period, that is, Q := { p ∈ [0 ,
1] : (1 − δ ) u ( a ∗ , p ) + δM ( p ) ≥ m ( p ) } . If Q is empty, then all policies are optimal as the principal can never incentivize theagent to play a ∗ . If Q is non-empty, then it is a closed interval [ q , q ] . Note that q = 0 if and only if a ∗ is optimal at p = 0 . For a graphical illustration, see Figure 2. m ( · ) M ( · ) u ( a ∗ , · ) q m ( q ) 1 m (1) (1 − δ ) u ( a ∗ , · ) + δM ( · ) q F IGURE
2. Construction of the set Q Second, we write w ( p ) ∈ [ m ( p ) , M ( p )] for the continuation payoff that makes the agentindifferent between playing action a ∗ and receiving the continuation payoff w ( p ) in the future, and playing a best reply to the belief p forever, that is, w ( p ) solves: (1 − δ ) u ( a ∗ , p ) + δ w ( p ) = m ( p ) . Theorem 1.
Consider any feasible policy inducing the value function ˜ V . If ˜ V is concavein both arguments, decreasing in w and satisfies ˜ V ( p, m ( p )) ≥ (1 − δ ) v ( a ∗ , p ) + δ ˜ V ( p, w ( p )) , for all p ∈ Q , then the policy is optimal.Proof. We argue that ˜ V is the fixed point of the operator T , hence ˜ V = V ∗ . Let ( λ s , p s , w s , a s ) s ∈ S be a solution to the maximization problem T ( ˜ V )( p, w ) . We first start with the followingobservation. Consider any s such that a s (cid:54) = a ∗ . We have (1 − δ ) v ( a s , p s ) + δ ˜ V ( p s , w s ) = δ ˜ V ( p s , w s ) ≤ ˜ V ( p s , w s ) ≤ ˜ V ( p s , (1 − δ ) u ( a s , p s ) + δw s ) , where the last inequality follows from the fact that ˜ V is decreasing in w and m ( p s ) ≤ (1 − δ ) u ( a s , p s ) + δw s ≤ (1 − δ ) m ( p s ) + δw s ≤ w s .Consider now any s such that a s = a ∗ . Since ( λ s , p s , w s , a s ) s ∈ S is feasible, we have that (1 − δ ) u ( a ∗ , p s ) + δw s ≥ m ( p s ) , hence p s ∈ Q and, therefore, ˜ V ( p s , m ( p s )) ≥ (1 − δ ) v ( a ∗ , p s ) + δ ˜ V (cid:16) p s , − (1 − δ ) u ( a ∗ , p s ) + m ( p s ) δ (cid:124) (cid:123)(cid:122) (cid:125) w ( p s ) (cid:17) . The concavity of ˜ V implies that ˜ V ( p s , (1 − δ ) u ( a ∗ , p s ) + δw s ) − ˜ V ( p s , m ( p s )) ≥ δ (cid:104) ˜ V ( p s , w s ) − ˜ V (cid:16) p s , w ( p s ) (cid:17)(cid:105) , where we use the identity (1 − δ ) u ( a ∗ , p s ) + δw s − m ( p s ) = δ ( w s − w ( p s )) and observation(a) about concave functions in Section A.1.Combining the above two inequalities implies that ˜ V ( p s , (1 − δ ) u ( a ∗ , p s ) + δw s ) ≥ (1 − δ ) v ( a ∗ , p s ) + δ ˜ V ( p s , w s ) . ONTRACTING OVER PERSISTENT INFORMATION 13
It follows that T ( ˜ V )( p, w ) = (cid:88) s ∈ S λ s (cid:104) (1 − δ ) v ( a s , p s ) + δ ˜ V ( p s , w s ) (cid:105) ≤ (cid:88) s ∈ S λ s (cid:104) ˜ V ( p s , (1 − δ ) u ( a s , p s ) + δw s ) (cid:105) ≤ ˜ V (cid:32)(cid:88) s ∈ S λ s p s , (cid:88) s ∈ S λ s ((1 − δ ) u ( a s , p s ) + δw s )) (cid:33) ≤ ˜ V ( p, w ) , where the second inequality follows from the concavity of ˜ V and the third inequalityfrom ˜ V being decreasing in w .Conversely, since the policy inducing ˜ V is feasible, we must have that T ( ˜ V )( p, w ) ≥ ˜ V ( p, w ) for all ( p, w ) . This completes the proof. (cid:3) An optimal policy.
The objective of this section is to define a policy, which welater prove to be optimal. Throughout, the number p ∈ [0 , refers to the probabilityof ω . We denote a p a maximizer of u ( · , p ) . Without loss of generality, assume that v ( a ∗ , v ( a ∗ , ≥ m (0) − u ( a ∗ , m (1) − u ( a ∗ , , that is, the principal’s benefit of a ∗ in state ω relative to state ω ishigher than the agent’s opportunity cost in state ω relative to state ω . (A symmetricargument applies if the reverse inequality holds.) Observe that if a ∗ is optimal for theagent at p = 1 , i.e., ∈ P , then a ∗ is also optimal at p = 0 and, consequently, P = [0 , ,i.e., a ∗ is optimal at all beliefs. In what follows, we exclude this trivial case and assumethat / ∈ P .Define the functions λ : W → [0 , and ϕ : W → [0 , , with ( λ ( p, w ) , ϕ ( p, w )) the uniquesolution to: pw = λ ( p, w ) ϕ ( p, w ) m ( ϕ ( p, w )) + (1 − λ ( p, w )) m (1) . (1)for all w > m ( p ) , and λ ( p, m ( p )) , ϕ ( p, m ( p ))) = (1 , p ) . Geometrically, the solution ( ϕ ( p, w ) , m ( ϕ ( p, w )) is the unique intersection between the line connecting ( p, w ) and (1 , m (1)) and the graphof m . See Figure 3 for an illustration. Note that both functions are continuous. m ( · ) M ( · ) pw m (1) ϕ ( p, w ) m ( ϕ ( p, w )) F IGURE
3. Construction of λ and ϕ We now define a family of policies ( τ q ) q ∈ [ q ,q ] and show later the existence of q ∗ ∈ [ q , q ] such that the policy τ q ∗ is optimal. For each q ∈ [ q , q ] , there are four regions to consider: W q := (cid:110) ( p, w ) : p ∈ [0 , q ) , w ≤ q − pq m (0) + pq m ( q ) (cid:111) , W q := (cid:110) ( p, w ) : p ∈ ( q, , − p − q m ( q ) + p − q − q m (1) < w ≤ − p − q m ( q ) + p − q − q m (1) (cid:111)(cid:91) (cid:110) ( p, w ) : p ∈ [ q , q ] , w ≤ − p − q m ( q ) + p − q − q m (1) (cid:111) , W q := (cid:110) ( p, w ) : p ∈ ( q, , w ≤ − p − q m ( q ) + p − q − q m (1) (cid:111) , W q := W \ ( W q ∪ W q ∪ W q ) . The four regions partition the set W . The first region corresponds to the points ( p, w ) below the line connecting (0 , m (0)) to ( q , m ( q )) . The second region corresponds to thepoints ( p, w ) below the line connecting ( q , m ( q )) and (1 , m (1)) but above the line connect-ing ( q, m ( q )) and (1 , m (1)) . The third region corresponds ( p, w ) below the line connecting ( q, m ( q )) and (1 , m (1)) , while the fourth region corresponds to all other points. Figure 4illustrates the four regions with W q the black region, W q the region with vertical lines, W q the gray region, and W q the region with north west lines. m ( · ) M ( · ) q pw q q F IGURE
4. The regions W , W , W and W . ONTRACTING OVER PERSISTENT INFORMATION 15
It is worth observing that the regions W q and W q do not depend on the parameter q ,while the other two do. The policy τ q differs from one region to another, as we nowexplain.If ( p, w ) ∈ W q , the policy splits p into and q with probability q − pq − and p − q − , respectively.Conditional on , the policy recommends a and promises a continuation payoff of m (0) .Conditional on q , the policy recommends a ∗ and promises a continuation payoff of w ( q ) .If ( p, w ) ∈ W q , the policy splits p into ϕ ( p, w ) and with probability λ ( p, w ) and − λ ( p, w ) ,respectively. Conditional on ϕ ( p, w ) , the policy recommends action a ∗ and promises acontinuation payoff of w ( ϕ ( p, w )) . Conditional on , the policy recommends action a andpromises a continuation payoff of m (1) .If ( p, w ) ∈ W q , the policy splits p into q and with probability − p − q and p − q − q , respectively.Conditional on , the policy recommends a and promises a continuation payoff of m (1) .Conditional on q , the policy recommends a ∗ and promises a continuation payoff of w ( q ) .If ( p, w ) ∈ W q , the policy splits p into , q , and with probability λ , λ q and λ ,respectively. Conditional on (resp., ), the policy recommends action a (resp, a ) and promises a continuation payoff of m (0) (resp., m (1) ). Conditional on q , the pol-icy recommends action a ∗ and promises a continuation payoff of w ( q ) . The probabilities ( λ , λ q , λ ) ∈ R + × R + × R + are the unique solution to: λ m (0)1 + λ q q m ( q )1 + λ m (1)1 = pw . A solution exists since W q is the convex hull of (0 , m (0)) , ( q , m ( q )) and (1 , m (1)) .This completes the description of the policy. Intuitively, in regions W q and W q , thepolicy generates a strictly positive information value, i.e., the policy leaves rents to theagent. Clearly, this is needed when even the promise of full information disclosureat the next period does not incentivize the agent to play a ∗ even once, i.e., when hisbelief is in [0 , \ Q . As we shall see later, this is even true for some beliefs in Q as a way to minimizes the cost of incentivizing the agent relative to the benefit to theprincipal. In region W q , the policy does not leave the rent to the agent and continuouslyincentivize him to play a ∗ with the promise of future information disclosure. This is the most important region. Finally, region W q is mostly for completeness. The policyenters it only at beliefs sufficiently close to q . For instance, in the introductory Figure1, panel ( D ) represents the evolution of beliefs starting from a prior belief in the region W q , transitioning to the region W q at the next period and staying there for three periodsand transitioning then to the region W q . To illustrate further the policy, we revisit ourrunning example. Example 1 (continued).
We have that M ( p ) = 1 + p , m ( p ) = max(1 − p, p ) and w ( p ) =2 max(2 p, − p ) − (1 / . Therefore, Q = [1 / , / . Assume that q = 1 / (we will soonshow that this is optimal in this example). Remember that / is the prior probabilityof ω and δ = 1 / . Let us start with the pair ( p, m ( p )) = (1 / , / , which is in region W / . The policy recommends a ∗ to the agent and promises a continuation payoff of w (1 /
3) = 5 / . The next state is therefore (1 / , / , which is again in W / . If the agenthad been obedient, the policy then splits the prior probability / into / and withprobability / and / , respectively. To see this, note that we indeed have: = 2224 m ( ) + 224 m (1) . Conditional on the posterior / , the policy recommends a ∗ to the agent and promisesa continuation payoff of w (3 /
11) = 21 / . Conditional on the posterior , the policyrecommends a and promises a continuation payoff of m (1) = 2 . Therefore, the next stateis either (3 / , / or (1 , , with the former again in W / . In the latter case, the policyyet again recommends a and a continuation payoff of . In the former case, the policysplits / into / and , with probability / and / , respectively. Conditional onthe posterior / , the policy recommends a ∗ to the agent and promises a continuationpayoff of w (7 /
39) = 89 / . Conditional on the posterior , the policy recommends a andpromises a continuation payoff of m (1) = 2 . Finally, at the state (7 / , / , which is inregion W / , the policy does a penultimate split of / into , / and with probability /
156 18 / and / , respectively. Conditional on the posterior / , the policyrecommends a ∗ and promises a continuation payoff of / , i.e., full information disclosureat that the next period. Thus, the policy fully discloses the state in finite time to theagent. See Figure 5 for the evolution of the beliefs at the beginning of each period. At allbeliefs other than and , the agent is recommended to play a ∗ . The principal’s expectedpayoff is / , i.e., about . . ONTRACTING OVER PERSISTENT INFORMATION 17
13 13 t = 1 t = 2 t = 3 t = 4 t = 5 F IGURE
5. Evolution of the beliefs.3.3.
Construction of q ∗ and optimality. Let V q : W → R be the value function in-duced by the policy τ q . Note that for all q , V q (1 , m (1)) = 0 since a ∗ is not optimal at p = 1 and V q (0 , m (0)) = 0 if a ∗ is not optimal at p = 0 (resp., = v ( a ∗ , ) if a ∗ is optimal at p = 0 ).Also, V q ( q , m ( q )) = (1 − δ ) v ( a ∗ , q ) if q > (resp., V q (0 , m (0)) = v ( a ∗ , if q = 0 , since a ∗ is then optimal at p = 0 ). Therefore, any two policies τ q and τ q (cid:48) induce the same valuesat all ( p, w ) ∈ W q ∪ W q = W q (cid:48) ∪ W q (cid:48) . (Remember that the regions W q and W q do not varywith q , see Figure 4.)Similarly, any two policies τ q and τ q (cid:48) induce the same values at all ( p, w ) ∈ W q,q (cid:48) ) .In particular, τ q and τ q induce the same values at all ( p, w ) ∈ W \ W q . Finally, at all ( p, w ) ∈ W q , V q ( p, w ) = − p − q V q ( q, m ( q )) = − p − q V q ( q, m ( q )) . (See Section A.4 for more details.)We are now ready to state our main result. Let q ∗ = sup (cid:8) p ∈ [ q , q ] : V q ( p, m ( p )) ≥ V q ( p, w ) for all w (cid:9) . Theorem 2.
The policy τ q ∗ is optimal and, therefore, V q ∗ = V ∗ . To understand the role of q ∗ , recall that for all p ∈ [ q ∗ , , the policy leaves rents to theagent. To minimize the rents left to the agent, we therefore would like to have q ∗ ashigh as possible, i.e, equals to q . However, V q ( · , m ( · )) is not guaranteed to be concave in p , a necessary condition for optimality. To see this, consider any pair ( p, p (cid:48) ) ∈ [0 , × [0 , and α ∈ [0 , . We have αV ∗ ( p, m ( p )) + (1 − α ) V ∗ ( p (cid:48) , m ( p (cid:48) )) ≤ V ∗ ( αp + (1 − α ) p (cid:48) , αm ( p ) + (1 − α ) m ( p (cid:48) )) ≤ V ∗ ( αp + (1 − α ) p (cid:48) , m ( αp + (1 − α ) p (cid:48) )) , I.e., the agent is promised a payoff of − p − q ∗ m ( q ∗ ) + p − q ∗ − q ∗ m (1) > m ( p ) . where the first inequality follows from the concavity of V ∗ in both arguments and thesecond from V ∗ decreasing in w and the convexity of m . The optimal choice of q ∗ is thusthe largest q , which guarantees V q ( · , m ( · )) to be concave.More precisely, as we show in Section A.5, the definition of q ∗ guarantees that V q ∗ isconcave in both arguments and decreasing in w , so that V q ∗ ( · , m ( · )) is a concave functionof p . We also prove that V q ∗ ( p, m ( p )) ≥ V q ( p, m ( p )) for all p . Since it is clearly the smallestsuch function, V q ∗ is the concavification of V q . In particular, q ∗ = q if V q ( · , m ( · )) isalready concave. Figure 6 illustrates the concavification in the context of Example 1.F IGURE
6. The concavification of V q ( · , m ( · )) in Example 1The policy we construct leaves rents to the agent for all priors in [0 , q ) ∪ ( q ∗ , , that is,the (ex-ante) participation constraint does not bind. It is quite natural for all priors in [0 , \ Q since the agent cannot be incentivized to play a ∗ even once. In the languageof Ely and Szydlowski [2020], “the goalposts need to move,” that is, one needs to dis-close information at the ex-ante stage to persuade the agent to play a ∗ . However, ourpolicy also leaves rents for all priors in ( q ∗ , q ] . The intuitive reason is that the initialinformation disclosure reduces the cost of incentivizing the agent in subsequent periodssufficiently enough to compensate for the initial loss. (When the realized posterior is ,the agent never plays a ∗ , thus creating the loss.)3.4. Properties of the policy.
The policy we construct has a number of noteworthyfeatures, which we now explore.
ONTRACTING OVER PERSISTENT INFORMATION 19
Information disclosure.
The policy discloses information gradually over time, withbeliefs evolving until either the agent learns the state or believes that a ∗ is (statically)optimal. We can be more specific. Let Q ∞ = [ p, q ∞ ] , with q ∞ the solution to m ( q ∞ ) = (1 − δ ) u ( a ∗ , q ∞ ) + δ (cid:18) − q ∞ − p m ( p ) + q ∞ − p − p m (1) (cid:19) , if P is non-empty, and Q ∞ = ∅ , otherwise. (Remember that P is the set of beliefs at which a ∗ is statically optimal.) Note that P ⊆ Q ∞ . See Figure 7 for a graphical illustration. m ( p ) u ( a ∗ , p )(1 − δ ) u ( a ∗ , p ) + δ (cid:16) − p − p m ( p ) + p − p − p m (1) (cid:17) q ∞ pp p F IGURE
7. Construction of q ∞ Intuitively, the set Q ∞ has the “fixed-point property,” that is, if one starts with a prior p ∈ Q ∞ , then the belief ϕ ( p, w ( p )) ∈ Q ∞ . Since ϕ ( p, w ( p )) ≤ p (with a strict inequalityif p / ∈ P ), we thus a decreasing sequence of beliefs converging to an element in P . Seepanel ( B ) of Figure 1 for an an illustration.At all priors in [0 , \ Q ∞ , there exists T δ < ∞ such that the belief process is absorbedin the degenerate beliefs or after at most T δ periods. In other words, agent learnsthe state for sure in finite time. The number of periods T δ corresponds to the maximalnumber of periods the agent can be incentivized to play a ∗ . (We provide an explicitcomputation in Section A.4.) In Example , it is . Moreover, the date T δ is increasing in δ and converges to + ∞ as δ converges to . (Note that it is also uniform in that it doesnot depend on p ∈ [0 , \ Q ∞ .)At all priors in Q ∞ , the belief process is either absorbed in or in some p ∈ P ; the processmay be absorbed only asymptotically. The economics of our policy.
We now provide some economic insights as to why ourpolicy is optimal. From Makris and Renou [2020], we can view the principal’s problemas selecting among the set of Bayes correlated equilibria of the decision problem the agent faces an equilibrium maximizing his payoff. At a Bayes correlated equilibrium,an omniscient mediator makes periodic recommendations of actions to the agent. Therecommendation at period t depends on all past recommendations, past actions andpayoff-relevant states. At an equilibrium, the agent has an incentive to be obedient,provided he has been in the past. Denote π t ( a | a t − , ω ) the probability of recommendingaction a conditional on the profiles of past recommendations (and actions) a t − and thestate ω , and let π t − ( a t − | ω ) = π ( a | ω ) × π ( a | a , ω ) × · · · × π t − ( a t − | a t − , ω ) . At a Bayes correlated equilibrium, the principal’s payoff is (1 − δ ) (cid:88) ω p ( ω ) (cid:32)(cid:88) t (cid:88) a t − δ t − π t ( a ∗ | a t − , ω ) π t − ( a t − | ω ) (cid:33) v ( a ∗ , ω ) = λ ∗ v ∗ ( a ∗ , p ∗ ) , where λ ∗ := (1 − δ ) (cid:88) ω p ( ω ) (cid:32)(cid:88) t (cid:88) a t − δ t − π t ( a ∗ | a t − , ω ) π t − ( a t − | ω ) (cid:33) is the discounted probability of recommending action a ∗ and p ∗ := (1 − δ ) (cid:80) t (cid:80) a t − ,a ∗ δ t − π t ( a ∗ | a t − , ω ) π t − ( a t − | ω ) λ ∗ , is the average discounted probability of ω when a ∗ is recommended. Notice that p ∗ cannot be lower than q since the agent would never play a ∗ for lower beliefs. Similarly,let p (cid:63) be the average discounted probability of ω when a ∗ is not recommended. Since thebelief process is a martingale, λ ∗ p ∗ + (1 − λ ∗ ) p (cid:63) = p .We now turn our attention to the agent’s expected payoff. Since the agent cannot obtainmore than max a ∈ A u ( a, ω ) whenever he does not play a ∗ , his expected payoff is boundedfrom above by: λ ∗ u ( a ∗ , p ∗ ) + (1 − λ ∗ ) M ( p (cid:63) ) = λ ∗ ( u ( a ∗ , p ∗ ) − M ( p ∗ )) + M ( p ) . Moreover, since the agent’s payoff must be at least m ( p ) , there exists a positive number c ≥ such that λ ∗ ( u ( a ∗ , p ∗ ) − M ( p ∗ )) + M ( p ) − c = m ( p ) . The number c captures two effects. First, the optimal solution may leave some rents tothe agent, so that the agent’s payoff is m ( p ) + c for some c ≥ . Second, the agent’spayoff may be bounded away from the upper bound by some positive number c > . Itfollows that c = c + c . More accurately, we should write π t ( a | a t − , ˆ a t − , ω ) , where the second component ˆ a t − corresponds topast recommendations. Since our discussion focuses on the equilibrium path, i.e., when a t − = ˆ a t − , thesimpler notation suffices. ONTRACTING OVER PERSISTENT INFORMATION 21
We can then rewrite the principal’s expected payoff as: v ( a ∗ , p ∗ ) M ( p ∗ ) − u ( a ∗ , p ∗ ) ( M ( p ) − m ( p ) − c ) . The first term captures the average benefit of incentivizing the agent to play a ∗ relativeto the cost. Since v ( a ∗ , v ( a ∗ , ≥ m (0) − u ( a ∗ , m (1) − u ( a ∗ , , it is decreasing in p ∗ . Ceteris paribus, the lowerthe average belief at which the agent plays a ∗ , the higher the principal’s expected payoff.In a sense, this term represents the “debt” the principal accumulates over time as theagent repeatedly plays a ∗ . The second term captures how the principal repays his debtwith his only instrument: information. The term M ( p ) − m ( p ) is the maximal value ofinformation the principal can create. Ceteris paribus, the principal’s payoff is decreasingin c , that is, the best is to leave no rents to the agent and to create as much informationas necessary to repay the agent. Notice that c = 0 is only achieved by both leaving norents to the agent and fully informing the agent of the state (i.e., whenever the agentdoes not play a ∗ , his belief is either or ).In general, the principal needs to trade-off a lower p ∗ for a lower c . It is worth notingthat our policy guarantees c = 0 for all p ∈ [ q , q ∗ ] . The principal uses to the fullextent possible the information available to him. For p ∈ [0 , q ] , it is then clear why ourpolicy of splitting p into p (cid:63) ) and q (= p ∗ ) is optimal. It minimizes the average costof incentivizing the agent to play a ∗ , attains the upper bound for the agent’s expectedpayoff since M (0) = m (0) , i.e., c = 0 , and leaves as little rents as possible. A similarargument applies to p ∈ [ q , . When p ∈ ( q ∗ , q ] , the policy leaves strictly positiverents to the agent, i.e., c > . The gain is to reduce the average cost of incentivizing a ∗ in the future, which out-weights the cost. Non-uniqueness and comparison with the KG’s policy.
The policy is not alwaysuniquely optimal. We demonstrate the non-uniqueness with the help of a simple ex-ample and then discuss how our policy compares with the KG’s policy (for Kamenica-Gentzkow’s policy).
Example 2.
The agent has two possible actions a and a , with a (resp., a ) the agent’soptimal action when the state is ω (resp., ω ). The principal wants to induce a as oftenas possible, i.e., a ∗ = a . The discount factor is / . The payoffs are in Table 2, with thefirst coordinate corresponding to the principal’s payoff.In Example 2, we have that: m ( p ) = max(1 − p, p ) , M ( p ) = 1 and u ( a ∗ , p ) = 1 − p . Thus, a ∗ is optimal for all p ∈ P = [0 , / . Moreover, Q = [0 , / and w ( p ) = 3 p − for p ∈ (1 / , / . T ABLE
2. Payoff table of Example 2 a a ω , , ω , , We now provide an explicit characterization of the value function. We first compute thevalue function V q ( · , m ( · )) and check whether it is concave. For p ∈ [0 , / , the policyrecommends a ∗ and promises a continuation payoff of m ( p ) . That is, since a ∗ is optimal,the principal does not need to incentivize the agent. For p ∈ (1 / , / , the policy rec-ommends a ∗ and promises a continuation payoff of w ( p ) . At ( p, w ( p )) with p ∈ (1 / , / ,the policy splits p into ϕ ( p, w ( p )) and , with probability λ ( p, w ( p )) and − λ ( p, w ( p )) respectively. (See Equation (1).)We obtain that λ ( p, w ( p )) = (3 − p ) and ϕ ( p, w ( p )) = − p − p . Note that ϕ ( p, w ( p )) = − p − p < since p ∈ (1 / , / . After splitting p into ϕ ( p, w ( p )) , the principal therefore obtains apayoff of 1 in all subsequent periods. It follows that the principal’s expected payoff is
12 + 12 λ ( p, w ( p )) = 2(1 − p ) . Finally, if p ∈ (2 / , , the policy splits p into / and with probability − p ) and (1 − − p )) , respectively. The principal’s expected payoff is then − p ) × (cid:104)
12 + 12 λ ( 23 , w ( 23 )) (cid:105) = 3(1 − p ) × (cid:18) − (cid:19) = 2(1 − p ) . So, the value function V q induced by the policy τ q is such V q ( p, m ( p )) = 1 for all p ∈ [0 , / and V q ( p, m ( p )) = 2(1 − p ) for all p ∈ (1 / , . Since it is concave in p , thisguarantees that q ∗ = q and, thus, the policy is indeed optimal.We now consider another policy, which we call the KG’s policy. The aim of the KG’s policyis to persuade the agent to choose a ∗ as often as possible by disclosing information at theinitial stage only. In other words, the KG’s policy uses information disclosure neither toreward nor to punish the agent, but only to persuade. The best payoff the principal canobtain with a KG’s policy is: max ( λ s ,p s ,a s ) (cid:88) s λ s v ( a s , p s ) , subject to ∀ s, u ( a s , p s ) ≥ m ( p s ) , and (cid:88) s λ s p s = p. In Example 2, the KG’s policy differs from our policy only when p ≥ / , and consistsin splitting p into / and , with probability − p ) and − − p ) respectively. The ONTRACTING OVER PERSISTENT INFORMATION 23
KG’s policy induces the same value function as ours policy, hence is also optimal. Wenow prove that this is not accidental.Suppose that there are only two actions, a and a , such that a (resp., a ) is optimalat state ω (resp., ω ). The principal aims at implementing a as often as possible, i.e., a ∗ = a . Remember that a is optimal at all beliefs in [ p, p ] . Since a is optimal at , p = 0 . To streamline the exposition, assume that the prior p > p . (If p ≤ p , an optimalpolicy is to never reveal any information.) It is then immediate to see that the KG’spolicy consists in splitting the prior p into p and , with probability − p − p and − − p − p ,respectively. Intuitively, the principal designs a binary experiment, with one signalperfectly informing the agent that the state is ω and the other partially informing theagent so that his posterior beliefs is p .We can contrast the KG’s policy with our policy. Unlike the KG’s policy, our policy doesnot reveal information to the agent at the first period, and only reveals information tothe agent if he plays a . If the agent plays a at the first period, the policy splits p into ϕ ( p , w ( p )) and with probability λ ( p , w ( p )) and − λ ( p , w ( p )) , respectively. Notethat ϕ ( p , w ( p )) ≤ p since w ( p ) ≥ m ( p ) . Thus, our policy guarantees that the agentplays a ∗ for sure at the first period. However, this comes at a cost: the principal needsto reveal more information to the agent at the next period and, consequently, inducingthe agent to play a with a lower probability. Somewhat surprisingly, both policies areoptimal, regardless of the discount factor. Corollary 1.
If there are only two actions, then the KG’s policy is also optimal.
As Example 1 shows, the KG’s policy is not always optimal. Yet, we might wonderwhether this was due to a ∗ being strictly dominated. The answer is negative, as thenext example shows. Example 3.
The agent has three possible actions a , a and a ∗ , with a (resp., a ) theagent’s optimal action when the state is ω (resp., ω ). The prior probability of ω is / and the discount factor is / . The payoffs are in Table 3, with the first coordinatecorresponding to the principal’s payoff.In Example 3, we have that: M ( p ) = 1 + p , m ( p ) = max(1 − p, / , p ) , [ p, p ] = [1 / , / and q = 1 / . It is straightforward to show that the KG policy consists in splitting the Recall that we assume that m (1) − u ( a ∗ , ≥ m (0) − u ( a ∗ , . Thus, if a ∗ = a , then m (1) − u ( a , ≥ m (0) − u ( a ,
0) = u ( a , − u ( a , ≥ , i.e., a is also optimal when the agent believes that the state is ω with probability 1. T ABLE
3. Payoff table of Example 3 a a a ∗ ω , , , / ω , , , / prior / into and / , inducing a payoff of / . Note that the KG’s policy does notcreate a strictly positive value of information, since (3 / m (0) + (1 / m (1 /
4) = m (1 / ,while ours does. Yet, our policy is optimal and induces a payoff of / ( = (1 − δ ) v ( a ∗ , q ) ). Comparison with the “recursive” policy.
Finally, we compare our policy with the“recursive” policy. Remember that the “recursive” policy is essentially equivalent to thepolicy of fully disclosing the state with delay, a policy which plays a prominent role inthe work of Ball [2019] and Orlov et al. [2019].We first compute the principal’s best payoff if he commits to the best “recursive” policy,that is, when the principal promises to fully disclose the state with probability α atperiod t + 1 (and to withhold all information with the complementary probability) if theagent plays a ∗ at period t . To ease the exposition, we assume that a ∗ is not optimal atthe belief p = 0 . Assume that p ∈ Q . The best recursive policy is thus solution to themaximization problem: V = max α ∈ [0 , (1 − δ ) v ( a ∗ , p ) + δ (1 − α ) V, subject to U = (1 − δ ) u ( a ∗ , p ) + δ [ αM ( p ) + (1 − α ) U ] ≥ m ( p ) . The optimal solution is α ∗ = w ( p ) − m ( p ) M ( p ) − m ( p ) = 1 − δδ m ( p ) − u ( a ∗ , p ) M ( p ) − m ( p ) , inducing the value (1 − δ ) (cid:88) t δ t (cid:18) M ( p ) − w ( p ) M ( p ) − m ( p ) (cid:19) t v ( a ∗ , p ) = M ( p ) − m ( p ) M ( p ) − u ( a ∗ , p ) v ( a ∗ , p ) . The formula has a natural interpretation. Whenever the agent is recommended to play a ∗ , no information has been revealed yet, so that the maximal value of information theprincipal can create is M ( p ) − m ( p ) . To incentivize the agent, the principal needs to As we vary the prior, the induced value function is p when p < / , when p ∈ [1 / , / and − p )5 when p ∈ (3 / , . It is concave in p . When a ∗ is optimal at p = 0 , we need to add the term δα (1 − p ) v ( a ∗ , p ) to the objective, which correspondsto the payoff the principal obtains when the disclosed state is ω . ONTRACTING OVER PERSISTENT INFORMATION 25 promise a continuation payoff of w ( p ) in the future and thus needs to create an infor-mation value of w ( p ) − m ( p ) . However, to create an information value of w ( p ) − m ( p ) ,the principal commits to fully disclose the state with some probability, hence forego-ing the opportunity to incentivize the agent to play a ∗ in the future. Therefore, thehighest probability with which the principal can incentivize the agent to play a ∗ is ( M ( p ) − w ( p )) / ( M ( p ) − m ( p )) . Yet, as we now explain, the principal can do better byexploiting the fact that it is easier to incentivize the agent to play a ∗ at some beliefsthan at others.To see how the principal can do better, we study the relaxed version of our problem,where only the (ex-ante) participation constraint needs to be satisfied. Consider thefollowing policy. The principal discloses information at the ex-ante stage, i.e., chooses asplitting ( λ s , p s ) s of p , and recommends the agent to play a ∗ at all periods with probability α s when the realized signal is s . We continue to assume that p ∈ Q . The policy satisfiesthe participation constraint if (cid:88) s λ s [ α s u ( a ∗ , p s ) + (1 − α s ) m ( p s )] ≥ m ( p ) . We can rewrite the participation constraint as: (cid:88) s λ s (1 − α s )( m ( p s ) − u ( a ∗ , p s )) ≥ m ( p ) − u ( a ∗ , p ) , (2)where m ( p s ) − u ( a ∗ , p s ) is the opportunity cost of following the recommendation at belief p s . The principal aims at maximizing (cid:80) s λ s α s v ( a ∗ , p s ) . Clearly, the participation con-straint binds at a maximum. Moreover, since m is convex, the best for the principal isto fully disclose all information at the ex-ante stage. It only remains to characterize theoptimal α s . Note that if the principal recommends a ∗ with the same probability at all s ,then his payoff is M ( p ) − m ( p ) M ( p ) − u ( a ∗ , p ) v ( a ∗ , p ) , which is precisely the payoff of the recursive policy. To see this, note that if the prin-cipal recommends a ∗ with the same probability α at all s , it is as if the agent plays a ∗ with probability α , regardless of the state, and plays a (resp., a ) with probability (1 − α )(1 − p ) (resp.,, (1 − α ) p ). Since α binds the participation constraint (2), we obtainthe same payoff as the recursive policy. When a ∗ is optimal at p = 0 , we need to add the term (1 − p ) (cid:16) − M ( p ) − m ( p ) M ( p ) − u ( a ∗ ,p ) (cid:17) v ( a ∗ , p ) . However, the principal can do better by exploiting the difference in opportunity costsat the two extreme beliefs and . Writing α (resp., α ) for the probability of rec-ommending a ∗ conditional on the posterior being (resp., ), the principal maximizes pα v ( a ∗ ,
1) + (1 − p ) α v ( a ∗ , subject to: pα ( m (1) − u ( a ∗ , − p ) α ( m (0) − u ( a ∗ , ≤ M ( p ) − m ( p ) . The right-hand side is the maximal value of information the principal can create, whilethe left-hand side is the expected opportunity cost of following the recommendation. Aswith the recursive policy, the principal needs to generate the maximal value of informa-tion; this is the maximal value the principal can use to incentivize the agent. However,unlike the recursive policy, the principal needs to use the surplus created asymmetri-cally, as it is easier to incentivize the agent in state ω than ω .More precisely, the problem is linear in ( α , α ) . Therefore, since the slope v ( a ∗ , v ( a ∗ , islarger than the slope m (0) − u ( a ∗ , m (1) − u ( a ∗ , , the optimal solution is to set α as high as possible.For instance, if M ( p ) − m ( p ) ≤ (1 − p )( m (0) − u ( a ∗ , , the best is to set ( α , α ) =( M ( p ) − m ( p )(1 − p )( m (0) − u ( a ∗ , , , resulting in a payoff of M ( p ) − m ( p ) m (0) − u ( a ∗ , ≥ M ( p ) − m ( p ) M ( p ) − u ( a ∗ , p ) , with a strict inequality if the opportunity cost is strictly higher in state ω . This is thesolution to the relaxed constraint.While our policy also needs to incentivize the agent to follow the recommendation, itexploits the same asymmetries in opportunity costs as the above policy, which explainswhy it outperforms the recursive policy.To conclude, note that if v ( a ∗ , v ( a ∗ , = m (0) − u ( a ∗ , m (1) − u ( a ∗ , , then the recursive policy solves the relaxedproblem and, therefore, is also optimal.A PPENDIX
A. A
PPENDICES
A.1.
Mathematical preliminaries.
We collect without proofs some useful results aboutconcave functions. Let f : [ a, b ] → R be a concave function and a ≤ x < y < z ≤ b . Thefollowing properties hold:(a) f ( y ) − f ( x ) y − x ≥ f ( z ) − f ( y ) z − y ,(b) f ( y ) − f ( a ) y − a ≥ f ( z ) − f ( a ) z − a , See Appendix A.8 for the full characterization.
ONTRACTING OVER PERSISTENT INFORMATION 27 (c) f ( b ) − f ( x ) b − x ≥ f ( b ) − f ( y ) b − y .Note that property (a) implies that f ( y ) − f ( x ) y − x ≥ f ( y + ∆) − f ( x + ∆) y − x , for all ∆ ≥ such that y + ∆ ≤ b . (This is true irrespective of whether x + ∆ (cid:84) y .) Wewill repeatedly use these properties in most of the following proofs.To prove Lemma 5, we will use the following property: if f : [ a, b ] → R satisfies f ( x ) − f ( a ) x − a ≥ f ( y ) − f ( a ) y − a for all a < x ≤ y ≤ b , then f is concave.A.2. Recursive formulation: Theorem 4 of Ely [2015, p. 44].
We first note that theoperator T is monotone, i.e., for all V ≥ V (cid:48) , T ( V ) ≥ T ( V (cid:48) ) . It also satisfies T ( V + c ) ≤ T ( V ) + δc for all positive constant c ≥ , for all V . Hence, it is indeed a contraction byBlackwell’s theorem.Ely [2015] proves that the principal’s maximal payoff is max w ∈ [ m ( p ) ,M ( p )] ˆ V ∗ ( p , w ) , with ˆ V ∗ the unique fixed point of the contraction ˆ T , with ˆ T differing from T in that thepromised-keeping constraint is in equality; all other constraints are the same. Notethat the operator ˆ T is also monotone.We now prove that both formulations are equivalent. Clearly, we have that T ( V )( p, w ) ≥ ˆ T ( V )( p, w ) for all ( p, w ) ∈ W , for all V . Let w ∈ arg max w ∈ [ m ( p ) ,M ( p )] ˆ V ∗ ( p , w ) . We havethat V ∗ ( p , m ( p )) ≥ V ∗ ( p , w ) = T ( V ∗ )( p , w ) ≥ ˆ T ( V ∗ )( p , w ) ≥ ˆ T ( V ∗ )( p , w ) ≥ · · · ≥≥ lim n →∞ ˆ T n ( V ∗ )( p , w ) = ˆ V ∗ ( p , w ) , where the first inequality follows from V ∗ decreasing in w .Conversely, let ( λ ∗ s , p ∗ s , w ∗ s , a ∗ s ) s ∈ S be a maximizer of T ( V ∗ )( p , m ( p )) . We have that M ( p ) ≥ (cid:88) s ∈ S λ ∗ s M ( p ∗ s ) ≥ (cid:88) s ∈ S λ ∗ s [(1 − δ ) u ( a ∗ s , p ∗ s ) + δw ∗ s ] := w ≥ (cid:88) s ∈ S λ ∗ s m ( p ∗ s ) ≥ m ( p ) , hence ( λ ∗ s , p ∗ s , w ∗ s , a ∗ s ) s ∈ S is a maximizer for T ( ˆ V ∗ )( p , w ) and, consequently, V ∗ ( p , m ( p )) = ˆ V ∗ ( p , w ) ≤ max w ∈ [ m ( p ) ,M ( p )] ˆ V ∗ ( p , w ) . A.3.
Proposition 2.
We break Proposition 2 into several lemmata.
Lemma 1.
Let ( λ s , p s , w s , a s ) s ∈ S be a solution to the maximization program T ( V ∗ )( p, w ) .For all s ∈ S such that λ s > , we have (1 − δ ) v ( a s , p s ) + δV ∗ ( p s , w s ) = V ∗ ( p s , (1 − δ ) u ( a s , p s ) + δw s ) . Proof.
By contradiction, assume that there exists s (cid:48) ∈ S such that λ s (cid:48) > and (1 − δ ) v ( a s (cid:48) , p s (cid:48) ) + δV ∗ ( p s (cid:48) , w s (cid:48) ) < V ∗ ( p s (cid:48) , (1 − δ ) u ( a s (cid:48) , p s (cid:48) ) + δw s (cid:48) ) . Let ( λ ∗ s , p ∗ s , w ∗ s , a ∗ s ) s ∈ S be the policy, which achieves V ∗ ( p s (cid:48) , (1 − δ ) u ( a s (cid:48) , p s (cid:48) ) + δw s (cid:48) ) and con-sider the new policy (( λ s , p s , w s , a s ) s ∈ S \{ s (cid:48) } , ( λ s (cid:48) λ ∗ s , p ∗ s , w ∗ s , a ∗ s ) s ∈ S ) . By construction, the new policy is feasible. Moreover, we have that (cid:88) s ∈ S \{ s (cid:48) } λ s [(1 − δ ) v ( a s , p s ) + δV ∗ ( p s , w s )] + λ s (cid:48) (cid:88) s ∈ S λ ∗ s [(1 − δ ) v ( a ∗ s , p ∗ s ) + δV ∗ ( p ∗ s , w ∗ s )] = (cid:88) s ∈ S \{ s (cid:48) } λ s [(1 − δ ) v ( a s , p s ) + δV ∗ ( p s , w s )] + λ s (cid:48) V ∗ ( p s (cid:48) , (1 − δ ) u ( a s (cid:48) , p s (cid:48) ) + δw s (cid:48) ) > (cid:88) s ∈ S λ s [(1 − δ ) v ( a s , p s ) + δV ∗ ( p s , w s )] , a contradiction with the optimality of ( λ s , p s , w s , a s ) s ∈ S .Since the fixed point satisfies V ∗ ( p s , (1 − δ ) u ( a s , p s ) + δw s ) ≥ (1 − δ ) v ( a s , p s ) + δV ∗ ( p s , w s ) ,we have the desired result. (cid:3) Lemma 2.
Let ( λ s , p s , w s , a s ) s ∈ S be a solution to the maximization program T ( V ∗ )( p, w ) .For all s ∈ S such that λ s > , V ∗ ( p s , w s ) = 0 if a s (cid:54) = a ∗ .Proof. Let s ∈ S such that λ s > and a s (cid:54) = a ∗ . We have (1 − δ ) v ( a s , p s ) + δV ∗ ( p s , w s ) = δV ∗ ( p s , w s ) ≥ V ∗ ( p s , (1 − δ ) u ( a s , p s ) + δw s ) ≥ V ∗ ( p s , w s ) , where the first inequality follows from Lemma 1 and the second follows from V ∗ decreas-ing in w and w s ≥ u ( a s , p s ) for (1 − δ ) u ( a s , p s ) + δw s ≥ m ( p s ) , to hold. It follows that V ∗ ( p s , w s ) = 0 . (cid:3) ONTRACTING OVER PERSISTENT INFORMATION 29
Lemma 3.
Let ( λ (cid:48) s , p (cid:48) s , w (cid:48) s , a (cid:48) s ) s ∈ S (cid:48) be a solution to the maximization program T ( V ∗ )( p, w ) .There exists another solution ( λ s , p s , w s , a s ) s ∈ S such that a s = a ∗ for at most one s ∈ S with λ s > .Proof. Let ( λ (cid:48) s , p (cid:48) s , w (cid:48) s , a (cid:48) s ) s ∈ S (cid:48) be a solution to the maximization program T ( V ∗ )( p, w ) . Let S ∗ ⊆ S (cid:48) be the set of signals such that a s = a ∗ and λ s > . If S ∗ is empty, there is nothingto prove. If S ∗ is non-empty, define p ∗ as (cid:88) s ∈ S ∗ (cid:16) λ (cid:48) s (cid:80) s ∈ S ∗ λ (cid:48) s (cid:17) p s = p ∗ , and (cid:80) s ∈ S ∗ λ (cid:48) s = λ ∗ . From the concavity of V ∗ , we have that (cid:88) s ∈ S ∗ λ (cid:48) s ( v ( a ∗ , p (cid:48) s )(1 − δ ) + δV ∗ ( p (cid:48) s , w (cid:48) s )) = λ ∗ (cid:16) v ( a ∗ , p ∗ )(1 − δ ) + δ (cid:88) s ∈ S ∗ (cid:16) λ (cid:48) s λ ∗ (cid:17) V ( p (cid:48) s , w (cid:48) s ) (cid:17) ≤ λ ∗ (cid:16) v ( a ∗ , p ∗ )(1 − δ ) + δV ( p ∗ , w ∗ ) (cid:17) , where w ∗ = (cid:88) s ∈ S ∗ (cid:16) λ (cid:48) s (cid:80) s ∈ S ∗ λ (cid:48) s (cid:17) w (cid:48) s . Notice that w ∗ ∈ [ m ( p ∗ ) , M ( p ∗ )] since the convexity of m implies M ( p ∗ ) ≥ (cid:88) s ∈ S ∗ (cid:16) λ (cid:48) s (cid:80) s ∈ S ∗ λ (cid:48) s (cid:17) M ( p (cid:48) s ) ≥ (cid:88) s ∈ S ∗ (cid:16) λ (cid:48) s (cid:80) s ∈ S ∗ λ (cid:48) s (cid:17) w s ≥ (cid:88) s ∈ S ∗ (cid:16) λ (cid:48) s (cid:80) s ∈ S ∗ λ (cid:48) s (cid:17) m ( p (cid:48) s ) ≥ m ( p ∗ ) . It routine to verify that the new contract (( λ (cid:48) s , p (cid:48) s , w (cid:48) s , a (cid:48) s ) s ∈ S (cid:48) \ S ∗ , ( λ ∗ , p ∗ , a ∗ , w ∗ )) is feasible and, therefore, also optimal. (cid:3) Lemma 4.
Let ( λ (cid:48) s , p (cid:48) s , w (cid:48) s , a (cid:48) s ) s ∈ S (cid:48) be a solution to the maximization program T ( V ∗ )( p, w ) .There exists another solution ( λ s , p s , w s , a s ) s ∈ S such that (1 − δ ) u ( a s , p s ) + δw s = m ( p s ) , for all s such that λ s > and a s = a ∗ .Proof. Assume that there exists s ∗ such that λ (cid:48) s ∗ > , a s ∗ = a ∗ and (1 − δ ) u ( a (cid:48) s ∗ , p (cid:48) s ∗ ) + δ ( w (cid:48) s ∗ − ε ) ≥ m ( p (cid:48) s ∗ ) for some ε > . From Lemma 3, we may assume that there is a single such s ∗ . Considerthe new tuple ( λ s , p s , w s , a s ) s ∈ S , where w s ∗ = w (cid:48) s ∗ − ε , w ˜ s = w (cid:48) ˜ s + λ s ∗ λ ˜ s ε for some ˜ s (cid:54) = s ∗ such that λ ˜ s > , w s = w (cid:48) s for all s ∈ S \ { s ∗ , ˜ s } , and ( λ s , p s , a s ) = ( λ (cid:48) s , p (cid:48) s , a (cid:48) s ) for all s . This newcontract is feasible and increases the principal’s payoff. (cid:3) A.4.
Value functions.
This section proves that the policy τ q induces a well-definedvalue function V q . As explained in the text, if the value function V q is well-defined,so are all value functions V q . We first start with the definition of important subsets of [0 , .A.4.1. Construction of the sets Q k . Let Q := [0 , . We define inductively the set Q k ⊆ [0 , , k ≥ . We write q k (resp., q k ) for inf Q k (resp., sup Q k ). For any k ≥ , define thefunction U k : [ q k , → R : U k ( q ) := 1 − q − q k m ( q k ) + q − q k − q k m (1) , with the convention that U k ≡ m (1) if q k = 1 . Note that U ( q ) = M ( q ) and U k ( q ) ≥ m ( q ) for all k . We define Q k +1 as follows: Q k +1 = { q ∈ Q k : (1 − δ ) u ( a ∗ , q ) + δU k ( q ) ≥ m ( q ) } . For a graphical illustration, see Figure 8. m ( · ) U k ( · ) u ( a ∗ , · ) q k m ( q k ) 1 m (1) (1 − δ ) u ( a ∗ , · ) + δU k ( · ) q k +1 q k +1 F IGURE
8. Construction of the thresholdsFew observations are worth making. First, we have that P ⊆ Q k for all k . Second, wehave a decreasing sequences, i.e., Q k +1 ⊆ Q k for all k . Third, if Q k and P are non-empty,then they are closed intervals. Fourth, the limit Q ∞ = lim k →∞ Q k = (cid:84) k Q k exists andincludes P . Moreover, if P (cid:54) = ∅ , then q ∞ = p , where p := inf P . If P = ∅ , then Q ∞ = ∅ .Consequently, there exists k ∗ < ∞ such that ∅ = Q k ∗ +1 ⊂ Q k ∗ (cid:54) = ∅ .The first to the third observations are readily proved, so we concentrate on the proof ofthe fourth observation. The limit exists as we have a decreasing sequence of sets. ONTRACTING OVER PERSISTENT INFORMATION 31
We prove that if P = ∅ , then Q ∞ = ∅ . We first argue that if Q k = Q k − (cid:54) = ∅ for some k ≥ ,hence Q k (cid:48) = Q k − for all k (cid:48) ≥ k , then P (cid:54) = ∅ . From the convexity and continuity of m and the linearity of u , Q k − is the closed interval [ q k − , q k − ] , with both boundary pointssolution to (1 − δ ) u ( a ∗ , q ) + δU k − ( q ) = m ( q ) . Therefore, if ( q k , q k ) = ( q k − , q k − ) , we have that: m ( q k − ) = (1 − δ ) u ( a ∗ , q k − ) + δm ( q k − ) ,m ( q k − ) = (1 − δ ) u ( a ∗ , q k − ) + δ (cid:104) − q k − − q k − m ( q k − ) + q k − − q k − − q k − m (1) (cid:105) , ≤ (1 − δ ) u ( a ∗ , q k − ) + δm ( q k − ) . This implies that u ( a ∗ , q k − ) = m ( q k − ) and u ( a ∗ , q k − ) = m ( q k − ) and, therefore, ∅ (cid:54) = Q k − ⊆ P , a contradiction. Therefore, we must have an infinite sequence of strictlydecreasing non-empty closed intervals. Let ε := min p ∈ P m ( p ) − u ( a ∗ , p ) . Since P = ∅ , ε > and for all p ∈ Q ∞ , for all k , m ( p ) ≤ (1 − δ ) u ( a ∗ , p ) + δU k ( p ) , ≤ (1 − δ )( m ( p ) − ε ) + δU k ( p ) . It follows that m ( q k ) ≥ m ( q ∞ ) + ε (1 − δ ) /δ for all k and, therefore, m ( q ∞ ) ≥ m ( q ∞ ) + ε (1 − δ ) /δ , a contradiction.We now prove that if P (cid:54) = ∅ , then q ∞ = p . From above, we have that if Q k = Q k − (cid:54) = ∅ forsome k ≥ , hence Q k (cid:48) = Q k − for all k (cid:48) ≥ k , then P = Q k since P ⊆ Q k . If we have aninfinite sequence of strictly decreasing sets, for all q ∈ Q ∞ , (1 − δ ) u ( a ∗ , q ) + δ (cid:104) − q − q ∞ m ( q ∞ ) + q − q ∞ − q ∞ m (1) (cid:105) ≥ m ( q ) . Taking the limit q ↓ q ∞ , we obtain that u ( a ∗ , q ∞ ) = m ( q ∞ ) , i.e., q ∞ ∈ P . Hence, q ∞ = p .A.4.2. Value functions.
We first argue that V q is well-defined at all ( p, w ) ∈ W \ W q .To start with, V q (1 , m (1)) = 0 since a ∗ is not optimal at p = 1 . Similarly, V q (0 , m (0)) = 0 if a ∗ is not optimal at p = 0 , while V q (0 , m (0) = v ( a ∗ , if a ∗ is optimal at p = 0 . Also, V q ( q , m ( q )) = (1 − δ ) v ( a ∗ , q ) if q > ; while V q (0 , m (0)) = v ( a ∗ , if q = 0 , since a ∗ isthen optimal at p = 0 . With the function V q defined at these three points, it is then defined at all points ( p, w ) in W q ∪ W q . In particular, it is easy to show that V q ( q , w ) = M ( q ) − wM ( q ) − m ( q ) (1 − δ ) v ( a ∗ , q ) = M ( q ) − wM ( q ) − u ( a ∗ , q ) v ( a ∗ , q ) , for all w ∈ [ m ( q ) , M ( q )] .At all points ( p, w ) ∈ W q , V q ( p, w ) = 1 − p − q V q ( q , m ( q )) . Therefore, V q is well-defined at all ( p, w ) ∈ W \ W q .At all points ( p, w ) ∈ W q , V q ( p, w ) is defined via the recursive equation: V q ( p, w ) = λ ( p, w )[(1 − δ ) v ( a ∗ , ϕ ( p, w ))+ δV q ( ϕ ( p, w ) , w ( ϕ ( p, w ))] = λ ( p, w ) V q ( ϕ ( p, w ) , m ( ϕ ( p, w ))) . Since V q ( p, w ) = λ ( p, w ) V q ( ϕ ( p, w ) , m ( ϕ ( p, w )) , the value function is well-defined at all ( p, w ) if it is well-defined at all ( p, m ( p )) , which we now prove.By construction of the sets Q k , observe that if p ∈ Q k \ Q k +1 , then w ( p ) ∈ ( U k ( p ) , U k +1 ( p )] and, therefore, ϕ ( p, w ( p )) ∈ [ q k − , q k ) ⊂ Q k − \ Q k . Moreover, ϕ ( q k , w ( q k )) = q k . We nowuse these observations to prove that V q is well-defined.For all p ∈ Q \ Q , we have that w ( p ) ∈ Q \ Q , so that ( p, w ( p )) ∈ W q . Since V q ( p, m ( p )) = (1 − δ ) v ( a ∗ , p ) + δV q ( p, w ( p )) ,V q ( p, m ( p )) is well-defined for all p ∈ Q \ Q . By induction, assume that it is well-defined for all p ∈ (cid:83) (cid:96) From the definition of Q ∞ , we have that w ( p ) ≤ − p − q ∞ m ( q ∞ ) + p − p − p m (1) and, therefore, ϕ ( p, w ( p )) ∈ Q ∞ . Consequently, the restriction of V q ( p, m ( p )) to Q ∞ is defined via thecontraction: V q ( p, m ( p )) = (1 − δ ) v ( a ∗ , p ) + δλ ( p, w ( p )) V q ( ϕ ( p, w ( p )) , m ( ϕ ( p, w ( p ))) . The unique solution to this fixed point problem is given by: V q ( p, m ( p )) = v ( a ∗ , p ) − m ( p ) − u ( a ∗ , p ) m (1) − u ( a ∗ , v ( a ∗ , , for all p ∈ Q ∞ . To see this, with a slight abuse of notation, write ( λ, ϕ ) for ( λ ( p, w ) , ϕ ( p, w ( p ))) ,and note that: (1 − δ ) v ( a ∗ , p ) + δλ (cid:20) v ( a ∗ , ϕ ) − m ( ϕ ) − u ( a ∗ , ϕ ) m (1) − u ( a ∗ , v ( a ∗ , (cid:21) =(1 − δ ) v ( a ∗ , p ) + δ [ v ( a ∗ , p ) − (1 − λ ) v ( a ∗ , − m ( p ) − (1 − λ ) m (1) − u ( a ∗ , p )(1 − δ ) m (1) − u ( a ∗ , v ( a ∗ , 1) + δ u ( a ∗ , p ) − (1 − λ ) u ( a ∗ , m (1) − u ( a ∗ , v ( a ∗ , 1) = v ( a ∗ , p ) − m ( p ) − u ( a ∗ , p ) m (1) − u ( a ∗ , v ( a ∗ , , where we use the identities λϕ + (1 − λ )1 = p , λm ( ϕ ) + (1 − λ ) m (1) = w ( p ) , and δ w ( p ) = m ( p ) − (1 − δ ) u ( a ∗ , p ) .This completes the proof that V q is well-defined. Note that V q and, therefore, all valuefunctions V q , are continuous functions.A.4.3. Value functions: another representation. We now present another construction of V q . For any q ∈ [ q , q ] , define the function m q : [0 , → R as (cid:16) − pq (cid:17) m (0) + pq m ( q ) if p ∈ [0 , q ] ,m ( p ) if p ∈ ( q , q ] , − p − q m ( q ) + p − q − q m (1) if p ∈ ( q, . Note that m q is convex, m q ( p ) ≥ m ( p ) for all p ∈ [0 , , m q (0) = m (0) and m q (1) = m (1) .For a graphical illustration, see Figure 9.It is straightforward to check that we have the following formula: V q ( p, w ) = λ ( p, w ) V q ( ϕ ( p, w ) , m q ( ϕ ( p, w )) , where the functions λ and ϕ are defined as in the main text, but with m q , instead. m ( · ) M ( · ) q m (1) q qm q ( · ) F IGURE 9. The function m q A.5. Theorem 2. To prove Theorem 2, we prove the following proposition and invokeTheorem 1. Proposition 3. Let V q ∗ be the value function induced by the policy τ ∗ , with q ∗ = sup (cid:8) p ∈ Q : V q ( p, m ( p )) ≥ V q ( p, w ) for all w (cid:9) . Then, V q ∗ is concave in ( p, w ) , decreasing in w , and satisfies: V q ∗ ( p, m ( p ) ≥ (1 − δ ) v ( a ∗ , p ) + δV q ∗ ( p ∗ , w ( p )) , for all p ∈ Q . We start with two preliminary observations.O BSERVATION A. We have the following identity: V q ( p, w ) = 1 − p − p (cid:48) V q (cid:18) p (cid:48) , − p (cid:48) − p w + p (cid:48) − p − p m q (1) (cid:19) . The proof is as follows. Let w (cid:48) = − p (cid:48) − p w + p (cid:48) − p − p m q (1) . Assume that w (cid:48) > m q ( p (cid:48) ) . Since λ ( p (cid:48) , w (cid:48) ) ϕ ( p (cid:48) , w (cid:48) ) m q ( ϕ ( p (cid:48) , w (cid:48) )) + (cid:0) − λ ( p (cid:48) , w (cid:48) ) (cid:1) m q = p (cid:48) w (cid:48) , we have − p − p (cid:48) λ ( p (cid:48) , w (cid:48) ) ϕ ( p (cid:48) , w (cid:48) ) m q ( ϕ ( p (cid:48) , w (cid:48) )) + (cid:18) − − p − p (cid:48) λ ( p (cid:48) , w (cid:48) ) (cid:19) m q (1) = pw . Therefore, λ ( p, w ) = − p − p (cid:48) λ ( p (cid:48) , w (cid:48) ) and ϕ ( p (cid:48) , w (cid:48) ) = ϕ ( p, w ) since the solution is unique when w (cid:48) > m q ( p (cid:48) ) . ONTRACTING OVER PERSISTENT INFORMATION 35 Finally, if m q ( p (cid:48) ) = − p (cid:48) − p m q ( p ) + p (cid:48) − p − p m q (1) , the result follows from continuity as: V q ( p, m q ( p )) = lim w → m q ( p ) V q ( p, w ) , = lim w → m q ( p ) − p − p (cid:48) V q (cid:18) p (cid:48) , − p (cid:48) − p w + p (cid:48) − p − p m q (1) (cid:19) , = 1 − p − p (cid:48) V q (cid:18) p (cid:48) , − p (cid:48) − p m q ( p ) + p (cid:48) − p − p m q (1) (cid:19) , = 1 − p − p (cid:48) V q ( p (cid:48) , m q ( p (cid:48) )) . Note that this implies that V q ( p, w ( p ) + c ) = λ ( p, w ( p )) V q (cid:18) ϕ ( p, w ( p )) , m q ∗ ( ϕ ( p, w ( p ))) + cλ ( p, w ( p )) (cid:19) , where c is a positive constant.O BSERVATION B. The value function V q ( p, · ) : [ m q ( p ) , M ( p )] → R is concave in w , foreach p . See Lemma 5 in section A.6.A.5.1. Proposition 3(a). We prove that V q ∗ decreasing in w . To start with, fix p ∈ [0 , and ( w, w (cid:48) ) ∈ [ m q ∗ ( p ) , M ( p )] × [ m q ∗ ( p ) , M ( p )] , with w (cid:48) > w .First, assume that p ≤ q ∗ . If w = m q ∗ ( p ) , then V q ∗ ( p, w (cid:48) ) ≤ V q ∗ ( p, w ) by construction of q ∗ .If w > m q ∗ ( p ) , we have that V q ∗ ( p, w (cid:48) ) − V q ∗ ( p, w ) w (cid:48) − w = V q ( p, w (cid:48) ) − V q ( p, w ) w (cid:48) − w ≤ V q ( p, w ) − V q ( p, m q ∗ ( p )) w − m q ∗ ( p )= V q ∗ ( p, w ) − V q ∗ ( p, m q ∗ ( p )) w − m q ∗ ( p ) ≤ , where the inequality follows from the concavity of V q with respect to w , for all w ≥ m q ( p ) . (Recall that m q ∗ ( p ) = m q ( p ) for all p ≤ q ∗ .)Second, assume that p > q ∗ . We show in great details how to make use of Observation Ato deduce the result. We repeatedly use similar computations later on. We have V q ∗ ( p, w (cid:48) ) = λ ( p, w (cid:48) ) V q ∗ ( ϕ ( p, w (cid:48) ) , m q ∗ ( ϕ ( p, w (cid:48) )))= λ ( p, w (cid:48) ) 1 − ϕ ( p, w (cid:48) )1 − ϕ ( p, w ) V q ∗ (cid:18) ϕ ( p, w ) , − ϕ ( p, w )1 − ϕ ( p, w (cid:48) ) m q ∗ ( ϕ ( p, w (cid:48) )) + (cid:18) − − ϕ ( p, w )1 − ϕ ( p, w (cid:48) ) (cid:19) m q ∗ (1) (cid:19) = λ ( p, w ) V q ∗ (cid:18) ϕ ( p, w ) , λ ( p, w (cid:48) ) λ ( p, w ) m q ∗ ( ϕ ( p, w (cid:48) )) + (cid:18) − λ ( p, w (cid:48) ) λ ( p, w ) (cid:19) m q ∗ (1) (cid:19) = λ ( p, w ) V q ∗ (cid:18) ϕ ( p, w ) , m q ∗ ( ϕ ( p, w )) + w (cid:48) − wλ ( p, w ) (cid:19) , where the first line follows from the construction of V q ∗ , the second line from ObservationA, the third line from the definition of the functions λ and ϕ and the last line from thefollowing computations: λ ( p, w (cid:48) ) λ ( p, w ) m q ∗ ( ϕ ( p, w (cid:48) )) + (cid:18) − λ ( p, w (cid:48) ) λ ( p, w ) (cid:19) m q ∗ (1) = 1 λ ( p, w ) w (cid:48) + (cid:18) − λ ( p, w ) (cid:19) m q ∗ (1)= 1 λ ( p, w ) w (cid:48) + (cid:18) − λ ( p, w ) (cid:19) (cid:20) w − λ ( p, w ) m q ∗ ( ϕ ( p, w ))1 − λ ( p, w ) (cid:21) = m q ∗ ( ϕ ( p, w )) + w (cid:48) − wλ ( p, w ) . Thus, we are able to express V q ∗ ( p, w (cid:48) ) as λ ( p, w ) V q ∗ ( ϕ ( p, w ) , ˜ w ) , with ˜ w the above ex-pression. Moreover, ϕ ( p, w ) ≤ q ∗ as w ≥ m q ∗ ( p ) . We can use the (already established)concavity of V q ∗ in w for each p ≤ q ∗ to deduce the desired result. More precisely, we havethat: V q ∗ ( p, w (cid:48) ) − V q ∗ ( p, w ) w (cid:48) − w = λ ( p, w ) (cid:16) V q ∗ (cid:16) ϕ ( p, w ) , m q ∗ ( ϕ ( p, w )) + w (cid:48) − wλ ( p,w ) (cid:17) − V q ∗ ( ϕ ( p, w ) , m q ∗ ( ϕ ( p, w ))) (cid:17) w (cid:48) − w ≤ , where the inequality follows from the concavity of V q ∗ in w at all p ≤ q ∗ .Lastly, since V q ∗ ( p, w ) = V q ∗ ( p, m q ∗ ( p )) for all w ∈ [ m ( p ) , m q ∗ ( p )] , the result immediatelyfollows for all ( w, w (cid:48) ) , with w ∈ [ m ( p ) , m q ∗ ( p )] .A.5.2. Proposition 3(b). We prove the concavity of V q ∗ with respect to both arguments ( p, w ) .Let ( p, w ) ∈ W , ( p (cid:48) , w (cid:48) ) ∈ W and α ∈ [0 , . Write ( p α , w α ) for α pw + (1 − α ) p (cid:48) w (cid:48) . ONTRACTING OVER PERSISTENT INFORMATION 37 Without loss of generality, assume that p ≤ p (cid:48) . We have that: αV q ∗ ( p, w ) + (1 − α ) V q ∗ ( p (cid:48) , w (cid:48) )= α − p − p (cid:48) V q ∗ (cid:16) p (cid:48) , − p (cid:48) − p w + p (cid:48) − p − p m q ∗ (1) (cid:124) (cid:123)(cid:122) (cid:125) ≥ m q ∗ ( p (cid:48) ) (cid:17) + (1 − α ) V q ∗ ( p (cid:48) , w (cid:48) ) ≤ (cid:18) α − p − p (cid:48) + (1 − α ) (cid:19) V q ∗ p (cid:48) , α − p − p (cid:48) (cid:16) − p (cid:48) − p w + p (cid:48) − p − p m q ∗ (1) (cid:17) + (1 − α ) w (cid:48) α − p − p (cid:48) + (1 − α ) = 1 − p α − p (cid:48) V q ∗ (cid:18) p (cid:48) , − p (cid:48) − p α w α + p (cid:48) − p α − p α m q ∗ (1) (cid:19) = V q ∗ ( p α , w α ) , where the inequality follows from the concavity of V q with respect to w for each p andthe property that V q ∗ ( p, w ) = V q ( p, w ) for all ( p, w ) such that w ≥ m q ∗ ( p ) . Notice that weuse twice Observation A.Finally, for all ( p, w ) ∈ W , for all ( p (cid:48) , w (cid:48) ) ∈ W and for all α , we have that: αV q ∗ ( p, w ) + (1 − α ) V q ∗ ( p (cid:48) , w (cid:48) ) = αV q ∗ ( p, max( w, m q ∗ ( p ))) + (1 − α ) V q ∗ ( p (cid:48) , max( w (cid:48) , m q ∗ ( p (cid:48) ))) ≤ V q ∗ ( p α , α max( w, m q ∗ ( p )) + (1 − α ) max( w, m q ∗ ( p (cid:48) ))) ≤ V q ∗ ( p α , w α ) , since α max( w, m q ∗ ( p )) + (1 − α ) max( w, m q ∗ ( p (cid:48) )) ≥ w α and the fact that V q ∗ is decreasingin w for all p . This completes the proof of concavity.A.5.3. Proposition 3 (c). We prove that V q ∗ ( p, m ( p )) ≥ (1 − δ ) v ( a ∗ , p ) + δV q ∗ ( p, w ( p )) for all p ∈ Q .The statement is true for all p ≤ q ∗ by definition. For p > q ∗ , we have that − p − q ∗ m q ∗ ( q ∗ ) + p − q ∗ − q ∗ m q ∗ (1) − w ( p ) = 1 − p − q ∗ m ( q ∗ ) + p − q ∗ − q ∗ m (1) − w ( p ) ≥ m ( p ) − w ( p ) ≥ , hence V q ∗ ( p, w ( p )) = V q ( p, w ( p )) . Since V q ( p, m ( p )) = (1 − δ ) v ( a ∗ , p ) + δV q ( p, w ( p )) forall p ∈ Q and V q ∗ ( p, m ( p )) = V q ∗ ( p, m q ∗ ( p )) = V q ( p, m q ∗ ( p )) , it is enough to prove that V q ( p, m q ∗ ( p )) ≥ V q ( p, m ( p )) .Clearly, there is nothing prove if m q ∗ ( p ) = m ( p ) for all p ∈ Q , i.e., if q ∗ = q (rememberthat m q ( p ) = m ( p ) for all p ∈ Q ). So, assume that m q ∗ ( p ) > m ( p ) for some p ∈ ( q ∗ , q ) , hence m q ∗ ( p ) > m ( p ) for all p ∈ ( q ∗ , q ) . We now argue that if V q ( p, w ) > V q ( p, m ( p )) for some w ≥ m q ∗ ( p ) , then V q ( p (cid:48) , m ( p (cid:48) )) < − p (cid:48) − p V q ( p, w ) , for all p (cid:48) > p . To see this, observe that w > m ( p ) and, accordingly, − p (cid:48) − p w + p (cid:48) − p − p m (1) − m ( p (cid:48) ) > , since m is convex. Hence, < V q ( p, w ) − V q ( p, m ( p )) w − m ( p )= − p − p (cid:48) (cid:104) V q (cid:16) p (cid:48) , − p (cid:48) − p w + p (cid:48) − p − p m (1) (cid:17) − V q (cid:16) p (cid:48) , − p (cid:48) − p m ( p ) + p (cid:48) − p − p m (1) (cid:17)(cid:105) w − m ( p ) ≤ V q (cid:16) p (cid:48) , − p (cid:48) − p w + p (cid:48) − p − p m (1) (cid:17) − V q ( p (cid:48) , m ( p (cid:48) )) − p (cid:48) − p w + p (cid:48) − p − p m (1) − m ( p (cid:48) ) , where the equality follows Observation A and the inequality from the concavity of V q in w for each p . Since V q ( p, w ) = 1 − p − p (cid:48) V q (cid:18) p (cid:48) , − p (cid:48) − p w + p (cid:48) − p − p m (1) (cid:19) , we have the desired result.Finally, from the definition of q ∗ , for all n > , there exist p n ∈ ( q ∗ , min( q ∗ + n , q )] and w n ≥ m ( p n ) such that V q ( p n , m ( p n )) < V q ( p n , w n ) . From the concavity of V q in w for all p , V q ( p n , m ( p n )) < V q ( p n , m q ∗ ( p n )) for all n .From the above argument, for all p , for all n sufficiently large, i.e., such that p n < p , wehave that V q ( p, m ( p )) < − p − p n V q ( p n , m q ∗ ( p n )) . Taking the limit as n → ∞ , we obtain that V q ( p, m ( p )) < − p − q ∗ V q ( q ∗ , m q ∗ ( q ∗ )) = V q ( p, m q ∗ ( p )) , which completes the proof.A.6. Concavity of V q with respect to w for each p .Lemma 5. The value function V q ( p, · ) : [ m q ( p ) , M ( p )] → R is concave in w , for each p . ONTRACTING OVER PERSISTENT INFORMATION 39 This section proves that V q is concave in w for each p . To do so, we prove that V q ( p, m q ( p ) + η ( m q (1) − u ( a ∗ , − V q ( p, m q ( p )) η ≥ V q ( p, m q ( p ) + η (cid:48) ( m q (1) − u ( a ∗ , − V q ( p, m q ( p )) η (cid:48) , for all ( η, η (cid:48) ) such that η (cid:48) ≥ η . (See the observations on concave functions.) We start withsome preliminary results.A.6.1. Preliminary Results. We study how the function ϕ ( p, w ( p )) varies with p . Lemma 6. There exists a non-empty interval [ q, q ] such that:(1) For any p (cid:48) < p ≤ q or p (cid:48) > p ≥ ¯ q , ϕ ( p, w ( p )) ≥ ϕ ( p (cid:48) , w ( p (cid:48) )) ,(2) The ratio m (1) − m ( ϕ ( p, w ( p ))1 − ϕ ( p, w ( p )) is constant for all p ∈ [ q, q ] .Proof of Lemma 6. Observe that m (1) − w ( p )1 − p = m (1) − m ( ϕ ( p, w ( p ))1 − ϕ ( p, w ( p )) . Therefore, the convexity of m implies that if m (1) − w ( p )1 − p < m (1) − w ( p (cid:48) )1 − p (cid:48) , then ϕ ( p, w ( p )) <ϕ ( p (cid:48) , w ( p (cid:48) )) .Consider the function h : [0 , → R , defined by h ( p ) = m (1) − w ( p )1 − p . We argue that h isquasi-concave. For all ( p, p (cid:48) ) and α ∈ [0 , , we have that m (1) − w ( αp + (1 − α ) p (cid:48) ) α (1 − p ) + (1 − α )(1 − p (cid:48) ) ≥ α ( m (1) − w ( p )) + (1 − α )( m (1) − w ( p (cid:48) )) α (1 − p ) + (1 − α )(1 − p (cid:48) )= α (1 − p ) α (1 − p ) + (1 − α )(1 − p (cid:48) ) m (1) − w ( p )1 − p +(1 − α )(1 − p (cid:48) ) α (1 − p ) + (1 − α )(1 − p (cid:48) ) m (1) − w ( p (cid:48) )1 − p (cid:48) ≥ min (cid:18) m (1) − w ( p )1 − p , m (1) − w ( p (cid:48) )1 − p (cid:48) (cid:19) , where the first inequality follows form the convexity of w . (Note that the inequality isstrict if w ( αp + (1 − α ) p (cid:48) ) < α w ( p ) + (1 − α ) w ( p (cid:48) ) .)It follows that if h ( p (cid:48) ) ≥ h ( p ) , then it is also true for all p (cid:48)(cid:48) ∈ ( p, p (cid:48) ) . Since h is quasi-concave and continuous, the set of maxima is a non-empty convex set [ q, q ] , and thefunction is increasing for all p < q and decreasing for all p > q . (Note that m (1) − w (1) = (1 − δ )( u ( a ∗ , − m (1)) δ < , hence the function is equal to −∞ at p = 1 .) (cid:3) We can make few additional observations about the interval [ q, q ] . Let k ∗ := sup { k : Q k (cid:54) = ∅} . Since ϕ ( q k , w ( q k )) = q k , the function h is decreasing for all p ≥ q k ∗ . Similarly, since ϕ ( q k , w ( q k )) = q k − , the function h is increasing for all p ≤ q k ∗ . Therefore, [ q, q ] ⊂ Q k ∗ .If P (cid:54) = ∅ , so that k ∗ = ∞ , then for all p ∈ P , the function h is increasing by convexity of m since w ( p ) = m ( p ) . (This is clearly true since ϕ ( p, m ( p )) = p in that region.) Therefore, p ≤ q if P (cid:54) = ∅ .Finally, let ˜ p := inf { p : m ( p ) = u ( a , p ) } . We have that q < ˜ p . To see this, observe that forall p ≥ ˜ p , m (1) − w ( p )1 − p = (1 − δ )( < (cid:122) (cid:125)(cid:124) (cid:123) u ( a ∗ , − u ( a , − p + ( u ( a , − u ( a , − (1 − δ )( u ( a ∗ , − u ( a ∗ , δ , hence it is decreasing in p . (If there are multiple optimal actions at p = 1 , the argumentapplies to all of them and, therefore, to the one that induces the smallest ˜ p .)The second preliminary result is technical. For any p ∈ (0 , and any η ∈ (cid:104) , M ( p ) − m q ( p ) m q (1) − u ( a ∗ , (cid:105) ,define w ( p ; η ) as m q ( p ) + η (cid:2) m q (1) − u ( a ∗ , (cid:3) , and write ( λ η , ϕ η ) for ( λ ( p, w ( p ; η )) , ϕ ( p, w ( p ; η ))) . To ease notation, we do not explicitlywrite the dependence of ( λ η , ϕ η ) on p . We have the following: Lemma 7. ϕ η , λ η , and − λ η η are all decreasing in η . The proof follows directly from the definition of ( λ η , ϕ η ) and omitted.Finally, we conclude with the following implication of Observation A, which wel usethroughout. For all ( p, w, w (cid:48) ) with w ≤ w (cid:48) , we have that: V q ( p, w ) − V q ( p, w (cid:48) ) = λ ( p, w ) (cid:20) V q ( ϕ ( p, w ) , m q ( p, w )) − V q (cid:18) ϕ ( p, w ) , m q ( p, w ) + w (cid:48) − wλ ( p, w ) (cid:19)(cid:21) . A.6.2. Proof of Lemma 5. We now prove that the gradient G ( p ; η ) := V q ( p,m q ( p )) − V q ( p,w ( p ; η )) η is increasing in η ∈ (cid:104) , M ( p ) − m q ( p ) m q (1) − u ( a ∗ , (cid:105) , for all p . We prove it on three separate intervals I , I and I . If P = ∅ , the three intervals are [0 , q ] , ( q, q ] and ( q, , respectively. If P (cid:54) = ∅ ,the three intervals are [0 , p ] , ( p, q ∞ ] and ( q ∞ , , respectively. ONTRACTING OVER PERSISTENT INFORMATION 41 A.6.3. For all p ∈ I , G ( p ; η ) is increasing in η . We limit attention to the case P (cid:54) = ∅ . (Thecase P = ∅ is identical.) The proof is by induction. First, consider the interval [0 , q ] .Remember that at q , we have a closed-form solution for V q ( q , w ) for all w given by V q ( q , w ) = M ( q ) − wM ( q ) − u ( a ∗ , q ) v ( a ∗ , q ) . Therefore, V q ( q , m q ( q )) − V q ( q , w ( q ; η )) η = 1 η (cid:20) M ( q ) − m q ( q ) M ( q ) − u ( a ∗ , q ) v ( a ∗ , q ) − M ( q ) − w ( q ; η ) M ( q ) − u ( a ∗ , q ) v ( a ∗ , q ) (cid:21) = v ( a ∗ , q ) M ( q ) − u ( a ∗ , q ) [ m q ( q ) + η ( m (1) − u ( a ∗ , − m q ( q ) η = q v ( a ∗ , 1) + (1 − q ) v ( a ∗ , q [ m q (1) − u ( a ∗ , − (1 − q )[ m q (0) − u ( a ∗ , w ( q ; η ) − m q ( q ) η = v ( a ∗ , q + (1 − q ) v ( a ∗ , v ( a ∗ , q + (1 − q ) m q (0) − u ( a ∗ , m q (1) − u ( a ∗ , ≥ v ( a ∗ , q ) . We now consider any p ∈ [0 , q ) . From Observation A, we have that: V q ( p, m q ( p )) = 1 − p − q V q (cid:18) q , − q − p m q ( p ) + (cid:18) − − q − p (cid:19) m q (1) (cid:19) V q ( p, w ( p ; η )) = 1 − p − q V q (cid:18) q , − q − p m q ( p ) + (cid:18) − − q − p (cid:19) m q (1) + 1 − q − p η (cid:2) m q (1) − u ( a ∗ , (cid:3)(cid:19) It follows that V q ( p, m q ( p )) − V q ( p, w ( p ; η )) η = 1 − p − q V q (cid:16) q , − q − p m q ( p ) + (cid:16) − − q − p (cid:17) m q (1) (cid:17) − V q (cid:16) q , − q − p m q ( p ) + (cid:16) − − q − p (cid:17) m q (1) + − q − p η (cid:2) m q (1) − u ( a ∗ , (cid:3)(cid:17) η (cid:62) − p − q − q − p v ( a ∗ , 1) = v ( a ∗ , . Therefore, G ( p ; η ) ≥ v ( a ∗ , for all η , for all p ∈ [0 , q ] . Moreover, the gradient G ( p ; η ) isindependent of η for all p ∈ [0 , q ] , hence (weakly) increasing.By induction, assume that G ( p ; η ) ≥ v ( a ∗ , for all p ∈ [0 , q k ] and increasing in η , we wantto prove that both properties also hold for all p ∈ ( q k , q k +1 ] . We rewrite V q ( p, w ( p ; η )) as follows: V q ( p, w ( p ; η )) = λ η V q ( ϕ η , m q ( ϕ η )) = λ η (cid:2) (1 − δ ) v ( a ∗ , ϕ η ) + δV q ( ϕ η , w ( ϕ η )) (cid:3) = (1 − δ ) λ η v ( a ∗ , ϕ η ) + δλ η V q ( ϕ η , w ( ϕ η ))= (1 − δ ) λ η v ( a ∗ , ϕ η ) + δV q (cid:0) p, λ η w ( ϕ η ) + [1 − λ η ] m q (1) (cid:1) = (1 − δ ) λ η v ( a ∗ , ϕ η ) + δV q (cid:18) p, w ( p ) + η − (1 − δ )(1 − λ η ) δ [ m q (1) − u ( a ∗ , (cid:19) . The second to last equality follows from Observation A, while the last equality followsfrom: λ η w ( ϕ η ) + [1 − λ η ] m q (1) = λ η − (1 − δ ) u ( a ∗ , ϕ η ) + m q ( ϕ η ) δ + [1 − λ η ] m q (1)= − (1 − δ ) δ λ η u ( a ∗ , ϕ η ) + 1 δ λ η m q ( ϕ η ) + [1 − λ η ] m q (1)= − (1 − δ ) δ [ u ( a ∗ , p ) − (1 − λ η ) u ( a ∗ , δ (cid:2) w ( p ; η ) − (1 − λ η ) m q (1) (cid:3) + [1 − λ η ] m q (1)= − (1 − δ ) δ [ u ( a ∗ , p ) − (1 − λ η ) u ( a ∗ , δ (cid:2) m q ( p ) + η ( m q (1) − u ( a ∗ , − (1 − λ η ) m q (1) (cid:3) + [1 − λ η ] m q (1)= (cid:20) − (1 − δ ) δ u ( a ∗ , p ) + 1 δ m q ( p ) (cid:21) + η − (1 − δ )(1 − λ η ) δ [ m (1) − u ( a ∗ , . For future reference, recall that λ η w ( ϕ η ) + (1 − λ η ) m q (1) = λ η (cid:2) λ ( ϕ η , w ( ϕ η )) m q ( ϕ ( ϕ η , w ( ϕ η ))) + (1 − λ ( ϕ η , w ( ϕ η )) m q (1) (cid:3) +(1 − λ η ) m q (1) , so that ϕ (cid:18) p, w ( p ) + η − (1 − δ )(1 − λ η ) δ [ m q (1) − u ( a ∗ , (cid:19) = ϕ ( ϕ η , w ( ϕ η )) , and λ (cid:18) p, w ( p ) + η − (1 − δ )(1 − λ η ) δ [ m q (1) − u ( a ∗ , (cid:19) = λ η λ ( ϕ η , w ( ϕ η )) . Since ϕ η is decreasing in η , we have that ϕ η (cid:48) ≤ ϕ η when η (cid:48) > η and, therefore, we havethat ϕ ( ϕ η , w ( ϕ η )) ≤ ϕ ( ϕ η (cid:48) , w ( ϕ η (cid:48) )) since ϕ η (cid:48) ≤ ϕ η ≤ p ≤ q . Similarly, since ϕ η < p ≤ q , wehave that ϕ ( ϕ η , w ( ϕ η )) ≤ ϕ ( p, w ( p )) and, therefore, η − (1 − δ )(1 − λ η ) δ > . ONTRACTING OVER PERSISTENT INFORMATION 43 We now return to the computation of the gradient. We have: = (cid:2) (1 − δ ) v ( a ∗ , p ) + δV q ( p, w ( p )) (cid:3) − (cid:104) (1 − δ ) λ η v ( a ∗ , ϕ η ) + δV q (cid:16) p, w ( p ) + η − (1 − δ )(1 − λ η ) δ [ m (1) − u ( a ∗ , (cid:17)(cid:105) η = (1 − δ ) η [ v ( a ∗ , p ) − λ η v ( a ∗ , ϕ η )] + δη (cid:20) V q ( p, w ( p )) − V q (cid:18) p, w ( p ) + η − (1 − δ )(1 − λ η ) δ [ m (1) − u ( a ∗ , (cid:19)(cid:21) = (1 − δ ) η (1 − λ η ) v ( a ∗ , 1) + δη (cid:20) V q ( p, w ( p )) − V q (cid:18) p, w ( p ) + η − (1 − δ )(1 − λ η ) δ [ m (1) − u ( a ∗ , (cid:19)(cid:21) . (3)We further develop the above expression. To ease notation, we write ( ϕ ( p ) , λ ( p )) for ( ϕ ( p, w ( p )) , λ ( p, w ( p ))) . Note that ϕ ( p ) ∈ ( q k − , q k ] , since p ∈ ( q k , q k +1 ] . As η − (1 − δ )(1 − λ η ) δ > , we have = (1 − δ ) η (1 − λ η ) v ( a ∗ , 1) + δη (cid:20) V q ( p, w ( p )) − V q (cid:18) p, w ( p ) + η − (1 − δ )(1 − λ η ) δ [ m (1) − u ( a ∗ , (cid:19)(cid:21) = (1 − δ ) η (1 − λ η ) v ( a ∗ , 1) + δη η − (1 − δ )(1 − λ η ) δ V q ( p, w ( p )) − V q (cid:16) p, w ( p ) + η − (1 − δ )(1 − λ η ) δ [ m (1) − u ( a ∗ , (cid:17) η − (1 − δ )(1 − λ η ) δ = (1 − δ ) η (1 − λ η ) v ( a ∗ , 1) + (cid:20) − (1 − δ ) (1 − λ η ) η (cid:21) λ ( p ) (cid:104) V q ( ϕ ( p ) , m q ( ϕ ( p ))) − V q (cid:16) ϕ ( p ) , m q ( ϕ ( p )) + η − (1 − δ )(1 − λ η ) δλ ( p ) [ m (1) − u ( a ∗ , (cid:17)(cid:105) η − (1 − δ )(1 − λ η ) δ = (1 − δ ) η (1 − λ η ) v ( a ∗ , 1) + (cid:20) − (1 − δ ) (1 − λ η ) η (cid:21) V q ( ϕ ( p ) , m q ( ϕ ( p ))) − V q (cid:16) ϕ ( p ) , m q ( ϕ ( p )) + η − (1 − δ )(1 − λ η ) δλ ( p ) [ m (1) − u ( a ∗ , (cid:17) η − (1 − δ )(1 − λ η ) δλ ( p ) (cid:62) (1 − δ ) η (1 − λ η ) v ( a ∗ , 1) + (cid:20) − (1 − δ ) (1 − λ η ) η (cid:21) v ( a ∗ , 1) = v ( a ∗ , , where we use Observation A and the induction step.We now show that the gradient is increasing in η . To start with, note that η − (1 − δ )(1 − λ η ) δ isincreasing in η since − λ η η is decreasing in η (see Lemma 7). For any η > η (cid:48) , we have the following V q ( p, w ( p )) − V q (cid:16) p, w ( p ) + η − (1 − δ )(1 − λ η ) δ (cid:2) m q (1) − u ( a ∗ , (cid:3)(cid:17) η − (1 − δ )(1 − λ η ) δ = λ ( p ) V q ( ϕ ( p ) , m q ( ϕ ( p ))) − λ ( p ) V q (cid:16) ϕ ( p ) , m q ( ϕ ( p )) + η − (1 − δ )(1 − λ η ) δλ ( p ) (cid:2) m q (1) − u ( a ∗ , (cid:3)(cid:17) η − (1 − δ )(1 − λ ) δ = V q ( ϕ ( p ) , m q ( ϕ ( p ))) − V q (cid:16) ϕ ( p ) , m q ( ϕ ( p )) + η − (1 − δ )(1 − λ η ) δλ ( p ) (cid:2) m q (1) − u ( a ∗ , (cid:3)(cid:17) η − (1 − δ )(1 − λ ) δλ ( p ) (cid:62) V q ( ϕ ( p ) , m q ( ϕ ( p ))) − V q (cid:16) ϕ ( p ) , m q ( ϕ ( p )) + η (cid:48) − (1 − δ )(1 − λ η (cid:48) ) δλ ( p ) (cid:2) m q (1) − u ( a ∗ , (cid:3)(cid:17) η (cid:48) − (1 − δ )(1 − λ η (cid:48) ) δλ ( p ) = V q ( p, w ( p )) − V q (cid:16) p, w ( p ) + η (cid:48) − (1 − δ )(1 − λ η (cid:48) ) δ (cid:2) m q (1) − u ( a ∗ , (cid:3)(cid:17) η (cid:48) − (1 − δ )(1 − λ η (cid:48) ) δ , where the inequality follows from the fact that ϕ ( p ) ∈ ( q k − , q k ] and, therefore, the gra-dient G ( ϕ ( p ); η ) being increasing in η by the induction hypothesis.Finally, we have that η (cid:2) V q ( p, m q ( p )) − V q ( p, w ( p ; η )) (cid:3) =(1 − δ ) (1 − λ η ) η v ( a ∗ , 1) + (cid:20) − (1 − δ ) (1 − λ η ) η (cid:21) V q ( p, w ( p )) − V q (cid:16) p, w ( p ) + η − (1 − δ )(1 − λ η ) δ [ m (1) − u ( a ∗ , (cid:17) η − (1 − δ )(1 − λ η ) δ (cid:62) (1 − δ ) (1 − λ η ) η v ( a ∗ , 1) + (cid:20) − (1 − δ ) (1 − λ η ) η (cid:21) V q ( p, w ( p )) − V q (cid:18) p, w ( p ) + η (cid:48) − (1 − δ )(1 − λ η (cid:48) ) δ (cid:2) m q (1) − u ( a ∗ , (cid:3)(cid:19) η (cid:48) − (1 − δ )(1 − λ η (cid:48) ) δ = (1 − δ ) (1 − λ η (cid:48) ) η (cid:48) v ( a ∗ , 1) + (cid:20) − (1 − δ ) (1 − λ η (cid:48) ) η (cid:48) (cid:21) V q ( p, w ( p )) − V q (cid:18) p, w ( p ) + η (cid:48) − (1 − δ )(1 − λ η (cid:48) ) δ (cid:2) m q (1) − u ( a ∗ , (cid:3)(cid:19) η (cid:48) − (1 − δ )(1 − λ η (cid:48) ) δ + (cid:20) (1 − δ ) (1 − λ η (cid:48) ) η (cid:48) − (1 − δ ) (1 − λ η ) η (cid:21) V q ( p, w ( p )) − V q (cid:18) p, w ( p ) + η (cid:48) − (1 − δ )(1 − λ η (cid:48) ) δ (cid:2) m q (1) − u ( a ∗ , (cid:3)(cid:19) η (cid:48) − (1 − δ )(1 − λ η (cid:48) ) δ − v ( a ∗ , (cid:62) η (cid:48) (cid:2) V q ( p, m q ( p )) − V q ( p, w ( p ; η (cid:48) )) (cid:3) + (cid:20) (1 − δ ) (1 − λ η (cid:48) ) η (cid:48) − (1 − δ ) (1 − λ η ) η (cid:21) V q ( ϕ ( p ) , m q ( ϕ ( p ))) − V q (cid:18) ϕ ( p ) , m q ( ϕ ( p )) + η (cid:48) − (1 − δ )(1 − λ η (cid:48) ) δλ ( p ) (cid:2) m q (1) − u ( a ∗ , (cid:3)(cid:19) η (cid:48) − (1 − δ )(1 − λ η (cid:48) ) δλ ( p ) − v ( a ∗ , (cid:62) η (cid:48) (cid:2) V q ( p, m q ( p )) − V q ( p, w ( p ; η (cid:48) )) (cid:3) . The last inequality follows from the fact that the gradient in the second bracket isweakly larger than v ( a ∗ , by the induction hypothesis and the fact that − λ η η < − λ η (cid:48) η (cid:48) (Lemma 7). ONTRACTING OVER PERSISTENT INFORMATION 45 Since lim k →∞ q k = p when P (cid:54) = ∅ , this completes the proof that the gradient is greaterthan v ( a ∗ , for all p ∈ [0 , p ] .A.6.4. For all p ∈ I , G ( p ; η ) is increasing in η . We first treat the case P (cid:54) = ∅ . Recall thatfor all p ∈ ( p, q ∞ ] , we have an explicit definition of the value function V q ( p, m q ( p )) as: v ( a ∗ , p ) − m q ( p ) − u ( a ∗ , p ) m q (1) − u ( a ∗ , v ( a ∗ , . Define ¯ η ( p ) as the solution to ϕ ¯ η ( p ) = ϕ ( p, w ( p ; ¯ η ( p ))) = p . Note that for any p ∈ ( p, q ∞ ] , forany η ≤ ¯ η , ϕ η ∈ [ p, q ∞ ] . Therefore, V q ( p, w ( p ; η )) = λ η V q ( ϕ η , m q ( ϕ η )) = λ η (cid:20) v ( a ∗ , ϕ η ) − m q ( ϕ η ) − u ( a ∗ , ϕ η ) m q (1) − u ( a ∗ , v ( a ∗ , (cid:21) = v ( a ∗ , p ) − w ( p ; η ) − u ( a ∗ , p ) m q (1) − u ( a ∗ , v ( a ∗ , . It follows that the gradient is equal to v ( a ∗ , for all p ∈ ( p, p ∗ ] , for all η ≤ ¯ η .Consider now η > ¯ η . We rewrite the gradient G ( p ; η ) as follows: V q ( p, m q ( p )) − V q ( p, w ( p ; η )) η = V q ( p, m q ( p )) − V q ( p, w ( p ; η ( p ))) + V q ( p, w ( p ; η ( p ))) − V q ( p, w ( p ; η )) η = η ( p ) η V q ( p, m q ( p )) − V q ( p, w ( p, η ( p ))) η ( p ) + η − η ( p ) η V q ( p, w ( p ; η ( p ))) − V q ( p, w ( p ; η )) η − η ( p )= η ( p ) η v ( a ∗ , 1) + η − η ( p ) η − p − p (cid:20) V q ( p, m q ( p )) − V q (cid:18) p, w (cid:18) p ; η − η ( p ) − p − p (cid:19)(cid:19)(cid:21) η − η ( p )= η ( p ) η v ( a ∗ , 1) + η − η ( p ) η G (cid:32) p ; η − η ( p ) − p − p (cid:33) Since we have already shown that G ( p ; η ) is increasing in η and weakly larger than v ( a ∗ , , we have that the gradient G ( p ; η ) is also weakly increasing in η (and greaterthan v ( a ∗ , ).We now treat the case P = ∅ . Define ¯ η ( p ) as the solution to ϕ ¯ η ( p ) = ϕ ( p, w ( p ; ¯ η ( p ))) = q .Note that for any p ∈ [ q, q ] , for any η ≤ ¯ η , ϕ η ∈ [ q, q ] . Therefore, for all η ≤ ¯ η , η =(1 − δ )(1 − λ η ) since the ratio m q (1) − w ( ϕ η )1 − ϕ η is constant in η and so is ϕ ( ϕ η , w ( ϕ η )) . (Recallthat we vary η at a fixed p .) It follows then from Equation (3) that G ( p ; η ) = (1 − δ ) η (1 − λ η ) v ( a ∗ , 1) + δη (cid:20) V q ( p, w ( p )) − V q (cid:18) p, w ( p ) + η − (1 − δ )(1 − λ η ) δ [ m (1) − u ( a ∗ , (cid:19)(cid:21) , = (1 − δ ) η (1 − λ η ) v ( a ∗ , 1) = v ( a ∗ , . We have that the gradient G ( p ; η ) is equal to v ( a ∗ , for all p ∈ ( q, q ] , for all η ≤ ¯ η . Finally,when η > ¯ η , the same decomposition as in the case P (cid:54) = ∅ completes the proof.A.6.5. For all p ∈ I , the gradient G ( p ; η ) is increasing in η . We only treat the case P (cid:54) = ∅ . (The case P = ∅ is treated analogously.) Define ¯ η ( p ) asthe solution to ϕ ¯ η ( p ) = ϕ ( p, w ( p ; ¯ η ( p ))) = q ∞ . By construction, for all p ∈ ( q ∞ , , for all η ≤ ¯ η ( p ) , we have that ϕ η ∈ ( q ∞ , . Therefore, ϕ η > q .Choose ¯ η ( p ) ≤ η (cid:48) ≤ η . We have that ϕ η (cid:48) ≥ ϕ η ≥ q since q ∞ ≥ q and, therefore, ϕ (cid:18) p, w ( p ) + η − (1 − δ )(1 − λ η ) δ [ m q (1) − u ( a ∗ , (cid:19) = ϕ ( ϕ η , w ( ϕ η )) ≥ ϕ ( ϕ η (cid:48) , w ( ϕ η (cid:48) ) = ϕ (cid:18) p, w ( p ) + η (cid:48) − (1 − δ )(1 − λ η (cid:48) ) δ [ m q (1) − u ( a ∗ , (cid:19) . Also, since q ≤ ϕ η ≤ p , we have that ϕ ( ϕ η , w ( ϕ η )) ≥ ϕ ( p, w ( p )) and, therefore, η − (1 − δ )(1 − λ η ) δ ≤ . The same applies to η (cid:48) . Finally, as already shown, η − (1 − δ )(1 − λ η ) δ < η (cid:48) − (1 − δ )(1 − λ η (cid:48) ) δ . To ease notations, define (˜ λ η , ˜ ϕ η ) as follows: ˜ λ η = λ (cid:18) p, w ( p ) − (1 − δ )(1 − λ η ) − ηδ [ m (1) − u ( a ∗ , (cid:19) ˜ ϕ η = ϕ (cid:18) p, w ( p ) − (1 − δ )(1 − λ η ) − ηδ [ m (1) − u ( a ∗ , (cid:19) (4)Notice that ˜ ϕ η = ϕ ( ϕ η , w ( ϕ η )) ∈ I since ϕ η > q ∞ .The rest of the proof is purely algebraic and mirrors the case p ∈ I . First, we have thefollowing: V q ( p, w ( p )) − V q (cid:16) p, w ( p ) − (1 − δ )(1 − λ η ) − ηδ (cid:2) m q (1) − u ( a ∗ , (cid:3)(cid:17) (1 − δ )(1 − λ η ) − ηδ = ˜ λ η V q (cid:16) ˜ ϕ η , m q ( ˜ ϕ η ) + (1 − δ )(1 − λ η ) − ηδ ˜ λ η (cid:2) m q (1) − u ( a ∗ , (cid:3)(cid:17) − ˜ λ η V q (cid:0) ˜ ϕ η , m q ( ˜ ϕ η ) (cid:1) (1 − δ )(1 − λ η ) − ηδ = V q (cid:16) ˜ ϕ η , w (cid:16) ˜ ϕ η ; (1 − δ )(1 − λ η ) − ηδ ˜ λ η (cid:17)(cid:17) − V q (cid:0) ˜ ϕ η , m q ( ˜ ϕ η ) (cid:1) (1 − δ )(1 − λ η ) − ηδ ˜ λ η , ONTRACTING OVER PERSISTENT INFORMATION 47 where we again use Observation A. Similarly, we have: V q ( p, w ( p )) − V q (cid:16) p, w ( p ) − (1 − δ )(1 − λ η (cid:48) ) − η (cid:48) δ (cid:2) m q (1) − u ( a ∗ , (cid:3)(cid:17) (1 − δ )(1 − λ η (cid:48) ) − η (cid:48) δ = ˜ λ η V q (cid:16) ˜ ϕ η , w (cid:16) ˜ ϕ η ; (1 − δ )(1 − λ η ) − ηδ ˜ λ η (cid:17)(cid:17) − ˜ λ η V q (cid:16) ˜ ϕ η , w (cid:16) ˜ ϕ η ; (1 − δ )(1 − λ η ) − ηδ ˜ λ η − (1 − δ )(1 − λ η (cid:48) ) − η (cid:48) δ ˜ λ η (cid:17)(cid:17) (1 − δ )(1 − λ η (cid:48) ) − η (cid:48) δ = V q (cid:16) ˜ ϕ η , w (cid:16) ˜ ϕ η ; (1 − δ )(1 − λ η ) − ηδ ˜ λ η (cid:17)(cid:17) − V q (cid:16) ˜ ϕ η , w (cid:16) ˜ ϕ η ; (1 − δ )(1 − λ η ) − ηδ ˜ λ η − (1 − δ )(1 − λ η (cid:48) ) − η (cid:48) δ ˜ λ η (cid:17)(cid:17) (1 − δ )(1 − λ η (cid:48) ) − η (cid:48) δ ˜ λ η , where again we use Observation A and the fact (1 − δ )(1 − λ η ) − ηδ ˜ λ η > (1 − δ )(1 − λ η (cid:48) ) − η (cid:48) δ ˜ λ η . Since ˜ ϕ η ∈ I , we have that: V q (cid:16) ˜ ϕ η , w (cid:16) ˜ ϕ η ; (1 − δ )(1 − λ η ) − ηδ ˜ λ η (cid:17)(cid:17) − V q (cid:16) ˜ ϕ η , w (cid:16) ˜ ϕ η ; (1 − δ )(1 − λ η ) − ηδ ˜ λ η − (1 − δ )(1 − λ η (cid:48) ) − η (cid:48) δ ˜ λ η (cid:17)(cid:17) (1 − δ )(1 − λ η (cid:48) ) − η (cid:48) δ ˜ λ η (cid:54) V q (cid:16) ˜ ϕ η , w (cid:16) ˜ ϕ η ; (1 − δ )(1 − λ η ) − ηδ ˜ λ η (cid:17)(cid:17) − V q (cid:0) ˜ ϕ η , m q ( ˜ ϕ η ) (cid:1) (1 − δ )(1 − λ η ) − ηδ ˜ λ η , where the inequality follows from our previous argument on the interval I .It follows that: V q ( p, w ( p )) − V q (cid:18) p, w ( p ) − (1 − δ )(1 − λ η (cid:48) ) − η (cid:48) δ (cid:2) m q (1) − u ( a ∗ , (cid:3)(cid:19) (1 − δ )(1 − λ η (cid:48) ) − η (cid:48) δ (cid:54) V q ( p, w ( p )) − V q (cid:16) p, w ( p ) − (1 − δ )(1 − λ η ) − ηδ (cid:2) m q (1) − u ( a ∗ , (cid:3)(cid:17) (1 − δ )(1 − λ η ) − ηδ . From Equation (3), we then have that η (cid:2) V q ( p, m q ( p )) − V q ( p, w ( p ; η )) (cid:3) =(1 − δ ) (1 − λ η ) η v ( a ∗ , 1) + (cid:20) (1 − δ ) (1 − λ η ) η − (cid:21) V q ( p, w ( p )) − V q (cid:16) p, w ( p ) − (1 − δ )(1 − λ η ) − ηδ [ m (1) − u ( a ∗ , (cid:17) (1 − δ )(1 − λ η ) − ηδ (cid:62) (1 − δ ) (1 − λ η ) η v ( a ∗ , 1) + (cid:20) (1 − δ ) (1 − λ η ) η − (cid:21) V q ( p, w ( p )) − V q (cid:18) p, w ( p ) − (1 − δ )(1 − λ η (cid:48) ) − η (cid:48) δ (cid:2) m q (1) − u ( a ∗ , (cid:3)(cid:19) (1 − δ )(1 − λ η (cid:48) ) − η (cid:48) δ = (1 − δ ) (1 − λ η ) η v ( a ∗ , 1) + (cid:20) − (1 − δ ) (1 − λ η ) η (cid:21) V q (cid:18) p, w ( p ) − (1 − δ )(1 − λ η (cid:48) ) − η (cid:48) δ (cid:2) m q (1) − u ( a ∗ , (cid:3)(cid:19) − V q ( p, w ( p )) (1 − δ )(1 − λ η (cid:48) ) − η (cid:48) δ = (1 − δ ) (1 − λ η (cid:48) ) η (cid:48) v ( a ∗ , 1) + (cid:20) − (1 − δ ) (1 − λ η (cid:48) ) η (cid:48) (cid:21) V q (cid:18) p, w ( p ) − (1 − δ )(1 − λ η (cid:48) ) − η (cid:48) δ (cid:2) m q (1) − u ( a ∗ , (cid:3)(cid:19) − V q ( p, w ( p )) (1 − δ )(1 − λ η (cid:48) ) − η (cid:48) δ + (cid:20) (1 − δ ) (1 − λ η (cid:48) ) η (cid:48) − (1 − δ ) (1 − λ η ) η (cid:21) V q (cid:18) p, w ( p ) − (1 − δ )(1 − λ η (cid:48) ) − η (cid:48) δ (cid:2) m q (1) − u ( a ∗ , (cid:3)(cid:19) − V q ( p, w ( p )) (1 − δ )(1 − λ η (cid:48) ) − η (cid:48) δ − v ( a ∗ , (cid:62) η (cid:48) (cid:2) V q ( p, m q ( p )) − V q ( p, w ( p ; η (cid:48) )) (cid:3) , where the last inequality follows from: V q (cid:16) p, w ( p ) − (1 − δ )(1 − λ η (cid:48) ) − η (cid:48) δ (cid:2) m q (1) − u ( a ∗ , (cid:3)(cid:17) − V q ( p, w ( p )) (1 − δ )(1 − λ η (cid:48) ) − η (cid:48) δ = ˜ λ η (cid:48) V q ( ˜ ϕ η (cid:48) , m q ( ˜ ϕ η (cid:48) )) − ˜ λ η (cid:48) V q (cid:16) ˜ ϕ η (cid:48) , w (cid:16) ˜ ϕ η (cid:48) ; (1 − δ )(1 − λ η (cid:48) ) − η (cid:48) δ ˜ λ η (cid:48) (cid:17)(cid:17) (1 − δ )(1 − λ η (cid:48) ) − η (cid:48) δ (cid:62) v ( a ∗ , . We now show that the the gradient G ( p ; η ) is smaller than v ( a ∗ , for any η ≤ ¯ η ( p ) . FromEquation (3), we have that: η (cid:2) V q ( p, m q ( p )) − V q ( p, w ( p ; η )) (cid:3) = (1 − δ )(1 − λ η ) η v ( a ∗ , − (cid:20) (1 − δ ) (1 − λ η ) η − (cid:21) V q (cid:16) p, w ( p ) − (1 − δ )(1 − λ η ) − ηδ [ m (1) − u ( a ∗ , (cid:17) − V q ( p, w ( p )) (1 − δ )(1 − λ η ) − ηδ = v ( a ∗ , − (cid:20) (1 − δ ) (1 − λ η ) η − (cid:21) V q (cid:16) p, w ( p ) − (1 − δ )(1 − λ η ) − ηδ [ m (1) − u ( a ∗ , (cid:17) − V q ( p, w ( p )) (1 − δ )(1 − λ η ) − ηδ − v ( a ∗ , = v ( a ∗ , − (cid:20) (1 − δ ) (1 − λ η ) η − (cid:21) ˜ λ η V q ( ˜ ϕ η , m q ( ˜ ϕ η )) − ˜ λ η V q (cid:16) ˜ ϕ η , w (cid:16) ˜ ϕ η ; (1 − δ )(1 − λ η ) − ηδ ˜ λ η (cid:17)(cid:17) (1 − δ )(1 − λ η ) − ηδ − v ( a ∗ , = v ( a ∗ , − (cid:20) (1 − δ ) (1 − λ η ) η − (cid:21)(cid:124) (cid:123)(cid:122) (cid:125) ≥ V q ( ˜ ϕ η , m q ( ˜ ϕ η )) − V q (cid:16) ˜ ϕ η , w (cid:16) ˜ ϕ η ; (1 − δ )(1 − λ η ) − ηδ ˜ λ η (cid:17)(cid:17) (1 − δ )(1 − λ η ) − ηδ ˜ λ η − v ( a ∗ , (cid:124) (cid:123)(cid:122) (cid:125) ≥ (cid:54) v ( a ∗ , , where the inequality follows from the fact that ˜ ϕ η ≤ p and, therefore from our argu-ments on the interval I (where we show that the gradient is larger than v ( a ∗ , ). ONTRACTING OVER PERSISTENT INFORMATION 49 Finally, we can use a similar decomposition as in the case p ∈ I to prove that thegradient is increasing for all η .A.7. Proof of Corollary 1. We first compute the principal’s payoff induced by our pol-icy. To ease notation, we write ϕ for ϕ ( p, w ( p )) . We first assume that q ∗ = q , computethe value function V q ( p, m ( p )) for all p and check that it is concave. By construction, theprincipal’s payoff satisfies: V q ( p, m ( p )) = (1 − δ ) + δV q ( p, w ( p )) = (1 − δ ) v ( a ∗ , p ) + δ − p − ϕ ( v ( a ∗ , ϕ ) . Remember that w ( p ) = m ( p ) − (1 − δ ) u ( a , p ) δ = 1 − p − ϕ m ( ϕ ) + p − ϕ − ϕ m (1) . Since w ( p ) = m ( p ) = u ( a , p ) when p ≤ p , we have that ϕ = p and, therefore, the principalpayoff is when p ≤ p . Assume that p > p . We have that: w ( p ) = u ( a , p ) − (1 − δ ) u ( a , p ) δ = 1 − p − ϕ u ( a , ϕ ) + p − ϕ − ϕ u ( a , , since m ( ϕ ) = u ( a , ϕ ) and ϕ ≤ p . (To see this, if ϕ > p , then m ( ϕ ) = u ( a , ϕ ) , hence w ( p ) = m ( p ) , a contradiction with w ( p ) > m ( p ) when p > p .) The above equation isequivalent to: (1 − ϕ )[ u ( a , p ) − (1 − δ ) u ( a , p )] = δ [(1 − p ) u ( a , ϕ ) + ( p − ϕ ) u ( a , . Observing that u ( a, p ) = (1 − p )( u ( a, − u ( a, u ( a, for all a and, similarly, for ϕ , wecan simplify the above expression to δ − p − ϕ = δ − p + (1 − p ) u ( a , − u ( a , u ( a , − u ( a , . Lastly, remember that the threshold p is solution to: − p = u ( a , − u ( a , u ( a , − u ( a , 1) + u ( a , − u ( a , , and, therefore, V q ( p, m ( p )) = v ( a ∗ , p ) + δ (cid:18) − − p − ϕ (cid:19) v ( a ∗ , − p − p v ( a ∗ , p ) + (cid:20) − − p − p + δ (cid:18) − − p − ϕ (cid:19)(cid:21) v ( a ∗ , − p − p v ( a ∗ , p ) . Since the KG’s policy induces the same payoff, it is also optimal.A.8. First best. This section provide details on the solution to the first-best problem.Let α ∗ = 1 − m ( p ) − u ( a ∗ , p ) p ( m (1) − u ( a ∗ , M ( p ) − m ( p ) − (1 − p )( m (0) − u ( a ∗ , p ( m (1) − u ( a ∗ , . Note that α ∗ ≤ , with equality if m ( p ) = u ( a ∗ , p ) ), and α ∗ < if M ( p ) − m ( p ) − (1 − p )( m (0) − u ( a ∗ , < .At an optimum, the participation constraint clearly binds. If m (0) − u ( a ∗ , 0) = 0 , thesolution is clearly (1 , M ( p ) − m ( p ) p ( m (1) − u ( a ∗ , ) . Assume that m (0) − u ( a ∗ , > . We can rewrite theprincipal’s objective as a function of α : pα v ( a ∗ , 1) + (1 − p ) v ( a ∗ , if α ≤ max(0 , α ∗ ) ,pα (cid:16) v ( a ∗ , − v ( a ∗ , m (1) − u ( a ∗ , m (0) − u ( a ∗ , (cid:17) + M ( p ) − m ( p ) m (0) − u ( a ∗ , v ( a ∗ , if max(0 , α ∗ ) ≤ α ≤ M ( p ) − m ( p ) p ( m (1) − u ( a ∗ , , −∞ otherwise . Note that the objective is continuous in α . The optimal payoff is therefore: p max(0 , α ∗ ) v ( a ∗ , 1) + (1 − p ) max (cid:18) M ( p ) − m ( p )(1 − p )( m (0) − u ( a ∗ , , (cid:19) v ( a ∗ , , obtained with ( α , α ) = (cid:16) M ( p ) − m ( p )(1 − p )( m (0) − u ( a ∗ , , (cid:17) if M ( p ) − m ( p )(1 − p )( m (0) − u ( a ∗ , ≤ and ( α , α ) = (1 , α ∗ ) ,otherwise. R EFERENCES R. J. Aumann, M. B. Maschler, and R. E. Stearns. Repeated games with incompleteinformation . MIT press, Cambridge (Mass.) London, 1995.I. Ball. Dynamic information provision: Rewarding the past and guiding the future. Available at SSRN 3103127 , 2019.V. P. Crawford and J. Sobel. Strategic information transmission. Econometrica , 50:1431–1451, 1982.J. Ely and M. Szydlowski. Moving the goalposts. Journal of Political Economy , 128,2020.J. C. Ely. Beeps. Technical report, Northwestern University, 2015.J. C. Ely. Beeps. American Economic Review , 107(1):31–53, January 2017.D. Fudenberg and L. Rayo. Training and effort dynamics in apprenticeship. AmericanEconomic Review , 109:3780–3812, 2019. ONTRACTING OVER PERSISTENT INFORMATION 51 L. Garicano and L. Rayo. Relational knowledge transfers. American Economic Review ,107(9):2695–2730, 2017.E. Kamenica. Bayesian persuasion and information design. Annual Review of Econom-ics , 11:249–72, 2019.E. Kamenica and M. Gentzkow. Bayesian persuasion. American Economic Review , 101:2590–2615, 2011.M. Makris and L. Renou. Information design in multi-stage games. 2020.D. Orlov, A. Skrzypacz, and P. Zryumov. Persuading the principal to wait. forthcomingJournal of Political Economics , 2019.J. Renault, E. Solan, and N. Vieille. Optimal dynamic information provision. Gamesand Economic Behavior , 104:329 – 349, 2017.A. Smolin. Dynamic evaluation design. R&R at American Economic Journal: Microeco-nomics , 2018.S. E. Spear and S. Srivastava. On repeated moral hazard with discounting. The Reviewof Economic Studies , 54:599–617, 1987. W EI Z HAO , HEC P ARIS AND GREGHEC-CNRS, 1 RUE DE LA L IB ´ ERATION , 78351 J OUY - EN -J OSAS ,F RANCE E-mail address : wei.zhao1(at)hec.fr C LAUDIO M EZZETTI , U NIVERSITY OF Q UEENSLAND , A USTRALIA E-mail address : c.mezzetti(at)uq.edu.au L UDOVIC R ENOU , Q UEEN M ARY U NIVERSITY OF L ONDON AND U NIVERSITY OF A DELAIDE , M ILES E ND , E1 4NS, L ONDON , UK E-mail address : lrenou.econ(at)gmail.com T RISTAN T OMALA , HEC P ARIS AND GREGHEC-CNRS, 1 RUE DE LA L IB ´ ERATION , 78351 J OUY - EN -J OSAS , F RANCE E-mail address ::