[PDF] Governmental incentives for green bonds investment

Abstract

Motivated by the recent studies on the green bond market, we build a model in which an investor trades on a portfolio of green and conventional bonds, both issued by the same governmental entity. The government provides incentives to the bondholder in order to increase the amount invested in green bonds. These incentives are, optimally, indexed on the prices of the bonds, their quadratic variation and covariation. We show numerically on a set of French governmental bonds that our methodology outperforms the current tax-incentives systems in terms of green investments. Moreover, it is robust to model specification for bond prices and can be applied to a large portfolio of bonds using classical optimisation methods.

Full PDF

GGovernmental incentives for green bonds investment

Bastien

Baldacci ∗ Dylan

Possamaï † January 5, 2021

Abstract

Motivated by the recent studies on the green bond market, we build a model in which an investor tradeson a portfolio of green and conventional bonds, both issued by the same governmental entity. The governmentprovides incentives to the bondholder in order to increase the amount invested in green bonds. These incentivesare, optimally, indexed on the prices of the bonds, their quadratic variation and covariation. We show numericallyon a set of French governmental bonds that our methodology outperforms the current tax-incentives systems interms of green investments. Moreover, it is robust to model speciﬁcation for bond prices and can be applied to alarge portfolio of bonds using classical optimisation methods.

Keywords: green bonds, moral hazard, incentives, regulation.

Green bonds are ﬁxed income products, issued by governments or companies to ﬁnance their debt. The only diﬀerencewith the so-called conventional bonds is that they ﬁnance environmental or climate-related activities. Since itsinception in 2007, the green bonds market has expanded rapidly to reach a total amount issued of $100 billion in2019. Corporate and ﬁnance companies issue more than 70% of the total amount of green bonds, whereas governmentsissue approximately 9% of this total, see for example the report of the Financial Stability Board in [11] or the OECDreports in [36; 37]. The role of ﬁnancial markets in promoting environmental policies via the green bonds is welldocumented in Park [39]. The characteristics of a bond to be deﬁned as ‘green’ is given by the Green Bond Principles,which are ‘voluntary process guidelines that recommend transparency and disclosure, and promote integrity in thedevelopment of the Green Bond market by clarifying the approach for issuance of a Green Bond’, see the deﬁnitionin the guidelines [6], published by the ICMA. These principles led the green bonds to become a standardised assetclass, part of the traditional asset allocation. There is an important literature on the inﬂuence of green bondson gas emissions and environmental ratings. In Flammer [24; 25], the author shows that the stock of a companyresponds positively to the announcement of green bond issues, and these issuances lead to an improvement of theenvironmental performance. The pricing and ownership of green bonds in the United States is studied in Baker,Bergstresser, Serafeim, and Wurgler [8], where the authors show in particular that green municipal bonds are issuedat a premium to otherwise similar ordinary bonds. Similarly, the impact of corporate green bonds on the creditquality of the issuer and on the shareholders is well documented by Tang and Zhang [48]. In de Angelis, Tankov, andZerbib [19], the authors show how green investments can help companies to reduce their greenhouse gas emissionsby raising their cost of capital. In particular, they provide empirical evidence on the US markets that an increase ofassets managed by green investors lead to a decrease of carbon emission by the companies.The idea of ﬁnancing renewable projects through green bonds is even more important since institutional investors, inparticular pension funds and asset managers, have been considering the possibility of including sustainable environ-mental investments in their assets. As such, “sustainable investing” now accounts for more than one quarter of totalassets under management (AUM) in the United States and more than half in Europe, see the report of the GSIA[3] for a detailed survey on the subject. The motivations of sustainable investing can be the search of higher alphaor lower risk (see Nilsson [35], Bauer and Smeets [9], Krüger [30]), or the will for a more socially responsible image(see Hong and Kacperczyk [28]). The two major practices in sustainable investing are exclusionary screening andenvironmental, social and governance (ESG) integration. Exclusionary screening involves the exclusion of certain as-sets from the range of eligible investments on ethical grounds, such as the so-called sin stocks, while ESG integrationinvolves under weighting assets with low ESG ratings and over weighting those with high ESG ratings. In Zerbib[51], the author builds a sustainable CAPM based on these two principles and shows how sustainable investing aﬀects ∗ École Polytechnique, CMAP, 91128, Palaiseau, France, [email protected]. † ETH Zürich, Department of Mathematics, Rämistrasse 101, 8092 Zürich, Switzerland, [email protected]. a r X i v : . [ q -f i n . M F ] J a n sset returns. Although the issuance of green corporate bonds has increased over the last years, the public sectoraccounts for two-thirds of the investments in sustainable energy infrastructure. This pleads in favour of a greaterissuance of green bonds by public entities to ﬁnance their sustainable projects, which will be the focus of the presentpaper.However, there are still several barriers to the development of the green bond market, such as a lack of greenbond deﬁnition, framework, and transparency. In that regard, Zerbib [49; 50] investigates the existence of a yieldpremium for green bonds. The results show that there exists a small negative premium meaning that the yield ofa green bond is lower than that of a conventional bond. In the existing literature, this negative yield diﬀerential ismainly attributed to intangible asset creation, which is imperfectly captured in the models of rating agencies, see forexample Porter and Van der Linde [40], Ambec and Lanoie [4], or Brooks and Oikonomou [12]. The price diﬀerencebetween a green and a conventional bond is studied in Hachenberg and Schiereck [26], where the authors show thatﬁnancial and corporate green bonds trade tighter than their conventional counterpart, and governmental bonds onthe other hand trade marginally wider. Finally, Ekeland and Lefournier [21] relativize the use of green bonds toﬁnance the ecological transition. As the green bond principles are by no means legally mandatory, and the investorsare not necessarily motivated by the green transition, there are no intrinsic diﬀerence between a green bond andits conventional counterpart. An important aspect in order to avoid green-washing, that is when the investors usethe funding obtained with the green bonds to ﬁnance non-sustainable projects, through green bonds is the issuer’sreputation or green third-party veriﬁcations, as stated in Bachelet, Becchetti, and Manfredonia [7]. These studiesshow the several components which slow down the development of the green bonds market. It is therefore importantto put in place practical solutions to overcome these constraints. Some mechanisms are already developed by thepolicy makers to facilitate the investment in this market.Indeed, there are several types of incentives policy-makers can put in place to support green bond issuance, seeMorel and Bordier [34], and Della Croce, Kaminker, and Stewart [20]: support for research and development (R&D),investment incentives (capital grants, loan guarantees and low-interest rate loans), policies which target the costof investment in capital by hedging or mitigating risk, and tax incentives policies. In particular, tax incentivesare attractive from a cost-eﬃciency perspective, as they can provide a big boost to investment with a relatively lowimpact on public ﬁnances. In Agliardi and Agliardi [1], the authors show that governmental tax-based incentives playa signiﬁcant role in scaling up the green bonds market. Finally, tax incentives (accelerated depreciation, tax credits,tax exemptions and rebates) can be provided either to the investor or to the issuer under the following forms. • Tax credit bonds: Bond investors receive tax credits instead of interest payments, so issuers do not pay couponinterests. Instead, they quarterly accrue phantom taxable income and tax credit equal to the amount ofphantom income to holders, see Klein [29].• Direct subsidy bonds: Bond issuers receive cash rebates from the government to subsidise their net interestpayments. This type of incentives is mainly used by US municipalities, see for example Ang, Bhansali, andXing [5].• Tax-exempt bonds: Bond investors do not have to pay income tax on interest from the green bonds they hold(so issuer can get lower interest rate). This type of tax incentive is typically applied to municipal bonds in theUS market, see Calabrese and Ely [13] for a survey of the use of these tax-incentives.All these incentives can be modelled as a function of the amount invested in green bonds. However, it should be clearthat policy-makers cannot necessarily control or monitor directly the actions of the investor. This leads for exampleto the so-called ‘green-washing’ practice, when the investors use the funding obtained with the green bonds to ﬁnancenon-sustainable projects, see Della Croce, Kaminker, and Stewart [20]. Moreover, the incentives are not dynamic inthe sense that they do not depend on the evolution of market conditions (for example the price diﬀerences betweengreen and conventional bonds). Thus, the incentives mechanism in the green bonds market is subject to a moralhazard component. In this article, we propose an alternative to tax incentives policy which is based on contracttheory, and designed so as to increase the investment in green bonds. Moral hazard, whose related theory has beendeveloped since the early 70’s, occurs when one person or entity (the Agent), is able to make decisions and/or takeactions on behalf of, or that impact, another person or entity: the Principal. The classical continuous-time settingworks as follow: the Principal hires an Agent to manage a ‘risky’ project, represented as a controlled stochasticdiﬀerential equation. In exchange for the eﬀort he puts into his work, the Agent receives a salary from the Principalwhich takes the form of a ‘contract’. The Principal’s goal is to oﬀer a contract to the Agent allowing him to maximiseits utility as a function of the terminal value of the project. The problem is addressed by solving a Stackelberg game,in two stages: For a complete survey of renewable energy promotion policies, we refer to Table 3 in Della Croce, Kaminker, and Stewart [20]. The data provided below can be found at . i ) With a ﬁxed contract, solve the problem of the Agent and obtain its optimal eﬀort given a contract proposedby the Principal.( ii ) Inject into the problem of the Principal the eﬀort of better response of the Agent previously found, and solvethe Principal’s problem, providing the optimal contract oﬀered to the Agent.Our goal is to propose a dynamic incentives model based on the prices and returns of green and conventional bondsissued by a government. We build a Principal-Agent model in which an investor (the Agent) runs a portfolio ofgreen and conventional bonds. Without intervention of the government, the Agent has speciﬁc investment targetscoming from his strategy. The policy-maker (the Principal) proposes incentives to the investor in order to achievetwo objectives:( i ) Increase the amount invested in green bonds according to a determined target;( ii ) maximise the value of the portfolio of bonds issued by the government.We show that without loss of utility for the government, we can consider incentives which take the form of stochasticintegrals with respect to the portfolio process, the price of the bonds and their quadratic (co)variation. In orderto propose tractable incentives for a possibly high number of bonds, we propose a form of contract that is basedonly on the dynamics of the portfolio process, the green bonds, an index of conventional bonds, and their respectivequadratic (co)variations. In the case of deterministic short-term rates for the green and conventional bonds, both theAgent and the Principal’s problems can be solved by maximising deterministic functions with classical root-ﬁndingmethods. When a one factor stochastic volatility model is used for short-term rates, we have to rely on stochasticcontrol theory and determining the incentives of the policy-maker is equivalent to solve a high-dimensional, Hamilton-Jacobi-Bellman equation.What we propose in this paper is aimed to be used by governments as an alternative to the existing tax incentives,in order to increase the investment in green bonds. We summarise below the key features of our approach.• The methodology we develop is completely tractable from a numerical point of view, thus the incentives canbe designed on a large set of bonds.• The remuneration we propose take into account the moral hazard between the investor and the government:the amount invested in the bonds is observed but not controlled by the government.• The form of the optimal incentives is robust to model error: we show numerically that a more complex dynamicsof the short-term rates of the bonds does not lead to an important loss in utility for the government, and causesminor variations in the form of the incentives.• On a one-year horizon, the incentives show a rather constant behaviour. By using this, we show that theoptimal incentives can be directly implemented with tradable ﬁnancial products such as futures, log-contractsand variance swaps on the bonds.• We compare our methodology with the current tax-incentives policy and show that, on a one-year period for asame target in green investments, our incentives policy leads to a higher value of the portfolio of bonds (15%to 20% on average).In the numerical experiments, we provide general guidelines for the government to calibrate the model parameters,in particular the risk aversions, according to its objectives.This article makes several contributions to the literature. First, to the best of our knowledge, it oﬀers the ﬁrstPrincipal-Agent framework to tackle the design of governmental incentives for green bonds. Contrary to articles likeZerbib [50] and Febi, Schäfer, Stephan, and Sun [23], where the authors provide a thorough descriptive analysis ofthe green bond market (risk premium, liquidity premium, ...) and examine the impact of green investing, our articlefocuses on answering a practical incentives problem from a quantitative viewpoint. The comparison with existingincentives policy on a set of French governmental bonds shows the beneﬁts of our method for the government. Thearticle contributes also to the Principal-Agent literature with volatility control, of which we give a brief overview. Contrary to the papers of Sung [46], Ou-Yang [38], the Principal observes the whole path of the controlled outputprocess. Moreover, in our framework, moral hazard arises from unobservable sources of risk. In Lioui and Poncet [32],the authors consider a ﬁrst-best problem with volatility control and assume that the agent has enough bargaining This literature has been growing since the study of the well-posedness of second-order backward stochastic diﬀerential equations,see for example Possamaï, Tan, and Zhou [41], or Soner, Touzi, and Zhang [44]. A rigorous study of the Principal-Agent problem withvolatility control in a general case can be found in Cvitanić, Possamaï, and Touzi [18].

Notations:

For ( v , v ) ∈ R d , v · v ∈ R denote the scalar product between v and v whereas v ◦ v ∈ R d is thecomponent-wise multiplication of the vectors. Let N ? be the set of all positive integers. For any ( ‘, c ) ∈ N ? × N ? , M ‘,c ( R ) will denote the space of ‘ × c matrices with real entries. Elements of the matrix M ∈ M ‘,c are denoted( M i,j ) ( i,j ) ∈{ ,...‘ }×{ ,...c } and the transpose of M is denoted M > . We identify M ‘, with R ‘ . When ‘ = c , we let M ‘ ( R ) := M ‘,‘ ( R ). For any x ∈ M ‘,c ( R ), and for any i ∈ { , . . . ‘ } and j ∈ { , . . . , c } , x i, : ∈ M ,c ( R ), and x : ,j ∈ R ‘ denote respectively the i -th row and the j -th column of M . For any d ∈ N ? , S d is the space of d × d -dimensional symmetric matrices. For any ( ‘, c ) ∈ N ? × N ? , we deﬁne I ‘ as the identity matrix of M ‘ ( R ), and ‘,c as a matrix in M ‘,c ( R ) with all entries equal to zero. We deﬁne the function diag : R d −→ M d ( R ) such thatfor v ∈ R d , and any ( i, j ) ∈ { , . . . , d } , diag( v ) i,j := v i if i = j , and 0 otherwise. For x ∈ M ‘,c ( R ), we deﬁne k x k := P ( i,j ) ∈{ ,...,‘ }×{ ,...,c } x i,j . Throughout the article, we work on a ﬁltered probability space (Ω , F , P ) under which all stochastic processes aredeﬁned. We refer to Appendix A for the rigorous weak formulation of the problem, and we intend the present sectionto have a more accessible (and therefore more heuristic) ﬂavour.We consider an investor wishing to develop his bonds’ portfolio. He wants to acquire both green and conventionalbonds issued by the same governmental entity or company with possible diﬀerent amounts issued and diﬀerentmaturities. We assume that we are given a time horizon T >

0, and positive integers d g and d c . The investormanages, over the horizon [0 , T ], d g green bonds, d c conventional bonds, and an index of conventional bonds ofdynamics given by d P g ( t, T g ) := P g ( t, T g ) ◦ (cid:16)(cid:0) r g ( t ) + η g ( t ) ◦ σ g ( t ) (cid:1) d t + diag (cid:0) σ g ( t ) (cid:1) d W gt (cid:17) , d P c ( t, T c ) := P c ( t, T c ) ◦ (cid:16)(cid:0) r c ( t ) + η c ( t ) ◦ σ c ( t ) (cid:1) d t + diag (cid:0) σ c ( t ) (cid:1) d W ct (cid:17) , d I t := I t (cid:0) µ I ( t )d t + σ I ( t )d W It (cid:1) . (2.1)In the above equations, T g is an R d g -valued vector representing the maturities of each green bond and T c is a R d c -valued vector representing the maturities of each conventional bond. The functions µ I : [0 , T ] −→ R , σ I : [0 , T ] −→ R represent respectively the drift and volatility of the index of conventional of bonds ( I t ) t ∈ [0 ,T ] . Similarly, the functions r g : [0 , T ] −→ R d g , r c : [0 , T ] −→ R d c represent the vectors of short-term rate of the green and conventional bonds,and the functions η g : [0 , T ] −→ R d g , η c : [0 , T ] −→ R d c represent the vectors of risk premia of the green andconventional bonds, while functions σ g : [0 , T ] −→ R d g , σ c : [0 , T ] −→ R d c represent the vector of volatilities ofthe green and conventional bonds. The processes ( W gt ) t ∈ [0 ,T ] , ( W ct ) t ∈ [0 ,T ] , ( W It ) t ∈ [0 ,T ] are respectively R d g , R d c and R -valued Brownian motions. Finally W :=  W g W c W I  We deﬁne the index as an average of the dynamics of the conventional bonds. In practice, the investor may trade a large quantityof conventional bonds and only a couple of green bonds. Thus, we argue that it is more convenient for the government to index theremuneration proposed on an average dynamics of conventional bonds in order to have more granularity for the green bonds’ incentives.

4s an R d g + d c +1 -valued Brownian motion, whose co-variance structure is given by d h W i t = Σd t , whereΣ ∈ M d g + d c +1 ( R ) , Σ :=  Σ g Σ g,c Σ g,I Σ g,c Σ c Σ c,I Σ g,I Σ c,I Σ I  , with Σ g ∈ M d g ( R ) , Σ gi,j := ρ gi,j ∈ [ − ,

1] if i = j, , ( i, j ) ∈ { , . . . , d g } , Σ c ∈ M d c ( R ) , Σ ci,j := ρ ci,j ∈ [ − ,

1] if i = j, , ( i, j ) ∈ { , . . . , d g } , Σ g,c ∈ M d g ,d c ( R ) , Σ g,ci,j := ρ gci,j ∈ [ − , , ( i, j ) ∈ { , . . . , d g } × { , . . . , d c } , Σ g,I ∈ R d g , Σ g,Ii := ρ gIi ∈ [ − , , i ∈ { , . . . , d g } , Σ c,I ∈ R d c , Σ c,Ii := ρ cIi ∈ [ − , , i ∈ { , . . . , d c } . Remark 2.1.

All these quantities are assumed to be deterministic, in order to derive a governmental incentive thatis tractable for a large number of bonds. We will show in

Appendix C that, at the expense of a higher computationalcost and the use of stochastic control theory, one can also derive incentives for the investor when short-term rates arestochastic. In

Section 4 , we show numerically that the use of stochastic short-term rates for the green bonds does notimpact qualitatively our results. In particular, when the short-term rates are driven by Ornstein-Uhlenbeck processes,the optimal investment policy in this case oscillates slightly around the one obtained with deterministic rates. Thus,the methodology we propose appears to be robust to model speciﬁcation.

Throughout the paper, we use the following technical assumption.

Assumption 2.2.

The functions r g , r c , η c , η g , σ g , σ c , µ I , and σ I are uniformly bounded on [0 , T ] . The investment policy is deﬁned by a vector of control processes π = ( π gt , π ct , π It ) t ∈ [0 ,T ] ∈ A , representing the amountof money invested at time t , where A := n ( π t ) t ∈ [0 ,T ] : K -valued and F -predictable processes o . is the set of admissible control process, where K := [ ε, b ∞ ] d g × [ ε, b ∞ ] d c × [ ε, b ∞ ] , for some 0 < ε < b ∞ and F := ( F t ) t ∈ [0 ,T ] is the natural ﬁltration of the process ( X, W ) with X deﬁned below. We deﬁne the dynamics of thevectors of returns on the bonds asd R g ( t, T g ) = (cid:0) r g ( t ) + η g ( t ) ◦ σ g ( t ) (cid:1) d t + diag (cid:0) σ g ( t ) (cid:1) d W gt , d R c ( t, T c ) = (cid:0) r c ( t ) + η c ( t ) ◦ σ c ( t ) (cid:1) d t + diag (cid:0) σ c ( t ) (cid:1) d W ct , d R It = µ I ( t )d t + σ I ( t )d W It . For every π ∈ A , one can deﬁne a probability measure P π such that the dynamics of the value of portfolio of bondsis given by d X t := π gt · d R g ( t, T g ) + π ct · d R c ( t, T c ) + π It d R It . We also denote by E πt the conditional expectation under the probability measure P π with respect to F t for all t ∈ [0 , T ].Throughout the investment period [0 , T ], the investor wants to maintain his investment in bonds at some pre-deﬁnedlevels, which can be seen as his investment proﬁle. We introduce the vectors α = ( α g , α c , α I ) ∈ R d g × R d c × R andthe cost function k : R d g × R d c × R −→ R , where for any p := ( p g , p c , p I ) ∈ R d g × R d c × R k ( p ) := 12 β g · ( p g − α g ) + 12 β c · ( p c − α c ) + 12 β I ( p I − α I ) , where β := ( β g , β c , β I ) ∈ R d g × R d c × R are what we coin intensity vectors. For instance, at some time t ∈ [0 , T ], theinvestor pays a cost to move the amount ( π gt ) i invested in the i -th green bond away from the initial target α gi , andthis cost is equal to β gi (cid:0) ( π gt ) i − α gi (cid:1) . Thus, ( β g , β c , β I ) represent the cost intensity of changing the investments We force the control processes to be strictly positive so that the density of the canonical process in Appendix A is invertible and wecan deﬁne properly the weak formulation of the control problem. Practically, this simply means that the investor ahs to invest in theindex, and in at least one of the conventional and one of the green bonds. See Appendix A for the weak formulation of the control problem, which explains how to construct P π .

5f the agent: the higher these coeﬃcients, the more incentives the investor will demand to change his investmentproﬁle.In order to modify an investment policy π ∈ A , the government proposes a remuneration to the investor. It takesthe form of an F T -measurable random variable denoted by ξ , and we will see later that the form of remunerationconsidered is an indexation on the value of the portfolio of bonds as well as the sources of risk of each bond.The optimisation problem of the investor with CARA utility function writes, for a given contract provided by thegovernment, as V A ( ξ ) := sup π ∈A E π (cid:20) U A (cid:18) ξ − Z T k ( π s )d s (cid:19)(cid:21) , U A ( x ) := − exp( − γx ) , where γ > π ∈A E π h exp( − γ ξ ) i < + ∞ , for some γ > γ. (2.2) Remark 2.3.

We emphasise here that the notion of price for a bond is meaningless as it is not quoted on the NationalBest Bid and Oﬀer (NBBO):

This is an

OTC market where the liquidity is provided by one or several dealers. Inparticular, even though there is a quantity deﬁned as the bond price on Bloomberg, it serves only as an indication asthe dealers have no obligation to buy or sell at this price. However, especially in the case of treasury bonds, Futureson the bonds are listed on the Chicago Board Of Trade where the notion of price is meaningful. Thus, throughout thearticle, the notion of bond price must be thought as the price of a future on the considered bond.

On the other hand, the government wishes to maximise the portfolio value of the bonds issued while increasing theamount invested in green bonds. Thus, he wants to maximise, on average, the quantity X T − d g X i =1 Z T κ (cid:16) G i − (cid:0) ˆ π gt ( ξ ) (cid:1) i (cid:17) d t, where for i ∈ { , . . . , d g } , G i is the investment target in the i -th green bond of the government entity, κ > G , . . . , G d g ) and ˆ π ( ξ ) is a best response of the investor to a given contract ξ . Weassume that the cost of moving away from the targets is the same for each green bond, meaning that the governmentdoes not have diﬀerent preferences for each bond (this assumption can of course be relaxed). The government alsosubtracts from this quantity the contract ξ oﬀered to the investor. Thus, his optimisation problem with CARAutility function writes V P = sup ξ ∈C sup ˆ π ∈A ( ξ ) E ˆ π (cid:20) U P (cid:18) X T − d g X i =1 Z T κ (cid:16) G i − (cid:0) ˆ π gt ( ξ ) (cid:1) i (cid:17) d t − ξ (cid:19)(cid:21) , U P ( x ) = − exp( − νx ) , (2.3)where ν > A ( ξ ) := (cid:26) ˆ π ∈ A : V A ( ξ ) = E ˆ π (cid:20) − exp (cid:18) − γ (cid:18) ξ − Z T k (ˆ π s )d s (cid:19)(cid:19)(cid:21)(cid:27) , is the set of best-responses of the Agent to a given contract ξ and C = (cid:8) ξ : R -valued, F T -measurable random variable such that V A ( ξ ) ≥ R, and (2.2) is satisﬁed (cid:9) , is the set of admissible contracts for the government, where R < ξ ) unless the contract is such that his expected utility isabove R . Remark 2.4.

We consider here that the reservation utility corresponds to the utility function of the investor in thecase ξ = 0 , that is R = V A (0) = sup π ∈A E π (cid:20) − exp (cid:18) γ Z T k ( π s )d s (cid:19)(cid:21) = − , where the supremum is reached by choosing π = ( α g , α c , α I ) . We will see in the following section that the optimalcontract proposed by the government will always saturate this constraint, that is the Principal will provide the Agentwith the minimum reservation utility R he requires. We will see later that there might be several best responses of the Agent. Thus, following the tradition in the moral hazard literature,we assume that the Principal has enough bargaining power to be able to choose the best response of the Agent that maximises his ownutility. Solving the optimisation problem

In this section, we derive the optimal governmental incentives proposed to the investor. As it would be unrealistic(and hardly tractable) to oﬀer a compensation based on the whole universe of governmental bonds, we suggest aremuneration based on the green bonds, the value of the portfolio and an index of conventional bonds. This way,the contract is only indexed on d g + 2 variables. The optimal incentives are obtained by maximising a deterministicfunction, which makes the problem easily tractable for a large number of green bonds. We begin this section withthe deﬁnition of contractible and non-contractible variables. Deﬁnition 3.1.

The set of contractible variables is deﬁned as the R d g +2 -valued process B obs :=  XW g W I  . The set of non-contractible variables is deﬁned as the R d c -valued process B obs := W c , with the following dynamics d B obs t := µ obs ( t, π t )d t + Σ obs ( t, π t )d W t , d B obs t := µ obs d t + Σ obs d W t , where µ obs := (cid:0) d c , (cid:1) , Σ obs := (cid:0) d c ,d g I d c d c , (cid:1) , and the maps µ obs : [0 , T ] × R d g × R d c × R −→ R d g +2 , as wellas Σ obs : [0 , T ] × R d g × R d c × R −→ M d g +2 ,d g + d c +1 ( R ) are deﬁned for any p := ( p g , p c , p I ) ∈ R d g × R d c × R and t ∈ [0 , T ] by µ obs ( t, p ) :=  p g · (cid:0) r g ( t ) + η g ( t ) ◦ σ g ( t ) (cid:1) + p c · (cid:0) r c ( t ) + η c ( t ) ◦ σ c ( t ) (cid:1) + p I µ I ( t ) d g ,  , Σ obs ( t, p ) :=  ( p g ◦ σ ( t ) g ) > ( p c ◦ σ ( t ) c ) > p I σ I ( t ) I d g d g ,d c d g , ,d g ,d c  . Finding the optimal contract ξ in the optimisation problem (2.3) is an arduous task, as we search a solution in thespace of F T -measurable random variables. However, see Cvitanić, Possamaï, and Touzi [18], it has been shown thatwithout reducing the utility of the Principal, we can restrict our study to admissible contracts which have a speciﬁcform. In order to describe this result, we need ﬁrst to introduce additional notations.We deﬁne the quantities B := (cid:18) B obs B obs (cid:19) , µ ( t, p ) := (cid:18) µ obs ( t, p ) µ obs (cid:19) , Σ( t, p ) := (cid:18) Σ obs ( t, p )Σ obs (cid:19) , ( t, p ) ∈ [0 , T ] × R d g × R d c × R . We also will need to introduce the map h : [0 , T ] × R d g + d c +2 × S d g + d c +2 ( R ) × K −→ R , with h ( t, z, g, p ) = − k ( p ) + z · µ ( t, p ) + 12 Tr (cid:2) g Σ( t, p )Σ(Σ( t, p ) > ] , ( t, z, g, p ) ∈ [0 , T ] × R d g + d c +2 × S d g + d c +2 ( R ) × K. and for all ( t, z, g ) ∈ [0 , T ] × R d g + d c +2 × S d g + d c +2 ( R ), O ( t, z, g ) := n ˆ p ∈ K : ˆ p ∈ argmax p ∈ K (cid:8) h ( t, z, g, p ) (cid:9)o . is the set of the maximisers of h with respect to its last variable, for ( t, z, g ) given. Following Schäl [43], thereexists at least one Borel-measurable map ˆ π : [0 , T ] × R d g + d c +2 × S d g + d c +2 ( R ) −→ K such that for every ( t, z, g ) ∈ [0 , T ] × R d g + d c +2 × S d g + d c +2 ( R ), ˆ π ( t, z, g ) ∈ O ( t, z, g ). We denote by O the corresponding set of all such maps. Theorem 3.2.

Without reducing the utility of the Principal, we can restrict the study of admissible contracts to theset C where any ξ ∈ C ⊂ C is of the form ξ = Y y ,Z, Γ , ˆ πT where for t ∈ [0 , T ] , Y y ,Z, Γ , ˆ πt := y + Z t Z s · d B s + 12 Z t Tr (cid:2) (Γ s + γZ s Z > s ) d h B i s (cid:3) − Z t h (cid:0) s, Z s , Γ s , ˆ π ( s, Z s , Γ s ) (cid:1) d s, (3.1) where y ∈ R , ˆ π ∈ O and ( Z, Γ) are respectively R d g + d c +2 - and S d g + d c +2 ( R ) -valued, F -predictable processes such that Condition (2.2) is satisﬁed for Y y ,Z, Γ , ˆ πT , and V A ( Y y ,Z, Γ , ˆ πT ) ≥ U A ( y ) . We denote by ZG the set of such processes,which is properly deﬁned in Equation (B.3) . Moreover, we have V A (cid:0) Y y ,Z, Γ , ˆ πT (cid:1) = U A ( y ) , A (cid:0) Y y ,Z, Γ , ˆ πT (cid:1) = n(cid:0) ˆ π ( t, Z t , Γ t ) (cid:1) t ∈ [0 ,T ] : ˆ π ∈ O , ( Z, Γ) ∈ ZG o . Z t · d B t is a remuneration indexedlinearly on the state variables. Contrary to the classical Principal-Agent problem where the agent controls the driftof the output process, see Sannikov [42] for example, the admissible contracts (3.1) are not only linear functionsof the state variables but depend also linearly on their quadratic variation and covariation. This comes from thefact that by investing in the bonds, the investor controls directly the volatility of the portfolio process X . Usingstandard tools of static hedging, this contract can be replicated using futures, log-contracts and volatility productssuch as variance swaps, see Section 3.2.2 for details. In particular, this ensures that the contracts we recommend arepractically implementable.As stated at the beginning of this section, we wish to build an optimal contract based only on the green bonds, theportfolio process and the index of conventional bonds. In this regard, the form we obtained in Equation (3.1) is toogeneral, which is why we are now going to restrict our attention to a slightly smaller class of contracts. We thusdeﬁne for any ( Z, Γ) ∈ ZG Z t =: (cid:18) Z obs Z obs (cid:19) , Γ =: (cid:18) Γ obs Γ obs , obs Γ obs , obs Γ obs (cid:19) , where for Lebesgue-almost every t ∈ [0 , T ] Z obs t ∈ R d g +2 , Z obs t ∈ R d c , Γ obs t ∈ S d g +2 ( R ) , Γ obs t ∈ S d c ( R ) , Γ obs , obs t ∈ M d g +2 ,d c ( R ) . We then consider a simpliﬁed Hamiltonian h obs : [0 , T ] × R d g +2 × S d g +2 ( R ) × K −→ R given by h obs ( t, z obs , g obs , p ) = − k ( p ) + z obs · µ obs ( t, p ) + 12 Tr (cid:2) g obs Σ obs ( t, p )Σ(Σ obs ( t, p ) > ) (cid:3) , and for all ( t, z obs , g obs ) ∈ [0 , T ] × R d g +2 × S d g +2 ( R ), we deﬁne O obs ( t, z obs , g obs ) := n ˆ p ∈ K : ˆ p ∈ argmax p ∈ K (cid:8) h obs ( t, z obs , g obs , p ) (cid:9)o . Following again Schäl [43], there exists at least one Borel-measurable map ˆ π : [0 , T ] × R d g +2 × S d g +2 ( R ) −→ K suchthat for every ( t, z obs , g obs ) ∈ [0 , T ] × R d g +2 × S d g +2 ( R ), ˆ π ( t, z obs , g obs ) ∈ O obs ( t, z obs , g obs ), and we let O obs be thecorresponding set of all such maps.We can now state precisely the class of contracts we are concerned with in this paper. Assumption 3.3.

We consider the subset of contracts C := n Y y ,Z, Γ , ˆ πT ∈ C : Z obs = d c , Γ obs = d c ,d c , Γ obs , obs = d g +2 ,d c o . In particular, any ξ ∈ C is of the form ξ = Y y ,Z obs , Γ obs , ˆ πT , where for any t ∈ [0 , T ] , Y y ,Z obs , Γ obs , ˆ πt := y + Z t Z obs s · d B obs s + 12 Tr h(cid:0) Γ obs s + γZ obs s ( Z obs s ) > (cid:1) d h B obs i s i − h obs (cid:16) s, Z obs s , Γ obs s , ˆ π ( s, Z obs s , Γ obs s )) (cid:17) d s, (3.2) where y ≥ , ˆ π ∈ O obs and ( Z obs , Γ obs ) ∈ ZG obs with ZG obs := n ( Z obs , Γ obs ) : R d g +2 × S d g +2 ( R )-valued, F -predictable, s.t. Y y ,Z obs , Γ obs , ˆ πT ∈ C o . The optimisation problem of the government that we now consider is e V P = sup y ≥ sup ( Z obs , Γ obs , ˆ π ) ∈ZG obs ×O obs E ˆ π ( Z, Γ) (cid:20) U P (cid:18) X T − d g X i =1 Z T κ (cid:16) G i − (cid:0) ˆ π g ( t, Z obs t , Γ obs t ) (cid:1) i (cid:17) d t − Y y ,Z obs , Γ obs , ˆ πT (cid:19)(cid:21) , (3.3)This assumption allows us to consider more tractable contracts for a large portfolio of bonds, even if we consider lessgeneral contracts compared to (3.1). Moreover, as the objective of the government is to encourage the acquisitionof green bonds, it is natural to consider a more granular contract with respect to the green bonds and to use onlythe index of conventional bonds as a representative contractible variable of this set of bonds. As we used onlydeterministic functions to model the risk premium, short-term rate and volatility processes, the optimal incentivesof the government can be obtained by maximising a deterministic function, which leads to the following theorem. We use the notation E (ˆ π ( t,Z t , Γ t )) t ∈ [0 ,T ] [ · ] =: E ˆ π ( Z, Γ) [ · ] heorem 3.4 (Main result) . The optimal contract ξ ? ∈ C is given by ξ ? = Y ,z ? obs ,g ? obs ,π ? T = Z T z ? obs ( t ) · d B obs t + 12 Tr h(cid:0) g ? obs ( t ) + γz ? obs ( t )( z ? obs ( t )) > (cid:1) d h B obs i t i − h obs (cid:16) t, z ? obs ( t ) , g ? obs ( t ) , π ? (cid:0) t, z ? obs ( t ) , g ? obs ( t ) (cid:1)(cid:17) d t, (3.4) where for all t ∈ [0 , T ] , z ? obs ( · ) , g ? obs ( · ) , π ? (cid:0) · , z ? obs ( · ) , g ? obs ( · ) (cid:1) are deterministic functions of time, solving sup ( z,g, ˆ π ) ∈ P ×O obs H (cid:0) t, z, g, ˆ π ( t, z, g ) (cid:1) , (3.5) where P := R d g +2 × S d g +2 ( R ) and H : [0 , T ] × P × K −→ R is given by H ( t, z, g, p ) := − d g X i =1 (cid:0) G i − p i (cid:1) −

12 Tr h ( g + γzz > )Σ obs (cid:0) t, p (cid:1) Σ(Σ obs (cid:0) t, p (cid:1) > (cid:1)i + h obs (cid:0) t, z, g, p (cid:1) + (cid:16) µ obs (cid:0) t, p (cid:1)(cid:17) − z > µ obs (cid:0) t, p (cid:1) − ν (cid:16)(cid:16) Σ obs (cid:0) t, p (cid:1)(cid:17) , : − z > Σ obs (cid:0) t, p (cid:1)(cid:17) > Σ (cid:16)(cid:16) Σ obs (cid:0) t, p (cid:1)(cid:17) , : − z > Σ obs (cid:0) t, p (cid:1)(cid:17) . Moreover e V P = U P (cid:18) Z T H (cid:16) t, z ?, obs ( t ) , g ?, obs ( t ) , π ? (cid:0) t, z ?, obs ( t ) , g ?, obs ( t ) (cid:1)(cid:17) d t (cid:19) . Proof.

The term in the exponential of the optimisation problem (3.3) is a linear function of y hence the reservationutility of the investor is saturated using y ? = 0. Deﬁne for any martingale M the operator E ( M ) T := exp (cid:18) − νM T + 12 ν h M i T (cid:19) . The government has now to solvesup ( Z, Γ , ˆ π ) ∈ZG obs ×O obs E ˆ π ( Z, Γ) " U P (cid:18) Z T (cid:18)(cid:0) µ obs (cid:0) t, ˆ π ( t, Z t , Γ t ) (cid:1)(cid:17) − d g X i =1 (cid:0) G i − ˆ π i ( t, Z t , Γ t ) (cid:1) −

12 Tr h(cid:16) Γ (cid:0) t, ˆ π ( t, Z t , Γ t ) (cid:1) + γZ t Z > t (cid:17) Σ obs (cid:0) t, ˆ π ( t, Z t , Γ t ) (cid:1) Σ (cid:16) Σ obs (cid:0) t, ˆ π ( t, Z t , Γ t ) (cid:1)(cid:17) > i + h obs (cid:0) t, Z t , Γ t , ˆ π ( t, Z t , Γ t ) (cid:1)(cid:19) d t (cid:19) × exp (cid:18) − ν Z T (cid:16)(cid:16) Σ obs (cid:0) t, ˆ π ( t, Z t , Γ t ) (cid:1)(cid:17) , : − Z > t Σ obs (cid:0) t, ˆ π ( t, Z t , Γ t ) (cid:1)(cid:17) d W t (cid:19) . We make appear the stochastic exponential so that the previous supremum becomessup ( Z, Γ , ˆ π ) ∈ZG obs ×O obs E ˆ π ( Z, Γ) " U P (cid:18) Z T H (cid:0) t, Z t , Γ t , ˆ π ( t, Z t , Γ t ) (cid:1) d t (cid:19) × E (cid:18) Z · (cid:16)(cid:16) Σ obs (cid:0) t, ˆ π ( t, Z t , Γ t ) (cid:1)(cid:17) , : − Z > t Σ obs (cid:0) t, ˆ π ( t, Z t , Γ t ) (cid:1)(cid:17) d W t (cid:19) T . As the function U P ( x ) is increasing and the expectation of a stochastic exponential is bounded by one, we obtain e V P ≤ U P (cid:18) Z T sup ( z,g, ˆ π ) ∈ P ×O obs H (cid:0) t, z, g, ˆ π ( t, z, g ) (cid:1) d t (cid:19) . We have H (cid:0) t, z, g, ˆ π ( t, z, g ) (cid:1) ≤ −

12 Tr h γzz > Σ obs (cid:0) t, ˆ π ( t, z, g ) (cid:1) Σ(Σ obs (cid:0) t, ˆ π ( t, z, g ) (cid:1) > (cid:1)i + (cid:16) µ obs (cid:0) t, ˆ π ( t, z, g ) (cid:1)(cid:17) .

9s ˆ π ( t, z, g ) < + ∞ is uniformly bounded and strictly positive, Σ is deﬁnite positive and the components of Σ obs arepositive, we observe that when k z k + k g k −→ + ∞ , the ﬁrst term goes to −∞ while the second term is bounded.Therefore, the supremum on O obs cannot be attained for inﬁnite values.If we now choose the incentives z ?, obs ( t ) , g ?, obs ( t ) , π ? (cid:0) t, z ?, obs ( t ) , g ?, obs ( t ) (cid:1) as the maximisers of H , they are Borel-measurable deterministic functions of t ∈ [0 , T ] thus belong to the set ZG obs and are bounded on [0 , T ], so that E (cid:18) Z · (cid:16)(cid:16) Σ obs (cid:0) t, π ? ( t, z ?, obs ( t ) , g ?, obs ( t )) (cid:1)(cid:17) , : − z ?, obs ( t ) > Σ obs (cid:16) t, π ? (cid:0) t, z ?, obs ( t ) , g ?, obs ( t ) (cid:1)(cid:17)(cid:17) d W t (cid:19) T is a P π ? -martingale and we obtain e V P = U P (cid:18) Z T H (cid:16) t, z ?, obs ( t ) , g ?, obs ( t ) , π ? (cid:0) t, z ?, obs ( t ) , g ?, obs ( t ) (cid:1)(cid:17) d t (cid:19) . Static maximisation (3.5) over ( z, g ) ∈ P can easily be handled with classic root-ﬁnding algorithms for a largeportfolio of green bonds. Before moving to the numerical experiments, we discuss the form and implementability ofthe optimal contract.

The contract consists of the following elements:• The term Z ? obs X is a compensation given to the investor with respect to the risk associated to the evolutionof the portfolio process. If Z ? obs X > Z ? obs X < t > t , the investor receives approximately the amount( Z ? obs X ) t ( X t − X t ).• For i ∈ { , . . . , d g } , the term Z i is a compensation given to the investor with respect to the volatility riskassociated to the evolution of the i -th green bond price. Between two times t > t , the investor receivesapproximately the amount ( Z ? obs i ) t ( W it − W it ): if Z ? obs i is close to zero, the government does not givecompensation with respect to the volatility of the i -th green bond and conversely for Z ? obs i far from zero. Theintuition behind Z ? obs I is the same.• The diagonal terms of Γ obs are compensations with respect to the quadratic variation of the portfolio processand the risk sources of the green bonds and the index. For example if Γ ? obs X >

0, the government providesremuneration to the investor for a high quadratic variation (which here can be thought of as volatility) of theportfolio process. If Γ X <

0, the government penalises a high volatility of the portfolio process.• The non-diagonal terms of Γ ? obs are compensations with respect to the quadratic covariation of the portfolioprocess and the risk sources of the green bonds and the index. For example, if Γ ? obs X,i > i ∈ { , . . . , d g } thegovernment provides remuneration to the investor for similar moves of the portfolio process and the i -th greenbond. If Γ ? obs X,i <

0, the government encourages opposite moves of the portfolio process and the i -th green bond.• The term G obs ( t, Z ? obs , Γ ? obs ) is a continuous coupon that is given to the investor. It corresponds to the utilityof the investor in the case ξ = 0.For reasonable choices of parameters ( α, β, G ), the supremum of h obs and in (3.5) are strictly concave functions sothat an optimiser is quickly found using root-ﬁnding algorithms. Note that the optimal contract is indexed on theportfolio process X the sources of risk coming from the green bonds W g and the one coming from the index W I .This can be reformulated as an indexing on X and the prices of the bonds. In this case we deﬁne B obs ,p :=  X log( P g )log( P I )  , B obs ,p := log( P c ) , In practice, we observe that for the set of parameters we choose for the numerical experiences, the function h obs is strictly concavewith respect to its last variable thus admits a unique maximizer ˆ π . B obs ,pt := µ obs ,p ( t, π t )d t + Σ obs ,p ( t, π t )d W t , d B obs ,pt := µ obs ,p ( t )d t + Σ obs ,p ( t )d W t , where µ obs ,p ( t, π ) :=  π g · (cid:0) r g ( t ) + η g ( t ) ◦ σ g ( t ) (cid:1) + π c · (cid:0) r c ( t ) + η c ( t ) ◦ σ c ( t ) (cid:1) + π I µ I ( t ) r g ( t ) + η g ( t ) ◦ σ g ( t ) − ( σ g ( t )) > Σ g σ g ( t ) µ I ( t ) − (cid:0) σ I ( t ) (cid:1)  , Σ obs ,p ( t, π ) :=  ( π g ◦ σ ( t ) g ) > ( π c ◦ σ ( t ) c ) > π I σ I ( t )diag (cid:0) σ g ( t ) (cid:1) d g ,d c d g , ,d g ,d c σ I ( t )  ,µ obs ,p ( t ) := (cid:0) r c ( t ) + η c ( t ) ◦ σ c ( t ) − ( σ c ( t )) > Σ c σ c ( t ) (cid:1) , Σ obs ,p ( t ) := (cid:0) d c ,d g diag (cid:0) σ c ( t ) (cid:1) d c , (cid:1) . This leads to minor changes in the computations and the optimal incentives.

We will show in the numerical section that the processes ( π ? , Z ? , Γ ? ) show a rather constant behaviour through theperiod [0 , T ]. Thus, the optimal contract does not need a frequent re-calibration throughout the year. This suggeststhe following approximation ξ ? ≈ ξ ? + ¯ Z ? obs · B obs T + 12 Tr h (¯Γ ? obs + γ ¯ Z ? obs ( ¯ Z ? obs ) > ) h B obs i T i − Z T h obs (cid:16) t, ¯ Z ? obs , ¯Γ ? obs , π ? ( t, ¯ Z ? obs , ¯Γ ? obs ) (cid:17) d t, (3.6)where ¯ Z ? obs , and ¯Γ ? obs are constants corresponding the average of z ? obs ( t ) , g ? obs ( t ) over [0 , T ] deﬁned by¯ Z ? obs = (cid:0) ¯ Z ? obs X , ¯ Z ? obs1 , . . . , ¯ Z ? obs d g , ¯ Z ? obs I (cid:1) > ∈ R d g +2 , ¯Γ ? obs =  ¯Γ ? obs X Γ ? obs X, . . . Γ ? obs X,d g ¯Γ ? obs X,I ¯Γ ? obs X, ¯Γ ? obs1 . . . ¯Γ ? obs1 ,d g ¯Γ ? obs1 ,I ... ... . . . ... ...... ... ... . . . Γ ? obs d g ,I ¯Γ ? obs X,I Γ ? obs1 ,I . . . Γ ? obs d g ,I Γ ? obs I  ∈ S d g +2 ( R ) . In order to provide a practical implementation of the contract, we propose a static replication of its payoﬀ usingﬁnancial instruments. First, note that the incentives ¯ Z ? obs X , and ¯Γ ? obs X are indexed on the holdings of the investor,thus do not need any replication using ﬁnancial instruments. The portion ¯ Z ? obs · B obs T of the contract can be easilyreplicated using log-contracts. For example, for i ∈ { , . . . , d g } , we replicate ¯ Z ? obs i ( B obs T ) i using a long position ofsize Z ? obs i on a log-contract on the i -th green bond with maturity T . In this section, all the derivatives products willhave a maturity equal to T .The portion of the contract with respect to quadratic variation and covariation terms are more subtle to replicate.Deﬁne the matrix ˜ C ∈ S d g +2 ( R ) whose coeﬃcients are given by˜ C i,j := d g +2 X k =1 C i,k h B obs k,j i T , C i,j := ¯Γ ? obs i,j + γ ¯ Z ? obs i ¯ Z ? obs j , ( i, j ) ∈ { , . . . , d g + 2 } . Then, we can rewrite Tr h(cid:0) ¯Γ ? obs + γ ¯ Z ? obs ( ¯ Z ? obs ) > (cid:1) h B obs i T i = P d g +2 i =1 ˜ C i,i . Following the reasoning of Carr andLee [14], we note that the quadratic variations and covariations on the logarithm of the green bonds and the indexof conventional bonds can be replicated statically using variance and covariance swaps on the bonds. Finally, theportfolio process is equivalent to holding π ?,g green bonds, π ?,c conventional bonds and π ?,I index. Thus, thequadratic covariation between the portfolio process X and the bonds can be replicated using a linear combination ofvariance and covariance swaps.We are now in position to state the replication strategy for the implementation of the contract. The proof is anapplication of the no-arbitrage principle and It¯o’s formula on the logarithm of the bond prices. Proposition 3.5.

The replication strategy on [0 , T ] of the optimal contract in (3.6) is as follow: • For i ∈ { , . . . , d g } , hold a position of size ¯ Z ? obs i in a log-contract on the i -th green bond. Hold a position of size ¯ Z ? obs I in a log-contract on the index of conventional bonds. • For i ∈ { , . . . , d g + 1 } , k ∈ { , . . . , d g + 1 } , hold a position of size C i,k in a covariance swap between the ( i − -th and the ( k − -th green bonds. • For i = d g + 2 , k ∈ { , . . . , d g + 1 } , hold a position of size C i,k in a covariance swap between the index ofconventional bonds and the ( k − -th green bonds. • For i = k = d g + 2 , hold a position of size C i,i in a variance swap on the index of conventional bonds. • For i = 1 , k ∈ { , . . . , d g + 1 } , l g ∈ { , . . . , d g } , l c ∈ { , . . . , d c } , hold a position of size C i,k π ?,gl g in a(co)variance swap between the ( k − -th and the l g -th green bonds, a position of size C i,k π ?,cl c between the ( k − -th green bond and the l c -th conventional bonds, and a position of size C i,k π ?,I in a covariance swapbetween the index of conventional bonds and the ( k − -th green bond. • For i = 1 , k = d g + 2 , l g ∈ { , . . . , d g } , l c ∈ { , . . . , d c } , hold a position of size C i,k π ?,gl g in a covariance swapbetween the index of conventional bonds and the l g -th green bond, a position of size C i,k π ?,cl c between the indexof conventional bonds and the l c -th conventional bond, and a position of size C i,k π ?,I in a variance swap onthe index of conventional bonds. The contract can be implemented practically only by using the value of the portfolio of bonds, log-contracts, varianceand covariance swaps on the diﬀerent bonds.

Remark 3.6.

We would like to emphasise that, even though it is possible to replicate in practice the optimal contractusing variance and covariance swaps on the government bonds, these derivatives might be highly illiquid on ﬁnancialmarkets. However, it is possible to replicate these volatility derivatives using the log-contracts and the bonds. Indeed,a variance swap on a bond P t ( we omit to describe the type of bond for notational simplicity ) of maturity T can bereplicated by holding for all t ∈ [0 , T ] one log-contract that pays − P T /P ) and /P t bonds P t . A covariance swapon the bonds P t , and P t can be replicated by holding for all t ∈ [0 , T ] one log-contract that pays − P T /P ) ,one log-contract that pays − P T /P ) , short variance swap on P , and short variance swap on P , long / ( P t P t ) bond P t := P t P t . Thus, the optimal contract ξ in (3.4) can be implemented only using bond prices andlog-contracts. Finally, note that if vanilla options on the futures on the bonds are available on the market, one can use the Carr-Madan formula, see Carr and Madan [15] to replicate the log-contract payoﬀs in Remark 3.6. Thus, the optimalcontract in (3.6) can be implemented in practice in three diﬀerent ways: using the bond prices, the portfolio process,the variance and covariance swaps on the bonds; using the bond prices, the portfolio process, and the log-contractson the bonds; or using the bond prices, the portfolio process, and vanilla options on the bond prices.

In the current section, we provide numerical examples illustrating the eﬃciency of our incentives method.

We illustrate our methodology on an example with real-world data. The dataset is composed of 3 French governmentalbonds, one green bond and two conventional bonds with the following characteristics.

Bloomberg Ticker Valuation date Maturity Amount issued Issue price CouponGreen bond FRTR 1 3/4 24/01/2017 25/06/2039 27.375b 100.162 1.75Conv. bond 1 FRTR 6 02/01/1994 25/10/2025 30.654b 95.29 6.Conv. bond 2 FTRT 4 09/03/2010 25/04/2060 16.000b 96.34 4.

We also deﬁne the index of conventional bonds I t as a geometric average of the conventional bonds, weighted by theamount issued. We perform the calibration using the daily prices of the bonds from 10 / / / / r g ( t ) = a r,g + b r,g ( T g − t ) , η g ( t ) = a ξ,g + b ξ,g ( T g − t ) , σ g ( t ) = a σ,g + b σ,g ( T g − t ) ,r ,c ( t ) = a r,c + b r,c ( T ,c − t ) , η ,c ( t ) = a ξ,c + b ξ,c ( T ,c − t ) , σ ,c ( t ) = a σ,c + b σ,c ( T ,c − t ) ,r ,c ( t ) = a r,c + b r,c ( T ,c − t ) , η ,c ( t ) = a ξ,c + b ξ,c ( T ,c − t ) , σ ,c ( t ) = a σ,c + b σ,c ( T ,c − t ) µ I ( t ) = a µ,I + b µ,I ( T I − t ) , σ I ( t ) = a σ,I + b σ,I ( T I − t ) , with T g = 19 . , T ,c = 6 . T ,c = 40 .

58, and T I = 18 .

29. In order to calibrate the dynamics of the bonds in (2.1)over the period, we use a classic least-square algorithm and we obtain the following set of parameters a r,g = − . , b r,g = 0 . , a ξ,g = 0 . , b ξ,g = 0 . , a σ,g = 0 . , b σ,g = 0 . ,a r,c = − . , b r,c = − . , a ξ,c = 0 . , b ξ,c = 0 . , a σ,c = 0 . , b σ,c = 0 . ,a r,c = 0 . , b r,c = 0 . , a ξ,c = 0 . , b ξ,c = − . , a σ,c = 0 . , b σ,c = − . ,a µ,I = − . , b µ,I = 0 . , a σ,I = 0 . , b σ,I = 0 . , and the correlation matrix is given by Σ =  . . . . . . . . . . . .  . The time horizon of the investor and the government is equal to one year, i.e T = 1. We deﬁne a so-called referencecase, which is a reference to analyze the impact of our incentives policy. In this setting, ν = γ = 1 , G = d g , κ = 0 , β = (0 . , . , . , . , α = (0 . , . , . , . . Thus, the investor and the government have the same risk aversion, and the government has no speciﬁc incentivesto increase the investments in the green bond. The only objective of the government is to maximise the value of theportfolio of bonds. The investor has the same cost intensity for every bonds and wishes to invest more in the indexand the second conventional bond compared to the green and the ﬁrst conventional bond. This corresponds to arisk-averse investor who prefers a diversiﬁed portfolio of conventional bonds, and is reluctant to invest in the greenbonds. Finally, the utility reservation of the investor is set equal to the his utility in the case ξ = 0.We summarise the important empirical ﬁndings coming from the numerical results.• The methodology we propose outperforms signiﬁcantly the current tax-incentives policy: for a same result interms of green investments, our methodology leads to a value of the portfolio process 15% to 20% higher.• The optimal investment policy is robust to model speciﬁcation: by using a one-factor model on the short-termrates of the green bond, we observe that the investor’s strategy oscillates slightly around the one obtained withdeterministic rates.• The optimal controls show a rather constant behaviour throughout the year: The government does not haveto frequently recalibrate the optimal contract.• The government can increase the amount invested in the green bonds by the mean of G and κ . This decreaseshis utility as he must provides higher incentives to the investor.• The most important incentive with respect to the contractible variables is Z ?X : The government always encour-age a higher value of the portfolio of bonds by setting Z ?X > Z ?X .We also provide some general remarks for the policy-maker.13 The parameters ( α, β ) modelling the preferences of the investor should be calibrated using the historical dataon the issuance of bonds. For example, for i ∈ { , . . . , d g } , the coeﬃcient α i should be equal to the historicalamount invested in the green bond P gi , and β i should be equal to the variance of the amount invested in thisgreen bond throughout the year. Note however that one historical data on bonds with the same characteristicmay not be available especially for countries with small amounts issued. Thus, the parameters ( α, β ) mightbe re-scaled depending on the maturity and the coupon of the newly issued bond: A risky investor such asa ﬁxed-income hedge fund might increase his investment in the bond if it oﬀers a higher coupon, whereasinstitutional investors such as pension funds will tend to buy bonds with a better rating.• The risk-aversion parameter γ should be chosen such that, in the case ξ = 0 and with ( α, β ) chosen as explainedpreviously, the optimal controls π ? correspond roughly to the historical positions of the investor.• The risk-aversion parameter ν should be chosen heuristically such that the optimal contract oﬀered to theinvestor bring the investments closer to the target G and the amount ξ ? oﬀered by the government is reasonable.The terms ‘closer to’ and ‘reasonable’ have to be interpreted by the policy-maker in view of their own budgetconstraints and political objectives.• In the case of a small number of bonds issued, the government can, for sake of simplicity, propose a contractindexed only on the value of the portfolio. In the absence of a contract, that is ξ = 0, the investor matches his investments π ? ( ξ ) with the target α as he hasno incentives to deviate. Thus, the optimal investments are given by π ?g (0) = 0 . , π ?c (0) = (0 . , . , π ?I (0) = 0 . π ? and the optimal incentives Z ? , and Γ ? through time. One can seethat, even if the risk premia, the short-term rates and volatility processes have a deterministic aﬃne structure withrespect to time, the processes ( π ? , Z ? , Γ ? ) show a rather constant behaviour through the year. Thus, the optimalcontract does not need frequent recalibration through the year.Figure 1: Optimal investment policy (upper left), optimal incentives Z ? (upper right) and Γ ? (bottom) as a function of time. Compared to the case ξ = 0, we observe that the contract increases the investment in the green bond and the secondconventional bond, while reducing the investment in the index and the ﬁrst conventional bond. Given the dynamicsof the bonds described previously, as well as the preferences of the investor, it is natural that he invests mostly inthe index and the second conventional bond. As the green bond has a higher short-term rate and risk premium thanthe ﬁrst conventional bond, the traders invests a higher part of his wealth in it.14he optimal incentives with respect to the sources of risk is as follow: the incentives with respect to the green bondand the index of conventional bonds are set to zero, whereas the incentive with respect to the value of the portfolioof bonds is strictly positive. Thus, the government provides incentives only to increase the value of the portfolio. Weobserve at the bottom of Figure 1 the incentives with respect to the quadratic variations of the contractible variables.The government provides no incentives with respect to the quadratic variation of the index and the green bondwhile it encourages a high quadratic variation of the portfolio process. The incentives with respect to the quadraticcovariations are as follow: the government penalises a high covariation between the portfolio process and the indexas well as between the green bonds and the index, while encouraging a high covariation between the portfolio andthe green bond. To illustrate the beneﬁts of the use of a contract, we plot in Figure 2 some simulations of the evolution of the portfolioprocess over the year with and without contract (that is when ξ = 0). We observe that the portfolio process is higherwhen the government provides a contract to the investor. This is also illustrated in Figure 3 where we show thecumulated diﬀerence between the portfolio processes with and without contract, using 10000 simulations.Figure 2: Some trajectories of the optimal portfolio process with and without contract.

Figure 3:

Average absolute diﬀerence of portfolio value over time, for 10000 simulations.

As the notion of incentives with respect to quadratic variation might not be easy to understand, we present inFigure 4 the optimal investment and incentives when the government set Γ = 0.15igure 4:

Optimal investment policy (left) and optimal incentives Z ? (right) as a function of time. Compared to Figure 1, we observe that the government sets a higher incentive on the value of the portfolio, whilethe optimal investment policy is slightly higher on every asset, but not materially diﬀerent compared to a frameworkwith an optimal contract depending on both the dynamics and the quadratic variations of the contractible variables.Thus, for sake of simplicity, a government can build an optimal incentives scheme based only on the dynamics of thegreen bonds, the value of the portfolio and the index of conventional bonds.

We show that, using a more complex model for the short-term rates of the green bonds, the results are qualitativelythe same. Using the methodology in Appendix C, we assume that the short-term rate of the green bond is drivenby a one-factor stochastic model, that is d r gt = θ g ( m g − r gt )d t + σ g d W g,rt , (4.1)where W g,r is a one-dimensional Brownian and ( θ g , m g , σ g ) ∈ R . Using a least-square algorithm, a calibration onthe short-term rate curve of the green bond gives the following parameters θ g = 0 . , m g = 0 . , σ g = 0 . . We show in Figure 5 the optimal investment policy when the short-term rate of the green bond is driven by (4.1). Thisis obtained by solving the 4-dimensional HJB equation (C.5) using a fully implicit scheme and locally unidimensionalmethods on sparse grids. Note that the optimisation is much harder to complete since for every π ? ( t, z, g, r g ) wehave to solve a 4-dimensional HJB equation and iterate until we ﬁnd the optima ( z ? , Γ ? ). We observe that theoptimal policy oscillates around the values obtained in the case of deterministic short-term rates in Figure 1. Asthe bonds are all positively but not perfectly correlated, a change of investment in the green bond induces a changeof smaller magnitude in the other bonds. The magnitude of oscillation around the value with deterministic rates isnot high, thus we observe same results from a qualitative point of view. As the use of stochastic rates can only beviable for a small portfolio of bonds, and as the diﬀerence of behaviour is negligible, we can argue that the use ofdeterministic short-term rates is more suited to practical applications.Figure 5: Optimal investment policy with stochastic rates. In particular, as the bond prices do not vary drastically during the year, we use 10 time steps, 40 space steps for the cash process,10 for the stochastic rate and 20 for the risk factors of the green bond and index of conventional bonds. .2.5 Comparison with current tax-incentives policy The purpose of the paper is to show that a form of incentives based on the value of the portfolio and the prices of thebonds performs better than the current tax-incentives policy. As stated in the introduction, the incentives policy toincrease investment in green bonds takes the form of tax credit or cash rebate, depending on the amount invested.Thus, in our Principal-Agent framework, it takes the following form ξ = c Z T d g X i =1 π gt d t, where c > c so that theamount invested in green bonds is the same as in Figure 1. In Figure 6, we plot the average relative diﬀerencebetween the cash processes of the government using our optimal policy and the actual tax-incentives. We observethat the diﬀerence increases with time, thus for a same result in terms of green investments our optimal contractincreases its utility compared to the actual incentives policy.Figure 6: Average relative diﬀerence (in %) of portfolio value over time for 10000 simulations.

We also show in Figure 7 some trajectories of the value of the portfolio process with the optimal contract and theoptimal policy. We observe that the value of the portfolio process is always (slightly) higher in the presence of theoptimal contract. In the next subsection we show that when the government wants to achieve a speciﬁc target ingreen investments, the diﬀerence between the two policies becomes largerFigure 7:

Some trajectories of the optimal portfolio process with the optimal contract and with the tax-incentives policy(labeled ‘without contract’). .3 Inﬂuence of the green target We now study the impact of the incentives policy we propose when the government seeks to achieve a speciﬁcinvestment target in the green bond. We take G = 3 , κ = 0 . Optimal investment policy (upper left), optimal incentives Z ? (upper right) and Γ ? (bottom) as a function of time. The behaviour of the investor is drastically diﬀerent compared to Figure 1. He now invests mostly in the greenbond, while increasing the amount invested in the other assets. This comes from the fact that all assets are positivelycorrelated so that the additional amount invested in the index is higher than the one invested in the ﬁrst conventionalbond. The government sets a higher incentive with respect to the value of the portfolio. The incentives with respectto the quadratic variation are now all positive and higher than in Figure 1. While Γ G and Γ I are still set to zero, theincentive with respect to every covariations are now positive. In particular, Γ XI has changed from − . α g = 0 .

2, the government has to provide higher incentives to force the investor to shift his preferences toward amuch higher investment in the green bond. As in the reference case, we show in Figure 9 some simulations of theevolution of the portfolio process compared to the case without contract. We observe that the higher investment ingreen bonds leads to a higher average value of the portfolio process. Moreover due to the higher incentives on thequadratic variations, the portfolio process with the contract is more volatile, as it can be seen in Figure 10Figure 9:

Average diﬀerence of portfolio value over time, for 10000 simulations.

Some trajectories of the optimal portfolio process with and without contract.

We have seen in Figures 6 and 7 that without speciﬁc target in green investments, the optimal contract we proposeleads to a higher value of the portfolio process compared to the tax-incentives policy. Here, we set the tax-incentives c so that the investor matches the investment in green bonds obtained with the optimal contract in Figure 8. Weplot in Figure 11, and Figure 12 some trajectories and the average relative diﬀerence of cash processes obtained withthe optimal contract and the tax-incentives policy.In this case, the relative diﬀerences of value are much higher compared to Figure 6, and Figure 7. Thus, if thegovernment has a speciﬁc investment target in green bonds, the use of the optimal contract we propose guaranteesa much higher value of the portfolio for a similar result than the tax-incentives policy.Figure 11: Some trajectories of the optimal portfolio process with the optimal contract and with the tax-incentives policy(labeled ‘without contract’).

Average relative diﬀerence (in %) of portfolio value over time (with the optimal contract and tax-incentives policy),for 10000 simulations. G , and κ In Figure 13, we show that reducing the value of κ makes the government target harder to achieve.Figure 13: Optimal investment policy (upper left), optimal incentives Z ? (upper right) and Γ ? (bottom) as a function oftime. In particular, we observe that the amount invested in all the assets has been reduced and especially the amountinvested in the green bond. In this case, the government proposes a much higher incentive with respect to thedynamics of the portfolio compared to Figure 8: as the investment target G is less important (because of a lower κ ,he aims at maximising the value of the portfolio Moreover, a high quadratic covariation between the green bond andthe index is now penalised, while a high variance of the portfolio is encouraged in order to maximise its value.In Figure 14, we show that with the parameters κ = 0 . G = 1, the investment target of the government can bereached more easily. In this case, the trader invest roughly the same amount in the the green bond and the secondconventional bond. The government increases the incentive corresponding to the value of the portfolio compared toFigure 8. Moreover, he encourages a high variance of the portfolio process while keeping the incentives Γ G , Γ I equalto zero. 20igure 14: Optimal investment policy (upper left), optimal incentives Z ? (upper right) and Γ ? (bottom) as a function of time α and β We studied in the previous section the inﬂuence of the government’s parameters, that is the target G and the costintensity κ . We now show the inﬂuence of the targets α g , α c , α I and the cost intensities β g , β c , β I of the investor. InFigure 15, we place ourselves in the context of the reference case of Figure 1, except that we set α g = 0. This meansthat the investor is not willing to put money in the green bond. Compared to Figure 1, we see that in the absenceof speciﬁc incentives for green investing, the investor eﬀectively sets π g equal to zero.The other investment policies are slightly changed, as there is now more investment in the second conventional bondthan in the index. As neither the government nor the investor are interested in the green bond, the governmentprovides higher incentives Z X in order to maximise the value of the portfolio. The incentive Γ XG become negativewhile Γ X becomes positive meaning that the government encourages opposite moves between the price of the greenbond and the portfolio process. Moreover, Γ XI becomes positive: the government encourages similar moves betweenthe price of the index and the portfolio process. Finally, the incentives corresponding to the quadratic variation ofthe green bond and the index remain equal to zero.Figure 15: Optimal investment policy (upper left), optimal incentives Z ? (upper right) and Γ ? (bottom) as a function oftime.

21n Figure 16, we compare these results with the case G = 3 , κ = 0 . α g = 0 but lower than in Figure 8 wherethe investor has α g = 0 .

2. The incentives with respect to the quadratic variations become positive meaning thatthe government encourages similar moves of all the contractible variables. In particular, compared to Figure 15, thegovernment gives higher incentives toward similar moves of the portfolio value and the green bond.We conclude this section by showing in Figure 17 the inﬂuence of the cost intensity. We take the same parametersas in Figure 16 except that we set β g = 0 .

5. As the intensity cost for moving the green bond target of the investoris higher than in Figure 16, the optimal investment policy in the green bond is lower. The government sets a higherincentive Z X to encourage a higher value of the portfolio. The incentives with respect to quadratic variations arematerially diﬀerent compared to Figure 16. In particular, the government encourages opposite moves between thegreen bond and the index.Figure 16: Optimal investment policy (upper left), optimal incentives Z ? (upper right) and Γ ? (bottom) as a functionof timeFigure 17: Optimal investment policy (upper left), optimal incentives Z ? (upper right) and Γ ? (bottom) as a functionof time 22 Weak formulation of the problem

We work on the canonical space Q of continuous functions on [0 , T ] with Borel algebra F . The ( d g + d c +2)-dimensionalcanonical process is B : =  XW g W c W I  and F = ( F t ) t ∈ [0 ,T ] is its natural ﬁltration. We deﬁne P as the ( d g + d c + 1)-dimensional Wiener measure on Q .Thus, B is a ( d g + d c + 2)-dimensional Brownian motion where ( W g , W c , W I ) has a correlation matrix Σ under P .We also deﬁne M (Ω) as the set of probability measures on ( Q , F T ) and H ( P ) := (cid:26) ( π t ) t ∈ [0 ,T ] : B -valued , F -predictable processes such that E P (cid:20) Z T k π t k d t (cid:21) < + ∞ (cid:27) . We consider the following family of processes, indexed by π ∈ H ( P ) X πt := R t Σ obs ( s, π s )d B s R t Σ obs d B s ! , and deﬁne the set P m as the set of probability measures P π ∈ M ( Q ) of the form P π = P ◦ ( X π ) − , for all π ∈ H ( P ) . Thanks to Bichteler [10], we can deﬁne a pathwise version of the quadratic variation process hBi and of its densityprocess with respect to the Lebesgue measure ˆ α t := d hBi t d t . As the processes π ∈ A ⊂ H ( P ) have all their coordinatesstrictly positive, the volatility of B is invertible, which implies in particular that the process W t = R t ˆ α − s d B s is an R d g + d c +2 -valued, P -Brownian motion with correlation matrix Σ for every P ∈ P m . According to Soner, Touzi, andZhang [45], there exists an F B -progressively measurable mapping β π : [0 , T ] × Q −→ R d g + d c +2 such that B = β π ( X π ) , P -a.s , W = β π ( B ) , P π -a.s , ˆ α ( B ) = π (cid:0) β π ( B ) (cid:1) , d t ⊗ d P π -a.e.In particular, the canonical process B admits the following dynamics for all π ∈ AB t = R t Σ obs ( s, π ( W · ))d W s R t Σ obs d W s ! , P π -a.s . The ﬁrst coordinate of the canonical process is the desired output process, the d g next coordinates are the contractiblesources of risk, that is the d g green bonds and the index of conventional bond, and the last d c coordinates are thenon-contractible sources of risk. Then, we can introduce easily the drift of the output process by the means ofGirsanov theorem. Denote d Q d P π := E (cid:18) Z · ˜Σ( s )d W s (cid:19) T , a change of measure independent of the control process π , where ˜Σ : [0 , T ] −→ M d g + d c +2 ( R ) is such that˜Σ( t ) := (cid:0) r g ( t )+ η g ( t ) ◦ σ g ( t ) σ g ( t ) (cid:1) > (cid:0) r c ( t )+ η c ( t ) ◦ σ c ( t ) σ c ( t ) (cid:1) > µ I ( t ) σ I ( t ) d g + d c +1 ,d g d g + d c +1 ,d c d g + d c +1 , ! . We ﬁnally obtain the desired dynamics for the output process and the d g + d c + 1 sources of risk. B Proof of Theorem 3.2

We can deﬁne the functions σ : [0 , T ] × K −→ M d g + d c +2 ,d g + d c +1 ( R ) , λ : [0 , T ] −→ R d g + d c +1 such that the set ofcontractible variables ( B t ) t ∈ [0 ,T ] can be rewritten for all π ∈ A asd B t = σ ( t, π t ) (cid:0) λ ( t )d t + d W t (cid:1) , (B.1)23here for all ( t, p ) ∈ [0 , T ] × K , σ ( t, p ) := (cid:0) p g σ g ( t ) (cid:1) > (cid:0) p c σ c ( t ) (cid:1) > p I σ I ( t )diag( σ g ( t )) d g ,d c d g , ,d g ,d c σ I ( t ) d c ,d g diag( σ c ( t )) d c ,  , λ ( t ) := (cid:16)(cid:0) r g ( t )+ η g ( t ) ◦ σ g ( t ) σ g ( t ) (cid:1) > (cid:0) r c ( t )+ η c ( t ) ◦ σ c ( t ) σ c ( t ) (cid:1) > µ I ( t ) σ I ( t ) (cid:17) > , Thanks to Assumption 2.2 and the deﬁnition of A , the functions σ , and λ are bounded. As the function σ ( t, π ) iscontinuous in time for some constant control process π ∈ A , there always exists a weak solution to (B.1). Thanksto the boundedness of the function λ , we can use Girsanov’s theorem which guarantees that every π ∈ A induces aweak solution for B t = B + Z t σ ( s, π s )d W s , d P d P (cid:12)(cid:12)(cid:12)(cid:12) F T = E (cid:18) Z · λ ( s ) · d W s (cid:19) T , where W is a P -Brownian motion.The cost function k : K → R is measurable and bounded by boundedness of the elements of K . We introduce thenorms k Z e k p H p = sup π ∈A E π (cid:20)(cid:18) Z T (cid:12)(cid:12)(cid:12) ˜ σ ( t, π t ) Z t (cid:12)(cid:12)(cid:12) d t (cid:19) p/ (cid:21) , k Y e k p D p = sup π ∈A E π (cid:20) sup t ∈ [0 ,T ] | Y t | p (cid:21) , for any F -predictable, R d g + d c +2 -valued process Z e and R -valued process Y e , and for all ( t, p ) ∈ [0 , T ] × K e σ :[0 , T ] × K → M d g + d c +2 ( R ) is such that e σ ( t, π t ) = σ ( t, p ) σ > ( t, p ) . We also deﬁne the functions H e : [0 , T ] × R d g + d c +2 × S d g + d c +2 ( R ) × R −→ R and h e : [0 , T ] × R d g + d c +2 × S d g + d c +2 ( R ) × R × K −→ R as H e ( t, z, g, y ) := sup p ∈ K h e ( t, z, g, y, p ) h e ( t, z, g, y, p ) := − γk ( p ) y + z · σ ( t, p ) λ ( t ) + 12 Tr h gσ ( t, p )Σ( σ ( t, p )) > i . We introduce the set of so-called admissible incentives ZG e as the set of F -predictable processes ( Z e , Γ e ) valued in R d g + d c +2 × S d g + d c +2 ( R ) such that k Z e k p H p + k Y e,Z e , Γ e k p D p < + ∞ , (B.2)for some p > y e ∈ R , Y e,y e ,Z e , Γ e t := y e + Z t Z es d B s + 12 Tr h Γ es d h B i s i − H e (cid:0) s, Z es , Γ es , Y e,Z e , Γ e s (cid:1) d s. Condition (B.2) guarantees that the process ( Y e,y e ,Z e , Γ e t ) t ∈ [0 ,T ] is well deﬁned: provided that the right-hand sideintegrals are well deﬁned, and by noting that H e is Lipschitz in its last variable (since the cost function k isbounded), ( Y e,y e ,Z e , Γ e t ) t ∈ [0 ,T ] is the unique solution of an ODE with random coeﬃcient. Moreover, as K is acompact set and h e is continuous with respect to its last variable, the supremum with respect to p is always attained.As ( Z e , Γ e ) = ( d g + d c +2 , d g + d c +2 ,d g + d c +2 ) ∈ ZG e , this set is non-empty and we are in the setting of Cvitanić,Possamaï, and Touzi [18]. Using [18, Proposition 3.3 and Theorem 3.6], we obtain that without reducing the utilityof the Principal, any admissible contract admits the representation U A ( ξ ) = Y e,Z e , Γ e T , Deﬁne for all t ∈ [0 , T ] the processes Z t =: − Z et γY e,y e ,Z e , Γ e t , Γ t := − Γ et γY e,y e ,Z e , Γ e t , Y y ,Z, Γ t = y + Z T Z s d B s + 12 Tr (cid:2)(cid:0) Γ s + γZ s Z > s d h B i s (cid:3) − H (cid:0) s, Z s , Γ s (cid:1) d s, H : [0 , T ] × R d g + d c +2 × S d g + d c +2 ( R ) −→ R is deﬁned by H ( t, z, g ) = sup p ∈ K h ( t, z, g, p ) and ZG := (cid:26) ( Z t , Γ t ) t ∈ [0 ,T ] : R d g + d c +2 × S d g + d c +2 ( R )-valued, F -predictable processes s.t (cid:0) − γZ t U A ( Y y ,Z, Γ t ) , − γ Γ t U A ( Y y ,Z, Γ t ) (cid:1) t ∈ [0 ,T ] ∈ ZG e (cid:27) . (B.3)An application of It¯o’s formula leads to ξ = Y y ,Z, Γ T . Thus, we obtain the desired representation for admissible con-tracts and V A ( Y y ,Z, Γ T ) = U A ( y ). The characterisation of A ( Y y ,Z, Γ T ) is a direct consequence of Cvitanić, Possamaï,and Touzi [18, Proposition 3.3]. C Green investments with stochastic interest rates

C.1 Framework

In the article, we considered a deterministic structure for the short-term rates. However, this omits some importantstylised facts of the yield curve. In this section we show that at the expense of the use of stochastic control, thegovernment can provide incentives based on short-term rates following a one factor stochastic model.We now assume that the vectors of short rate dynamics of the green bonds are given byd r gt := a g ( t, r gt )d t + diag( b g )d W g,rt , (C.1)where b g ∈ R d g + , a g : [0 , T ] × R d g −→ R d g and W g,r is a d g -dimensional Brownian motion of correlation matrix Σ g,r . Remark C.1.

For notational simplicity, we assume no dependence between the risk sources of the short-term ratesand the ones of the bonds. Allowing such dependence is straightforward and does not lead to a higher dimension ofthe control problem.

We contract only on the portfolio process, the risk factors of the green bonds and of the stochastic short-term rateof the green bonds, and the risk factor of the index of conventional bonds. The new sets of state variables are B obs ,S =  XW g r g W I  , B obs ,S = W c , where the superscript S stands for stochastic, which can be written asd B obs ,St := µ obs ,S ( t, π t , r gt )d t + Σ obs ,S ( t, π t )d W t , d B obs ,St := µ obs ,S ( t )d t + Σ obs ,S ( t )d W t , where for all t ∈ [0 , T ] , p = ( p g , p c , p I ) ∈ R d g × R d c × R , r g ∈ R d g W t :=  W gt W g,rt W It W ct  , µ obs ,S ( t, p, r g ) :=  p g · (cid:0) r g + η g ( t ) ◦ σ g ( t ) (cid:1) + p c · (cid:0) r c ( t ) + η c ( t ) ◦ σ c ( t ) (cid:1) + p I µ I ( t ) d g , a g ( t, r g )0  , Σ obs ,S ( t, p ) :=  ( p g ◦ σ ( t ) g ) > ,d g p I σ I ( t ) ( p c ◦ σ ( t ) c ) > I d g d g ,d g d g , d g ,d c d g ,d g diag( b g ) d g , d g ,d c ,d g ,d g ,d c  ,µ obs ,S ( t ) = (cid:0) d c , (cid:1) , Σ obs ,S ( t ) = (cid:0) d c ,d g I d c d c , d c ,d g (cid:1) . We now specify the new set of admissible contracts that we consider for the incentives proposed by the government.

C.2 Representation of admissible contracts

Deﬁne C S as the set of admissible contracts in the case of stochastic short-term rates (the admissibility conditionsare the same as for the set C ) and for any π ∈ A we introduce the following quantities B S := (cid:18) B obs ,S B obs ,S (cid:19) , µ S ( t, π ) := (cid:18) µ obs ,S ( t, π ) µ obs ,S (cid:19) , Σ S ( t, π ) := (cid:18) Σ obs ,S ( t, π )Σ obs ,S (cid:19) .

25e deﬁne h S : [0 , T ] × R d g + d c +2 × S d g + d c +2 ( R ) × R d g × K −→ R such that h S ( t, z, g, r g , p ) = − k ( p ) + z · µ S ( t, p, r g ) + 12 Tr h g Σ S ( t, p )Σ(Σ S ( t, p )) > i , and for all ( t, z, g, r g ) ∈ [0 , T ] × R d g + d c +2 × S d g + d c +2 ( R ) we deﬁne O S ( t, z, g, r g ) := n ˆ p ∈ K : ˆ p ∈ argmax p ∈ K h S ( t, z, g, r g , p ) o , as the set of maximisers of h S with respect to its last variable for ( t, z, g, r g ) ﬁxed. Following Schäl [43], thereexists at least one Borel-measurable map ˆ π : [0 , T ] × R d g + d c +2 × S d g + d c +2 ( R ) × R d g −→ K such that for every( t, z, g, r g ) ∈ [0 , T ] × R d g + d c +2 × S d g + d c +2 ( R ) × R d g , ˆ π ( t, z, g, r g ) ∈ O S ( t, z, g, r g ). We denote by O S the set of allsuch maps. By analogy with Theorem 3.2, the following theorem states the form of any admissible contracts in thissetting. Theorem C.2.

Without reducing the utility of the Principal, we can restrict the study of admissible contracts to theset C S where any ξ ∈ C S is of the form ξ = Y y ,Z S , Γ S , ˆ πT where for all t ∈ [0 , T ] , Y y ,Z S , Γ S , ˆ πt := y + Z t Z Ss · d B s + 12 Tr h(cid:0) Γ Ss + γZ Ss ( Z Ss ) > (cid:1) d h B S i s i − h S (cid:0) s, Z Ss , Γ Ss , r gs , ˆ π ( s, Z s , Γ s , r gs ) (cid:1) d s, (C.2) where ˆ π ∈ O S and ( Z St ) t ∈ [0 ,T ] , (Γ St ) t ∈ [0 ,T ] are respectively R d g +2+ d c and S d g +2+ d c ( R ) -valued, F -predictable processessatisfying similar conditions as the elements of ZG . We denote the set of admissible incentives as ZG S . Moreoverin the present case of stochastic rates for green bonds V A ( Y y ,Z S , Γ S , ˆ πT ) = U A ( y ) , A (cid:0) Y y ,Z S , Γ S , ˆ πT (cid:1) = n(cid:0) ˆ π ( t, Z St , Γ St , r gt ) (cid:1) t ∈ [0 ,T ] , ˆ π ∈ O S , ( Z St , Γ St ) t ∈ [0 ,T ] ∈ ZG S o . We now set Z St = (cid:18) Z obs ,St Z obs ,St (cid:19) , Γ St = (cid:18) Γ obs ,St Γ obs , obs ,St Γ obs , obs ,St Γ obs ,St (cid:19) , where for all t ∈ [0 , T ] Z obs ,St ∈ R d g +2 , Z obs ,St ∈ R d c , Γ obs ,St ∈ S d g +2 ( R ) , Γ obs ,St ∈ S d c ( R ) , Γ obs , obs ,St ∈ M d g +2 ,d c ( R ) . We deﬁne h obs ,S : [0 , T ] × R d g +2 × S d g +2 ( R ) × R d g × K −→ R such that h obs ( t, z obs ,S , g obs ,S , r g , p ) = − k ( p ) + z obs ,S · µ obs ,S ( t, p, r g ) + 12 Tr h g obs ,S Σ obs ,S ( t, p )Σ(Σ obs ,S ( t, p )) > i , and for all ( t, z obs ,S , g obs ,S , r g ) ∈ [0 , T ] × R d g +2 × S d g +2 ( R ) × R d g we deﬁne O obs ,S ( t, z obs , g obs , r g ) := n ˆ p ∈ K : ˆ p ∈ argmax p ∈ K h obs ,S ( t, z obs ,S , g obs ,S , r g , p ) o . Using again Schäl [43], there exists at least one Borel-measurable map ˆ π : [0 , T ] × R d g +2 × S d g +2 ( R ) × R d g −→ B suchthat for every ( t, z obs ,S , g obs ,S , r g ) ∈ [0 , T ] × R d g +2 × S d g +2 ( R ) × R d g , ˆ π ( t, z obs ,S , g obs ,S , r g ) ∈ O obs ,S ( t, z obs ,S , g obs ,S , r g )and O obs ,S denotes the set of all such maps. We consider the subset of admissible contracts C S := n Y y ,Z S , Γ S , ˆ πT ∈ C S : Z obs ,S = d c , Γ obs ,S = d c ,d c , Γ obs , obs ,S = d g +2 ,d c o ⊂ C S ⊂ C S , where any contract in C S is of the form Y y ,Z obs ,S , Γ obs ,S , ˆ πT where for all t ∈ [0 , T ], Y y ,Z obs ,S , Γ obs ,S , ˆ πt := y + Z t Z obs ,Ss · d B obs ,Ss + 12 Tr h(cid:0) Γ obs ,Ss + γZ obs ,Ss ( Z obs ,Ss ) > (cid:1) d h B obs ,S i s i − h obs ,S (cid:16) s, Z obs s , Γ obs s , r gs , ˆ π (cid:0) s, Z obs ,Ss , Γ obs ,Ss , r gs (cid:1)(cid:17) d s, (C.3)where y ≥

0, ˆ π ∈ O obs ,S and ( Z obs ,S , Γ obs ,S ) ∈ ZG obs ,S with ZG obs ,S := n ( Z obs ,S , Γ obs ,S ) : R d g +2 × S d g +2 ( R )-valued, F -predictable s.t Y y ,Z obs ,S , Γ obs ,S , ˆ πT ∈ C S o . We can now formulate the stochastic control problem faced by the government.26 .3 The Hamilton-Jacobi-Bellman equation

Let us deﬁne the process ( Q y ,Z obs ,S , Γ obs ,S , ˆ πt ) t ∈ [0 ,T ] where for all ( t, y , Z obs ,S , Γ obs ,S , ˆ π ) ∈ [0 , T ] × R × ZG obs ,S × O obs ,S Q y ,Z obs ,S , Γ obs ,S , ˆ πt := X t − Z t d g X i =1 (cid:0) G i − ˆ π gi ( s, Z obs ,Ss , Γ obs ,Ss , r gs ) (cid:1) ds − Y y ,Z obs ,S , Γ obs ,S , ˆ πt . The optimisation problem of the government that we consider here is e V P = sup y ≥ sup ( Z obs ,S , Γ obs ,S , ˆ π ) ∈ZG obs ,S ×O obs ,S E ˆ π ( Z obs ,S , Γ obs ,S ) " − exp (cid:16) − νQ y ,Z obs ,S , Γ obs ,S , ˆ πT (cid:17) . (C.4)Due to the presence of state variables in the best response of the Agent, the optimal control of the Principal will nolonger be deterministic, and we have to rely on the Hamilton-Jacobi-Bellman formulation of the stochastic controlproblem. First, we note that the supremum over y is attained by setting y = 0. Next, the state variables of thecontrol problem are (cid:0) t, B obs ,St , Q ,Z obs ,S , Γ obs ,S , ˆ πt (cid:1) and as it is standard in control problems with CARA utility function,the last variable can be simpliﬁed. Deﬁne P S = R d g +2 × S d g +2 ( R ), and the Hamiltonian H ˆ π : [0 , T ] × P S × R × P S −→ R H ˆ π ( t, z, g, u, u b , u bb ) := νu (cid:18) z · µ obs ,S (cid:0) t, ˆ π ( t, z, g, r g ) , r g (cid:1) + d g X i =1 (cid:0) G i − ˆ π gi ( t, z, g, r g ) (cid:1) + 12 Tr (cid:20) ( g + γzz > )Σ obs ,S (cid:0) t, ˆ π ( t, z, g, r g ) (cid:1)(cid:16) Σ obs ,S (cid:0) t, ˆ π ( t, z, g, r g ) (cid:1)(cid:17) > (cid:21) − h obs ,S (cid:0) t, z, g, r g , ˆ π ( t, z, g, r g ) (cid:1)(cid:19) + 12 ν u Tr (cid:20) zz > Σ obs ,S (cid:0) t, ˆ π ( t, z, g, r g ) (cid:1)(cid:16) Σ obs ,S (cid:0) t, ˆ π ( t, z, g, r g ) (cid:1)(cid:17) > (cid:21) + u b · µ obs ,S (cid:0) t, ˆ π ( t, z, g, r g ) , r g (cid:1) + 12 Tr (cid:20) Σ obs ,S (cid:0) t, ˆ π ( t, z, g, r g ) (cid:1)(cid:16) Σ obs ,S (cid:0) t, ˆ π ( t, z, g, r g ) (cid:1)(cid:17) > u bb (cid:21) . The value function of the control problem of the Principal is solution of the following Hamilton-Jacobi-Bellmanequation  ∂ t U ( t, b ) + sup ( z,g, ˆ π ) ∈ P S ×O obs ,S H ˆ π (cid:0) t, z, g, b, U, U b , U bb (cid:1) = 0 ,U ( T, b ) = − , (C.5)where U : [0 , T ] × R d g +2 −→ R and for all ( i, j ) ∈ { , . . . , d g + 2 } , ( U b ) i = ∂ b i U, ( U bb ) i,j = ∂ b i b j U , in the sensethat e V P = U (0 , b ) where B obs ,S = b and y = 0. Thus, the incentives provided to the investor are obtained up tothe resolution of a (2 d g + 2)-dimensional HJB equation. Although it provides greater ﬂexibility on the modelling ofshort-term rates, this approach can only be applied to a small portfolio of bonds using classic numerical schemes onsparse grids. References [1] E. Agliardi and R. Agliardi. Financing environmentally–sustainable projects with green bonds.

Environmentand Development Economics , 24(6):608–623, 2019.[2] R. Aïd, D. Possamaï, and N. Touzi. Optimal electricity demand response contracting with responsivenessincentives.

ArXiv preprint arXiv:1810.09063 , 2018.[3] Global Sustainable Investment Alliance. 2016 global sustainable investment review. Technical report, GSIA,2017.[4] S. Ambec and P. Lanoie. Does it pay to be green? A systematic overview.

The Academy of ManagementPerspectives , 22(4):45–62, 2008.[5] A. Ang, V. Bhansali, and Y. Xing. Build America bonds.

The Journal of Fixed Income , 20(1):67–73, 2010.[6] International Capital Market Association. Green bond principles, 2016. Technical report, ICMA, 2016.277] M.J. Bachelet, L. Becchetti, and S. Manfredonia. The green bonds premium puzzle: the role of issuer charac-teristics and third–party veriﬁcation.

Sustainability , 11(4):1098, 2019.[8] M. Baker, D. Bergstresser, G. Serafeim, and J. Wurgler. Financing the response to climate change: the pricingand ownership of US green bonds. Technical Report w25194, National Bureau of Economic Research, 2018.[9] R. Bauer and P. Smeets. Social identiﬁcation and investment decisions.

Journal of Economic Behavior &Organization , 117:121–134, 2015.[10] K. Bichteler. Stochastic integration and L p –theory of semimartingales. The Annals of Probability , 9(1):49–89,1981.[11] Financial Stability Board. Global shadow banking monitoring report 2015. Technical report, FSB, 2015.[12] C. Brooks and I. Oikonomou. The eﬀects of environmental, social and governance disclosures and performanceon ﬁrm value: a review of the literature in accounting and ﬁnance.

The British Accounting Review , 50(1):1–15,2018.[13] T.D. Calabrese and T.L. Ely. Borrowing for the public good: the growing importance of tax–exempt bonds forpublic charities.

Nonproﬁt and Voluntary Sector Quarterly , 45(3):458–477, 2016.[14] P. Carr and R. Lee. Robust replication of volatility derivatives. Mathematics in ﬁnance working paper series2008–3, Courant Institute of Mathematical Sciences, New York University, 2008.[15] Peter Carr and Dilip Madan. Option valuation using the fast fourier transform.

Journal of computationalﬁnance , 2(4):61–73, 1999.[16] Xiaoyan Chen and Jaeyoung Sung. Managerial compensation and outcome volatility.

Available at SSRN 3140205 ,2018.[17] J. Cvitanić, D. Possamaï, and N. Touzi. Moral hazard in dynamic risk management.

Management Science , 63(10):3328–3346, 2017.[18] J. Cvitanić, D. Possamaï, and N. Touzi. Dynamic programming approach to principal–agent problems.

Financeand Stochastics , 22(1):1–37, 2018.[19] T. de Angelis, P. Tankov, and O.D. Zerbib. Environmental impact investing.

SSRN preprint 3562534 , 2020.[20] R. Della Croce, C. Kaminker, and F. Stewart. The role of pension funds in ﬁnancing green growth initiatives.Working papers on ﬁnance, insurance and private pensions 10, OECD, 2011.[21] I. Ekeland and J. Lefournier. L’obligation verte : homéopathie ou incantation ? Technical report, UniversitéParis–Dauphine, 2019.[22] R. Élie, E. Hubert, T. Mastrolia, and D. Possamaï. Mean–ﬁeld moral hazard for optimal energy demand responsemanagement.

Mathematical Finance , to appear, 2019.[23] W. Febi, D. Schäfer, A. Stephan, and C. Sun. The impact of liquidity risk on the yield spread of green bonds.

Finance Research Letters , 27:53–59, 2018.[24] C. Flammer. Corporate green bonds.

Journal of Financial Economics , 2018.[25] C. Flammer. Green bonds: eﬀectiveness and implications for public policy.

Environmental and Energy Policyand the Economy , 1(1):95–128, 2020.[26] B. Hachenberg and D. Schiereck. Are green bonds priced diﬀerently from conventional bonds?

Journal of AssetManagement , 19(6):371–383, 2018.[27] N. Hernández Santibáñez and T. Mastrolia. Contract theory in a VUCA world.

SIAM Journal on Control andOptimization , 57(4):3072–3100, 2019.[28] H. Hong and M. Kacperczyk. The price of sin: the eﬀects of social norms on markets.

Journal of FinancialEconomics , 93(1):15–36, 2009.[29] M. Klein. Tax credit bonds.

CitiBank Investment Management Review , 11:27–31, 2009.2830] P. Krüger. Corporate goodness and shareholder wealth.

Journal of Financial Economics , 115(2):304–329, 2015.[31] R.C.W. Leung. Continuous–time principal–agent problem with drift and stochastic volatility control: with ap-plications to delegated portfolio management. Technical report, Haas School of Business, University of CaliforniaBerkeley, 2014.[32] A. Lioui and P. Poncet. Optimal benchmarking for active portfolio managers.

European Journal of OperationalResearch , 226(2):268–276, 2013.[33] T. Mastrolia and D. Possamaï. Moral hazard under ambiguity.

Journal of Optimization Theory and Applications ,179(2):452–500, 2018.[34] R. Morel and C. Bordier. Financing the transition to a green economy: their word is their (green) bond?

ClimateBrief , 14, 2012.[35] J. Nilsson. Investment with a conscience: examining the impact of pro-social attitudes and perceived ﬁnancialperformance on socially responsible investment behavior.

Journal of Business Ethics , 83(2):307–325, 2008.[36] OECD.

Mobilising the debt capital markets for a low carbon transition . Green ﬁnance and investment. ÉditionsOCDE, Paris, 2017.[37] OECD.

Investing in climate, investing in growth . Éditions OCDE, Paris, 2017.[38] H. Ou-Yang. Optimal contracts in a continuous–time delegated portfolio management problem.

Review ofFinancial Studies , 16(1):173–208, 2003.[39] S.K. Park. Investors as regulators: green bonds and the governance challenges of the sustainable ﬁnancerevolution.

Stanford Journal of International Law , 54:1, 2018.[40] M.E. Porter and C. Van der Linde. Toward a new conception of the environment–competitiveness relationship.

Journal of Economic Perspectives , 9(4):97–118, 1995.[41] D. Possamaï, X. Tan, and C. Zhou. Stochastic control for a class of nonlinear kernels and applications.

TheAnnals of Probability , 46(1):551–603, 2018.[42] Y. Sannikov. A continuous–time version of the principal–agent problem.

The Review of Economic Studies , 75(3):957–984, 2008.[43] Manfred Schäl. A selection theorem for optimization problems.

Archiv der Mathematik , 25(1):219–224, 1974.[44] H.M. Soner, N. Touzi, and J. Zhang. Wellposedness of second order backward SDEs.

Probability Theory andRelated Fields , 153(1-2):149–190, 2012.[45] H.M. Soner, N. Touzi, and J. Zhang. Dual formulation of second order target problems.

The Annals of AppliedProbability , 23(1):308–347, 2013.[46] J. Sung. Linearity with project selection and controllable diﬀusion rate in continuous–time principal–agentproblems.

The RAND Journal of Economics , 26(4):720–743, 1995.[47] J. Sung. Optimal contracting under mean–volatility ambiguity uncertainties.

SSRN preprint 2601174 , 2015.[48] D.Y. Tang and Y. Zhang. Do shareholders beneﬁt from green bonds?

Journal of Corporate Finance , 61:101427,2020.[49] O.D. Zerbib. The green bond premium.

SSRN preprint 2890316 , 2017.[50] O.D. Zerbib. The eﬀect of pro-environmental preferences on bond prices: evidence from green bonds.

Journalof Banking & Finance , 98:39–60, 2019.[51] O.D. Zerbib. A sustainable capital asset pricing model (S–CAPM): evidence from green investing and sin stockexclusion.