[PDF] Optimal dynamic regulation of carbon emissions market: A variational approach

Abstract

We consider the problem of reducing the carbon emissions of a set of firms over a finite horizon. A regulator dynamically allocates emission allowances to each firm. Firms face idiosyncratic as well as common economic shocks on emissions, and have linear quadratic abatement costs. Firms can trade allowances so to minimise total expected costs, from abatement and trading plus a quadratic terminal penalty. Using variational methods, we exhibit in closed-form the market equilibrium in function of regulator's dynamic allocation. We then solve the Stackelberg game between the regulator and the firms. Again, we obtain a closed-form expression of the dynamic allocation policies that allow a desired expected emission reduction. Optimal policies are not unique but share common properties. Surprisingly, all optimal policies induce a constant abatement effort and a constant price of allowances. Dynamic allocations outperform static ones because of adjustment costs and uncertainty, in particular given the presence of common shocks. Our results are robust to some extensions, like risk aversion of firms or different penalty functions.

Full PDF

OOptimal dynamic regulation of carbon emissions marketA variational approach

René Aïd ‡ Sara Biagini § February 25, 2021

Abstract

We consider the problem of reducing the carbon emissions of a set of ﬁrms over a ﬁnite horizon. Aregulator dynamically allocates emission allowances to each ﬁrm. Firms face idiosyncratic as well ascommon economic shocks on emissions, and have linear quadratic abatement costs. Firms can tradeallowances so to minimise total expected costs, from abatement and trading plus a quadratic terminalpenalty. Using variational methods, we exhibit in closed-form the market equilibrium in functionof regulator’s dynamic allocation. We then solve the Stackelberg game between the regulator andthe ﬁrms. Again, we obtain a closed-form expression of the dynamic allocation policies that allow adesired expected emission reduction. Optimal policies are not unique but share common properties.Surprisingly, all optimal policies induce a constant abatement eﬀort and a constant price of allowances .Dynamic allocations outperform static ones because of adjustment costs and uncertainty, in particulargiven the presence of common shocks. Our results are robust to some extensions, like risk aversion ofﬁrms or diﬀerent penalty functions.

Keywords : Stochastic optimization, environmental economics, cap and trade, linear quadratic prob-lem, Fréchet diﬀerentiability, market equilibrium, social cost minimisation.

AMS subject classiﬁcations:

JEL subject classiﬁcations:

C62, E63, H23, Q52, Q58.

Acknowledgements:

We warmly thank Bruno Bouchard, Anna Creti, Paolo Guasoni, Peter Tankovand Nizar Touzi for discussions on the topic.

Since its inception in 2005, the European Union Trading System has been a major innovative tool tomanage carbon emissions and help EU member states reach their agreed reduction targets. Former ex-amples of cap-and-trade mechanism to reduce pollution include the successful cap-and-trade market forsulfur dioxide and nitrogen oxides in the US (Title IV of Clean Air Act Amendments, 1990). However, thestriking novelty of the EU carbon market is the dimension: more than 30 billions euro of value gatheringmore than 15 thousands stationary installations across 30 countries. The seminal paper by Montgomery(1972) [26] on market for licences has found here a spectacular illustration of the idea that market mech-anisms can be eﬃciently developed to achieve pollution reduction. Nevertheless, after 15 years, it is clearthat the EUTS is facing some issues. Figure 1 provides the price of allowances from January, 2008 to ‡ University Paris-Dauphine, PSL Research University, Department of Economics, Place du Marechal de Lattre de Tassigny75 775 Paris, France; [email protected]. This project received the support from the Finance For Energy MarketsResearch Initiative ( ) and the EcoRESS ANR under grant ANR-19-CE05-0042. § [email protected], Department of Economics and Finance, LUISS University, viale Romania 32, 00197 Rome, Italy. Fi-nancial support from LUISS Visiting Program is gratefully acknowledged. a r X i v : . [ ec on . GN ] F e b une, 2020 and the cumulative diﬀerence between total veriﬁed emissions per year and total allowances.It shows that the market price of carbon is highly sensitive to the relation between supply (allowances)and demand (emissions). The 2008 ﬁnancial crisis led a large surplus that lasted until 2013 and led to adepressed market price, which reached less than 5 e /tCO2. During this period, emissions were reduced,not because of ﬁrms abatement eﬀorts, but because the world was experiencing a major recession. Thisphenomenon led the EU to design the Market Stability Reserve mechanism (MSR), to reduce market pricevolatility and over supply (EU Decision 2015/1814 of October 6 th , 2015 on EU Directive 2003/87). Ina nutshell, the MSR regulates the potential market imbalances either by backloading allowances to thefuture or providing more allowances through auctions during the current phase and the next ones. Thismechanism makes the carbon market regulation a dynamic process. E U A P r i c e -1000-800-600-400-2000200400 C u m u l a t i v e E m i ss i on s - A ll o w an c e s Figure 1: EUA price in e /tCO2 (left axis) compared to the diﬀerence between total veriﬁed emissionsand total allocations in MtCO2-equiv (right axis) from January 2008 to June 2020; source: Eikon Reutersfor EUA prices and European Environment Agency EU Emission Trading System data viewer..Thus, the MSR mechanism implements the idea that contingent regulation should be preferred toﬁxed ones. This concept has been vastly supported in the literature, since Weitzman’s (1974) [37] seminalpaper on regulation under uncertainty. From there, an intense research activity has focused on dynamicregulation in diﬀerent settings. Hahn (1989) [17] and more recently Hepburn (2006) [18] both providesurveys on regulation through prices or quantities as well as a framing of the admissible tools to regulators.In our speciﬁc case of carbon market, there exists an extensive literature which analyzes the carbonemissions reduction problem under its many diﬀerent aspects. The optimality of banking permits fromone period to the next has been studied in Rubin (1996) [33], Schennach (2000) [34], Chaton et al. (2015)[4], Lintunen and Kuusela (2018) [25] and Kuusela and Lintunen (2020) [23]. Mechanism designs areproposed in Carmona et al. (2010) [7] in a discrete time model to reduce electricity producers windfallproﬁts. An important stream of the literature on diﬀusive pollution deals with imperfect competition andstrategic interaction among polluters. Requate (1993) [31], Von Der Fehr (1993) [36] and recently Anandand Giraud-Carrier (2020) [1] develop static models of imperfect competition which point out the ﬁrmscapacity to increase their proﬁt with emissions regulation.The focus in the present paper is continuous time regulation of the carbon emissions with dynamicallowances allocation, see also Kollenberg and Taschini (2016, 2019), Grüll and Taschini (2011) [16],2izer (2002) [29] and Pizer and Prest (2020) [30]. However, we leave for future research the coupling ofthe dynamical aspects of emissions reductions with imperfect competition among polluters. We regardcarbon reduction as a stochastic Stackelberg game. Firms are the Followers, and the regulator is theLeader. The model for ﬁrms is a continuous time stochastic model, largely inspired by the Kollenbergand Taschini (2016, 2019) [21, 22] carbon emissions market model. Firms experience individual emissionsgrowth rate, plus idiosyncratic as well as common economic shocks. Abatement costs are heterogeneousand linear-quadratic. Although there are signiﬁcant uncertainties on the marginal abatement costs ofcarbon reduction (see Gillingham and Stock (2018) [14] for an introduction on the topic), we make theassumption that at the time scale of carbon market emissions phase, less than ten years, these costsare known and constant. Firms can trade the carbon emissions allowances provided by the regulator, andemissions market imperfection is taken into account by an impact on the price of carbon. The key feature isthat each ﬁrm is endowed with a bank account, as in the cited papers by Kollenberg and Taschini. The bankaccount position is the result of the allocation received, the traded permits minus the realised emissions.Further, ﬁrms face a terminal quadratic penalty on their bank position at the end of the regulated period -this is a strong incentive to achieve emission reduction. There are quite large uncertainties on the damagecost function induced by carbon emissions (see Hsiang and Kopp’s [19] and Auﬀhammer [2] papers fora thorough introduction to the uncertainties in carbon emissions damage function). Nevertheless, theconvex terminal penalty can be interpreted as the expected future damage costs induced by non-compliantemissions. The objective of the authority is to achieve a given expected carbon emission reduction at leastpossible expected total cost from abatement, trading and terminal penalty. Dynamic allocation processesavailable to the regulator are chosen in the space of semimartingales, i.e. processes which are the sum ofa diﬀusion part and a jump component.Usually, carbon emissions equilibrium price is tackled using forward-backward stochastic diﬀerentialequations as illustrated in Carmona et al. (2009, 2010, 2011, 2013) [6, 7, 5, 8]. Further, stochastic dynamicStackelberg games are typically solved using Bellman’s principle and coupled HJB equations. On this,the reader is referred to Bensoussan et. al. (2014) [3] for a survey on the problem and illustrations in thelinear quadratic case.Instead, our approach is in the spirit of Duﬃe’s utility maximisation via stochastic gradient methods(see Duﬃe (2001) [9]). Since the objective functions in our problem are linear quadratic, they are Fréchetdiﬀerentiable over the space of square integrable processes. This smoothness allows for stochastic vari-ational techniques and leads to explicit and compact expressions for the equilibrium price and optimalcontrols. In fact, using the variational approach we provide closed form expressions for the best responseof each ﬁrm. Without surprise, the optimal abatement eﬀort equates the marginal abatement cost to themarginal penalty. More surprisingly, the abatement eﬀort process is a martingale irrespectively of theallocation. We obtain in closed form a unique market equilibrium, reached at the trading rates that clearthe market. The corresponding equilibrium price ˆ P is a martingale given by the conditional expectationof the (average of) marginal penalties. Its dynamics are driven by (conditional) expectation of globalallocations and emission shocks. Of course, the higher the allocations, the lower the market price, andconversely for shocks. Further, we show that the same methodology can be applied to diﬀerent situations,like risk-averse ﬁrms or alternative terminal penalty function.Our main ﬁndings consist in showing that optimal dynamic policies are not unique, though sharingcommon properties. All optimal dynamic policies induce a constant abatement eﬀort. As a consequence ofconstant abatement eﬀorts, the market equilibrium price is also constant. All dynamic allocations providethe same expected allocation. A simple example of optimal dynamic policy consists in, ﬁrst, debiting theﬁrms accounts by the level of the desired emission reduction and then to credit them at the business-as-usual emission rate, including shocks. By acting in this way, the regulator hedges or protects ﬁrmsagainst heavy adjustment costs. Far from being a drawback, the non-uniqueness of the optimal dynamicallocation suggests that by the same tool the regulator could reach other goals on top of the reduction of3xpected emission at the minimum social cost. An example is adding constraints on the ﬁnancing of newtechnologies. We leave this analysis to future research.Indeed, an accurate comparison with a static allocation mechanism as implemented in the ﬁrst phasesof the EUTS shows that dynamic allocation mechanisms outperform static allocations only in the presenceof adjustment costs, business cycles and, in particular, of common shocks. If any of these three featuresis absent, there is no beneﬁt from the implementation of a dynamic allocation mechanism. Using crudedata of veriﬁed emission of the industrial sectors involved in the EUTS on the period 2008 to 2012,we calibrated our model. Using the closed formulas then, we calculated the diﬀerence in cost betweenthe optimal dynamic allocation described above and three alternative existing policies, namely the staticallocation mechanism that prevailed during Phase I and II, a MSR-like mechanism and the pure tax policy.We ﬁnd signiﬁcant diﬀerence in cost between static and dynamic allocations, approximately whenﬂexibility is low. On the other hand, the pure tax policy induces costs of higher order of magnitude.The paper is organised as follows. Section 2 describes the stochastic underlying model and formalisesthe regulator’s problem. Section 3 deals with the ﬁrm individual optimisation problem for a ﬁxed allocationprocess and a given market price process. Section 4 provides the market equilibrium both with andwithout market impact. Section 5 solves the regulator’s problem. Comparisons with alternative policies,like Market Stability Reserve, pure tax and static allocation are gathered in Section 6. There, numericalillustrations are also given. Finally, the Appendix contains technical results and computationally intensiveparts of the proofs of the main Theorems. The regulation of carbon allowances is indeed quite complex. In particular, it occurs over several periods,and allowances can be banked from one period to the other. We abstract from these features and focus ona single period of T years at the end of which compliance is assessed, as in Carmona et al. (2010, 2013)[7, 8], Kollenberg and Taschini (2016, 2019) [21, 22], Fell and Morgenstern (2010) [11].A regulator (the Leader) wishes to reduce the pollution produced by a set of N ﬁrms (Followers) overa period [0 , T ] . To this end, she allocates carbon permits to the ﬁrms. Given the received allocation, eachﬁrm minimizes its reduction, trade and terminal penalty cost till the system reaches a market equilib-rium. Then, the regulator minimizes optimal social cost over possible allocations. The stochastic modelformalization goes as follows. Consider a ﬁltered probability space (Ω , ( F t ) ≤ t ≤ T , P ) , in which the ﬁltra-tion is the augmented Brownian ﬁltration generated by a standard N + 1 dimensional Brownian motion ( (cid:102) W , (cid:102) W , . . . , (cid:102) W N ) . Fix further correlation factors ( k , . . . , k N ) and let W it = (cid:113) − k i (cid:102) W it + k i (cid:102) W t (2.1)In particular, the correlation between W i , W j is ρ ij := k i k j . The index notation in what follows isself explanatory, in the sense that Y j denotes a process, while Y j , y j refer to constants or deterministicfunctions. The Business As Usual (BAU) cumulative emissions dynamics for ﬁrm i are: E it = µ i t + σ i W it , ≤ t ≤ T, (2.2)where µ i and σ i > are the average growth rate and standard deviation rate of their emission. Thus, theemission of ﬁrm i is aﬀected by its own idiosyncratic noise d ˜ W i and by the common economic businesscycle d ˜ W . A positive shock induces an increase in emissions.In the BAU case, the expected total emission over the period T is E [ E T ] = N ¯ µT where E t = (cid:80) Ni =1 E it and N ¯ µ = (cid:80) Ni =1 µ i . The regulation wishes to reduce the expected emissions to ρ T N ¯ µ, < ρ < − ρ percent of reduction compared to BAU. The regulator has several instruments ather disposal (taxes, quotas) but she wishes to implement a dynamic cap–and–trade system working inthis way. At t = 0 she opens for each ﬁrm a bank account X i and allocates permits, summed up in thecumulative process ˜ A i . The i -th bank follows the dynamics: dX it = β it dt + d ˜ A it − dE i,α i t , dE i,α = (cid:0) µ i − α it (cid:1) dt + σ i dW it . (2.3)In the above, α i is the abatement rate and β i is the trading rate in the (liquid) allowances market.When making the eﬀort α it , the emissions E i,α i of ﬁrm i increases at a rate µ i − α it . Assumptions : Firms controls are square integrable wrt d P ⊗ dt , namely they belong to L := L (Ω × [0 , T ] , ( F t ) t , P ⊗ dt ) . The allocation process ˜ A i is a square integrable semimartingale, that is E [( ˜ A it ) ] < ∞ for all t , decomposable into square integrable ﬁnite variation part F and square integrable stochasticintegral: ˜ A it = F it + N (cid:88) j =0 (cid:90) t ˜ b js d (cid:102) W js We do not require that the ﬁnite variation part F of ˜ A i is absolutely continuous wrt dt . Therefore, theregulator is free e.g. to allocate (credit or debit) permits at discrete instants, which can be ﬁxed or stoppingtimes, and/or at a rate ˜ a . In addition, the Lebesgue Decomposition Theorem allows the decompositionof F into the sum of an absolutely continuous part with respect to the Lebesgue measure, and of singularpart ˜ S i . So, ˜ A it = ˜ S it + (cid:90) t ˜ a it dt + N (cid:88) j =0 (cid:90) t ˜ b js d (cid:102) W js (2.4)Each of the three addenda can be null. As already mentioned, the singular ˜ S i can be e.g. a pure jump part.Namely, a weighted sum of Dirac deltas along a targeted (stochastic) time grid, in which the regulatorprovides/cancels allowances in a discrete way: ˜ S it = k i δ t ( t ) + k i δ t ( t ) . . . where the dates t h form an increasing sequence of stopping times and k it h ∈ F t h . This covers the case inwhich there is an initial allocation. In fact, X i = ˜ S i it is enough take t = 0 in the grid, and ˜ S i (cid:54) = 0 . Given an allowance scheme, the allowance bank X i ofﬁrm i gives at all t s the net position of the ﬁrm in terms of emissions, abatement, allowances trading andallowances endowment by the regulator. A positive economic shock to the emissions induces a decrease inthe bank accounts, while an increase in the allocations makes the accounts grow. The bank dynamics canbe rewritten as dX it = dA it + (cid:16) α it + β it (cid:17) dt − σ i dW it . (2.5)in which A is the net allocation process over the trend A it = ˜ A it − µ i t Since our results on the optimal regulation depend only on A i , the trend rate µ i does not necessarily haveto be a constant. Here, we set it constant for ease of presentation. For future use, note that E (cid:2) A iT (cid:3) = E (cid:2) ˜ S iT (cid:3) + E (cid:104) (cid:90) T (˜ a i − µ i ) dt (cid:105) P and a given (net) allowance scheme A i , the ﬁrm i aims to solveits cost minimization problem: inf α i ,β i J i ( α i , β i ) := inf α i ,β i E (cid:104) (cid:90) T (cid:16) c i ( α it ) + P t β it + 12 ν ( β it ) (cid:17) dt + λ ( X iT ) (cid:105) . (2.6)In the objective function J i of ﬁrm i , the abatement cost function c i is supposed to be quadratic c i ( α ) = h i α + 12 η i α h i , η i > to consider both the linear and the adjustment costs, the latter proportional to the square of the abatementrate. This choice is in line with the literature of carbon emission reduction (see Gollier (2020) [15] andreference therein). Further, the linear-quadratic form captures the non decreasing feature of marginalabatement cost. From an investment point of view, it means that there is some irreversibility in theabatement decision. The higher the values of η i , the higher the ﬂexibility of the abatement process andthus the higher the reversibility of the decision.About trading costs, we take into account a price impact eﬀect as in the original Kyle (1985) model[24], with constant market depth parameter ν > . The term λ ( X iT ) , with λ > equal for all ﬁrms, isthe terminal monetary penalty on the bank accounts set by the regulator. The ﬁrm is going to pay bothif its bank is above or below the compliance zero level. It is a regularized version of the actual terminal(cap) penalty function, which is zero if the ﬁrm is compliant and linear otherwise.The objective of the regulator is to design dynamic allocation schemes A = ( A , . . . A N ) to reduceexpected emission, while minimising social cost inf A E (cid:104) N (cid:88) i =1 (cid:90) T (cid:0) c i ( α it ) + β it P t + 12 ( β it ) ν (cid:1) dt + λ ( X iT ) (cid:105) , E (cid:104) N (cid:88) i =1 E i,α i T (cid:105) = ρ T N ¯ µ. (2.7)when the ﬁrms behave optimally and are at equilibrium, namely when (cid:80) i β it = 0 .So, the problem regulator/ﬁrms falls in the category of dynamic stochastic Stackelberg games sincethe regulator aims at minimizing social cost from optimal ﬁrms reduction and trade. Thus, she acts as aLeader while ﬁrms act as Followers (see Bensoussan et. al. (2014) [3] for a survey). The focus here is on the single ﬁrm cost minimization, for a given exogenous allowances price P and a netallocation scheme ˜ A i . We deﬁne M it := E t (cid:104) A iT (cid:105) , R it := E t (cid:104) A iT − A it (cid:105) , g i ( t ) := 2 λη i λ ( η i + ν )( T − t ) (3.1)in which• M i is the martingale closed by A iT . It gives, at time t , the (conditional) expectation of the ﬁrmcumulative (net) endowment A T . Diﬀerent intertemporal allocations, with the same cumulative valueon the regulatory horizon [0 , T ] give rise to the same M i . From the deﬁnition of A i , M i , M i = E (cid:2) A iT (cid:3) = E (cid:2) ˜ A iT (cid:3) − µ i T R i gives, at time t , the conditional expectation of the residual net allocation on [ t, T ] .As anticipated in the Introduction, the ﬁrm minimization problem can be tackled by variational meth-ods. In our case, the functional J i : L × L → R is linear-quadratic: J i ( α i , β i ) = E (cid:104) (cid:90) T (cid:16) c i ( α it ) + P t β it + 12 ν ( β it ) (cid:17) dt + λ ( X iT ) (cid:105) , strictly convex and smooth. The optimal solution is unique and can be obtained by annihilating thestochastic gradient.By the Riesz Representation Theorem, any linear form on a Hilbert space can be represented by anelement of the space itself. Therefore, the diﬀerential of J i can be represented by the gradient, whichbelongs to L × L . The gradient is then a couple of square integrable, adapted processes on which we aregoing to write the ﬁrst order conditions (FOC) in the proof of the next Theorem. Theorem 3.1.

For a given cumulative, net allocation scheme A i and for a given allowances price P , thei-th ﬁrm cost minimization problem (2.6) has a unique, explicit solution ( ˆ α i , ˆ β i ) in L × L . (i) The optimal abatement ˆ α i is the solution of the following SDE d ˆ α it = − g i ( t ) (cid:16) d ( M it − σ i W it ) + d E t (cid:104) (cid:90) T ν ( h i − P s ) ds (cid:105)(cid:17) , (3.2) ˆ α i = − g i (0) (cid:16) λ h i + M i + E (cid:104) (cid:90) T ν ( h i − P t ) dt (cid:105)(cid:17) (3.3) and is therefore a martingale. (ii) The optimal trade ˆ β i is ˆ β it = ν (cid:16) h i + ˆ α it η i − P t (cid:17) . (3.4)(iii) Both optimal controls can be rewritten in feedback form in function of the bank state ˆ X it ˆ α it = ˆ α it ( ˆ X it ) = − g i ( t ) (cid:16) h i λ + ˆ X it + R it + E t (cid:104) (cid:90) Tt ν ( h i − P s ) ds (cid:105)(cid:17) (3.5) ˆ β it = ˆ β it ( ˆ X it ) = ν (cid:16) h i + ˆ α it ( ˆ X it ) η i − P t (cid:17) (3.6) in which the expected residual allocation process R i appears in place of M i . Proof.

The proof is given in Section A.1. Here, we just anticipate the FOCs written on the gradient of J , as they will be explicitly referred to in the rest of the paper. They are h i + α it η i + 2 λ E t (cid:2) X iT (cid:3) = 0 , (3.7) P t + β it ν + 2 λ E t (cid:2) X iT (cid:3) = 0 (3.8)Let us comment on these ﬁndings.(i) The FOCs can be written c (cid:48) ( α it ) = − λ E t (cid:2) X iT (cid:3) , β it = ν (cid:0) c (cid:48) ( α it ) − P t (cid:1) . The marginal abatement cost is equal to the (C.E. of the) marginal penalty, and so is the marginalcost of trading, P t + β it /ν . Consistently with economic intuition, the ﬁrm buys (resp. sells) if itsmarginal abatement cost is higher (resp. lower) than the market price.7ii) The optimal abatement ˆ α i is a martingale. In fact, it is a stochastic integral, with a bounded,deterministic integrand − g i , of three explanatory martingales: M , namely the C.E. of cumulativenet allocation over [0 , T ] ; the conditional expectation of the integrated price; and the emission noise σ i W i . A fortiori it is not the full intertemporal structure of the net allocation A i which matters, asit appears in (2.4). The key quantity here is M .The ﬁrm compares the dynamics of the expectation of what will be given during the whole regulatoryperiod, A iT , to noise and to integrated price, and then makes the decision on eﬀort. If there is apositive economic shock dW it everything else being equal, the ﬁrm eﬀort increases. It decreases if dM it is positive, i.e. if the ﬁrm anticipates an increase in total expected net allocation.The integrand g i in the martingale representation for ˆ α i depends on the following parameters: theindividual ﬁrm adjustment cost of abatement η i , the common penalty coeﬃcient λ and market depthcoeﬃcient ν .When written in feedback form, ˆ α i depends only on the state ˆ X i , on the C.E. of net residual allocation R i at time t , R it = E t [ A iT − A it ] , and on the C.E. of the (residual) integrated price.Finally, we remark that martingality of ˆ α i would be preserved if the abatement ﬁrst order cost h i became a martingale.(iii) The optimal trade ˆ β i is not a martingale, unless P is a martingale as well. This will occur atequilibrium (see Section 4). Absence of market frictions is a common assumption in the literature (see Kollenberg and Taschini (2016)[21] or Carmona et. al. (2010) [7] and the references within). TO better compare with this case, let ussolve the ﬁrm optimisation problem when the market has inﬁnite depth, ν = ∞ . The problem becomes inf α i ,β i (cid:101) J ( α i , β i ) := inf α i ,β i E (cid:104) (cid:90) T (cid:18) h i α it + ( α it ) η i + P t β it (cid:19) dt + λ ( X iT ) (cid:105) (3.9)If the optimizers exist, we cannot expect that ˆ β i will be unique. The objective function in fact losesstrict convexity in the β argument. The quadratic terminal penalty however involves the cumulative trade B iT = (cid:82) T β it dt , for which uniqueness will be obtained. Proposition 3.1.

Problem (3.9) admits a solution if and only if P is a martingale. In case P is amartingale, the abatement eﬀort of ﬁrm i is unique and given by: ˆ α it = η i (cid:0) P t − h i (cid:1) . (3.10) The optimal trade rate is not unique. Any β i ∈ L satisfying (cid:90) T β it dt = ˆ B iT (3.11) is optimal, where ˆ B i is the L martingale satisfying the Cauchy problem d ˆ B it = − (cid:18) λ ( T − t ) η i λ dP t + d ( M it − σ i W it ) (cid:19) , ˆ B i = − (cid:18) ˆ P (1 − λη i T )2 λ + M i + η i h i T (cid:19) . (3.12)The complete proof follows the same lines as the previous Theorem 3.1. We brieﬂy highlight the maindiﬀerences. The resulting FOCs are: c (cid:48) ( α it ) + 2 λ E t (cid:2) X iT (cid:3) = 0 , (3.13) P t + 2 λ E t [ X iT ] = 0 . (3.14)8f P is not a martingale, there are no stationary points and thus no minimizers. When P is a martingale,as economic intuition suggests, each ﬁrm equates the marginal cost of abatement to the market price P t .Also, the market price is equal to the conditional expectation of the marginal penalty. In Theorem 3.1,frictions introduce deviation from these equalities. With ﬁnite ν in fact, we saw that the marginal cost ofabatement equals the marginal cost of trading, P t + β it /ν as from (3.7), (3.8). Same holds for the relationbetween the marginal cost of trading and the marginal penalty.Further, the FOC equations here do not involve β i directly, but only the martingale B i generated bythe total trade: B it := E t (cid:20)(cid:90) T β it dt (cid:21) . This is the main novelty, now the optimisation problem is strictly convex only in the total trade B iT .Therefore the optimal ˆ B iT and the generated martingale ˆ B are unique. And, in fact, such martingale isfound by replacing in relation (3.14), ˆ α by its expression as a function of P : P t + 2 λM it + 2 λ E t (cid:104) (cid:90) T η i ( h i − P t ) dt (cid:105) + 2 λB it − λσ i W it = 0 (3.15)An application of Lemma A.2, together with evaluation at t = 0 , gives d ˆ B it = − (cid:18) λ ( T − t ) η i λ dP t + dM it − σ i dW it (cid:19) , (3.16) ˆ B i = − (cid:18) ˆ P (1 − λη i T )2 λ + M i + η i h i T (cid:19) (3.17)Therefore, any β i satisfying (cid:90) T β it dt = ˆ B iT is optimal. We are now ready to tackle the equilibrium problem of the system of N ﬁrms. Recall that the noises W i in the ﬁrms activity have a quite general dependence structure, as described at the beginning of Section 3.Fix a net allocation policy of the regulator A = ( A , . . . , A N ) ∈ ( L ) N . Deﬁne the positive, deterministicfunction π i ( t ) := g i ( t ) η i (cid:16) − ν ( T − t ) N N (cid:88) k =1 g k ( t ) η k (cid:17) − , (4.1)where g i is deﬁned in (3.1). If ﬁrms share the same η i , all the functions π i are equal. Market equilibriumconsists in ﬁnding a price ˆ P that satisﬁes the market clearing condition: N (cid:88) i =1 ˆ β it ( ˆ P ) = 0 , ∀ t ∈ [0 , T ] . (4.2)in which the ˆ β i are given from the system (3.4). The price ˆ P is then called an equilibrium price . Themarket equilibrium is described in the following Theorem. Theorem 4.1.

For a given net cumulative allocation A , The equilibrium price ˆ P is the unique solution to the Cauchy problem: d ˆ P t = − N N (cid:88) i =1 π i ( t ) (cid:0) dM it − σ i dW it (cid:1) , ˆ P = 1 N N (cid:88) i =1 π i (0) (cid:16) η i h i T − M i (cid:17) . (4.3) The price ˆ P is therefore a martingale. (ii) The equilibrium ˆ P can be written in feedback form as ˆ P t = 1 N N (cid:88) i =1 π i ( t ) (cid:16) η i h i ( T − t ) − ( ˆ X it + R it ) (cid:17) . (4.4)(iii) The optimal controls ˆ α i , ˆ β i are the unique solutions of the next Cauchy problems: d ˆ α it = − g i ( t ) (cid:104) d ( M it − σ i W it ) − ν ( T − t ) d ˆ P t (cid:105) , ˆ α i = − g i (0) (cid:104) h i (cid:16) λ + νT (cid:17) + M i − νT ˆ P (cid:105) , (4.5) d ˆ β it = νd ( ˆ P t − h i − η i ˆ α it ) , ˆ β i = ν (cid:16) h i + ˆ α i η i − ˆ P (cid:17) . (4.6)(iv) In feedback form, ˆ α it ( ˆ X it , R it ) = g i ( t ) (cid:104) ν ( T − t )( ˆ P t − h i ) − (cid:16) h i λ + ˆ X it + R it (cid:17)(cid:105) , ˆ β it ( ˆ X it , R it ) = ν ( ˆ P t − ˆ α it η i − h i ) . (4.7) Proof.

See appendix A.2.Let us comment on the above results.(i) The explanatory processes in (4.3) for the equilibrium price ˆ P are the martingales ( M i − σ i W i ) , i = 1 , . . . , N . We observe that if all these martingales experience a positive shock, the price decreases.In short, if the regulator promises to all ﬁrms more (resp. less) future total net allocation than theeﬀect of their economic shock, the price decreases (resp. increases).(ii) When the adjustment costs η i are equal, the deterministic coeﬃcients π i become equal and can befactorized out in (4.4). The equilibrium price ˆ P in this case depends then only on the aggregratequantities Z t := N (cid:88) i =1 ( M it − σ i W it ) and N ¯ h := N (cid:88) i =1 h i (4.8)(iii) The optimal eﬀorts ˆ α i are obtained from (3.2) exploiting the martingality of ˆ P . The trade ˆ β i keepsits structure.When there are no market frictions, we have seen in Section 3.2 that the optimal trade is non unique.However, there is a unique (martingale) equilibrium price ˆ P , and consequently a unique abatement eﬀort ˆ α i , i = 1 , . . . N . The equilibrium price ˆ P depends only on the aggregate quantities as in (4.8), plus theaverage of adjustment cost coeﬃcient ¯ η . If ( ˆ α, ˆ β, ˆ P ) denotes an equilibrium triplet, the next Propositionsums up the results in this particular framework. We omit the proof, since it follows from combiningTheorem 4.1 (sending ν to inﬁnity) and Proposition 3.1.10 roposition 4.1. When there are no market frictions, (i)

The equilibrium price dynamics become d ˆ P t = − f ( t ) (cid:16) d ¯ M t − d ¯ W t (cid:17) , ˆ P = f (0) (cid:16) T ¯ H − ¯ M (cid:17) , (4.9) with ¯ M t := 1 N N (cid:88) i =1 M it , ¯ W t := 1 N N (cid:88) i =1 σ i W it , f ( t ) := 2 λ λ ¯ η ( T − t ) . in which ¯ H := N (cid:80) Ni =1 η i h i , and ¯ η := N (cid:80) Ni =1 η i . Its expression in closed-loop is given by ˆ P t = f ( t ) (cid:16) ( T − t ) ¯ H − ¯ X t − ¯ R t (cid:17) (4.10) where ¯ X denotes the average bank account process. (ii) The abatement eﬀort of ﬁrm i is unique and given by: ˆ α it = η i (cid:0) ˆ P t − h i (cid:1) . (4.11)(iii) The trading rates are not unique. Any β i ∈ L satisfying (cid:90) T β it dt = ˆ B iT (4.12) is optimal, where ˆ B it satisﬁes the Cauchy problem d ˆ B it = − (cid:18) λ ( T − t ) η i λ dP t + dM it − σ i dW it (cid:19) , ˆ B i = − (cid:18) ˆ P (1 − λη i T )2 λ + M i + η i h i T (cid:19) . (4.13) Market frictions are small compared to the cost of abatement required to achieve the carbon emissionreduction targeted by the European Union. As documented by Frino et. al. (2010), the carbon marketquality has been constantly increasing with tick size decreasing from 5 c e /t to 1 c e /t and a bid–ask spreadof 5 c e /t as of 2008. Nowadays, the value of the bid–ask on the December contract is around 2 c e /t for aquoted carbon price around 30 e /t ∗ , which makes a transaction cost less than 0.06%. Hence, we neglectthem in the regulator’s problem and assume hereafter that there are no market frictions. We address now the optimisation problem (2.7) of the regulator when the market is at equilibrium. Theregulator faces: inf A C ( A ) := E (cid:104) N (cid:88) i =1 (cid:90) T (cid:16) h i ˆ α it + ( ˆ α it ) η i (cid:17) dt + λ ( ˆ X iT ) (cid:105) , E (cid:2) E ˆ αT (cid:3) = ρ T N ¯ µ. (5.1)in which ˆ α = ( ˆ α , . . . , ˆ α N ) is the optimal eﬀort of the system given the allocation A , while E ˆ α is thesystem emission under this eﬀort. Using Proposition 4.1 (i) and (ii), E (cid:2) E ˆ αT (cid:3) = N T (cid:0) ¯ µ − ¯ η ˆ P + ¯ H (cid:1) . ∗ Source: Thomson-Reuters Reﬁnitiv quotations of the EU EUA December contract. a constraint on the equilibrium price ˆ P = 1¯ η (cid:0) ¯ H + (1 − ρ )¯ µ (cid:1) . (5.2)The relation (5.2) gives the necessary average carbon price required to emerge from the market ifthe regulator wishes to achieve a carbon emission by a factor − ρ . This average price captures all thefeatures of the system in a simple way: it is proportional to the inﬂexibility / ¯ η of the system, to theaverage abatement cost ¯ H , to the growth rate of the emission ¯ µ and to the ambition of reduction ρ .Equating the constraint on ˆ P in (5.2) with the expression (4.9) of ˆ P obtained in Proposition 4.1 (i),we get ¯ M = − λ ¯ η (cid:104) ¯ H + (cid:0) λ ¯ ηT (cid:1) (1 − ρ )¯ µ (cid:105) =: (cid:96) ( ρ ) < . (5.3)Thus, the regulator withdraws allowances on average as ¯ M is negative . This holds regardless of theintertemporal allocation processes A i from the (detrended) equation (2.4), as the relevant processes arethe M i . If e.g. ﬁrms receive allowances in the beginning, so that the banks X i satisfy (cid:80) i X i = (cid:80) i A i = (cid:80) i ˜ S i > , then in (0 , T ] the regulator on average will withdraw permits.Conversely, if initially ﬁrms are given ¯ X := (cid:80) i X i /N < (cid:96) ( ρ ) then the regulator will on average creditback permits in (0 , T ] . In fact, using the decomposition of A i , ¯ M = E [ ¯ A T ] = ¯ X + 1 N N (cid:88) i =1 E (cid:104) ˜ S iT − ˜ S i + (cid:90) T a it dt (cid:105) = (cid:96) ( ρ ) and the second addendum must then be positive. By the Predictable Representation Property of theBrownian ﬁltration generated by the N + 1 noises (cid:102) W j , j = 0 , . . . , N (see (2.1)), each M i can be written as M it = M i + (cid:90) t N (cid:88) j =0 γ i,js d (cid:102) W js with γ i,j ∈ L , j = 0 , . . . N . If γ i = ( γ i, , . . . , γ i,N ) denotes the integrands vector for ﬁrm i , the problemcan be re-parametrized on controls M ≡ (( M , γ ) , . . . , ( M N , γ N )) . With this formulation, we state ourmain result. Theorem 5.1.

The social cost minimisation problem inf A C ( A ) = inf M E (cid:104) N (cid:88) i =1 (cid:90) T (cid:16) h i ˆ α it + ( ˆ α it ) η i (cid:17) dt + λ ( ˆ X iT ) (cid:105) , ¯ M = (cid:96) ( ρ ) , (5.4) has the following solutions structure: (1) Optimizers ˆ γ i , i = 1 , . . . , N annihilate the volatility of the price. The vectors ˆ γ i are non unique. Oneset of optimizers is obtained by tracking the volatility of each ﬁrm separately, thus allocating to the i-thﬁrm exactly the systemic ( j = 0 ) and idiosyncratic components ( j = i ): ˆ γ i, t = σ i k i , ˆ γ i,it = σ i (cid:113) − k i , and ˆ γ i,jt = 0 for j (cid:54) = 0 and j (cid:54) = i. The optimal martingales ˆ M it become ˆ M it = ˆ M i + σ i W it , i = 1 . . . N Expected optimal allocations ˆ M i are also non unique. The regulator is free to allocate permits as longas the expectation of the total number of permits satisﬁes the constraint (cid:80) Ni =0 ˆ M i = N (cid:96) ( ρ ) < . Anexample is the equal assignment in expectation: ˆ M i = (cid:96) ( ρ ) for all i (3) Optimal allocations A are, as a consequence, non unique. If the regulator chooses the ﬁrm-by-ﬁrmvolatility tracking and equal assignment in expectation as in item 1) and 2) above, there is an optimalset of net martingale allocations: A it = (cid:96) ( ρ ) + σ i W it = ˆ M it (4) As the price has zero volatility, it is constant: ˆ P t = ˆ P = 1¯ η (cid:0) ¯ H + (1 − ρ )¯ µ (cid:1) (5.5)(5) Firms optimal abatements are unique and constant ˆ α it = ˆ α i = η i ( ˆ P − h i ) . (6) Firms optimal trading rates are non unique. Any ˆ β i satisfying (4.12) is optimal. If the regulatorchooses the ﬁrm-by-ﬁrm individual tracking in item 1) above, the dynamics of the associated ˆ B i arenull. If in addition, there is equal endowment in expectation then an optimal solution is trading at aconstant rate ˆ β it = ˆ B i T = 1 T (cid:18) (1 + 2 λη i T )2 λ ˆ P + (cid:96) ( ρ ) − η i h i T (cid:19) (7) The minimum social cost is ˆ C = N λ (1 + 2 λ ¯ ηT ) ˆ P − T N (cid:88) i =1 η i h i (5.6) Proof.

The cost function in (5.4) is convex and diﬀerentiable in M i , γ i for all i . The Lagrangian is L ( M, ξ ) = N (cid:88) i =1 E (cid:104) (cid:90) T (cid:16) h i ˆ α it + ( ˆ α it ) η i (cid:17) dt + λ ( ˆ X iT ) (cid:105) + ξ (cid:32) N (cid:88) i =1 M i − N (cid:96) ( ρ ) (cid:33) in which the optimal controls are from Proposition 4.1. From (3.14), ˆ P T = − λ ˆ X iT for all ﬁrms i .Substituting the optimal ﬁrm abatement controls with ˆ P as from (3.13), the optimal cost for the i th ﬁrmis E (cid:104) (cid:90) T h i η i ( ˆ P t − h i ) + 12 η i ( η i ( ˆ P t − h i )) dt + λ λ ( ˆ P T ) (cid:105) By the martingale property of ˆ P , the Lagrangian becomes L ( M, ξ ) = − T N (cid:88) i =1 h i η i + 12 N (cid:88) i =1 η i E (cid:104) (cid:90) T ˆ P t dt (cid:105) + N λ E (cid:2) ˆ P T (cid:3) + ξ (cid:32) N (cid:88) i =1 M i − N (cid:96) ( ρ ) (cid:33) Also, E (cid:2) ˆ P t (cid:3) = ˆ P + 2 E (cid:2) ( ˆ P t − ˆ P ) (cid:3) + E (cid:2) ( ˆ P t − ˆ P ) (cid:3) = ˆ P + E (cid:2) (cid:104) ˆ P (cid:105) t (cid:3) L ( M, ξ ) = − T N (cid:88) i =1 h i η i + N λ (2 λT ¯ η + 1) ˆ P + N ¯ η E (cid:104) (cid:90) T (cid:104) ˆ P (cid:105) t dt (cid:105) + N λ E (cid:2) (cid:104) ˆ P (cid:105) T (cid:3) + ξ (cid:32) N (cid:88) i =1 M i − N (cid:96) ( ρ ) (cid:33) Optimality conditions:• ∇ M L . We have to impose ∂L∂M i = ξ − N λ (2 λT ¯ η + 1) f (0) N = ξ − ˆ P i = 1 . . . N • ∇ γ L . The matrix process γ in the representation of M is involved only in the dynamics of ˆ P . Inthe Lagrangian, it thus enters only in the quadratic variation. This implies that the minimum wrt γ is attained when the regulator annihilates the quadratic variation, i.e. the volatility of ˆ P . FromProposition 4.1, we just need to impose that (cid:104) ¯ M − ¯ W (cid:105) = 0 , in which ¯ W = N (cid:80) Ni =1 σ i W i . Using (2.1), and recalling M it = M i + (cid:82) t (cid:80) Nj =0 γ i,js d (cid:102) W js , the previousequation can be rewritten as (cid:104) N (cid:88) i =1 (cid:90) · N (cid:88) j =0 γ i,js d (cid:102) W js − N (cid:88) i =1 σ i ( (cid:113) − k i (cid:102) W i + k i (cid:102) W ) (cid:105) = 0 . The above boils down to the system: N (cid:88) i =1 ( γ i, s − σ i k i ) = 0 , j = 0 , N (cid:88) i =1 γ i,js − σ j (cid:113) − k j = 0 , j = 1 , . . . , N (5.7)Namely, the regulator allocates on the j th Brownian motion (cid:102) W j the aggregate volatility the systemhas in that shock. Thus, the net eﬀect is that the regulator kills the exposure to shocks in thewhole system. Clearly, a particular solution is to kill exposure ﬁrm by ﬁrm, with γ i, s − σ i k i = 0 , γ j,js − σ j (cid:113) − k j = 0 , and γ i,js = 0 otherwise as stated in item (1).• Feasibility is simply (cid:88) i M i − N (cid:96) ( ρ ) = 0 • Price will then be unique, the positive constant ˆ P = 2 λ λ ¯ ηT ( T ¯ H − (cid:96) ( ρ )) = 1¯ η (cid:0) ¯ H + (1 − ρ )¯ µ (cid:1) • The optimal eﬀorts ˆ α i are unique and constant, as per (4.11).• The optimal C.E. of total trade B i are non unique, because they depend on the C.E of the individual optimal total allocation ˆ M i as detailed in (4.13). If the regulator tracks individual volatility, then ˆ M it = (cid:96) ( ρ ) + σ i W it and ˆ B i = const = (cid:16) (1+2 λη i T )2 λ ˆ P + (cid:96) ( ρ ) − η i h i T (cid:17) .14 Finally, the expression of the minimum social cost easily follows from the above relations. (cid:3) Hence, the dynamic regulation eﬀect is a constant market price. However, this condition is not imposed a priori , but it is a consequence of social cost minimisation. Indeed, the level of carbon emission reductionﬁxes the average level of required eﬀort, and thus the average required price. But price ﬂuctuations inducevariations of eﬀort, which in turn produce irreversible cost because of the inﬂexibility of the system. Infact, social costs are increased by price ﬂuctuations. By annihilating price changes, the regulator avoidsﬁrms expensive stop-and-go.In the optimal policy given in Theorem 5.1, one has ˜ A it = A it + µ i t = (cid:96) ( ρ ) + σ i W it + µ i t, i.e. the regulator provides an initial (negative) allocation and then, credits the whole BAU emissions tothe ﬁrms. As a consequence, the bank accounts have a deterministic, linear dynamic: ˆ X it = (cid:96) ( ρ ) + ( ˆ α i + ˆ β i ) t i = 1 , . . . , N. In other optimal schemes, the regulator eliminates all the economic uncertainty involved in the dynam-ics of the carbon emissions. Regulation does not necessarily kill individual emission noises, but uncertaintyis tackled as a whole - as soon as the optimality condition (5.7) is respected.Although more complex to implement than a tax or a static initial endowment, optimal dynamic poli-cies oﬀer a powerful tool in emissions control. In particular, non-uniqueness of optimal dynamic policiesis a key feature in our model. Indeed, because they are non-unique, the regulator can achieve more goalsusing the same device. This can be obtained by e.g. adding more constraints to the optimal control prob-lem of the regulator. An example could be an additional constraint on the ﬁnancing of new technologies.We leave these developments for future research.

Remark 5.1 (When there are frictions) . The same procedure used in Theorem 5.1 applies to the casewhere there are market frictions. In that case, Theorem 4.1 states that the optimal eﬀorts and tradingrates are martingales. Also, the terminal values of the bank accounts still are deterministic functions of ˆ P T , as can be easily deduced from (3.7) and (4.7) . Therefore the regulator’s general problem (2.7) alsoamounts to annihilating the volatility of ˆ P . This goal is uniquely achieved by individual tracking, giving M i = M i + σ i W i to ﬁrm i . The dynamic allocation proposed in Theorem 5.1, item (3) is now the uniqueoptimal policy. Optimal eﬀorts and optimal trading rates will also be unique. Cap-and-trade mechanism

A cap-and-trade mechanism is often described by a terminal penalty inthe form of a cap on emissions. This corresponds to a put on the bank, with strike equal to the maximumtolerated emission level L . We show that the ﬁrm optimization problem can be still be solved by thevariational methodology. Suppose the cost function K i of the agent is K i ( α i , β i ) = E (cid:104) (cid:90) T (cid:16) h i α it + ( α it ) η i − P t β it + 12 ν ( β it ) (cid:17) dt (cid:105) + E [ λ ( L i − X iT ) + ] where the cap L is the minimum tolerated bank level at the end of the regulated period. After thisthreshold, the ﬁrm pays at maturity λ per ton exceeding the bank level. This formulation can be seen ascontinuous time version of the model in Carmona, Fehr an Hinz [6], with the additional features of priceimpact and quadratic abatement costs. In order to solve min α i ,β i K i ( α i , β i )

15e observe that the functional K i is ﬁnite, strictly convex and coercive on L × L and so there existsa unique optimizer. K i is also sub-diﬀerentiable. In fact, the pointwise (non adapted) subdiﬀerentials ofthe put expectation are: (cid:101) λ = − ( λ { L − X iT > } + τ { L − X iT =0 } ) in which ≤ τ ≤ λ is a F T -measurable r.v. Note that when P ( L − X iT = 0) = 0 then all the subdiﬀerentialsat X iT = L coincide almost surely, so we can safely choose τ = λ . The put becomes diﬀerentiable. Thishappens e.g. when the allocation of the regulator does not perfectly track the bank noise. In fact, in thiscase the distribution of X iT has no atoms thanks to the presence of a diﬀusion term. In the paper [6], theassumption (19) in Theorem 1 is a very similar condition on the net ﬁrm position, which ensures that thesubdiﬀerential of the put is in fact unique. To better compare with [6], assume the put is diﬀerentiable.The FOCs become: ∂∂α i K i = h i + α it η i − λ E t [ { L − X iT ≥ } ] = 0 , (5.8) ∂∂β i K i = − P t + β it ν − λ E t [ { L − X iT ≥ } ] = 0 (5.9)and the system admits a unique solution in L × L . At equilibrium, by market clearing the β termvanishes in the second FOC and the equilibrium price becomes the average of the conditional expectationsof the marginal penalties. If, as in [6] we neglect price impact, at equilibrium one gets the simpliﬁedrelation P T = λ { L − X iT ≥ } i = 1 , . . . , N This implies that the sets { L − X iT ≥ } , i = 1 . . . N , coincide a.s. and N L − N (cid:88) i =1 X iT ≥ . s . iﬀ L − X iT ≥ . s . and therefore the price can be seen as the marginal penalty of the aggregate put: P T = − λ { NL − (cid:80) i X iT ≥ } and equals the marginal abatement costs.As in the cited [6], optimal strategies cannot be found in closed form. Nevertheless, we observe that themarket equilibrium price is still a martingale, depending on the allocation processes and the abatementeﬀorts are a linear function of the price. As in Theorem 5.1 then, optimal allocations annihilate thevolatility of the price. Risk aversion taken into account

Let us denote by Y iT = (cid:90) T (cid:18) h i α it + 12 η i ( α it ) − P t β it + ( β it ) ν (cid:19) dt + λ ( X iT ) the random cost incurred in [0 , T ] by a single ﬁrm. Suppose the agent has a concave utility U i ﬁnite on R , smooth and with U (cid:48) i > . Consider the expected utility functional ( α, β ) ∈ L × L → E [ U i ( − Y T )] Assume that the functional is proper, namely there exists a couple α , β in L such that the expectedutility of the associated cost Y , E [ U i ( Y T )] , is ﬁnite. The agent seeks to solve max α,β E [ U i ( − Y T )] − U (cid:48) i ( − Y T ) DY in which DY is the bivariate process: DY t = (cid:16) h i + α t η i + 2 λ E t [ X T ] , − P t + β t ν − λ E t [ X T ] (cid:17) Since U (cid:48) i (cid:54) = 0 , then the FOC condition is DY = 0 dP ⊗ dt a.e.which is equivalent to (A.2) and therefore the same we had in the risk neutral case. Thus, the solutionsfound in Theorem 3.1 are the candidate optimal couple. We only need to check that the associated cost ˆ Y T has ﬁnite utility. This is straightforward −∞ < E [ U i ( ˆ Y T )] ≤ E [ U i ( ˆ Y T )] concavity ≤ U i ( E [ ˆ Y T ]) < ∞ since ˆ Y T here is integrable and U i is ﬁnite on R . Therefore, ( ˆ α, ˆ β ) from Theorem 3.1 continues to be theoptimal couple even in the presence of risk aversion. We compare here the optimal dynamic allocation policy suggested in Theorem 5.1 with three alternativeexisting policies: the initial static allocation, the pure tax system, and a Market Stability Reserve–likeallocation mechanism. In each case, we compute the social cost to achieve the same expected total carbonemissions reduction. For ease of presentation, the ﬁrms adjustment costs are assumed to be equal η i = η for all i = 1 , . . . , N For the sake of comparison, consider an ETS Phase 2-like mechanism, i.e. a static allocation: an initialendowment X i = ˜ S i = x i and zero intertemporal allocation ˜ a it = 0 , ˜ S it = 0 for all i = 1 , . . . , N and < t ≤ T . Under this policy we now calculate the social cost (5.1), which will necessarily be suboptimal.Denote by ¯ x the average initial endowment. Since a it = − µ i , by Proposition 4.1 (i) rewritten with η i = η the equilibrium price is ˆ P t = 2 λ ληT (cid:16) T η ¯ h − ¯ x + T ¯ µ (cid:17) + (cid:90) t λ λη ( T − s ) d ¯ W s , (6.1)since ¯ M = ¯ x − T ¯ µ . The expected emissions under optimal eﬀort ˆ α given the above price are E (cid:2) E ˆ αT (cid:3) = N T (¯ µ − η ( ˆ P − ¯ h )) Then, under a static allocation the regulator achieves the objective of emission reduction from the BAUtrend

T N ¯ µ to ρT N ¯ µ by setting N T (¯ µ − η ( ˆ P − ¯ h )) = ρT N ¯ µ. Thus, the initial price ˆ P becomes identical to (5.5) and the regulator has to allocate on average ¯ x = T ρ ¯ µ − λ (cid:104) ¯ h + (1 − ρ )¯ µη (cid:105) . (6.2)17ote that contrary to the optimal dynamic allocation scheme (Theorem 5.1 (2)), the initial allocationhere can be positive. Further, let us compare with the intuitive initial allocation of a cap-and-trade system.There, if regulator wishes to reduce the emissions to ρN ¯ µT , this value will be the aggregate cap. Then,she would set precisely the initial endowment at the cap level ρ ¯ µT per ﬁrm. Here instead the optimalinitial endowment is lower than this intuitive cap.The corresponding social cost is: C stat = E (cid:104) (cid:90) T N (cid:88) i =1 η ( ˆ P t − h i ) dt + N ˆ P T λ (cid:105) . Let denote d (cid:104) ¯ W (cid:105) t = σ dt , with N σ := (cid:80) Ni =1 σ i + 2 (cid:80) i

11 + 2 λη ( T − t ) −

11 + 2 ληT (cid:19) , straightforward computations lead to C stat = N λ (cid:16) ληT (cid:17) ˆ P + N σ η ln (cid:104) ληT (cid:105) − T η N (cid:88) i =1 h i (6.3)Thus, the diﬀerence in social beneﬁt between the static allocation (6.3) and the optimal allocation ˆ C in(5.6) is given by ∆ stat := N σ η ln (cid:104) ληT (cid:105) (6.4)Such diﬀerence stems from the presence of uncertainty and inﬂexibility in the system. Further, supposethere are N ﬁrms with identical σ i = ¯ σ and identical k i , so that we have ρ ij = ¯ ρ .In this case, N σ = N ¯ σ + ¯ ρ ¯ σ N ( N − . Thus, when N is large, the volatility σ tends to zero, unlessthere is some correlation with the common shocks. In fact, when ¯ ρ is non zero the per unit diﬀerence incost does not vanish: lim N →∞ ∆ stat N = ¯ ρ ¯ σ η ln (cid:104) ληT (cid:105) . (6.5)Thus, for N large, the optimal dynamic policies continue to outperform the static allocation in the presenceof common economic shocks.Further, by the relation (6.1), the price quadratic variation satisﬁes: (cid:104) ˆ P (cid:105) T = 4 λ σ T ληT , (6.6)which provides a way to estimate the ﬂexibility parameter η . Indeed, it satisﬁes η = 4 λ σ T − (cid:104) ˆ P (cid:105) T λT (cid:104) ˆ P (cid:105) T . (6.7)When the term λ σ T is large (see the numerical illustration below in section 6.4), we have (cid:104) P (cid:105) T σ ≈ λη . (6.8)The relation makes more explicit the relation between the volatility of the exogenous economic shocks andthe equilibrium market price volatility. The penalty factor λ and the ﬂexibility parameter η act as thetransmission belts of the economic shocks to the market price volatility. The higher the ﬂexibility η , thegreater the compensation of an economic shock in the equilibrium market price.18 .2 Pure tax As the name suggests, in a pure tax system there is no bank account for net emissions positions, norallowances. Firm i makes abatement eﬀort only because of a proportional tax τ on total realized emissions E i,α i T from (2.3). Each ﬁrm then faces the minimization problem: inf α i E (cid:104) (cid:90) T c i ( α it ) dt + τ E i,α i T (cid:105) , i = 1 , . . . , N (6.9)which admits a unique solution, the constant eﬀort ˆ α i = η ( τ − h i ) . So, the regulator would set the tax at τ = ¯ h + (1 − ρ )¯ µ/η (6.10)to induce a reduction of expected total emission of a factor − ρ . Without surprise, the tax level is equalto the constant price ˆ P in (5.5) of the dynamic allocation schemes because there the expected emissionsreduction is determined by the average carbon price. The social cost becomes C tax := E (cid:104) N (cid:88) i =1 (cid:90) T c i ( ˆ α it ) dt + τ E i, ˆ α i T (cid:105) = N T (cid:32) η τ − η N N (cid:88) i =1 h i + ρ ¯ µτ (cid:33) (6.11)and must be compared with ˆ C in (5.6). Quick computations show that the tax is more eﬃcient than anoptimal dynamic allocation when λ < ¯ h + (1 − ρ )¯ µ/η ρ ¯ µT . (6.12)The diﬀerence in cost between the tax system and the optimal dynamic allocation should be understoodas follows. In a tax system, ﬁrms pay for each ton they emit whereas in the cap-and-trade system, theypay only for those tons which are non-compliant with the targets. Hence, the more the system emitscarbon, the more it is socially expensive to set a tax. The relation (6.12) translates this phenomenon intoa threshold on the penalty factor, i.e. the damage function. To cope with the imbalances of the EUTS between realised emissions and total allocations, the EU haslaunched in 2019 a Market Stability Reserve (MSR). The policy rules under MSR specify that when thenumber of allowances in circulation falls below a certain threshold value (400 million allowances), theregulator auctions oﬀ a share of new allowances ( %). Further, if the total number of allowances exceedsanother threshold (800 million allowances), the same fraction of allowances are withdrawn from the market.To make a comparison of the MSR policy with our framework, we consider a continuous time versionof this mechanism. We work under identical initial endowments and allocations, since the social cost is afunction of the aggregate quantities. So, consider a net allocation scheme with equal initial endowment X i = ˜ S i = x i = ¯ x (but no intertemporal singular ˜ S it = 0 for all i = 1 , . . . , N and < t ≤ T ), andidentical net allocation rates a i given by: a it = ¯ a t = δ (cid:16) T − tT ¯ x − ¯ X t (cid:17) , for all i = 1 , . . . , N. (6.13)The rationality behind the above allocation mechanism is that the regulator would like to drive theaccounts from X i = ¯ x to an aggregate position ¯ X T so that E [ ¯ X T ] ≈ by following a linear trajectory. In19his scheme, deviations from the average expected trajectory of carbon emissions reduction are consideredto be market imbalances and are compensated continuously at a rate proportional to the imbalance. Theparameter δ acts as mean-reversion factor, trying to make the average bank accounts go back to the desiredtrajectory.Since the allocation process is ﬁxed, we need to determine ¯ x such that ¯ x + E (cid:104) (cid:90) T ¯ a t dt (cid:105) = (cid:96) ( ρ ) (6.14)to ensure that the expected total emissions are reduced by a factor ρ . Under this MSR-like mechanism,the market equilibrium follows the dynamics: d ¯ X t = (cid:18) η ( ˆ P t − ¯ h ) + δ (cid:16) T − tT ¯ x − ¯ X t (cid:17)(cid:19) dt − d ¯ W t ¯ X = ¯ x , (6.15) ˆ P t = F ( t ) (cid:104) (1 − δz ( t )) (cid:16) T − tT ¯ x − ¯ X t (cid:17) + z ( t ) (cid:16) η ¯ h − ¯ x T (cid:17)(cid:105) , (6.16) z ( t ) = 1 − e − δ ( T − t ) δ , F ( t ) := f ( t )1 − ηf ( t ) (cid:2) T − t − z ( t ) (cid:3) , Since ¯ X can be found explicitly, we can calculate the initial allocation that ensures a reduction of emissionof level ρ : ¯ x = δT − e − δT (cid:104) (cid:96) ( ρ ) + (cid:16) T + e − δT − δ (cid:17) η (cid:0) ˆ P − ¯ h (cid:1)(cid:105) . (6.17)Computations are detailed in Appendix A.3. The complexity of the expression for the induced cost leadsus to resort to numerical illustrations, see the next Section. We illustrate here the ﬁrms behavior and provide some orders of magnitude of social costs in the variouspolicy schemes above. As a reference situation, we consider an objective of reduction of 20% of carbonemission ( ρ = 0 . ) over a period of T = 10 years.This setup is quite close to the objective of the European Union in their climate policy adopted in 2008for the Phase 2 of the EU ETS, only we consider a longer period of time, ten years instead of ﬁve. Weconsider the six main sectors covered by the EU ETS (Public Power and Heat, Pulp and Paper, Cement,Lime and Glass, Metals, Oil and Gas, Other), so N = 6 . The average emission growth rate of the EU 27members included in the EU ETS is around N ¯ µ = 2 Gton per year. The average standard deviation ofemission rate is σ = 0 . Gton/year.The volatility has been estimated on data provided by the European Environment Agency EU EmissionTrading System data viewer. For sake of simplicity, we assume that sectors share an identical emissionvolatility σ i = σ /N . To estimate the correlation matrix among sectors, we considered the yearly veriﬁedemissions from 2008 to 2012. The result is an average correlation to the common shock of k i = 0 . equal for all sectors. The terminal penalty parameter λ is chosen so to ensure that, in the optimaldynamic scheme, the reduction target is reached with a discrepancy of . Gt. This means that λ veriﬁes N ˆ P = 2 λ | ¯ X T | with | ¯ X T | = 0 . Gt, namely λ = 7 . − e /ton .The estimation of the marginal abatement cost function per ﬁrm is the subject of a vivid, currentdebate amongst economists (see Gillingham and Stock (2018) [14] for an introduction on the topic). Ourpurpose here is to provide illustrative examples to ﬁx the ideas on the diﬀerence in social costs acrosspolicies. Thus, we base our choice for h, η on the estimation performed by Gollier (2020) [15] which are in20urn based on the MIT Emissions Prediction and Policy Analysis of Morris et. al. (2012) [27]. It leads totaking N ¯ h = 25 e /ton and a nominal value of the ﬂexibility parameter η = 6 10 ton e /year. Further,we make a sensitivity analysis of the costs in function of η .(a) (b) (c) ETSMSROPT

ETSMSROPT

BAUETSMSROPT (d) (e) (f)

ETSMSROPT

ETSMSROPT -6 C o s t ( b illi on eu r o ) OPTETSMSR

Figure 2:

Simulation of one trajectory under the optimal dynamic policy, the ETS policy and the MSR–like policyof (a) the total bank accounts, (b) the equilibrium market prices, (c) total emission, (d) average abatement eﬀortand (e) net allocation minus initial allocation. Parameters values: T = 10 years, N = 6 sectors, k i = k = 0 . , ρ = 0 . , µ i = 2 /N Gt/year, σ i = 0 . / √ N Gt/year, ¯ h = 25 e /t, η = 6 10 ton /year. e , λ = 7 . − e /ton , δ = 0 . . Picture (f) Social costs of the diﬀerent allocation policies as a function of η ; Theorem 5.1 (7) for optimaldynamic allocation, formula (6.3) for static (ETS) allocation, Monte-Carlo estimation for the MSR-like mechanism. Figure 2 illustrates the behaviour of the diﬀerent policies in the case of the presence of common shocks.With an average growth rate of 2 Gt per year on a 10 year period with a reduction objective of 20%, itleads to an expected reduction of 4 Gt. The optimal dynamic allocation schemes starts by allocating a debt of approximately 5 Gt to the ﬁrms while the static ETS scheme allocates approximately 15 Gt inthe system, close to the natural cap of Gt. Both the static and dynamic allocation schemes achievenearly the same reduction at the end of the regulated period as the trajectories show. But to reach thatend, each scheme follows a diﬀerent path. Under optimal dynamic allocation policies, total bank accounts,total abatement eﬀorts and market equilibrium price are deterministic while the net cumulated allocationis random. Under the static ETS policy the opposite is true, while under the MSR–like policy each processinvolved is stochastic. Also, in the static policy the volatility is monotone increasing in time. Indeed, therelation (6.1) gives that d (cid:104) P (cid:105) t = 4 λ σ (cid:0) λη ( T − t ) (cid:1) dt. Hence, compared to its value at time zero, the volatility at maturity is multiplied by a (2 ληT ) ≈ ,which explains the large oscillations observed at maturity. Figure 2 (f) shows the eﬀect the choice η − (10 − , − ) on the diﬀerent allocation schemes. First, we observe the hockey stick form of the costs.When η is in the range (10 ; 10 ) , the system is ﬂexible enough so that there is no signiﬁcant diﬀerencein cost among the various allocation policies. When η goes below , the diﬀerence in cost exhibitsa linear growth. The discrepancy between the static ETS mechanism and the optimal dynamic schemebecomes approximately % ( billion euros compared to an optimal cost of billion). The fact thatthe MSR-like policy succeeds in getting close to the optimal trajectory of the bank accounts translates ina reduced cost compared to the static allocation. Thus, in the range of values for the ﬂexibility of thesystem we picked, we ﬁnd signiﬁcant diﬀerence in costs between static and optimal dynamic allocation,but not an order of magnitude. The pure tax scheme is excluded from these comparisons because its costis one order of magnitude higher than the other policies. In fact, even in the base scenario of η = 6 10 we get an expected social cost of the allowances policies around billion euros, while the tax social costis greater than billion euros. We ﬁnd the optimal dynamic allocation processes that achieve a given expected emission reduction ofcarbon emissions. They are non-unique, but have the same eﬀect on the system: they induce constantabatement eﬀorts. As a result, the equilibrium market price is also constant. A priori, the regulator isnot pursuing any price control mechanism. A posteriori, however, the desired reduction goals naturallylead to that conclusion. The eﬃciency of optimal dynamic allocations scheme compared to sub-optimalyet intuitive policies strongly depends on the ﬂexibility of the system and on its dependence on commonbusiness cycles. Non-uniqueness in the optimal policy suggests that within our framework the regulator canaccomplish not only emission reduction at minimal social cost, but also consistently include other featuressuch as the ﬁnancing of new non-emissive technology. We leave these extensions for future research.

A Proofs and computations

The following facts will be used in the proofs.

Lemma A.1.

Let h be a process in L and Y ∈ L (Ω , F T , P ) . Then E (cid:104) Y (cid:90) T h t dt (cid:105) = E (cid:104) (cid:90) T dt h t E t [ Y ] (cid:105) . Namely, the scalar product of Y and the pathwise integral (cid:82) T h t dt equals the scalar product in L of h andthe martingale process closed by Y , namely ( E t [ Y ]) t . Proof.

The proof is one line: E (cid:104) Y (cid:90) T h t dt (cid:105) = E (cid:104) (cid:90) T Y h t dt (cid:105) = (cid:90) T dt E [ Y h t ] = (cid:90) T dt E [ h t E t [ Y ]] = E (cid:104) (cid:90) T dt h t E t [ Y ] (cid:105) where the equalities from the second onwards follow from Fubini Theorem and from the properties ofconditional expectation. An alternative is to apply integration by parts to the product of the martingale ( E t [ Y ]) t and the bounded variation process (cid:82) · h t dt . (cid:3) The proof of the next Lemma is straightforward.22 emma A.2.

Let α ∈ L be a martingale. Then, consider the martingale M closed by (cid:82) T α s dsM t := E t (cid:104) (cid:90) T α s ds (cid:105) which also belongs to L . Then, M t = (cid:90) t α s ds + ( T − t ) α t and the dynamics of M are given by dM t = ( T − t ) dα t A.1 Proof of Theorem 3.1

For ease of notation, we drop the dependence on i of the coeﬃcients and of the controls. We split the costfunction J in two parts, running cost C and terminal cost F : J ( α, β ) = E (cid:104) (cid:90) T (cid:16) hα t + α t η + P t β t + 12 ν β t (cid:17) dt (cid:105) + E [ λX T ] = C ( α, β ) + F ( α, β ) . From the structure it is apparent that both C and F are diﬀerentiable. The diﬀerential of the runningcost C ( α, β ) := E (cid:104) (cid:90) T (cid:16) hα t + α t η + P t β t + 12 ν β t (cid:17) dt (cid:105) is given by diﬀerentiation inside the integral D α C t = h + α t η , D β C t = P t + β t ν . In the terminal penalty F ( α, β ) := E [ λX T ] , the bank X T equals X T = A T + (cid:90) T ( α t + β t ) dt − σW T If we apply the chain rule inside the expectation in F , namely to G ( α, β ) := λX T ( α, β ) we get the(non adapted) Frechet gradient: DG t = 2 λX T DX T in which DX T is the gradient of X T w.r.t. ( α, β ) . As X T is a linear map, DX T is simply the constantbidimensional process DX T = (1 1) . Given the regularity of the problem, we can diﬀerentiate F = E [ G ( α, β )] under the expectation side.This means that the Frechet derivative DF ∈ L × L must verify E (cid:104) (cid:90) T DG t (cid:16) Y t Y t (cid:17) dt (cid:105) = E (cid:104) (cid:90) T DF t (cid:16) Y t Y t (cid:17) dt (cid:105) ∀ Y , Y ∈ L To ﬁnd the two components of DF , make the scalar product of e.g. the ﬁrst derivative D α G with ageneric Y : E (cid:104) (cid:90) T D α G t Y t dt (cid:105) = 2 λ E (cid:104) (cid:90) T X T Y t dt (cid:105) = 2 λ E (cid:104) (cid:90) T E t (cid:2) X T (cid:3) Y t dt (cid:105) , D α F t = 2 λ E t (cid:2) X T (cid:3) . Similarly for D β G . Finally, DF t = (cid:0) λ E t (cid:2) X T (cid:3) λ E t (cid:2) X T (cid:3) (cid:1) (A.1)Putting things together, the diﬀerential of the cost function JD α J = h + α t η + 2 λ E t (cid:2) X T (cid:3) , D β J = P t + β t ν + 2 λ E t (cid:2) X T (cid:3) (A.2)The FOC equations for the generic agent i write D α J = 0 , D β J = 0 . (A.3)Subtracting the two equations above, β t = ν ( h + α t η − P t ) , which can be substituted into the ﬁrst equation: h + α t η + 2 λ E t (cid:104) A T + (cid:90) T ( α t + ν ( h + α t η − P t )) dt (cid:105) − λσW t = 0 , ∀ t ≥ . (A.4)Thus, it is straightforward that the optimal abatement (if it exists) ˆ α is a martingale, solution of theequation: ˆ α t = − η (cid:18) h + 2 λM t + 2 λ E t (cid:104) (cid:90) T ((1 + νη ) ˆ α t + ν ( h − P t )) dt (cid:105) − λσW t (cid:19) (A.5)In particular, using Lemma A.2 and the deﬁnition of g, M in (3.1), the initial value of ˆ α is: ˆ α = − g (0) (cid:16) λ h + M + ν E (cid:104) (cid:90) T ( h − P t ) dt (cid:105)(cid:17) . (A.6)To solve for ˆ α , we rewrite (A.5) in diﬀerential version: d ˆ α t = − η (cid:110) λdM t + 2 λ (cid:16) νη (cid:17) d E t (cid:104) (cid:90) T ˆ α s ds (cid:105) + 2 λνd E t (cid:104) (cid:90) T ( h − P s ) ds (cid:105) − λσdW t (cid:111) By Lemma A.2, d E t (cid:104) (cid:82) T ˆ α s ds (cid:105) = ( T − t ) d ˆ α t and thus (1 + 2 λ ( η + ν )( T − t )) d ˆ α t = − ηλ (cid:18) dM t + νd E t (cid:104) (cid:90) T ( h − P s ) ds (cid:105) − σdW t (cid:19) or d ˆ α t = − g ( t ) (cid:16) dM t + νd E t (cid:104) (cid:90) T ( h − P s ) ds (cid:105) − σdW t (cid:17) , (A.7)where the term between parentheses is (the diﬀerential of) a square integrable martingale. The Cauchyproblem given by the SDE and the initial condition (A.6) uniquely identiﬁes ˆ α , since the dynamics of P and a are exogenous here. The optimal trade ˆ β t is then ˆ β t = ν (cid:16) h + ˆ α t η − P t (cid:17) . (A.8)24ote that the optimal trade is a martingale of and only if P is a martingale. To conclude, we rewrite theoptimal couple as a function of the state ˆ X . Start again from the ﬁrst FOC, h + ˆ α t η + 2 λ E t (cid:2) ˆ X T (cid:3) = 0 and rewrite h + ˆ α t η + 2 λ ˆ X t + 2 λ E t (cid:104) A T − A t + (cid:90) Tt (cid:16) a s + ˆ α s + ν (cid:0) h + ˆ α s η − P s (cid:1)(cid:17) ds (cid:105) = 0 An application of Lemma A.2 leads to h + ˆ α t η + 2 λ (cid:16) νη (cid:17) ( T − t ) ˆ α t + 2 λ ˆ X t + 2 λR t + 2 λ E t (cid:104) (cid:90) Tt + ν ( h − P s ) ds (cid:105) = 0 Finally, ˆ α in feedback form reads as ˆ α t = − g ( t ) (cid:16) h λ + ˆ X t + R t + ν E t (cid:104) (cid:90) Tt ( h − P s ) ds (cid:105)(cid:17) and this concludes the proof, since M = R . (cid:3) A.2 Proof of Theorem 4.1 (i) Summing up over all i the relations (3.4) and using market clearing condition, N P t = N (cid:88) i =1 h i + α it η i . (A.9)From Theorem 3.1 (i), dα it = − g i ( t ) (cid:16) dM it − σ i dW it − νd E t (cid:104) (cid:90) T P s ds (cid:105)(cid:17) . (A.10)Thus, N dP t = − N (cid:88) i =1 g i ( t ) η i (cid:16) dM it − νd E t (cid:104) (cid:90) T P s ds (cid:105) − σ i dW it (cid:17) , which shows that P must be a martingale. Hence, by Lemma A.2, (cid:16) N − (cid:88) i g i ( t ) ν ( T − t ) η i (cid:17) dP t = − N (cid:88) i =1 g i ( t ) η i (cid:16) dM it − σ i dW it (cid:17) , which gives the SDE for the equilibrium price ˆ P in (4.3).For the initial condition of the equilibrium price dynamics, ﬁrst using the ﬁrst FOC in (A.4) writtenfor t = 0 and substituing β i as a function of α i and P , we ﬁnd that α i = − g i (0) (cid:16) x i + M i + h i (cid:16) λ + νT (cid:17) − νT P (cid:17) . (A.11)25hen substituting the value above in (A.9), we get after tedious computions: P = 1 N N (cid:88) i =1 π i (0) (cid:16) η i h i T − M i (cid:17) . (A.12)(ii) The equilibrium price ˆ P is given by the market clearing condition N (cid:88) i =1 ˆ β i ( ˆ P ( a ) , a ) = 0 . According to Theorem 3.1 (ii), we deduce that: ˆ P t = 1 N N (cid:88) i =1 ( h i + ˆ α it η i ) , (A.13)from which we deduce that the equilibrium price is a martingale. Recalling that ˆ α i ( ˆ X it , P ) = − g i ( t ) (cid:16) λ h i + ˆ X it + R it + ν ( T − t ) h i − E t (cid:104) (cid:90) T P s ds (cid:105)(cid:17) , and using the fact that ˆ P is a martingale, we get that the equilibrium price is given by: (cid:104) − N N (cid:88) i =1 νg i ( i )( T − t ) η i (cid:105) ˆ P t = 1 N N (cid:88) i =1 h i − g i ( t ) η i (cid:16) λ h i + ˆ X it + R it + ν ( T − t ) h i (cid:17) . (A.14)Hence, (cid:104) − N N (cid:88) i =1 νg i ( i )( T − t ) η i (cid:105) ˆ P t = 1 N N (cid:88) i =1 h i (cid:8) − g i ( t ) η i (cid:0) λ + ν ( T − t ) (cid:1)(cid:9) − g i ( t ) η i (cid:16) ˆ X it + R it (cid:17) . (A.15)Further, − g i ( t ) η i (cid:0) λ + ν ( T − t ) (cid:1)(cid:9) = g i ( t )( T − t ) , (A.16)from which (cid:104) − N N (cid:88) i =1 νg i ( i )( T − t ) η i (cid:105) ˆ P t = 1 N N (cid:88) i =1 g i ( t ) η i (cid:16) η i h i ( T − t ) − ˆ X it − R it (cid:17) . Hence, ˆ P t = 1 N N (cid:88) i =1 π i ( t ) (cid:16) η i h i ( T − t ) − ˆ X it − R it (cid:17) . (iii) Using relation (A.10) and the expression of the dynamics of the equilibrium price, the relations areimmediate. (iv) Direct consequence of the initial condition on the controls and on the price. (cid:3) .3 Computations for the MSR-like mechanism In our MSR-like mechanism, allocations consists of initial endowment plus net allocation rate a i of mean-reverting type: X i := ¯ x , a it = ¯ a t = δ (cid:16) T − tT ¯ x − ¯ X t (cid:17) , for all i = 1 , . . . , N. (A.17)The goal is ﬁnding ¯ x to ensure an expected emissions reduction to a factor ρ . From Section 5.1, we knowthat ¯ x + (cid:90) T E [¯ a t ] dt = (cid:96) ( ρ ) , (A.18)to ensure the desired reduction. The dynamics of the average bank account veriﬁes d ¯ X t = (cid:18) η ( ˆ P t − ¯ h ) + δ (cid:16) T − tT ¯ x − ¯ X t (cid:17)(cid:19) dt − d ¯ W t with ¯ X = ¯ x , (A.19)where we used the expression of the abatement eﬀort rates ˆ α i given by (4.11) and the market clearingcondition. The solution is ¯ X t = e − δt ¯ x + e − δt (cid:90) t e δs (cid:104) δ T − sT ¯ x + η ( ˆ P s − ¯ h ) (cid:105) ds − e − δt (cid:90) t e δs d ¯ W s . (A.20)Exploiting martingality of ˆ P , E [ ¯ X t ] = (cid:104) η ( ˆ P − ¯ h ) + ¯ x T (cid:105) − e − δt δ + T − tT ¯ x . Since the ˆ P ensuring a reduction to a factor ρ must satisfy ˆ P = ¯ h + (1 − ρ )¯ µ/η , the relation (A.18)becomes ¯ x − (cid:16) η ( ˆ P − ¯ h ) + ¯ x T (cid:17)(cid:16) T + e − δT − δ (cid:17) = (cid:96) ( ρ ) . So, ¯ x = δT − e − δT (cid:104) (cid:96) ( ρ ) + (cid:16) T + e − δT − δ (cid:17) η (cid:0) ˆ P − ¯ h (cid:1)(cid:105) . (A.21)We now solve for the equilibrium price ˆ P corresponding to the MSR-like allocation. According toProposition 4.1 (1), ˆ P t = f ( t ) (cid:16) ( T − t ) η ¯ h − ¯ X t − ¯ R t (cid:17) . The residual expected allocation is ¯ R t = E t (cid:104) (cid:90) Tt ¯ a s ds (cid:105) = (cid:90) Tt δ (cid:16) T − sT ¯ x − E t (cid:2) ¯ X s (cid:3)(cid:17) ds. Using the solution of the dynamics of ¯ X (A.20), we have that δ (cid:16) E t (cid:2) ¯ X s (cid:3) − T − sT ¯ x (cid:17) = δe − δ ( t − s ) (cid:16) ¯ X t − T − tT ¯ x (cid:17) + (cid:104) η ( ˆ P t − ¯ h ) + ¯ x T (cid:105)(cid:16) − e − δ ( s − t ) (cid:17) , ¯ R t = (cid:16) T − tT ¯ x − ¯ X t (cid:17)(cid:104) − e − δ ( T − t ) (cid:105) − (cid:16) η ( ˆ P t − ¯ h ) + ¯ x T (cid:17)(cid:104) T − t − − e − δ ( T − t ) δ (cid:105) , which we rewrite as ¯ R t = z ( t )¯ a t + (cid:16) η ¯ h − ¯ x T (cid:17)(cid:2) T − t − z ( t ) (cid:3) − η (cid:2) T − t − z ( t ) (cid:3) ˆ P t , with z ( t ) := 1 − e − δ ( T − t ) δ . Hence, solving for ˆ P t gives ˆ P t = F ( t ) (cid:104) (1 − δz ( t )) (cid:16) T − tT ¯ x − ¯ X t (cid:17) + z ( t ) (cid:16) η ¯ h − ¯ x T (cid:17)(cid:105) , F ( t ) := f ( t )1 − ηf ( t ) (cid:2) T − t − z ( t ) (cid:3) . Thus, under this MSR-like mechanism, the market equilibrium follows the dynamics: d ¯ X t = (cid:18) η ( ˆ P t − ¯ h ) + δ (cid:16) T − tT ¯ x − ¯ X t (cid:17)(cid:19) dt − d ¯ W t ¯ X = ¯ x , (A.22) ˆ P t = F ( t ) (cid:104) (1 − δz ( t )) (cid:16) T − tT ¯ x − ¯ X t (cid:17) + z ( t ) (cid:16) η ¯ h − ¯ x T (cid:17)(cid:105) , (A.23)with ¯ x given by (A.21). References [1] K. S. Anand, F. C. Giraud-Carier. Pollution Regulation of Competitive Markets.

ManagementScience , 66(9):4193-4206, 2020.[2] M. Auﬀhammer. Quantifying economic damages from climate change.

J. of Economic Perspectives ,32(4):33-52, 2018.[3] A. Bensoussan, S. Chen, S. P. Sethi. Feedback Stackelberg solutions of inﬁnite-horizon stochasticdiﬀerential games.

Models and Methods in Economics and Management Science , F. El Ouardighiand K. Kogan (eds.), International Series in Operations Research & Management Science 198, 2014.[4] C. Chaton, A. Creti, B. Peluchon. Banking and back-loading emission permits.

Energy Policy ,82:332-341, 2015.[5] R. Carmona, M. Fehr. Risk-neutral models for emission allowance prices and option valuation.

Management Science , 57(8):1453-1468, 2011.[6] R. Carmona, M. Fehr, J. Hinz. Optimal stochastic control and carbon price formation.

SIAM J. onControl and Optimisation , 48(4):2168-2190, 2009.[7] R. Carmona, M. Fehr, J. Hinz, A. Porchet. Market design for emission trading schemes.

SIAMReview , 52(3):403-452, 2010.[8] R. Carmona, F. Delarue, G.-E. Espinosa, N. Touzi. Singular forward-backward stochastic diﬀerentialequations and emissions derivatives.

Ann. of Applied Probabilities , 23(3):1086-1128, 2013.[9] Darrell Duﬃe. Dynamic Asset Pricing Theory.

Princeton University Press , 3 rd ed., 2001.2810] P. Falbo, J. Hinz. Risk aversion in modeling of cap-and-trade mechanism and optimal design ofemission markets. Stochastics of Environmental and Financial Economics , F.E. Benth and G. DiNunno (eds.), Springer Proceedings in Mathematics and Statistics 138, 2014.[11] H. Fell, R. D. Morgenstern. Alternative Approaches to Cost Containment in a Cap-and-TradeSystem.

Environmental and Ressource Economics , 47:275-297, 2010.[12] H. Fell, D. Burtraw, R. D. Morgenstern, K. L. Palmer. Soft and hard price collars in a cap-and-tradesystem: A comparative analysis.

J. of Environmental Economics and Management , 64:183-198, 2012.[13] A. Frino, J. Kruk, A. Lepone. Liquidity and transaction costs in the European carbon futuresmarket.

J. Deriv. and Hedge Funds , 16(2):100-115, 2010.[14] K Gillingham, J. H. Stock. Quantiﬀying economic damages from climate change.

J. of EconomicPerspectives , 32(4):53-72, 2018.[15] C. Gollier. The cost–eﬃciency carbon pricing puzzle.

Toulouse School of Economics Working Paper ,18–952, 2020.[16] G. Grüll, L. Taschini. Cap-and-trade properties under diﬀerent hybrid scheme designs.

J. ofEnvironmental Economics and Management , 61:107-118, 2011.[17] R. W. Hahn. Economic prescriptions for environmental problems: How the patient followed thedoctor’s orders.

J. of Economic Perspective , 3(2):95-114, 1989.[18] C. Hepburn. Regulation by prices, quantities, or both: a review of instrument choice.

Oxford Reviewof Economic Policy , 22(2):226-247, 2006.[19] S. Hsiang, R. E. Kopp. An economist’s guide to climate change science.

J. of Economic Perspectives ,32(4):3-32, 2018.[20] E. Keeler, M. Spence, R. Zeckhauser. The optimal control of pollution.

J. of Economic Theory ,4:19-34, 1971.[21] S. Kollenberg, L. Taschini. Emissions trading systems with cap adjustments.

J. of EnvironmentalEconomics and Management , 80:20-36, 2016.[22] S. Kollenberg, L. Taschini. Dynamic supply adjustment and banking under uncertainty in anemission trading scheme: The market stability reserve.

European Economic Review , 118:213-236,2019.[23] O.-P. Kuusela, J. Lintunen. A Cap-and-Trade Commitment Policy with Allowance Banking.

Envi-ronmental and Resource Economics , 75:421-455, 2020.[24] A. S. Kyle. Continuous auctions and insider trading.

Econometrica , 53(6):1315-1336, 1985.[25] J. Lintunen, O.-P. Kuusela. Business cycles and emission trading with banking.

European EconomicReview , 101:397-417, 2018.[26] W. D. Montgomery. Markets in licences and eﬃcient pollution control programs.

J. of EconomicTheory , 5:395-418, 1972.[27] J. Morris, S. Paltsev, J. Reilly. Marginal abatement costs and marginal welfare costs for greenhousegas emissions reductions: Results from the EPPA model.

Environmental Modeling and Assessment ,17:325-336, 2012. 2928] N. Z. Muller, R. Mendelsohn. Eﬃcient pollution control: Getting the prices right.

AmericanEconomic Review , 99(5):1714-1739, 2009.[29] W. A. Pizer. Combining price and quantity controls to mitigate global climate change.

J. of PublicEconomics , 85:409-434, 2002.[30] W. A. Pizer, B. Prest. Prices versus quantities with policy updating.

J. of the Association ofEnvironmental and Resource Economists , 7(3):483-518, 2020.[31] T. Requate. Pollution control in a Cournot duopoly via taxes and permits.

J. of Economics ,58(3):255-291, 1993.[32] M. J. Roberts, M. Spence. Eﬄuent charges and licenses under uncertainty.

J. of Public Economics ,5:193-208, 1976.[33] J. D. Rubin. A model of intertemporal emission trading, banking, and borrowing.

J. of Environ-mental Economics and Management , 31:269-286, 1996.[34] S. Schennach. The economics of pollution permit banking in the context of title IV of the 1990 cleanair act amendments.

J. of Environmental Economics and Management , 40:189-210, 2000.[35] D. F. Spulber. Eﬄuent regulation and long-run optimality.

J. of Environmental Economics andManagement , 12:103-116, 1985.[36] N. Von der Fehr. Tradable emission rights and strategic interaction.

Environmental ResourceEconomics , 3(2):129–151, 1993.[37] M. L. Weitzman. Prices vs. quantities.