Incentives, lockdown, and testing: from Thucydides's analysis to the COVID-19 pandemic
Emma Hubert, Thibaut Mastrolia, Dylan Possamaï, Xavier Warin
IIncentives, lockdown, and testing: from Thucydides’s analysis to theCOVID–19 pandemic ∗ Emma
Hubert † Thibaut
Mastrolia ‡ Dylan
Possamaï § Xavier
Warin ¶ September 2, 2020
Abstract
We consider the control of the COVID–19 pandemic via incentives, through either stochastic SIS or SIR com-partmental models. When the epidemic is ongoing, the population can reduce interactions between individuals inorder to decrease the rate of transmission of the disease, and thus limit the epidemic. However, this effort comesat a cost for the population. Therefore, the government can put into place incentive policies to encourage the lock-down of the population. In addition, the government may also implement a testing policy in order to know moreprecisely the spread of the epidemic within the country, and to isolate infected individuals. We provide numericalexamples, as well as an extension to a stochastic SEIR compartmental model to account for the relatively longlatency period of the COVID–19 disease. The numerical results confirm the relevance of a tax and testing policy toimprove the control of an epidemic. More precisely, if a tax policy is put into place, even in the absence of a specifictesting policy, the population is encouraged to significantly reduce its interactions, thus limiting the spread of thedisease. If the government also adjusts its testing policy, less effort is required on the population side, so individualscan interact almost as usual, and the epidemic is largely contained by the targeted isolation of positively–testedindividuals.
Key words.
COVID–19, stochastic epidemic models, epidemic control, optimal incentives, moral hazard, com-partment model.
AMS 2020 subject classifications.
Primary: 92D30; Secondary: 91B41, 60H30, 93E20.
Starting around 430 BC, and known as the first historically well–documented epidemic, the Plague of Athens killedbetween a quarter and a third of Athenians, as reported by Thucydides. He described the reaction of commonAthenians and physicians of the time alike in these termsFor a while physicians, in ignorance of the nature of the disease, sought to apply remedies; but it was invain, and they themselves were among the first victims, because they oftenest came into contact with it.No human art was of any avail, and as to supplications in temples, inquiries of oracles, and the like, theywere utterly useless, and at last men were overpowered by the calamity and gave them all up. (Jowett [60,Volume I, Book II, pp. 135])Thucydides analysed the consequences of this epidemic, and concluded that it had led a moral upheaval for theAthenians, faced with the complete lack of any useful cure. Indeed, they realised that their traditionally used policies(mostly of a religious nature) to face tragedies had no effect on the epidemic, and that in the end, the disease wasonly stopped thanks to the development of a natural immunity within the population, during the first four years ofthe epidemic phase. Concerning now more specifically the spread of the disease itself, Thucydides wrote the following ∗ The authors gratefully acknowledge the supports of the ANR project PACMAN ANR–16–CE05–0027 and the FACE Foundation –Thomas Jefferson Fund. † LAMA, Université Gustave Eiffel, France, [email protected] ‡ CMAP, École Polytechnique, Palaiseau, France, [email protected]. § ETH Zürich, Mathematics department, Switzerland, [email protected] ¶ EDF R&D and FiME, Laboratoire de Finance des Marchés de l’Énergie ), [email protected] a r X i v : . [ q - b i o . P E ] S e p ppalling too was the rapidity with which men caught the infection; dying like sheep if they attended onone another; and this was the principal cause of mortality. When they were afraid to visit one another,the sufferers died in their solitude, so that many houses were empty because there had been no one leftto take care of the sick; or if they ventured they perished, especially those who aspired to heroism. Forthey went to see their friends without thought of themselves and were ashamed to leave them, at a timewhen the very relations of the dying were at last growing weary and ceased even to make lamentations,overwhelmed by the vastness of the calamity. (Jowett [60, Volume I, Book II, pp. 138])In Thucydides’s analysis of the Plague of Athens, we can isolate three fundamental questions that need to be addressedwhenever an unknown epidemic occurs.(1) How can one model a disease when one has, at best, parsimonious information on how it is spreading among thepopulation?(2) How can one solve the Gordian knot associated to interactions within the population: enjoying on the onehand the presence of others and avoiding isolation and solitude, and on the other hand potentially dramaticallyspreading the disease?(3) How can governments and decisions–makers incentivise people in order to better control the spread of theepidemic?The first question is naturally linked to several strands of fundamental research, both for mathematicians and physi-cians, dealing with the problem of choosing a relevant epidemic model. If the paternity of the first mathematicalmodel designed to describe the evolution of an epidemic is often attributed to Bernoulli, who proposed one for small-pox as early as 1760 in [17], the real mathematical development of the theory had to wait for the 20th century, withfundamental contributions for the development of deterministic models by Hamer [52], Ross [88; 89; 90], Soper [98],and later Kermack and McKendrick [64], McKendrick [75], and Bartlett [13] who proposed one of the first generalinvestigations of the evolution of deterministic interacting systems, which was then applied to epidemiology in Kendall[63]. The previous list is by no means comprehensive, and we refer the interested reader to the monograph by Bailey[12] for more historical details. It was rapidly noticed that deterministic models were insufficient to account for theuncertainty associated the disease spreading, and the technical difficulties usually encountered for its detection. Thisacknowledgement helped nurturing the development of stochastic models, whose first instances seems to be tracedback to McKendrick [74] and Greenwood [48]. For a precise comparison between deterministic and stochastic modelsin discrete–time settings, we refer our readers to Bailey [12], Bartlett [14], and Allen and Burgin [5], and to Allen [4]for more up–to–date references and an overview of recent epidemiological models.We will now describe some specific type of epidemiological models, belonging to the general class of compartmentalmodels, and which will be at the heart of our work. The first one considers a sort of worst–case scenario, in whichan immunity is not developed after infection. This is specially relevant for instance for some sexually transmittedinfections, or bacterial diseases. In such a case, infected people can either die of the infection, or be cured and thereforebecome once more susceptible to contract the disease. Such models have been coined SIS (for Susceptible–Infected–Susceptible), and consider a population divided into two groups. Susceptible individuals interact with infected ones,and therefore move from one class to the other repeatedly. This model was first discussed in Weiss and Dishon[106], generalising a simpler version by Bailey [11], where it was linked to birth and death interacting processes. Itwas then further studied by Kryscio and Lefèvre [67], who computed the mean time of extinction of the infection.These discrete–time models were then extended by Nåsell [78; 79], who found the quasi–stationary distribution of acontinuous–time stochastic SIS model with no births nor deaths. More recently, Gray, Greenhalgh, Hu, Mao, and Pan[47] proposed to model a stochastic SIS process in continuous–time, as a solution to a bi–dimensional SDE driven bya Brownian motion. This is the model we will follow in our SIS framework.Alternatively to this quite pessimistic scenario, one can also assume that an immunity will appear after infection. Inthat case, we can distinguish three classes: susceptible individuals who can contract the disease, infected people whoare currently infected by the disease, and recovered people who have been cured and developed antibodies. Introducedoriginally by Kermack and McKendrick [64], this so–called SIR model was studied in depth by Anderson and May[8] in a deterministic setting, while stochastic perturbations were introduced by Beretta, Kolmanovskii, and Shaikhet[16]. Modelling a stochastic SIR process as a solution to an SDE driven by a Brownian motion was then proposed inTornatore, Buccellato, and Vetro [103], and Jiang, Yu, Ji, and Shi [59]. This will be our model choice in this case.2or a more realistic modelling in the case of the COVID–19 disease, and especially to account for the relatively longlatent phase of this disease, one could also assume that once a susceptible individual contracts the disease, he doesnot immediately become contagious. This led us to provide some extensions of our reasoning, in particular to a SEIRmodel, used by Bacaër [10], Dolbeault and Turinici [32] and Élie, Hubert, and Turinici [38] to model the COVID–19disease. In this type of models, an intermediary class between susceptible and infected is introduced, usually referredto as the class of exposed individuals. This class allows to model individual infected but not yet infectious. Similarlyas for the SIR/SIS models, another variation on this model considers that there is only a partial immunity, andindividuals having recovered may revert to the class of susceptible: in this case, the model is usually coined SEIRS.In our framework we need to consider continuous–time stochastic version of these models, and will therefore use theones introduced in Mummert and Otunuga [77].The question of the model being now settled, we can focus more on the second question raised above, which is linkedto the spread of the disease through interactions within the population. In classical SIS/SIR models, the infectiongrows into the population through an incidence rate β , and proportionally to the product of the number of susceptibleand infected individuals. In the absence of a cure or a vaccine, this transmission rate appears therefore as the onlycontrol variable of individual or public institutions, in order to reduce the spread of an epidemic. Our take on thesecond question will therefore be from a control–theoretic perspective. At the heart of this approach is the simpleidea that when faced with an epidemic, a perfectly rational population will try to find an equilibrium interaction rate,balancing the need to still connect with others, and the natural fear of spreading the infection itself. This is by nomeans a new point of view, and papers discussing the use of formal control theory in epidemiology can be dated backto the 70s, see among others Taylor [100], Jaquette [58], Sanders [92], Gupta and Rink [50; 51], Abakuks [1], Mortonand Wickwire [76], Wickwire [107], or Sethi and Staats [97]. More recently and closer to our purpose, we can alsorefer to Behncke [15], Riley et al. [87], who studied the impact of the control of transmission rate on the 2002–2004SARS outbreak in Hong Kong and on the ways to interfere with the disease spreading, Piunovskiy and Clancy [84],Hansen and Day [53], Fenichel et al. [39], Kandhway and Kuri [61], Sélley, Besenyei, Kiss, and Simon [96], and morebroadly to the monograph by Lenhart and Workman [69].An important, and slightly unrealistic aspect of the framework we just described is that the population is perfectlyrational. Though it seems reasonable to assume that at least some individuals, being afraid of getting sick, willnaturally decrease their interaction rates, it would however clearly be a stretch to assume that all individuals willhave access to enough information, compared for instance to public institutions, for them to assess whether they arereally acting in a way which is truly beneficial to the population as a whole. This is one of the reasons why quarantineand lockdown measures can be in addition introduced by governments, in order to help slow down a pandemic, whenno cure nor vaccine have been developed, and there is a risk for medical facilities to be overwhelmed by a largeinflux of patients. As should be expected, a significant part of the recent literature on the COVID–19 pandemic hasalso adopted this point of view, and such measures as well as their medical, societal, and economical impacts arediscussed by, among others, Alvarez, Argente, and Lippi [6], Anderson, Heesterbeek, Klinkenberg, and Hollingsworth[9], Colbourn [24], Del Rio and Malani [30], Djidjou-Demasse, Michalakis, Choisy, Sofonea, and Alizon [31], Élie,Hubert, and Turinici [38], Ferguson et al. [40], Fowler, Hill, Levin, and Obradovich [41], Grigorieva, Khailov, andKorobeinikov [49], Hatchimonji, Swendiman, and Seamon [54], Kantner [62], Ketcheson [65], Piguillem and Shi [83],Thunstrom, Newbold, Finnoff, Ashworth, and Shogren [101], Toda [102], or Wilder-Smith, Chiew, and Lee [108].A telling example in the above list is the report of the Imperial College London by Ferguson et al. [40], which assessesthe impact of non–pharmaceutical interventions to reduce the contact rate within a population for the COVID–19 pandemic. They distinguish between mitigation strategies ( i.e. reduction of the peak hospitalisation levels byprotecting the most susceptible individual from getting infected, with shelter in place policies or social distancing),and suppression strategies ( i.e. aiming at reversing the disease growth with home isolation and social distancing forthe entire population). It has been showed that mitigation policies ‘might reduce deaths seen in the epidemic by up tohalf, and peak healthcare demand by two–thirds,’ (Ferguson et al. [40, pp. 15]) but will lead to numerous deaths andsaturation of health systems. The suppression strategy thus appears in this report as a preferred policy. In light of theissues we have raised, a natural conclusion was, at least for us, that even if a control–theoretic approach to mitigatethe impact of an epidemic is clearly desirable, there is a priori no evidence that in face of clear public policies, apopulation will directly adopt a social distancing behaviour leading to an optimal transmission rate for the welfare ofthe society. Moreover, in the absence of a system allowing to actually keep track of the level of interaction within thepopulation, governments are faced with a clear situation of moral hazard. Consequently, an incentive policy should It is worth pointing out here that several countries worldwide have decided to use contact–tracing tools, such as mobile phone apps, The present paper proposes to fulfil this gap by studying how a lockdown policy, seen asa suppression strategy to echo [40], can limit the number of infected people during an epidemic, with uncertainties onthe actual number of affected individuals, and on their level of adherence to such a policy. More specially, we aim atsolving this moral hazard problem by finding( i ) the best reaction effort of the population to reduce the interaction given a specific government policy;( ii ) the optimal policy composed by an aggregated tax paid by the population at some fixed maturity, and a testingpolicy to reduce the uncertainty on the estimated number of infected people.As we already mentioned, this problem perfectly fits with a classical principal–agent problem with moral hazard, andboils down to finding a Stackelberg equilibrium between the principal (the leader, here the government) proposing apolicy to an agent (the follower, here the population) to interact optimally in order to reduce the spread of the disease.Principal–agent problems have a long history in the economics literature, dating back from, at least, the 60s. It is notour goal here to review the whole literature on the subject, and we refer the interested reader to the seminal booksby Laffont and Martimort [68], Bolton and Dewatripont [19], or Salanié [91]. For our purpose here, we will contentourselves to mention that this literature regained a strong momentum in the past two decades, where continuous–time models where developed and showed to be more flexible and tractable than the earlier static or discrete–timemodels. Main contributors in these regards are Holmström and Milgrom, [55], Schättler and Sung [95], Sannikov [93],Williams [109], see also the monograph by Cvitanić and Zhang [27]. More recently, Cvitanić, Possamaï, and Touzi[28; 29] developed a general theory allowing to tackle a great number of contract–theory problem, which has been thenextended and applied in many different situations . The basic idea is to identify a sub–class of contracts offered bythe principal, which are revealing in the sense that the best–reaction function of the agent, and his optimal control,can be computed straightforwardly, and then proving that restricting one’s attention to this class is without loss ofgenerality. With this approach, the problem faced by the principal now becomes a standard optimal control problem.There are however two fundamental assumptions for this theory to work, one of them being a specific structurecondition, which enforces that the drift of the process controlled by the agent, meaning here for us the pair ( S, I )giving the number of susceptible and infected people in the population, must be in the range of the volatility matrixof this process. This fundamental assumption is not satisfied in our model, because roughly speaking, there is onlyone Brownian motion driving the two processes, and we therefore cannot directly rely on existing result to tackleour problem. In these so–called degenerate problems, the literature has so far relied on the Pontryagin stochasticmaximum principle, see for instance [56], but this requires extremely stringent assumptions, such as linear dynamics,which are automatically precluded for SIS/SIR models. We however prove that in our specific problem, it is possible designed to help tracking down subsequent exposures after an infected individual is identified, see for instance Cho, Ippolito, and Yu [23],or Reichert, Brack, and Scheuermann [86]. Using these would in principle erase any possibility or moral hazard, provided that all thepopulation uses the app, and that testing is organised on a massive scale. Even admitting that this would be the case, it remains that thesetools have raised complex issues of privacy, see Ienca and Vayena [57] or Park, Choi, and Ko [82], and thus are still extremely polemical.In any case, the incentive–based approach we propose can always be considered as a useful complement to any other adopted strategy. There are a certain number of papers studying disease spreading through the lens of either moral hazard or adverse selection. However,these papers are mostly interested in livestock related diseases, where producers naturally have private information on preventive measuresthey may have adopted, prior to contamination ( ex ante moral hazard), and may or may not declare whether their herd is infected aftercontamination ( ex post adverse selection). Such issues and the design of appropriate policies are considered for instance in Valeeva andBackus [104], Gramig, Horan, and Wolf [45; 46], but the problematic is completely different from the one we are interested in. A notableexception can be found in the work of Carmona and Wang [22, Section 5], where the authors consider an application of their moral hazardtheory for agents interacting through a finite state mean–field game to the containment of an epidemic. See among others Aïd, Possamaï, and Touzi [2], Alasseur, Farhat, and Saguan [3], El Euch, Mastrolia, Rosenbaum, and Touzi [33],Cvitanić and Xing [26], Élie and Possamaï [35], Élie, Mastrolia, and Possamaï [37], Élie, Hubert, Mastrolia, and Possamaï [36], Kharroubi,Lim, and Mastrolia [66], Lin, Ren, Touzi, and Yang [72], Mastrolia and Ren [73].
4o identity a whole family of contract representations (unlike the unique one in non–degenerate models), which isdifferent from the one obtained in [29], but which still allows us to re–interpret the problem of the principal as astandard stochastic control problem. As far as we know, ours is the first paper in the literature which uses a dynamicprogramming approach to solve a degenerate principal–agent problem, and this constitutes our main mathematicalcontribution.Unfortunately, but of course expectedly for a relatively general framework, there is no way to extract from ourmodel explicit results, especially on the shape of optimal controls. It is therefore necessary to perform numericalsimulations, by implementing semi–Lagrangian schemes, proposed for the first time by Camilli and Falcone [21], usingsome truncated high–order interpolators, as proposed by Warin [105]. The numerical results for both SIS and SIRmodels are conclusive, and confirm the relevance of a tax and testing policy to improve the control of an epidemic.First, in the benchmark case, considered as the case where the government does not put into place a specific policy,the efforts of the population are not sufficient to contain the epidemic. In our opinion, this supports the need forincentives. Indeed, if a tax policy is put into place, even in the absence of a specific testing policy, the population isthen encouraged to significantly reduce its interactions, thus containing the epidemic until the end of the period underconsideration. However, for a fixed containment period, the population relaxes its effort at the very end, leading toa resumption of the epidemic at that point. Finally, if the government also adjusts its testing policy, less effort isrequired on the population side, so individuals can interact almost in a business–as–usual fashion, and the epidemicis largely contained by the targeted isolation of positively–tested individuals.
Notations.
We let N ? be the set of positive integers, R + := [0 , ∞ ) and R ? + := (0 , ∞ ). We fix a time horizon T > a priori , by the government. For every n ∈ N ? , S n represents the setof n × n symmetric positive matrices with real entries. We also denote by C n the space of continuous functions from[0 , T ] into R n , and simplify notations when n = 1 by setting C := C . The set C n will always be endowed with thetopology associated to the uniform convergence on the compact [0 , T ]. For every finite dimensional Euclidean space E , and any n ∈ N ? , we let C b ( E, R ) be the space of bounded, continuous functions from E to R , as well as C nb ( E, R )the subset of C b ( E, R ) of all n –times continuously differentiable functions on E , with bounded derivatives. For every ϕ ∈ C b ( E, R ), we denote by ∇ ϕ its gradient vector, and by D ϕ its Hessian matrix. In this section, in order to highlight the results we obtained throughout this paper, we present our model in an informalway. We thus detail the compartmental epidemic models we consider to represent the spreading of the virus, i.e. ,either a SIS or a SIR model. Indeed, at the beginning of an epidemic, it is unlikely that decision–makers, let alonethe population, will have sufficient data to conclude that infected individuals become immune to the virus in questiononce they have recovered. This is particularly the case when the virus is new, as in the case of the COVID–19.With this in mind, we concentrate our attention to two classical models in epidemiology: the SIS model, for the casewhere infected individuals do not develop an immunity to the disease, and can therefore re–contract it, and the SIRmodel in the opposite case. Our study is therefore able to deal with both models, and one of the important pointswill be to compare the results obtained for each of them. We insist on the fact that this entire section is informal, andthe reader is referred to Section 4 for the rigorous mathematical study.
Some parameters will be common in the considered models. In particular, they both involve four non–negativeparameters, λ , µ , β and γ . The parameters λ and µ represent respectively the birth and (natural) death rates amongthe population, and therefore reflect the demographic dynamics unrelated to the epidemic , while γ represents thedeath rate associated to the disease. All these parameters are assumed to be constant and exogenous. In mostepidemic models, the parameter β , representing the transmission rate of the disease, is also assumed to be constantand exogenous. Nevertheless, in our framework, we will consider that β is endogenous and time–dependent, in orderto model the influence that the population can have on this transmission rate. It should be noted that if the length of the epidemic is relatively short in relation to the life expectancy at birth in the concernedcountry, the demographic dynamics become less relevant and may be dismissed altogether, by setting λ = µ = 0. Nevertheless, for thesake of generality, we choose to take these dynamics into account, in order to allow for a straightforward application of our study to othertypes of epidemics. β depends essentially on two factors: the disease characteristics and the contactrate within the population. Although the population cannot modify the disease characteristics, each individual canchoose (or be incentivised) to reduce his/her contact rate with other individuals in the population. We will thus assumethat the population can control the transmission rate β of the disease, by reducing social interactions. With this inmind, we will denote by β > i.e. , without any control measuresor effort from the population. Unfortunately, reducing social interactions is costly for the population. This cost takesinto account both the obvious social cost, due to accrued isolation during the lockdown period, and an economic cost(loss of employment due to the lockdown,...). From now on, β will thus denote the time–dependent transmissionrate of the disease, controlled by the population. More precisely, we fix some constant β max ≥ β representing themaximum rate of interaction that can be considered, and we define B := [0 , β max ]. The process β will be assumed tobe B –valued, and we will denote by B the corresponding set of processes. One of the two epidemic models we will study is inspired by the well–known SIS (Susceptible–Infected–Susceptible)compartment model, which mainly considers two classes S and I within the population: the class S represents the‘Susceptible’, while the class I represents the ‘Infected’. In this model, during the epidemic, each individual can beeither susceptible or infected, and ( S t , I t ) denotes the proportion of each category at time t ≥
0. More precisely,as in classical SIS models, we assume that an infected individual returns, after recovery, to the class of susceptibleindividuals, and can therefore re–contract the disease. We denote by ν the associated rate, which is assumed to bea non–negative constant. We also take into account the demographic dynamics of the population, i.e. , births anddeaths (related to the considered disease or not), through the previously mentioned parameters λ , µ and γ . To sumup, the model is represented in Figure 1 below, and the (continuous–time) evolution of the disease is described by thefollowing system d S t = (cid:0) λ − µS t + νI t − β t S t I t (cid:1) d t, d I t = − (cid:0) ( µ + ν + γ ) I t − β t S t I t (cid:1) d t, for t ∈ [0 , T ] , (2.1)for an initial compartmental distribution of individuals at time 0, denoted by ( s , i ) ∈ R , supposed to be known. Susceptible InfectedDeath λ d t β t S t I t d t ( µ + γ ) I t d tµS t d t νI t d t Figure 1: SIS model with demographic dynamics
The second epidemic model we will focus on is the classical SIR (Susceptible–Infected–Recovered) compartment model.As in the SIS model, the class S represents the ‘Susceptible’ and the class I represents the ‘Infected’. The SIR modelis used to describe epidemics in which infected individuals develop immunity to the virus. This therefore involves athird class, namely R , representing the ‘Recovered’, i.e. , individuals who have contracted the disease, are now cured,and therefore immune to the virus under consideration. We denote by ρ the recovery rate, which is assumed to bea fixed non–negative constant. Therefore, during the epidemic, each individual can be either susceptible, infected orrecovery, and ( S t , I t , R t ) denotes the proportion of each category at time t ≥
0. As in the previously described SISmodel, we also take into account the demographic dynamics of the population, through the parameters λ , µ and γ .To sum up, the epidemic scheme is represented in Figure 2, and the (continuous–time) evolution of the disease is We refer to Section 4.1.2 for a more precise definition of the set B , taking into account the information flow in the model. d S t = (cid:0) λ − µS t − β t S t I t (cid:1) d t, d I t = − (cid:0) ( µ + γ + ρ ) I t − β t S t I t (cid:1) d t, d R t = ( ρI t − µR t )d t for t ∈ [0 , T ] , (2.2)for a given initial distribution of individuals at time 0, denoted by ( s , i , r ) ∈ R and assumed to be known. Susceptible InfectedDeath Recovery λ d t β t S t I t d t ( γ + µ ) I t d tµS t d t ρI t d t µR t d t Figure 2: SIR model with demographic dynamics
The use of a deterministic model is widespread and generally justified for most epidemics. However, in our case study,and given what is currently happening in many countries, it appears that the number of infected individuals is not sosimple to quantify and estimate. Indeed, without a large testing campaign, it seems complicated to know preciselythe proportion of infected in the population. This is particularly true in the case of the COVID–19 epidemic: theabsence of symptoms for a significant proportion of infected individuals leads to uncertainty about the actual numberof susceptible and infected.As a consequence, it seems more realistic in our study to turn both the SIS and SIR deterministic controlled modelspreviously described, into stochastic controlled models. Concerning the deterministic part, the dynamics written inthe previous systems remain identical. The volatility is partly represented by a fixed and deterministic parameter σ >
0, and by a time–dependent process α , representing the actions of the government in terms of testing policy. Moreprecisely, in our model, an increase of the number of tests in the population, represented by a decrease of the parameter α , leads to a decrease in the volatility of the processes S and I . Hence, both the population and the government havea clearer view of the number of susceptible and infected, and thus on the epidemic. However, this strategy comes atan economic cost for the government. We then assume that, without any specific effort of the government, α is equalto 1. We also fix a small parameter ε ∈ (0 ,
1) to consider the subset A := [ ε, The control α of the government isassumed to be A –valued, and we denote by A the corresponding set of processes. In addition, the testing policy allows the government to isolate individuals with positive test results. Therefore, thecontrol α also has an impact on the effective transmission rate of the disease. More precisely, without any testingpolicy, i.e. α = 1, the government cannot isolate contaminated individuals efficiently. In this case, all infected peoplespread the disease, and the transmission rate of the virus is given by β . Conversely, if a testing policy is put into placeby the government, i.e. when α <
1, we consider that individuals with positive test results can be isolated, and as aconsequence less infected people spread the disease. In this case, the effective transmission rate is lower. We howeverdo not assume that the impact of the testing policy on the volatility of S and I , and on the transmission rate has thesame magnitude. Indeed, we expect a lower reduction of the effective transmission rate, compared to the volatilityreduction for a given policy α . This should be understood as a manifestation of the fact that it is easier to reduce theuncertainty on the number of infected people, compared to actually isolate individuals who have been identified asinfected. We thus assume a linear dependency with respect to α for the volatility of both S and I , while the effectivetransmission rate is chosen equal to β √ α , so that the number of infected people spreading the disease at time t isgiven by √ αI t . The lower bound ε is here to insist on the fact that it is not possible, or prohibitively expensive, to cancel completely the uncertaintylinked to the disease’s dynamics, by taking α to be 0. We refer to Section 4.1 for the rigorous definition of the set A .
7e can now consider the SIS model previously defined by (2.1), but in its stochastic version: the number of infected,and therefore the number of susceptible, are impacted at each time t by a Brownian motion W t . More precisely, thedynamic of the epidemic is now given by the following system S t = s + Z t (cid:0) λ − µS s + νI s − β s √ α s S s I s (cid:1) d s + Z t σα s S s I s d W s , t ∈ [0 , T ] ,I t = i − Z t (cid:0) ( µ + ν + γ ) I s − β s √ α s S s I s (cid:1) d s − Z t σα s S s I s d W s , t ∈ [0 , T ] . (2.3)Similarly to the SIS model, we consider that the deterministic model SIR described by (2.2) is also subject to a noisein the estimation of the proportion of susceptible and infected individuals. Inspired by the stochastic SIR model inTornatore, Buccellato, and Vetro [103], the dynamic of the epidemic is now given by the following system S t = s + Z t (cid:0) λ − µS s − β s √ α s S s I s (cid:1) d s + Z t σα s S s I s d W s , t ∈ [0 , T ] ,I t = i − Z t (cid:0) ( µ + ρ + γ ) I s − β s √ α s S s I s (cid:1) d s − Z t σα s S s I s d W s , t ∈ [0 , T ] ,R t = r + Z t ( ρI s − µR s )d s, t ∈ [0 , T ] . (2.4)Note that the proportion R of individuals in recovery is also uncertain, but only through its dependency with respectto I . More precisely, we assume that there is no uncertainty on the recovery rate ρ , implying that if the proportion ofinfected individual is perfectly known, the proportion of recovered is also known without uncertainty. This modellingchoice is consistent with most stochastic SIR models, and emphasises that the major uncertainty in the currentepidemic is related to the non–negligible proportion of (nearly) asymptomatic individuals. Indeed, an asymptomaticindividual may be mis–classified as susceptible. This is also the case for an individual in recovery, who has beenasymptomatic, but the uncertainty is solely related to the fact that he was not classified as infected when he actuallywas.In order to provide a unified framework for both the SIS and SIR models, and simplify the presentation, we willconsider the following dynamic for the epidemic S t = s + Z t (cid:0) λ − µS s + νI s − β s √ α s S s I s (cid:1) d s + Z t σα s S s I s d W s , t ∈ [0 , T ] ,I t = i − Z t (cid:0) ( µ + ν + γ + ρ ) I s − β s √ α s S s I s (cid:1) d s − Z t σα s S s I s d W s , t ∈ [0 , T ] ,R t = r − Z t ( ρI s − µR s )d s, t ∈ [0 , T ] . (2.5)Notice that to recover the SIS model, one has to set ρ = 0, and conversely, ν = 0 to obtain the SIR model. In addition to the choice of a testing policy, the government can also incentivise the population to limit their socialinteractions, in order to decrease the transmission rate of the disease, by introducing financial penalties. More precisely,at time 0, the government informs the population about its testing policy α ∈ A , as well as its fine policy χ ∈ C ,for the lockdown period [0 , T ]. Knowing this, the population will choose an interacting behaviour according to thefollowing rules:( i ) an increase in the tax lowers its utility;( ii ) an increase in the level of interaction (up to a specific threshold, namely β ) improves its well–being;( iii ) the population is scared of having a large number of infected people. See Section 4.1.3 for a rigorous definition of the set C of admissible fine policies. .3.1 Population optimisation problem We stylise the previous facts by considering that the population solves the following optimal control problem, for agiven pair ( α, χ ) ∈ A × C V A0 ( α, χ ) := sup β ∈B E (cid:20) Z T u ( t, β t , I t )d t + U ( − χ ) (cid:21) , (2.6)where u : [0 , T ] × B × R + −→ R and U : R −→ R are continuous functions in all their arguments, and U is a bijectionfrom R to R . Given a pair ( α, χ ), the set of optimal contact rates β will be denoted B ? ( α, χ ). The functions u and U should be interpreted as functions translating respectively the actual value of interaction fromthe point of view of the population, and the disutility associated to the fine. More precisely, the function U is assumedto be an increasing function, according to ( i ) above. Concerning the function u , it should be non–decreasing in thesecond variable up to β , and then non–increasing, modelling ( ii ) above. On the other hand, the function u is assumedto be non–increasing with respect to the proportion of infected individual in the population. In particular, this allowsto take into account both the fear of the infection (as mentioned in ( iii ) above) and the cost that is incurred if anindividual is infected. From the population’s point of view, this cost is not actually expressed in terms of money, butmainly corresponds to medical side effects or general morbidity. We refer to Anand and Hanson [7], Zeckhauser andShepard [110] and Sassi [94], for an introduction to QALY/DALY (Quality– and Disability–Adjusted Life–Year), thegeneric measures of disease burden used in economic evaluation to assess the value of medical interventions.We choose to normalise the utility of the population to zero when there is no epidemic. In other words, if i = 0, then I t = 0 for all t ∈ [0 , T ], and thus the utility of the population should be equal to 0. With this in mind, we assumethat U (0) = 0, which means that without a fine, the population does not suffer any disutility. Moreover, when thereis no epidemic, the population should not reduce its social interaction, meaning that for all t ∈ [0 , T ], β t = β . Thisleads us to assume that u ( t, β,
0) = 0 , for all t ∈ [0 , T ] . Example 2.1 (Utility functions for the population) . As previously mentioned, the function u : [0 , T ] × B × R + −→ R represents the social cost of lockdown policy, and thus should capture the two rules ( ii ) and ( iii ) , as well as satisfy u ( t, β,
0) = 0 for all t ∈ [0 , T ] . In particular, we could consider a separable utility function u of the form u ( t, b, i ) := − u I ( t, i ) − u β ( t, b ) , ( t, b, i ) ∈ [0 , T ] × B × R + , (2.7) where the function u I : R + −→ R represents the fear of the infection for the population. In order to choose thisfunction, we would like model the fact that when the proportion of infected is close to , the population underestimatesthe epidemic, while when this proportion becomes large, the population becomes irrationally afraid. Therefore, we canconsider a function independent of t , and take u I ( t, i ) = c p i , ( t, i ) ∈ [0 , T ] × R + , for some c p ≥ . (2.8) Next, the function u β represents the sensitivity of the population with respect to the initial transmission rate β ofthe disease, i.e. , without any lockdown measure. During the lockdown period, the social cost of distancing measuresbecomes more and more important for the population, and we thus expect the cost u β to also reflect this sensitivitywith respect to time. More precisely, we can consider two particular functions to model these stylised facts ( i ) either u β ( t, b ) := η p ψ ( t )( β − b ) / , ( t, b ) ∈ [0 , T ] × B , for some η p > , to insist on the fact that it is costly for thepopulation to deviate from its usual contact rate, i.e. its level of interactions in an epidemic–free environment,inducing the natural transmission rate of the disease β ;( ii ) or u β ( t, b ) := η p ψ ( t ) (cid:0) β η p b − η p − (cid:1) , ( t, b ) ∈ [0 , T ] × B , for some η p > , to emphasise that it is very costly, if notimpossible, to reduce the level of interaction between the population to zero, and thus to prevent the transmissionof the disease.In the two previous forms, ψ is a non–decreasing and convex R + –valued function, to represent the increasing aversionto the lockdown for the population as time passes. Indeed, the longer the lockdown period, the more sensitive thepopulation is to the social cost of distancing measures. In other words, deviating from its usual level of interaction Once again, the reader is referred to Section 4.1.3, and more precisely to Equation (4.8) for a rigorous definition of B ? . ntails a social cost to the population that is greater as the duration increases. More precisely, we can imagine thatthe function ψ takes the form ψ ( t ) := e τ p t , t ∈ [0 , T ] , for some τ p > . Finally, concerning the utility of the Agent with respect to the tax χ , we choose a mixed CARA – risk–neutral utilityfunction U ( x ) := 1 − e − θ p x θ p + φ p x, x ∈ R , where θ p > is the risk–aversion of the population, and φ p > , so that U (0) = 0 , and U is an increasing and strictlyconcave bijection from R to R . For later use, we record that the inverse of U , denoted by U ( − , has an explicit form ( see Corless, Gonnet, Hare, Jeffrey, and Knuth [25] for more details about the
LambertW function ) U ( − ( y ) := 1 θ p LambertW (cid:16) φ − e − θ p yφ p (cid:17) + θ p y − θ p φ p , y ∈ R . Before turning to the principal–agent problem itself, we aim at solving (4.7) for α = 1 fixed, and χ = 0, i.e. withouttax and testing policy. Similar problems have been studied in for instance Kandhway and Kuri [61]. Mathematicallyspeaking, the optimisation problem faced by the population without contract is informally given by V A0 (1 ,
0) = sup β ∈B E (cid:20) Z T u ( t, β t , I t )d t (cid:21) , (2.9)since we assumed U (0) = 0. Notice that by assumption on the function u , in the no–epidemic case, i.e. , if i = 0, thepopulation should not make any effort, and therefore the optimal contact rate β over the period [0 , T ] is equal to β .We thus consider in the following a fixed initial condition ( s , i ) ∈ ( R ? + ) , which implies that for all t ∈ [0 , T ], both S t and I t are (strictly) positive.Without tax, the population’s problem boils down to a standard control problem, with two state variables S and I .We will give the associated PDE in Section 2.4.1 below. One of the main theoretical result of our study is given by Theorem 4.7. Informally, this theorem states that givenan admissible contract, namely a testing policy α ∈ A and a tax χ ∈ C , there exist a unique Y and Z such that thefollowing representation holds U ( − χ ) = Y − Z T (cid:16) Z t ( µ + ν + γ + ρ ) I t + u ( t, β ?t , I t ) − β ?t √ α t S t I t Z t (cid:17) d t − Z T Z t d I t , (2.10)where β ? is the unique optimal contact rate for the population. More precisely, we can state that for (Lebesgue–almostevery) t ∈ [0 , T ], β ?t := b ? ( t, S t , I t , Z t ) is the maximiser of the function b ∈ B u ( t, b, I t ) − bS t I t Z t . Under someassumptions for existence and smoothness of the inverse of the function U , the previous equation gives a representationfor the tax χ .Based on (2.10), the tax χ will be indexed on the variation of the proportion of infected I , through the stochasticintegral R · Z s d I s , and not on the variation of susceptible S (though it is indexed on S through the d t integral).Nevertheless, using the link between the dynamics of I and S , we can write a representation equivalent to (2.10) U ( − χ ) = Y − Z T (cid:0) u ( t, β ?t , I t ) − β ?t √ α t S t I t Z t − Z t ( λ − µS s + νI s ) (cid:1) d t + Z T Z t d S t . (2.11)Through this equation, we can state that the tax can be indexed on S instead of I . Therefore, given the strong linkbetween the number of Susceptible and the number of Infected, it is sufficient to index the tax on only one of thesetwo quantities, and one can therefore choose indifferently to index the tax χ on the variations of I or S .The reader familiar with contract theory in continuous–time will have noticed that the previous representation forthe tax χ is not exactly the expected one. Indeed, referring for instance to Cvitanić, Possamaï, and Touzi [29] thecontract is usually the sum of three components: 10 i ) a constant similar to Y , chosen by the Principal in order to satisfy the participation constraint of the Agent;( ii ) an integral with respect to time t ∈ [0 , T ] of the Agent’s Hamiltonian;( iii ) a stochastic integral with respect to the controlled process, i.e. , in our framework, ( S, I ).Neither the representation (2.10) nor (2.11) are, a priori of this form. This difference is due to the fact that thedynamics of (
S, I ) is degenerated. More precisely, there is a fundamental structure condition in [29] requiring that thedrift of the output process belongs to the range of its volatility. In words, defining for ( s, i ) ∈ R and ( a, b ) ∈ A × B , σ ( i, s, a ) := σasi (cid:18) − (cid:19) , and λ ( s, i, b, a ) := (cid:18) λ − µs + νi + b √ asi − ( µ + ν + γ + ρ ) i + b √ asi (cid:19) , the condition assumed in [29, Equation (2.1)] implies that λ ( s, i, b, a ) ∝ σ ( i, s, a ) , for any ( s, i, a, b ) ∈ R × A × B, which is obviously impossible here. Therefore, we cannot use directly any existing result in the literature, and weshould not expect, a priori , to be able to obtain a contract representation similar to the one in [29], nor that theso–called dynamic programming approach will prove effective in our case. Indeed, as far as we know, such degeneratemodels have only been tackled using the stochastic maximum principle, see Hu, Ren, and Touzi [56].However, and somewhat surprisingly, the form we exhibit for the tax is actually strongly related to the usual rep-resentation. The reason for this is twofold. First, up to the sign, the volatilities in the dynamics of both S and I are exactly the same. Second, both the processes S and I are driven by the same Brownian motion W . Therefore,intuitively, in order to provide incentives to the population, the government can afford to index the tax on only oneof the two processes. Mathematically, it is also straightforward to show that given an arbitrary decomposition of theprocess Z in Equation (2.10) of the form Z =: Z s − Z i , we have U ( − χ ) = Y − Z T H ( t, S t , I t , Z st , Z it )d t + Z T Z st d S t + Z T Z it d I t , where H is the Hamiltonian of the population, and this is exactly the general form provided in [29]. The maindifference is that in [29], Z s and Z i are both uniquely given, while in our representation, only their difference actuallymatters. Hence, there is an infinite number of possible representations for the tax χ in our degenerate model. As already explained, the government can choose the tax χ ∈ C paid by the population together with the testingpolicy α ∈ A . It aims at minimising the number of infected people until the end of the quarantine period, and weinformally write its minimisation problem as V P0 := sup ( α,χ ) ∈ Ξ sup β ∈B ? ( α,χ ) E (cid:20) χ − Z T (cid:0) c ( I t ) + k ( t, α t , S t , I t ) (cid:1) d t (cid:21) , (2.12)where c : R + −→ R + and k : [0 , T ] × A × R + × R + −→ R are continuous functions. The function c denotes theinstantaneous cost implied by the proportion of infected people during the quarantine period, and is thus assumed tobe non–decreasing, while the function k represents the cost of the testing policy.In addition, the set Ξ takes into account the so–called participation constraint for the population. This means thatthe government is benevolent, which translates into the fact that it has committed to ensure that the living conditionsof the population do not fall below a minimal level. Mathematically, the government can only implement policies( α, χ ) ∈ A × C such that V A0 ( α, χ ) ≥ v , where the minimal utility v ∈ R is given. This is what is encoded in the set Ξ. Example 2.2 (Cost functions for the government) . The function c can be linear to represent the cost per unit ofinfected people, or quadratic to highlight the cost induced by the saturation of the health–care system when the numberof infected is too high. Typically, we would take c ( i ) := c g ( i + i ) , i ∈ R + , for some c g > c p , to take into accountthat the marginal cost linked to the proportion of infected people in the population is higher for the government thanfor the population itself. We also point out that we choose a linear–quadratic cost in i for the government, while we ook a purely cubic one for the population. This choice emphasises that, on the one hand, even for a small number ofinfected, the marginal cost faced by the government is not close to hence the linear term ) . On the other hand, thepopulation is more likely to incur very high and lasting costs ( loss of revenues, employment, life, ... ) when the diseasespreads uncontrollably, when compared to the government which mostly faces pecuniary costs.Concerning the cost function k associated with the testing policy, we recall that α = 1 means no testing policy, so nocost for the government. As soon as α is different from , the cost has to be higher. We may consider the followingfunction for the testing policy k , for some η g > and κ g > , k ( a ) := κ g η g (cid:0) a − η g − (cid:1) , a ∈ A. (2.13) This function highlights the fact that it is very costly, if not impossible, to eliminate the uncertainty associated with theepidemic. Indeed, in a relatively populous country, it seems impossible to develop a testing policy sufficient to knowexactly the proportion of susceptible and infected people.
Another interesting case to compare our results with, corresponds to the so–called first–best case. This is the best–possible scenario where the government can enforce whichever interaction rate β ∈ B it desires, and simply has tosatisfy the participation constraint of the population. From the practical point of view, this could correspond to asituation where the government would be able to track every individual and force them to stop interacting. Theproblem faced by the government is then V P , FB0 := sup ( α,χ,β ) ∈A× C ×B E (cid:20) χ − Z T (cid:0) c ( I t ) + k ( t, α t , S t , I t ) (cid:1) d t (cid:21) , such that E (cid:20) Z T u ( t, β t , I t )d t + U ( − χ ) (cid:21) ≥ v. (2.14) In this section, we present the main theoretical results obtained when the dynamic of the epidemic is given by (2.5).Recall that, in order to consider the SIS or the SIR model, one has to set respectively ρ = 0 or ν = 0. As mentioned in Section 2.3.2, the benchmark problem is a standard Markovian stochastic control problem, whoseHamiltonian is defined, for t ∈ [0 , T ], ( s, i ) ∈ ( R ? + ) , p := ( p , p ) ∈ R and M ∈ S by H A ( t, s, i, p, M ) := sup b ∈ B (cid:8) − bsi ( p − p ) + u ( t, b, i ) (cid:9) + ( λ − µs + νi ) p − ( µ + ν + γ + ρ ) ip + 12 σ ( si ) ( M − M + M ) . (2.15)We then have the natural identification V A0 (1 ,
0) = v (0 , s , i ), where v solves the associated Hamilton–Jacobi–Bellman(HJB for short) equation − ∂ t v ( t, s, i ) − H A ( t, s, i, ∇ v, D v ) = 0 , ( t, s, i ) ∈ D , v ( T, s, i ) = 0 , ( s, i ) ∈ D T , (2.16)where D := (cid:8) ( t, s, i ) ∈ [0 , T ) × ( R ? + ) : 0 < s + i ≤ F ( t, s , i ) (cid:9) , D T := (cid:8) ( s, i ) ∈ R : 0 < s + i < F ( T, s , i ) (cid:9) , for a particular function F defined by (4.4) in Section 4.1. Note that if we consider separable utilities with the form u ( t, b, i ) := − u I ( t, i ) − η p ψ ( t )2 ( β − b ) , (2.17a)or u ( t, b, i ) := − u I ( t, i ) − η p ψ ( t ) (cid:0) β η p b − η p − (cid:1) , (2.17b)12he maximiser of the Hamiltonian is given by b ◦ ( s, i, p − p ) where the map b ◦ : ( R ? + ) × R −→ B is defined for all( s, i, z ) ∈ ( R ? + ) × R by b ◦ ( s, i, z ) := β max , if z < − η p ψ ( t ) si ( β max − β ) ,β − sizη p ψ ( t ) , if z ∈ (cid:20) − η p ψ ( t )( β max − β ) si , βη p ψ ( t ) si (cid:21) , , if z > βη p ψ ( t ) si , (2.18a)or b ◦ ( s, i, z ) := β max , if z < η p ψ ( t ) β η p si ( β max ) η p , (cid:18) η p ψ ( t ) β η siz (cid:19) η , otherwise . (2.18b)In particular, the optimal interaction rate is given by β ◦ t = b ◦ ( S t , I t , ( ∂ s v − ∂ i v )( t, S t , I t )), t ∈ [0 , T ]. To find the optimal interaction rate β ∈ B , as well as the optimal contract ( α, χ ) ∈ A × C , in the first–best case, onehas to solve the government’s problem defined by (2.14). Mathematical details are postponed to Section 4.3.3, but wepresent here an overview of the main results.To take into account the inequality constraint in the definition of V P , FB0 , one has to introduce the associated La-grangian. Given a Lagrange multiplier $ >
0, we first remark that the optimal tax is constant and given by χ ? ( $ ) := − (cid:0) U (cid:1) ( − (cid:18) $ (cid:19) . (2.19)Then, defining for any $ > V ( $ ) := sup ( α,β ) ∈A×B E (cid:20) Z T (cid:0) $u ( t, β t , I t ) − c ( I t ) − k ( t, α t , S t , I t ) (cid:1) d t (cid:21) , (2.20)we have V P , FB0 = inf $> n χ ? ( $ ) + $ (cid:0) U (cid:0) − χ ? ( $ ) (cid:1) − v (cid:1) + V ( $ ) o . (2.21)Note that V ( $ ) is the value function of a standard stochastic control problem, and therefore we expect to have V ( $ ) = v $ (0 , s , i ), for a function v $ : [0 , T ] × R −→ R solution to the following HJB PDE − ∂ t v $ ( t, s, i ) + c ( i ) − ( λ − µs + νi ) ∂ s v $ + ( µ + ν + γ + ρ ) i∂ i v $ − H $ ( t, s, i, ∂v $ , D v $ ) = 0 , ( t, s, i ) ∈ D ,v $ ( T, s, i ) = 0 , ( s, i ) ∈ D T , where the Hamiltonian is defined, for t ∈ [0 , T ], ( s, i ) ∈ ( R ? + ) , p := ( p , p ) ∈ R and M ∈ S by H $ ( t, s, i, p, M ) := sup a ∈ A (cid:26) sup b ∈ B (cid:8) $u ( t, b, i ) − bsi √ a ( p − p ) (cid:9) − k ( t, a, s, i ) + 12 σ ( si ) a ( M − M + M ) (cid:27) . In particular, if we consider separable utilities with the forms (2.17), for a given testing policy α ∈ A and a Lagrangemultiplier $ >
0, the optimal interaction rate is given for all t ∈ [0 , T ] by β $t = b $ (cid:0) S t , I t , ∂v $ ( t, S t , I t ) , α t (cid:1) , where b $ ( s, i, p, a ) := b ◦ (cid:0) s, i, √ a ( p − p ) /$ (cid:1) , for all ( s, i, p, a ) ∈ ( R ? + ) × R × A, recalling that b ◦ is defined by (2.18). 13 .4.3 The general case Thanks to the reasoning developed in Section 4, we are able to determine the optimal design of the fine policy, theoptimal testing policy, as well as the optimal effort of the population.First, as informally explained in Section 2.3.3, to implement a tax policy χ ∈ C , the government only needs tochoose a constant Y and a process Z . Given these two parameters, we can state that the optimal contact rate forthe population is defined by β ?t := b ? ( t, S t , I t , Z t , α t ), such that the function b ∈ B u ( t, b, I t ) − b √ α t S t I t Z t ismaximised for (Lebesgue–almost every) t ∈ [0 , T ]. Remark 2.3.
Note that if we consider separable utilities with the forms (2.17) , the maximiser b ? is defined for all ( t, s, i, z, a ) ∈ [0 , T ] × ( R ? + ) × R × A by b ? ( s, i, z, a ) := b ◦ ( s, i, z √ a ) , recalling that b ◦ is defined by (2.18) . It thus remains to solve the government’s problem in order to determine the optimal choice of Y and Z . The readeris referred to Section 4.3 for the rigorous government’s problem, but, to summarise the results, the optimal process Z as well as the optimal testing policy α are determined so as to maximise the government’s Hamiltonian, given by H P ( t, s, i, p, M ) = sup z ∈ R ,a ∈ A (cid:26) b ? ( t, s, i, z, a ) √ asi ( p − p ) + 12 σ a ( si ) f ( z, M ) − k ( t, a, s, i ) − u ? ( t, s, i, z, a ) p (cid:27) + ( λ − µs + νi ) p − ( µ + ν + γ + ρ ) ip − c ( i ) , for ( t, s, i, p, M ) ∈ [0 , T ] × R × R × S , and where, in addition for z ∈ R , f ( z, M ) := M − M + M − z ( M − M ) + z M , and u ? ( t, s, i, z, a ) := u (cid:0) t, b ? ( t, s, i, z, a ) , i (cid:1) . Finally, it remains to solve numerically the following HJB equation, for all t ∈ [0 , T ] and x := ( s, i, y ) ∈ R − ∂ t v ( t, x ) − H P (cid:0) t, x, ∇ x v, D x v (cid:1) = 0 , ( t, x ) ∈ O , v ( T, x ) = − U ( − ( y ) , x ∈ O T , (2.22)where the natural domain over which the above PDE must be solved is O := (cid:8) ( t, s, i, y ) ∈ [0 , T ) × R × R : 0 < s + i < F ( t, s , i ) (cid:9) , O T := (cid:8) ( s, i, y ) ∈ R × R : 0 < s + i < F ( T, s , i ) (cid:9) . The results presented in Section 2.4 are quite theoretical: except for the optimal transmission rate, it is complicatedto obtain explicit formulae for the other variables sought, in particular for the optimal testing policy α , even if weconsider separable utility functions as in (2.17). It is therefore necessary to perform numerical simulations to evaluatethe optimal efforts of the population and the government, as well as the optimal tax policy. Given the similarities inthe results between the SIS and SIR models, only those related to the SIR model are presented in this section. Thereader will find in Appendix A the results corresponding to the SIS model. The following numerical experiments are implemented using the utility and cost functions respectively mentioned inExample 2.1 for the population and in Example 2.2 for the government. To summarise, we choosefor the population: u ( t, b, i ) := − c p i − η p ψ ( t )2 ( β − b ) , with ψ ( t ) := e τ p t , and U ( x ) := 1 − e − θ p x θ p + φ p x, (3.1a)for the government: k ( a ) := κ g ( a − η g − , and c ( i ) := c g ( i + i ) , (3.1b)for all ( t, x, s, i ) ∈ [0 , T ] × R × R , a ∈ A := [ ε,
1] and b ∈ B := [0 , β max ]. These functions require to specify severalparameters, provided in Table 1.Parameters c p η p θ p τ p φ p β max Values 0 . . . (a) Characteristics of the population Parameters κ g c g η g ε Values 0 .
001 1 0 .
01 0 . (b) Characteristics of the government Table 1: Set of parameters for cost and utility functions14n addition, the set of parameters used for the simulations of the epidemic dynamics given by (2.5) are provided inTable 2 and are inspired by those chosen by Élie, Hubert, and Turinici [38]. Recall that the parameter β denotesthe usual contact rate within the population, before the beginning of the lockdown. In other words, β represents theinitial and effective transmission rate of the disease, without any specific effort of the population. The associatedreproduction number R , commonly defined by R := β/ ( ν + ρ ) in the literature on epidemic models, is equal to 2 . λ and µ represent respectively the birth and (natural) death rates among the population, and therefore reflect the demographicdynamics unrelated to the epidemic, while γ represents the death rate associated to the disease. To simplify, and sincethe duration of the COVID–19 epidemic should be relatively short in comparison to the life expectancy at birth,we choose to disregard the demographic dynamics by setting λ = µ = 0. In contrast, we set γ = 1%, since themortality associated with the disease appears to be significant. Finally, recall that the parameters ν and ρ correspondrespectively to the recovery rates in the SIS and SIR models, i.e. , the inverse of the virus contagious period. Sincewe want to consider here a SIR dynamic, we let ν = 0 and ρ = 0 .
1, to account for the average 10–day duration ofCOVID–19 disease. T ( s , i , r ) ( λ, µ ) γ ν ρ σ β SIR model (2.4) 200 (0 . , . × − , . × − ) (0 ,
0) 0 .
01 0 0 . . . v .To fix this level, we assume that the government wants to ensure at the very least to the population the same livingconditions it would have had in the event of an uncontrolled epidemic, i.e. , without any effort on the part of neitherthe population nor the government, meaning β = β , α = 1 and χ = 0. Mathematically, this is equivalent to thefollowing, since u is separable of the form (2.7), such that for all t ∈ [0 , T ], u β ( t, β ) = 0 and u I satisfies (2.8) v := E P ,β (cid:20) Z T u ( t, β, I t )d t + U (0) (cid:21) = E P ,β (cid:20) − Z T (cid:0) u I ( t, I t ) + u β ( t, β ) (cid:1) d t (cid:21) = − c p E P ,β (cid:20) Z T I t d t (cid:21) , (3.2)Notice that the reservation utility v is given by the worst case scenario, without any sanitary precaution neither fromthe population nor from the government. This level may be judged too severe, and one could consider a model wherethe government is more benevolent. In particular, one could set v closer to the value that the population achieves inthe benchmark case, i.e. , when it makes optimal efforts in the absence of government policy. Nevertheless, the valueof v should not be of major importance, since it should only impact the initial value Y . In order to solve Equation (2.16) corresponding to the population’s problem in the benchmark case, as well as Equa-tion (2.22) for the government’s problem, we need a method permitting to deal with degenerate HJB equations. Wechoose to implement semi–Lagrangian schemes, first proposed in Camilli and Falcone [21]. These are explicit schemesusing a given time–step ∆ t , and requiring interpolation on the grid of points where the equation is solved. Thisinterpolation can be either linear, as proposed in [21], or using some truncated higher–order interpolators, as proposed15y Warin [105], leading to convergence of the numerical solution to the viscosity solution of the problem. A keypoint here, which makes the approach delicate, is that the domain over which the PDEs are solved is unbounded.It is therefore necessary to define a so–called resolution domain, over which the numerical solution will be actuallycomputed, which on the one hand must be large enough, and which on the other hand creates additional difficultiesin the treatment of newly introduced boundary conditions. In order to treat these issues, we use two special tricks:( i ) picking randomly the control in (2.5) for the benchmark case, and in (4.16) for the general case, and using theforward SDE with an Euler scheme, a Monte–Carlo method allows us to get an envelop of the reachable domainwith a high probability at each time–step. Then, given a discretisation step fixed once and for all, the gridof points used by the semi–Lagrangian scheme is defined at each time–step with bounds set by the reachabledomain estimated by Monte–Carlo. Therefore, at time step 0, the grid is only represented by a single mesh,while the number of meshes can reach millions near T ;( ii ) since the scheme is explicit, starting at a given point at date t , it requires to use only some discretisation pointsat date t + ∆ t , and a modification of the general scheme is implemented to use only points inside the grid atdate t + ∆ t , as shown in [105].Lastly, in dimension 3 or above, parallelisation techniques defined in [105] have to be used in order to accelerate theresolution of the problems. The numerical results below are obtained using the StOpt library, see Gevret, Langrené,Lelong, Warin, and Maheshwari [43]. We first focus on the benchmark case, when the government does not implement any particular policy to tackle theepidemic, i.e. , α = 1 and χ = 0. Recall that in this case, the population’s problem is given by (2.9), and is thenequivalent to solving the HJB equation (2.16).For our simulations, we choose a number of time–steps equal to 200, and a discretisation step equal to 0 . b ◦ used to maximise the Hamiltonian is discretised with200 points given a step discretisation of 0 . S and I , meaning with the optimal transmission rate b ◦ . In order to check theaccuracy of the method described in Section 3.2, we implement two versions of the resolution( i ) the first version is a direct resolution of (2.16) with the Hamiltonian (2.15);( ii ) the second one relies on a change of variable. More precisely, we consider ( s, x := ( s + i )) as state variables,instead of ( s, i ), and then solve the problem (2.16), but with a slightly modified Hamiltonian to take into accountthis change of variable e H A ( t, s, x, p, M ) := sup b ∈ B (cid:8) − bs ( x − s ) p + u ( t, b, x − s ) (cid:9) + ( λ − ( µ + ν ) s + νx ) p + ( λ − ( µ + γ + ρ ) x + ( γ + ρ ) s ) p + 12 σ ( s ( x − s )) M . The advantage of the second representation is that the dispersion of I t + S t is zero and thus smaller than the one of I t , leading to the use of grids with a smaller number of points.First, to give an overview of the overall trend, we plot, on Figure 3, 100 trajectories of the optimal interaction rate β ? ,and the associated proportions S t and I t of susceptible and infected, using the resolution method ( i ) mentioned above, i.e. , with state variables ( S, I ). For more accurate trajectories, we compare on Figure 4 two different trajectories of theoptimal interaction rate β ? , together with the corresponding dynamic of the proportion I of infected. For these twosimulations, we compare the results given by the two aforementioned methods. More precisely, while the blue curveis obtained through the direct resolution, the orange one results from the second method, i.e. , with state variables( S, S − I ). Finally, on Figures 5 and 6, we test the influence of the parameter τ p by setting τ p = 0 .
01, instead of 0.16 ptimal effort β ? Proportion I of infected Proportion S of susceptible Figure 3: 100 simulations with respect to time of the SIR model in the benchmark case.
Voluntary lockdown of the population.
As expected, the optimal behaviour β ? is to start close to β , then wenote that β ? decreases as the disease spreads in the population. More specifically, two waves of effort can be observed:the first one delays the acceleration of the epidemic, and the second, generally more significant, takes place during thepeak of the epidemic. Approaching the fixed maturity, individuals come back to their usual behaviour β . However,even if the population chooses to decrease the interaction rate among individuals, the range of β ? stays quite smallwith minimum 0 .
16 and maximum β = 0 . Optimal effort% of infected
Simulation Simulation Figure 4: The optimal transmission rate β and the resulting proportion I in the benchmark case Comparison between of the two methods aforementioned on two simulations. ensitivity with respect to the method. As we can notice in Figure 4 (top), the optimal effort obtained forthese two simulations exhibits the same features as those previously described. Moreover, the blue curve and theorange curve, representing respectively the results of the two aforementioned methods, are very close, except at thebeginning of the time interval, probably because of the very small initial value i . Nevertheless, we can see on thebottom graphs that the two methods lead to the same dynamic for the proportion of infected, since the two curves,blue and orange, are almost superposed. Therefore, a small error on the computation of the optimal effort at thebeginning does not impact the optimally controlled trajectories of I . The resolution with respect to ( s, s + i ) seemsto be more regular, and may give a command closer to the analytical one. The fear of the infection is not enough.
Without a proper government policy to encourage the lockdown, thenatural reduction of the interaction rate among individuals is not sufficient to contain the disease, so that it spreadswith a high infection peak, up to 0 . S at time T = 200 lies between 0 . .
4. In conclusion, without some governmental measures, the fearof the epidemic is not sufficient to encourage the population to make sufficient effort, in order to significantly reducethe rate of transmission of the disease. The introduction by the government of an effective lockdown policy togetherwith an active testing policy should improve the results of the benchmark case, in particular by reducing the peak ofinfection and the total number of infected people over the considered period.
Optimal effort% of infected
Simulation Simulation Figure 5: The optimal transmission rate β and the resulting proportion I with τ p = 0 . Comparison between of the two methods on two simulations.
The lockdown fatigue.
By setting τ p = 0 .
01 instead of 0, the cost of the lockdown from the population’s point ofview is now increasing with time. This allows to take into account the possible fatigue the population may suffer if thelockdown continues for too long. As expected, by comparing Figures 3 and 6, the impatience of the population, giveshigher values of optimal interaction rate β . Moreover, comparing Figures 4 and 5, we can see that in both simulations,18he second wave of effort is of course more impacted ( i.e. , the contact rate is less reduced) by the impatience of thepopulation than the first one. Optimal control β ? Proportion I of infected % S of susceptible Figure 6: 100 simulations of the SIR model in the benchmark case with τ p = 0 . We focus in this section on the tax policy, by assuming that A = { } . In words, we assume that the governmentdoes not implement a specific testing policy, which means α = 1 as in the benchmark case, but only encouragesthe population to lockdown through the tax policy χ . In such a situation, i.e. , without a proper testing policy, thedetection and hence the isolation of ill people becomes very intricate. The only possibility to regain control of theepidemic was to reduce the interaction rate of the population.This case is interesting, as it corresponds to the lockdown policy that most of western countries have implemented in2020, when faced with the COVID–19 disease, while a very small number of tests was available. Indeed, most countriesput in place systems of fines, or even prison sentences, to incentivise people to lockdown. Although the penalties fornon–compliance are not as sophisticated as in our model, most governments did adapt the level of penalties accordingto the stage of the epidemic: higher fines during periods of strict lockdown (hence at the peak of the epidemic), or incase of recidivism, for example. This reflects the adjustment of sanctions in many countries according to the healthsituation, and therefore a notion of dynamic adaptation to circumstances, which is exactly what is suggested by ourtax system. Though it is clear that our model is different from reality, since we consider a fine/compensation, paidat some terminal time T , and equal for each individual, whereas in most countries, the fine is paid by a particularindividual who has not complied with the injunctions, we still believe it allows to highlight sensible guidelines.The numerical approach is highly similar to the method used to solve the benchmark case. One difference is that wehave to estimate the reservation utility of the population, namely v , given by (3.2). Using a Monte–Carlo method andan Euler scheme with a time–discretisation of 200 time–steps and 10 trajectories, we obtain an approximated value v = − . s, i, y ) corresponding to (0 . , . , . × ×
800 (for Z max = 30).A last technical point concerning the domain of the control Z . Although this control of the government, used to indexthe tax on the proportion of infected, can take high values, we have to bound its domain in order to perform thenumerical simulations. We choose to restrict its domain to an interval [ − Z max , Z max ], and consider a discretisationstep equal to 0 .
5. One would naturally expect that a larger choice would lead to somewhat better solutions. However,this neglects a fundamental numerical issue: large values of Z increase the numerical cost, as they enlarge the volatilityof the process Y (given by σZIS ). As such, since the volatility cone becomes larger, it is necessary to sample a muchlarger grid in order to be able to cover the region were Y will most likely take its values. Too large values of Z max therefore become numerically intractable, unless one is willing to sacrifice accuracy. A balance need to be struck,which is why we capped Z maz at 30. A sensitivity analysis with respect to variations of Z max is provided in Figure 8.Though the trajectories of the optimal Z are somewhat impacted, Figure 7 confirms that this is minimal impact onthe trajectories of I itself. Indeed, for different values for Z max , the shape of the parameter Z remains the same. Moreimportantly, we will see that the paths of the optimal transmission rate, namely β ? , associated to different Z max , arealmost superposed. As a consequence, the dynamic of I also follows almost the same paths independently of Z max .19irst, we present in Figure 7 different trajectories of the proportion I of infected when the government implementsthe optimal tax policy, and compare it to the trajectories obtained in the benchmark case. As mentioned before, wealso want to study the sensibility with respect to the arbitrary bound Z max , and we thus represent the paths of I inthree cases, in addition to the benchmark case: for Z max = 10 (orange curves), Z max = 20 (green), and Z max = 30(red). Then, the corresponding simulations of the optimal control Z of the government, used to index the tax on theproportion of infected, is given in Figure 8. We compare optimal controls β and Z for the tax policy with differentlockdown time period in Figure 9. Finally, Figure 10 regroups the simulations of the optimal transmission rate β ? obtained with the tax policy, and compare it to β ◦ obtained in the benchmark case. Simulation Simulation Simulation Simulation Simulation Simulation Figure 7: Optimal trajectories of I without testing Comparison for different values of Z max and in the benchmark case. The epidemic is at best contained, and at worst delayed.
Compared to the benchmark case, we observe inFigure 7 that the optimal lockdown policy prevents the epidemic peak in most cases by maintaining the epidemic tolow levels of infection during the lockdown period. Therefore, the government has more time to prepare for a possibleinfection peak after the lockdown, specifically to increase hospital capacity and provide safety equipment (surgicalmasks, hydro–alcoholic gel, respirators...). The government can also use this time to fund the development of tests todetect the virus, as well as the research on a vaccine or a remedy for the related disease.Nevertheless, we can see that at the end of the lockdown period, in many cases the virus is not exterminated andthe epidemic may even restart. This is particularly well illustrated by Figure 11, representing 500 trajectories of I ,obtained with the optimal control. Such a phenomenon can be understood as follows: the lockdown slows down theepidemic, so that a very small proportion of the population has been infected and is therefore immune. We thus cannotthus rely on herd immunity, which is reached if at least 50% of the population has been contaminated, to prevent aresurgence of the epidemic. Consequently, this lockdown policy is a powerful leverage to control an epidemic, but thistool needs to be supplemented by alternative policies, such as those mentioned above, in order to be fully effective. Ifthe time saved through lockdown is not exploited, it will have no impact on the final consequences of the epidemic,measured by the economic and social cost associated with the total number of people infected and deceased duringthe total duration of the epidemic. 20 imulation Simulation Simulation Simulation Simulation Simulation Figure 8: Optimal trajectories of the control Z without testing. Comparison for different values of Z max , with A = { } . Policy implications
By comparing the graphs in Figure 8, we first remark that the shape of the optimal indexationparameter rate Z remains the same, regardless of the simulation and the value of Z max . The control takes the mostnegative value possible ( − Z max ) for about 20 days, then increases almost instantaneously to reach the maximum value Z max , before slowly decreasing to 0. Therefore, the optimal tax scheme set by the government is as follows. First,at the beginning of the epidemic, it seems optimal to give to the population a compensation (corresponding to anegative tax) as maximal as possible, by setting Z = − Z max . Though this may be a numerical artefact, given thatthe initial values of I and its variations are extremely low, the fact that the same phenomenon appeared in virtuallyall our simulations tends to show that it is actually significant. We interpret this as a the government anticipatingthe negative consequences of the lockdown policy by immediately providing monetary relief to the population. Thisis exactly what happened in several countries, for instance in the USA with stimulus checks sent to every citizen, andour model endogenously reproduces this aspect. Policy–wise, it also shows that maximum efficiency for such stimuluspackages is attained when they are provided to the population as early as possible. After this initial phase, when theepidemic spreads among the population, the government suddenly increases Z , so that the tax becomes positive andis in fact maximum, in order to deter people from interacting.Approaching the maturity, the government eases the lockdown little by little. However, this end of lockdown may bepremature, since we have observed in the previous figures that the epidemic may restart at the end of the consideredperiod. Indeed, considering a final time horizon is equivalent to assuming that ‘the world’ stops at that time: allthe potential costs generated by the epidemic after T are not taken into account in the model. The government thushas no interest in implementing costly measures, whose subsequent impact on the epidemic will not be measured.Nevertheless, this boundary effect has no impact on the previous results and interpretations. Indeed, we remark in thenumerical results that if we consider a more distant time T , the lockdown certainly lasts longer, but follows the exactsame paths during most of the lockdown period, and its release occurs around the same time before maturity (seeFigure 9 below). Moreover, the lockdown period should still end at some time, which is why a finite terminal time isassumed. This time may correspond to an estimate of the time needed to implement other more sustainable policiesthan lockdown, such as the implementation of an active testing policy, or to hope for the discovery of a vaccine orcure, as mentioned above. 21 ptimal β Optimal control Z Figure 9: Maturity effect for the tax policy in the SIR model
Comparison of the optimal trajectories of Z for T = 200 and T = 250, with Z max = 30. Optimal tax sensitivity with respect to the lockdown duration.
On Figure 9, we give two trajectories of theoptimal contact rate β (on the left) and the optimal indexation parameter Z (on the right) for two different maturities.It is clear that both trajectories follow the same paths until some point. Regardless of the maturity, the contact rate β and the parameter Z have the same characteristics as those shown respectively in Figures 8 and 10. As one approachesthe shortest maturity, i.e. T = 200, the parameter Z decreases towards 0 for the contract of this maturity, while theother remains at the maximum, and decreases later, as its maturity approaches. Therefore, the fact that Z decreasesat maturity, as mentioned in the paragraph ‘Policy implications’ above appears to be a boundary effect since it is notsensitive with respect to the maturity. Simulation Simulation Simulation Simulation Simulation Simulation Figure 10: Optimal transmission rate β without testing Comparison for different Z max and with the benchmark case, in the case A = { } . ptimal interaction rate and comparison with the benchmark case. We now explain the general trend ofthe optimal interaction rate. In the beginning, recall that Z is negative, meaning that the tax is negatively indexed onthe variation of I . In other words, since I is globally (but very slightly) increasing at the beginning of the epidemic,the compensation increases with I , which means that the population is not incentivised at all to decrease their contactrate, and thus the transmission rate of the virus, which remains equal to the initial level β . Then, as the epidemicspreads, Z becomes very high, which now incentivises the population to reduce the transmission rate below β . Finally,near the end of the lockdown period, Z plunges to zero, which naturally implies that the optimal contact rate β ? goesback to its usual level β . By comparing with the benchmark case, we see that the tax policy succeeds in reducingsignificantly the interaction rate. As a consequence, and as we have seen in Figure 7, the tax policy contains thespread of the disease during the considered time period, unlike in the case without intervention of the government. Contract case Benchmark case
Figure 11: 500 simulations of the proportion I of infected in the SIR model Comparison between the case with tax policy (but without testing) on the left and the benchmark case on the right.
In this section, we now study the case where the government can implement an active testing policy, in addition to theincentive policy for lockdown, to contain the spread of the epidemic. This policy is similar to the one adopted by mostEuropean governments in June 2020, after relatively strict containment periods and at a time when the COVID–19epidemic seemed to be under control. Indeed, the lockdown periods in Europe have generally made it possible todelay the epidemic, and thus to give public authorities time to prepare a meaningful testing policy by developing andincreasing the number of available tests.This testing policy has two major interests. First, it allows the identification of clusters, and therefore provides a moreprecise knowledge of the dynamics of the epidemic in real time on the different territories. Second, by identifyinginfected people, we can force them to remain isolated, in order to avoid the contamination of their relatives. This policytherefore constitutes another leverage, in addition to containment, to reduce the contact rate within the population.Thus, by developing a robust testing policy, public authorities can in fact relax the lockdown while keeping the rate ofdisease transmission at a sufficiently low level. Therefore, comparing with the no–testing policy case, we expect that( i ) the government will be able to control the epidemic at least as well as with just the lockdown policy;( ii ) it will allow the population to regain a contact rate closer to the desired and initial level β .To study the optimal testing policy α ? , taking values in A := [ ε, k given by (3.1b).This cost function emphasises the fact that testing the entire population every day is inconceivable, and thereforeresults in an explosion of cost when α takes values close to 0. Recall that the parameters for the function k , namely κ g and η g are given in Table 1b. Finally, A is discretised with a step equal to 0 .
05 and we consider Z max = 30.As we can see from the six selected simulations below, the control Z is very regular (see Figure 12), while the control α is less regular and concentrated at the heart of the epidemic (see Figure 13). Figure 16 gives a global overview ofthe 500 simulations, which confirms the intuition given by the six selected ones.23 imulation Simulation Simulation Simulation Simulation Simulation Figure 12: Optimal trajectories of Z with testing policy. Simulation Simulation Simulation Simulation Simulation Simulation Figure 13: Optimal trajectories of the testing policy α imulation Simulation Simulation Simulation Simulation Simulation Figure 14: Optimal effective transmission rate β √ α with testing policy Comparison between the three cases, the benchmark, with, and without testing.
Relaxed lockdown but lower effective transmission rate.
First, comparing Figures 8 and 12, the optimalcontrol Z presents the same shape in both cases, except at the beginning, since now Z is not negative initially. Infact, in this case, we observe that the government is asking for less effort from the population, and therefore the initialstimulus mentioned in the paragraph ‘Policy implications’ still happens, but later and for a much shorter length.Figure 15 also shows that the optimal contact rate is closer to the initial level β , which should induce a more violentspread of the disease. Nevertheless, the control α , representing the testing policy and given by Figure 13, balances thiseffect. Indeed, the testing allows an isolation of targeted infected individual, and therefore contribute to the decrease ofthe effective transmission rate of the disease, represented in Figure 14. Therefore, comparing Figure 16 with Figure 11,we notice that the control of the epidemic is more efficient than in the case A = { } , since the proportion of infectedis globally decreased. Optimal contact rate β Effective transmission rate β √ α Figure 15: 500 simulations of the transmission rate with testing policy25 ptimal control α Optimal control Z Proportion I of infected Figure 16: 500 simulations of optimal government’s controls, with the resulting trajectories of I . First, remark that, with the particular choice of utility functions, we have χ ? ( $ ) = 1 θ p ln (cid:18) $ − φ p (cid:19) , if 0 < $ < φ P = 2 . Otherwise, if $ ≥
2, the optimal tax policy is equal to −∞ , which cannot be optimal from the government’s point ofview, since it leads to an infimum on $ equal to + ∞ (see (2.21)). For each value of the Lagrange parameter, a twodimensional PDE with a two–dimensional control ( α, β ) is considered. A step discretisation for the grid in ( s, i ) istaken equal to (0 . , . A = [ ε,
1] is discretised with 20 values and the values of β are discretised with 80 equallyspaced values (to reduce the cost of optimisation). We then search for the optimal $ parameter with a step of 0 . , .
64 and we give on Figure 17 the results,which show in particular that the epidemic is controlled in a similar way as in the second–best case, with incentivesand testing policy.
Transmission rate β Testing policy α Proportion of infected I Figure 17: 500 trajectories obtained in the first–best case.The shape of the optimal controls β and α , as well as the trajectories for the proportion I of infected, are highly similarto those obtained in the previous case. The only clear difference is the principal’s value. Indeed, we can compare theoptimal value V P0 for the government in the moral hazard case, to the first best value V P , FB0 . Using 10 trajectories andthe previously optimal control computed, we estimate V P , FB0 = − .
249 while V P0 = − . β very close to its usual value β , as shownon Figure 17. 26 Incentive policy for epidemic stochastic models
We fix a small parameter ε ∈ (0 ,
1) to consider the subset A := [ ε, A the set of all finite andpositive Borel measures on [0 , T ] × A , whose projection on [0 , T ] is the Lebesgue measure. In other words, every q ∈ A can be disintegrated as q (d s, d v ) = q s (d v )d s , for an appropriate Borel measurable kernel ( q s ) s ∈ [0 ,T ] , meaning that forany s ∈ [0 , T ], q s is a finite positive Borel measure on A , and the map [0 , T ] s q s is Borel measurable, when thespace of measures on A is endowed with the topology of weak convergence. We then define the following canonicalspace Ω := C × A , whose canonical process is denoted by ( S, I,
Λ), in the sense that S t (cid:0) s , ι, q (cid:1) := s ( t ) , I t (cid:0) s , ι, q (cid:1) := ι ( t ) , Λ (cid:0) s , ι, q (cid:1) := q, ∀ (cid:0) t, s , ι, q (cid:1) ∈ [0 , T ] × Ω . We let F be the Borel σ –algebra on Ω, and F := ( F t ) t ∈ [0 ,T ] be the natural filtration of the canonical process F t := σ (cid:16)(cid:0) S s , I s , ∆ s (Υ) (cid:1) : ( s, Υ) ∈ [0 , t ] × C b (cid:0) [0 , T ] × A, R (cid:1)(cid:17) , t ∈ [0 , T ] , where for any ( s, Υ) ∈ [0 , T ] × C b ([0 , T ] × A, R ), ∆ s (Υ) := RR [0 ,s ] × A Υ( r, a )Λ(d r, d a ) . Recall that in this framework F = F T . Let M be the set of probability measures on (Ω , F T ). For any P ∈ M , we let N P be the collection of all P –null sets, that is to say N P := (cid:8) N ∈ Ω : ∃ N ∈ F T , N ⊂ N , P [ N ] = 0 (cid:9) , where we recall that 2 Ω represents the set of all subsets of Ω, and we let F P := ( F P t ) t ∈ [0 ,T ] be the P –augmentationof F , where F P t := F t ∨ σ ( N P ). We let F P + := ( F P + t ) t ∈ [0 ,T ] the corresponding right limit. Similarly, for any subsetΠ ⊂ M , we let F Π := ( F Π t ) t ∈ [0 ,T ] be the Π–universal completion of F .Let us now introduce the drift and volatility functions for our controlled model, namely B : R −→ R and Σ : R × A −→ R , defined by B ( x, y ) := (cid:18) λ − µx + νy − ( µ + ν + γ + ρ ) y (cid:19) , Σ( x, y, a ) := (cid:18) σaxy − σaxy (cid:19) , ( x, y, a ) ∈ R × A, where the parameters ( λ, µ, ν, γ, σ ) ∈ [0 , ∞ ) × R ? + are given. For any ( s, ϕ ) ∈ [0 , T ] × C b ( R , R ), we set M s ( ϕ ) := ϕ ( S s , I s ) − Z Z [0 ,s ] × A (cid:18) B ( S r , I r ) · ∇ ϕ ( S r , I r ) + 12 Tr (cid:2) D ϕ ( S r , I r ) (cid:0) ΣΣ > (cid:1) ( S r , I r , a ) (cid:3)(cid:19) Λ(d r, d a ) , and we give ourselves some initial values ( s , i ) ∈ R . Definition 4.1.
We define the subset
P ⊂ M as the one composed of all P ∈ M such that ( i ) M ( ϕ ) is an ( F , P ) –local martingale on [0 , T ] for all ϕ ∈ C b ( R , R );( ii ) P (cid:2) ( S , I ) = ( s , i ) (cid:3) = 1;( iii ) with P –probability , the canonical process Λ is of the form δ φ · (d v ) for some Borel function φ : [0 , T ] A ,where as usual, for any a ∈ A , δ a is the Dirac mass at a . We can follow Bichteler [18], or Neufeld and Nutz [80, Proposition 6.6] to define a pathwise version of the density ofthe quadratic variation of S , denoted by b σ : [0 , T ] × Ω −→ R , by b σ t ( ω ) := limsup n →∞ n (cid:0) h S i t ( ω ) − h S i t − /n ( ω ) (cid:1) , ( t, ω ) ∈ [0 , T ] × Ω . Notice that the initial value of r of R , which appears in the SIR version of the model, is irrelevant at this stage. W t := Z T b σ − / s ˆ σ s =0 d S s , t ∈ [0 , T ] , (4.1)is an ( F P , P )–Brownian motion for any P ∈ P . For any P ∈ P , we denote by A o ( P ) the set of F –predictable and A –valued process α := ( α s ) s ∈ [0 ,T ] such that, P –a.s. S t = s + Z t (cid:0) λ − µS s + νI s (cid:1) d s + Z t σα s S s I s d W s , t ∈ [0 , T ] ,I t = i − Z t ( µ + γ + ν + ρ ) I s d s − Z t σα s S s I s d W s , t ∈ [0 , T ] . (4.2)Once again, it is a classical result (see for instance Stroock and Varadhan [99, Theorem 4.5.2], Lin, Ren, Touzi, andYang [71, Lemma 2.2], or Élie, Hubert, Mastrolia, and Possamaï [36, Lemma 2.3]) that A o ( P ) is not empty.We recall that the term λ ≥ µ ≥ γ ≥ ν and ρ correspondto recovery rates, depending on whether we are considering a SIS or a SIR model, see the remark below for moredetails. Remark 4.2.
From now on, one should have in mind that ( i ) if ρ = 0 , the constant ν ≥ is the rate of recovery for infected people, going back in the class of susceptible. Thiscase corresponds to the classical SIS model whose dynamics are described by the system (2.3);( ii ) if ν = 0 , the constant ρ ≥ is the recovery rate for infected individual, going into a class of recovered people,whose proportion is denoted by R . This case corresponds to the classical SIR model described by (2.4) .It can be noted that our model, which results from a mixing of the
SIS and
SIR models, can be interpreted as an
SIR model with partial immunisation, in the sense that only a part of the population develops antibodies for the disease afterbeing infected. Thus, a proportion ρ of the infected moves to the class R , and can no longer be infected. Conversely,the proportion of the infected who do not develop antibodies reverts to the class S , and can therefore contract the diseaseagain. This resulting model is similar to the one developed by Zhang, Wu, Zhao, Su, and Choi [111] and called
SISRS .This type of model seems in fact well suited to model epidemics related to new viruses, such as the
COVID–19 , whenthe immunity of infected persons has not yet been proved.
Before pursuing, we need a bit more notations, and will consider the following sets A o := [ P ∈P A o ( P ) , as well as, for any α ∈ A o , P ( α ) := (cid:8) P ∈ P : α ∈ A o ( P ) (cid:9) . We will require that the controls chosen by the governmentlead to only one weak solution to Equation (4.2), and are such that the processes S and I remain non–negative. Wewill therefore concentrate our attention to the set A of admissible controls defined by A := (cid:8) α ∈ A o : P ( α ) is a singleton { P α } , and ( S, I ) is R –valued, P α –a.s. (cid:9) . Notice that the set A is not empty. Indeed, any constant A –valued process automatically belongs to A , as a directconsequence of Gray, Greenhalgh, Hu, Mao, and Pan [47, Section 3] or Gao, Song, Wang, and Liu [42, Lemma 2.3].Notice that for any α ∈ A , we have b σ t = σS t I t α t , d P α ⊗ d t –a.e. More precisely, one should first use the result of Stroock and Varadhan [99, Theorem 4.5.2] to obtain that on an enlargement of (Ω , F T ),there is for any P ∈ P , a Brownian motion W P , and an F –predictable process, A –valued process α P such that S t = s + Z t ( λ − µS s + νI s )d s + Z t σα P s S s I s d W P s , t ∈ [0 , T ] , P –a.s.The result for W is then immediate. Notice in addition that since W is defined as a stochastic integral, it should also depend on explicitlyon P . We can however use Nutz [81, Theorem 2.2] to define W universally, as an F P –adapted and continuous process. This requires someset–theoretic assumptions which we implicitly consider here, see Possamaï, Tan, and Zhou [85, Footnote 7] for details. α ∈ A , we have S t + I t = s + i + Z t (cid:0) λ − µ ( S s + I s ) − ( γ + ρ ) I s (cid:1) d s, t ∈ [0 , T ] , P α –a.s.We thus deduce, using the positivity of S and I , that0 ≤ S t + I t = e − µt (cid:0) s + i (cid:1) + Z t e − µ ( t − s ) (cid:0) λ − ( γ + ρ ) I s (cid:1) d s ≤ F ( t, s , i ) , t ∈ [0 , T ] , P α –a.s. , (4.3)where for all ( t, s, i ) ∈ [0 , T ] × R F ( t, s, i ) := e − µt (cid:0) s + i (cid:1) + λ (cid:18) − e − µt µ { µ> } + t { µ =0 } (cid:19) . (4.4)This result proves in particular that S and I are actually P α –almost surely bounded, for any α ∈ A . Moreover, if( s , i ) ∈ ( R ? + ) , then for all t ∈ [0 , T ], both S t and I t are (strictly) positive. Remark 4.3.
Note that in the
SIR model described by the system (2.4) , we have, for all t ∈ [0 , T ] , R t = r e − µt + ρ Z t I s e − µ ( t − s ) d s, so that R t depends only on the observation of I s for s ≤ t . In addition to that ≤ S t + I t + R t ≤ e − µt (cid:0) s + i + r (cid:1) + Z t e − µ ( t − s ) (cid:0) λ − γI s (cid:1) d s ≤ F ( t, s , i ) + r e − µt . The basic model from (4.2) takes into account the testing policy put into place by the government, but ignores sofar the interacting behaviour of the population. We model this through an additional control process chosen by thepopulation. More precisely, we fix some constant β max > B := [0 , β max ]. Let B be the set of all F –predictable and B –valued processes. Given atesting policy α ∈ A implemented by the government, notice that the following stochastic exponential (cid:18) exp (cid:18) − Z t β s σ √ α s d W s − Z t β s σ α s d s (cid:19)(cid:19) t ∈ [0 ,T ] , is an ( F , P α )–martingale, given that the process β/ ( σ √ α ) takes values in (cid:2) , β max / ( σ √ ε ) (cid:3) , P α –a.s. Therefore, for any( α, β ) ∈ A × B , we can define a probability measure P α,β on (Ω , F ), equivalent to P α , byd P α,β d P α := exp (cid:18) − Z t β s σ √ α s d W s − Z t β s σ α s d s (cid:19) . Using Girsanov’s theorem, we know that the process W βt := W t + Z t β s σ √ α s d s, t ∈ [0 , T ] , (4.5)is an ( F , P α,β )–Brownian motion, and we have S t = s + Z t (cid:0) λ − µS s + νI s − β s √ α s S s I s (cid:1) d s + Z t σα s S s I s d W βs , t ∈ [0 , T ] ,I t = i − Z t (cid:0) ( µ + ν + γ + ρ ) I s − β s √ α s S s I s (cid:1) d s − Z t σα s S s I s d W βs , t ∈ [0 , T ] . (4.6)29 .1.3 Optimisation problems At time 0, the government informs the population about its testing policy α ∈ A , as well as its fine policy χ , whichfor now will be an F T –measurable and R –valued random variable (a set we denote by C ). The population solves thefollowing optimal control problem V A0 ( α, χ ) := sup β ∈B E P α,β (cid:20) Z T u ( t, β t , I t )d t + U ( − χ ) (cid:21) , (4.7)The interpretation of the functions u and U is detailed in Section 2.3.1, where the population’s problem was informallydefined.For any ( α, χ ) ∈ A × C , we recall that we denoted by B ? ( α, χ ) the set of optimal controls for V A0 ( α, χ ), that is to say B ? ( α, χ ) := (cid:26) β ∈ B : V A0 ( α, χ ) = E P α,β (cid:20) Z T u ( t, β t , I t )d t + U ( − χ ) (cid:21)(cid:27) . (4.8)We require minimal integrability assumptions at this stage, and insist that there exists some p > E P α (cid:2) | U ( − χ ) | p (cid:3) < ∞ , for any α ∈ A . (4.9) Remark 4.4.
Notice that since for any α ∈ A the Radon–Nykodým density d P α,β / d P α has moments of any orderunder P α ( since any β ∈ B is bounded and any α ∈ A is bounded and bounded away from , a simple application ofHölder’s inequality ensures that (4.9) implies that for any p ∈ (1 , p ) and any β ∈ B E P α,β (cid:2)(cid:12)(cid:12) U ( − χ ) (cid:12)(cid:12) p (cid:3) < ∞ . Recall that the government can only implement policies ( α, χ ) ∈ A × C such that V A0 ( α, χ ) ≥ v , where the minimalutility v ∈ R is given. We denote the subset of A × C satisfying this constraint and Equation (4.9) by Ξ.In line with the informal reasoning developed in Section 2.3.4, the government aims at minimising the number ofinfected people until the end of the lockdown period, and we write rigorously its minimisation problem as V P0 := sup ( α,χ ) ∈ Ξ sup β ∈B ? ( α,χ ) E P α,β (cid:20) χ − Z T (cid:0) c ( I t ) + k ( t, α t , S t , I t ) (cid:1) d t (cid:21) , (4.10)where the functions c : R + −→ R + and k : [0 , T ] × A × R + × R + −→ R were introduced in Section 2.3.4. Since the fine policy χ is an F T –measurable random variable, where F is the filtration generated by the process ( S, I ),we should expect that in general V A0 ( α, χ ) = v (0 , s , i ), where the map v : [0 , T ] × C −→ R satisfies an informalHamilton Jacobi Bellman (HJB for short) equation, and as such has the dynamicd v ( t, S t , I t ) = − H ( S t , I t , Z st , Z it , α t )d t + Z st d S t + Z it d I t , where the population’s Hamiltonian H : [0 , T ] × ( R ? + ) × R × A −→ R is defined, for ( t, s, i, z, z , a ) ∈ [0 , T ] × ( R ? + ) × R × A , by H ( t, s, i, z, z , a ) := sup b ∈ B h ( t, s, i, z, z , a, b ) , where h ( t, s, i, z, z , a, b ) := (cid:0) λ − µs + νi − b √ asi (cid:1) z − (cid:0) ( µ + ν + γ + ρ ) i − b √ asi (cid:1) z + u ( t, b, i ) , for b ∈ B. In particular, defining Z := Z s − Z i , we should have U ( − χ ) = V A0 ( α, χ ) − Z T H ( t, S t , I t , Z st , Z it , α t )d t + Z T Z st d S t + Z T Z it d I t = V A0 ( α, χ ) − Z T sup b ∈ B (cid:8) u ( t, b, I t ) − b √ α t S t I t ( Z st − Z it ) (cid:9) d t + Z T σα t S t I t (cid:0) Z st − Z it (cid:1) d W t = V A0 ( α, χ ) − Z T (cid:16) ( µ + ν + γ + ρ ) I t Z t + sup b ∈ B (cid:8) u ( t, b, I t ) − b √ α t S t I t Z t (cid:9)(cid:17) d t − Z T Z t d I t . (4.11)Given the supremum appearing above, the following assumption will be useful for us.30 ssumption 4.5. There exists a unique Borel–measurable map b ? : [0 , T ] × R ? + × R ? + × R × A −→ B such that b ? ( t, s, i, z, a ) ∈ argmax b ∈ B (cid:8) u ( t, b, i ) − b √ asiz (cid:9) , ∀ ( t, s, i, z, a ) ∈ [0 , T ] × ( R ? + ) × R × A. (4.12) Remark 4.6.
We would like to insist on the fact that for the
SIR model and in view of
Remark 4.3 , it is not necessaryto consider that the process R is a state variable. Indeed, its value at time t can be deduced from the paths of I untiltime t . More precisely, following the previous reasoning to find the relevant form of contracts, one could consider d v ( t, S t , I t ) = − e H ( S t , I t , R t , Z st , Z it , α t )d t + Z st d S t + Z it d I t + Z rt d R t , where, in this case, the population’s Hamiltonian e H : [0 , T ] × ( R ? + ) × R × A is defined by e H ( t, s, i, r, z, z , e z, a ) := sup b ∈ B (cid:8) h ( t, s, i, z, z , a, b ) (cid:9) + ( ρi − µr ) e z, for any ( t, s, i, z, z , e z, a ) ∈ [0 , T ] × ( R ? + ) × R × A. Since the dynamics of R is deterministic and not controlled, a simplification occurs between the additional part of theHamiltonian ( ρi − µr ) e z and the integral with respect to d R , which leads to the same form for the utility function aspreviously mentioned, i.e., Equation (4.11) . Let us start this section by defining two useful spaces. For any α ∈ A , and any m ∈ N ? , we define S m ( P α ) and H m ( P α )as respectively the sets of R –valued, F P α + –adapted continuous processes Y such that k Y k S m ( P α ) < ∞ , and the set of F P α –predictable, R –valued processes Z with k Z k H m ( P α ) < ∞ , where k Y k m S m ( P α ) := E P α (cid:20) sup t ∈ [0 ,T ] | Y t | m (cid:21) , k Z k m H m ( P α ) := E P α (cid:20)(cid:18) Z T (cid:12)(cid:12)b σ s Z s (cid:12)(cid:12) d s (cid:19) m/ (cid:21) , ( Y, Z ) ∈ S m ( P α ) × H m ( P α ) . Theorem 4.7.
Let ( α, χ ) ∈ Ξ . There exists a unique F P α +0 –measurable random variable Y and a unique Z ∈ H p ( P α ) such that U ( − χ ) = Y − Z T (cid:16) Z t ( µ + ν + γ + ρ ) I t + u ( t, β ?t , I t ) − β ?t √ α t S t I t Z t (cid:17) d t − Z T Z t d I t , P α –a.s. , (4.13) with β ?t := b ? ( t, S t , I t , Z t , α t ) for all t ∈ [0 , T ] . Moreover, B ? ( α, χ ) = { β ? } and V A0 ( α, χ ) = E P α [ Y ] .Proof. Fix ( α, χ ) ∈ Ξ as in the statement of the theorem. Let us consider the solution (
Y, Z ) of the following BSDE Y t = U ( − χ ) + Z Tt sup b ∈ B (cid:8) u ( r, b, S r , I r ) − Z r b √ α r S r I r (cid:9) d r − Z Tt Z r σα r S r I r d W r , t ∈ [0 , T ] . (4.14)Since χ ∈ C , u is continuous, I and S are bounded, and B is a compact set, it is immediate this BSDE is well–posed andadmits a unique solution ( Y, Z ) ∈ S p ( P α ) × H p ( P α ) (in a more general context, one may refer for instance to Bouchard,Possamaï, Tan, and Zhou [20, Theorem 4.1]). Therefore, using the dynamic of I under P α , given by Equation (4.2),as well as the definition of β ? , and letting t = 0, we obtain that (4.13) is satisfied. Next, using this representation for U ( χ ), notice that for any β ∈ B , we have E P α,β (cid:20) Z T u ( t, β t , I t )d t + U ( − χ ) (cid:21) = E P α,β (cid:20) Y + Z T (cid:16) u ( t, β t , I t ) − Z t ( µ + ν + γ + ρ ) I t − u ( t, β ?t , I t ) + β ?t √ α t S t I t Z t (cid:17) d t − Z T Z t d I t (cid:21) = E P α [ Y ] + sup β ∈B E P α,β (cid:20) Z T (cid:16) u ( t, β t , I t ) − β t S t I t Z t − u ( t, β ?t , I t ) + β ?t √ α t S t I t Z t (cid:17) d t (cid:21) ≤ E P α [ Y ] , Z ∈ H p ( P α ), and that the process E β · := exp (cid:18) − Z · β s σ √ α s d W s − Z · β s σ α s d s (cid:19) , is continuous, and both an ( F P α , P α )– and an ( F P α + , P α )–martingale (see for instance Neufeld and Nutz [80, Proposition2.2]), so that for any β ∈ B E P α,β [ Y ] = E P α (cid:2) E βT Y (cid:3) = E P α (cid:2) E β Y (cid:3) = E P α [ Y ] . The previous inequality implies that V A0 ( α, χ ) ≤ E P α [ Y ] . Moreover, thanks to Assumption 4.5, equality is achieved if and only if we choose the control β ? . This shows that V A0 ( α, χ ) = E P α [ Y ] , and B ? ( α, χ ) = (cid:8) β ? (cid:9) . In the previous result, the fact that Equation (4.13) holds with an F P α +0 –measurable random variable and not aconstant is somewhat annoying. The next lemma shows that we can actually have the representation with a constantwithout loss of generality. Lemma 4.8.
Let α ∈ A , and fix an F P α +0 –measurable random variable Y and some Z ∈ H p ( P α ) . Define the followingcontracts χ := − U ( − (cid:18) Y − Z T (cid:16) Z t ( µ + ν + γ + ρ ) I t + u ( t, β ?t , I t ) − β ?t √ α t S t I t Z t (cid:17) d t − Z T Z t d I t (cid:19) ,χ := − U ( − (cid:18) E P α [ Y ] − Z T (cid:16) Z t ( µ + ν + γ + ρ ) I t + u ( t, β ?t , I t ) − β ?t √ α t S t I t Z t (cid:17) d t − Z T Z t d I t (cid:19) . Then V A0 ( α, χ ) = V A0 ( α, χ ) = E P α [ Y ] , B ? ( α, χ ) = B ? ( α, χ ) = (cid:8) β ? (cid:9) . Proof.
The equalities for ( α, χ ) are immediate from Theorem 4.7. For ( α, χ ), we have, using the fact that Z ∈ H p ( P α ),and thus Z ∈ H q ( P α,β ) for any β ∈ B and any q ∈ (1 , p ) V A0 ( α, χ ) = sup β ∈B E P α,β (cid:20) E P α [ Y ] + Z T (cid:16) u ( t, β t , I t ) − Z t ( µ + ν + γ + ρ ) I t − u ( t, β ?t , I t ) + β ?t √ α t S t I t Z t (cid:17) d t − Z T Z t d I t (cid:21) = E P α [ Y ] + sup β ∈B E P α,β (cid:20) Z T (cid:16) u ( t, β t , I t ) − β t √ α t S t I t Z t − u ( t, β ?t , I t ) + β ?t √ α t S t I t Z t (cid:17) d t (cid:21) ≤ E P α [ Y ] . Since the equality is attained if and only if we choose β = β ? , this ends the proof. We introduce the class Ξ of contracts defined by all pairs (cid:0) α, U ( − ( − Y y ,ZT ) (cid:1) with α ∈ A , and Y y ,Z a process given, P α –a.s., for all t ∈ [0 , T ] by Y y ,Zt = y − Z t (cid:16) Z r ( µ + ν + γ + ρ ) I r + u (cid:0) t, b ? ( r, S r , I r , Z r , α r ) , I r (cid:1) − b ? ( r, S r , I r , Z r , α r ) √ α r S r I r Z r (cid:17) d r − Z t Z r d I r , with Z ∈ H p ( P α ) and y ∈ [ v, ∞ ). We also denote for simplicity P ?,α,Z := P α,b ? ( S · ,I · ,Z · ) . Lemma 4.9.
The problem of the government given by (4.10) can be rewritten V P0 = sup ( α,Z ) ∈A× H p ( P α ) E P ?,α,Z (cid:20) − U ( − (cid:0) Y v,ZT (cid:1) − Z T (cid:0) c ( I s ) + k ( s, α s , I s , S s ) (cid:1) d s (cid:21) . (4.15)32 roof. From Theorem 4.7 and Lemma 4.8, we know that Ξ ⊂ Ξ. To prover the reverse inclusion, let us now considera pair (cid:0) α, − U ( − (cid:0) Y y ,ZT (cid:1)(cid:1) ∈ Ξ. We simply need to ensure that − U ( − (cid:0) Y y ,ZT (cid:1) ∈ C . We have, using the fact that u iscontinuous, B is compact, α is bounded below by ε , and S and I are bounded, that there exists some constant C > E P α (cid:2)(cid:12)(cid:12) Y y ,ZT (cid:12)(cid:12) p (cid:3) ≤ C (cid:18) E P α (cid:20)(cid:18) Z T | S r I r Z r | d r (cid:19) p + (cid:12)(cid:12)(cid:12)(cid:12) Z T b σ r Z r d W r (cid:12)(cid:12)(cid:12)(cid:12) p (cid:21)(cid:19) ≤ C (cid:18) E P α (cid:20)(cid:18) Z T σα r | S r I r Z r | d r (cid:19) p (cid:21) + k Z k p H p ( P α ) (cid:19) ≤ C (cid:0) k Z k p H p ( P α ) (cid:1) < ∞ , where we used Burkholder–Davis–Gundy’s inequality and Cauchy–Schwarz’s inequality. This proves the reverse in-clusion and thus that Ξ = Ξ.Next, we use Lemma 4.8 to realise that B ? (cid:0) α, − U ( − (cid:0) Y y ,ZT (cid:1)(cid:1) = (cid:8) b ? ( · , S · , I · , Z · , α · ) (cid:9) , and V A0 (cid:0) α, − U ( − (cid:0) Y y ,ZT (cid:1)(cid:1) = y , which shows V P0 = sup y ≥ v sup ( α,Z ) ∈A× H p ( P α ) E P ?,α,Z (cid:20) − U ( − (cid:0) Y y ,ZT (cid:1) − Z T (cid:0) c ( I s ) + k ( s, α s , S s , I s ) (cid:1) d s (cid:21) . To conclude, it is enough to notice that the following map[ v, ∞ ) y E P ?,α,Z (cid:20) − U ( − (cid:0) Y y ,ZT (cid:1) − Z T (cid:0) c ( I s ) + k ( s, α s , S s , I s ) (cid:1) d s (cid:21) ∈ R , is non–increasing. Lemma 4.9 states that the problem of the government can be can be reduced to a more standard stochastic controlproblem. However, in the current formulation, one of the three state variables, namely Y , is considered in the strongformulation, while the other state variables S and I are considered in weak formulation. Indeed, the variable Y is indexed by the control Z , while the control ( α, Z ) only impacts the distribution of S and I through P ?,α,Z . Ashighlighted by Cvitanić and Zhang [27, Remark 5.1.3], it makes little sense to consider a control problem of this formdirectly. Therefore, contrary to what is usually done in principal–agent problems (see, e.g. , [29]), we decided to adoptthe weak formulation to rigorously write the problem of the principal, since this is the formulation which makes sensefor the agent’s problem. We will thus formulate it below, for the sake of thoroughness. Let V := R × A and consider the sets V as we defined A in Section 4.1.1. The intuition is that the principal’s problemdepends only on time and on the state variable X = ( S, I, Y ). Following the same methodology used for the agent’sproblem, to properly define the weak formulation of the principal’s problem, we are led to consider the followingcanonical space Ω P := C × V , with canonical process ( S, I, Y, Λ P ), where for any ( t, s , ι, y, q ) ∈ [0 , T ] × Ω P S t ( s , ι, y, q ) := s ( t ) , I t ( s , ι, y, q ) := ι ( t ) , Y t ( s , ι, y, q ) := y ( t ) , Λ P ( s , ι, y, q ) := q. We let G be the Borel σ –algebra on Ω P , and G := ( G T ) t ∈ [0 ,T ] the natural filtration of ( S, I, Y, Λ P ), defined in thesame way as F in the previous canonical space Ω (see Section 4.1). Let then M P be the set of probability measures on(Ω P , G T ). For any P ∈ M P , we can define G P the P –augmentation of G , its right limit G P + , as well as F Π := ( F Π t ) t ∈ [0 ,T ] the Π–universal completion of F for any subset Π ⊂ M P . Notice that at the end of the day, this is not really an issue. Indeed, provided that the problem has enough regularity (typically somesemi–continuity of the terminal and running reward with respect to state), one can expect the strong and weak formulations to coincide.See for instance El Karoui and Tan [34, Theorem 4.5] X are now defined for any ( t, s, i, z, a ) ∈ [0 , T ] × ( R ? + ) × VB P ( t, s, i, z, a ) := λ − µs + νi − b ? ( t, s, i, z, a ) √ asi − ( µ + ν + γ + ρ ) i + b ? ( t, s, i, z, a ) √ asi − u ? ( t, s, i, z, a ) , Σ P ( s, i, z, a ) := σasi − z , (4.16)where u ? ( t, s, i, z, a ) := u ( t, b ? ( t, s, i, z, a ) , i ), for all ( t, s, i, z ) ∈ [0 , T ] × ( R ? + ) × R . For any ( t, ϕ P ) ∈ [0 , T ] × C b ( R , R ),we define M P t ( ϕ P ) := ϕ P ( X t ) − Z Z [0 ,t ] × V (cid:18) B P ( r, S r , I r , v ) · ∇ ϕ P ( X r ) + 12 Tr (cid:2) D ϕ P ( X r ) (cid:0) Σ P (Σ P ) > (cid:1) ( r, S r , I r , v ) (cid:3)(cid:19) Λ P (d r, d v ) . In the spirit of Definition 4.1 for
P ⊂ M , we define the subset Q ⊂ M P as the one consisting of all P ∈ M P such that( i ) M P ( ϕ P ) is a ( G , P )–local martingale on [0 , T ] for all ϕ P ∈ C b ( R , R );( ii ) P (cid:2) X = x (cid:3) = 1, where x := ( s , i , v );( iii ) with P –probability 1, the canonical process Λ P is of the form δ φ · (d v ) for some Borel function φ : [0 , T ] V .Still following the line of Section 4.1, we know that for any P ∈ Q , we can define a ( G Q , P )–Brownian motion W P .We then denote by V o ( P ) the set of G –predictable and V –valued process ( Z, α ) such that, P –a.s. and for all t ∈ [0 , T ], S t = s + Z t (cid:0) λ − µS r + νI r − b ? ( r, S r , I r , Z r , α r ) √ α r S r I r (cid:1) d r + Z t σα r S r I r d W P r ,I t = i − Z t (cid:0) ( µ + ν + γ + ρ ) I r − b ? ( r, S r , I r , Z r , α r ) √ α r S r I r (cid:1) d r − Z t σα r S r I r d W P r ,Y t = v − Z t u ? ( r, S r , I r , Z r , α r )d r + Z t Z r σα r S r I r d W P r . (4.17) Thank to the analysis conducted in the previous subsection, the problem of the government given by (4.10) can nowbe written rigorously in weak formulation V P0 = sup P ∈Q E P (cid:20) − U ( − ( Y T ) − Z T (cid:0) c ( I s ) + k ( s, α s , S s , I s ) (cid:1) d s (cid:21) . (4.18)We then define the Hamiltonian of the government, for all t ∈ [0 , T ], x := ( s, i, y ) ∈ R and ( p, M ) ∈ R × S , by H P ( t, x, p, M ) := sup ( z,a ) ∈ V (cid:26) B P ( t, s, i, z, a ) · p + 12 Tr h M (cid:0) Σ P (Σ P ) > (cid:1) ( t, s, i, z, a ) i − k ( t, a, s, i ) (cid:27) − c ( i ) , (4.19)where S represents the set of 3 × H P ( t, x, p, M ) = sup z ∈ R ,a ∈ A (cid:26) b ? ( t, s, i, z, a ) √ asi ( p − p ) − u ? ( t, s, i, z, a ) p + 12 σ a ( si ) f ( z, M ) − k ( t, a, s, i ) (cid:27) + ( λ − µs + νi ) p − ( µ + ν + γ + ρ ) ip − c ( i ) , where f ( z, M ) := M − M + M − z ( M − M ) + z M , for all ( z, M ) ∈ R × S .We are then led to consider the following HJB equation, for all t ∈ [0 , T ] and x = ( s, i, y ) ∈ R : − ∂ t v ( t, x ) − H P ( t, x, ∇ x v, D x v ) = 0 , ( t, x ) ∈ O , (4.20)34ith terminal condition v ( T, x ) := − U ( − ( y ), and where the natural domain over which the above PDE must besolved is O := (cid:8) ( t, s, i, y ) ∈ [0 , T ) × R × R : 0 < s + i < F ( t, s , i ) (cid:9) , recalling that F is defined by (4.4). Remark 4.10.
Standard arguments from viscosity solution theory allow to prove that V P0 = v P (0 , x ) ( recalling that x = ( s , i , v )) where v P should be understood as the unique viscosity solution, in an appropriate class of functions,of the PDE (4.20) . Obtaining further regularity results is by far more challenging. Indeed, it is a second–order, fullynon–linear, parabolic
PDE , which is clearly not uniformly elliptic, the corresponding diffusion matrix being degenerate.This makes the question of proving the existence of an optimal contract a very complicated one, which is clearly outsidethe scope of our study. As a sanity check though, we recall that ε –optimal contracts always exist, and can be indeedapproximated numerically. See for instance Kharroubi, Lim, and Mastrolia [66] for an explicit construction of such ε –optimal contracts in a particular case dealing with the stochastic logistic equation. As already mentioned, the first–best case corresponds to the case where the government can enforce whichever inter-action rate β ∈ B it desires (in addition to a contract ( α, χ ) ∈ A × C ), and simply has to satisfy the participationconstraint of the population. In order to find the optimal interaction rate in this scenario, as well as the optimalcontract, one has to solve the government’s problem defined by (2.14).The simplest way to take into account the inequality constraint in the definition of V P , FB0 is to introduce the associatedLagrangian. By strong duality, we then have V P , FB0 = inf $> sup ( α,χ,β ) ∈A× C ×B (cid:26) E P α,β (cid:20) χ − Z T (cid:0) c ( I t ) + k ( t, α t , S t , I t ) (cid:1) d t (cid:21) + $ (cid:18) E P α,β (cid:20) Z T u ( t, β t , I t )d t + U ( − χ ) (cid:21) − v (cid:19)(cid:27) . First, by concavity of U , it is immediate that for any given Lagrange multiplier $ >
0, the optimal tax is constantand given by (2.19). Then, using the definition of V ( $ ) for any $ > V P , FB0 = inf $> n χ ? ( $ ) + $ (cid:0) U (cid:0) − χ ? ( $ ) (cid:1) − v (cid:1) + V ( $ ) o . Note that V ( $ ) is the value function of a standard stochastic control problem. Therefore, we expect to have V ( $ ) = v $ (0 , s , i ), where the function v $ : [0 , T ] × R −→ R solves the following HJB PDE − ∂ t v $ ( t, s, i ) + c ( i ) − ( λ − µs + νi ) ∂ s v $ + ( µ + ν + γ + ρ ) i∂ i v $ − H $ ( t, s, i, ∂v $ , D v $ ) = 0 , ( t, s, i ) ∈ D ,v $ ( T, s, i ) = 0 , ( s, i ) ∈ D T , where the Hamiltonian is defined, for t ∈ [0 , T ], ( s, i ) ∈ ( R ? + ) , p := ( p , p ) ∈ R and M ∈ S by H $ ( t, s, i, p, M ) := sup a ∈ A (cid:26) sup b ∈ B (cid:8) $u ( t, b, i ) − bsi √ a ( p − p ) (cid:9) − k ( t, a, s, i ) + 12 σ ( si ) a ( M − M + M ) (cid:27) . To simplify, let us consider separable utilities with the forms (2.17). We focus on the maximisation of the Hamiltonian H $ with respect to b ∈ B , to obtain the optimal interaction rate β $ . The maximiser b $ is defined by b $ ( s, i, p, a ) := b ◦ (cid:0) s, i, √ a ( p − p ) /$ (cid:1) , for all ( s, i, p, a ) ∈ ( R ? + ) × R × A, recalling that b ◦ is defined by (2.18). In particular, for a given testing policy α ∈ A and a Lagrange multiplier $ > t ∈ [0 , T ] by β $t = b $ (cid:0) S t , I t , ∂v $ ( t, S t , I t ) , α t (cid:1) . We thus obtain H $ ( t, s, i, p, M ) = sup a ∈ A (cid:26) $u $ ( t, s, i, p, a ) − b $ ( s, i, p, a ) si √ a ( p − p ) − k ( t, a, s, i ) + ( σsia ) M − M + M ) (cid:27) , for all ( t, s, i, p, M ) ∈ [0 , T ] × ( R ? + ) × R × S , where in addition for a ∈ A , u $ ( t, s, i, p, a ) := u ( t, b $ ( s, i, p, a ) , i ).Then, the optimal testing policy is given for all t ∈ [0 , T ] by α $t := a $ ( t, S t , I t , ∂v $ ( t, S t , I t ) , D v $ ( t, S t , I t )), where a $ : [0 , T ] × ( R ? + ) × R × S −→ A is the maximiser of the previous Hamiltonian on a ∈ A , if it exists. The boundary of the domain cannot be reached by the processes S an I , which is why it not necessary to specify a boundary conditionthere. Notice though that the upper bound can formally only be attained when I is constantly 0, in which case S becomes deterministic,and the government best choice for α is clearly 1, and its choice of Z becomes irrelevant. In such a situation, we would immediately have V P0 = v . Extensions and generalisations
We now focus on the SEIR/S (Susceptible–Exposed–Infected–Recovered or Susceptible) compartment model. Again,the class S represents the ‘Susceptible’ and the class I represents the ‘Infected’ and infectious. The SEIR and SEISmodels are used to describe epidemics in which individuals are not directly contagious after contracting the disease.This therefore involves a fourth class, namely E , representing the ‘Exposed’, i.e. , individuals who have contracted thedisease but are not yet infectious. With this in mind, we denote by ι the rate at which an exposed person becomesinfectious, which is assumed to be a fixed non–negative constant. Therefore, during the epidemic, each individual canbe either ‘Susceptible’ or ‘Exposed’ or ‘Infected’ or in ‘Recovery’, and ( S t , E t , I t , R t ) denotes the proportion of eachcategory at time t ≥
0. The difference between SEIS and SEIR models is embedded into the immunity toward thedisease: for SEIR models, it is assumed that the immunity is permanent, i.e. , after being infected, an individual goesand stays in the class R , whereas for SEIS models, there is no immunity, i.e. , infected individual come back in thesusceptible class at rate ν ≥
0, similarly to SIS models. As in the previously described SIR model, we also take intoaccount the demographic dynamics of the population, through the parameters λ , µ and γ . To sum up, the epidemicdynamics is represented in Figure 18. Susceptible Exposed InfectedDeath Recovery λ d t β t S t I t d t µE t d t ιE t d t ( γ + µ ) I t d tµS t d t ρI t d t µR t d tνI t d t Figure 18: SEIR/SEIS model with demographic dynamicsSimilarly to the previous models, we consider that the dynamic of the epidemic is subject to a noise in the estimationof the proportion of susceptible and infected individuals. Inspired by the stochastic model in Mummert and Otunuga[77, Equation (3)], we therefore consider that the dynamics of the epidemic is given by the following system S t = s + Z t (cid:0) λ − µS s − β s √ α s S s I s + νI s (cid:1) d s + Z t σα s S s I s d W s ,E t = e − Z t (cid:0) ( µ + ι ) E s − β s √ α s S s I s (cid:1) d s − Z t σα s S s I s d W s ,I t = i − Z t (cid:0) ( µ + ν + γ + ρ ) I s − ιE s (cid:1) d s,R t = r + Z t ( ρI s − µR s )d s, for t ∈ [0 , T ] , (5.1)Note that the proportion I of infected and infectious is also uncertain, but only through its dependence on E and theproportion R of recovery is uncertain only through its dependence on I . More precisely, we assume that there is nouncertainty on both the recovery rate ρ , the rate ι at which infected people becomes infectious and the (potentially)rate ν at which an individual loses immunity, implying that if the proportion of exposed individual is perfectly known,the proportion of infected is also known without uncertainty and consequently the proportion of recovery is alsocertainly known. Again this modelling choice is consistent with most stochastic SEIRS models, and emphasises thatthe major uncertainty in the current epidemic is related to the non–negligible proportion of (nearly) asymptomaticindividuals. Indeed, an asymptomatic individual may be misclassified as susceptible or exposed.36 .1.2 The contracting problem We will now give (informally) the optimisation problems faced by both the population and the government, therigorous treatment can be done following the lines of Section 4. The most important change compared to SIS/SIRmodels is that the criteria should now depend on the sum E + I , representing the proportion of the population havingcontracted the disease, rather than just the proportion I of infectious people. Unless otherwise stated, the notationsare those of Section 4.The problem of the population is now V A0 ( α, χ ) := sup β ∈B E (cid:20) Z T u (cid:0) t, β t , E t + I t (cid:1) d t + U ( − χ ) (cid:21) , (5.2)while that of the government becomes V P0 := sup ( α,χ ) ∈ Ξ sup β ∈B ? ( α,χ ) E (cid:20) χ − Z T (cid:0) c (cid:0) E t + I t (cid:1) + k ( t, α t , S t , I t ) (cid:1) d t (cid:21) . (5.3)Notice that in the cost function k , we did not replace I by I + E . This is due to the fact that this cost should scalewith the volatility of I + E (see the discussion in Example 2.2), which is still σ α · ( S · I · ) in the model (5.1).The population’s Hamiltonian is given by H : [0 , T ] × ( R ? + ) × R + × R × A and defined, for any ( t, s, e, i, z s , z e , z i , z r , a ) ∈ [0 , T ] × ( R ? + ) × R × A , by H ( t, s, e, i, r, z s , z e , z i , z r , a ) := sup b ∈ B (cid:8) h ( t, s, e, i, z s , z e , a, b ) (cid:9) + ( ρi − µr ) z r − (cid:0) ( µ + ν + γ + ρ ) i − ιe (cid:1) z i , where h ( t, s, e, i, z s , z e , a, b ) := (cid:0) λ − µs + νi − b √ asi (cid:1) z s − (cid:0) ( µ + ι ) e − bsi (cid:1) z e + u ( t, b, i + e ) , for b ∈ B . Given the supremum appearing above, and similarly to Assumption 4.5, we make the following assumption. Assumption 5.1.
There exists a unique Borel–measurable map b ? : [0 , T ] × ( R ? + ) × R × a −→ B such that b ? ( t, s, e, i, z, a ) ∈ argmax b ∈ B (cid:8) u ( t, b, i + e ) − b √ asiz (cid:9) , ∀ ( t, s, e, i, z, a ) ∈ [0 , T ] × ( R ? + ) × R × A. (5.4)Therefore, a straightforward adaption of our earlier arguments will show that every admissible contract will take theform χ := − U ( − ( Y T ) where Y t := Y − Z T (cid:16) Z t ( µ + ι ) E t + u ( t, β ?t , E t + I t ) − β ?t √ α t S t I t Z t (cid:17) d t − Z T Z t d E t , (5.5)where β ?t := b ? ( t, S t , E t , I t , Z t , α t ) for all t ∈ [0 , T ] is the optimal control of the population. It thus remain to solve thegovernment’s problem. Unlike in the previous SIS/SIR models, there are now four state variables for the government’sproblem, namely ( S, E, I, Y ), whose dynamic under the optimal effort of the population is as follows S t = s + Z t (cid:0) λ − µS s − b ? ( s, S s , E s , I s , Z s ) √ α s S s I s + νI s (cid:1) d s + Z t σα s S s I s d W s ,E t = e − Z t (cid:0) ( µ + ι ) E s − b ? ( s, S s , E s , I s , Z s ) √ α s S s I s (cid:1) d s − Z t σα s S s I s d W s ,I t = i − Z t (cid:0) ( µ + ν + γ + ρ ) I s − ιE s (cid:1) d s,Y t = Y − Z t u ? ( s, S s , E s , I s , Z s , α s )d s + Z t Z s σ √ α s S s I s d W s , for t ∈ [0 , T ] , (5.6)where u ? ( t, s, e, i, z, a ) = u ? ( t, b ? ( t, s, e, i, z, , i + e, a ), for all ( t, s, e, i, z, a ) ∈ [0 , T ] × ( R ? + ) × R × A .We then define the Hamiltonian of the government, for all t ∈ [0 , T ], x := ( s, e, y, i ) ∈ R and ( p, M ) ∈ R × S , by H P ( t, x, p, M ) = sup z ∈ R ,a ∈ A (cid:26) b ? ( t, s, e, i, z, a ) √ asi ( p − p ) − u ? ( t, s, e, i, z, a ) p + 12 σ a ( si ) f ( z, M ) − k ( t, a, s, i ) (cid:27) + ( λ − µs + νi ) p − ( µ + ι ) ep − (cid:0) ( µ + ν + γ + ρ ) i − ιe (cid:1) p − c ( e + i ) , (5.7)37here f ( z, M ) := M − M + M − z ( M − M ) + z M , for all ( z, M ) ∈ R × S . We are then led to considerthe following HJB equation − ∂ t v ( t, x ) − H P ( t, x, ∇ v, D v ) = 0 , ( t, x ) ∈ e O , (5.8)with terminal condition v ( T, x ) := − U ( − ( y ), and where the domain over which the above PDE must be solved is e O := (cid:8) ( t, s, e, i, y ) ∈ [0 , T ) × R × R : 0 < s + i + e < F ( t, s , i + e ) (cid:9) , recalling that F is defined by (4.4). Solving numerically (5.8) is really more challenging since it increases the dimensionof the problem. A numerical investigation seems to be complicated as far as we now, and we left these numericalissues for future researches. There are of course plethora of generalisations of the models we have considered so far. For instance, in SEIRS (or alsoSIRS) models, the immunity is temporary, i.e. people in the class R may come back into the class S at rate ν . Usinga similar stochastic extension of this model, it is straightforward that all our results extend, mutatis mutandis , tothis case as well, albeit with one important difference: the control problem faced by the government now has 5 statesvariables, namely ( S, E, I, R, Y ). Even more generally, our approach can readily be adapted to compartmental modelsconsidering additional classes: for instance the SIDARTHE (‘Susceptible’ (S), ‘Infected’ (I), ‘Diagnosed’ (D), ‘Ailing’(A), ‘Recognised’ (R), ‘Threatened’ (T), ‘Healed’ (H) and ‘Extinct’ (E)) model investigated in Giordano, Blanchini,Bruno, Colaneri, Di Filippo, Di Matteo, and Colaneri [44] for COVID–19. Of course the price to pay is that thenumber of state variables in the government’s problem will increase with the number of compartments, and numericalprocedures to solve the HJB equation will become more delicate to implement, and could be based on neural networks.
Appendix.
A Simulations for the SIS model
Similar to Section 3, we present in this appendix the numerical results obtained when considering a SIS compartmentalmodel, whose dynamic is given by (2.3), or equivalently by (2.5) with ρ = 0. Choice of parameters.
We take the same parameters as for the SIR case to model the preferences of the governmentand the population, i.e. the parameters given in Table 1, except for β max = 0 .
5. To model the SIS dynamic, we considera different set of parameters (see Table 3), in order to obtain the same shape for the proportion of infected at thebeginning of the epidemic in both cases of an SIR and SIS dynamics. This choice is made to model the fact that, at thebeginning of a relatively unknown epidemic such as that of COVID–19, the proportion of infected people is observed(with noise), but the authorities do not necessarily know whether this disease allows immunity to be acquired. T ( s , i , r ) ( λ, µ ) γ ν ρ σ β SIS model (2.3) 600 (0 . , . × − ,
0) (0 ,
0) 0 .
01 0 .
04 0 0 . . A.1 The benchmark case
To solve the benchmark case, we follow the method described in Section 3.3, although we choose here a number oftime steps equal to 600, a time step discretisation equal to 0 . β used to maximise the Hamiltonian is discretised with 200 points given a step discretisation of 0 . S, I ).As for the numerical resolution of the benchmark case for the SIR model, we implement two versions of the resolution,with variables (
S, I ) or (
S, S + I ). 38 ptimal effort% of infected Simulation Simulation Figure 19: Two simulations of the SIS in the benchmark case
Comparison between the two methods.
As the numerical results obtained in the benchmark case when the epidemic dynamic is given by a SIS model havethe same characteristics as with the SIR dynamic, we describe the graphs only briefly below.
Figure 19.
As in the SIR case, the trajectories of β obtained through the two aforementioned resolutions are ratherclose, and the corresponding trajectories for I coincide. Figure 20.
We plot 100 trajectories of the optimal interaction rate β ? , the proportion of susceptible S , as well as theproportion of infected I . The population’s behaviour is similar to the one obtained in the SIR model: first thepopulation behaves as usual, then begins to reduce β , which finally goes back to its usual values as the epidemicdisappears. Once again that the population’s fear of infection is not sufficient to prevent the epidemic. Optimal control β Proportion I of infected Proportion S of Susceptible Figure 20: Optimal trajectories in the benchmark case, for an SIS dynamic.39 .2 The lockdown policy without testing
As for the benchmark case, the numerical method to obtain the optimal lockdown policy is similar to the one usedin the case of an SIR dynamics. We only recall here the key points of the method. We first solve (2.22) with thesemi–Lagrangian scheme, taking v given by (3.2) and estimated with a Monte Carlo method, and by using an Eulerscheme with a time–discretisation of 600 time steps and 10 trajectories. The estimated value for v is − . s, i, y ) corresponding to (0 . , . , . × × − ,
10] for the control Z ,and a step discretisation equal to 0 . Figure 21.
We present some trajectories of the optimal controls β and Z , as well as the resulting proportion I ofinfected individuals. Figure 22.
We compare on some simulations the optimal transmission rate obtained with the contract to the oneobtained in the benchmark case. We see that the tax succeeds in reducing significantly the interaction ratecompared to the no–tax policy case.
Figure 23.
As a consequence, we see through different simulations of the trajectory of the proportion of infected I that a tax policy contains the spreading of the disease along the considered time period, contrary to thebenchmark case. The optimal control thus allows to limit the high values of I . Optimal transmission rate β Optimal control Z Proportion I of infected Figure 21: 500 trajectories of optimal β , Z and I , without testing policy Simulation 1 Simulation 2 Simulation 3
Figure 22: Trajectories of the optimal transmission rate without testing policy
Comparison with the benchmark case. imulation Simulation Simulation I resulting from optimal lockdown without testing policy Comparison with the benchmark case.
A.3 The testing policy
Due to the larger terminal time horizon, the computation time is particularly significant. To reduce it, the discretisationused to find the optimal control Z is reduced to 1. The resulting graphs are briefly described below. Figure 24.
We present trajectories of the optimal controls β , α and Z , and the resulting proportion I of infected. Figure 25.
We compare on simulations the optimal proportion of infected with the two previous cases (benchmarkcase and only tax policy): with testing, the epidemic is now totally under control.
Figure 26.
We present simulations of the optimal effective transmission rate in this case, and compare it to theoptimal β obtained in the benchmark case and without testing policy. Figure 27.
We present simulations of the optimal α : its quick variations explain the swift changes in the effective β . Optimal contact rate β Optimal testing policy α Optimal control Z Proportion I of infected Figure 24: 500 optimal trajectories of β , α , Z , and I , with testing policy Simulation Simulation Simulation Figure 26: Optimal effective transmission rate β √ α , compared to the previous cases.41 imulation Simulation Simulation Figure 25: Optimal trajectories of I with and without testing, as well as in the benchmark case. Simulation Simulation Simulation Figure 27: Optimal trajectories of the testing policy α References [1] A. Abakuks. An optimal isolation policy for an epidemic.
Journal of Applied Probability , 10(2):247–262, 1973.[2] R. Aïd, D. Possamaï, and N. Touzi. Optimal electricity demand response contracting with responsiveness incentives. arXiv preprint arXiv:1810.09063 , 2018.[3] C. Alasseur, H. Farhat, and M. Saguan. A principal–agent approach to study capacity remuneration mechanisms. arXivpreprint arXiv:1911.12623 , 2019.[4] L.J.S. Allen. An introduction to stochastic epidemic models. In F. Brauer, P. van den Driessche, and J. Wu, editors,
Mathematical epidemiology , volume 1945 of
Lecture notes in mathematics , pages 81–130. Springer Berlin Heidelberg, 2008.[5] L.J.S. Allen and A.M. Burgin. Comparison of deterministic and stochastic SIS and SIR models in discrete time.
Mathe-matical Biosciences , 163(2):1–33, 2000.[6] F.E. Alvarez, D. Argente, and F. Lippi. A simple planning problem for COVID–19 lockdown. Technical Report 26981,National Bureau of Economic Research, 2020.[7] S. Anand and K. Hanson. Disability–adjusted life years: a critical review.
Journal of Health Economics , 16(6):685–702,1997.[8] R.M. Anderson and R.M. May. Population biology of infectious diseases: part I.
Nature , 280(5721):361–367, 1979.[9] R.M. Anderson, H. Heesterbeek, D. Klinkenberg, and T.D. Hollingsworth. How will country–based mitigation measuresinfluence the course of the COVID–19 epidemic?
The Lancet , 395(10228):931–934, 2020.[10] N. Bacaër. Un modèle mathématique des débuts de l’épidémie de coronavirus en France.
Mathematical Modelling ofNatural Phenomena , 15(29):1–10, 2020.
11] N.T.J. Bailey. A simple stochastic epidemic.
Biometrika , 37(3/4):193–202, 1950.[12] N.T.J. Bailey.
The mathematical theory of infectious diseases and its applications . Charles Griffin & Company, London,2nd edition, 1975.[13] M.S. Bartlett. Some evolutionary stochastic processes.
Journal of the Royal Statistical Society. Series B ( Methodological ),11(2):211–229, 1949.[14] M.S. Bartlett. Deterministic and stochastic models for recurrent epidemics. In J. Neyman, editor,
Proceedings of the thirdBerkeley symposium on mathematical statistics and probability, volume 4: contributions to biology and problems of health ,pages 81–109, 1956.[15] H. Behncke. Optimal control of deterministic epidemics.
Optimal Control Applications and Methods , 21(6):269–285, 2000.[16] E. Beretta, V. Kolmanovskii, and L. Shaikhet. Stability of epidemic model with time delays influenced by stochasticperturbations.
Mathematics and Computers in Simulation , 45(3–4):269–277, 1998.[17] D. Bernoulli. Essai d’une nouvelle analyse de la mortalité causée par la petite vérole, et des avantages de l’inoculationpour la prévenir. In
Histoire de l’Académie Royale des Sciences. Année M . DCCLX . Avec les mémoires de mathématique& de physique, pour la même année, tirés des registres de cette académie , pages 1–45 (Mémoires). Imprimerie Royale,Paris, 1760.[18] K. Bichteler. Stochastic integration and L p –theory of semimartingales. The Annals of Probability , 9(1):49–89, 1981.[19] P. Bolton and M. Dewatripont.
Contract theory . MIT press, 2005.[20] B. Bouchard, D. Possamaï, X. Tan, and C. Zhou. A unified approach to a priori estimates for supersolutions of BSDEsin general filtrations.
Annales de l’institut Henri Poincaré, Probabilités et Statistiques (B), 54(1):154–172, 2018.[21] F. Camilli and M. Falcone. An approximation scheme for the optimal control of diffusion processes.
ESAIM: MathematicalModelling and Numerical Analysis – Modélisation Mathématique et Analyse Numérique , 29(1):97–122, 1995.[22] R. Carmona and P. Wang. Finite–state contract theory with a principal and a field of agents. arXiv preprintarXiv:1808.07942 , 2018.[23] H. Cho, D. Ippolito, and Y.W. Yu. Contact tracing mobile apps for COVID–19: privacy considerations and relatedtrade–offs. arXiv preprint arXiv:2003.11511 , 2020.[24] T. Colbourn. COVID–19: extending or relaxing distancing control measures.
The Lancet Public Health , 5(3):E236–E237,2020.[25] R.M. Corless, G.H. Gonnet, D.E.G. Hare, D.J. Jeffrey, and D.E. Knuth. On the LambertW function.
Advances inComputational mathematics , 5(1):329–359, 1996.[26] J. Cvitanić and H. Xing. Asset pricing under optimal contracts.
Journal of Economic Theory , 173:142–180, 2018.[27] J. Cvitanić and J. Zhang.
Contract theory in continuous–time models . Springer, 2012.[28] J. Cvitanić, D. Possamaï, and N. Touzi. Moral hazard in dynamic risk management.
Management Science , 63(10):3328–3346, 2017.[29] J. Cvitanić, D. Possamaï, and N. Touzi. Dynamic programming approach to principal–agent problems.
Finance andStochastics , 22(1):1–37, 2018.[30] C. Del Rio and P.N. Malani. COVID–19–new insights on a rapidly changing epidemic.
Journal of the American MedicalAssociation , 323(14):1339–1340, 2020.[31] R. Djidjou-Demasse, Y. Michalakis, M. Choisy, M.T. Sofonea, and S. Alizon. Optimal COVID–19 epidemic control untilvaccine deployment. medRxiv 2020.04.02.20049189 , 2020.[32] J. Dolbeault and G. Turinici. Heterogeneous social interactions and the COVID–19 lockdown outcome in a multi–groupSEIR model. arXiv:2005.00049 , 2020.[33] O. El Euch, T. Mastrolia, M. Rosenbaum, and N. Touzi. Optimal make–take fees for market making regulation. arXivpreprint arXiv:1805.02741 , 2018.
34] N. El Karoui and X. Tan. Capacities, measurable selection and dynamic programming part II: application in stochasticcontrol problems. arXiv preprint arXiv:1310.3364 , 2013.[35] R. Élie and D. Possamaï. Contracting theory with competitive interacting agents.
SIAM Journal on Control and Opti-mization , 57(2):1157–1188, 2019.[36] R. Élie, E. Hubert, T. Mastrolia, and D. Possamaï. Mean–field moral hazard for optimal energy demand responsemanagement. arXiv preprint arXiv:1902.10405 , 2019.[37] R. Élie, T. Mastrolia, and D. Possamaï. A tale of a principal and many many agents.
Mathematics of Operations Research ,44(2):440–467, 2019.[38] R. Élie, E. Hubert, and G. Turinici. Contact rate epidemic control of COVID–19: an equilibrium view. arXiv preprintarXiv:2004.08221 , 2020.[39] E.P. Fenichel, C. Castillo-Chavez, M.G. Ceddia, G. Chowell, P.A.G. Parra, G.J. Hickling, G. Holloway, R. Horan, B. Morin,C. Perrings, M. Springborn, L. Velazquez, and C. Villalobos. Adaptive human behavior in epidemiological models.
Proceedings of the National Academy of Sciences of the United States of America , 108(15):6306–6311, 2011.[40] N. Ferguson, D. Laydon, G. Nedjati-Gilani, N. Imai, K. Ainslie, M. Baguelin, S. Bhatia, A. Boonyasiri, Z. Cucunubá,G. Cuomo-Dannenburg, A. Dighe, I. Dorigatti, H. Fu, K. Gaythorpe, W. Green, A. Hamlet, W. Hinsley, L.C. Okell, S. vanElsland, H. Thompson, R. Verity, E. Volz, H. Wang, Y. Wang, P.G.T. Walker, C. Walters, P. Winskill, C. Whittaker, C.A.Donnely, S. Riley, and A.C. Ghani. Report 9: Impact of non–pharmaceutical interventions (NPIs) to reduce COVID–19mortality and healthcare demand. Technical report, Imperial College London, 2020.[41] J.H. Fowler, S.J. Hill, R. Levin, and N. Obradovich. The effect of stay–at–home orders on COVID–19 infections in theUnited States. arXiv preprint arXiv:2004.06098 , 2020.[42] N. Gao, Y. Song, X. Wang, and J. Liu. Dynamics of a stochastic SIS epidemic model with nonlinear incidence rates.
Advances in Difference Equations , 2019(1):41, 2019.[43] H. Gevret, N. Langrené, J. Lelong, X. Warin, and A. Maheshwari. STochastic OPTimization library in C++.
HALpreprint hal-01361291 , 2018.[44] G. Giordano, F. Blanchini, R. Bruno, P. Colaneri, A. Di Filippo, A. Di Matteo, and M. Colaneri. Modelling the covid–19epidemic and implementation of population-wide interventions in italy.
Nature Medicine , 26:855–860, 2020.[45] B.M. Gramig, R.D. Horan, and C.A. Wolf. A model of incentive compatibility under moral hazard in livestock diseaseoutbreak response. Technical report, Michigan State University, 2005.[46] B.M. Gramig, R.D. Horan, and C.A. Wolf. Livestock disease indemnity design when moral hazard is followed by adverseselection.
American Journal of Agricultural Economics , 91(3):627–641, 2009.[47] A. Gray, D. Greenhalgh, L. Hu, X. Mao, and J. Pan. A stochastic differential equation SIS epidemic model.
SIAM Journalon Applied Mathematics , 71(3):876–902, 2011.[48] M. Greenwood. On the statistical measure of infectiousness.
The Journal of Hygiene , 31(3):336–351, 1931.[49] E. Grigorieva, E. Khailov, and A. Korobeinikov. Optimal quarantine strategies for COVID–19 control models. arXivpreprint arXiv:2004.10614 , 2020.[50] N.K. Gupta and R.E. Rink. A model for communicable disease control. In
Proceedings of the 24th annual conference onengineering in medicine and biology, 1971, Las Vegas , volume 13, page 296, 1971.[51] N.K. Gupta and R.E. Rink. Optimum control of epidemics.
Mathematical Biosciences , 18(3–4):383–396, 1973.[52] W.H. Hamer. The Milroy lectures on epidemic disease in England – the evidence of variability and of persistency of type.
The Lancet , 167(4306):655–662, 1906.[53] E. Hansen and T. Day. Optimal control of epidemics with limited resources.
Journal of Mathematical Biology , 62(3):423–451, 2011.[54] J.S. Hatchimonji, R.A. Swendiman, and M.J. Seamon. Trauma does not quarantine: violence during the COVID–19pandemic.
Annals of Surgery , to appear, 2020.
55] B. Holmström and P. Milgrom. Aggregation and linearity in the provision of intertemporal incentives.
Econometrica , 55(2):303–328, 1987.[56] K. Hu, Z. Ren, and N. Touzi. Continuous–time principal–agent problem in degenerate systems. arXiv preprintarXiv:1910.10527 , 2019.[57] M. Ienca and E. Vayena. On the responsible use of digital data to tackle the COVID–19 pandemic.
Nature Medicine , 26(4):463–464, 2020.[58] D.L. Jaquette. A stochastic model for the optimal control of epidemics and pest populations.
Mathematical Biosciences ,8(3–4):343–354, 1970.[59] D. Jiang, J. Yu, C. Ji, and N. Shi. Asymptotic behavior of global positive solution to a stochastic SIR model.
Mathematicaland Computer Modelling , 54(1–2):221–232, 2011.[60] B. Jowett.
Thucydes translated into English, to which is prefixed an essay on inscriptions and a note on the geography ofThucydides , volume I. Oxford University Press, 2nd revised edition, 1900.[61] K. Kandhway and J. Kuri. How to run a campaign: optimal control of SIS and SIR information epidemics.
AppliedMathematics and Computation , 231:79–92, 2014.[62] M. Kantner. Beyond just" flattening the curve": optimal control of epidemics with purely non–pharmaceutical interven-tions. arXiv preprint arXiv:2004.09471 , 2020.[63] D.G. Kendall. Deterministic and stochastic epidemics in closed populations. In J. Neyman, editor,
Proceedings of thethird Berkeley symposium on mathematical statistics and probability, volume 4: contributions to biology and problems ofhealth , pages 149–165, 1956.[64] W.O. Kermack and A.G. McKendrick. A contribution to the mathematical theory of epidemics.
Proceedings of the RoyalSociety of London. Series A , CXV(772):700–721, 1927.[65] D.I. Ketcheson. Optimal control of an SIR epidemic through finite–time non–pharmaceutical intervention. arXiv preprintarXiv:2004.08848 , 2020.[66] I. Kharroubi, T. Lim, and T. Mastrolia. Regulation of renewable resource exploitation.
SIAM Journal on Control andOptimization , 58(1):551–579, 2020.[67] R.J. Kryscio and C. Lefèvre. On the extinction of the S–I–S stochastic logistic epidemic.
Journal of Applied Probability ,26(4):685–694, 1989.[68] J.-J. Laffont and D. Martimort.
The theory of incentives: the principal–agent model . Princeton University Press, 2002.[69] S. Lenhart and J.T. Workman.
Optimal control applied to biological models . Mathematical and computational Biologyseries. Chapman & Hall/CRC, 2007.[70] Q. Li, X. Guan, P. Wu, X. Wang, L. Zhou, Y. Tong, R. Ren, K.S.M. Leung, E.H.Y. Lau, J.Y. Wong, X. Xing, N. Xiang,Y. Wu, C. Li, Q. Chen, D. Li, T. Liu, J. Zhao, M. Liu, W. Tu, C. Chen, L. Jin, R. Yang, Q. Wang, S. Zhou, R. Wang,H. Liu, Y. Luo, Y. Liu, G. Shao, H. Li, Z. Tao, Y. Yang, Z. Deng, B. Liu, Z. Ma, Y. Zhang, G. Shi, T.T.Y. Lam, J.T.Wu, G.F. Gao, B.J. Cowling, B. Yang, G.M. Leung, and Z. Feng. Early transmission dynamics in Wuhan, China, of novelcoronavirus–infected pneumonia.
New England Journal of Medicine , to appear, 2020.[71] Y. Lin, Z. Ren, N. Touzi, and J. Yang. Second order backward SDE with random terminal time. arXiv preprintarXiv:1802.02260 , 2018.[72] Y. Lin, Z. Ren, N. Touzi, and J. Yang. Random horizon principal–agent problem. arXiv preprint arXiv:2002.10982 , 2020.[73] T. Mastrolia and Z. Ren. Principal–agent problem with common agency without communication.
SIAM Journal onFinancial Mathematics , 9(2):775–799, 2018.[74] A.G. McKendrick. Applications of mathematics to medical problems.
Proceedings of the Edinburgh Mathematical Society ,44:98–130, 1925.[75] A.G. McKendrick. The dynamics of crowd infection.
Edinburgh Medical Journal , 47(2):117–136, 1940.[76] R. Morton and K.H. Wickwire. On the optimal control of a deterministic epidemic.
Advances in Applied Probability , 6(4):622–635, 1974.
77] A. Mummert and O.M. Otunuga. Parameter identification for a stochastic SEIRS epidemic model: case study influenza.
Journal of Mathematical Biology , 79(2):705–729, 2019.[78] I. Nåsell. The quasi–stationary distribution of the closed endemic SIS model.
Advances in Applied Probability , 28(3):895–932, 1996.[79] I. Nåsell. On the quasi–stationary distribution of the stochastic logistic epidemic.
Mathematical Biosciences , 156(1–2):21–40, 1999.[80] A. Neufeld and M. Nutz. Measurability of semimartingale characteristics with respect to the probability law.
StochasticProcesses and their Applications , 124(11):3819–3845, 2014.[81] M. Nutz. Pathwise construction of stochastic integrals.
Electronic Communications in Probability , 17(24):1–7, 2012.[82] S. Park, G.J. Choi, and H. Ko. Information technology–based tracing strategy in response to COVID–19 in South Korea– privacy controversies.
Journal of the American Medical Association , to appear, 2020.[83] F. Piguillem and L. Shi. The optimal COVID–19 quarantine and testing policies. Technical report, Einaudi Institute forEconomics and Finance, 2020.[84] A.B. Piunovskiy and D. Clancy. An explicit optimal intervention policy for a deterministic epidemic model.
OptimalControl Applications and Methods , 29(6):413–428, 2008.[85] D. Possamaï, X. Tan, and C. Zhou. Stochastic control for a class of nonlinear kernels and applications.
The Annals ofProbability , 46(1):551–603, 2018.[86] L. Reichert, S. Brack, and B. Scheuermann. Privacy-preserving contact tracing of covid-19 patients. Technical Report2020/375, Cryptology ePrint Archive, 2020.[87] S. Riley, C. Fraser, C.A. Donnelly, A.C. Ghani, L.J. Abu-Raddad, A.J. Hedley, G.M. Leung, L.-M. Ho, T.-H. Lam, andT.Q. Thach. Transmission dynamics of the etiological agent of SARS in Hong Kong: impact of public health interventions.
Science , 300(5627):1961–1966, 2003.[88] R. Ross.
The prevention of malaria . E.P. Dutton & Company, New York, 1910.[89] R. Ross. Some a priori pathometric equations.
British Medical Journal , 1(2830):546–547, 1915.[90] R. Ross. An application of the theory of probabilities to the study of a priori pathometry – part I.
Proceedings of theRoyal Society of London. Series A , 92(638):204–230, 1916.[91] B. Salanié.
The economics of contracts: a primer . MIT press, 2005.[92] J.L. Sanders. Quantitative guidelines for communicable disease control programs.
Biometrics , 27(4):883–893, 1971.[93] Y. Sannikov. A continuous–time version of the principal–agent problem.
The Review of Economic Studies , 75(3):957–984,2008.[94] F. Sassi. Calculating QALYs, comparing QALY and DALY calculations.
Health Policy and Planning , 21(5):402–408,2006.[95] H. Schättler and J. Sung. The first–order approach to the continuous–time principal–agent problem with exponentialutility.
Journal of Economic Theory , 61(2):331–371, 1993.[96] F. Sélley, Á. Besenyei, I.Z. Kiss, and P.L. Simon. Dynamic control of modern, network–based epidemic models.
SIAMJournal on Applied Dynamical Systems , 14(1):168–187, 2015.[97] S.P. Sethi and P.W. Staats. Optimal control of some simple deterministic epidemic models.
Journal of the OperationalResearch Society , 29(2):129–136, 1978.[98] H.E. Soper. The interpretation of periodicity in disease prevalence.
Journal of the Royal Statistical Society , 92(1):34–73,1929.[99] D.W. Stroock and S.R.S. Varadhan.
Multidimensional diffusion processes , volume 233 of
Grundlehren der mathematischenWissenschaften . Springer–Verlag Berlin Heidelberg, 1997.[100] H.M. Taylor. Some models in epidemic control.
Mathematical Biosciences , 3:383–398, 1968. Journal of Benefit–Cost Analysis , to appear, 2020.[102] A.A. Toda. Susceptible–infected–recovered (SIR) dynamics of COVID–19 and economic impact. arXiv preprintarXiv:2003.11221 , 2020.[103] E. Tornatore, S.M. Buccellato, and P. Vetro. Stability of a stochastic SIR system.
Physica A: Statistical Mechanics andits Applications , 354(15):111–126, 2005.[104] N.I. Valeeva and G.B.C. Backus. Incentive systems under ex post moral hazard to control outbreaks of classical swinefever in the Netherlands. Technical report, Agricultural Economics Research Institute and Wageningen University, 2007.[105] X. Warin. Some non–monotone schemes for time dependent Hamilton–Jacobi–Bellman equations in stochastic control.
Journal of Scientific Computing , 66(3):1122–1147, 2016.[106] G.H. Weiss and M. Dishon. On the asymptotic behavior of the stochastic and deterministic models of an epidemic.
Mathematical Biosciences , 11(3–4):261–265, 1971.[107] K.H. Wickwire. Optimal isolation policies for deterministic and stochastic epidemics.
Mathematical Biosciences , 26(3-4):325–346, 1975.[108] A. Wilder-Smith, C.J. Chiew, and V.J. Lee. Can we contain the COVID–19 outbreak with the same measures as forSARS?
The Lancet Infectious Diseases , to appear, 2020.[109] N. Williams. On dynamic principal–agent problems in continuous time. University of Wisconsin, Madison, 2009.[110] R. Zeckhauser and D. Shepard. Where now for saving lives?
Law and Contemporary Problems , 40(4):5–45, 1976.[111] X. Zhang, J. Wu, P. Zhao, X. Su, and D. Choi. Epidemic spreading on a complex network with partial immunization.
Soft Computing , 22(14):4525–4533, 2018., 22(14):4525–4533, 2018.