[PDF] Optimal Lockdown Policy for Covid-19: A Modelling Study

Abstract

As the COVID19 spreads across the world, prevention measures are becoming the essential weapons to combat the pandemic in the period of crisis. The lockdown measure is the most controversial one as it imposes an overwhelming impact on our economy and society. Especially when and how to enforce the lockdown measures are the most challenging questions considering both economic and epidemiological costs. In this paper, we extend the classic SIR model to find optimal decision making to balance between economy and people's health during the outbreak of COVID-19. In our model, we intend to solve a two phases optimization problem: policymakers control the lockdown rate to maximize the overall welfare of the society; people in different health statuses take different decisions on their working hours and consumption to maximize their utility. We develop a novel method to estimate parameters for the model through various additional sources of data. We use the Cournot equilibrium to model people's behavior and also consider the cost of death in order to leverage between economic and epidemic costs. The analysis of simulation results provides scientific suggestions for policymakers to make critical decisions on when to start the lockdown and how strong it should be during the whole period of the outbreak. Although the model is originally proposed for the COVID19 pandemic, it can be generalized to address similar problems to control the outbreak of other infectious diseases with lockdown measures.

Full PDF

OOptimal Lockdown Policy for Covid-19:A Modelling Study

Yuting Fu Haitao Xiang Hanqing Jin Ning Wang

Mathematical Institute, University of Oxford, Oxford, UK

Abstract

As the COVID-19 spreads across the world, prevention measures are be-coming the essential weapons to combat against the pandemic in the periodof crisis. The lockdown measure is the most controversial one as it imposes anoverwhelming impact on our economy and society. Especially when and howto enforce the lockdown measures are the most challenging questions consid-ering both economic and epidemiological costs. In this paper, we extend theclassic SIR model to ﬁnd optimal decision making to balance between econ-omy and people’s health during the outbreak of COVID-19. In our model,we intend to solve a two phases optimisation problem: policymakers controlthe lockdown rate to maximise the overall welfare of the society; people indiﬀerent health statuses take diﬀerent decisions on their working hours andconsumption to maximise their utility. We develop a novel method to esti-mate parameters for the model through various additional sources of data.We use the Cournot equilibrium to model people’s behaviour and also con-sider the cost of death in order to leverage between economic and epidemiccosts. The analysis of simulation results provides scientiﬁc suggestions forpolicymakers to make critical decisions on when to start the lockdown andhow strong it should be during the whole period of the outbreak. Althoughthe model is originally proposed for the COVID-19 pandemic, it can be gener-alised to address similar problems to control the outbreak of other infectiousdiseases with the lockdown measures.

Keywords:

COVID-19, Equilibrium, SIR, Lockdown, Optimal Control

Preprint submitted to Elsevier February 12, 2021 a r X i v : . [ phy s i c s . s o c - ph ] J a n . Introduction As one of the most devastating pandemics in human history, the currentoutbreak of COVID-19 has already caused more than one million deathsaround the globe. Numerous prevention measures have been studied by [11]and [24] in order to control the spread of the virus by the governments. Forexample, medical measures, such as research on testing, medicine, and vac-cine, are accelerated; relatively easy measures, like face masking and socialdistancing, are also widely accepted and applied. Essentially the most ef-fective prevention of COVID-19 is the lockdown measure which completelyceases the movement of the human being and thus slows down the spreadof disease. However, the lockdown measure is incredibly controversial as itimposes a tremendous impact on our society and economy. Hence it might bethe most diﬃcult decision to be made by the governments. Especially whenand how to impose the lockdown measure is one of the most challengingquestions for both politicians and scientists. To address this question, thereis a need to develop a mathematical model combining both epidemiology andeconomics.Epidemiological models have been widely studied to analyse the dynam-ics of the pandemic ([21] [22] [25] [26]). However, there is less discussionon how the lockdown policies can inﬂuence the economic decisions of peopleand the spread of disease and how can policymakers make optimal policyin the epidemic. [10] and [9] analyse the government intervention using epi-demiological models with exogenous parameters and evaluate the eﬀect ofthe intervention by simulation results. Some recent papers focus on analysisof optimal policy and policy eﬀect in the framework of the SIR model or itsvariants. They studied the eﬀect of diﬀerent measures including ﬁscal policy([7], [6], [15]), testing and quarantine ([23], [4], [18], [2]), intervention policyon multi-aged groups ([5], [1], [12]), social distancing ( [19], [8], [14]) andlockdown control ([3], [13], [1]). In previous works, [3] studied the optimallockdown policy that minimises the value of fatalities and the output costsof the lockdown policy by locking down part of the susceptible and Infec-tious population, [1] researched the optimal lockdown policies on people ofdiﬀerent age groups, and [13] maximise the economic activity level with theburden of the health-care system.We extend the classic SIR model ([16], [20]) and incorporate an equilib-rium framework to study the optimal lockdown policy during the pandemicperiod. What we innovate from previous works is that they all only took2he governments’ perspective but did not take people’s own reaction to thepandemic and the government policy into consideration, while we adopt theextension to the SIR model from [7] by involving people’s economic deci-sion making (consumption and working hours) and embed the SIR modelin a simple Cournot equilibrium framework to model people’s reaction toeach other. Diﬀerent from [7] that studied the optimal containment policyby controlling the tax rate, we control the level of lockdown, which is moredirect and eﬀective for the governments, especially in the early stage of thepandemic. Furthermore, we emphases the cost of death in our model objec-tive of policymakers, which is an important factor in real-world governmentdecision making. Using this method, we can enable the lockdown policy toidentify a balance between the impact of the epidemics on the economy andpeople’s health.The motivation for this work is to address the following questions thatthe policymakers may face in reality. The main ﬁndings are shown as theshort answer to these questions. • What diﬀerence does the optimal control make on the economic andhealth outcome of the epidemic compare to no control? We ﬁnd thatoptimal lockdown measures could signiﬁcantly reduce the deaths andinfections caused by the epidemics. Although there is a short-termrecession with lockdown control, it has better long-term economic out-comes than doing no control. • How does the timing of starting and ending aﬀect the optimal lockdowncontrol itself as well as its economic and health consequences? Our re-sults suggest that both the timing of starting and ending the lockdowncontrol policy makes a diﬀerence in terms of both the economic andepidemic outcomes. It is best to start the control as early as possible,and it is more important to avoid ending the control too early. • How does the cost of death aﬀect the lockdown control policy and theoutcomes? Whether policymakers regard the deaths as a negative inﬂu-ence on society lead to diﬀerent results. Regarding deaths as negativeresults in stricter lockdown control policy which leads to a much betterepidemic and slightly worse economic outcomes. • What if policymakers have additional information on people’s healthstatus? Additional information about the health status of people is3eneﬁcial, as the optimal separate control on people in diﬀerent healthstatus will reach much better economic and epidemic outcomes.This paper continues as follows, in section 2, we describe the SIR-LockdownModel and analyse its properties. In section 3, we discuss the parameter esti-mation of the model. We present the numerical results of the SIR-Lockdownmodel in section 4. Section 5 makes conclusions.

2. Model

In this section, we ﬁrst describe the extension to the canonical SIR model.Then analyse the behaviour of susceptible, infectious, and recovered people inregard to their decisions on consumption and working hours under lockdownregulator and formulate the optimal control problem. Finally, we add thecost of death in our model objective.

As shown in the classic SIR model ([20] [17]), we classify people into threecategories according to [16]: • Infectious (I) are those who are tested positive to the virus; • Recovered (R) are those who have been tested positive to the virus andnow recovered; • Susceptible (S) are those who have not been tested positive to the virus.We assume that all susceptible people are subjects to be infected withsome possibility in direct contact with infectious people, and infectious peo-ple will recover with a constant probability of π r or become dead with an-other constant probability π d . Our extension is on the infection. All in-fection happens via direct contact between susceptible people and infectedones into three types of activities: purchasing and/or consumption of goodsand services, working with other people, and other daily activities. A Lock-down policy can be applied to control the working contact, hence changethe income ﬂow, which indirectly imposes constraints on the purchasing andconsumption. 4e use the following equation (1—5) to describe our extended SIR modelfor the transition among Susceptible, Infected, Recover, and the death out-come. T t = π s ( S t C st ) (cid:0) I t C it (cid:1) + π s ( S t N st ) (cid:0) I t N it (cid:1) + π s S t I t , (1) S t +1 = S t − T t , (2) I t +1 = I t + T t − ( π r + π d ) I t , (3) R t +1 = R t + π r I t , (4) D t +1 = D t + π d I t . (5)In this system of equations, S t , I t , R t and D t represents the number of peo-ple in categories of Susceptible, Infectious, Recovery and Death respectivelyat time t . We use ( C st , N st ) to model the (average) consumption behaviourand working hours of susceptible people, ( C it , N it ) to model the (average)consumption behaviour and working hours of infectious people, and ( C rt , N rt )to model the (average) consumption behaviour and working hours of a re-covered people. T t in equation (1)is the number of newly infectious peoplein the time period t to t + 1 and the three terms in the right-hand side ofthis equation are used to describe the infection by the three diﬀerent contactbetween susceptible people and infectious people via consumption, working,and other types of contact.We use several constant parameters to describe the transition rate be-tween diﬀerent categories. π s reﬂects the transition rate for a susceptiblepeople get infected by infectious people from direct contact via purchas-ing/consuming. Similarly, π s reﬂects the transition rate from direct contactvia working, and π s reﬂects the transition rate from other contacts.Denote ∆ Y t = Y t +1 − Y t for Y = S, I, R , then the dynamics of the SIRmodel is ∆ S t = − T t , ∆ I t = T t − ( π r + π d ) I t ∆ R t = π d I t . We use vectors and matrices to simplify our presentation. Denote X t =( S t , I t , R t ) (cid:62) , C t = ( C st , C it , C rt ) (cid:62) , n t = ( n st , n it , n rt ) (cid:62) , and for any x = ( x , x , x ) (cid:62) , c =( c , c , c ) (cid:62) , n = ( n , n , n ) (cid:62) , deﬁne T ( x, c, n ) = x x ( π s c c + π s n n + π s ) (6)5 ( x, c, n ) = ( − T ( x, c, n ) , T ( x, c, n ) − ( π r + π d ) x , π d x ) (cid:62) (7)then the system can be described as∆ X t = F ( X t , C t , n t ) . (8) We study the rational behaviour of all people who maximise their ownwelfare by choosing proper consumption and working hours like in a normaltime, i.e., the virus does not change people’s rationality and preference. Also,we use the following utility function to model the utility from consumptionand working of an individual, u ( c, n ) = ln c − θ n (9)where c is the consumption, and n is the working hours. The ﬁrst termmeasures the utility from consumption, and the second term measures theutility from working. Denote by A the average wage per hour of a person,hence the labor income of an individual, with working hour n is A ∗ n , whichwill be the upper bound of the consumption, i.e. An ≥ c .Denote by n the full working hours in a unit time before the spread of thevirus, which is oﬃcially guided by the government. It is natural that n is setoptimally for the society, and the optimality brings some information of theparameter θ . If a person follows the full working hours n optimally, thenher labor income will be An . Since the utility function is strictly increasingin the consumption, all labor income should be consumed up, hence theoptimal consumption c should be Then by the optimality of n , we have ∂u ( c ,n ) ∂n = n − θn = 0, by which we will choose θ by θ = 1 /n . The total utility of a ﬂow of consumption and working hours { ( c τ , n τ ) } τ = t, ··· ,T is deﬁned by U ( c · , n · ) = T (cid:88) t = τ β τ u ( c τ , n τ ) (10)To contain the spreading of the virus, governments need to apply a lock-down policy to reduce direct contacts between people, which will impose6tricter constraints on their behaviour. In this paper, we study the lockdownpolicy by a constraint on the ratio L ∈ [0 ,

1] of the working hour in the fullworking capacity, i.e., given the full working hours n , the maximal workinghour cannot exceed n ∗ L . We suppose the government cannot easily iden-tify individuals into their categories so that the lockdown constraint on theworking hours is the same for all people. We formulate the decision makingproblem for each category with a given lockdown policy L · , and then studythe lockdown policy-making problem for the government. Suppose the lockdown measure L t ∈ [0 ,

1] is given for any time t .A recovered people aims at maximising his total utility J r ( c r · , n r · ; t ) = T (cid:88) τ = t β τ − t u ( c rτ , n rτ ) (11)with the constraint c rτ ≤ An rτ and n rτ ≤ n L τ . Theorem 2.1.

At time t with state X t and the lockdown policy { L τ : τ ∈ [ t, T ] } , the optimal ( c r , n r ) is c r ∗ τ = An L τ , n r ∗ τ = n L τ , τ = t, · · · , T. Proof.

Since ∂J r ( c r ,n r ; t ) ∂c rτ = β τ − t c rτ >

0, we have c r ∗ τ = An rτ ∀ τ = t, ..., T .Denote f ( c r , n r , λ rn ; t ) = J r ( c r , n r ; t ) + (cid:80) Tτ = t λ rnτ ( n L τ − n rτ ). Then by KKTcondition, ∀ τ = t, ..., T , ∂f ( c r ∗ , n r , λ rn ; t ) ∂n rτ = 0 ⇒ β τ − t ( − θn rτ + 1 n rτ ) − λ rnτ = 0 λ rnτ ( n L τ − n rτ ) = 0 , λ rnτ ≥ n θ = 1, λ rnτ = β τ − t ( − θn rτ + 1 n rτ ) > β τ − t ( − θn + 1 n ) = 0Thus n r ∗ τ = n L τ , c r ∗ τ = An L τ ∀ τ = t, ..., T Notice that the behaviour of recovered people ( c r · , n r · ) plays no role in thespread of the virus, hence the behaviour of recovered people will not aﬀectpeople in other categories. This is why we start to form this easy-to-handlecategory. 7 .2.2. Optimal behaviour of infectious people Similar to the case of recovered people, infectious people also need tochoose their optimal consumption and working hours { ( c st , n st ) } t =0 , , ··· ,T tomaximise their total utility from consumption and working hour, subjectto the constraint that the consumption c st cannot exceed the labour incomefor the working hour n t , and n t must be no more than the lockdown policy n ∗ L t .The labor income of an infectious people is diﬀerent from other categories.Because they are infected, their health condition is usually worse than otherpeople. So we introduce a constant φ to discount their working eﬃciency,and the labor income from n t working hour will be A ∗ φ ∗ n . Furthermore,since an infectious people will have a constant probability π r to recover andsuﬀer a possibility π d of death, we need to calculate the distribution overall categories at a future time. For an infectious people at time t , he hasthe probability π r to recover in the next unit time, π d to die, and the restprobability 1 − π r − π d to stay in the infected category. By this evolution,we can get the conditional probabilities for his health state at a future time τ > t . Denote by p i,i ( t, τ ) the probability for him being still infected, p i,r ( t, τ )the probability being recovered, and p i,d ( t, τ ) the probability of being dead.Then we can deduce that p i,i ( t, τ ) = (1 − π r − π d ) τ − t , (12) p i,r ( t, τ ) = π r (1 − (1 − π r − π d ) τ − t ) π r + π d , (13) p i,d ( t, τ ) = π d (1 − (1 − π r − π d ) τ − t ) π r + π d . (14)If he recovered, he should behave optimally as a recovered people, while ifdeath has happened unfortunately, we cease the accumulation of any utility.So, for a given ﬂow ( c i · , n i · ) of consumption and working hours taken by theinfectious people from time t , the accumulated utility he can get will be J i ( c i · , n i · ; t ) = T (cid:88) τ = t β τ − t (cid:2) p i,i ( t, τ ) u ( c iτ , n iτ ) − p i,r ( t, τ ) u ( c r ∗ τ , n r ∗ τ ) (cid:3) . (15)where ( c r ∗ , n r ∗ ) is the optimal behaviour of a recovered people determined inthe previous case, and p i,i , p i,r are as deﬁned in equation (12, 13). Theorem 2.2.

Given the lockdown policy L · , the optimal ( c r , n r ) is c i ∗ τ = Aφn L τ , n i ∗ τ = n L τ , τ = t, · · · , T. (16)8 roof. Notice ∂J i ( c i ,n i ; t ) ∂c iτ = ( β (1 − π r − π d )) τ − t c iτ >

0, we have c i ∗ τ = Aφn iτ ∀ τ = t, ..., T . Denote f ( c i , n i , λ in ; t ) = J i ( c i , n i ; t ) + (cid:80) Tτ = t λ inτ ( n L τ − n iτ ).Then by KKT condition, ∀ τ = t, ..., T , ∂f ( c i ∗ , n i , λ i ; t ) ∂n iτ = 0 ⇒ ( β (1 − π r − π d )) τ − t ( − θn iτ + 1 n iτ ) − λ inτ = 0 λ inτ ( n L τ − n iτ ) = 0 , λ inτ ≥ n θ = 1, λ inτ = ( β (1 − π r − π d )) τ − t ( − θn iτ + 1 n iτ ) > ( β (1 − π r − π d )) τ − t ( − θn + 1 n ) = 0Thus n i ∗ τ = n L τ , c i ∗ τ = An φL τ ∀ τ = t, ..., T Diﬀerent from the infectious case, the behaviour of an infectious peo-ple ( c i , n i ) is involved in our extend SIR model for the spreading of thevirus, hence they will make the decision problem for susceptible people muchharder. The decision planning for a susceptible people from time t is much morecomplicated if we consider the possibilities for this people to turn into in-fectious, recovered, and death at diﬀerent future time spots. We avoid thecomplexity by taking advantage of the optimal value function for an infected,and model the objective function of a susceptible people recursively.As for the previous two categories, we start from time t and pick up asusceptible person. Denote the state of the SIR model at the starting timeas X t , and the lockdown policy is fully given as L · ).Suppose he will follow a given ﬂow of consumption and working hours( c sτ , n sτ ) τ = t,t +1 , ··· ,T before being infected, and then follow the optimal be-haviour after been infected, i.e., his consumption and working hours afterinfected will switch to the optimal control for an infected person from theinfection time. We denote his objective value as J s ( c s · , n s · ; t, X t , L · ) = u ( c st , n st ) + βτ t J i ∗ ( t + 1 , L · )+ β (1 − τ t ) J s ( c s · , n s · ; t + 1 , X t +1 , L · ) , (17) J s ( c s · , n s · ; T, X T , L · ) = u ( c sT , n sT ) , (18)9here τ t = π s n AφI t L t c st + π s n I t L t n st + π s I t is the probability of a sus-ceptible person to be infected in the next unit time, J i ∗ ( t + 1 , L · ) is theoptimal objective value achievable for an infected person starting from time t + 1, and X t +1 is the SIR state at time t + 1 resulted by people’s behaviour( c st , n st , c i ∗ t , n i ∗ t , c r ∗ t , n r ∗ t ) and the time t state X t .Now it is natural that we aim at maximising the objective J s ( c s · , n s · ; t, X t , L · )over feasible control ﬂow ( c s · , n s · )), ,i.e., the optimal behaviour of a susceptiblepeople will be the solution for the optimisationmax J s ( c s · , n s · ; t, X t , L · ) s.t. c sτ ≤ An sτ , n sτ ≤ n L τ , ∀ τ ∈ { t, t + 1 , · · · , T } . (19) Theorem 2.3.

At time t with state X t and the lockdown policy { L τ : τ ∈ [ t, T ] } , the optimal ( c s , n s ) is c s ∗ τ = An s ∗ τ , τ = t, · · · , T. (20) Proof.

We ﬁx the lockdown policy L · and omit it when no confusion willarise.Denote the value function as V ( t, X t ) = J s ( c s ∗· , n s ∗· ; t, X t , L · ). Accordingto the dynamic programming principle, we know V must satisfy V ( t, X t ) = max c st ≤ An st , n st ≤ n L t (cid:2) u ( c st , n st ) + βτ t J i ∗ ( t + 1 , L · ) + β (1 − τ t ) V ( t + 1 , X t +1 ) (cid:3) = u ( c s ∗ t , n s ∗ t ) + βτ ∗ t J i ∗ ( t + 1 , L · ) + β (1 − τ ∗ t ) V ( t + 1 , X ∗ t +1 ) , where τ ∗ t and X ∗ t are the corresponding infection probability and time t + 1state of the SIR model.If c s ∗ t < An s ∗ t , then, due to the strictly increasing properties of τ t in both c s and n s , we can easily ﬁnd a value m ∈ ( c s ∗ t , An s ∗ t ), and construct anothercontrol c st = m and n st = m/A , such that the corresponding τ t will be thesame as τ ∗ t , hence X t +1 will also be the same as X ∗ t . But since c st > c s ∗ t and n st < n s ∗ t , we have u ( c st , n st ) > u ( c s ∗ t , n s ∗ t ), which contradicts the optimality of( c s ∗ , n s ∗ ) in the dynamic programming principle. With the optimal behaviour in each category under a given lockdownpolicy L · , we can easily formulate the optimal policy-making problem intoan optimal control problem. 10uppose we start the lockdown problem from some time t with the con-tamination state X t being given by S t = s, I t = i and R t = r , then theoptimal lockdown policy should be the optimal control problemmax L · J ( L · ; t, X t ) = (cid:80) Tt = t β t − t [ S t u ( c s ∗ t , n s ∗ t ) + I t u ( c i ∗ t , n i ∗ t ) + R t u ( c r ∗ t , n r ∗ t )] , (21)where ( c ca ∗ t , n ca ∗ t ) are the optimal consumption and working hours for peoplein category ca ( ca can be s, i or r ), which are all determined in previousoptimisation problems.In previous objective J , we remove all cases of death. In reality, sincedeath of disease causes has a strong negative impact to a household as wellas to the society, regulators should not ignore any death case. We include thestrong impact of death cases by introduce a penalty term into the objective J λ ( L . ; t, X t ) = T (cid:88) τ = t β τ − t (cid:2) S τ u ( c s ∗ τ , n s ∗ τ ) + I τ u ( c i ∗ τ , n i ∗ τ ) + R τ u ( c r ∗ τ , n r ∗ τ ) − λD τ u ( c r ∗ τ , n r ∗ τ ) (cid:3) , (22)In this new objective, we measure the the cost of a death by a multiple of theoptimal utility for a recovered people, and the multiple λ > λ = 0, J λ reducesto our previous objective J .With this new objective, the problem for a regulator is to solvemax L · J λ ( L · ; t, X t ) ,s.t. L t ∈ [0 , ∀ t ∈ [0 , T ] . (23) In Problem (23), or its reduced version (21), the optimal decisions ofindividuals in all three categories are involved. Fortunately, the optimal de-cisions of recovered and infectious people are trivial due to our good structureof the model, which leaves us to tackle the optimal decision problem (19) forsusceptible people before the Problem (23).We start our solving scheme by tackling the Problem (19) with a givenlockdown policy L · . Because of the lockdown constraint, it is almost hopelessfor us to get an explicit solution. We solve this optimal control problemnumerically in the same was as in [7]. In this approach, the optimal controlat each time step is regarded as the static optimisation with two constraints11rom the consumption budget and the lockdown policy on the working hours,and solutions are obtained by solving the corresponding KKT condition .With the optimal control ( c s ∗· , n s ∗ ) as functions of the lockdown policy L · , we deal with the optimal control problem (23) as an optimisation overthe high dimension space [0 , T by the gradient-based interior-point methodused in the Matlab function fmincon . Although we have no theoretical proofon the convergence of our scheme, our numerical results show the convergenceof our scheme.Parts of our code in our scheme are from [7].

3. Model Parameters

In this section, we study how to estimate those parameters in our modelfrom real data, and apply it in an example with COVID-19 data in the UKto get the numerical results for optimal lockdown control.In our model, we have quite a lot of parameters, and some of them arewell-estimated and available from diﬀerent sources. Let us start from easilyaccessible ones.For the extended SIR model, without loss of generality, we standardise thetotal population to N = 1, which makes S t , I t , R t and D t be the proportionsof the population of each category in the total population.The unit of a time step is not an essential parameter, we can simply countthe time by weeks. π r and π d in the extended SIR model can be easily estimated from histor-ical data, which have been done in several data sources like HPCC covid19data cluster . In our example, we will use the estimation from [7]. π s , π s and π s are complicated to estimate, and we defer the discussionto after all easy ones.For the characterisation of the decision making for individuals, we stillneed parameters n , θ, β, A , and φ . Most of them are quite ﬂexible, and in our In fact, when we use the numerical scheme proposed in [7] to our problem, the deriva-tive used in the KKT condition is not correct due to the absence of a complicated termfrom the term in equation (17). We decide to ignore this absence due to the followingtwo reasons: (1) if we recover this complicated term, the calculation will be extremelycomplicated; (2) from real data in the COVID-19 pandemic, we know the coeﬃcient inthe third term βτ t ) is very close to 0, which is also observed in our numerical results. π s , π s , π s . At any time t , wehave π s c st c it + π s n st n it + π s = π t , where π t is the transmission rate in classicSIR model. Similar to π r and π d , the quantity π t is also available in diﬀerentdata source like HPCC covid-19 data cluster . To estimate π s , π s , π s , wechoose two diﬀerent time spots t and t . The ﬁrst time spot t can be anytime between the onset of the spreading of the virus and the ﬁrst lockdownmeasure, and the second time spot t must be in a period where a lockdownmeasure was applied. With the observation of π t and π t , we have: π s A n + π s n + π s = π t , π s A n L t + π s n L t + π s = π t , where L t is an estimation of actual lockdown rate at time t . These twoequations are not enough to give us the values of three parameters, we stillneed one more equation for the purpose. In the case (as happened in theUK) that no diﬀerent (non-null) lockdown measures have been applied, thethird equation is oﬃcially not available. So we assume that π s n × ×

16 = π s . This equation is from the assumption that susceptible people spend about1 / / / π s , π s and π s . We take the COVID-19 in the UK as our example, which started in theyear 2019. The only lockdown took place on 23 March 2020 and lifted up inJuly 2020.For the estimation of π s , π s and π s , we need to specify some otherparameters. HPCC systems covid19: https://covid19.hpccsystems.com/ t to bea time in Jan 2020 and t to be some time in April 2020.The government released Experimental results of the pilot Oﬃce for Na-tional Statistics (ONS) online time-use study (collected 28 March to 26April 2020 across Great Britain) compared with the 2014 to 2015 UKtime-use study, which reported the working-not-from-home time. Accord-ing to the study, the average daily time (in minutes) of working not fromhome is 97 . . L t = 97 . / ≈ . . , thereby we set n = 36 . n θ = 1, we set θ = 0 . .

6% from [7]. As in [7] , we assume that each infected casetakes 18 days on average to either recover or die. Since our model is weekly,we have π d = 0 . × / , π r = 7 / − π d . The reproduction number R attime t in Jan 2020 is around 1 .

95 without control measures , and between0 . . , we use the middle point 0 . R for the calcuation of π t . Since in classic SIR model, R = β/γ where β and γ the infected and recovery transmission rate π t = 1 . × / , π t = 0 . × / . Given a published average annual income A =15 . π s , π s , π s ), weget solution π s = 1 . × − , π s = 1 . × − , π s = 0 . . Coronavirus wikipedia: https://en.wikipedia.org/wiki/Coronavirus disease 2019

14y the value n = 36 .

9, we take θ = 1 / (36 . .Finally, we copy the value φ = 0 .

4. Numerical Results

In this section, we present the result of our numerical experiments underthe parameter setting in section 3.2. We do experiments to analyze theimpact of the optimal lockdown control policy, the policy when diﬀerentlevels of the cost of death are taken into consideration, early exit and latestart of the lockdown policy, and ﬁnally the smart containment policy. Forevery experiment, the initial state is (

S, I, R ) = (0 . , . ,

0) and thetime horizon is 100 weeks.

As Figure 1 (a), (d) (page 21) shows, if there is no lockdown control,i.e. the lockdown rate is constant 1 for all time, then under our parametersetting, around 15% of the population will be infected, 0 .

3% of the populationwill die and the peak of infection will be above 0 .

6% at week 50. Under theoptimal lockdown control, the proportion of Infectious people decrease to5 . × − at week 50, then raises to 2 . × − at week 100. 0 .

37% ofthe population will become infected and 0 . .

8% and reduces the number of deaths by 97 . .

6% with the optimal lockdown measure.In Figure 1 (f) (page 21), the optimal lockdown rate starts from around80%, then gradually release to above 95%, the speed of the increase of thelockdown rate ﬁrst decreases until around week 50, then increase until week100.The increase of the infected proportion is because our model has a ﬁnitetime horizon, and does not take the consequences after a time horizon of100 weeks into consideration. In the beginning, the aggregated consumptionunder optimal lockdown control is 20% less but becomes 8 .

2% more thanthat of no control in the end. The reason that the optimal lockdown con-trol policy did not cause a severe recession might be that in the no controlcase, susceptible people will cut back their working hours, as well as their15onsumption as the infected population increases, and in the optimal lock-down control restricted the infected population so that susceptible peoplewon’t cut back their consumption as much. We proved in section 2 that therecovered and Infectious people will work as much time as possible in orderto maximize their own utility, but the behaviour of susceptible people is notcertain. In the parameter setting of our experiments, the susceptible peoplealmost work as much as possible just as the infected and recovered people do,but slightly reduce their working hours from the upper bound of lockdownconstrain near the end of the time horizon, this behaviour may due to theincrease of infected proportion, which raises the risk of getting infected forsusceptible people.In general, the optimal lockdown policy saves lives and is more robust ineconomic recovery, it brings long-term health beneﬁts and economic growthwith the cost of a short-term recession.

In this subsection, we study how the severity of death regarded by theplanners aﬀects the optimal lockdown policy. We set the penalty coeﬃcient ofdeath λ in (17) as 0 , , ,

50, which means the death of 1 people is regardedas the loss of 0 , , ,

50 recovered people by the planner. When λ = 0, it isthe same as the original optimal control model.Our results in Figure 2 (page 22) show that adding a penalty on deathsmakes a huge diﬀerence, it signiﬁcantly slows down the increase of the lock-down rate (Figure 2 (f) (page 22)), thus reduces the proportion of deathsin a great extent: 76 . , . , .

2% respectively (Figure 2 (d) (page 22)),and avoid the substantial rise of the infectious population (Figure 2 (a) (page22)), these are beneﬁcial in terms of the mental impact in the society as lowdeaths and infection amount release the pressure on both people in the soci-ety and the planner. As the penalty coeﬃcient increases, the optimal policybecomes constantly more strict. The relation of the death penalty coeﬃcientand the result optimal control rate is below linear. As Figure 2 (f) (page 22)shows, despite that optimal lockdown policy with diﬀerent death penalty co-eﬃcient starts with quite diﬀerent lockdown rates:0 . , . , .

46 for penaltycoeﬃcient 10 , ,

50 respectively, they quickly become close. At the end ofcontrol, the aggregate consumption, as well as the lockdown rate of optimalpolicy with the death penalty is extremely close to the one without the deathpenalty. Compare to the original optimal lockdown policy, there is a slightrecession when adding penalty on number of deaths: the average aggregate16onsumption decreases by 3 . , . , .

5% for penalty coeﬃcient 10 , , Practically, policymakers may under the intense pressure of economicloss that forces them to end the containment policy in the middle of thepandemic. In this subsection, we discuss the consequences of doing so. Aswe see in section 4.1, the infected population reaches the bottom at the week50, which may seem to be a good time spot to end the lockdown policy.Our results in Figure 3 (e) (page 23) shows that there is an instant bounceof consumption right after the end of lockdown control, but this would causethe instant rise of infectious population (Figure 3 (a) (page 23)), and at theend, the infectious population 72 times larger than that of week 50. Theburst of infection would result in a recession of 10 .

8% from the peak at theend (Figure 3 (e) (page 23)).So, ending the lockdown policy prematurely may not bring long-term eco-nomic beneﬁt and what’s worse is, it would result in a substantial additionalnumber of deaths. Therefore we suggest that policymakers avoid terminat-ing the lockdown policy during the pandemic in pursuit of only a short-termeconomic beneﬁt.

Policymakers could also face the situation that there are things that pre-vent them from taking the lockdown measure in the early stage of the pan-demic.Our results Figure 4 (f) (page 24) show the optimal lockdown policy thatstarts at week 13 (around 3 months later). Compare to the optimal lockdownpolicy that starts at week 0 that starts with the lockdown rate 0 .

8, the late17tarted optimal lockdown policy starts with a stricter constrain rate of 0 . .

8% by week 100.In general, It is the earlier the better to start the lockdown control policy,and despite that the late start of lockdown policy brings additional loss, itis much better than applying no containment policy or abandon it too early.

Vaccination is an eﬀective method of preventing infectious diseases. Wenow involve vaccination in SIR model. Assume that at each time period, ﬁxamount of susceptible people: δ v of the starting population get vaccinationthat could prevent them from getting COVID-19 and assume governmentsto aﬀord the cost of vaccination for people. Once susceptible people getvaccination, they are regarded as recovered. Thus the objective value ofsusceptible people become: J s ( c s · , n s · ; t, X t , L · ) = u ( c st , n st ) + β (1 − δ v S t ) τ t J i ∗ ( t + 1 , L · )+ β (1 − δ v S t )(1 − τ t ) J s ( c s · , n s · ; t + 1 , X t +1 , L · )+ β δ v S t J r ∗ ( t + 1 , L · ) (24)With the cost of vaccination, denote as p , the optimal control problem ofpolicymakers become:max L · J ( L · ; t, X t ) = (cid:80) Tt = t β t − t [ S t u ( c s ∗ t , n s ∗ t ) + I t u ( c i ∗ t , n i ∗ t ) + R t u ( c r ∗ t , n r ∗ t ) − pδ v ](25)We set δ v = 1 /

104 in simulation. Results on Figure 7(a),(b),(d)(page 27)shows that vaccination could eliminate the epidemic without a rebound ofinfection and reduce the number of deaths compare that without vaccination.Figure(e),(f) shows that vaccination reduces the severity of recession andleads to a less strict optimal lockdown control.18 .6. Smart Lockdown Control Policy

In the lockdown control policies we studied so far, the government choosesthe same lockdown rate for all three kinds of people (susceptible, infectious,and recovered). In this subsection, we consider the smart containment, bywhich means the policymaker directly chooses working hours for all threekinds of people with the same objective function as previous models. Thereis no need to apply any lockdown on recovered people because their utilityreaches the maximum as their working hour is at the maximum and they donot aﬀect the utility or the transition of susceptible and Infectious people.Our results show that in the smart lockdown control policy, Infectious peoplealmost do not work at the beginning, but then the planner gradually increasestheir working hours as the infected population decreases rapidly, and suscep-tible people work almost without fear of becoming infected. Figure 5 (page25) shows that compare to the previous optimal lockdown control policy, thesmart lockdown policy is much better, since it reduces the number of deathsto a great extent, and almost avoids the recession because the proportion ofInfectious people is extremely small. The implement of a smart lockdowncontrol policy requires the planners to know the status of all people and havecontrol over their working hours. In reality, the knowledge of people’s statusneeds measures such as medical testing and rely on the accuracy of testing.Our results suggest that these measures and information that are helpful fortaking smart lockdown policy are beneﬁcial for social welfare.

The reproduction number ( R ) is now a basis for some governments tomake decisions in reaction to the pandemic. We present the R of lockdownpolices in all our experiments in Figure 6 (page 26). The R of smart lock-down policy is much smaller than that of all other lockdown policies. The R of lockdown policies with the same lockdown rate for all three kinds of peoplebehave similarly to their lockdown rate whereas in the no control case, its R decreases constantly and the R of smart lockdown policy behaves similarto the lockdown rate of Infectious people. This is because the behaviour ofall three kinds of people is in accordance with the lockdown rate in optimallockdown control policies, while the behaviour of susceptible people variesif there is no control, and in the smart lockdown case, the susceptible andrecovered people almost remain the lockdown rate as constant 1 in the wholecontrol process, thus its R behaves in accordance with the lockdown rateof Infectious people. Notice that although the R of in the no control case19ecreases below 1, and the R of lockdown control policies with or withoutthe death penalty increase over 1, policies with lockdown control are muchbetter than that without control as analyzed in previous subsections. We,therefore, suggest that whether R is larger or less than 1 can not be theonly foundation for planners to make judgements or decisions on the currentsituation.

5. Conclusion

In this paper, we extend the canonical epidemiological model SIR to ﬁndan optimal decision making with the aim to balance between economy andpeople’s health. In our model, people in diﬀerent health statuses take dif-ferent decisions on their working hours and consumption to maximise theirown utility, while policymakers control the lockdown rate to maximise theoverall welfare, which leads to a two phases optimisation problem. Severalparameters in our model are not straightforward to specify using the com-mon epidemic data for modelling. We develop a novel method of parameterestimation through various additional sources of data. Our results show thatlockdown measures could eﬀectively reduce the deaths and infections causedby the COVID-19. There is an inevitable trade-oﬀ between the short-term re-cession, and health problems caused by the pandemic, and how policymakersdeal with this could lead to very diﬀerent decisions. We quantify the trade-oﬀ by emphasising the cost of death in the model objective, which enablesthe optimal lockdown policy to discover a balance between the economic andepidemic outcomes. The timing of starting and ending the lockdown controlpolicy makes much diﬀerence in terms of both the economic and epidemicoutcomes. So the earlier to start the control, the better the results will be.It is crucial to avoid premature ending of the control. In the analysis ofthe smart containment policy, the results suggest that additional informa-tion about the health status of people is beneﬁcial, as the optimal lockdowncontrol policy will reach much better outcomes if it could be implementedon people with diﬀerent health status separately. Through comparison oflockdown policies, we suggest that R cannot be the only foundation forpolicy-making. References [1] Acemoglu, D., Chernozhukov, V., Werning, I., Whinston, M.D., 2020.Optimal targeted lockdowns in a multi-group sir model. NBER Working20

50 100

Weeks S ha r e o f I n i t i a l P opu l a t i on -3 (a) Infected, I Weeks S ha r e o f I n i t i a l P opu l a t i on (b) Susceptibles, S no controloptimal control Weeks S ha r e o f I n i t i a l P opu l a t i on (c) Recovered, R Weeks S ha r e o f I n i t i a l P opu l a t i on -3 (d) Deaths, D Weeks -20-15-10-50 % D e v . f r o m I n i t i a l S t ead y S t a t e (e) Aggregate Consumption, C Optimal Control