[PDF] A control theory approach to optimal pandemic mitigation

Abstract

In the framework of homogeneous susceptible-infected-recovered (SIR) models, we use a control theory approach to identify optimal pandemic mitigation strategies. We derive rather general conditions for reaching herd immunity while minimizing the costs incurred by the introduction of societal control measures (such as closing schools, social distancing, lockdowns, etc.), under the constraint that the infected fraction of the population does never exceed a certain maximum corresponding to public health system capacity. Optimality is derived and verified by variational and numerical methods for a number of model cost functions. The effects of immune response decay after recovery are taken into account and discussed in terms of the feasibility of strategies based on herd immunity.

Full PDF

AA control theory approach to optimal pandemic mitigation

P. Godara, S. Herminghaus, K. M. Heidemann

Max Planck Institute for Dynamics and Self-Organization, Am Fassberg 17, 37073 G¨ottingen, Germany (Dated: October 19, 2020)In the framework of homogeneous susceptible-infected-recovered (SIR) models, we use a controltheory approach to identify optimal pandemic mitigation strategies. We derive rather general condi-tions for reaching herd immunity while minimizing the costs incurred by the introduction of societalcontrol measures (such as closing schools, social distancing, lockdowns, etc.), under the constraintthat the infected fraction of the population does never exceed a certain maximum correspondingto public health system capacity. Optimality is derived and veriﬁed by variational and numericalmethods for a number of model cost functions. The eﬀects of immune response decay after recoveryare taken into account and discussed in terms of the feasibility of strategies based on herd immunity.

I. INTRODUCTION

The recent outbreak of the illness COVID-19, causedby the SARS-CoV-2 virus, has resulted in a pandemicwith unprecedented impact on societies all over the globe.Mitigation measures included complete lockdowns of so-cietal life, with severe psychic, social, and economic con-sequences [1, 2]. Hence a major focus of governments ison designing containment strategies which are as mild aspossible, but substantial enough to limit the severity ofthe outbreak in order not to overwhelm the health ser-vice system (HSS). This requires reliable forecast, basedon careful collection of data on the fraction of infectedcitizens, as well as extensive simulation [2, 3].We discuss the system in terms of a so-called SIRmodel [4], referring to the fraction of susceptible (S), in-fected (I), and recovered (R) citizens in the population.We identify the recovered with all those who are neithersusceptible nor infected ( R = 1 − S − I ); the dynamicsare thus fully described by a set of two equations: ∂ t S = − βSI ,∂ t I = βSI − Iτ , (1)where S, I ∈ [0 ,

1] are the fraction of susceptible andinfected individuals in the population, respectively. Notethat I ( t ) denotes the actual fraction of acutely infectedcitizens at time t , no matter whether or not the infectionhas been realized by the individual, or has even beenrecorded. The infection of a susceptible individual by aninfected is described by the infection rate β >

0, while τ is the average duration of the infection of an individualuntil her recovery.The task we address in this study is to limit, duringthe whole period of the pandemic, the current numberof infected individuals such as to prevent the number ofthose needing intensive care from exceeding the capacityof the deployed HSS. Such control may be described bya control parameter α ( t ), which quantiﬁes the eﬀect ofmitigation strategies upon the infection rate. We maywrite β = β (1 − α ), such that α = 0 and α = 1 corre-spond to usual societal life and complete mutual isolationof citizens, respectively. FIG. 1. Sketch of the space of possible mitigation measures(grey shade), spanned by their eﬀect on the infection rate, α , and their socio-economic cost, f ( α ). Normal societal lifeis at the origin, while the upper right corner corresponds tototal mutual isolation of all citizens, which is the strongestpossible intervention. The dashed and dotted curves are pos-sible choices for mitigation measures; such curves correspondto the cost functions referred to in the manuscript. In order to deﬁne α in a general way, we state that acertain value of α = 1 − β/β denotes the subset of allpossible mitigation measures which lead to an infectionrate β ≤ β . We thus do not need to refer to any spe-ciﬁc measures, but can formulate our approach in a verygeneral way. The (more or less) accurate determinationof these subsets is then the task of careful social (e.g.,infection history) data analysis among citizens. This isillustrated in Fig. 1, where mitigation strategies, followedby the public authorities, are indicated by the dashedand dotted curves, within a space spanned by the eﬀectof the measures upon the infection rate, α , and the costincurred for economy and society as a whole, f ( α ).As indicated by the explicit time dependence of α ( t ),we follow a control theoretic approach. This is in somecontrast to earlier treatments which have assumed miti-gation measures to be constant over time [5–8]. Instead,we aim at determining the optimal function α ( t ) whichminimizes the impact on society, while at the same timeavoiding the HSS to become overloaded. At the end ofthe mitigation scenario, herd immunity shall be reached,so that the epidemic comes to an end without further a r X i v : . [ q - b i o . P E ] O c t control. We do not consider potential vaccination scenar-ios here (as is done elsewhere [9, 10]), so immunizationcan only be achieved via infection with the virus in thepresent study.A dimensionless quantity which is frequently used inepidemiology is the reproduction number , R := βτ S ,which denotes the average number of susceptibles in-fected by one infected individual. At the beginning ofan epidemic ( S ≈

1) and without mitigation measuresdeployed ( β = β ), one observes the basic reproductionnumber R = β τ . In case of the COVID-19 pandemic,typical estimates are R ≈ τ ≈ ten days [1, 12].It has been shown before that the inherently random,network-like structure of interactions between individu-als mainly results in a readjustment of R [13]. Hence wefollow a mean ﬁeld approach, disregarding small scale in-homogeneities of the system. We consider a homogeneousscenario, where β ( t ) depends on time, but is spatiallyconstant. Deﬁning a normalized time variable, θ := βt ,we can rewrite Eq. (1) as ∂ θ S = − SI ,∂ θ I = SI − IR (1 − α ) . (2)The trajectory of the system in the S - I -plane, I ( S ), canbe obtained by dividing the equations displayed in Eq. (2)by each other. This yields dIdS = 1 R (1 − α ) S − , (3)which has the solution: I ( S ) = ln SR (1 − α ) − S + C , (4)where C is a constant of integration. When no mitigationmeasures are in place ( α = 0), we have I (1) = 0 and thusobtain I ( S ) = ln SR − S + 1 . (5)This is plotted as the dashed curve in Fig. 2 for the case R = 3. The maximum turns out to occur at S peak =1 /R , where it reaches a value of I peak = 1 − ln R R − R . (6)At S = S peak = 1 /R , the population has reached herdimmunity since from then on the number of infected cit-izens decreases until zero.If the disease is serious, one is faced with the prob-lem that with a fraction of I peak people being infected,the number of those in need of hospitalization or even in-tense care may exceed the capacity of the HSS. We denoteby I h < I peak the maximum fraction of infected citizenswhich can be managed by the HSS. It is limited by in-frastructural aspects, such as the availability of staﬀ or FIG. 2. Trajectories of the system in the (

S, I )-plane, for R = 3. Dashed curve: trajectory with no mitigation. Itstarts at ( S, I ) = (1 , I h (here we have set I h = 0 . α ( t ). The corresponding characteris-tic of α ( S ) follows the dash-dotted curve in the intermediateregime, which we call phase II. the size and areal density of hospitals, and is indicated bythe horizontal dotted line in Fig. 2. Any substantial over-shoot of the dashed curve over the dotted line constitutesa catastrophe, as a major fraction of the population willthen not receive proper health care or treatment. Thismust clearly be avoided by means of suitable measures,such as reducing mutual contacts between individuals,banning major assemblies, reducing mobility etc., thusreducing the infection rate. Such measures are describedby the parameter α ( t ), which is to be discussed next. II. OPTIMAL CONTROL PROBLEM

It is clear that the aforementioned measures will havea more or less substantial impact on society, mainlythrough their detrimental eﬀects on economy, but alsothrough other societal (e.g., cultural) damage. This maybe described by means of a cost functional , K { α } := (cid:90) f ( α ( t )) dt , (7)where the cost function f corresponds to the mitigationstrategy chosen, i.e., the curve chosen in Fig. 1. It de-notes the cost incurred at a given control α , along withthe assumption ∂f /∂α ≥ ∀ α ; later we will require ∂ f∂α ≥ α ( t ), suchthat the impact on society, as described by K , is beingminimized under the constraint that I ( t ) never exceeds I h (capacity of the HSS) and at the end of mitigation—at unknown terminal time t e —herd immunity is reached,i.e., S ( t e ) = 1 /R (end of phase II in Fig. 2). We thusneed to minimize K { α } = (cid:90) t e f ( α ( t )) dt , such that ˙ x ( t ) = h ( x , α ( t )) , x (0) = x ,I ( t ) ≤ I h , S ( t e ) = 1 /R , (8)where x = ( S, I ), and h ( x , α ( t )) as given by Eq. 1 (with β = β (1 − α )). Solving Eq. 8 can be recast into mini-mization of the following functional: J := (cid:90) t e f ( α ( t )) + λ ( t ) · [ ˙ x ( t ) − h ( x , α ( t ))]+ µ ( t )( I − I h ) dt , (9)where λ ( t ) = ( λ S ( t ) , λ I ( t )), µ ( t )) are Lagrange multi-pliers. The introduction of µ ( t ) for the inequality con-straint introduces additional constraints on µ ( t ), namely µ ( t ) ≥ µ ( t )( I ∗ − I h ) = 0. These are also known as KKT condi-tions [15]. The star ( ∗ ) represents the optimal quantities.Additionally, S ( t e ) = 1 /R and x (0) = x need to be en-forced.The necessary conditions for optimality can be evalu-ated by setting the ﬁrst variation of Eq. 9 to zero (for adetailed derivation see supporting information), we ob-tain: f ( α ∗ ( t ∗ e )) − λ ( t ∗ e ) · h ( x ∗ ( t ∗ e ) , α ∗ ( t ∗ e )) = 0 , (10)˙ λ ( t ) = − λ ( t ) · ∇ x h | x ∗ ( t ) + µ ( t ) ∇ x ( I − I h ) | x ∗ ( t ) , (11) ∂ α f | α ∗ ( t ) − λ ( t ) · ∂ α h | α ∗ ( t ) = 0 , (12) λ I ( t ∗ e ) = 0 , (13) µ ( t ) ≥ , (14) µ ( t )( I ∗ − I h ) = 0 . (15)In addition to these, one also has the optimal system dy-namics ˙ x ∗ ( t ) = h ( x ∗ , α ∗ ( t )). The necessary conditionsbecome suﬃcient conditions if h ( x , α ) and f ( α ) are con-vex in x and α [16]. The former can be checked to be validfor the SIR model and the latter implies that ∂ f∂α ≥ λ ( t ) and µ ( t ) which simultaneously satisfy the above conditions.This task usually involves a numerical search for the ini-tial conditions of the Lagrange multipliers and evolve thesystem of ODE’s until the terminal conditions given byEqs. 10 and 13 are met. We escape the numerical dif-ﬁculties arising with this procedure by ﬁrst guessing asolution and then ﬁnding the appropriate Lagrange mul-tipliers which verify optimality. A. Heuristic approach

Let us ﬁrst consider what is necessary to keep the frac-tion of infected citizens at a constant value, I c . Since S varies with time, dI/dt = 0 entails dI/dS = 0, and hencefrom Eq. 3 we obtain α ( t ) = 1 − R S ( t ) . (16)This is indicated by the dash-dotted curve in Fig. 2. Notethat α ( t ) does not depend on the value of I c .Next we consider the cost function for proceeding fromsome S = S to some S = S < S while maintaining I = I c . Inserting Eq. 16 in Eq. 1, we ﬁnd dS = − I c dtτ . (17)We use this substitution to express the cost function as K = t ( S ) (cid:90) t ( S ) f ( α ( t )) dt = τI c S (cid:90) S f ( α ( S )) dS . (18)Hence if S and S are ﬁxed, I c must be as large aspossible to minimize K . This now guides our heuristic:we should control α such as to maintain I = I h for aslong as possible.If our guess is valid, the trajectory we have to fol-low in order to optimally control the pandemic is theone indicated as the solid curve in Fig. 2. It starts at( S, I ) = (1 ,

0) and proceeds until I = I h is reached. Thiscompletes phase I of the process, during which we set α = 0. Mitigation measures are then deployed, such that α jumps upwards to the dash-dotted curve. It followsthat curve all through phase II, hence keeping I = I h constant. As S decreases, α is gradually reduced untilit reaches zero at the end of phase II. All through phaseIII, α is maintained at zero, while I gradually decays tozero because R <

1. This ends the pandemic.

B. Validation of the solution

We now proceed to verify our heuristic solution. Wefocus on phase II, as this is where the pandemic will spendthe most amount of time. To do this we ask the question:Is it true that if the pandemic starts with I = I h , thenfor all S > /R and for all the cost functions f ( α ( t ))such that ∂f∂α , ∂ f∂α ≥

0, optimal pandemic control mustkeep I ( t ) = I h until S ( t e ) = R is reached?As we will see, the answer to the above question isyes. We proceed by setting α ∗ ( t ) = 1 − R S ∗ ( t ) and S ∗ ( t ) = S − I h τ t for t ∈ [0 , t ∗ e ] and t ∗ e = ( S − R ) τI .The terminal conditions for the dynamics are given by x ∗ ( t ∗ e ) = ( R , I h ). We can substitute the terminal con-ditions in Eq. 10 and get the terminal condition for λ S (the terminal condition for λ I is given by Eq. 13). Thetask remains to ﬁnd µ ( t ) such that Eqs. 11 and 12 aresatisﬁed simultaneously.Let’s have a look at Eqs. 11 and 12 after making thesubstitutions. We have˙ λ S ( t ) = I h τ S ∗ ( t ) [ λ S − λ I ] , ˙ λ I ( t ) = 1 τ λ S + µ ( t ) , (19)and ∂f∂α (cid:12)(cid:12)(cid:12)(cid:12) α ∗ ( t ) = ( λ S ( t ) − λ I ( t )) βI h S ∗ ( t ) . (20)One can hence ﬁnd the Lagrange parameter µ ( t ) as µ ( t ) = − λ S τ + ∂ f∂α R S ∗ . (21)Lastly, there is also the issue of non-negativity of µ . Ifwe assume the convexity of f we have ∂ f∂α ≥

0. Hencethe second summand in Eq. 21 is non-negative. ∂f∂α ≥ λ S is monotonically increasing (Eqs. 19,20). If the cost of zero control is zero, then the terminalcondition Eq. 10 implies that λ S ( t ∗ e ) = 0, thereby imply-ing λ S ( t ) ≤ , t ∗ e ]. This shows that fortime independent cost functions f under the assumptionsthat ∂f∂α , ∂ f∂α ≥ f (0) = 0 our heuristic solution isoptimal in phase II. III. NUMERICAL RESULTS

We have shown that an optimal trajectory starting onthe boundary ( I = I h ) remains on that boundary. Toobtain optimal control trajectories for arbitrary initialconditions, we perform direct numerical optimization us-ing the software library PSOPT [17]. In Fig. 3 we showthe numerical solutions to the control problem Eq. (8)for various cost functions f i ( α ( t )) ∈ { α ( t ) , α ( t ) , α ( t ) } .Clearly, in all scenarios the optimal trajectory I ∗ reaches the threshold value I h and remains there until S ∗ ( t ∗ e ) is reached (phase II). Phase I, however, depends onthe cost function applied. For linear costs, α ( t ) = 0 until I = I h [18]. With higher order cost terms, we observenon-zero control from the very beginning (see Fig. 3).This is to reduce the amount of time spent at large con-trol values α and thereby the total integrated costs. Theoptimal terminal time t ∗ e increases with the order of thecost function (see Fig. 3). We should note, however, thatthe inﬂuence of the functional form of f ( α ( t )), as ex-pressed in the diﬀerent shapes of the numerically derivedcurves, is minute, since the time axis is logarithmic, andthe deviations are noticeable only during a very smallfraction of time. Hence we see that the inﬂuence of thecost function, which corresponds to the chosen mitiga-tion strategy, is ﬁnite, but can be regarded as negligible for practical purposes. . . . . . I / I h . . . . α − − t / τ . . . . . R R ∞ cubic costsquadratic costs linear costsheuristic FIG. 3. Optimal control trajectories for diﬀerent costfunctions f i ( α ( t )) ∈ { α ( t ) , α ( t ) , α ( t ) } . The corre-sponding optimal terminal times t ∗ e are determined as { . τ, . τ, . τ } . I = 0 . I h = 0 . R = 3, S ( t ∗ e ) = R − . R ∞ is the asymptotic reproduction numberfor t → ∞ , given by R ∞ = − W (exp( − − R I h )), with theLambert W function. IV. DURATION OF THE PANDEMIC

If immune response acquired after recovery from aninfection is permanent, the pandemic will last until herdimmunity is reached at the end of phase II. This is when S ( t ) = S = 1 /R , as indicated by the left vertical dottedline in Fig. 2. This is the start of phase III, in whichthe number of infected citizens decays with no mitigationmeasures anymore in place (i.e., at α = 0). We will nowdiscuss the time we expect it to take until this point isreached. Using Eq. (17) with I c = I h , we can write dt = − τI h dS (22)and hence for the total duration of the pandemic, T ,until herd immunity is reached, T = − τI h S (cid:90) S dS ≈ τI h (cid:18) − R (cid:19) . (23)Here we have exploited the fact that in all cases of practi-cal relevance, I h will be very small as compared to unity.Consequently, the duration of phase I will be very smallas compared to phase II, such that the evaluation of thetrue duration of phase I is of minor importance. As avery good approximation, we have simply set S = 1and neglected the impact of α ( t ) on the dynamics for theshort period of phase I. A. Inﬂuence of immune response decay

The introductory discussion was based on the idea thatrecovered patients stay immune for all times. However, itis well known that for some diseases, in particular of theSARS-CoV type, the immune response tends to decay af-ter some time [19]. Hence there is some ﬁnite probabilitythat recovered patients become susceptible again.We now assume that the transition from the recoveredto the susceptible state can be described as a Poisson pro-cess. In other words, we assume the probability that arandomly chosen, formerly infected citizen becomes sus-ceptible in a time interval [ t, t + dt ] to be proportionalto dt and independent of t . This modiﬁes the dynamicalsystem (1) to ∂ t S = − βSI + ρ (1 − S − I ) ,∂ t I = βSI − Iτ , (24)with β = β (1 − α ), and ρ the average life time of theimmune state, averaged over all formerly infected indi-viduals. Note that we conceptually include those whofell victim to the disease and thus do not become suscep-tible again. Their contribution to the average resuscep-tibilization frequency is zero, which merely increases theaverage immune lifetime, ρ . From the data in [19], weﬁnd that after three years the average IgG immune re-sponse against SARS-CoV had decayed to 55.6 percent.For a corresponding Poissonian process we can estimate ρ ≈

931 days.In Eq. 24, we see immediately that the conditions tofulﬁl ∂ t I = 0 have not changed with respect to Eq. 1.Hence the optimal control trajectory still obeys α ( t ) =1 − R S ( t ) . In phase II, with optimal control, we obtain ∂ t S = − I h τ + 1 ρ (1 − I h ) − ρ S (25)with the solution S ( t ) = I h (cid:16) ρτ (cid:17) (cid:104) e − t/ρ − (cid:105) + 1 . (26)Again, herd immunity (and hence the end of the pan-demic) is reached when S = 1 /R , at a time we call T r ,referring to resusceptibilization (i.e., decaying immuneresponse). Inserting this into Eq. 26 yields1 − e − T r /ρ = 1 − /R I h (1 + ρ/τ ) (27) FIG. 4. (a) The normalized duration of the pandemic, T r /T ,as a function of the variable X = T / ( τ + ρ ). (b) Solid curves:The minimum required health system capacity ˆ I h to reachherd immunity (Eq. 30) as a function of the duration of im-munity after recovery, for diﬀerent values of R (from 1 . . . R → ∞ . Circlesrepresent the scenario for ρ = 93 τ . Open: I h = 0 .

01. Closed: I h = 0 . and hence T r = − ρ ln (cid:20) − − /R I h (1 + ρ/τ ) (cid:21) . (28)Although this looks rather awkward, it can be rewrit-ten conveniently in terms of the pandemic duration, T ,which we would ﬁnd for inﬁnite ρ . Deﬁning the variable X = T / ( τ + ρ ), we ﬁnd T r T = − X ln [1 − X ] . (29) X is the total duration of the pandemic if no loss ofimmunity occurs, divided by the average time it takesfor a patient from her infection to the loss of immunityafter recovery, τ + ρ . If immunity lasts very much longerthan T , X is small. In this case, the logarithm in Eq. 29can be expanded and we recover T r ≈ T . If, however, ρ is . . . . . . . I . . . . . . . . S I h ˆ I h ( I ∞ , S ∞ ) ( I , S ) R ˙ S = 0˙ I = 0 FIG. 5. Phase portrait, ( ˙ I ( S, I ) , ˙ S ( S, I )), for the uncontrolledSIR model ( α = 0) with ﬁnite immune response (Eq. 24). Thesolid green curve shows a trajectory in phase III, with initialconditions I = I h and S = 1 /R (circle). The dashed curves(orange) show the nullclines, ˙ I = 0 for S = 1 /R or I = 0, and˙ S = 0 for S = (1 − I ) / ( IR ρ/τ + 1). The stable ﬁxed point(diamond) is given by I ∞ = ˆ I h , S ∞ = 1 /R . Parameters: R = 3, ρ/τ = 93, I h = 0 . of order T (remember that T (cid:29) τ in all relevant cases), T r diverges. This behaviour is summarized in Fig. 4a,in which X is chosen as the abscissa. We see that theduration of the pandemic becomes uncomfortably largewhen the total time from infection to resusceptibilization, τ + ρ , comes close to the pandemic duration with inﬁniteimmunity, T (vertical dotted line).We might now ask how many acute infections thehealth system must be able to deal with in order to reachherd immunity at all. This can be derived by demandinglim t →∞ S ( t ) = 1 /R in Eq. 26. It is readily shown thatthe health system capacity required for reaching herd im-munity is given by ˆ I h = 1 − R ρτ . (30)This is shown in Fig. 4b for diﬀerent values of R . Thedotted curve represents the (unrealistic) limiting case R −→ ∞ .Only with inﬁnite immune response lifetime ( ρ → ∞ ),we observe an exponential decay of I after herd immu-nity has been reached (see also Fig. 3). To understandthe long time dynamics after mitigation (phase III) forﬁnite ρ , we draw the phase portrait (see Fig. 5). Thereexist two ﬁxed points, ( I , S ) = (0 , R ≥

1, and ( I ∞ , S ∞ ) = ( ˆ I h , /R ), a stable ﬁxed pointfor R ≥ I >

0, the uncontrolled system will approachthe stationary state, ( I ∞ , S ∞ ). Interestingly, the station-ary fraction of infected coincides with the minimal HSScapacity ˆ I h needed to reach herd immunity.We have thus shown that given I h > ˆ I h , herd immunity can be reached in ﬁnite time during mitigation phaseII. After mitigation measures have been released (phaseIII), I converges to ˆ I h in the long time limit. Moreover, I remains below I h (see Fig. 5) since re-entering the regime I < I h from above would require a trajectory to crossitself (not possible for an autonomous system of ODEs(Eq. 24) with unique solutions). V. A FEW SCENARIOS

Let us ﬁnally consider a few scenarios for some typicalparameters, as we have in the current COVID-19 pan-demic. The maximum number of known acute infectionsin Germany in spring 2020 was around 100.000, whichwas well tolerable for the HSS. Depending upon the per-centage of cases which are oﬃcially recorded, the actualnumber of infected citizens may be considerably larger. Ifwe assume a factor of two here, corresponding to 200.000cases, we have I h ≈ . ≈

80 million citizensin Germany). On the other hand, if there are more, andif we also take into account that the HSS could well takea few more patients, we might also consider a scenariowith 800.000 acute infections at a time, correspondingthen to I h ≈ .

01. In both cases we also have to vary theaverage lifetime of the immune state, ρ , and observe itseﬀect on the duration of the pandemic.The two sets of scenarios are represented in the graphsin Fig. 6. The remaining fraction of susceptible cit-izens is shown as the solid curves, while the dashedcurves represent the control parameter, α . As before,we have assumed R = 3 for the basic reproductionnumber. We take the value ρ = 931 days mentionedabove for another SARS-CoV strain as a reasonable esti-mate. Using τ = ten days, this corresponds to ρ ≈ τ .In order to cover this case, we have used the values ρ/τ ∈ { , , , ∞} . For Germany, an HSS capacityof I h = 0 .

01 (top (a) graph) would correspond to roughly32% of hospital beds [20] (500 .

000 in total) utilized forpatients with COVID-19, if we assume a hospitalizationrate of 20% [21]. The bottom (b) graph corresponds toa smaller HSS capacity ( I h = 0 . α ( t ), it is obvious that the mitigation measures canbe gradually alleviated as time proceeds. In the top (a)graph ( I h = 0 .

01) for inﬁnite immunity ( ρ → ∞ ), onewould reach the end of mitigation measures after abouttwo years (= 66 . τ , with τ = 10 days). This is, however,hardly realistic. For the more realistic case, ρ/τ = 93,it would take about three years ( ≈ . τ ). For an HSScapacity of I h = 0 . ρ/τ = 93. Instead, one would not reach any furtherthan α ≈ .

5, which still corresponds to rather harshmeasures.It should ﬁnally be noted that the number of fatalitiesis limited for all cases where herd immunity is reached. t/τ . . . . . . S , α I h = 0 . R (a) S ( t ) α ( t ) ρ/τ = 50 ρ/τ = 93 ρ/τ = 200 ρ/τ = ∞ t/τ . . . . . . S , α I h = 0 . R (b) FIG. 6. Typical pandemic scenarios for diﬀerent average im-munity loss times, ρ/τ ∈ { , , , ∞} , corresponding tocurves from right to left (or see color code), and diﬀerent val-ues for I h , namely 0 .

01 in the top (a) graph and 0 . S ( t ). Dashed curves: α ( t ). The fraction of acutely infected citizens is kept at I h inphase II until herd immunity is reached ( S = 1 /R , horizontaldashed line). If this is successful (if I h > ˆ I h , see Eq. 30) phaseIII begins: Mitigation measures are being released ( α = 0)and S ( t ) oscillates around its limiting value S ∞ = 1 /R .Other parameters: R = 3, τ = 10 days. In particular, if φ is the fraction of fatalities among thosewhich are infected, the fraction of fatalities with respect to the population is F = φ (1 − R − ) for inﬁnite ρ . If ρ is ﬁnite, we ﬁnd F = φT r T (cid:18) − R (cid:19) = (cid:18) − R (cid:19) φ ln(1 − X ) X . (31)This is precisely the scaling described by the curve dis-played in Fig. 4a, which shows that already well below X = 1, the death toll incurred by the herd immunitystrategy becomes prohibitively high if immune responsedecay plays a signiﬁcant role. VI. CONCLUSIONS

We have shown that for a wide class of cost functions,in order to reach herd immunity without vaccination, itprovides an optimal control strategy to keep the eﬀectivereproduction number, R, at unity during the majority ofthe duration of the pandemic. Deviations which dependupon the speciﬁc form of the cost function are limited toa narrow time window and can be considered negligiblefor practical purposes.Reducing R can be achieved through various measures,e.g., increased hygiene, physical distancing, or contacttracing [22]. Keeping R at the critical value of unity—above which epidemic spreading sets in—is, however,hardly feasible in practice, due to uncertainties as wellas observation delays concerning the eﬀects of mitigationmeasures on the reproduction number. Development of robust optimal control scenarios taking such uncertain-ties into account is left for future investigations.Explicit expressions for the expected duration of thepandemic have been given, and we have seen that theduration of the pandemic increases strongly as the aver-age lifetime of the immune state decreases. In particu-lar, we can conclude that in case the immune responseto SARS-CoV2 decays in a similar manner as for the for-merly encountered SARS-CoV1 strain [19], there is noreasonable route to using herd immunity as a vaccina-tion strategy for SARS-CoV2, unless the capacity of hos-pitals is dramatically enhanced (see Fig. 6a). However,as a consequence of global mobility there may be morepandemics coming which show diﬀerent infection and im-mune response behaviour. We therefore think that ourresults should be borne in mind for future use, as theyare of rather general nature. [1] W. H. O. et al., Report of the who-china joint mission oncoronavirus disease 2019 (covid-19) , Tech. Rep. (2020).[2] M. Enserink and K. Kupferschmidt, Science , 1414(2020).[3] J. Dehning, J. Zierenberg, F. P. Spitzner, M. Wibral,J. P. Neto, M. Wilczek, and V. Priesemann, Science (2020), 10.1126/science.abb9789.[4] H. W. Hethcote, SIAM Review , 599 (2000).[5] T. Harko, F. Lobo, and M. Mak, Appl. Math. and Com- put. , 184 (2014).[6] C. Gros, R. Valenti, L. Schneider, K. Valenti, andD. Gros, “Containment eﬃciency and control strategiesfor the corona pandemic costs,” (2020).[7] V. Zlatic, I. Barjasic, A. Kadovic, H. Stefancic, andA. Gabrielli, “Bi-stability of sudr+k model of epidemicsand test kits applied to covid-19,” (2020).[8] N. Ferguson, D. Laydon, G. Nedjati Gilani, N. Imai,K. Ainslie, M. Baguelin, S. Bhatia, A. Boonyasiri, Z. Cu- cunuba Perez, G. Cuomo-Dannenburg, A. Dighe, I. Dori-gatti, H. Fu, K. Gaythorpe, W. Green, A. Hamlet,W. Hinsley, L. Okell, S. Van Elsland, H. Thompson,R. Verity, E. Volz, H. Wang, Y. Wang, P. Walker, P. Win-skill, C. Whittaker, C. Donnelly, S. Riley, and A. Ghani, Report 9: Impact of non-pharmaceutical interventions(NPIs) to reduce COVID19 mortality and healthcare de-mand , Tech. Rep. (Imperial College London, 2020).[9] A. Perkins and G. Espana,

Optimal control of theCOVID-19 pandemic with non-pharmaceutical interven-tions , preprint (Epidemiology, 2020).[10] R. Djidjou-Demasse, Y. Michalakis, M. Choisy, M. T. So-fonea, and S. Alizon,

Optimal COVID-19 epidemic con-trol until vaccine deployment , preprint (Infectious Dis-eases (except HIV/AIDS), 2020).[11] Y. Liu, A. A. Gayle, A. Wilder-Smith, and J. Rockl¨ov,Journal of Travel Medicine , taaa021 (2020).[12] X. He, E. H. Lau, P. Wu, X. Deng, J. Wang, X. Hao,Y. C. Lau, J. Y. Wong, Y. Guan, X. Tan, et al. , NatMed , 672 (2020).[13] R. Parshani, S. Carmi, and S. Havlin, Phys. Rev. Lett. , 258701 (2010).[14] Convexity of the cost function is a reasonable assumptionsince one can imagine that chosen mitigation measuresbecome more and more costly as α approaches unity.[15] H. W. Kuhn and A. W. Tucker, in Proceedings of theSecond Berkeley Symposium on Mathematical Statisticsand Probability (University of California Press, Berkeley,Calif., 1951) pp. 481–492.[16] O. L. Mangasarian, SIAM Journal on Control , 139(1966), https://doi.org/10.1137/0304013.[17] V. M. Becerra, in (IEEE, 2010)pp. 1391–1396.[18] The same holds for a constant cost function, i.e. if we as-sume societal costs to be independent of α ( t ); numericaldata not shown here.[19] L.-P. Wu, Emerging Infectious Diseases , 1562 (2007).[20] We do not consider the need for intensive care units here,one reason being lack of knowledge about the fraction ofICU cases.[21] This is a conservative estimate given that the numberpublished by the Robert Koch Institute [23] (17%) isbased on reported cases only; the true hospitalizationrate might be signiﬁcantly lower.[22] V. B. Bulchandani, S. Shivam, S. Moudgalya, andS. L. Sondhi, “Digital Herd Immunity and COVID-19,”(2020), arXiv: 2004.07237.[23] Robert Koch Institute, Coronavirus Disease 2019(COVID-19). Daily Situation Report of the Robert KochInstitute 29/04/2020 , Tech. Rep. (Robert Koch Insti-tute, 2020).

Appendix A: First order necessary conditions foroptimality

The problem of optimal control is given in Eq. (8).Deﬁning ψ ( x ( t )) = I ( t ) − I h we rewrite the functional in Eq. 9: J { α } = (cid:90) t e f ( α ( t )) + λ ( t ) · [ ˙ x ( t ) − h ( x , α ( t ))]+ µ ( t ) ψ ( x ( t )) dt , (A1)with the complimentary slackness condition µ ( t ) ψ ( x ∗ ( t )) = 0 and µ ( t ) ≥

0. The slacknesscondition can be seen as activation of the constraint,i.e., in the region when the optimal trajectory satisﬁes ψ ( x ∗ ( t )) < δJ = 0. Tothis end we consider the variations x ( t ) = x ∗ ( t ) + η σ ( t ), α ( t ) = α ∗ ( t ) + η θ ( t ) and t e = t ∗ e + η χ for some inﬁnites-imal scalar parameter η . Upon making the substitutionswe have: J = J ∗ + η δJ = (cid:90) t ∗ e + ηχ f ( α ∗ ( t )) + ∂ α f | α ∗ ( t ) η θ ( t )+ λ ( t ) · (cid:2) ˙ x ∗ ( t ) + η ˙ σ ( t ) − h ( x ∗ ( t ) , α ∗ ( t )) − ∇ x h | x ∗ ( t ) · η σ ( t ) − ∂ α h | α ∗ ( t ) · η θ ( t ) (cid:3) + µ ( t ) ψ ( x ∗ ( t )) + ∇ x ψ | x ∗ ( t ) · η σ ( t ) µ ( t ) dt , (A2)where J ∗ = (cid:82) t ∗ e f ( α ∗ ( t )) + λ ( t ) · [ ˙ x ∗ ( t ) − h ( x ∗ ( t ) , α ∗ ( t ))]+ µ ( t ) ψ ( x ∗ ( t )) dt . Now noting that (cid:82) t ∗ e + ηχ A ( t ) dt = (cid:82) t ∗ e A ( t ) dt + (cid:82) t ∗ e + ηχt ∗ e A ( t ) dt and (cid:82) t ∗ e + ηχt ∗ e A ( t ) dt = A ( t ∗ e ) ηχ and considering terms only up to ﬁrst order in η , we have δJ = (cid:20) f ( α ∗ ( t ∗ e )) + λ ( t ∗ e ) · [ ˙ x ( t ∗ e ) − h ( x ( t ∗ e ) , α ∗ ( t ∗ e ))]+ µ ( t ∗ e ) ψ ( x ∗ ( t ∗ e )) (cid:21) χ + (cid:90) t ∗ e ∂ α f | α ∗ ( t ) θ ( t ) + λ ( t ) · (cid:2) ˙ σ ( t ) − ∇ x h | x ∗ ( t ) · σ ( t ) − ∂ α h | α ∗ ( t ) θ ( t ) (cid:3) + ∇ x ψ | x ∗ ( t ) · σ ( t ) µ ( t ) dt . (A3)Making use of the slackness condition, performing inte-gration by parts on the λ ( t ) · ˙ σ ( t ) term and rearrangingthe terms we get: δJ = (cid:20) f ( α ∗ ( t ∗ e )) + λ ( t ∗ e ) · [ ˙ x ∗ ( t ∗ e ) − h ( x ∗ ( t ∗ e ) , α ∗ ( t ∗ e ))] (cid:21) χ + (cid:90) t ∗ e θ ( t ) (cid:20) ∂ α f | α ∗ ( t ) − λ ( t ) · ∂ α h | α ∗ ( t ) (cid:21) dt + (cid:90) t ∗ e σ ( t ) · (cid:20) − ˙ λ ( t ) − λ ( t ) · ∇ x h | x ∗ ( t ) + ∇ x ψ | x ∗ ( t ) µ ( t ) (cid:21) dt +[ λ ( t ) · σ ( t )] t ∗ e . (A4)For the last term we make use of the initial conditions x (0) = x , hence we require σ (0) = . The termi-nal constraint implies R − = S ( t e ) = S ( t ∗ e + ηχ ) = S ∗ ( t ∗ e ) + η ( ˙ S ∗ ( t ∗ e ) χ + σ s ( t ∗ e )) + O ( η ), and therefore˙ S ∗ ( t ∗ e ) χ + σ s ( t ∗ e ) = 0. Taking these into account we ﬁ-nally get the ﬁrst order necessary conditions for optimalcontrol: f ( α ∗ ( t ∗ e )) − λ ( t ∗ e ) · h ( x ∗ ( t ∗ e ) , α ∗ ( t ∗ e )) = 0 , (A5)˙ λ ( t ) = − λ ( t ) · ∇ x h | x ∗ ( t ) + ∇ x ψ | x ∗ ( t ) µ ( t ) , (A6) ∂ α f | α ∗ ( t ) − λ ( t ) · ∂ α h | α ∗ ( t ) = 0 , (A7) λ I ( t ∗ e ) = 0 . (A8) Appendix B: Stability analysis of the uncontrolledSIR model with ﬁnite immune response

The uncontrolled SIR model with ﬁnite immune re-sponse (Eq. 24 with α = 0) can be written as: ∂ t S = − τ R SI + ρ (1 − S − I ) ,∂ t I = τ ( R SI − I ) . (B1)It has a ﬁxed point at x ∞ = ( S ∞ , I ∞ ) = (cid:18) R , − /R ρ/τ (cid:19) . (B2) Linear stability analysis requires computation of the Ja-cobian of the dynamical system evaluated at the ﬁxedpoint. It is given via: J | x ∞ = (cid:34) − R τ + ρ − /ρ − /τ − /ρ R − τ + ρ (cid:35) . (B3)Its eigenvalues are given by: λ , = − ρR + τ ρ ( τ + ρ ) ± (cid:115)(cid:18) ρR + τ ρ ( τ + ρ ) (cid:19) + 1 − R τ ρ . (B4)The ﬁrst term is always negative, we thus have threeregimes: λ , ∈ R ∧ λ , < λ , ∈ R ∧ λ > , λ < λ , ) (cid:54) = 0 ∧ Re( λ , ) < x ∞ is stable for any R ≥

1. The stablenode regime is limited to a very small, rather unrealisticparameter regime (mainly ρ ≤ τ , i.e., loss of immunityis faster than recovery from the infection). So for mostcases, the ﬁxed point is a stable spiral , in particular forthe estimated values for SARS-CoV-2 ( ρ/τ ≈ R ≈ ρ/τ R saddlestable node stable spiral FIG. 7. Stability of the ﬁxed point x ∞ of the uncontrolled( α = 0) SIR model with ﬁnite immune response (Eq. B1) fordiﬀerent values of R and ρ/τρ/τ