A New Mathematical Model for Controlled Pandemics Like COVID-19 : AI Implemented Predictions
Liam Dowling Jones, Malik Magdon-Ismail, Laura Mersini-Houghton, Steven Meshnick
AA New Mathematical Model for Controlled Pandemics LikeCOVID-19 : AI Implemented Predictions
Liam Dowling Jones , Malik Magdon-Ismail , Laura Mersini-Houghton and Steven Meshnick August 25, 2020 Department of Physics and Astronomy and COSMS Institute, UNC- Chapel Hill, NC 27599,USA Computer Science Department, Rensselauer Polytechnic Institute,110 8th Street, Troy, NY12180, USA UNC Gillings School of Global Public Health, UNC-Chapel Hill, NC 27599-7435, USA
Abstract
We present a new mathematical model to explicitly capture the effects that the threerestriction measures: the lockdown date and duration, social distancing and masks, and,schools and border closing, have in controlling the spread of COVID-19 infections i ( r, t ) .Before restrictions were introduced, the random spread of infections as described by theSEIR model grew exponentially. The addition of control measures introduces a mixing oforder and disorder in the system’s evolution which fall under a different mathematical classof models that can eventually lead to critical phenomena. A generic analytical solution ishard to obtain. We use machine learning to solve the new equations for i ( r, t ) , the infections i in any region r at time t and derive predictions for the spread of infections over time asa function of the strength of the specific measure taken and their duration. The machine istrained in all of the COVID-19 published data for each region, county, state, and country inthe world. It utilizes optimization to learn the best-fit values of the model’s parameters frompast data in each region in the world, and it updates the predicted infections curves for anyfuture restrictions that may be added or relaxed anywhere. We hope this interdisciplinaryeffort, a new mathematical model that predicts the impact of each measure in slowing downinfection spread combined with the solving power of machine learning, is a useful tool in thefight against the current pandemic and potentially future ones. It is the first time in recent human history where the spread of a novel virus is not allowedto be random but strict control measures like: locking down countries, closing borders, masksand social distancing, closing schools, and quarantining are undertaken at such a massive globalscale . New measures require a new mathematical model which transcends the traditional
SEIR model used in epidemiology to study the spread of viral diseases: a model which will allow us topredict and prepare for pandemics well in advance by enabling us to evaluate the combinationof restrictions with the highest impact.When a virus diffuses unrestricted through a population, it spreads randomly. Therefore,the number of infections grows exponentially with time, obeying a nearly Gaussian distributioncurve, the hallmark of randomness. The distribution of random infections in each region beforecontrol measures are introduced, map human social behavior (contact rate) as well as populationdensity and demographics. The epidemiological model which has been used to describe thespreading of viruses through a population with N people is the class of SEIR models. The1 a r X i v : . [ q - b i o . P E ] A ug riving force of infections in the SEIR type models is a coupling between the susceptiblesand infected groups where their coupling strength β is directly proportional to the now famouscontagion parameter R .In this work we propose a stochastic mathematical model for the infection spread which,in addition to the driving force of infections β term of the SEIR model, explicitly adds theimpact of each of the three measures (lockdown, social distancing plus masks, and school andborder closing) imposed worldwide in disrupting exposure and breaking the transmission of thepandemic have on the diffusion of infection through population. Since time is of essence duringa pandemic, we implement our new model and solve its equations using an AI approach topredict what will happen for the rest of the year in any region. The machine is trained in allthe
Covid − data available from Johns Hopkins University for any region, county, state, orcountry before and after restrictions were imposed.Data from the early phase of the pandemic when the spread of infection was unrestrictedis very useful in training the machine to learn, and include in its estimate for the restrictionsparameters introduced in our model, the population distribution and demographics as well asthe characteristic social behavior (contact rates) of each region. This advanced approach allowsus to estimate and predict in a matter of seconds how the infections may spread for the rest ofthe year in any region as a function of the time and type of control measures being introducedand their population characteristics.Furthermore, we offer an estimate of the number of infection cases undetected for eachconfirmed positive test in a region, a parameter we call γ ( (cid:126)r, t ) . Through this parameter we canconvert the confirmed daily infection curves for any region into a knowledge of the ’reservoir’,that is, the total number of the infected population in that region at any time whether they aretested or not. The machine also learns to account for the time delay between infection and itsreported tested confirmation. However, since the machine is trained on historic public data, thefindings reported here are limited by the accuracy of the reported data: the time lapse betweenthe true first case of infection in a region and the first confirmed positive test, as well as the noisein the reported data, i.e. the amount of uncertainty in the public data for confirmed positives.Let us first review the basics and terminology of the SEIR model before replacing it witha new model which adds new terms to include the restrictions measures. The letters in
SEIR stand for ’number of people susceptible S to the virus, number of people exposed E to it, thenumber of infected people I , and the number of people removed or recovered R ’, within thatpopulation of size N . In its simplest form it is mathematically described by [1, 4]: dsdt = − βsi, s = S/Ndidt = βsi − ˜ νi, i = I/Ndrdr = ˜ νi, r = R / N (1)The parameters (˜ ν, β ) , the removal rate and effective contact rate parameters respectively,are in general time dependent. Without loosing generality, the exposed e = E/N and infected i = I/N percentages of the population in the above equations are grouped together into i forsimplicity. An explicit equation for E mathematically obeys the same first order differentialequation sourced by the susceptible population on the right hand side.Combining the above and recalling that e is included in i , gives: d ( i + r ) dt = βsi = − dsdt (2)Not surprisingly, as can be seen from the opposite signs in the time derivative terms of s and ( i + r ) in Eqn.2 what is a driving force for infections, F seir = βsi , is a damping force with thesame strength for the susceptible population. We will make use of Eqn.2 in our model below.2raditionally, Eqns. 1 are combined in a different way, in the manner of Eqn. 3 below: dids = − νβs = − R s , (3)because of the explicit dependence on the crucial parameter R , the viral reproduction num-ber, defined as R o = β ˜ ν = τ ˜ c ν (4)where the terms on the right hand side are respectively’: τ = Transmissibility given contact of s with i , (5) ˜ c = rate of contact between i and s , (6)and, T = 1˜ ν = duration of infection (7)It is clear from Eqn.4 and Eqn. 3 that R o describes how infectious a virus is since it gives theaverage number of people who will get the disease from one infected person. As such its value iscrucial in estimating how fast the infection can spread uncontrolled. It should also be noted thatin the early stages when the virus spreads unrestricted, this parameter depends on the populationdensity and social behaviour of the region. We report R dependence on a region’s characteristicsfor three such examples in Section 4. For the case of COVID-19 epidemiological studies indicatethat when the virus is spreading unrestricted at t = 0 , R ( t = 0) can range from . [1] to . [5]. Additionally, R is an effective parameter since it changes over time. Integrating Eqn.3highlights the well known fact that damping the spread of infections translates into bringing R to stable values of one or less. Mathematically, β ( t ) = ˜ νR ( t ) contains the same information as R on how the spread of the disease changes over time because ˜ ν is on average i an unchangingcharacteristic of the virus. Therefore, any changes introduced by restrictions on the effectivereproduction number R of a region over time translate into changes in β ( t ) .The SEIR model has been further compartmentalized in a SE j ...I j ...R by breaking down thenumber of exposed “ E i ” and infected “ I j ” cases in sub-classes of the different stages of exposure,infection and quarantine, as studied in [6, 1, 2, 4]). Understanding these compartments isimportant for epidemiological contact tracing and surveillance and for clinical purposes whenstudying the probability of infection of a region. However, we won’t need the details of thecompartmentalized SEIR models in what follows, as their net effect is absorbed in the β drivingforce term already embedded in the new model below, and implicitly contained in the data onwhich the machine is trained and which it uses to optimize the best fit for β ( t ) in real time.The paper is organized as follows: In Section 2 we introduce the new mathematical modeland describe the long term behavior of the model. In Section 3 we describe the method forobtaining the solutions using machine learning. In Section 4 we present the results of our AIimplemented model and illustrate them for the case of three US states by showing the predictedcurves of infection growth as a function of a representative combination of control measuresgiven by their three parameters. We also estimate and show the number of true infections, the reservoir , for each confirmed infection, a number which for example in the US varies from 4to 11 depending on the state. We are making the mathematical model and the code publiclyavailable here [3]. This will allow users to instantly derive how the predicted growth of infectionschanges in their region any time a measure or a combination of them may be introduced and of i Perhaps different strains of the virus may have different removal rates or their mutation may change ˜ ν . To account for the order introduced by the new global measures in slowing the random virusspreading anywhere in space and time, we enhance the
SEIR type models by adding newforce terms to Eqns. 1 which capture the damping forces of infections imposed by lockdowns,masks and social distancing, and schools and border closings. The remaining random sourceof infection spread, not contained in the above, is given by a white noise and captured by adiffusion parameter. These terms are in addition to the βsi term of Eqn. 2, the driving forceof infections which describes the contagion spread through random contact rate between theinfected and susceptible populations in the
SEIR model.As described below, the addition of the new forces to Eqn. 2 leads to a new stochastic modelwhich is in a completely different mathematical class from the
SEIR models. Let’s promote thenonsusceptibles, i.e. the part of the population exposed to the virus at any level of infection,into an ’infection field’: ˜ i ( (cid:126)x, t ) = i ( (cid:126)x, t ) + e ( (cid:126)x, t ) + r ( (cid:126)x, t ) ii spreading on (2 + 1) − D (two spatialdimensions (cid:126)x and one temporal dimension t ).In this model we make the following assumptions : a ) no vaccine will become massivelyavailable until next year; b ) the recovered, infected and exposed (the EIR part in
SEIR ) willgain some immunity, therefore will not be susceptible to being sick again this year or until thevaccine is available. The latter is expected on general grounds on how our immunity systemprotects us from viruses, but it is not verified to be true for the novel COVID-19 virus. (Indeed,recent medical findings [8] seem to indicate COVID-19 may be unique in the sense that thosewho recovered from it may not have a long lasting immunity to reinfection.)The infection field ˜ i ( (cid:126)x, t ) is a restricted self-avoiding random walker on a two dimensionalspace and in time. The restricted feature on the random walk of the infection seeping throughthe population of some region of space (cid:126)x with density ρ ( (cid:126)x, t ) at some time t is due to thecontrol measures taken in restricting the virus spreading. The self-avoiding feature is related tothe assumption of immunity, namely a person who has already been infected once will not beinfected again during the time interval from now to until a vaccine is available. Typically, thisclass of models is expected to lead to critical phenomena of polynomial growth of infections atlarge times which display a universal scale-free behavior described by what in critical phenomenaare known as critical exponents ν, µ iii . In two spatial dimensions ν provides the fractal criticaldimension of the fractal space available to the restricted random walker ˜ i ( (cid:126)x, t ) for that region.According to Flory’s theory [9, 10, 11] for critical exponents, analytically the fractal dimension ν in two dimensions is expected to be about ν ∼ / instead of . That is, without vaccinesat large times ˜ i ( (cid:126)x, t ) asymptotically is expected to approach a stable fractal distribution of selfsimilar ’hot’ and ’cold’ infection clusters on a space of dimension less than . Despite a vastliterature on the subject, analytical results in a closed form for this class of critical phenomenafor a restricted self avoiding (SAW) random walker in two spatial dimensions do not exist evenfor the simpler case of time independent parameters. Our model therefore has an additional ii We changed notation for spatial coordinates from (cid:126)r to (cid:126)x here to avoid any confusion of the SEIR variable r for the recovered population with the spatial coordinate (cid:126)r . However both (cid:126)r and (cid:126)x have the same meaning,region’s location on a two dimensional surface and are used interchangeably. iii The critical exponent ν here should not be confused with the removal rate parameter ˜ ν in Eqns. 1 of theSEIR model ˜ i ( (cid:126)x, t ) are time dependent. Therefore wewill use machine learning for solving these equations to predict the late time growth of infectionspreading as a function of restriction measures. Consider the infection variable ˜ i to be our system . We promoted it into a field i ( (cid:126)x, t ) randomwalking through a ’host’ environment field s ( (cid:126)x, t ) living on a two dimensional region over time.In what follows, we will also make use of the parameter γ mentioned in Section 1, definedin Eqn.8 as the ratio between the confirmed infections i ( (cid:126)x, t ) and the actual reservoir of thenon-susceptible population ˜ i ( (cid:126)x, t ) . γ ( (cid:126)x, t ) = i ( (cid:126)x, t )˜ i ( (cid:126)x, t ) , (8)Clearly, γ ( (cid:126)x, t ) is a space and time dependent parameter since it quantifies the true numberof infections in a population located at (cid:126)x per each confirmed case there by testing, at any time t . The Force of Lockdown:
Let us now model the effect of the lockdown measure on reducing the infection spread witha quadratic confining potential V i (˜ i ) . A single well potential for a field, in our case the activeinfections, describes the effect of lockdown in a region located at (cid:126)r because such a potential drivesthe active confirmed infection field i ( (cid:126)r, t ) to eventually rolls down to zero. iv The strength of theconfinement potential camptures the impact of the lockdown in damping active infections and isgiven by its space and time dependent coupling constant α = α [( t f − t i ) , t, r ] , which varies withregion (cid:126)r and depends on t i , t f , the initial and final date (duration) of the lockdown of the region,respectively. Since our system is ˜ i ( (cid:126)r, t ) then, using the rescaling between confirmed infection i and ˜ i of Eqn. 8, we can rewrite the lockdown potential in restricting the random walker in termsof ˜ i by absorbing γ ( (cid:126)r, t ) into the strength α ( (cid:126)r, t ) of the lockdown potential V i .With these considerations the lockdown potential becomes V i = ˜ α ˜ i (9)where the rescaled lockdown strength is: ˜ α = αγ . The damping force of the lockdown oninfections is simply the derivative of its potential F i = − dV i d ˜ i . From now we will suppress showingthe symbols t i , t f in α for ease of notation. The Force of Social Distancing and Masks:
The purpose of masks and social distancing is to eliminate the possibility of exposure to thevirus, in other words to minimize the chance that susceptibles are physically close to the path ofdroplets from others, or to block their flux with a mask altogether. We here assume that the partof the population which follows guidelines for eliminating exposure to the virus is equally likelyto engage both methods to block exposure, masks and social distancing, therefore the impact ofthe social distancing potential in slowing down infection spread discussed here includs the effectof masks as well. While we don’t include any of the biological or clinical parameters such asthe role of viral load in infecting susceptibles, the risk from potential airborne properties, or avariation in the size and trajectory of droplets, we gauge the net effect of social distancing ineliminating exposure to droplets by i ( (cid:126)r, t ) as a damping force on infections growth by relying ona simple physics based approximation: the trajectory of a flux of droplets ejected by i (throughbreathing, coughing and sneezing) follows a typical classical projectile motion described by iv A quartic or any type of concave single well confining potential function would work equally well as thequadratic type of Eqn.9, so we don’t loose any information by choosing the simplest such potential. j th susceptible person s j and droplets fluxfrom a k th infected person i k would take the form V si = − ˜ δ Σ s j i k r ζjk , where ˜ δ is the strengthparameter of the potential, r jk is the distance between them, and the sum is over all ( j, k ) . Forthe purely gravitational forces on the droplets motion ζ = 2 . For a more detailed analysis whichmay account for changes in the projectile motion due to other forces on the droplets such adrifting force from wind and other weather elements, or an air buoyancy force on the dropletsinducing floating, we could allow ζ to be an additional parameter ζ (cid:54) = 2 the value of each wouldbe determined by simulations. Since the susceptible part of the population is much larger thanthe infected part so far in all the reported data for any region, we replace the sum over j of s j with an averaged value s ( (cid:126)r, t ) = (1 − ˜ i ) , up to a proportionality constant which depends onthe population distribution and social behavior absorbed in ˜ δ for each region, through machinelearning. As before, the re-scaling factor γ between i and ˜ i of Eqn.8 is included in ˜ δ . Asmentioned the machine is trained to learn and include the information about the populationdensity distribution ρ ( (cid:126)r, t ) and social interactions from earlier times data when the spread wasrandom of each region within radius (cid:126)r at time t into the coupling constants α, ˜ δ of each potential.This allows us to further approximate the typical average distance r ≈ < r jk > between infectedand susceptible people from different households on a two dimensional surface with an ensembleaveraged distance r ≈ ˜ i − / for that region’s population density.Any numerical factors thatarise from this approximation are roughly ˜ i ( − / and are also absorbed by ˜ δ . Therefore, wedenote the rescaled strength parameter of this potential by δ . Under the above simplifications,the potential V D capturing the effect of social distancing and masks restriction designed toavoid contact between droplets from the infected and susceptibles of V s,i interactions takes thefollowing form V D = δ ˜ i sr ζ (10)where s = (1 − ˜ i ) (11)To reduce the number of parameters the machine needs to estimate and best fit to data, wenow take ζ = 2. The damping force on infections from the masks and social distancing potential,then is F D = − dV D d ˜ i . Similarly the coupling strength β for the driving force of infection spreaddepends on space and time.Collecting all the restriction forces acting on the infection field ˜ i leads to d ˜ idt ≈ ˜ β (1 − i )˜ i − ˜ α ˜ i − δ (1 − i ) r + f ( t ) = ˜ β s ˜ i − V (cid:48) i − V (cid:48) D + f ( t ) (12)The prime denotes dd ˜ i . The first term in Eqn.12 is the familiar SEIR driving force ofinfections, which in our model presented here corresponds to a potential function, V SEIR = − β si . ˜ β is the space and time dependent coefficient of β which is rescaled by γ and bypopulation demographics and social behavior; the second term is the damping force with finiteduration due to the lockdown; the third term is the damping force on infections from the percentof population which follow rules on masks and social distancing. The Noise Term from the Environment and Diffusion:
The last term f ( t ) in Eqn.12 is an important one which deserves some explanation. Thisterm is a source of noise from the background s ( (cid:126)r, t ) acting on the system ˜ i , a random local6orce driving infection spreading details of which are not known. The only information knownabout the noise term f ( t ) is its averaged values, such that on average < f ( t ) > = 0 and
It is convenient to introduce a new state variable q ( t ) to denote the number of newly createdinfections at time t in region r . q ( t ) approximately corresponds to the rate d ˜ i/dt of Eqn. (12)in region r . Recall that ˜ i ( t ) is the cumulative total number of individuals up to time t inwho have been infected by the disease (including those who have recovered), and e ( t ) is thenumber of exposed and currently infectious individuals at time t . At time t , let s ( t ) be thenumber of susceptible individuals. We introduce two additional variables r s ( t ) and x ( t ) , which10ave no exact counterpart in the standard SEIR model. r s ( t ) is the cumulative number ofindividuals up to time t who contracted the disease at some earlier time and have now silently recovered. Silently means without detection, usually because there were no symptoms. x ( t ) isthe cumulative number of confirmed infections up to time t . The contribution to x ( t ) at time t comes from the fraction of new infections at time t − k which get confirmed at time t (theremaining fraction silently recover). Although we did not explicitly show the (cid:126)r dependence inthe variables defined above it should be understood that they are functions of space as well astime. The coupled finite difference equations are: q ( t ) = βe ( t − s ( t − − αe ( t − − δe ( t − − e ( t − (21) s ( t ) = s ( t − − q ( t ) (22) e ( t ) = e ( t −
1) + q ( t ) − q ( t − k ) (23) x ( t ) = x ( t −
1) + γq ( t − k ) (24) r s ( t ) = r s ( t −
1) + (1 − γ ) q ( t − k ) (25)Equation (21) contains the infectious force and damping potentials from (12) which control therate at which new infections are produced, q ( t ) . Equation (22) merely says that the susceptiblepopulation decreases by the new infections. Equation (23) says that the infectious individualsincrease by the new infections minus the infectious individuals who recovered silently or whobecame symptomatic and got discovered (and quarantined). Due to the lag k the loss of infectiousindividuals is exactly q ( t − k ) . Equations (24) and (25) say that a fraction γ of the infectiousget symptomatic and discovered and the other (1 − γ ) -fraction silently recover without serioussymptoms. Recall that ˜ i ( t ) = e ( t ) + x ( t ) + r s ( t ) . It is only x ( t ) which is observed.The parameters ( β, α, δ, γ ) are unknown. From data provided by John’s Hopkins University[15], we observe very noisy estimates { ˆ x (0) , ˆ x (1) , ˆ x (2) , . . . , ˆ x ( T ) } , where timestep 0 indicates thefirst confirmed infection. We set i ( − k ) = x (0) /N where N is the population of the region beinganalysed, and correspondingly s ( − k ) = 1 − i ( − k ) , q ( − k ) = i ( − k ) , x ( − k ) = 0 , r s ( − k ) = 0 . Giventhese initial conditions, the trajectory of x ( t ) is determined given ( β, α, δ, γ ) and Equations (21)–(25). Hence we identify ( β ∗ , α ∗ , δ ∗ , γ ∗ ) as the parameters which minimize the mean squared errorbetween the observed trajectory ˆ x and the model’s trajectory x ( β, α, δ, γ ) . That is, ( β ∗ , α ∗ , δ ∗ , γ ∗ ) = argmin ( β,α,δ,γ ) T (cid:88) t =0 ( x ( t | β, α, δ, γ ) − ˆ x ( t )) . (26) Lockdown, Social Distancing and Masks:
Again, to address the practical situation with COVID-19, we have to address the effects oflockdown, social distancing and masks. Indeed, this is one of the advantages of our model, thatit gives us simple switches for turning on and off various social distancing measures. We assumethat lockdown begins at some time τ and lasts till τ + L . In general fitting of the data, thelockdown period L is about 90 days for all USA regions. Naturally, this parameter L is tuned toeach specific region and it is calculated by the machine learning algorithm from JHU data [15].The start of the lockdown, τ , is determined by a robust changepoint (A sample or time instantat which some statistical property of a signal changes abruptly) analysis in the time-series (forexample see [18]).Parameters representing the strength of each measure on any region vary with time andduration. Being trained on past data, the machine learns the time and strength of these measures α, δ and the remaining degree of randomness β and the time when they were introduced suddenlyin the past. Therefore, we include that time dependence in β, α, δ in (21) by breaking downtheir time dependence in three time intervals, namely: the ’random spread time’,i.e. the timebefore the lockdown t < τ when all restrictions are zero and the virus is spreading randomly,i.e. when β = β max ; the ’lockdown duration’, i.e. the time interval during the first full lockdown α = α max being in effect τ < t < τ + L in a region. L should be understood to be a function of11pace and time L ( (cid:126)r, t since the lockdown time and duration varies by region, and the machinecan learn to read this parameter from data. Social distancing and masks restriction δ = δ max are also in full effect during this time and can extend beyond. As a result, β drops from itsmaximum value by a factor λ pm β where the reduction parameter λ pm in this interval is estimatedfrom past data from the machine; and, future time interval which can extend for two years, i.e. t > τ + L when some of the restrictions are relaxed or not followed rigorously. For example,although lockdown officially ended in May, the machine estimates that its impact parameter α oninfection spread was still at half its strength until August because many people were encouragedand continued to work remotely, schools were closed during summer, and travel and hospitalityindustries were at a minimum, therefore many people stayed home. The future increase ordecrease in β is captured by the parameter λ m and it depends on future decisions taken onrestrictions in any region. The strength of social distancing follows similar patterns in past dataand it is learned by the machine. The change, for example decrease, in the strength of lockdownand social distancing restrictions that can be imposed in the future on any region is capturedby the region and time dependent parameters λ L ( t ) , λ sd ( t ) respectively. The time dependenceon these parameters allows us the freedom to predict what happens to the daily infection curvesif restrictions are introduced at certain times in the future. All past data parameters are bestfitted by the machine. With these definitions, we have β ( t ) = β max t < τ ; λ pm β τ ≤ t ≤ τ + L ; λ m β t > τ + L. (27) α ( t ) = t < τ ; α ( t ) τ ≤ t ≤ τ + L ; λ L ( t ) α t > τ + L. (28) δ ( t ) = t < τ ; δ ( t ) τ ≤ t ≤ τ + L ; λ sd ( t ) δ t > τ + L. (29)Equation (27) captures the effects of control measures of α and δ in reducing β after lockdownand in the future. Although β does not return to its maximum initial value in the future, itincreases by a percentage defined by the factor λ m when restrictions ease. Equations (28) and(29) capture changes over time in the impact of lockdown and social distancing potentials ondamping infection spread.The step functions in Equations (27)–(29) can be smoothened to better approximate re-ality, since control measures are not instantly adopted throughout the whole population (ex-cept perhaps a government mandated lockdown or curfew). A transition from say a func-tion a ( t ) to another function b ( t ) at time t can be compactly written as a single function a ( t )+Θ( t − t )( b ( t ) − a ( t )) , where Θ( · ) is the Heaviside threshold function. One can smoothen thisby replacing Θ( t − t ) with any smooth approximation, such as Θ( t − t ) ≈ (1+tanh( c ( t − t ))) / for c > (smaller c is smoother) and the machine can be set to determine the value of c.In summary, the lockdown start-time τ is determined by the first changepoint in a robustchangepoint analysis of the time series of confirmed infections. For example, in the US thelockdown duration is about L = 90 days, but the machine learns this duration from data for anyregion. We fit the parameters β ( (cid:126)r, t ) , α ( (cid:126)r, t ) , δ, γ ( (cid:126)r, t ) , λ i ( (cid:126)r, t ) (where i stands for m, L, sd ) tothe data by minimizing the MSE in (26) using an exhaustive search over the ranges β ∈ [0 . , , α ∈ [0 , . , δ ∈ [0 , . , γ ∈ [0 , . and λ m ∈ [0 . , . .By changing future strength of restrictions given α ( t ) , δ ( t ) and β , allows us to predictsand investigate future scenarios in which the different social distancing measures are relaxed atcertain times. We will demonstrate some of these scenarios next in the Results section.12 Results
In addition to the new mathematical model we present in this paper, we have made the codepublicly available in [3], including the machine learning algorithm. Data file is available dailyfrom [15]. The machine is trained in every region of the world, therefore our model implementedby the deposited code can be used for predicting daily infection curves for any region in theworld. We plan to post infection curves for every region in the world on our website [7], for thetwo future control measure scenarios described below. Here, we discuss in detail the results forthree states:
North Carolina, New York and Florida .These results are obtained by fitting the model to the region’s reported data in [15], shownby the red dots in the figures below, to obtain the coupling parameters β ( t ) , α ( t ) , δ ( t ) , γ, λ m .We show predicted curves in these three states by considering these two hypothetical futurescenarios:(a) Scenario A.
The official lockdown of last Spring ends. However population remains in asemi lockdown state with a decrease of 50% in the impact of this restriction in slowing downinfection spread, that is α → α/ . Social distancing is extended for a short time interval(approximately from the end of lockdown until August 2020) after official lockdown ends,before this restriction too decreases to about 50% . The exact time intervals when thesemeasures are reduced can be seen in the plots below. The reduction in these restrictionsoccurs around the anticipated start of K-12 schooling in each state, assuming that socialdistancing measures will consequently decrease and β, λ m will increase due to the difficultyenforcing these measures in schools.(b) Scenario B.
The same as above but social distances is maintained for a longer extensionof time (extension past August 2020 into January 2021) before being reduced to 50%impact. In this scenario the reduction in social distancing and therefore increase in β startshappening in January 2021. This is done under the assumption that states can maintainrestrictions such as social distancing and masks without a strict lockdown through January,for example by moving K-12 schools to a remote learning model for the Fall Semester. Thisassumption does not include the possibility of universities operating in person classes, ifwe were to assume that college students would follow guidelines like the rest of the adultpopulation.We are not proposing that these scenarios be adopted by policy makers, we are simply usingthese two examples to illustrate the predictive power of our mathematical model and its machinelearning implementation.One could instead analyze any combination of future restrictions such as additional lock-downs, partial mask compliance, etc.Note that a different choice and/or duration of future restrictions in a region will obviouslyproduce very different daily infection curves. The danger of infection spread can be postponedbut will continue to loom over the susceptible population until a vaccine is developed. Hence,it is up to health-economists to evaluate the human and economic loss of different scenarios tocome up with the optimal redistribution of the infection counts to tradeoff healthcare-systemoverload against prolonged economic slump. Our Ai implemented predictive model can helpin this effort by efficiently determining months in advance the infection spread dynamics as afunction of future decisions on the schedule and strength of control measures.We illustrate these predictions next. The infection plots for the three states are in Figure 1 (North Carolina), Figure 2 (New York)and Figure 3 (Florida). We show the two scenarios, Scenario A and Scenario B. In each figure, wefirst show the past confirmed daily infections (red circles) from JHU data [15] and the infectioncurves predicted by our new mathematical model (solid black line). Uncertainties in the infection13urves are represented by gray uncertainty bands. To determine the uncertainty in the modelpredictions the algorithm was set to search for all models which have a fit-error within 5% ofthe optimal. The gray uncertainty bands represent the set of possible outcomes from this setof models, and then the model (solid black line) is the average of these models. The largestsource of uncertainty in our predictions is most likely due to noise in the publicly reported data.Because the machine learns from past data, any noise or uncertainty in that data is reflectedas uncertainty in the future, and the farther the model projects into the future the larger theregion of uncertainty. We then show the time dependent learned coupling parameters α ( t ) , δ ( t ) , β ( t ) (normalized to their maximum value), and γ ( t ) , see Equations (27), (28), (29) and (8).Recall the definition of γ as the reservoir parameter which in the variable definitions from FiniteDifference model is computed as γ ( t ) = x ( t ) / ( e ( t ) + x ( t ) + r s ( t )) from Equations (23)-(25). Inour illustration below, we choose not to initiate a future full lockdown as such a drastic measureseems unlikely. Instead, we focused on extending social distancing and mask mandates whichare more practical control measures. Our results show these are even more effective in reducinginfection curves than the lockdown.We highlight several points from the figures.i Each region and state has very different population densities and social interactions, hencethe machine learning produces very different coupling parameters for each state. In con-trast to SEIR model, this finding emphasizes the importance of space and time dependentrestriction parameters contained in our model and the usefulness of machine learning infinding out this regional dependence. On our website [7], we will show result to the countylevel.ii The strength of the implemented control measures is determined by the level of socialcompliance. Since we do not know how compliant the population of a region is, it isnecessary for the machine learning to automatically infer this from the observed past dataduring the phase when the virus spreading is completely random. This knowledge can beuseful for other medical and viral diseases that rely on regional characteristics. The resultsvary considerably from county to county, once again underlining the value of heteregeonity.iii Despite limited testing, we have here developed the tools to estimate the true number ofnon susceptible population, the ’reservoir’, as a function of time in the past and for anypredicted future scenario through parameter γ ( (cid:126)r, t ) shown for the three states below.14a) Scenario A. (b) Scenario B.Figure 1: North Carolina (NC). (Left) Scenario A relaxes control measures in accordancewith real time policy decisions from each state [15, 17]. This results in relaxation of the lockdownmeasures on May 8, with social distancing and λ m being relaxed around 80 days later, resultingin the spike of infections seen around October. (Right) Scenario B extends social distancing and λ m an additional 120 days, accounting for a scenario in which K-12 schools move to a remotelearning model (as is the case in NC) in the fall 2020, and so the spike in infections is delayeduntil March 2021 should measures be relaxed next Spring. Prolonging restrictions reduces β and daily infection curves. The multiplying parameter for the reservoir γ is also shown below.Inboth scenarios R = 3 . , α max = 0 . , δ max = 1 . e − , γ today = 0 . , which corresponds toroughly 1 in 15 infections being confirmed. 15a) Scenario A. (b) Scenario B.Figure 2: New York (NY). (Left) Scenario A relaxes control measures in accordance withreal time policy decisions from each state [15, 17]. This results in relaxation of the lockdownmeasures on May 8, with social distancing relaxing around 80 days later and λ m relaxing 43days after the extension in social distancing. This is in accordance with the NY’s current plan todelay school openings until September 23. (Right) Scenario B extends social distancing and λ m an additional 200 days after lockdown is relaxed, accounting for a scenario in which schools moveto a remote learning model for the fall semester, and so the spike in infections is delayed untilMarch 2021. In NY R = 6 . , α max = 0 . , δ max = 0 . , γ today = 0 . , which corresponds to1 in 11 infections being confirmed. 16a) Scenario A. (b) Scenario B.Figure 3: Florida (FL). (Left) Scenario A. In this case all control lockdown is relaxed aroundearly May, while social distancing and λ m measures are extended 80 days to the beginningof August and this seems to match the data in the confirmed infection curve very closely,predicting a spike in infections if social distancing restrictions are not extended past August(Right) Scenario B. In this case for Florida we extend social distancing and the λ m measuresfor an additional 90 days in comparison to Scenario A, meaning social distancing and λ m arenow relaxed around early December. This not only pushes the second peak of back but re-duces the peak height while increasing peak width, further spreading out infections. In FL R = 3 . , α max = 0 . , δ max = 0 . , γ today = 0 . , which, similar to NC corresponds to ap-proximately 1 in 15 infections being confirmed. For both of these scenarios we note the grayuncertainty bands are present but cannot be seen in the plots due to a relatively small value inthe uncertainty.The area under the curve in all the plots shown gives the total confirmed number of infected,which through the parameter γ of each region can be converted into a total number of nonsuscpetibles of each region. 17 Conclusions
The impact of the
Covid − pandemic on countries around the world and on the daily lifeof citizens is unprecedented. The traditional class of models used by epidemiologists to studyviral infections spreading randomly through a population has been the SEIR model we brieflyreviewed in Section 1. The latter cannot account for the impact of global restriction measures onthe pandemic, quantify the reservoir, or the space time dependence of these restrictions on thepandemic. We here offer a new and predictive mathematical model for the pandemic dynamicswhich combined with machine learning includes the space and time dependence in the pandemicdynamics.In contrast to previous viral diseases, the type and extent of global measures taken to containthe spread of Covid − to prevent a health crisis are new. In this new situation, part of thechallenge is finding a predictive model for the evolution of the pandemic which describes themix of order given by global restrictions, to the disorder of a random spreading of virus, andquantifies the impact of these control measures (their duration and the time they are introduced)in constraining the spread over extended periods of time.In this work we presented an interdisciplinary approach to predicting the dynamics of futureinfections. Firstly in Section 2 we develop a mathematical model which using physics basedconsiderations to capture the effect of individual restrictions on the infection spread as a functionof space and time. In our model the infection spread is a restricted self avoiding random walkerthrough a population. Our model belongs in a different mathematical class from SEIR typemodels. It is stochastic and its probability distribution function P ( (cid:126)r, t ) on any region at any timeis found by solving a Fokker Planck type equation (Eqn. 16). The strength of the lockdown and ofsocial distancing and masks on the infection curves in our model are given by the time dependentparameters α ( (cid:126)r, t ) , δ ( (cid:126)r, t ) respectively. These restrictions introduce order in the random walk.The amount of disorder in the random walk which may persist through a population (for examplea random mixing of susceptible and infectious groups through school openings) is still describedby the SEIR type term with strength β ( t ) , however note that in contrast to the SEIR models, β is a function of space and time and captures population demographics and social behavior.The diffusion of the infection through a population not captured in the above terms, (infectionswhich somehow escape detection or the set of restrictions and quarantine) is given by a diffusionparameter D ( t ) sourced by a noise term f ( t ) . This model can be applied to potential futurepandemics, although of course we hope there wont be any.Secondly, given the complexity of the model, we take full advantage of the benefits of AIin Section 3 in implementing our model in order to estimate future infection curves for anyregion within a short time. Viruses need hosts to feed and multiply, therefore their unrestrictedspreading phase depends on population demographics and human social behavior, in additionto the virus characteristic reproductive rate R o . We take advantage of this feature in trainingthe machine. The machine learns the population density and social behavior networks for anyregion, state and country from the pre lockdown data (March 2020) when the virus spreadunconstrained. The machine then applies these learned characteristics when estimating futureinfections curves from our model and estimates the ratio of true infections for each confirmedinfection γ ( (cid:126)r, t ) . In this work we implemented a simplified version of our stochastic model bytaking diffusion to zero and solving the Langevin equation of Eqn. 12. We deposited the codein Github [3] and will post examples of the predicted infection curves for each region located at r as a function of a particular choice of restrictions α ( t ) , δ ( t ) , β ( t ) , γ ( t ) in our website [7].For the case of time independent parameters, restricted SAW models in two dimensions canlead to critical phenomena of self similar hot and cold clusters of infections at large times. Inreality, control measure parameters in our model are time dependent as restrictions change withtime and country. Therefore, it is not clear whether the evolution of this pandemic mathemati-cally would be in the same universality class, have the same fractal properties and polynomial18rowth at late times, as its well studied stationary SAW model. We plan to show our AI imple-mented infection curves for complete mathematical model presented here including diffusion ina sequel paper. Among other elements, diffusion is an important indicator of the possibility ofre infection when travel is freely opened. In the present work we included the effect of borderclosing by imposing absorbing boundary conditions on our equations, meaning by taking theflux of population crossing boundaries in and out of a region to be zero.How to interpret our results: Our new AI implemented model predicts future infection curvesfor any region as a function of the time and the restriction measures taken in that region. Itexplicitly quantifies the impact of each control measure on future infection curves over extendedperiods of time. It also quantifies the ’reservoir’, through the parameter γ which tells us thenumber of undetected infections for each confirmed one in every region. Different scenarios,such as the choice of one restriction over another or a combination of them as well as the timethey are imposed in the future, result in different predicted infection curves for the rest of theyear. An added benefit is the speed of the machine in deriving these predictions. We illustratethe comparison of different outcomes for the same region in Section 4 by showing the predictedinfection curves for a few regions under two different scenarios, one with schools closed andpartial future lockdown and and social distancing rules, and one with schools open and nosecond lockdown but extended social distancing. The model can be applied to any pandemicor disease where the spreading of a disease is restrained by control measures. We hope that anadvanced knowledge of the predicted infection curves as a function of regional restrictions givenby our model will be useful in public health and economic considerations on the duration andcombination of future restrictions which is optimal to the needs of their region. Acknowledgements
LMH would like to thank Dr. C. Carothers and Dr. B. Yener for their readiness to help withAI resources available at RPI, and Dr. S. Alexander and Dr. J. Engel for useful discussions inthe early stage of this project. We thank Dr. D. Budenz and Dr. D. Peterson for their timeand expert advice on the public health and medical applications of our results. We are gratefulto Dr. K. Simon and Dr. A. Bento fo their feedback on our work. LMH acknowledges supportfrom the Bahnson funds. (cid:63)
We are saddened that one of our colleagues and a co-author in this paper, Steven Meshnick,passed away shortly before our paper was finished.
References
Mathematical Biosciences and Engineering 4(4) , 675-686.https://dx.doi.org/10.3934/mbe.2007.4.675195] Sanche, S., Lin, Y., Xu, C., Romero-Severson, E., Hengartner, N., & Ke,R. (2020). High Contagiousness and Rapid Spread of Severe Acute Respira-tory Syndrome Coronavirus 2.
Emerging Infectious Diseases, 26(7) , 1470-1477.https://dx.doi.org/10.3201/eid2607.200282.[6] Ngonghala, C., Iboi, E., Eikenberry, S., Scotch, M., MacIntyre, C., Bonds, M., Gumel,A. (2020). Mathematical assessment of the impact of non-pharmaceutical interventions oncurtailing the 2019 novel Coronavirus.
Mathematical Biosciences 325() , 108364.[7] Under construction.[8] Long, Q., Tang, X., Shi, Q. et al. (2020). Clinical and immunological assessment of asymp-tomatic SARS-CoV-2 infections.
Nat Med .[9] Guillotin-Plantard, N., Schott, R. (8). Dynamic Random Walks[10] Kardar, M., Parisi, G., Zhang, Y. (1986). Dynamic Scaling of Growing Interfaces.
PhysicalReview Letters 56(9) , 889-892. https://dx.doi.org/10.1103/physrevlett.56.889; Kardar, M.,Zhang, Y. (1987). Scaling of Directed Polymers in Random Media.
Physical Review Letters58(20) , 2087-2090. https://dx.doi.org/10.1103/physrevlett.58.2087[11] Flory, P., (1971). Principles of Polymer Chemistry, Chap. XII (Cornell University Press,Ithaca, N. Y.)[12] Pascucci, A., & Pesce, A. (2019). On stochastic Langevin and Fokker-Planck equations: thetwo-dimensional case. arXiv:
Probability [13] “AI implemented predictions of a stochastic diffusive model”, in progress.[14] Perkins, T., Foxall, E., Glass, L. et al. (2014). A scaling law for random walks on networks.