A fractional model for the COVID-19 pandemic: Application to Italian data
Elisa Alòs, Maria Elvira Mancino, Raúl Merino, Simona Sanfelici
AA fractional model for the COVID-19pandemic: Application to Italian data
Elisa Alòs ∗ , Maria Elvira Mancino , Raúl Merino , and Simona Sanfelici Dpt. Economia i Empresa, Universitat Pompeu Fabra,c/Ramón Trias Fargas, 25-27, Barcelona, Spain. Barcelona Graduate School of Economics,c/Ramón Trias Fargas, 25-27, Barcelona, Spain. Department of Economics and Management, University of Florence,Via delle Pandette 32,50127 Florence, Italy. Facultat de Matemàtiques i Informàtica, Universitat de Barcelona,Gran Via 585, 08007 Barcelona, Spain. VidaCaixa S.A., Market Risk Management Unit,C/Juan Gris, 2-8, 08014 Barcelona, Spain. Department of Economics and Management, University of Parma,Via J.F. Kennedy, Parma - Italy.
August 4, 2020
Abstract
We provide a probabilistic SIRD model for the COVID-19 pandemic in Italy, where weallow the infection, recovery and death rates to be random. In particular, the underlyingrandom factor is driven by a fractional Brownian motion. Our model is simple and needs onlysome few parameters to be calibrated.
Keywords : SIRD model, fractional Brownian Motion, COVID-19
MSC classification : 92 C 60, 92 D 30, 60 G 22
JEL classification : C 02, C 32, C 63, I12 ∗ Corresponding author, [email protected] a r X i v : . [ q - b i o . P E ] J u l Introduction
Since the start of the COVID-19 pandemic many models have been proposed in the literature toexplain its dynamics. Most of these approaches are compartmental models, where the populationis divided into compartments that describe the different situations regarding the infectious disease(like susceptible, infected, etc.), and people progress between them. The basic reference compart-mental model is the SIR model, where the population is divided into susceptible (S), infected (I)and recovered (R). As this model is too simple to describe the complexity of several epidemics,some extensions have been proposed in the literature. For example , the SIRD model considersalso the compartment of deaths (D), and the SEIR introduces exposed (E). Some other classicalmodels, like the SEIRS, also allow to model the lost of immunity after recovery. Some recent ap-proaches pay special attention to non directly observable compartments as asymptomatic (A) (see,for example, [3] and the references therein) that, even not being directly observable, play a crucialrole in the pandemic. Other extensions consider time-dependent parameters, as in [1], where thekinetic of the rates of infection ( β ) and death ( µ ) are modeled by exponential functions, while therecovery rate ( γ ) is of logistic type. Recent literature also exploits stochastic models, where themain variables account for a noise (Brownian) component, or branching processes [2, 5, 6, 7].In this paper, we introduce a probabilistic SIRD model, where we allow the coefficients to bestochastic processes determined by a few number of parameters. This randomness is a way toaccommodate the effect of several unobservable factors, as the loose of immunity or the existenceof asymptomatic. Our construction of the model is motivated by the descriptive analysis of theparameter time series, where we observe that the β returns have negatively correlated increments,an observation that suggests to model them by a fractional Brownian motion (fBm) with a Hurstparameter H < [4]. We recall that this approach does not need to consider a drift for β , but thisdrift arises simply by the properties of the fBm. Once modeled β , we see that simple relationshipsbetween the evolution of diagnosed and infected, allow us to model γ and µ .The model is calibrated in such a way the mean paths of infected, death and recovered fit thecorresponding observed values, as well as the variability of β, γ , and µ . Our approach is not onlyable to adjust observed data, but also to study the different possible scenarios according to itsstochastic behavior.The paper is organized as follows. In Section 2 we present a descriptive analysis of β , γ and µ corresponding to the evolution of the pandemic in Italy ( https://raw.githubusercontent.com/pcm-dpc/COVID-19/master/dati-andamento-nazionale/dpc-covid19-ita-andamento-nazionale.csv ). Section 3 is devoted to present our stochastic SIRD model for the Italian COVID-19 out-break. The model is calibrated in Section 4, while we simulate some different scenarios in Section5. Finally, a discussion of the results and proposals of future research are presented in Section 6. Let us consider a stochastic SIRD model of the form S n +1 = S n − β n I n S n /NI n +1 = I n (1 + β n S n /N − γ n − µ n ) R n +1 = R n + γ n I n D n +1 = D n + µ n I n , (1)where S, I, R, D = { S n , I n , R n , D n , n = 1 , ..., } denote the number of daily observed susceptible,infected, recovered and death, N is the population size and β, γ, µ = { β n , γ n , µ n , n = 1 , ..., } S/N ≈ in all the data set, we considerthe following simpler version of the above model S n +1 = S n − β n I n I n +1 = I n (1 + β n − γ n − µ n ) R n +1 = R n + γI n D n +1 = D n + µI n , (2)Now we observe the behavior of this model in Italy in the period from the 24/2/2020 to the28/7/2020. In Figure 1 we can observe the paths of I ( totale positivi in the data set), R ( dimessi-guariti ), D ( deceduti ) and the total number of diagnosed ( I + R + D , totale casi ).Figure 1: Evolution pandemic in Italy. β Now let us observe β, γ, µ . In Figure 2 we can see the behavior of β , that is a decreasing function oftime. This fits what expected due to the lock-down, that reduced the number of contacts betweenindividuals. In Figure 3, we can see the corresponding increments (i.e., ∆ β = β n +1 − β n ), thatare more variable at the beginning of the sample, when β is higher.3igure 2: Evolution of β .Figure 3: Increments of β .Moreover, in Figure 4 we can see the returns ∆ ββ that have a stationary mean. After deletingthe outliers at days 115 and 119 (making them equal to zero), the corresponding autocorrelationfunction, see Figure 5, reveals a negative correlation between increments. The empirical analysissuggests to model β as a process with negative correlated increments with a decreasing mean.This process is designed in Section 3. 4igure 4: Evolution of the returns of β .Figure 5: Autocorrelation function of β returns. µ The time behavior of µ can be seen in Figure 6. We observe that it is similar to the behavior of β , but with a delay. Moreover, Figures 7 and 8 show a clear relationship between the daily newdiagnosed and deaths, with a delay of 4 days. In Figure 9, we can see a strong link between totaldiagnosed and deaths. In Figures 10 and 11 we can see that this relationship is linear. Then, asthe number of new infected is given by βI , the daily increments of deaths ( ∆ D ) would be modeled This finding appears consistent with data by Istituto Superiore di Sanità . c µ β n − I n − , for some positive constant c µ . This leads to the following model for µ : µ n = c µ β n − I n − /I n , Figure 6: Evolution of µ .Figure 7: Comparing the time series of 0.14 · ∆ Infected against ∆ Death.6igure 8: Comparing the time series of 0.14 · ∆ Infected against ∆ Death.Figure 9: Comparing the time series of diagnosed (I+R+D) with 4 days lag against Death.7igure 10: Linear Dependence between ∆ Diagnosed (with 4 days lag) and ∆ DeathFigure 11: Linear Dependence between total number of Diagnosed (with 4 days lag) and Deaths γ Finally, let us observe the path of γ in Figure 12. This recovery rate seems to move between anupper and a lower bound, that we can interpret as the rate under stressed conditions of the healthsystem, and the rate when this system adapts to the new scenario and the epidemic curve decays.This is connected to the approach in [1], where γ is assumed to be described by a logistic function,and the rate of recovery increases from a initial value γ and it stabilizes at some higher level.8igure 12: Evolution of γ .A measure of the stress of the model can be defined as the quantity ( I + R + D ) − n / ( I + R + D ) n , (3)where ( I + R + D ) − n is the average of the diagnosed people the last days. If there are nonew infections, the number of diagnosed remains stable, and then the above quantity is near 1.Otherwise, a high increment of new cases translates into a decrease of the value of this ratio. InFigure 13, we can see the behavior of this processFigure 13: ( I + R + D ) − n / ( I + R + D ) n We can also observe in Figure 12 a big variance, as expected due to the fact that not everybodyrecovers at the same speed. Then we would like to add some noise, multiplying this time seriesby an adequate random factor, as we detail in the following section.9
A stochastic SIRD model
We propose a stochastic SIRD model where randomness is embedded by modeling the parametersof infection ( β ), mortality ( µ ) and recovery ( γ ). In the following subsections, we will describe themodel for each one of the parameters. β . As the return of β are negatively correlated, it is natural to construct a model based on thefractional Brownian motion (fBm). The fBm is a Gaussian process with stationary increments,which depends on a parameter H ∈ (0 , called the Hurst parameter. More precisely, a process B H = B Ht , t ∈ [0 , T ] is called a fractional Brownian motion (fBm) if • E ( B H ) = 0 . • E ( B Ht B Hs ) = ( s H + t H − | t − s | H ) .Except in the Brownian motion case H = 1 / , the fBm increments are correlated. Denote X n = B Hn − B Hn − and define ρ H ( n ) := Cov ( X , X n +1 ) . Then we have ρ H ( n ) = E ( B H ( B Hn +1 − B Hn ))= 12 (1 H + ( n + 1) H − n H ) −
12 (1 H + n H − ( n − H )= 12 (cid:0) ( n + 1) H + ( n − H − n H (cid:1) (4)Notice that this quantity is positive for H > and negative if H < .Then, a natural candidate to model β is given by ˆ β n +1 = ˆ β n (1 + ( B Hn +1 − B Hn )) , (5)where B H is a fractional Brownian motion with H < and k is a constant. Apart from thenegative correlated returns, this process has also a decreasing mean, as we prove in the followingresult. Proposition 3.1.
Consider a fractional Brownian motion B H with H < and the process definedby (8). Then, provided β > , E ( ˆ β n ) is decreasing as n → ∞ .Proof. A recursive computation gives us that ˆ β n := ˆ β A n , with A n := Π nm =1 (1 + k ∆ B Hm ) , where ∆ B Hm = B Hm +1 − B Hm . The above product can be written as n (cid:88) i =1 k i (cid:32) (cid:88) m < ··· 12n the other hand, the observations in Section 2 lead to a model for γ of the type γ n = c γ [( I + R + D ) − n / ( I + R + D ) n ] exp( c γ ( B H γ n +1 − B H γ n )) , for some positive constants c γ , c γ and for some adequate Hurst parameter H γ . The above models for β, γ , and µ lead to the following stochastic SIRD model: β n +1 = (cid:80) i =1 β in , , with β in +1 = β in (1 + c β ( B i,Hn +1 − B i,Hn )) µ n = c µ β n − I n − I n γ n = c γ [( I + R + D ) − n / ( I + R + D ) n ] exp( c γ ( B H γ n +1 − B H γ n ) S n +1 = S n − β n I n I n +1 = I n (1 + β n − γ n − µ n ) R n +1 = R n + γI n D n +1 = D n + µI n . (9)Notice that only 7 parameters have to be calibrated are m , H , H γ and c β , c µ , c γ , c γ . We see inthe following section how the set of the first three parameters m , H , H γ , that define the drivingprocesses of the model, is chosen based on empirical observations, while the last group is calibratedby means of a classical least squares method. The calibration procedure is as follows. • In a first step, we fix reasonable values of m and the Hurst parameters according to observeddata, and • fixed m , H , and H γ , we calibrate c β , c µ , c γ , c γ by a least squares method.Step 1 focuses on choosing adequate driving random processes for the model. As estimating withprecision and robustness this group of parameters is not straightforward (see for example Glotter(2007)), so we simply proceed empirically. More precisely, we have seen in Section 2 that themaximum and the minimum paths in the case H = 0 . and m = 10 envelope the observed β timeseries. Moreover, the observed correlation of beta returns (for lag=1) is equal to − . , while(from a 1000 simulations sample), we estimate this quantity to be − . for H = 0 . and m = 10 .This leads to choose m = 10 , H = 0 . in our model.In order to choose H γ , we compute the autocorrelation function of log (cid:18) γ n [( I + R + D ) − n / ( I + R + D ) n ] (cid:19) . At lag=1, this autocorrelation function is equal to . . Then, taking into account Equation (4),we get a estimation of H γ = 0 . . Then we choose H γ = 0 . for the sake of simplicity. c β , c µ , c γ , c γ and global calibration In order to be able to start the calibration of the other parameters, we will need to have an initialguess. 13o find the parameter c β , we are going to fit the average ˆ β against the realized β . This givesus a value of 0.32250809.For µ , as we have seen previously, we can obtain a nice estimation by doing a linear regressionbetween the infected and death people. The slope ( 0.13904755) will be the initial guess for c µ .In the case of γ , in order to obtain the variables c γ , c γ , we minimize the distance betweenthe average ˆ γ and the observed γ , as well as the distance between the standard deviation of thereturns of ˆ γ and the observed γ . This gives us (0.03512639,0.5) as initial guess.Once we have the initial guess, we find the best possible parameters to minimize the distancebetween the infected, the death and the recovered time series as well as the empirical variance ofthe γ by OLS. Following the procedure in the previous section, we get the following estimation of the parametersof the model: c β = 0 . , c µ = 0 . , c γ = 0 . , c γ = 0 . . Simulating the model and taking the corresponding mean paths we fit the observed data, as wecan see in Figure 17. Figure 17: Simulated mean paths β , γ , and µ The same analysis can be done for β , γ , and µ :14igure 18: betaFigure 19: mu15igure 20: gamma Some simulated paths for β, γ and µ can be observed in the following figures:Figure 21: β µ Figure 23: γ Different random scenarios can be seen in the following figures:17igure 24: Simulated scenarioFigure 25: Simulated scenario18igure 26: Simulated scenario In the present work, we have extended the classic SIRD model including stochastic parametersand we obtain an easy-to-calibrate pure probabilistic model. We have been able to reproduce theevolution of the parameters of the Italian outbreak using only seven parameters. The propertiesof the fractional Brownian motion and the relationship between the parameters of the model areable to reproduce the exponential and logistic trends observed for example in [1]. Therefore, wehave been able to fit the empirical data as well as imitate the noise of the variables. One of theadvantages of having a stochastic model is that we are capable of generating a wide variety ofscenarios.One challenging problem is now to model the evolution after a lockdown. This translates intoa new dynamics of β , probably driven by a fBm with H > . Moreover, the comparison betweendifferent countries that applied different policies during the pandemic would be of great interest. References [1] D. Caccavo: Chinese and Italian COVID-19 outbreaks can be correctly described by a modifiedSIRD model. MedRxiv, 2020.[2] L. Changguo, P. Yongzhen, Z. Meixia, and D. Yue: Parameter Estimation on a Stochastic SIRModel with Media Coverage. Discrete Dynamics in Nature and Society Volume 2018, ArticleID 3187807.[3] B. Ivorra, M. R. Ferrández, M. Vela-Pérez, A. M. Ramos: Mathematical modeling of the spreadof the coronavirus disease 2019 (COVID-19) taking into account the undetected infections. Thecase of China. Communications in Nonlinear Science and Numerical Simulation , Vol. 88, 2020.[4] B. Mandelbrot, and J.W. van Ness: Fractional Brownian motions, fractional noises and appli-cations,