[PDF] A Data-driven Understanding of COVID-19 Dynamics Using Sequential Genetic Algorithm Based Probabilistic Cellular Automata

Abstract

COVID-19 pandemic is severely impacting the lives of billions across the globe. Even after taking massive protective measures like nation-wide lockdowns, discontinuation of international flight services, rigorous testing etc., the infection spreading is still growing steadily, causing thousands of deaths and serious socio-economic crisis. Thus, the identification of the major factors of this infection spreading dynamics is becoming crucial to minimize impact and lifetime of COVID-19 and any future pandemic. In this work, a probabilistic cellular automata based method has been employed to model the infection dynamics for a significant number of different countries. This study proposes that for an accurate data-driven modeling of this infection spread, cellular automata provides an excellent platform, with a sequential genetic algorithm for efficiently estimating the parameters of the dynamics. To the best of our knowledge, this is the first attempt to understand and interpret COVID-19 data using optimized cellular automata, through genetic algorithm. It has been demonstrated that the proposed methodology can be flexible and robust at the same time, and can be used to model the daily active cases, total number of infected people and total death cases through systematic parameter estimation. Elaborate analyses for COVID-19 statistics of forty countries from different continents have been performed, with markedly divergent time evolution of the infection spreading because of demographic and socioeconomic factors. The substantial predictive power of this model has been established with conclusions on the key players in this pandemic dynamics.

Full PDF

AA Data-driven Understanding of COVID-19 DynamicsUsing Sequential Genetic Algorithm BasedProbabilistic Cellular Automata

Sayantari Ghosh a , Saumik Bhattacharya ∗ b a Department of Physics, National Institute of Technology Durgapur, India b Department of E & ECE, Indian Institute of Technology Kharagpur, India

Abstract

COVID-19 pandemic is severely impacting the lives of billions across the globe.Even after taking massive protective measures like nation-wide lockdowns, dis-continuation of international ﬂight services, rigorous testing etc., the infectionspreading is still growing steadily, causing thousands of deaths and serious socio-economic crisis. Thus, the identiﬁcation of the major factors of this infectionspreading dynamics is becoming crucial to minimize impact and lifetime ofCOVID-19 and any future pandemic. In this work, a probabilistic cellular au-tomata based method has been employed to model the infection dynamics for asigniﬁcant number of diﬀerent countries. This study proposes that for an accu-rate data-driven modeling of this infection spread, cellular automata providesan excellent platform, with a sequential genetic algorithm for eﬃciently esti-mating the parameters of the dynamics. To the best of our knowledge, this isthe ﬁrst attempt to understand and interpret COVID-19 data using optimizedcellular automata, through genetic algorithm. It has been demonstrated thatthe proposed methodology can be ﬂexible and robust at the same time, and canbe used to model the daily active cases, total number of infected people andtotal death cases through systematic parameter estimation. Elaborate analysesfor COVID-19 statistics of forty countries from diﬀerent continents have beenperformed, with markedly divergent time evolution of the infection spreadingbecause of demographic and socioeconomic factors. The substantial predictivepower of this model has been established with conclusions on the key players inthis pandemic dynamics.

Keywords:

Epidemiological model; Probabilistic cellular Automata; Geneticalgorithm; Real data modeling. ∗ Corresponding author

Email addresses: [email protected] (Sayantari Ghosh), [email protected] (Saumik Bhattacharya ∗ ) Preprint submitted to Elsevier August 28, 2020 a r X i v : . [ q - b i o . Q M ] A ug . Introduction With its outbreak in Wuhan, China, Coronavirus disease-2019 (COVID-19)has spread across the world within a few months. Due to its explosive growthand considerable rate of fatality, World Health Organization (WHO) declaredCOVID-19 as a pandemic and a global health emergency [1]. According to theavailable statistics in June, 2020, the total number of infections by SARS-CoV-2 (Severe Acute Respiratory Syndrome Coronavirus 2), the causative agent ofthis disease, is approaching 19 million around the world, causing around 700,000deaths in 213 countries and territories, with no eﬀective vaccination availablein the market so far. Beyond respiratory discomforts including pneumonia, drycough, cold and sneezing [2, 3], it has been reported to cause liver and gastroin-testinal tract maladies, kidney dysfunction and heart inﬂammation, in cases ofsevere infection [4, 5, 6]. This highly infectious disease transmits from person-to-person through respiratory droplets produced by infected person. Fomite-mediated and nosocomially acquired infections are also being identiﬁed as im-portant sources of viral diﬀusion [7, 8, 9]. A typical incubation time from expo-sure to symptoms has been reported for COVID-19, while infection transmissionfrom asymptomatic individuals has been observed as well [10, 11, 12].Immediately after the detection of human-to-human transmission, the gov-ernment agencies of various countries started implementing several mitigationstrategies to control the epidemic. The measures thus taken include social dis-tancing, restrictions on domestic as well as international travel, cancelling socialevents, shutting down of public as well as commercial activities etc. which caneﬀectively reduce the possibilities of physical human contact. Moreover, contacttracing, aggressive testing as well as hospital or home quarantine for infectedindividuals and suspected cases have also been executed to track and preventfurther spread. However, these strategies are directly contributing to enormouseconomical loss. The optimum estimation of this novel disease dynamics isemerging out as a challenging problem in this context. The immense disruptioncaused by COVID-19, resulting into overwhelming disorder in the health, econ-omy and lives of billions of people around the globe, has brought the necessityfor accurate modelling of infectious diseases into the focus. The eﬀect and eﬀec-tiveness of this complex interplay between diﬀering length-scales and time-scaleswith the applied control strategies can only be understood and predicted withthe help of precisely designed quantitative models.

With a tremendous eﬀort from researchers around the world, a spectrum of var-ious mathematical and computational approaches is being used to understandand predict COVID-19 statistics, addressing its diﬀerent perspectives. On arudimentary sense, the studies being pursued can be segmented in two cate-gories: (i) data science and machine learning approaches and (ii) diﬀerentialequation based mathematical modelling techniques. The ﬁrst group of studiestrusted mostly on data mining from national/international repositories (e.g.,WHO, country speciﬁc data centres etc.) or popular social media platforms to2 igure 1: An overview of the dynamics: (a) Object process diagram of the proposed model;(b) The schematic diagram of the disease transmission dynamics in form of a modiﬁed SEIQRmodel. Transition probabilities p se , p ei , p iq , p ir and p qr are pointed out. The associated statetransition delays are indicated on the timeline of the disease dynamics. (c) Time evolution ofthe spatial lattice during spread of the infection in a population. The colors of the respectivesubpopulations, (i.e., susceptible, exposed, infected, quarantined and removed) are same asdepicted in (a). In this study, we propose probabilistic cellular automata based dynamicalmodel, optimised through sequential genetic algorithm for an accurate assess-ment of the extent of COVID-19 dynamics. The major motivation of usingcellular automata (CA) is its ability in depicting extremely complex macro-scopic outcomes, while being based on local interactions that trusts on the4 able 1: Comparison of the proposed method with the state-of-the-art COVID-19 models

BasicMethodology

Diﬀerential equation models Data science approaches

References [33-37], [39], [40] [13-22]

Limitations a) Homogeneous Mixingb) Most models are consideredas deterministic a) No way to track personto person transmission.b) No neighborhoodconsideration.

Contribution

Proposed method,a) accommodates heterogeneity in populationb) includes stochasticity and probabilistic dynamicsc) estimates optimum epidemic dynamics parameters.d) considers neighborhood and demography explicitly.e) performs robust prediction with limited data.interaction of a multitude of single individuals [41, 42]. This methodology is ca-pable of giving a direct correspondence to the physical system and also rectiﬁesthe major drawbacks of ODE models by (i) tracking individual contact pro-cesses, (ii) giving room for introducing probabilistic individual behaviour, and(iii) capturing neighbourhood as well as global spatial information. Becauseof these reasons, CA based approaches have been successfully used as a com-petent substitute method to simulate physical, biological, environmental andsocial contagion-like spreading [43, 44, 45, 46]. For studying past epidemics aswell as interpreting COVID-19, some studies have proposed cellular automataas an alternative method [47, 48, 49, 50]. However, to capture and interpret thebehaviour of real data through CA needs a large-scale parameter optimizationthat could be time consuming as well as sub-optimal. Thus, though being ex-tremely ﬂexible and powerful, CA has not been yet optimized to understand andinterpret COVID-19 data for countries worldwide. To explore this, in this study,genetic algorithm (GA) has been employed, which is a well-known method forgenerating the optimal parameter subset through stochastic search proceduresbased on the principle of the survival of the ﬁttest [51, 52, 53, 54, 55]. Cross-over and mutations, two key properties of genetic algorithm help to optimizethe parameter set eﬃciently in limited steps. Cellular automata coupled withgenetic algorithm has been used before to explore evolutionary aspects of gametheoretical problems [56], but to the best of our knowledge analyzing and devel-oping understanding from real pandemic data like COVID-19 using optimizedCA platform has not been attempted yet. The main contributions of this workare as follows: • To build a CA model which is probabilistic, so that it can take into accountof demographic variations, neighbourhood diversity and uncertainties ofreal dynamics. 5

To create an easily implementable framework where optimization usingGA will be done sequentially for all parameters associated with the tran-sition rules of the CA model for real data interpretation. • To interpret and understand COVID-19 disease transmission dynamicswith an optimized CA framework, which can be extended for predictionas well.Through this, on one hand, one can track the individual contact process throughtime and space; on the other hand, a self-adapting process of evolutionary strate-gies has been created by designing the chromosome with parametric genes andestablishing ﬁtness function that maximises over the generations. The mainlimitations of the state-of-the-art algorithms and the major contributions of theproposed method are listed in Table 1 for a clear understanding. The main ra-tionality behind this approach is that it is extremely diﬃcult to ﬁnd the optimalparameter of the complex spatial epidemiological model using random searchor analytical techniques. The proposed GA based framework helps to searchthe parameter space more eﬃciently for the optimal performance of the entirealgorithm.The rest of this article is organized as follows: Section 2 includes the proposedconcepts of epidemiological model, probabilistic cellular automata and the se-quential genetic algorithm used in this work. In Section 3, the results has beenelaborately discussed where the optimized CA model has been employed forsimultaneously understanding as well as analyzing active infections, total infec-tions and total death caused by COVID-19 for several countries, considering thedemographic and spatial population density variations. Section 4 is comprisedof concluding remarks.

2. Proposed Methodology

An object process diagram of the proposed method has been depicted inFig. 1(a). The methodology starts with the infection spreads following theSEIQR epidemiological model in a random human population over a 2D grid,initialized on a country-speciﬁc basis. The parameters of the epidemiologicalmodel is continuously optimized using proposed sequential genetic algorithm tomatch the real country-speciﬁc infection spread data. The proposed methodol-ogy is consisted of three distinct parts − ( A ) epidemiological model that governsthe infection spreading, ( B ) probabilistic cellular automata (PCA) to modelthe dynamics of the pandemic spread and ( C ) optimization of the parametersassociated with PCA using genetic algorithm (GA) to ﬁt real-world data. In the epidemiological model, the entire population is partitioned in ﬁvedistinct parts. At the very beginning, every person was healthy but they arevulnerable to the infection. These people are denoted as susceptible ( S ) sub-population. At time instance t = 0, some people in the population got exposed6 able 2: Descriptions of the parameters used in the proposed work Notation Description L Spatial lattice A Set of possible states on lattice A \ n ti Total number of people at state a i at time t Ω d d -neighbourhood of x ∈ L Λ at Mapping L → a at time tp tij Probability at time t that x ∈ L moves from a i to a j τ ij Transitional delay for x to move from a i to a j e t , i t Number of exposed and infected people in the d -neighbourhood of x at time tp e , p i Probabilities that an exposed or an infected person spreadsthe infection to a susceptible person when they meetΘ A gene containing all the parameters of PCA method B Binary encoded representation of Θ G (Θ) The PCA model with parameter Θ y Time series of an epidemiological state in a countryˆ y Time series estimate of epidemiological state from PCA e ji Estimation error of j th gene in i th generation N g Total number of chromosome in genepool F Number of parents selected for mating from N g p β Fraction of r t that recovers from the disease ρ Fraction of parents F that lives in the next generationto the infection from some known or unknown source. These exposed peopledo not have any particular symptom of the infection, but they can spread theinfection to the susceptible people. These asymptomatic people are referredas exposed ( E ) subpopulation. At time instance t = 0, there were also somepeople who had clear symptoms of the infection and they also had the poten-tial to spread the infection among susceptible people. This symptomatic peopleare considered as infected ( I ) subpopulation. After an incubation period, someof the exposed people show the symptoms of the infection and they move tosubpopulation I . Because of the health facilities and testing time, the infectedpeople are detected with some average delay, and put to quarantine. The peo-ple who are quarantined cannot spread the infection to other people, thoughthey themselves remain in the infectious stage. These people are denoted asquarantined ( Q ) subpopulation. Both the quarantined people and the infected(but not detected) people would come out of the infectious stage eventually,and after that they no longer contribute in the infection spreading dynamics.These people are denoted as removed ( R ) subpopulation in the model. Thisremoved subpopulation contains two kinds of people − one who have recoveredfrom the infection completely and they neither infect nor get infected in future,and the other kind of people who have died due to the severity of the infection.7chematic diagram related to the transitions, probabilities and timelines corre-sponding to the dynamics of infection are shown in Fig. 1(b). In the analysis,normalized subpopulations have been considered, and the respective normalizedsubpopulation is denoted using the same lowercase character. For example, thenormalized susceptible and infected subpopulations are denoted by s and i re-spectively. As shown in Fig. 1(c), this epidemiological time evolution has beenimplemented on a 2D lattice using PCA as discussed below. Let L be a ﬁnite subset of Z at time instance t , denoted as L (cid:64) Z whichdeﬁnes a regular 2D lattice. Every point on this lattice x ∈ L can acquire ﬁ-nite number of states A . In this particular problem, the set A can be deﬁnedas A = { , s, e, i, q, r } , where the terms s , e , i , q and r denote the particularpossible states of infection as discussed in Sec. 2.1, and 0 denotes no humanoccupant or an empty space. At time t = 0, n i points are randomly selected on L and assign the state a i where i ∈ A . The total initial population is deﬁnedas N = (cid:80) i ∈ A \ n i . At any instance of time t , n ti , i ∈ A \ a i .For neighbourhood criteria, modiﬁed-Moore neighbourhood or d -neighbourhoodhas been used. A ﬁnite subset Ω d (cid:64) Z is deﬁned, containing the origin = (0 , d is 4 d ( d + 1). General probabilistic cellularautomata (PCA) is a stochastic process that describes sequence of mappingsΛ at : L → a , a ∈ A , where any particular state Λ at ( x ) of x ∈ L at a particulartime instance t is dependent on the previous states of the d -neighbourhood of x , denoted as x + Ω d = { x + ω : ∀ ω ∈ Ω d } with certain probabilities. Moreprecisely, in COVID-19 infection spread, Λ Et ( x ) will be decided by Λ t − ( x + ω ), ∀ ω ∈ Ω d . The other mappings Λ at ( x ), a ∈ A \ E , depends on the sequence ofstates Λ aκ ( x ), 0 ≤ κ < t . The transition probability p ta i a j denotes the probability of transition at time t from state a i to state a j , where a i , a j ∈ A . Without any loss of generality, p ta i a j is denoted as p tij and transition from state a i to a j as a ij in the rest ofthe discussion for a simpler notation. In cases, where a i (cid:54) = a j , p tij is referredas state transitional probability, and if a i = a j , p tii is called as self transitionalprobability.If a state transition a ij , i (cid:54) = j , happens in x at time t following the transitionprobability p tij and the transition state a ij has a transitional delay τ ij , then p tij = (cid:26) t < t ui + τ ij p ij if t ≥ t ui + τ ij where t ui is the time instance when transition a ui , u (cid:54) = i happened. In thisinfection diﬀusion model, only the state transitional probabilities p tse , p tei , p tiq , p tqr and p tir are considered to be nonzero at certain instance of time, and for all8 igure 2: Time series data for active cases (blue) of COVID-19 pandemic in diﬀerent countrieswhere the peaks of the infection spread of the ﬁrst wave have been passed, and estimated activecases (red) from proposed PCA-GA method. the other transitional probabilities, τ ij is set to inﬁnity, where p ij and τ ij areuser deﬁned parameters. However, for the transition a se , t ui and τ ij are set tozero, and for x ∈ L , let us deﬁne p tse = p ij = 1 − p tss and the self-transitionprobability p tss = (1 − p i ) i t − (1 − p e ) e t − where i t − and e t − are the number ofcells in states i and e respectively in the Ω d neighbourhood of x at time t − p e and p i are deﬁned as ‘infection probabilities’ which can beconsidered as the probabilities that a susceptible person become exposed to theinfection when that person meets an exposed or an infected person respectively.An empty cell does not contribute in the infection spread, and thus, self transi-tional probability p t = 1, ∀ t . Among the total removed population r t at timeinstance t , a population fraction p β r t is considered that recover from the in-fection at time instance t and acquire long-term immunity towards the disease,and a population fraction (1 − p β ) r t is considered to be deceased. The removedpopulation r t is not considered further in the infection dynamics and it is takenthat p t (cid:48) rr = 1, t (cid:48) > t . 9 .3. Parameter optimization using GA Though PCA has potential to model the probabilistic transition of states ona spatial lattice, the main challenge to use it for modeling a real-world scenariois to ﬁnd out the optimal parameters for the PCA. As the searching space forthe proposed PCA model is very large, it is practically impossible to searchfor the optimal parameter setting manually to analyse the characteristics ofthe infection spread from a real data. Thus, genetic algorithm (GA) has beenapplied to ﬁnd out the optimal parameter set given a real time-series data.Let us assume a discrete time signal y [n], 0 ≤ n ≤ ( T −

1) associated withthe real world infection spread. The PCA model is denoted by G (Θ), whereΘ = [ θ , θ . . . θ h ] denotes the set of parameters used for the PCA model. Ifˆ y [n], 0 ≤ n ≤ ( T −

1) is the time evolution of the desired variable in the model G (Θ), then the objective is to ﬁnd an optimal parameter set Θ ∗ such thatˆ y [n] → y [n], ∀ n. To apply GA, each θ i , 1 ≤ i ≤ h , is encoded as a string ofbinary digits b i [54, 55] assuming the θ i has a bound | θ i | < ζ i , 1 ≤ i ≤ h . Thisbinary string is referred as gene , and the concatenated genes in the order ofthe appearance of respective θ i in Θ is called the chromosome . For example, if B is the chromosome corresponding to parameter set Θ, G ( B ) is equivalent to G (Θ). A collection of N g number of chromosomes of estimated parameters, oftenreferred as gene pool , are evaluated at every time step (called as generation ).In our work, the error of each chromosome has been evaluated using l normdistance. At i th generation, the error of the j th chromosome B ji is computed as e ji = (cid:107) y − ˆ y ji (cid:107) = T − (cid:88) n=0 | y [n] − ˆ y ji [n] | where ˆ y ji is the estimated output of G ( B ji ) in the vector form and ˆ y ji [n] is thevalue of ˆ y ji at time instance ‘n’. At each generation, GA ﬁnds out min ( e ji ), ∀ j and tries to make e ji → i → ∞ . In the proposed framework, some of theparameters are related to probabilities having a range 0 to 1, and some of theparameters are associated with time (in days) which are discrete integers, andgreater than or equal to zero in our case. Thus, the parameters are initializedrandomly keeping their domain restrictions intact.For mating, two chromosomes, often referred as parents , are selected from thegene pool considering their ‘ ﬁtness ’. Among two selected parents, a crossoverpoint or a splice point is selected at b i , 1 ≤ i ≤ h in both chromosomes anda crossover [55] happens that produces two oﬀsprings. In our approach, ﬁtness f ji of each chromosome has been deﬁned as the inverse of their respective errorsat a particular generation. At each generation, F number of best chromosomesare selected from the gene pool having the maximum ﬁtness for mating. Fol-lowing the idea of [52], ρF number of parents are kept to the next generationalong with the new chromosomes to ensure that the error in the next generationis always less than or equal to the current generation. Selecting ρF numberof chromosomes from the parents, N g − ρF number of children are producedfrom mating to keep the size of the gene pool constant. After the oﬀspringsare generated, in the parameter space, s genes are randomly selected and small10 igure 3: Parameter estimations and goodness of model estimation: (a) RMSE, Correlationand χ distance, d l , d c and d χ for all 40 countries considered in this work in terms of goodnessof agreement with model estimations shown in percentage. The colors green, orange and redsignify level of agreement. Values between (0:0.05) for d l , (0:0.01) for d c and (0:1) for d χ areconsidered as good (green). Values between (0.05:0.08) for d l , (0.01:0.1) for d c and (1:3) for d χ are considered as moderate (orange). Values above moderate are considered as poor (red).For all three metrics 65 −

75% countries have shown good agreement with model estimation;(b) and (c) represent boxplot for the best-ﬁt parameters of state transition probabilities andstate transitional delays respectively, for all the 20 countries shown in Fig. 2. The height ofthe boxplots represents the interquartile range (IQR). The dark line inside the box representsthe median. The lower and upper whisker extend to the lowest and highest values within 1.5IQR of the ﬁrst and third quartile, respectively. perturbations are added individually to mimic mutation.As shown by several researchers [57], the homogeneity in the gene pool increaseswith the generations, and as the perturbations due to mutation are typicallysmall, the reduction of error becomes a problem after a few generations. Thus,to restrict homogeneity in the gene pool, a small number of oﬀsprings µ areselected from the total N g − ρF number of generated oﬀsprings, and replacedthem with randomly generated chromosomes to maintain diversity. This step iscalled as ‘diversiﬁcation’ of gene pool.In our problem, the parameters Θ of the PCA model G (Θ) are the state tran-sitional probabilities p ei , p iq , p ir , p qr , infection probabilities p e and p i , statetransition delays τ ei , τ iq , τ qr , τ ir , neighbourhood d , and death probability p β as mentioned in Sec.2.2. As optimizing these many parameters simultaneouslymight be challenging and require huge amount of resources, we propose a vari-ant of GA with sequential evolution mechanism where instead of optimizing thesolutions simultaneously, the parameters are optimized sequentially. Let us de-ﬁne a set of generations as an era . For the ﬁrst era containing a small numberof generations, a traditional GA methodology is followed as discussed this farto have a set of initial parameters. From the next era onward, two parametersare ﬁxed and optimized sequentially in that era. Mutation and crossover arerestricted to those two respective genes, whereas parent selection is done basedon the performances of the entire chromosomes. This newly proposed sequen-11 igure 4: Time series data for active cases (blue) of COVID-19 pandemic in diﬀerent countrieswhere the cases are saturating, and estimated active cases (red) from proposed PCA-GAmethod. tial optimization of parameters of PCA using GA is deﬁned as PCA-GA. Theproposed approach can optimize a large number of parameters using limitedresources eﬃciently. All the notations used in PCA-GA are brieﬂy summarizedin the Table 2.Proposed PCA-GA has a complexity which can be approximated as O ( N g T g O ( f ))where N g is the number of population, T g is the total generation and O ( f ) isthe complexity to measure the ﬁtness in the GA. For a large enough N g , T g is considered as a comparatively smaller constant and thus, the complexity ofthe entire algorithm is mainly governed by N g and O ( f ). The complexity ofestimating the ﬁtness can be approximated as O ( f ) = O ( T + 8 N τ T ) for Mooreneighbourhood criteria, where N is the total population on the 2D grid. Thelength of the original time series data T , and τ , the maximum of τ ij , are bothconstant, and thus O ( f ) can be represented as O ( N ).Though GA has been selected as a strategy to optimize the parameters of theproposed PCA model, it is evident that because of the generalized constructionof the proposed framework, other meta-heuristic methods could also be em-ployed to search the parameters of the spatially driven SEIQR model which isthe main focus of this work. However, presence of mutation and diversiﬁcationin GA help to search for better solutions as the search space is extremely large.

3. Results

To validate the eﬀectiveness of the proposed framework, using PCA-GA, theactual statistics of COVID-19 spreads till 20th June, 2020 in diﬀerent countries12s used. For ﬁnalizing the data-set from available data of 213 countries, severalaspects have been considered. At ﬁrst, 102 countries had been dropped due toless number of reported cases (less than 1000 reported cases till 20th June 2020).Out of the remaining countries, some countries, like Iran, Greece, Paraguay etc.,are removed due to data inconsistency, and ﬁnally 40 countries are randomlyselected ensuring the following points: • At least 2 countries from each continent got selected to maintain demo-graphic diversity in our data. • Care has been taken to maintain signiﬁcant variation in population den-sity, which we believe as a major factor contributing in disease transmis-sion. • It was ensured that countries from three distinct stages of COVID-19infection are considered: (i) where the infection is signiﬁcantly diminished,(ii) where the peak infection has been reached but substantial infectionstill persists, and (iii) where consistent growth in infection is occurring.With these widely variant spectrum of time series data, we proceed for quan-titative calibration and interpretation through the proposed methodology. Alldata samples are taken from the website worldometers.info .To point out the major contributing factors in dynamics of infection spread,for every country under consideration, three available time series, namely dailyactive cases, total number of infected cases and total number of deaths are ac-cumulated. Out of these three series, the daily active cases time series is usedfor model formulation, and the rest are considered for model validation. It isimportant to mention that the population q t is the relevant observable here, asinfected people as i t and e t remain latent and undetected in the population.The reported daily active case data is associated with lifetime of the infection,and are used in this study to check the eﬀectiveness of the proposed frameworkas follows. By applying PCA-GA on the daily active case data of a particularcountry, the parameters Θ ∗ that gives the minimum l error is extracted. Forvalidation of the optimized parameters and understanding the robustness of thealgorithm, results generated by using G (Θ ∗ ) for the total infected states and de-ceased states are then compared with the real-world data. Here it must be notedthat the optimal parameters Θ ∗ remain unaltered and no further optimizationis performed. For all the simulations, PCA is initialized with a ﬁxed lattice size of 100 × n e = 50 and n i = 4. The population n q and n r are set to zero at t = 0.The susceptible population n s has been initiated depending on the populationdensity of a country as follows: among the countries considered in our study, igure 5: Time series data for active cases (blue) of COVID-19 pandemic in diﬀerent countrieswhere the cases are increasing exponentially, and estimated active cases (red) from proposedPCA-GA method. for the country with lowest population density (Canada), n s = 2500 has beenselected, and for the country with highest population density (Singapore), n s =6000 has been ﬁxed. For any other country, n s has been assigned within thisrange using logarithmic scaling based on the population of that country. As eachof the parameters of PCA-GA has physical relevance, the sequential searchingprocess has been initiated by following restrictions of ranges. It is important tonote that in our problem, genes associated with probabilities are initiated in therange [0 ,

1] and clipped during the optimization process accordingly. The statetransition delays τ ei (incubation period) and τ iq (testing delay) are consideredto be within the range (0 , τ ir and τ qr (correspondingrecovery periods) are initialized in the range (20 , The daily active cases can be deﬁned as the c t = c t − + q t − r t where c t is thenumber of active cases at time instance t having the initialization c = 0. In Fig.2, the active cases of 20 diﬀerent countries are shown along with the respectiveestimated active cases using PCA-GA model. For the countries shown in Fig.2, the ﬁrst peak of the infection is already crossed and a steady fall in theinfection spread is observed. It can also be seen that some of the active cases ofthe countries like China, Israel, Switzerland, follow smooth bell-shaped curves,whereas for some countries, like Australia, Cyprus, Hungary etc., the timesseries data deviates from bell-shaped curves with substantial degree of noises.In all the cases, PCA-GA has successfully captures the trend of the time seriesdata estimating the parameters of the epidemiological process. To measurethe goodness of the model estimation, three diﬀerent metrics has been used tomeasure the quality of the estimated values. The root mean square (RMSE)14istance, correlation distance and chi-square distance [58, 59, 60], denoted as d l , d c and d χ respectively, are computed between the real data and the estimatedvalues from the PCA-GA model to evaluate the eﬀectiveness of the optimizedmodel. For two vectors u and v , we deﬁne d l = (cid:118)(cid:117)(cid:117)(cid:116) T T (cid:88) i =1 ( u i − v i ) , d c = 1 − ( u − ¯ u ) . ( v − ¯ v ) (cid:107) ( u − ¯ u ) (cid:107) (cid:107) ( v − ¯ v ) (cid:107) , d χ = T (cid:88) i =0 ( u i − v i ) v i where T is the length of each vector, u i and v i are the i th elements of u and v respectively and (.) denotes dot product of two vectors. As shown in Fig. 3(a),the proposed model performs well in modelling the real data. When evaluatedover all the countries considered in this work, the proposed model ﬁts the datawell, and for only 0% -12.5% cases the ﬁttings were poor depending on theevaluation metric. It is important to mention that all the distance measures areevaluated on normalized data.In Fig. 2, an interesting point to notice is that the peak of the active casesare located at markedly diﬀering time instances, and the other properties, likevariance, skewness etc., of the observed distributions are also varying drastically.The fundamental diﬀerences between the ﬁtted curves are quantiﬁed with thehelp of boxplot of the parameters in Fig.3(b)-(c) by analysing basic statisticalproperties. The reported boxplots are speciﬁcally for the countries selected inFig. 2. It can be noted that p e , p i and p ei exhibit a wide variability in Fig. 3(b).During our analysis, a strong positive correlation with population density for p e and p i has been also observed. This can be thus inferred that the variationin population density in the considered countries causes the wide range of theseparameters. It can be also concluded that high density of population increasesthe probability of transmission of the disease. The considerable diﬀerence inthe mean magnitudes of the infection associated probabilities ( p e , p i and p ei )and recovery-related probabilities ( p iq , p ir and p qr ) indicate the sharper riseand slower fall of active cases curves, which results into a skewed distributionin most of the cases (see Fig. 2). In Fig. 3(c), it is also shown that τ ei , whichis identiﬁed as the incubation time in the model, exhibits a range of 3-14 dayswith a mean at 7.3, which perfectly aligns with the observed cases all aroundthe world [61]. In this ﬁgure, a wide variability in the range of τ ir and τ qr isobserved, which points out the substantial diﬀerence in health infrastructure ofthese countries.Here it must be mentioned that, while performing this statistical analysis withall 40 countries, some countries were detected showing consistent outliers (notincluded in Fig. 3(b)-(c)) in terms of four transitional parameters: p ir . p qr , τ ir and τ qr . While analyzing the active case distributions of these outliers, it wasfound out that the time series data for all these countries have a saturating trendwhere the daily active cases do not show an average descent with time. Someof such cases are shown in Fig. 4. Even for these data which have drasticallydiﬀerent qualitative trend compared to countries shown in Fig. 2, the proposedPCA-GA framework has successfully captured the trend of the real time series15 igure 6: Total infected cases (blue) of COVID-19 pandemic in diﬀerent countries, and esti-mated total cases (red) from proposed PCA-GA method. data accurately.There are also certain countries, like India, Brazil, Chile, Mexico, etc., for whichthe infection spreading started later than the countries like China or Italy, andthe active daily cases are still growing almost exponentially. As shown in Fig.5, PCA-GA is able to estimate the time series data for these countries where theinfection is spreading rapidly. Dynamics of COVID-19 spread in these countriesare of particular interest as the prediction of the peak positions in these countriesmight help immensely to understand the maximum socioeconomic impact of thedisease at a time in that geographical location. While analyzing a complex dynamics like the spread of a pandemic, it isnot always suﬃcient to model the input real data only. It is required that theoptimized model should be robust and can provide meaningful interpretationswithout further retraining or parameter tuning for real-world applications. Tovalidate the robustness and the eﬀectiveness of the proposed algorithm, the opti-mized model is now employed for three diﬀerent tasks. At ﬁrst, the robustness ofthe optimized model is checked by estimating the total number of infected cases,followed by total number of death cases without any further training, tuning orsupervision. Finally, to further validate the eﬃciency of the model, its perfor-mance has been evaluated for the prediction task by training the model withpartitioned data and evaluating on its future predictions without any furtheroptimization. 16 igure 7: Total deaths (blue) of COVID-19 pandemic in diﬀerent countries, and estimatedtotal deaths (red) from proposed PCA-GA method.

The total number of infected cases z t at time instance ‘ t ’ can deﬁned as z t = (cid:80) ti =0 q i . This cumulative sum indicates the total number of people whosuﬀered from the disease at any point of time. For a country, where the ﬁrstwave of the infection has passed, e.g., Croatia, Italy, etc., z t follows a sigmoidfunction approximately, whereas for the countries like India, Mexico etc., wherethe infection has not reached the peak, z t follows an exponential function. AsPCA-GA is optimized using the time series information of daily active cases c t , z t is used to validate the parameters learnt by the sequential GA framework inthe following way. Once a particular country is selected, Θ ∗ is estimated usingPCA-GA with the actual c t . Next the ˆ z t for G (Θ ∗ ) is calculated without anyfurther ﬁne-tuning of the parameters, and compared ˆ z t with actual z t . In Fig.6, the total cases (blue) of six such countries are shown along with the best-ﬁtresults obtained from PCA-GA (red) which depict an excellent agreement withthe data. It must be mentioned that for all three dynamical stages of infectionspreading as discussed in Sec. 3.2, i.e., where the ﬁrst wave of infection haspassed, where the active cases are almost saturated currently or where the activecases are increasing rapidly, our estimated ˆ z t closely matches z t without anyfurther parameter optimization. When evaluated over all 40 countries for thenumber of infected people, the proposed method gives average d l , average d c andaverage d χ as 0.037,0.006 and 0.53 respectively, which exhibits the robustnessof the model. To further validate the ‘goodness’ of the estimated parameters, the parame-ter set Θ ∗ optimized over the daily active cases of a particular country is taken17 igure 8: Prediction of daily active cases from truncated data. For Israel and Switzerland,real data up to 54 and 43 days has been used to predict the daily active cases for 100 days.For prediction, the average of 50 independent PCA-GA simulations are considered. and the identical parameter values are used to compare the estimated totaldeaths with the actual total deaths of that country. Death in the population isthe prime concern in case of the COVID-19 pandemic, and as mentioned in Sec.2.2.1, daily deceased population is a fraction of r t in our model. So, the totalestimated death cases can be deﬁned as ˆ d t = (1 − p β ) (cid:80) ti =0 r i where p β and r i for 0 ≤ i ≤ t are given by Θ ∗ and G (Θ ∗ ) respectively. Fig. 7 demonstrates thecomparison of the actual total death cases d t with estimated total death casesˆ d t for Θ ∗ , the identical set of parameters used for estimating active cases aswell as total cases previously. The same countries shown in Fig. 6 have beenselected to show the robustness of the estimated parameter Θ ∗ using the pro-posed technique. Excellent agreement with data has been found for this case aswell; when evaluated over all 40 countries for the total number of death cases,the proposed method gives average d l , average d c and average d χ as 0.041,0.006and 0.48 respectively. Prediction of future events is always challenging in data modeling [62].Forthe ﬁnal stage of validation of the methodology, the predictive power of themodel has been tested. As the impacts of this pandemic becomes far reachingas the socioeconomic contexts vary, a considerably accurate prediction about thedynamics of the infection spread can be crucial and useful in many ways. AsPCA-GA successfully estimates the optimal parameter Θ ∗ , the set of parameterscan also be utilised to predict the future course of the infection in that country.To validate the capacity of the prediction strategy, the daily active cases of acountry c t is truncated to c P keeping the ﬁrst ‘ P ’ values. PCA-GA is appliedon c P to estimate the parameters Θ P . Then Θ P is used to predict the dailyactive cases ˆ c t . As shown in Fig. 8, for two countries Israel and Switzerland, thedaily active case information up to 54 and 43 days respectively are considered18or an attempt to predict the daily active cases up to 100 days. In the ﬁgure,the estimated curve (shown in red) is optimized using all the real data pointsavailable, whereas the predicted curve (shown in black) is optimized using thetruncated real data. It can be observed that the predictive estimation closelyfollows the real active case data, even though only ∼

50% data points are usedfor parameter estimation. For Israel and Switzerland, 100 days prediction ofthe algorithm produces ( d l , d c , d χ ) as (0 . , . , .

95) and (0 . , . , . d l , d c , d χ )as (0 . , . , .

82) and (0 . , . , .

6) respectively for the truncated timeseries of Switzerland. SVM regression with RBF kernel performs satisfactorilyon the same truncated data and produces ( d l , d c , d χ ) as (0 . , . , . d l , d c , d χ ) as (0 . , . , . As the PCA-GA methodology has been elaborately validated in Section 3.3,now, in this section, it is employed for the purpose of prediction of consistentlyrising real epidemic data. Though the parameter estimation works well evenwhen the minimum information about the peak position in c t is available, theprediction task becomes really challenging when c t is exponential in nature. Fora particular country where c t is almost exponentially rising, proceeding with pre-diction, ﬁrst the best set of parameters Θ ∗ is detected by PCA-GA with ﬁtness f ∗ and error e ∗ . As the drop of the infection heavily depends on the transitionalprobabilities p ir , p qr and state transitional delays τ ir and τ qr , this parametersare tuned to ﬁnd a region of predictions bounded by the possible best case andthe worst case scenarios. While estimating the best case scenario, p ir and p qr is chosen equal to the maximum and minimum p ir and p qr observed in the con-tinent from which the country belongs. The reason behind this strategy is thatthe parameters related to the infection spreading are diﬀerent in each continentwhich is also observed by [64]. In the best case scenario, transitional delays τ ∗ ir and τ ∗ qr are reduced to obtain best case transitional delays τ (cid:9) ir and τ (cid:9) qr respec-tively such that the ﬁtness remain within 90% of f ∗ , where τ ∗ ir and τ ∗ qr are thecorresponding optimized delays available in Θ ∗ . For the worst case scenario,we consider τ ⊕ ir = τ ∗ ir + α ir and τ ⊕ qr = τ ∗ qr + α qr , where α ir = τ ∗ ir − τ (cid:9) ir and α qr = τ ∗ qr − τ (cid:9) qr .Fig. 9 depicts the prediction of the daily active cases using the method dis-cussed so far. In the Fig. 9, the black dotted line indicates the prediction usingthe optimal parameters Θ ∗ estimated using PCA-GA. The orange line indicatesthe best case scenario, where the maximum daily active cases would be mini-mized given the real data. The red line indicates the worst case scenario basedon the speciﬁc conditions mentioned above. The best case and the worst case19 igure 9: Prediction of the course of the disease: Exponentially rising daily active cases forIndia (blue) till 20th July, 2020 are used for parameters estimation and the predictions. scenarios act as limiting cases of an area (shaded in pink color) of probable fu-ture state. Any curve inside the pink region that contains the real data could bethe evolution of the daily active cases in future given the real time series data,that is in exponentially rising state currently. This indicates that for India,which is now one of the biggest epicenters of COVID-19 in South-eastern Asia,the disease can start decline very soon if vigorous measures from governmentand complete support from the public could be achieved. It also shows thatthe maximum active cases on a day, that puts a direct burden on the healthinfrastructure of the country can be restricted below 750,000 if people partici-pate to government indicated mitigation strategies, and recovery rate remainsat its current value. In that case, the peak of the disease is expected to passduring mid-September to mid-October, and the disease can be over with its ﬁrstwave by March 2021. But these predictions also imply that the range of futurestates, that are possible for exponentially rising daily active cases, not only de-pend on the evolution of the epidemic so far, but also gets highly aﬀected bythe consistency and implementation eﬃciency of mitigation strategies.

4. Conclusion

COVID-19 outbreak has created a massive impact all across the globe. Evenafter nation-wide lockdowns, extensive testing strategies and medical supports,the spread of the virus has overwhelmed several countries. Thus, it is becomingmore and more important to understand the nature of the infection spread andthe key parameters that are controlling the spread. In this work, we proposeda probabilistic cellular automata model to understand and depict COVID-19spread using appropriate choice of loss functions and evolutionary optimiza-tion framework. The parameters of this cellular automata model are optimisedusing sequential evolutionary genetic algorithm. It has been shown that this20elf-adapting methodology can be highly ﬂexible and has the power to accu-rately estimate time trajectories of epidemics. This model works with physi-cally interpretable parameters, which are accessible for analysis, data collectionand further experiment, and can be readily identiﬁed with ground reality. Thismodel has been successfully employed for optimizing all these parameters simul-taneously for the daily active cases, total infected cases and total deaths withextreme robustness. The performance of the model has been exhibited for alarge number of countries with huge diversity in population density, continentsand available healthcare infrastructures. The predictive strength of the modelhas also been validated extensively, and demonstrated to estimate the course ofthe pandemic for the countries where infection peak has not been reached yet.It is important to mention that the motivation of the work was to develop adata driven, generalized, spatial framework that can be used to estimate rele-vant epidemiological parameters. This methodology is so powerful and ﬂexiblethat physical interpretations of the results obtained from these analyses canhave a wide range implications. Once the data is properly interpreted with theproposed methodology, interesting realistic features can be identiﬁed for spe-ciﬁc countries. For example, in a pandemic situation, easily relatable factorslike population clusters, variable population density, variable health facilitiesat diﬀerent places of a country etc, can be studied to understand and predictemergence of new hotspots which can be used to design selective area contain-ment strategies. While we propose and establish the applicability and strengthof this framework in this work, we wish address these application perspectivesin a study in our upcoming research studies.With this proposed platform, the impact of individuality on contagion processcan be explicitly studied, which might be directly related to the questions likelockdown behavioral diﬀerences, inﬂuence of rumors, vaccination opinion dif-ferences etc. As the eﬀects of more complex dynamical factors like periodiclockdown or population clusters are not considered in this present model, theprediction capability of the proposed model is not satisfactory for time seriesdata with abrupt discontinuities in the present form. The proposed frameworkcould be enhanced with other l p norm distances and diﬀerent optimization tech-niques like multi-objective genetic algorithm or strength pareto evolutionaryalgorithm. Other swarm-based optimization techniques can also be exploredfor further reﬁnement of the model. The potential of the proposed approachcan be utilized to better understand the disease spreading and controlling, be-yond this pandemic the world is facing currently, by keeping track of the spatialinformation of the dynamics, incorporating realistic behavioural aspects, andoptimizing in terms of demographic as well as socioeconomic features. References [1] World Health Organization Coronavirus disease (COVID-2019) situa-tion reports, Available at url: (accessed June2020), 2020. 212] X. Jin, J.-S. Lian, J.-H. Hu, J. Gao, L. Zheng, Y.-M. Zhang, S.-R. Hao, H.-Y. Jia, H. Cai, X.-L. Zhang, et al., Epidemiological, clinical and virologicalcharacteristics of 74 cases of coronavirus-infected disease 2019 (COVID-19)with gastrointestinal symptoms, Gut 69 (2020) 1002–1009.[3] L. Pan, M. Mu, P. Yang, Y. Sun, R. Wang, J. Yan, P. Li, B. Hu, J. Wang,C. Hu, et al., Clinical characteristics of COVID-19 patients with diges-tive symptoms in Hubei, China: a descriptive, cross-sectional, multicenterstudy, The American journal of gastroenterology 115 (2020).[4] Y. Cheng, R. Luo, K. Wang, M. Zhang, Z. Wang, L. Dong, J. Li, Y. Yao,S. Ge, G. Xu, Kidney disease is associated with in-hospital death of patientswith COVID-19, Kidney International (2020).[5] C. Han, C. Duan, S. Zhang, B. Spiegel, H. Shi, W. Wang, L. Zhang, R. Lin,J. Liu, Z. Ding, et al., Digestive symptoms in COVID-19 patients withmild disease severity: clinical presentation, stool viral RNA testing, andoutcomes, The American journal of gastroenterology (2020).[6] Y.-Y. Zheng, Y.-T. Ma, J.-Y. Zhang, X. Xie, COVID-19 and the cardio-vascular system, Nature Reviews Cardiology 17 (2020) 259–260.[7] C. Wang, R. Pan, X. Wan, Y. Tan, L. Xu, C. S. Ho, R. C. Ho, Immedi-ate psychological responses and associated factors during the initial stageof the 2019 coronavirus disease (COVID-19) epidemic among the generalpopulation in China, International journal of environmental research andpublic health 17 (2020) 1729.[8] Y. Wang, Y. Wang, Y. Chen, Q. Qin, Unique epidemiological and clini-cal features of the emerging 2019 novel coronavirus pneumonia (covid-19)implicate special control measures, Journal of medical virology 92 (2020)568–576.[9] N. van Doremalen, T. Bushmaker, D. H. Morris, M. G. Holbrook, A. Gam-ble, B. N. Williamson, A. Tamin, J. L. Harcourt, N. J. Thornburg, S. I.Gerber, et al., Aerosol and surface stability of SARS-CoV-2 as comparedwith SARS-CoV-1, New England Journal of Medicine 382 (2020) 1564–1567.[10] Y. Bai, L. Yao, T. Wei, F. Tian, D.-Y. Jin, L. Chen, M. Wang, Presumedasymptomatic carrier transmission of COVID-19, Jama 323 (2020) 1406–1407.[11] H. Nishiura, T. Kobayashi, T. Miyama, A. Suzuki, S. Jung, K. Hayashi,R. Kinoshita, Y. Yang, B. Yuan, A. R. Akhmetzhanov, et al., Estima-tion of the asymptomatic ratio of novel coronavirus infections (COVID-19),medRxiv (2020). 2212] P. Yu, J. Zhu, Z. Zhang, Y. Han, A familial cluster of infection associatedwith the 2019 novel coronavirus indicating possible person-to-person trans-mission during the incubation period, The Journal of infectious diseases221 (2020) 1757–1761.[13] G. Giordano, F. Blanchini, R. Bruno, P. Colaneri, A. Di Filippo, A. Di Mat-teo, M. Colaneri, Modelling the covid-19 epidemic and implementation ofpopulation-wide interventions in italy, Nature Medicine (2020) 1–6.[14] Z. Yang, Z. Zeng, K. Wang, S.-S. Wong, W. Liang, M. Zanin, P. Liu, X. Cao,Z. Gao, Z. Mai, et al., Modiﬁed seir and ai prediction of the epidemics trendof covid-19 in china under public health interventions, Journal of ThoracicDisease 12 (2020) 165.[15] V. Volpert, M. Banerjee, S. Petrovskii, On a quarantine model of coro-navirus infection and data analysis, Mathematical Modelling of NaturalPhenomena 15 (2020) 24.[16] C. Li, L. J. Chen, X. Chen, M. Zhang, C. P. Pang, H. Chen, Retrospectiveanalysis of the possibility of predicting the covid-19 outbreak from internetsearches and social media data, china, 2020, Eurosurveillance 25 (2020)2000199.[17] L. Li, Z. Yang, Z. Dang, C. Meng, J. Huang, H. Meng, D. Wang, G. Chen,J. Zhang, H. Peng, et al., Propagation analysis and prediction of the covid-19, Infectious Disease Modelling 5 (2020) 282–292.[18] S. J. Fong, G. Li, N. Dey, R. G. Crespo, E. Herrera-Viedma, Compositemonte carlo decision making under high uncertainty of novel coronavirusepidemic using hybridized deep learning and fuzzy rule induction, AppliedSoft Computing (2020) 106282.[19] K. Chatterjee, K. Chatterjee, A. Kumar, S. Shankar, Healthcare impactof covid-19 epidemic in india: A stochastic mathematical model, MedicalJournal Armed Forces India (2020).[20] S. J. Fong, G. Li, N. Dey, R. G. Crespo, E. Herrera-Viedma, Finding anaccurate early forecasting model from small dataset: A case of 2019-ncovnovel coronavirus outbreak, arXiv preprint arXiv:2003.10776 (2020).[21] G. Baltas, F. A. Prieto Rodr´ıguez, M. Frantzi, C. Garc´ıa Alonso,P. Rodr´ıguez Cort´es, et al., Monte carlo deep neural network model forspread and peak prediction of covid-19 (2020).[22] D. Khatua, A. De, S. Kar, E. Samanta, A. A. Seikh, D. Guha, A fuzzydynamic optimal model for covid-19 epidemic in india based on granulardiﬀerentiability, Available at SSRN 3621640 (2020).2323] P. Liu, P. Beeler, R. K. Chakrabarty, Covid-19 progression timeline andeﬀectiveness of response-to-spread interventions across the united states,medRxiv (2020).[24] M. C. Traini, C. Caponi, G. V. De Socio, Modelling the epidemic 2019-ncovevent in italy: a preliminary note, medRxiv (2020).[25] S. Lai, I. I. Bogoch, N. W. Ruktanonchai, A. Watts, X. Lu, W. Yang,H. Yu, K. Khan, A. J. Tatem, Assessing spread risk of wuhan novel coron-avirus within and beyond china, january-april 2020: a travel network-basedmodelling study, medRxiv (2020).[26] L. Wynants, B. Van Calster, M. M. Bonten, G. S. Collins, T. P. Debray,M. De Vos, M. C. Haller, G. Heinze, K. G. Moons, R. D. Riley, et al., Pre-diction models for diagnosis and prognosis of covid-19 infection: systematicreview and critical appraisal, bmj 369 (2020).[27] C. T. Bauch, J. O. Lloyd-Smith, M. P. Coﬀee, A. P. Galvani, Dynamicallymodeling sars and other newly emerging respiratory illnesses: past, present,and future, Epidemiology (2005) 791–801.[28] G. R. Shinde, A. B. Kalamkar, P. N. Mahalle, N. Dey, J. Chaki, A. E.Hassanien, Forecasting models for coronavirus disease (covid-19): A surveyof the state-of-the-art, SN Computer Science 1 (2020) 1–15.[29] B. M. Althouse, J. Lessler, A. A. Sall, M. Diallo, K. A. Hanley, D. M. Watts,S. C. Weaver, D. A. Cummings, Synchrony of sylvatic dengue isolations: amulti-host, multi-vector sir model of dengue virus transmission in senegal,PLoS Negl Trop Dis 6 (2012) e1928.[30] R. M. Anderson, R. M. May, Infectious diseases of humans: dynamics andcontrol, Oxford university press, 1992.[31] H. W. Hethcote, Asymptotic behavior in a deterministic epidemic model,Bulletin of Mathematical Biology 35 (1973) 607–614.[32] H. Behncke, Optimal control of deterministic epidemics, Optimal controlapplications and methods 21 (2000) 269–285.[33] S. Bhattacharya, K. Gaurav, S. Ghosh, Viral marketing on social networks:An epidemiological perspective, Physica A: Statistical Mechanics and itsApplications 525 (2019) 478–490.[34] Y. Liu, A. A. Gayle, A. Wilder-Smith, J. Rockl¨ov, The reproductive num-ber of COVID-19 is higher compared to SARS coronavirus, Journal oftravel medicine (2020).[35] E. Shim, A. Tariq, W. Choi, Y. Lee, G. Chowell, Transmission potential andseverity of COVID-19 in South Korea, International Journal of InfectiousDiseases (2020). 2436] A. J. Kucharski, T. W. Russell, C. Diamond, Y. Liu, J. Edmunds, S. Funk,R. M. Eggo, F. Sun, M. Jit, J. D. Munday, et al., Early dynamics oftransmission and control of covid-19: a mathematical modelling study, Thelancet infectious diseases (2020).[37] L. Peng, W. Yang, D. Zhang, C. Zhuge, L. Hong, Epidemic analysis ofcovid-19 in china by dynamical modeling, arXiv preprint arXiv:2002.06563(2020).[38] W. O. Kermack, A. G. McKendrick, A contribution to the mathematicaltheory of epidemics, Proceedings of the royal society of london. Series A,Containing papers of a mathematical and physical character 115 (1927)700–721.[39] A. Rachah, D. F. Torres, Analysis, simulation and optimal control ofa seir model for ebola virus with demographic eﬀects, arXiv preprintarXiv:1705.01079 (2017).[40] T. Berge, J.-S. Lubuma, G. Moremedi, N. Morris, R. Kondera-Shava, Asimple mathematical model for ebola in africa, Journal of biological dy-namics 11 (2017) 42–74.[41] T. Toﬀoli, N. Margolus, Cellular automata machines: a new environmentfor modeling, MIT press, 1987.[42] S. Wolfram, Cellular automata and complexity: collected papers, CRCPress, 2018.[43] N. Boccara, K. Cheong, M. Oram, A probabilistic automata network epi-demic model with births and deaths exhibiting cyclic behaviour, Journalof Physics A: Mathematical and General 27 (1994) 1585.[44] C. Beauchemin, J. Samuel, J. Tuszynski, A simple cellular automatonmodel for inﬂuenza a viral infections, Journal of theoretical biology 232(2005) 223–234.[45] H. Fuks, A. T. Lawniczak, Individual-based lattice model for spatial spreadof epidemics, Discrete Dynamics in Nature and Society 6 (2001).[46] R. Willox, B. Grammaticos, A. Carstea, A. Ramani, Epidemic dynam-ics: discrete-time and cellular automaton models, Physica A: StatisticalMechanics and its Applications 328 (2003) 13–22.[47] P. Eosina, T. Djatna, H. Khusun, A cellular automata modeling for visu-alizing and predicting spreading patterns of dengue fever, Telkomnika 14(2016) 228.[48] K. S. Pokkuluri, S. U. D. Nedunuri, A novel cellular automata classiﬁer forcovid-19 prediction, Journal of Health Sciences 10 (2020) 34–38.2549] M. Dascalu, M. Malita, A. Barbilian, E. Franti, G. M. Stefan, Enhanced cel-lular automata with autonomous agents for covid-19 pandemic modeling,ROMANIAN JOURNAL OF INFORMATION SCIENCE AND TECH-NOLOGY 23 (2020) S15–S27.[50] S. Ghosh, S. Bhattacharya, Computational model on covid-19 pandemicusing probabilistic cellular automata, arXiv preprint arXiv:2006.11270(2020).[51] A. H. Wright, Genetic algorithms for real parameter optimization, in:Foundations of genetic algorithms, volume 1, Elsevier, 1991, pp. 205–218.[52] L. Yao, W. A. Sethares, Nonlinear parameter estimation via the geneticalgorithm, IEEE Transactions on signal processing 42 (1994) 927–935.[53] S. Katare, A. Bhan, J. M. Caruthers, W. N. Delgass, V. Venkatasubrama-nian, A hybrid genetic algorithm for eﬃcient parameter estimation of largekinetic models, Computers & chemical engineering 28 (2004) 2569–2581.[54] M. Gulsen, A. Smith, D. Tate, A genetic algorithm approach to curveﬁtting, International Journal of Production Research 33 (1995) 1911–1923.[55] C. L. Karr, B. Weck, D.-L. Massart, P. Vankeerberghen, Least mediansquares curve ﬁtting using a genetic algorithm, Engineering Applicationsof Artiﬁcial Intelligence 8 (1995) 177–189.[56] P. H. Schimit, Evolutionary aspects of spatial prisoners dilemma in a pop-ulation modeled by continuous probabilistic cellular automata and geneticalgorithm., Applied Mathematics and Computation 290 (2016) 178–188.[57] J. H. Holland, et al., Adaptation in natural and artiﬁcial systems: anintroductory analysis with applications to biology, control, and artiﬁcialintelligence, MIT press, 1992.[58] T. W. Liao, Clustering of time series dataa survey, Pattern recognition 38(2005) 1857–1874.[59] J. Gao, H. Sultan, J. Hu, W.-W. Tung, Denoising nonlinear time seriesby adaptive ﬁltering and wavelet shrinkage: a comparison, IEEE signalprocessing letters 17 (2009) 237–240.[60] O. Salem, Y. Liu, A. Mehaoua, Anomaly detection in medical wsns us-ing enclosing ellipse and chi-square distance, in: 2014 IEEE InternationalConference on Communications (ICC), IEEE, pp. 3658–3663.[61] World Health Organization coronavirus disease (COVID-2019) situationreports, Available at url: