Are official confirmed cases and fatalities counts good enough to study the COVID-19 pandemic dynamics? A critical assessment through the case of Italy
Krzysztof Bartoszek, Emanuele Guidotti, Stefano Maria Iacus, Marcin Okrój
AAre official confirmed cases and fatalities counts goodenough to study the COVID–19 pandemic dynamics? Acritical assessment through the case of Italy.
Krzysztof Bartoszek ∗ Emanuele Guidotti † Stefano Maria Iacus ‡ Marcin Okr´oj § May 19, 2020
Abstract
As the COVID–19 outbreak is developing the two most frequently reported statisticsseem to be the raw confirmed case and case fatalities counts. Focusing on Italy, one ofthe hardest hit countries, we look at how these two values could be put in perspective toreflect the dynamics of the virus spread. In particular, we find that merely consideringthe confirmed case counts would be very misleading. The number of daily tests grows,while the daily fraction of confirmed cases to total tests has a change point. It (dependingon region) generally increases with strong fluctuations till (around, depending on region)15 th –22 nd March and then decreases linearly after. Combined with the increasing trendof daily performed tests, the raw confirmed case counts are not representative of thesituation and are confounded with the sampling effort. This we observe when regressingon time the logged fraction of positive tests and for comparison the logged raw confirmedcount. Hence, calibrating model parameters for this virus’s dynamics should not be donebased only on confirmed case counts (without rescaling by the number of tests), but takealso fatalities and hospitalization count under consideration as variables not prone to bedistorted by testing efforts. Furthermore, reporting statistics on the national level doesnot say much about the dynamics of the disease, which are taking place at the regionallevel. These findings are based on the official data of total death counts up to 15 th April2020 released by ISTAT and up to 10 th May 2020 for the number of cases. In this workwe do not fit models but we rather investigate whether this task is possible at all.This work also informs about a new tool to collect and harmonize official statisticscoming from different sources in the form of a package for the R statistical environmentand presents the “ COVID-19 Data Hub ”. Highlights • confirmed cases are related to the total tests and time • confirmed case counts without number of tests would likely misinform epidemiolog-ical models • national level statistics do not say much about the dynamics of the disease • a new R package and a COVID–19 web hub with harmonized data is made available Keywords—
COVID-19, coronavirus, R language, data ∗ Department of Computer and Information Science, Link¨oping University, Link¨oping SE–581 83, Sweden † Institut d’analyse financi`ere, University of Neuchˆatel, Switzerland. ‡ European Commission, Joint Research Centre, Via E. Fermi 2749, I–21027 Ispra (VA), Italy. § Department of Cell Biology and Immunology, Intercollegiate Faculty of Biotechnology, University of Gda´nskand Medical University of Gda´nsk, Poland. a r X i v : . [ s t a t . A P ] M a y ontents R package 94 Discussion or Should we use these data to calibrate epidemiological models? 95 Source code and scripts 126 Acknowledgments 12 Introduction
In December 2019 the first cases of pneumonia of unknown etiology were reported in Wuhan city, Peo-ple’s Republic of China. Analyses of patients’ samples collected from their respiratory tract revealedthat a novel coronavirus, later named as severe acute respiratory syndrome coronavirus 2 (SARS–CoV–2) is the pathogen responsible for infection (Huang et al., 2020). The disease, officially calledCOVID–19 by World Health Organization (WHO) is characterized by higher transmissibility andinfectivity but lower mortality than Middle East Respiratory Syndrome (MERS) and Severe AcuteRespiratory Syndrome (SARS) caused by other coronaviruses (Wang et al., 2020).Apart of the source of infection, the spread of the virus depends on the transmission route andgeneral susceptibility of the population. SARS–CoV–2 is believed to be transmitted mostly by closecontact (and further carry–over to the mucous surfaces of the body) and inhalation of aerosol producedby an infected person. The presence of the virus was also reported in samples from the gastrointestinaltract (Xiao et al., 2020) but the potential role of the oral–fecal route of infection is unknown. Theevidence of asymptomatic carriers who may unintentionally transmit the virus together with relativelylong incubation period up to 24 days (Bai et al., 2020) increase the risk of viral spread worldwide andmake prevention measures difficult. On the other hand, separation of identified cases, prior immunityto SARS–CoV–2 or cross–reactivity of human antibodies naturally risen against other viruses wouldact as a barrier for virus transmission. The latter is probable as RNA sequences of SARS–CoV–2 arein 79% identical to the sequences of SARS–CoV responsible for the previous pandemic in Far Eastcountries in 2002 and 50% identical to MERS–CoV (Lu et al., 2020). All above mentioned issueswould act as confounding factors for any modelling of pandemic progression.Except of the city of Wuhan where the first reports of COVID–19 were announced in December2019, there was another outbreak of disease, which took place in January–February 2020 on the
Di-amond Princess cruise ship with more than 3700 people onboard. As such a great number of peoplewere locked in a confined space using common facilities, air–condition systems, restaurants etc. andonce the chronology of infections, symptoms and undertaken health measures are known (Nakazawaet al., 2020; Rockl¨ov et al., 2020; Zhang et al., 2020), one can consider this as a unique, naturally–occurring epidemiological study useful for prediction of mortality, disease spread and other parametersof the COVID–19 pandemic. Since the virus has spread across the world and new pandemic epicenterslike Italy, Spain, Iran, South Korea and USA have emerged, a multitude of new data has appeared.Different countries have applied different strategies of testing people for the coronavirus (mass test-ing vs. testing of selected patients), different testing methods (serological vs. PCR–based assays)and count of case fatalities (solely SARS–CoV–2 positive tested cases vs. cases with comorbidities).Therefore, any direct comparison of pandemic dynamics is difficult but still, comparison to a “goldenstandard”, which the Diamond Princess case could be considered as, may be useful.Since the outbreak of the disease a multitude of papers modelling the dynamics of the infectionhave appeared, especially on the arXiv preprint server. They are usually concerned with connectingthe pandemic with various epidemiological models (e.g. Kumar and Hembram (2020); Morais (2020);Singer (2020); Zullo (2020) following a brief survey of arXiv at the start of April 2020). However,such models of course require data concerning the infected individuals. Furthermore, the media arebombarding today with two basic numbers (for each country)—the number of confirmed cases andthe number of case fatalities. Given that supposedly the vast majority of people are asymptomaticand testing is not done as random sampling of the population but due to particular protocols thesevalues by themselves might be misleading. We can only second Wood (2020) in “
Despite millions oftests having been performed, there are still no results from statistically well founded sampling basedtesting programmes to establish basic epidemic quantities such as infection fatality rate and infectionrates. In the absence of such direct data, epidemic management has to proceed on the basis of dataproduced largely as a side effect of the clinical response to the disease. ” As a motivating example wepresent Figure 1 from which we can see that in Italy the case fatality to confirmed ratio is constantwhile the confirmed cases to number of tests has been decreasing since around March 22 nd . Indeed,the time period since March 22 nd is longer than the median time of 19 . .000.250.500.751.00 Mar 01 Mar 15 Apr 01 Apr 15 May 01 Confirmed / Tested
Deaths / Confirmed
AbruzzoBasilicataCalabriaCampaniaEmilia−Romagna Friuli Venezia GiuliaLazioLiguriaLombardiaMarche MoliseP.A. BolzanoP.A. TrentoPiemontePuglia SardegnaSiciliaToscanaUmbriaValle d'Aosta Veneto
Figure 1:
Cumulative confirmed cases and case fatalities for all the regions of Italy. Right: Cumulative case fatalities divided byconfirmed cases, left: cumulative confirmed cases divided by the cumulative number of tests. • With each country having their own reporting standard and testing strategy are these rawnumbers comparable across countries? • Do these data actually mean what they are being said to be and are they appropriate for modelfitting at all?Clearly, the curves presented in Figure 1 suggest that a more in–depth look at the raw numbersis required and that there is a need to put the data in a correct perspective before trying to fit anyepidemiological model to them, especially because the viral dynamics are starting to be inferred fromreported case fatalities (Britton, 2020a; Pugliese and Sottile, 2020; Vattay, 2020).In this work we approach these issues by looking in detail at the available infection data for indi-vidual Italian regions (Section 2) and present the R (R Core Team, 2020) package COVID19 (Section3) that unifies COVID–19 datasets across different sources in order to simplify the data acquisitionprocess and the subsequent analysis. Section 4 contains a discussion on what other data would beuseful (if of course possible to collect for the already overworked public services), in understandingthe dynamics of the pandemic. Most regional analyses are contained in the Appendices.
Italy is a country which is being very extremely hard–hit with the COVID–19 pandemic. It is currently(as of 13 th May 2020) as a whole in lockdown and the medical services are extremely strained. However,due to this situation it has also very detailed epidemiological data that has been made publiclyavailable. Its constantly increasing infected and case fatality count has lead us looking in greaterdetail into this data, especially as it is used for curve–fitting of epidemiological models (e.g. Kumarand Hembram (2020); Morais (2020); Singer (2020); Zullo (2020) following brief survey of arXiv) andpresented in public media.The first hurdle that one comes across is what do the presented counts actually represent. Thisseems to be region dependent . Furthermore, any deceased whose test result is found positive is classi-fied as a COVID–19 case fatality, regardless of any past or underlying diseases, and this methodologyhas been consistently applied in Italy since the beginning (Picariello and Aliani, 2020). It is importantto point out that different countries seem to have different testing strategies and classification systemsof deaths—hence raw counts between countries might not be comparable. Given the huge amount of Initially the Veneto region blanked tested a significant part of the population, while Lombardy did not(private communication with Marco Picariello and Paola Aliani). ests performed in Italy (2735628 as of 13 th May 2020 (
COVD19 package), Guidotti and Ardia, 2020)an important question is: “what fraction of them were serological tests?” as there is no official dataon this. A serological test may not distinguish between a person actively infected with the virus and aperson that was exposed to the virus in the past. Alternatively, serological test may not detect personactively infected with still low viral titer of anti-virus antibodies. On the other hand, if the protocolis to test only people exhibiting symptoms and medical personnel, then given that it is hypothesisedthat the vast majority of cases are asymptomatic, such a raw count might not be representative of thescale of the epidemic.Given the above uncertainties we set out to see how the Italian regional data could be presentedin a standardized manner. Furthermore, we see how the data of each region compares to the
DiamondPrincess’ data. We focus on the two values that are being presented everywhere—the confirmed casecount and the case fatalities count. However these should be scaled. We scale the confirmed casecount by the total number of tests performed. Scaling the case fatalities is more problematic. Acommon way is to present them as the case fatality ratio but these may be misleading when estimatedduring an epidemic (B¨ottcher et al., 2020). Furthermore, assuming that the vast majority of cases areasymptomatic—hence not tested and not inside the case count, we are uncertain to what the fatalitieswould actually be compared to.Given, the lack of hard data another objective approach would be to compare the daily count ofcase fatalities to the total deceased count for the day. To the best of our knowledge such statisticsare not centrally reported in Italy in real–time. Daily deceased counts (from nearly all of the Italianmunicipalities—see Discussion) are available though for the period 1 st January–15 th April 2020 .Hence, for this time period we are able to plot the weakly “nearly”-desired ratios (see Section 4). Weaggregate per week to remove daily fluctuations, which obscure the picture. Furthermore, the samedata source provides deceased counts for the years 2015–2019 (for the same time period). This allowsus to also visualize the excess mortality (with respect to the per week average from the past fiveyears). Beyond this time interval, it is impossible to provide such curves. However, having daily casefatalities counts and past mortality (this is taken as a constant value equalling the average number ofdeceased for 15 th April) we are able to plot the (per week) ratio of case fatalities to previous averagemortality. This provides some indication of the magnitude of excess mortality . However, it is worthnoticing that when looking at the current excess mortality it could be appropriate to compare withpast mortality peaks (e.g. for UK death toll, the 2014 / / , Figs. 1, 5 and 6 ofThomas, 2020), taking into consideration the causes of death. Here for Italy and its regions, in Figs.4, 6 and Fig. 28 in Appendix B we compare the current deceased peak with the seasonal start of theJanuary one.We should remark that perhaps more focus should be on the cumulative positive test fractioninstead of the daily positive test fraction. This is because the daily fraction is extremely noisy andfurthermore it sometimes happens that this fraction, in the official data source for Italy, exceeds 1.For similar reasons we plot the weekly scaled deaths and cumulative scaled deaths. The daily countsare extremely noisy as well.We plot the scaled daily and cumulative positive test count and scaled case fatalities next to thecumulative positive tested fraction of passengers on the Diamond Princess . Here we present the graphsfrom two special regions in Italy—Lombardy and Veneto. The remaining regions are presented in theAppendix A. Lombardia is the center of the epidemic, where the cases and deaths counts are thehighest. Veneto seems to be a region where the pandemic’s dynamics are special—it was a region thatvery early on undertook population–wide testing and drastic lockdown measures .On all of the graphs the curve labels have the following meaning.1. (DP) Confirmed Scaled: cumulative number of cases on the Diamond Princess divided by 3711,the number of passengers and crew onboard2. (IT) Confirmed/Tests: cumulative confirmed case to cumulative number of tests ratio for Italyor region A similar graphical analysis for appeared in The Economist (2020); Wu and McCann (2020); Giles (2020). Private communication with Marco Picariello and Paola Aliani. Lombardia −6−4−20 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Lombardia (log) (DP) Confirmed Scaled(IT) Confirmed/Tests (IT) COVID frac cumul deaths(IT) COVID frac cumul deaths wrt past (IT) COVID frac excess deaths(IT) COVID frac weekly deaths (IT) COVID frac weekly deaths wrt past(IT) deaths in 2020 wrt past050001000015000 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Lombardia
Lombardia (log) (IT) Confirmed (IT) Confirmed − Tests (IT) Confirmed − Tests cumulative (IT) Tests
Figure 2:
Comparison of curves for Lombardy region. Left: y –axis on normal scale, right: on logarithmic scale. Regres-sion line shown for log(daily confirmed) − log(daily tested) ∼ time with 95% prediction band. Slope of regression with 95%confidence interval: a f = − . − . , − . . . , . ∼ time is a raw = − . − . , − . . . , . ∼ time is 0 . . , . . . , . a f /a raw = 1 . . − log(cumulative tested) ∼ time is − . − . , − . . . , .
3. (IT) COVID frac cumul deaths: cumulative number of case fatalities to cumulative number ofdeceased in 2020 ratio for Italy or region4. (IT) COVID frac cumul deaths wrt past: cumulative number of case fatalities to cumulativenumber of average from 2015–2019 number of deceased ratio for Italy or region5. (IT) COVID frac excess deaths: number of case fatalities for given week to difference betweendeaths in 2020 and average from 2015–2019 number of deceased for given week ratio for Italyor region6. (IT) COVID frac weekly deaths: number of case fatalities for given week to number of deceasedin 2020 for given week ratio for Italy or region7. (IT) COVID frac weekly deaths wrt past: number of case fatalities for given week to averagefrom 2015–2019 number of deceased for given week ratio for Italy or region Veneto −5.0−2.50.02.5 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Veneto (log) (DP) Confirmed Scaled(IT) Confirmed/Tests (IT) COVID frac cumul deaths(IT) COVID frac cumul deaths wrt past (IT) COVID frac excess deaths(IT) COVID frac weekly deaths (IT) COVID frac weekly deaths wrt past(IT) deaths in 2020 wrt past0500010000 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Veneto −50510 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Veneto (log) (IT) Confirmed (IT) Confirmed − Tests (IT) Confirmed − Tests cumulative (IT) Tests
Figure 3:
Comparison of curves for Veneto region. Left: y –axis on normal scale, right: on logarithmic scale. Regres-sion line shown for log(daily confirmed) − log(daily tested) ∼ time with 95% prediction band. Slope of regression with 95%confidence interval: a f = − . − . , − . . . , . ∼ time is a raw = − . − . , − . . . , . ∼ time is 0 . . , . . . , . a f /a raw = 1 . . − log(cumulative tested) ∼ time is − . − . , − . . . , .
8. (IT) deaths in 2020 wrt past: number of deceased in 2020 for given week to average from2015–2019 number of deceased for given week ratio for Italy or region9. (IT) Confirmed: daily number of confirmed cases for Italy or region10. (IT) Confirmed - Tests: log((IT) Confirmed) − log((IT) Tests)11. (IT) Confirmed - Tests cumulative: log(cumulative number of confirmed cases in Italy or region) − log(cumulative number of tests performed in Italy or region)12. (IT) Tests: daily number of tests performed for Italy or regionWe obtain data for the period 24 th February–10 th May 2020 and we plot the curves from themoment of the first death. From both Figure 2 and 3 (and those present in the Appendix A) wecan notice a number of facts. Firstly the daily fraction of infected cases fluctuates very wildly andsometimes can be greater than 1. This can only be due to some changes in protocols or reporting. N u m be r o f dea t h s pe r w ee k Lombardia TOTAL N u m be r o f dea t h s pe r w ee k Lombardia WOMEN N u m be r o f dea t h s pe r w ee k Lombardia MEN N u m be r o f dea t h s pe r w ee k Veneto TOTAL N u m be r o f dea t h s pe r w ee k Veneto WOMEN N u m be r o f dea t h s pe r w ee k Veneto MEN
Figure 4:
Weekly raw death toll comparison in different age groups between 2020 and 2015–2019 for Lombardy and Veneto.
Similarly, such an explanation seems plausible for the fluctuations in the fractions. In fact, Picarielloand Aliani (2020) report a change in the way positive cases and deaths are calculated on 10 th March.The cumulative case fraction on the other hand does not exhibit such fluctuations. For most regions itis flat and then decreasing. In a number of regions (e.g. Abruzzo, Basilicata, Campania, Friuli VeneziaGiulia, Lazio, Molise, Puglia, Sardegna, Sicily, Toscana, Umbria, Veneto) on the log–scale graphs thecumulative case to tests ratio curve seems the peak around or below the
Diamond Princess ’ cumulativecase curve and then start dropping. The scaled death curves exceed this curve.When looking at the graphs of the number of tests per day two things can be seen. Firstly, thenumber of positive cases closely follows the number of tests (this is clearly visible on the log–scalegraphs and supported by the regression study). We look at this issue in detail and present for eachregion and Italy the confirmed cases with respect to the total tests carried out. We also regress thelog(daily confirmed cases) − log(daily total tests) on time (in days) andlog(cumulative confirmed cases) − log(cumulative total tests) on time (in days). The slope of such aregression can be presented in terms of the half–lives, if it is negative. Such a presentation in terms ofeffect sizes is important, otherwise it is difficult to assess if the raw slope is big or small. The linearmodel approach means that the proportion of infected behaves exponentially(daily(cumulative) confirmed cases)) / (daily(cumulative) total tests)) =: p ( t ) = be at , then to get the half–life (for a negative, t > t ) one takes2 = p ( t ) p ( t ) = e a ( t − t ) obtaining ( t − t ) = log(2) / ( − a ). For a > t − t ) = log(2) /a . It is important to point out that this is a rather rule–of–thumb approach—ouraim is not to model the dynamics of infections, but rather to visualize and understand what the data n front of us is. These regressions were not performed from the first day, as initially there seemsto be a lot of noise in the tests, the starting time considered is visible in each graph—where thefitted line with prediction confidence band is fitted. We performed a regression for both the daily andcumulative counts. For some regions (Molise, Valle d’Aosta) no regression is performed as the dailycounts seem to noisy. Secondly, one can very clearly identify days when something must have changeddue to the testing methodology in the Emilia Romagna region—there are huge dips in the numbersof tests performed. Hence, for this region the dates 28 th –30 th March were removed for the regressionestimation. In the Basilicata and Calabria regions spikes to 0 can also be observed—these are alsoremoved, as on the log scale would result in infinite values which cannot be handled by the regressionprocedure in lm() . However such dips require careful investigation.The directly plotted death toll in Figs. 4, 5 and 28 shows that in the regions Emilia–Romagna,Lombardia, P.A. Bolzano combined with P.A. Trento and Valle d’Aosta there is a larger current (spring2020) mortality peak than the past December/January (2015–2020 are plotted separately) maximumone. In the regions Liguria, Marche and Piemonte such a larger current peak is present for men only.In the other regions for all age groups and both men and women the current “COVID–19 peak” seemsto be approximately of the same height, or lower, than past December/January (2015–2019) maximumones. Looking at Italy for men and both sexes combined it is higher, but women seem to have thesame peak height. However, it must be stressed that this is only considering the peak’s height, notthe total amount of deceased during the current peak and the December/January ones. R package We used the, available on CRAN,
COVID19 R package for the purpose of obtaining the data . Thepackage unifies COVID–19 datasets across different sources in order to simplify the data acquisitionprocess and the subsequent analysis. COVID–19 data are pulled in real time and merged with demo-graphic indicators from several trusted sources including but not limited to: Johns Hopkins UniversityCenter for Systems Science and Engineering (JHU CSSE) ; World Bank Open Data ; World Factbookby CIA ; Ministero della Salute, Dipartimento della Protezione Civile ; Istituto Nazionale di Statis-tica ; Swiss Federal Statistical Office ; Open Government Data Zurich . Besides worldwide data,the dataset includes fine–grained data for the Diamond Princess , Switzerland and Italy. At the timeof writing, these include the number of confirmed cases, deaths and tests, total population, populationages 0–14, 15–64 and 65+ (% of total population), median age of population, population density perkm , population mortality rate. Depending on the data provider, the data are available at the countrylevel, state level, or city level. For non R users, the combined datasets are available in csv format . In this work we analyzed in depth the two statistics that are commonly reported for the currentlyongoing COVID–19 pandemic—the number of confirmed cases and the number of case fatalities forthe different regions of Italy. We found significant variability between regions but also some commoninsights. In particular, the number of confirmed cases is clearly related to the number of tests andtheir ratio seems to be decaying for some time now in all regions. This is confirmed when looking https://cran.r-project.org/web/packages/COVID19/ https://github.com/CSSEGISandData/COVID-19 https://data.worldbank.org/ https://github.com/pcm-dpc/COVID-19 https://github.com/openZH/covid_19 https://covid19datahub.io t the log–scale plot. The difference between the logarithm of the cumulative number of tests andthe logarithm of the cumulative number of confirmed seems to be (visually) dropping linearly (apartfrom the below, extremely noisy ones) regions and Italy as a whole. Furthermore, for a number ofregions (Molise, Valle d’Aosta), on the log scale, the tests, total, positive and difference behave verychaotically, suggesting rather various test handling situations, than any pattern. Such oscillations canbe visible in all regions at the initial stages, but they settle down (apart from the previously mentionedthree regions). However, in regions with seemingly well–behaved curves individual huge dips can beobserved (Emilia-Romagna, Marche). Therefore, reports claiming the growth of the epidemic basedonly on the increasing number of confirmed individuals will not be catching its dynamics.Furthermore, studying daily positively tested counts could be misleading. On a number of dayswe found (for some regions) that this count was greater than the number of tests performed. This cancertainly be understood, as the result of reporting procedures, in a crisis situation. However, this alsoimplies that any statistical analysis or modelling of such data has to be done very carefully. We findthat the cumulative positively tested fraction behaves much more stably, even though in the officialcumulative counts decreases can be observed.More importantly, using the raw confirmed case counts one could risk combining the samplingeffort with the actual disease spread. In our regressions, for the logarithm of the ratio confirmed casesto total tests on time the fitted slopes are all negative (indicating that the virus is receding and thiswas observed also by Corica and Vito (2020)).Furthermore, these slopes are steeper than the slopes of the logarithm of the raw confirmed casecounts on time. With the exceptions of Lazio, P. A. Bolzano the 95% confidence intervals for thesetwo slopes do not overlap, or overlap very slightly. The ratios of the two slopes lie between 1 . .
717 (Piemonte). We report these ratios alongside the slope estimates in thecaptions of Figs. 7—27. This means that the number of confirmed cases will be confounded by thenumber of performed tests and cannot be analyzed without them as a point of reference.Hence, the raw confirmed case counts are not representative of the virus’ infection dynamics. Thelogarithm of the fraction of confirmed cases to total tests is modelled well by a linear function withan increasing number of daily tests being performed and has a steeper slope than the logarithm of theconfirmed case counts. Drawing conclusions from raw confirmed case data would seem to be mixing–inthe study of the sampling effort (it is important to stress that we do not make any statements hereconcerning the interpretation of the confirmed cases to tests fraction). Therefore, calibrating modelparameters for this virus’s dynamics should not be done based solely on confirmed case counts, butmaybe rather also on case fatalities or hospitalization data (given that classification protocols are takeninto account) as, e.g., Britton (2020a); Pugliese and Sottile (2020); Vattay (2020) do. In fact, alreadyFlaxman et al. (2020), critised (as, Pugliese and Sottile, 2020, later also did following them) lookingat case counts and postulated a focus on the “observed deaths” while Vattay (2020) writes that “thecumulative number of deaths can be regarded as a master variable”. Britton (2020b) developed anestimation methods based on the cumulative reported number of case fatalities.On the other hand, we also looked at the ratio of case fatalities to the number of deceased per day.This has the analytical advantage, of referring to something certain and well measured, detailed recordsare collected (sooner or later) on the exact number of deceased in a given time period. Here, there ishardly any chance of missing asymptomatic (of being dead) people.
If the assumption , mentioned inthe Introduction, that a significant proportion of the tests are serological is true, then the ratio of casefatalities to all deceased should be telling us something about the cumulative proportion of infectedindividuals. Our graphs (especially on the log–scale) do not contradict this, while the cumulativeproportion of confirmed cases changes very slowly, the ratio of case fatalities to total deceased perday seems to look like an epidemic growth curve. Since Italy has very high quality data on the casefatalities, this data could be further studied to assess the dynamics of the pandemic (e.g. Morais (2020)uses the raw death counts for assessing the dynamics of the pandemic, albeit at the country level).This seems to be supported by that if one compares the curves to a potential “gold standard”—thecumulative fraction of confirmed cases on the
Diamond Princess , then the case fatalities ratio seemsto shadow this curve (on the initial part when the epidemic was taking place on the cruise ship andfor some regions like Emilia–Romagna or Lombardia) but exceeds it. One could hope that once allcurves would flatten at the same level, then the epidemic will reach the plateau. Unfortunately, at thelevel of some (e.g. Emilia–Romagna or Lombardia) of the regions, the scaled case fatalities grew and xceeded both the Diamond Princess and cumulative fraction of confirmed cases.We also compared the regional results to the same curves for the whole of Italy, Figure 5. On the onehand the same patterns are visible—the number of confirmed cases are related to the testing effort,the case fatalities exceeding the
Diamond Princess ’ cumulative confirmed cases and the confirmedcases fraction seems to be stabilizing around the
Diamond Princess ’ and then dropping. However,these graphs completely miss the regional variation. This is particularly visible when looking atthe total death tolls directly Figs. 6 and 28. Combined Italy shows a visible increase in the deathtoll during the March–April period compared to previous years and the seasonal December/Januarypeak. However, this peak is driven by particular regions Emilia–Romagna, Lombardia and Piemonte(Liguria, Marche, P.A. Bolzano combined with P.A. Trento and Valle d’Aosta also show a big increase–but in raw numbers are much lesser than the other three). All the other regions’ peak is on the samelevel or lower than the December/January one and for some the death toll is on similar levels to theMarch–April one from previous years. Furthermore, looking at epidemiological country level datawould be especially misleading for Italy as Lombardy acted differently from Veneto in terms of theirtesting strategies.We believe that our presented view on the Italian regional data gives some insights how thepandemic data reporting can be improved (if of course given the difficult situation it would be possiblein practise). For the confirmed cases count a break–down should be provided, how many of these weremedical personnel, how many had symptoms, how many were seriously hospitalized, how many weretested for other reasons (e.g. after contact). Similarly for the number of tests carried out and theirtype (serological or not). The case fatalities counts, should also be put in perspective—with a reportof how many people died in total on the given day and how many deceased were tested negatively.This would allow for estimating excess mortality (crudely—compared to previous years’ average ormore exactly if number of deaths for the given time period are available) and for correct scaling tocompare to other ratios. In fact in the time period 1 st January–15 th April we are able to visualizethe excess mortality directly—the number of deceased (in each week) in 2020 to the average from thepast five years. The dataset is based on the 7904 Italian municipalities.To the best of our knowledge the presented here counts are at the moment the best available datathat can be used for scaling and putting the deceased counts in Italy in perspective. The death countsseem to be collected in a consistent manner, both the number of case fatalities and the (used here)population death counts. This means that such counts could be used as a proxy for monitoring thedynamics of the virus.It is also a question whether the
Diamond Princess can be considered as a gold–standard. Certainlyat the beginning it seems to behave like the other presented here curves. However, the data very quicklyends, when the passengers were disembarked. We do not know if it reached the plateau or would havestill grown. The confirmed case ratio seems to usually stay below/around this curve, slightly go aboveand then drop. Scaled case fatality curves exceed the curve.Finally, the counting methodology should be made readily available for easy comparison betweendifferent countries. While of course each country is free to follow their own protocol, without puttingnumbers into context one can analyze data in an over–pessimistic or over–optimistic way. The effect ofdifferent counting methods is pointed out by Picariello and Aliani (2020); ? ); ? , when fitting parametersto the confirmed case counts (in Lombardy, Bergamo and Brescia), one has a change of coefficientsfollowing 10 th March and 17 th March, the latter can be possibly due to containment measures, butthe former the authors are convinced is due to a change in the counting methodology. We have alsoabstained here from fitting any models to the data (the regression performed does not have as anaim modelling but formally testing what the respective curve could be telling us). It is known thatdue to different protocols between regions and changes in the protocols with time, the data is nothomogeneous. In order to fit any model one would have to obtain documentation what were themeasurement strategies for each region in the time periods. In fact, when Alberti and Faranda (2020)modelled the cumulative number of infections in Italy through time (obtained using the
COVID19 package), they performed fits to date separately in different time intervals which corresponded tovarious government introduced confinement measures. Italy −6−4−20 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Italy (log) (DP) Confirmed Scaled(IT) Confirmed/Tests (IT) COVID frac cumul deaths(IT) COVID frac cumul deaths wrt past (IT) COVID frac excess deaths(IT) COVID frac weekly deaths (IT) COVID frac weekly deaths wrt past(IT) deaths in 2020 wrt past0200004000060000 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Italy −50510 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Italy (log) (IT) Confirmed (IT) Confirmed − Tests (IT) Confirmed − Tests cumulative (IT) Tests
Figure 5:
Comparison of curves for the whole of Italy. Left: y –axis on normal scale, right: on logarithmic scale. Re-gression line shown for log(daily confirmed) − log(daily tested) ∼ time with 95% prediction band. Slope of regression with95% confidence interval: a f = − . − . , − . . . , . ∼ time is a raw = − . − . , − . . . , . ∼ time is 0 . . , . . . , . a f /a raw = 1 . . − log(cumulative tested) ∼ time is − . − . , − . . . , . The
COVID19 package is available from https://cran.r-project.org/web/packages/COVID19/ .The R script used to generate the graphics is available from https://github.com/krzbar/COVID19 . We would like to thank Marco Picariello (MIUR-IISS) and Paola Aliani (Cognizant) for sharing theirpreliminary work on data analysis of the Veneto region with us. We would like to thank PierpaoloMalinverni (JRC) for his critical reading of an earlier version of the manuscript. All responsibility forthe content of this document lies with the authors. K.B. is supported by the Swedish Research Council(Vetenskapsr˚adet) grant no. 2017–04951. M.O is supported by National Science Centre Poland grantno. 2014 / /E/N Z / N u m be r o f dea t h s pe r w ee k Italy TOTAL N u m be r o f dea t h s pe r w ee k Italy WOMEN N u m be r o f dea t h s pe r w ee k Italy MEN
Figure 6:
Weekly raw death toll comparison in different age groups between 2020 and 2015–2019 for whole of Italy.
References
Alberti, T. and D. Faranda (2020). On the uncertainty of real-time predictions of epidemic growths:a COVID–19 case study for China and Italy. https://arxiv.org/abs/2004.10060 .Bai, Y., L. Yao, T. Wei, F. Tian, D.-Y. Jin, L. Chen, and M. Wang (2020, 04). Presumed asymptomaticcarrier transmission of COVID–19.
JAMA 323 (14), 1406–1407.B¨ottcher, L., M. Xia, and T. Chou (2020). Why estimating population-based case fatality rates duringepidemics may be misleading. https://doi.org/10.1101/2020.03.26.20044693 .Britton, T. (2020a). Basic estimation-prediction techniques for Covid–19, and a prediction for Stock-holm. medRxiv .Britton, T. (2020b). Basic prediction methodology for covid–19: estimation and sensitivity consider-ations. medRxiv .Corica, A. and L. D. Vito (2020). La prima foto del virus a milano: la mappa dei contagi in citte nell’hinterland. https://milano.repubblica.it/cronaca/2020/05/10/news/coronavirus_mappa_del_virus_milano_provincia_lodi_contagi-256225645/ .Flaxman, S., S. Mishra, A. Gandy, H. J. T. Unwin, T. A. M. H. Coupland, H. Zhu, T. Be-rah, J. W. Eaton, P. N. P. Guzman, N. Schmit, L. Callizo, K. E. C. Ainslie, M. Baguelin,I. Blake, A. Boonyasiri, O. Boyd, L. Cattarino, C. Ciavarella, L. Cooper, Z. Cucunub´a, G. Cuomo–Dannenburg, A. Dighe, B. Djaafara, I. Dorigatti, S. van Elsland, R. FitzJohn, H. Fu, K. Gaythorpe,L. Geidelberg, N. Grassly, W. Green, T. Hallett, A. Hamlet, W. Hinsley, B. Jeffrey, D. Jorgensen,E. Knock, D. Laydon, G. Nedjati–Gilani, P. Nouvellet, K. Parag, I. Siveroni, H. Thompson, R. Ver-ity, E. Volz, P. G. T. Walker, C. Walters, H. Wang, Y. Wang, O. Watson, C. Whittaker, P. Winskill,X. Xi, A. Ghani, C. A. Donnelly, S. Riley, L. C. Okell, M. A. C. Vollmer, N. M. Ferguson, andS. Bhatt (2020, 03). Report 13—Estimating the number of infections and the impact of non-pharmaceutical interventions on COVID–19 in 11 European countries. Technical report, ImperialCollege London.Giles, C. (2020). Coronavirus death toll in UK twice as high as official figure. .Guidotti, E. and D. Ardia (2020). COVID–19 data hub. https://covid19datahub.io .Huang, C., Y. Wang, X. Li, L. Ren, J. Zhao, Y. Hu, L. Zhang, G. Fan, J. Xu, X. Gu, Z. Cheng, T. Yu,J. Xia, Y. Wei, W. Wua, X. Xie, W. Yin, H. Li, M. Liu, Y. Xiao, H. Gao, L. Guo, J. Xie, G. Wang, . Jiang, Z. Gao, Q. Jin, J. Wang, and B. Cao (2020). Clinical features of patients infected with2019 novel coronavirus in Wuhan, China. Lancet 395 (10223), 497–506.Kumar, J. and K. P. S. S. Hembram (2020). Epidemiological study of novel coronavirus (COVID–19). https://arxiv.org/abs/2003.11376 .Lu, R., X. Zhao, J. Li, P. Niu, B. Yang, H. Wu, W. Wang, H. Song, B. Huang, N. Zhu, Y. Bi, X. Ma,F. Zhan, L. Wang, T. Hu, H. Zhou, X. Hu, W. Zhou, L. Zhao, J. Chen, Y. Meng, J. Wang, Y. Lin,J. Yuan, Z. Xie, J. Ma, W. J. Liu, D. Wang, W. Xu, E. C. Holmes, G. F. Gao, G. Wu, W. Chen,W. Shi, and W. Tan (2020). Genomic characterisation and epidemiology of 2019 novel coronavirus:implications for virus origins and receptor binding.
Lancet 395 (10224), 565–574.Morais, A. F. (2020). Logistic approximations used to describe new outbreaks in the 2020 COVID–19pandemic. https://arxiv.org/abs/2003.11149 .Nakazawa, E., H. Ino, and A. Akabayashi (2020). Chronology of COVID–19 cases on the DiamondPrincess cruise ship and ethical considerations: a report from Japan.
Disaster Med. Public HealthPrep. , 1–27.Picariello, M. and P. Aliani (2020). Covid–19: Data analysis of the Lombardy region and the provincesof Bergamo and Brescia. https://arxiv.org/abs/2003.10518 .Pugliese, A. and S. Sottile (2020). Inferring the COVID–19 infection curve in Italy. https://arxiv.org/abs/2004.09404 .R Core Team (2020).
R: A Language and Environment for Statistical Computing . Vienna, Austria:R Foundation for Statistical Computing.Rockl¨ov, J., H. Sj¨odin, and A. Wilder–Smith (2020, 02). COVID–19 outbreak on the Diamond Princesscruise ship: estimating the epidemic potential and effectiveness of public health countermeasures.
J. Travel Med. .Singer, H. M. (2020). Short–term predictions of country-specific Covid–19 infection rates based onpower law scaling exponents. https://arxiv.org/abs/2003.11997 .The Economist (2020). Tracking covid–19 excess deaths across countries: Official covid–19 death tollsstill under-count the true number of fatalities. .Thomas, D. M. (2020). Excess registered deaths in England and Wales during the COVID–19 pan-demic, March 2020 and April 2020. https://arxiv.org/abs/2004.11355 .Vattay, G. (2020). Forecasting the outcome and estimating the epidemic model parameters from thefatality time series in COVID–19 outbreaks. https://arxiv.org/abs/2004.089773 .Wang, L. S., Y. R. Wang, D. W. Ye, and Q. Q. Liu (2020). A review of the 2019 Novel Coronavirus(COVID–19) based on current evidence.
Int. J. Antimicrob. Agents , 105948.Wood, S. N. (2020). Simple models for COVID–19 death and fatal infection profiles. https://arxiv.org/abs/2005.02090 .Wu, J. and A. McCann (2020). 25,000 missing deaths: Tracking the true tollof the coronavirus crisis. .Xiao, F., M. Tang, X. Zheng, Y. Liu, X. Li, and H. Shan (2020). Evidence for gastrointestinal infectionof SARS-CoV–2.
Gastroenterology .Zhang, S., M. Diao, W. Yu, L. Pei, Z. Lin, and D. Chen (2020). Estimation of the reproductive numberof novel coronavirus (COVID–19) and the probable outbreak size on the Diamond Princess cruiseship: A data-driven analysis.
Int. J. Infect. Dis. 93 , 201–204. hou, F., T. Yu, R. Du, G. Fan, Y. Liu, Z. Liu, J. Xiang, Y. Wang, B. Song, X. Gu, L. Guan, Y. Wei,H. Li, X. Wu, J. Xu, S. Tu, Y. Zhang, H. Chen, and B. Cao (2020). Clinical course and risk factorsfor mortality of adult inpatients with COVID–19 in Wuhan, China: a retrospective cohort study. Lancet 395 (10229), 1054–1062.Zullo, F. (2020). Some numerical observations about the COVID–19 epidemic in Italy. https://arxiv.org/abs/2003.11363 . ppendix A: Curves for regions of Italy −2−101 Mar 01 Mar 15 Apr 01 Apr 15 May 01 Abruzzo −6−4−20 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Abruzzo (log) (DP) Confirmed Scaled(IT) Confirmed/Tests (IT) COVID frac cumul deaths(IT) COVID frac cumul deaths wrt past (IT) COVID frac excess deaths(IT) COVID frac weekly deaths (IT) COVID frac weekly deaths wrt past(IT) deaths in 2020 wrt past0500100015002000 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Abruzzo −4048 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Abruzzo (log) (IT) Confirmed (IT) Confirmed − Tests (IT) Confirmed − Tests cumulative (IT) Tests
Figure 7:
Comparison of curves for Abruzzo region. Left: y –axis on normal scale, right: on logarithmic scale. Regressionline shown for log(daily confirmed) − log(daily tested) ∼ time with 95% prediction band. Slope of regression with 95% con-fidence interval: a f = − . − . , − . . . , . ∼ time is a raw = − . − . , − . . . , . ∼ time is 0 . . , . . . , . a f /a raw = 1 . . − log(cumulative tested) ∼ time is − . − . , − . . . , . .000.250.500.75 Mar 01 Mar 15 Apr 01 Apr 15 May 01 Basilicata −6−4−20 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Basilicata (log) (DP) Confirmed Scaled(IT) Confirmed/Tests (IT) COVID frac cumul deaths(IT) COVID frac cumul deaths wrt past (IT) COVID frac excess deaths(IT) COVID frac weekly deaths (IT) COVID frac weekly deaths wrt past(IT) deaths in 2020 wrt past050010001500 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Basilicata −8−4048 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Basilicata (log) (IT) Confirmed (IT) Confirmed − Tests (IT) Confirmed − Tests cumulative (IT) Tests
Figure 8:
Comparison of curves for Basilicata region. Left: y –axis on normal scale, right: on logarithmic scale. Re-gression line shown for log(daily confirmed) − log(daily tested) ∼ time with 95% prediction band. Slope of regression with95% confidence interval: a f = − . − . , − . . . , . ∼ time is a raw = − . − . , − . . . , . ∼ time is 0 . . , . . . , . a f /a raw = 2 . . − log(cumulative tested) ∼ time is − . − . , − . . . , . Calabria −6−4−20 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Calabria (log) (DP) Confirmed Scaled(IT) Confirmed/Tests (IT) COVID frac cumul deaths(IT) COVID frac cumul deaths wrt past (IT) COVID frac excess deaths(IT) COVID frac weekly deaths (IT) COVID frac weekly deaths wrt past(IT) deaths in 2020 wrt past0500100015002000 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Calabria −8−4048 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Calabria (log) (IT) Confirmed (IT) Confirmed − Tests (IT) Confirmed − Tests cumulative (IT) Tests
Figure 9:
Comparison of curves for Calabria region. Left: y –axis on normal scale, right: on logarithmic scale. Regres-sion line shown for log(daily confirmed) − log(daily tested) ∼ time with 95% prediction band. Slope of regression with 95%confidence interval: a f = − . − . , − . . . , . ∼ time is a raw = − . − . , − . . . , . ∼ time is 0 . . , . . . , . a f /a raw = 1 . . − log(cumulative tested) ∼ time is − . − . , − . . . , . Campania −6−4−20 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Campania (log) (DP) Confirmed Scaled(IT) Confirmed/Tests (IT) COVID frac cumul deaths(IT) COVID frac cumul deaths wrt past (IT) COVID frac excess deaths(IT) COVID frac weekly deaths (IT) COVID frac weekly deaths wrt past(IT) deaths in 2020 wrt past020004000 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Campania −4048 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Campania (log) (IT) Confirmed (IT) Confirmed − Tests (IT) Confirmed − Tests cumulative (IT) Tests
Figure 10:
Comparison of curves for Campania region. Left: y –axis on normal scale, right: on logarithmic scale. Re-gression line shown for log(daily confirmed) − log(daily tested) ∼ time with 95% prediction band. Slope of regression with95% confidence interval: a f = − . − . , − . . . , . ∼ time is a raw = − . − . , − . . . , . ∼ time is 0 . . , . . . , . a f /a raw = 1 . . − log(cumulative tested) ∼ time is − . − . , − . . . , . Emilia−Romagna −6−4−20 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Emilia−Romagna (log) (DP) Confirmed Scaled(IT) Confirmed/Tests (IT) COVID frac cumul deaths(IT) COVID frac cumul deaths wrt past (IT) COVID frac excess deaths(IT) COVID frac weekly deaths (IT) COVID frac weekly deaths wrt past(IT) deaths in 2020 wrt past0200040006000 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Emilia−Romagna −505 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Emilia−Romagna (log) (IT) Confirmed (IT) Confirmed − Tests (IT) Confirmed − Tests cumulative (IT) Tests
Figure 11:
Comparison of curves for Emilia Romagna region. Left: y –axis on normal scale, right: on logarithmic scale. Re-gression line shown for log(daily confirmed) − log(daily tested) ∼ time with 95% prediction band. Slope of regression with95% confidence interval: a f = − . − . , − . . . , . ∼ time is a raw = − . − . , − . . . , . ∼ time is 0 . . , . . . , . a f /a raw = 1 . . − log(cumulative tested) ∼ time is − . − . , − . . . , . th –30 th March are removed for the regression analysis. Friuli Venezia Giulia −6−4−20 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Friuli Venezia Giulia (log) (DP) Confirmed Scaled(IT) Confirmed/Tests (IT) COVID frac cumul deaths(IT) COVID frac cumul deaths wrt past (IT) COVID frac excess deaths(IT) COVID frac weekly deaths (IT) COVID frac weekly deaths wrt past(IT) deaths in 2020 wrt past020004000 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Friuli Venezia Giulia −505 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Friuli Venezia Giulia (log) (IT) Confirmed (IT) Confirmed − Tests (IT) Confirmed − Tests cumulative (IT) Tests
Figure 12:
Comparison of curves for Friuli Venezia Giulia region. Left: y –axis on normal scale, right: on logarithmic scale.Regression line shown for log(daily confirmed) − log(daily tested) ∼ time with 95% prediction band. Slope of regression with95% confidence interval: a f = − . − . , − . . . , . ∼ time is a raw = − . − . , − . . . , . ∼ time is 0 . . , . . . , . a f /a raw = 1 . . − log(cumulative tested) ∼ time is − . − . , − . . . , . Lazio −6−4−20 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Lazio (log) (DP) Confirmed Scaled(IT) Confirmed/Tests (IT) COVID frac cumul deaths(IT) COVID frac cumul deaths wrt past (IT) COVID frac excess deaths(IT) COVID frac weekly deaths (IT) COVID frac weekly deaths wrt past(IT) deaths in 2020 wrt past0200040006000 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Lazio −505 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Lazio (log) (IT) Confirmed (IT) Confirmed − Tests (IT) Confirmed − Tests cumulative (IT) Tests
Figure 13:
Comparison of curves for Lazio region. Left: y –axis on normal scale, right: on logarithmic scale. Regression line shownfor log(daily confirmed) − log(daily tested) ∼ time with 95% prediction band. Slope of regression with 95% confidence interval: a f = − . − . , − . . . , . ∼ time is a raw = − . − . , − . . . , . ∼ time is 0 . − . , . a f /a raw = 1 . . − log(cumulative tested) ∼ time is − . − . , − . . . , . Liguria −6−4−20 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Liguria (log) (DP) Confirmed Scaled(IT) Confirmed/Tests (IT) COVID frac cumul deaths(IT) COVID frac cumul deaths wrt past (IT) COVID frac excess deaths(IT) COVID frac weekly deaths (IT) COVID frac weekly deaths wrt past(IT) deaths in 2020 wrt past05001000150020002500 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Liguria −505 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Liguria (log) (IT) Confirmed (IT) Confirmed − Tests (IT) Confirmed − Tests cumulative (IT) Tests
Figure 14:
Comparison of curves for Liguria region. Left: y –axis on normal scale, right: on logarithmic scale. Regres-sion line shown for log(daily confirmed) − log(daily tested) ∼ time with 95% prediction band. Slope of regression with 95%confidence interval: a f = − . − . , − . . . , . ∼ time is a raw = − . − . , − . . . , . ∼ time is 0 . . , . . . , . a f /a raw = 2 . . − log(cumulative tested) ∼ time is − . − . , − . . . , . Lombardia −6−4−20 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Lombardia (log) (DP) Confirmed Scaled(IT) Confirmed/Tests (IT) COVID frac cumul deaths(IT) COVID frac cumul deaths wrt past (IT) COVID frac excess deaths(IT) COVID frac weekly deaths (IT) COVID frac weekly deaths wrt past(IT) deaths in 2020 wrt past050001000015000 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Lombardia
Lombardia (log) (IT) Confirmed (IT) Confirmed − Tests (IT) Confirmed − Tests cumulative (IT) Tests
Figure 15:
Comparison of curves for Lombardia region. Left: y –axis on normal scale, right: on logarithmic scale. Re-gression line shown for log(daily confirmed) − log(daily tested) ∼ time with 95% prediction band. Slope of regression with95% confidence interval: a f = − . − . , − . . . , . ∼ time is a raw = − . − . , − . . . , . ∼ time is 0 . . , . . . , . a f /a raw = 1 . . − log(cumulative tested) ∼ time is − . − . , − . . . , . Marche −6−4−20 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Marche (log) (DP) Confirmed Scaled(IT) Confirmed/Tests (IT) COVID frac cumul deaths(IT) COVID frac cumul deaths wrt past (IT) COVID frac excess deaths(IT) COVID frac weekly deaths (IT) COVID frac weekly deaths wrt past(IT) deaths in 2020 wrt past0200040006000 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Marche −505 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Marche (log) (IT) Confirmed (IT) Confirmed − Tests (IT) Confirmed − Tests cumulative (IT) Tests
Figure 16:
Comparison of curves for Marche region. Left: y –axis on normal scale, right: on logarithmic scale. Regres-sion line shown for log(daily confirmed) − log(daily tested) ∼ time with 95% prediction band. Slope of regression with 95%confidence interval: a f = − . − . , − . . . , . ∼ time is a raw = − . − . , − . . . , . ∼ time is 0 . . , . . . , . a f /a raw = 1 . . − log(cumulative tested) ∼ time is − . − . , − . . . , . .00.51.0 Mar 01 Mar 15 Apr 01 Apr 15 May 01 Molise −6−4−20 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Molise (log) (DP) Confirmed Scaled(IT) Confirmed/Tests (IT) COVID frac cumul deaths(IT) COVID frac cumul deaths wrt past (IT) COVID frac excess deaths(IT) COVID frac weekly deaths (IT) COVID frac weekly deaths wrt past(IT) deaths in 2020 wrt past0200400600 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Molise −404 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Molise (log) (IT) Confirmed (IT) Confirmed − Tests (IT) Confirmed − Tests cumulative (IT) Tests
Figure 17:
Comparison of curves for Molise region. Left: y –axis on normal scale, right: on logarithmic scale. .000.250.500.751.00 Mar 01 Mar 15 Apr 01 Apr 15 May 01 P.A. Bolzano −6−4−20 Mar 01 Mar 15 Apr 01 Apr 15 May 01
P.A. Bolzano (log) (DP) Confirmed Scaled(IT) Confirmed/Tests (IT) COVID frac cumul deaths(IT) COVID frac cumul deaths wrt past (IT) COVID frac excess deaths(IT) COVID frac weekly deaths (IT) COVID frac weekly deaths wrt past(IT) deaths in 2020 wrt past050010001500 Mar 01 Mar 15 Apr 01 Apr 15 May 01
P.A. Bolzano −4048 Mar 01 Mar 15 Apr 01 Apr 15 May 01
P.A. Bolzano (log) (IT) Confirmed (IT) Confirmed − Tests (IT) Confirmed − Tests cumulative (IT) Tests
Figure 18:
Comparison of curves for P.A. Bolzano region. Left: y –axis on normal scale, right: on logarithmic scale. Top row:scaling with respect to population death tolls not presented as P. A. Bolzano is merged with P. A. Trento in deaths date providedby ISTAT. Regression line shown for log(daily confirmed) − log(daily tested) ∼ time with 95% prediction band. Slope of regressionwith 95% confidence interval: a f = − . − . , − . . . , . ∼ time is a raw = − . − . , − . . . , . ∼ time is 0 . . , . . . , . a f /a raw = 1 . . − log(cumulative tested) ∼ time is − . − . , − . . . , . .00.10.20.3 Mar 01 Mar 15 Apr 01 Apr 15 May 01 P.A. Trento −6−4−2 Mar 01 Mar 15 Apr 01 Apr 15 May 01
P.A. Trento (log) (DP) Confirmed Scaled(IT) Confirmed/Tests (IT) COVID frac cumul deaths(IT) COVID frac cumul deaths wrt past (IT) COVID frac excess deaths(IT) COVID frac weekly deaths (IT) COVID frac weekly deaths wrt past(IT) deaths in 2020 wrt past0500100015002000 Mar 01 Mar 15 Apr 01 Apr 15 May 01
P.A. Trento −4048 Mar 01 Mar 15 Apr 01 Apr 15 May 01
P.A. Trento (log) (IT) Confirmed (IT) Confirmed − Tests (IT) Confirmed − Tests cumulative (IT) Tests
Figure 19:
Comparison of curves for P.A. Trento region. Left: y –axis on normal scale, right: on logarithmic scale. Top row:scaling with respect to population death tolls not presented as P. A. Bolzano is merged with P. A. Trento in deaths date providedby ISTAT. Regression line shown for log(daily confirmed) − log(daily tested) ∼ time with 95% prediction band. Slope of regressionwith 95% confidence interval: a f = − . − . , − . . . , . ∼ time is a raw = − . − . , − . . . , . ∼ time is 0 . . , . . . , . a f /a raw = 1 . . − log(cumulative tested) ∼ time is − . − . , − . . . , . Piemonte −6−4−20 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Piemonte (log) (DP) Confirmed Scaled(IT) Confirmed/Tests (IT) COVID frac cumul deaths(IT) COVID frac cumul deaths wrt past (IT) COVID frac excess deaths(IT) COVID frac weekly deaths (IT) COVID frac weekly deaths wrt past(IT) deaths in 2020 wrt past0200040006000 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Piemonte −505 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Piemonte (log) (IT) Confirmed (IT) Confirmed − Tests (IT) Confirmed − Tests cumulative (IT) Tests
Figure 20:
Comparison of curves for Piemonte region. Left: y –axis on normal scale, right: on logarithmic scale. Regres-sion line shown for log(daily confirmed) − log(daily tested) ∼ time with 95% prediction band. Slope of regression with 95%confidence interval: a f = − . − . , − . . . , . ∼ time is a raw = − . − . , − . . . , . ∼ time is 0 . . , . . . , . a f /a raw = 3 . . − log(cumulative tested) ∼ time is − . − . , − . . . , . Puglia −6−4−20 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Puglia (log) (DP) Confirmed Scaled(IT) Confirmed/Tests (IT) COVID frac cumul deaths(IT) COVID frac cumul deaths wrt past (IT) COVID frac excess deaths(IT) COVID frac weekly deaths (IT) COVID frac weekly deaths wrt past(IT) deaths in 2020 wrt past0500100015002000 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Puglia −505 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Puglia (log) (IT) Confirmed (IT) Confirmed − Tests (IT) Confirmed − Tests cumulative (IT) Tests
Figure 21:
Comparison of curves for Puglia region. Left: y –axis on normal scale, right: on logarithmic scale. Regres-sion line shown for log(daily confirmed) − log(daily tested) ∼ time with 95% prediction band. Slope of regression with 95%confidence interval: a f = − . − . , − . . . , . ∼ time is a raw = − . − . , − . . . , . ∼ time is 0 . . , . . . , . a f /a raw = 1 . . − log(cumulative tested) ∼ time is − . − . , − . . . , . Sardegna −6−4−20 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Sardegna (log) (DP) Confirmed Scaled(IT) Confirmed/Tests (IT) COVID frac cumul deaths(IT) COVID frac cumul deaths wrt past (IT) COVID frac excess deaths(IT) COVID frac weekly deaths (IT) COVID frac weekly deaths wrt past(IT) deaths in 2020 wrt past0500100015002000 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Sardegna −8−4048 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Sardegna (log) (IT) Confirmed (IT) Confirmed − Tests (IT) Confirmed − Tests cumulative (IT) Tests
Figure 22:
Comparison of curves for Sardegna region. Left: y –axis on normal scale, right: on logarithmic scale. Re-gression line shown for log(daily confirmed) − log(daily tested) ∼ time with 95% prediction band. Slope of regression with95% confidence interval: a f = − . − . , − . . . , . ∼ time is a raw = − . − . , − . . . , . ∼ time is 0 . . , . . . , . a f /a raw = 1 . . − log(cumulative tested) ∼ time is − . − . , − . . . , . .00.30.6 Mar 01 Mar 15 Apr 01 Apr 15 May 01 Sicilia −6−4−20 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Sicilia (log) (DP) Confirmed Scaled(IT) Confirmed/Tests (IT) COVID frac cumul deaths(IT) COVID frac cumul deaths wrt past (IT) COVID frac excess deaths(IT) COVID frac weekly deaths (IT) COVID frac weekly deaths wrt past(IT) deaths in 2020 wrt past01000200030004000 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Sicilia −505 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Sicilia (log) (IT) Confirmed (IT) Confirmed − Tests (IT) Confirmed − Tests cumulative (IT) Tests
Figure 23:
Comparison of curves for Sicilia region. Left: y –axis on normal scale, right: on logarithmic scale. Regres-sion line shown for log(daily confirmed) − log(daily tested) ∼ time with 95% prediction band. Slope of regression with 95%confidence interval: a f = − . − . , − . . . , . ∼ time is a raw = − . − . , − . . . , . ∼ time is 0 . . , . . . , . a f /a raw = 2 . . − log(cumulative tested) ∼ time is − . − . , − . . . , . Toscana −6−4−2024 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Toscana (log) (DP) Confirmed Scaled(IT) Confirmed/Tests (IT) COVID frac cumul deaths(IT) COVID frac cumul deaths wrt past (IT) COVID frac excess deaths(IT) COVID frac weekly deaths (IT) COVID frac weekly deaths wrt past(IT) deaths in 2020 wrt past0200040006000 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Toscana −505 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Toscana (log) (IT) Confirmed (IT) Confirmed − Tests (IT) Confirmed − Tests cumulative (IT) Tests
Figure 24:
Comparison of curves for Toscana region. Left: y –axis on normal scale, right: on logarithmic scale. Regres-sion line shown for log(daily confirmed) − log(daily tested) ∼ time with 95% prediction band. Slope of regression with 95%confidence interval: a f = − . − . , − . . . , . ∼ time is a raw = − . − . , − . . . , . ∼ time is 0 . . , . . . , . a f /a raw = 2 . . − log(cumulative tested) ∼ time is − . − . , − . . . , . .00.51.01.52.0 Mar 01 Mar 15 Apr 01 Apr 15 May 01 Umbria −6−4−20 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Umbria (log) (DP) Confirmed Scaled(IT) Confirmed/Tests (IT) COVID frac cumul deaths(IT) COVID frac cumul deaths wrt past (IT) COVID frac excess deaths(IT) COVID frac weekly deaths (IT) COVID frac weekly deaths wrt past(IT) deaths in 2020 wrt past050010001500 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Umbria −505 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Umbria (log) (IT) Confirmed (IT) Confirmed − Tests (IT) Confirmed − Tests cumulative (IT) Tests
Figure 25:
Comparison of curves for Umbria region. Left: y –axis on normal scale, right: on logarithmic scale. Regression lineshown for log(daily confirmed) − log(daily tested) ∼ time with 95% prediction band. Slope of regression with 95% confidenceinterval: a f = − . − . , − . . . , . ∼ time is − . − . , − . . . , . ∼ time is 0 . . , . . . , . a f /a raw = 1 . . − log(cumulative tested) ∼ time is − . − . , − . . . , . Valle d'Aosta −6−4−20 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Valle d'Aosta (log) (DP) Confirmed Scaled(IT) Confirmed/Tests (IT) COVID frac cumul deaths(IT) COVID frac cumul deaths wrt past (IT) COVID frac excess deaths(IT) COVID frac weekly deaths (IT) COVID frac weekly deaths wrt past(IT) deaths in 2020 wrt past0200400600 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Valle d'Aosta −404 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Valle d'Aosta (log) (IT) Confirmed (IT) Confirmed − Tests (IT) Confirmed − Tests cumulative (IT) Tests
Figure 26:
Comparison of curves for Valle d’Aosta region. Left: y –axis on normal scale, right: on logarithmic scale. Veneto −5.0−2.50.02.5 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Veneto (log) (DP) Confirmed Scaled(IT) Confirmed/Tests (IT) COVID frac cumul deaths(IT) COVID frac cumul deaths wrt past (IT) COVID frac excess deaths(IT) COVID frac weekly deaths (IT) COVID frac weekly deaths wrt past(IT) deaths in 2020 wrt past0500010000 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Veneto −50510 Mar 01 Mar 15 Apr 01 Apr 15 May 01
Veneto (log) (IT) Confirmed (IT) Confirmed − Tests (IT) Confirmed − Tests cumulative (IT) Tests
Figure 27:
Comparison of curves for Veneto region. Left: y –axis on normal scale, right: on logarithmic scale. Regres-sion line shown for log(daily confirmed) − log(daily tested) ∼ time with 95% prediction band. Slope of regression with 95%confidence interval: a f = − . − . , − . . . , . ∼ time is a raw = − . − . , − . . . , . ∼ time is 0 . . , . . . , . a f /a raw = 1 . . − log(cumulative tested) ∼ time is − . − . , − . . . , . ppendix B: Death tolls for regions of Italy N u m be r o f dea t h s pe r w ee k Abruzzo TOTAL N u m be r o f dea t h s pe r w ee k Abruzzo WOMEN N u m be r o f dea t h s pe r w ee k Abruzzo MEN N u m be r o f dea t h s pe r w ee k Basilicata TOTAL N u m be r o f dea t h s pe r w ee k Basilicata WOMEN N u m be r o f dea t h s pe r w ee k Basilicata MEN N u m be r o f dea t h s pe r w ee k Calabria TOTAL N u m be r o f dea t h s pe r w ee k Calabria WOMEN N u m be r o f dea t h s pe r w ee k Calabria MEN N u m be r o f dea t h s pe r w ee k Campania TOTAL N u m be r o f dea t h s pe r w ee k Campania WOMEN N u m be r o f dea t h s pe r w ee k Campania MEN
Figure 28 N u m be r o f dea t h s pe r w ee k Emilia−Romagna TOTAL N u m be r o f dea t h s pe r w ee k Emilia−Romagna WOMEN N u m be r o f dea t h s pe r w ee k Emilia−Romagna MEN N u m be r o f dea t h s pe r w ee k Friuli Venezia Giulia TOTAL N u m be r o f dea t h s pe r w ee k Friuli Venezia Giulia WOMEN N u m be r o f dea t h s pe r w ee k Friuli Venezia Giulia MEN N u m be r o f dea t h s pe r w ee k Lazio TOTAL N u m be r o f dea t h s pe r w ee k Lazio WOMEN N u m be r o f dea t h s pe r w ee k Lazio MEN N u m be r o f dea t h s pe r w ee k Liguria TOTAL N u m be r o f dea t h s pe r w ee k Liguria WOMEN N u m be r o f dea t h s pe r w ee k Liguria MEN
Figure 28 N u m be r o f dea t h s pe r w ee k Lombardia TOTAL N u m be r o f dea t h s pe r w ee k Lombardia WOMEN N u m be r o f dea t h s pe r w ee k Lombardia MEN N u m be r o f dea t h s pe r w ee k Marche TOTAL N u m be r o f dea t h s pe r w ee k Marche WOMEN N u m be r o f dea t h s pe r w ee k Marche MEN N u m be r o f dea t h s pe r w ee k Molise TOTAL N u m be r o f dea t h s pe r w ee k Molise WOMEN N u m be r o f dea t h s pe r w ee k Molise MEN N u m be r o f dea t h s pe r w ee k P. A. Bolzano/P. A. Trento TOTAL N u m be r o f dea t h s pe r w ee k P. A. Bolzano/P. A. Trento WOMEN N u m be r o f dea t h s pe r w ee k P. A. Bolzano/P. A. Trento MEN
Figure 28 N u m be r o f dea t h s pe r w ee k Piemonte TOTAL N u m be r o f dea t h s pe r w ee k Piemonte WOMEN N u m be r o f dea t h s pe r w ee k Piemonte MEN N u m be r o f dea t h s pe r w ee k Puglia TOTAL N u m be r o f dea t h s pe r w ee k Puglia WOMEN N u m be r o f dea t h s pe r w ee k Puglia MEN N u m be r o f dea t h s pe r w ee k Sardegna TOTAL N u m be r o f dea t h s pe r w ee k Sardegna WOMEN N u m be r o f dea t h s pe r w ee k Sardegna MEN N u m be r o f dea t h s pe r w ee k Sicilia TOTAL N u m be r o f dea t h s pe r w ee k Sicilia WOMEN N u m be r o f dea t h s pe r w ee k Sicilia MEN
Figure 28 N u m be r o f dea t h s pe r w ee k Toscana TOTAL N u m be r o f dea t h s pe r w ee k Toscana WOMEN N u m be r o f dea t h s pe r w ee k Toscana MEN N u m be r o f dea t h s pe r w ee k Umbria TOTAL N u m be r o f dea t h s pe r w ee k Umbria WOMEN N u m be r o f dea t h s pe r w ee k Umbria MEN N u m be r o f dea t h s pe r w ee k Valle d'Aosta TOTAL N u m be r o f dea t h s pe r w ee k Valle d'Aosta WOMEN N u m be r o f dea t h s pe r w ee k Valle d'Aosta MEN N u m be r o f dea t h s pe r w ee k Veneto TOTAL N u m be r o f dea t h s pe r w ee k Veneto WOMEN N u m be r o f dea t h s pe r w ee k Veneto MEN
Figure 28: