Biomedical and environmental sciences : BES | 2021

Testing-Related and Geo-Demographic Indicators Strongly Predict COVID-19 Deaths in the United States during March of 2020.

 
 
 
 
 
 
 
 

Abstract


The COVID-19 pandemic has wreaked havoc around the globe and caused significant disruptions across multiple domains. Moreover, different countries have been differentially impacted by COVID-19 — a phenomenon that is due to a multitude of complex and often interacting determinants. Understanding such complexity and interacting factors requires both compelling theory and appropriate data analytic techniques. Regarding data analysis, one question that arises is how to analyze extremely non-normal data, such as those variables evidencing L-shaped distributions. A second question concerns the appropriate selection of a predictive modelling technique when the predictors derive from multiple domains (e.g., testing-related variables, population density), and both main effects and interactions are examined. To address these questions, we propose a novel statistical approach for analyzing and understanding complex data interactions. Using data collected in the USA during the first month in which COVID-19 testing was performed (March of 2020 Supplementary Table S1 available in www. besjournal.com), we examined the following six predictors of COVID-19 related deaths: (i) the proportion of all tests conducted during the first week of testing; (ii) the cumulative number of (testpositive) cases through 3-31-2020; (iii) the number of tests performed/ million inhabitants; (iv) the cumulative number of inhabitants tested; (v) the number of cases/million inhabitants (cases/mill inh); and (vi) the number of diagnostic tests performed in week one of testing/million inhabitants/statespecific population density (w1DT/MI/PD), where “population density ” is defined as the number of inhabitants per square kilometer. The purpose of this study was to examine the ability of the six variables to predict COVID-19 related deaths in the United States during March of 2020. We ran the predictive model twice, once for each dependent variable: mortality count (overall number of deaths), and deaths per million inhabitants. Because our model (a) uses predictors that leverage information from multiple domains, (b) captures both nationwide and state-specific dimensions, and (c) examines two different mortality-related outcomes, the results are expected to have relevance for policy-makers. All data used in this study were obtained from three sources in the public domain: Worldometer (https://www.worldometers.info/coronavirus/), World Population Review (https://worldpopulationreview. com/states), and Covidtracking (https://covidtracking. com/). The data were processed and analyzed using IBM SPSS, Minitab, and R. Univariate skewness and kurtosis values indicated that all predictors and outcomes were non-normally distributed, with a few variables evidencing L-shaped distributions. The Lshaped variables were normalized using the rankbased inverse normal (RIN) transformation. For extremely non-normal data, the RIN method is a highly effective normalizing transformation. The prediction models were first examined using linear multiple regression, with the RIN-transformed versions of all variables used in the regressions. Because the homoscedasticity assumption (i.e.,

Volume 34 9
Pages \n 734-738\n
DOI 10.3967/bes2021.102
Language English
Journal Biomedical and environmental sciences : BES

Full Text