The Association of Opening K-12 Schools with the Spread of COVID-19 in the United States: County-Level Panel Data Analysis
TTHE ASSOCIATION OF OPENING K-12 SCHOOLS AND COLLEGESWITH THE SPREAD OF COVID-19 IN THE UNITED STATES:COUNTY-LEVEL PANEL DATA ANALYSIS
VICTOR CHERNOZHUKOV, HIROYUKI KASAHARA, AND PAUL SCHRIMPF
Abstract.
This paper empirically examines how the opening of K-12 schools and collegesis associated with the spread of COVID-19 using county-level panel data in the UnitedStates. Using data on foot traffic and K-12 school opening plans, we analyze how anincrease in visits to schools and opening schools with different teaching methods (in-person,hybrid, and remote) is related to the 2-weeks forward growth rate of confirmed COVID-19 cases. Our debiased panel data regression analysis with a set of county dummies,interactions of state and week dummies, and other controls shows that an increase in visitsto both K-12 schools and colleges is associated with a subsequent increase in case growthrates. The estimates indicate that fully opening K-12 schools with in-person learning isassociated with a 5 (SE = 2) percentage points increase in the growth rate of cases. Wealso find that the positive association of K-12 school visits or in-person school openingswith case growth is stronger for counties that do not require staff to wear masks at schools.These results have a causal interpretation in a structural model with unobserved countyand time confounders. Sensitivity analysis shows that the baseline results are robust totiming assumptions and alternative specifications. Introduction
Does opening K-12 schools and colleges lead to the spread of COVID-19? Do mitigationstrategies such as mask-wearing requirements help reduce the transmission of SARS-CoV-2at school? These are important policy relevant questions. If in-person school openingssubstantially increase COVID-19 cases, then local governments could promote enforcingmitigation measures at schools (universal and proper masking, social distancing, and hand-washing) to lower the risk of COVID-19 spread. Furthermore, the government could prior-itize vaccines for education workers in case of in-person school openings. This paper usescounty-level panel data on K-12 school opening plans and mitigation strategies togetherwith foot traffic data to investigate how an increase in the visits to K-12 schools and col-leges/universities is associated with a subsequent increase in the growth rates of COVID-19cases in the United States.
Date : February 23, 2021.
Key words and phrases.
K-12 school openings | in-person, hybrid, and remote | mask-wearing requirementsfor staff | foot traffic data | debiased estimator.We are very grateful to Emily Oster for her helpful comments. All mistakes are our own. a r X i v : . [ ec on . GN ] F e b VICTOR CHERNOZHUKOV, HIROYUKI KASAHARA, AND PAUL SCHRIMPF
Figure 1.
The evolution of cases, deaths, and visits to K-12 schools andrestaurants before and after the opening of K-12 schools (a) Cases by K-12 Opening Modes (b) Deaths by K-12 Opening Modes W eek l y C ases P e r plan In−person/No−MaskIn−person/Yes−MaskHybrid/No−MaskHybrid/Yes−MaskRemote W eek l y D ea t h s P e r plan In−person/No−MaskIn−person/Yes−MaskHybrid/No−MaskHybrid/Yes−MaskRemote (c) K-12 School Visits (d) Full-Time Workplace Visits K − S c hoo l V i s i t s / D ev i ce plan In−person/No−MaskIn−person/Yes−MaskHybrid/No−MaskHybrid/Yes−MaskRemote Fu ll T i m e W o r k / D ev i ce plan In−person/No−MaskIn−person/Yes−MaskHybrid/No−MaskHybrid/Yes−MaskRemote
Notes: (a)-(b) plot the evolution of weekly cases or deaths per 1000 persons averaged across counties within eachgroup of counties classified by K-12 school teaching methods and mitigation strategy of mask requirements againstthe days since K-12 school opening. We classify counties that implement in-person teaching as their dominantteaching method into “In-person/Yes-Mask” and “In-person/No-Mask” based on whether at least one school districtrequires staff to wear masks or not. Similarly, we classify counties that implement hybrid teaching into“Hybrid/Yes-Mask” and “Hybrid/No-Mask” based on whether mask-wearing is required for staff. We classifycounties that implement remote teaching as “Remote.” (c) and (d) plot the evolution of per-device visits to K-12schools and full-time workplaces, respectively, against the days since K-12 school opening using the sameclassification as (a) and (b).
We begin with simple suggestive evidence. Fig. 1 provides visual evidence for the asso-ciation of opening K-12 schools with the spread of COVID-19 as well as the role of schoolmitigation strategies. Fig. 1(a) and (b) plot the evolution of average weekly cases anddeaths per 1000 persons, respectively, against days since school opening across differentteaching methods as well as mask requirements for staff. In Fig. 1(a), the average num-ber of weekly cases starts increasing after 2 weeks of opening schools in-person or hybrid,
PENING K-12 SCHOOLS AND THE SPREAD OF COVID-19 3 especially for counties with no mask mandates for staff. This possibly suggests that maskmandates at school reduce the transmissions of SARS-CoV-2. In Fig. 1(b), the numberof deaths starts increasing after 3 to 5 weeks of opening schools, especially for countiesthat adopt in-person/hybrid teaching methods with no mask mandates. Alternative miti-gation strategies of requiring mask-wearing to the student, prohibiting sports activities, andpromoting online instruction also appear to help reduce the number of cases after schoolopenings (see SI Appendix, Fig. S1(i)-(p)).Fig. 1(c) shows that opening K-12 schools in-person or hybrid increases the numberof per-device visits to K-12 schools more than opening remotely, especially when no maskmandates are in place. Fig. 1(d) and SI Appendix, Fig. S1(e)-(f) show that visits tofull-time and part-time workplaces increase after school openings with in-person teaching,suggesting that the opening of schools allow parents to return to work. On the other hand,we observe no drastic changes in per-device visits to restaurants, recreational facilities, andchurches after school openings (SI Appendix, Fig. S1(b)-(d)).Fig. 2 and SI Appendix, Fig. S2 provide further descriptive evidence that opening collegesand universities with in-person teaching lead to the spread of COVID-19 in counties wherethe University of Wisconsin(UW)-Madison, the University of Oregon, the University ofArizona, the Michigan State University, the Pennsylvania State University, the Iowa StateUniversity, and the University of Illinois-Champaign are located.What happened in Dane county, WI, is also illustrative. The left panel of Fig. 2 presentsthe evolution of the number of cases by age groups, the number of visits to colleges anduniversities, and the number of visits to bars and restaurants in Dane county, WI. The firstpanel shows that the number of cases for age groups of 10-19 and 20-29 sharply increasedin mid-September while few cases were reported for other age groups. The second tothe fourth panels suggest that this sharp increase in cases among the 10-29 age cohortin mid-September is associated with an increase in visits to colleges/universities, bars, andrestaurants in late August and early September. The fall semester with in-person classes atthe UW-Madison began on September 2, 2020, when many undergraduates started livingtogether in residential halls and likely visited bars and restaurants. This resulted in increasesin COVID-19 cases on campus; according to the letter from Dane County Executive JoeParisi to the UW-Madison Parisi (2020), nearly 1,000 positive cases were confirmed on theUW-Madison campus by September 9, 2020, accounting for at least 74 percent of confirmedcases from September 1 to 8, 2020 in Dane county.While Fig. 1-2 as well as SI Appendix, Fig. S1-S2 are suggestive, the patterns observedin them may be driven by a variety of confounders. Therefore, we analyze the effect ofopening K-12 schools and colleges/universities by panel data regression analysis with fixedeffects to capture unobserved confounding.We conduct the analysis using county-level data in the United States. As an outcome vari-able, we use the weekly growth rate of confirmed cases approximated by the log-differencein reported weekly cases over two weeks, where the log of weekly cases is set to be − VICTOR CHERNOZHUKOV, HIROYUKI KASAHARA, AND PAUL SCHRIMPF
Figure 2.
The number of cases by age groups and the number of visits tocolleges/universities and bars in Dane county, WI, and Lane county, OR
Dane County, WI Lane County, OR
Cases by Age Groups Cases by Age Groups
20 − 29 Years40 − 49 Years60 − 69 Years10 − 19 Years30 − 39 Years0 − 9 Years80+ Years70 − 79 Years50 − 59 Years
Date da y m o v i ng a v e r age o f c a s e s Wisconsin − Dane
40 − 49 Years20 − 29 Years0 − 9 Years50 − 59 Years10 − 19 Years60 − 69 Years30 − 39 Years80+ Years70 − 79 Years
Date da y m o v i ng a v e r age o f c a s e s Oregon − Lane
College Visits College Visits
Date da y m o v i ng a v e r age o f c o ll ege v i s i t s College visits in Wisconsin − Dane county
Date da y m o v i ng a v e r age o f c o ll ege v i s i t s College visits in Oregon − Lane county
Bar Visits Bar Visits
Date da y m o v i ng a v e r age o f ba r v i s i t s Bar visits in Wisconsin − Dane county
Date da y m o v i ng a v e r age o f ba r v i s i t s Bar visits in Oregon − Lane county
Restaurant Visits Restaurant Visits
Date da y m o v i ng a v e r age o f r e s t au r an t v i s i t s Restaurant visits in Wisconsin − Dane county
Date da y m o v i ng a v e r age o f r e s t au r an t v i s i t s Restaurant visits in Oregon − Lane county
Notes: The first, the second, and the third figures in the left panel show the evolution of the number of cases by agegroups, the number of visits to colleges/universities, and bars, respectively, in Dane County, WI. The right panelshows the corresponding figures for Lane County, OR. per-device visits to K-12 schools and colleges/universities from SafeGraph foot traffic data(SI Appendix, Fig. S3. (3)(6)).We also consider the variable for school openings with different teaching methods (in-person, hybrid, and remote) from MCH Strategic Data (SI Appendix, Fig. S3(11)). Foot
PENING K-12 SCHOOLS AND THE SPREAD OF COVID-19 5 traffic data has the advantage over school opening data in that it provides more accurateinformation on the actual visits to schools over time, possibly capturing unrecorded changesin teaching methods and school closures beyond the information provided by MCH StrategicData. Furthermore, foot traffic data covers all counties while there is missing informationfor some school districts in MCH Strategic Data, which may possibly cause sample selectionissues.To investigate the role of mitigation strategies at school on the transmission of SARS-CoV-2, we examine how the coefficients of K-12 school visits and K-12 school opening de-pend on the mask-wearing requirement for staff by adding an interaction term, for example,between K-12 school visits and mask-wearing requirements for staff at schools. As confounders, we consider a set of county fixed effects as well as interaction termsbetween state and week fixed effects to control for unobserved time-invariant county-levelfactors as well as unobserved time-varying state-level factors. County fixed effects controlpermanent differences across counties in unobserved personal risk-aversion and attitudetoward mask-wearing, hand washings, and social distancing. Interaction terms betweenstate dummy variables and week dummy variables capture any change over time in people’sbehaviors and non-pharmaceutical policy interventions (NPIs) that are common within astate; they also control for changes in weather, temperature, and humidity within a state.We also include county-level NPIs (mask mandates, ban gathering of more than 50 per-sons, stay-at-home orders) lagged by 2 weeks to control for the effect of people’s behavioralchanges driven by policies on case growths beyond the effect of state-level policies. Further-more, the logarithm of past weekly cases with 2, 3, and 4 weeks lag lengths are included tocapture people’s voluntarily behavioral response to new information of transmission risks.The growth rate of the number of tests recorded at the daily frequency for each state is alsoadded as a control for case growth regression.Because the fixed effects estimator with a set of county dummies for dynamic panelregression could be severely biased when the time dimension is short (Nickell, 1981), weemploy the debiased estimator by implementing bias correction (e.g., Chen, Chernozhukov,and Fern´andez-Val, 2019). Our empirical analysis uses 7-day moving averages of dailyvariables to deal with periodic fluctuations within a week. Our data set contains 3144counties for regression analysis using foot traffic data but some county observations aredropped out of samples due to missing values for school opening teaching methods and staff MCH Strategic Data provides the school district level data on whether each school district adopts thefollowing mitigation strategies: (i) mask requirements for staff, (ii) mask requirements for students, (iii)prohibiting sports activities, and (iv) online instruction increases, among other measures. We decided to usemask requirements for staff as the main variable for school mitigation strategy because it has a relativelysmaller number of missing values. For regression analysis with the mask requirement variable, we dropcounties from the sample when more than 50 percent of students in a county attend school districts ofwhich mask requirements for staff is unknown or pending. Similarly, for specification with different teachingmethods, we drop counties from the sample when more than 50 percent of students in a county attend schooldistricts of which teaching methods are unknown or pending. The decision to reopen schools in some states such as California and Oregon depended on trends in localcase counts or hospitalizations (Goldhaber-Fiebert, Studdert, and Mello, 2020).
VICTOR CHERNOZHUKOV, HIROYUKI KASAHARA, AND PAUL SCHRIMPF mask requirements in some regression specifications. Our sample period is from April 1,2020, to December 2, 2020. The analysis was conducted using R software (version 4.0.3).
Results
Table 1 reports the debiased estimates of panel data regression. Clustered standarderrors at the state level are reported in the bracket to provide valid inference under possibledependency over time and across counties within each state. The results suggest that anincrease in the visits to K-12 schools and colleges/universities as well as opening K-12schools with in-person learning mode is associated with an increase in the growth rates ofcases with 2 weeks lag when schools implement no mask mandate for staff.In column (1), the estimated coefficient of per-device visits to colleges is 0.14 (SE =0.07) while that of per-device visits to K-12 schools is 0.47 (SE = 0.07). The change intop 5 percentile values of per-device visits to colleges/universities and K-12 schools betweenJune and September among counties are around 0.1 and 0.15, respectively, in SI Appendix,Fig. S4(d)(e). Taking these values as a benchmark for full openings, fully opening col-leges/universities may be associated with (0.14 × × Our regression analysis uses 2788 counties for specification with K-12 school opening with differentteaching modes while the sample contains 2204 counties for specification with mask requirements for staff.
PENING K-12 SCHOOLS AND THE SPREAD OF COVID-19 7
Consistent with evidence from U.S. state-level panel data analysis in Chernozhukov,Kasahara, and Schrimpf (2021), the estimated coefficients of county-wide mask mandatepolicy are negative and significant in columns (1)-(4), suggesting that mandating masksreduces case growth. The estimated coefficients of ban gatherings and stay-at-home ordersare also negative. The negatively estimated coefficients of the log of past weekly cases areconsistent with a hypothesis that the information on higher transmission risk induces peopleto take precautionary actions voluntarily to reduce case growth. The table also highlightsthe importance of controlling for the test growth rates as a confounder.Evidence on the role of schools in the spread of COVID-19 from other studies is mixed.Papers that focus on contract tracing of cases among students find limited spread fromstudent infections Zimmerman et al. (2021), Brandal et al. (2021), Ismail et al. (2020),Gillespie et al. (2021), Falk et al. (2021), Willeit et al. (2021). There is also some evidencethat school openings are associated with increased cases in the surrounding community.Bignami et al. (2021) provides suggestive evidence that school openings are associated withincreased cases in Montreal neighborhoods. Auger et al. (2020) use US state-level data toargue that school closures at the start of the pandemic substantially reduced.Two closely related papers also examine the relationship between schools and county-level COVID-19 outcomes in the US. Goldhaber et al. (2021) examine the relationshipbetween schooling and cases in counties in Washington and Michigan. They find that in-person schooling is only associated with increased cases in areas with high pre-existingCOVID-19 cases. Similarly, Harris, Ziedan, and Hassig (2021) analyze US county-leveldata on COVID-19 hospitalizations and find that in-person schooling is not associated withincreased hospitalizations in counties with low pre-existing COVID-19 hospitalization rates.As discussed in SI Appendix, our regression specification is motivated by a SIRD model,and the dependent variable in our analysis is case growth rates instead of new cases orhospitalizations. Consistent with Goldhaber et al. (2021) and Harris, Ziedan, and Hassig(2021), our finding of a constant increase in growth rates implies a greater increase in casesin counties with more pre-existing cases.We next provide sensitivity analysis with respect to changes to our regression specificationand assumption about delays between infection and reporting cases as follows:(1) Baseline specifications in columns (1) and (2) of Table 1.(2),(3) Alternative time lags of 10 and 18 days for visits to colleges and K-12 schools aswell as NPIs.(4) Setting the log of weekly cases to 0 when we observe zero weekly cases to computethe log-difference in weekly cases for outcome variable.(5) Add the log of weekly cases lagged by 5 weeks and per-capital cumulative numberof cases lagged by 2 weeks as controls.(6) Add per-device visits to restaurants, bars, recreational places, and churches laggedby 2 and 4 weeks as controls.(7) Add per-device visits to full-time and part-time workplaces and a proportion ofdevices staying at home lagged by 2 weeks as controls.(8) All of (5)-(7).
VICTOR CHERNOZHUKOV, HIROYUKI KASAHARA, AND PAUL SCHRIMPF
Table 1.
The Association of School/College Openings and NPIs with CaseGrowth in the United States: Debiased Estimator
Dependent variable: Case Growth Rates (1) (2) (3) (4)College Visits, 14d lag 0.139 ∗ ∗∗ ∗∗∗ ∗∗∗ (0.070) (0.070)K-12 Visits × No-Mask 0.297 ∗∗∗ (0.070)K-12 In-person, 14d lag 0.047 ∗∗∗ − − ∗∗∗ (0.014) (0.013)K-12 Remote, 14d lag − ∗∗∗ − ∗∗∗ (0.016) (0.015)K-12 In-person × No-Mask 0.041 ∗∗ (0.019)K-12 Hybrid × No-Mask 0.049 ∗∗∗ (0.017)Mandatory mask, 14d lag − ∗∗∗ − ∗∗∗ − ∗∗∗ − ∗∗∗ (0.018) (0.017) (0.020) (0.019)Ban gatherings, 14d lag − ∗∗∗ − ∗∗∗ − ∗∗∗ − ∗∗∗ (0.033) (0.044) (0.033) (0.042)Stay at home, 14d lag − ∗∗∗ − ∗∗∗ − ∗∗∗ − ∗∗∗ (0.031) (0.039) (0.034) (0.040)log(Cases), 14d lag − ∗∗∗ − ∗∗∗ − ∗∗∗ − ∗∗∗ (0.009) (0.010) (0.010) (0.010)log(Cases), 21d lag − ∗∗∗ − ∗∗∗ − ∗∗∗ − ∗∗∗ (0.005) (0.005) (0.005) (0.005)log(Cases), 28d lag − ∗∗∗ − ∗∗∗ − ∗∗∗ − ∗∗∗ (0.003) (0.003) (0.004) (0.004)Test Growth Rates 0.009 ∗∗ ∗ ∗∗ ∗∗ (0.004) (0.004) (0.004) (0.004)County Dummies Yes Yes Yes YesState × Week Dummies Yes Yes Yes YesObservations 690,297 545,131 612,963 528,941R Notes: Dependent variable is the log difference in weekly positive cases across 2 weeks. Regressors are 7-daysmoving averages of corresponding daily variables and lagged by 2 weeks to reflect the time between infection andcase reporting except that we don’t take any lag for the log difference in test growth rates. All regressionspecifications include county fixed effects and state-week fixed effects to control for any unobserved county-levelfactors and time-varying state-level factors such as various state-level policies. The debiased fixed effects estimator isapplied. The results from the estimator without bias correction is presented in SI Appendix, Table S1. Asymptoticclustered standard errors at the state level are reported in bracket. ∗ p < ∗∗ p < ∗∗∗ p < Because the actual time lag between infection and reporting cases may be shorter orlonger than 14 days, we consider the alternative time lags in (2) and (3). Specification (4)checks the sensitivity of handling zero weekly cases to construct the outcome variable of thelog difference in weekly cases.A major concern for interpreting our estimate in Table 1 as the causal effect is that achoice of opening timing, teaching methods, and mask requirements may be endogenous.Our baseline specification mitigates this concern by controlling for county-fixed effects,state-week fixed effects and the log of past cases but a choice of school openings may be stillcorrelated with time-varying unobserved factors at the county-level. Therefore, we estimatea specification with additional time-varying county-level controls in (5)-(8).Fig. 3(a) takes column (1) of Table 1 as a baseline specification and plots the estimatedcoefficients for visits to colleges and K12 schools with the 90 percent confidence intervalsacross different specifications using the debiased estimator; the estimates using the standardestimator without bias correction are qualitatively similar and reported in SI Appendix, Fig.S3. The estimated coefficients of K-12 school visits and college visits are all positive acrossdifferent specifications, suggesting that an increase in visits to K-12 schools and collegesis robustly associated with an increase in case growth. On the other hand, the estimatedcoefficients often become smaller when we add more controls. In particular, relative tothe baseline, adding full-time/part-time workplace visits and staying home devices leads tosomewhat smaller estimated coefficients for both K-12 school and college visits, suggestingthat opening schools and colleges is associated with people returning to work and/or goingoutside more frequently.In Fig. 3(b), the estimated interaction term of K-12 school visits and no mask-wearingrequirements for staff in column (2) of Table 1 are all positive and significant, robustlyindicating a possibility that mask-wearing requirement for staff may have helped to reducethe transmission of SARS-CoV-2 at schools when K-12 schools opened with the in-personteaching method.
Association between School Openings and Mobility.
As highlighted by a modelingstudy for the United Kingdom (Panovska-Griffiths et al., 2020), there are at least tworeasons why opening K-12 schools in-person may increase the spread of COVID-19. First,opening K-12 schools increases the number of contacts within schools, which may increasethe risk of transmission among children, parents, education workers, and communities atlarge. Second, reopening K-12 schools allow parents to return to work and increase theirmobility in general, which may contribute to the transmission of COVID-19 at schools andworkplaces.To give insight on the role of reopening K-12 schools for parents to return to work andto increase their mobility, we conduct panel data regression analysis by taking visits tofull-time workplaces and a measure of staying home devices as outcome variables and usea similar set of regressors as in Table 1 but without taking 2 weeks time lags.
Table 2(a) shows how the proportion of devices at full-time workplaces and that of stayinghome devices are associated with visits to K-12 schools as well as their in-person openings.In columns (1) and (2), the estimated coefficients of per-device K-12 school visits andopening K-12 schools for full-time work outcome variables are positive and especially largefor in-person K-12 school opening. Similarly, the estimates in columns (3) and (4) suggestthe negative association of per-device K-12 school visits and opening K-12 schools with theproportion of devices that do not leave their home. This is consistent with a hypothesisthat opening K-12 school allows parents to return to work and spend more time outside.This result may also reflect education workers returning to work.Table 3 presents regression analysis similar to that in Table 1 but including the proportionof devices at full-time/part-time workplaces and those at home as additional regressors,which corresponds to specification (7) in Fig. 3. The estimates indicate that the proportionof staying home devices is negatively associated with the subsequent case growth while theproportion of devices at full-time workplaces is positively associated with the case growth.Combined with the estimates in Table 2(a), these results suggest that school openings mayhave increased the transmission of SARS-CoV-2 by encouraging parents to return to workand to spend more time outside. This mechanism can partially explain the discrepencybetween our findings and various studies that focus on cases among students. Contracttracing of cases in schools, such as Falk et al. (2021), Zimmerman et al. (2021), Willeitet al. (2021), Brandal et al. (2021), and Ismail et al. (2020), often finds limited directspread among students. On the other hand, Vlachos, Herteg˚ard, and B. Svaleryd (2021)finds that parents and teachers of students in open schools experience increases in infectionrates.In columns (1)-(2) of Table 3, the estimated coefficients on K-12 school visits remainpositive and large in magnitude even after controlling for the mobility measures of returningto work and being outside home which are mediator variables to capture the indirect effectof school openings on case growth through its effect on mobility. The coefficient on K-12school visits are approximately 75% as large in Table 3 as in Table 1. This suggests thatwithin-school transmission may be the primary channel through which school openingsaffect the spread of COVID-19.One likely reason why college openings may increase cases is that students go out for bars(KA et al., 2020; Chang et al., 2021), where properly wearing masks and practicing socialdistancing are difficult. Table 2(b) presents how visits to restaurants and bars are associatedwith colleges/universities from panel regressions using per-device visits to restaurants andbars as outcome variables. These results indicate that bar visits are positively associatedwith college visits, consistent with a hypothesis that the transmission of SARS-CoV-2 maybe partly driven by an increase in visits to bars by students.
Death Growth Regression.
Many county-day observations report zero weekly deaths inour data set (SI Appendix, Table S4 and Fig. S4(4)). We approximate the weekly deathgrowth rate by the log difference in weekly deaths, where the log of weekly deaths is replacedwith − PENING K-12 SCHOOLS AND THE SPREAD OF COVID-19 11 death growth regression, we use the sub-sample of larger counties by dropping 10 percent ofthe smallest counties in terms of their population size for which zero weekly death happensmore frequently.Fig. 4 illustrates the estimated coefficients of visits to colleges and K-12 schools acrossdifferent specifications for death growth regressions. SI Appendix, Table S3 presents theestimates of death growth regression under baseline specification with a time lag of 21 days. Fig. 4(a) shows that the coefficient of visits to colleges and K-12 schools are positivelyestimated for (1) baseline, (3) an alternative time lag of 35 days, (4) an alternative measureof death growth, and adding more controls in (5)-(8), providing evidence that an increasein visits to colleges and K-12 schools is positively associated with the subsequent increasein weekly death growth rates. The magnitude of the estimated coefficient of K-12 schoolvisits becomes smaller when the time lag is set to 28 days in (2). Fig. 4(b) shows that theassociation of K-12 school visits with death growth is stronger when no mask mandate forstaff is in place.
Limitations.
Our study has the following limitations. First, our study is observationaland therefore should be interpreted with great caution. It only has a causal interpretationin a structural model under exogeneity assumptions that might not hold in reality (seethe Model and Method in SI Appendix). While we present sensitivity analysis with avariety of controls including county dummies and interactions of state dummies and weekdummies, the decisions to open K-12 schools and colleges/universities may be endogenousand correlated with other unobserved time-varying county-level factors that affect the spreadof COVID-19. For example, people’s attitudes toward social distancing, hand-washing, andmask-wearing may change over time (which we are not able to observe in the data) and theirchanges may be correlated with school opening decisions beyond the controls we added toour regression specifications.Our analysis is also limited by the quality and the availability of the data as follows.The reported number of cases is likely to understate true COVID-19 incidence, especiallyamong children and adolescents because they are less likely to be tested than adults giventhat children exhibit milder or no symptoms. County-level testing data is not used becauseof a lack of data although state-week fixed effects control for the weekly difference acrosscounties within the same state and we also control daily state-level test growth rates.Because foot traffic data is constructed from mobile phone location data, the data on K-12 school visits likely reflects the movements of parents and older children who are allowed The time lag of 21 days is taken as a baseline to take into account the time lag of infection and deathreporting but we also report the estimates for the time lag of 28 and 35 days in specifications (2) and(3). These choices of time lags are motivated by the numbers reported in Table 2 of . For the age group above 65, the days fromexposure to onset range up to 6 days; the interquartile range of days from symptom onset to death is givenby 8 and 21 days; the interquartile range of days from death to reporting is 5 and 44 days. This is consistent with CDC data which shows the lower testing volume and the higher rate of positivetest among children and adolescents than adults (Leidman et al., 2021). to carry mobile phones to schools and excludes those of younger children who do not ownmobile phones. Because COVID-infected children and adolescents are known to be less likely to be hospi-talized or die from COVID, the consequence of transmission among children and adolescentsdriven by school openings crucially depends on whether the transmission of SARS-CoV-2from infected children and adolescents to the older population can be prevented. Ouranalysis does not provide any empirical analysis on how school opening is associated withthe transmission across different age groups due to data limitations. Vlachos, Herteg˚ard,and B. Svaleryd (2021) show that teachers in open schools experience higher COVID-19infection rates compared to teachers in closed schools. They also show that this increase ininfection rate also occurs in partners of teachers and parents of students in open schools,albeit to a lesser degree.The impact of school openings on the spread of COVID-19 on case growth may be differentacross counties and over time because it may depend not only on in-school mitigationmeasures but also on contact tracing, testing strategies, and the prevalence of communitytransmissions (Goldhaber-Fiebert, Studdert, and Mello, 2020; Ziauddeen et al., 2020). Wedo not investigate how the association between school openings and case growths dependson contact tracing and testing strategies at the county-level.The result on the association between school opening and death growth in Fig. 4 is sug-gestive but must be viewed with caution because the magnitude of the estimated coefficientof K-12 school visits is sensitive to the assumption on the time lag from infection to deathreporting. The time lag between infection and death is stochastic and spreads over time,making it difficult to uncover the relationship between the timing of school openings andsubsequent deaths. Furthermore, while we provide sensitivity analysis for how to handlezero weekly deaths to approximate death growth, our construction of the death growthoutcome variable remains somewhat arbitrary.Finally, our result does not necessarily imply that K-12 schools should be closed. Closingschools have negative impacts on children’s learning and may cause declining mental healthsamong children. The decision to open or close K-12 schools requires careful assessments ofthe cost and the benefit. We also focus on limited Points-Of-Interest: K-12 schools, colleges and universities, restaurants, drinkingplaces, other recreational places including gyms, and churches. We check the robustness by including visitsto assisted living facilities for the elderly as well as nursing care facilities as additional controls but theresults are not sensitive to their inclusion. In the meta-analysis of 54 studies on the household transmission of SARS-CoV-2 Madewell et al. (2020),estimated household secondary attack rate to child contacts was 16.8%. Miyahara et al. (2021) reports thathousehold secondary attack rate from children and adolescence to other family members was 23.8% andhigher than other age groups in Japan. CDC collects the data on the number of reported cases by age groups from each state whenever suchdata is available. However, for many counties, the reported cases by age groups are missing or there exists asubstantial gap between the sum of cases across different age groups reported by CDC and the total numberof cases reported in NYT case data (see, for example, the case of Ingham, MI, in SI Appendix, Fig. S2).
PENING K-12 SCHOOLS AND THE SPREAD OF COVID-19 13
Figure 3.
Sensitivity analysis for the estimated coefficients of K-12 visitsand college visits of case growth regressions: Debiased Estimator (a) Case Growth Estimates
SchoolVisitsCollegeVisits 0.0 0.4 0.8 1.2
Estimated Coefficients model(1) Baseline(2) Lag = 10(3) Lag = 18(4) Alt. Case Growth(5) Past Cases(6) Bars etc.(7) Fulltime + Home(8) All of (5)-(7) (b) Case Growth Estimates with School Visits × No Mask
SchoolVisitsx No-MaskSchoolVisitsCollegeVisits 0.0 0.4 0.8 1.2
Estimated Coefficients model(1) Baseline(2) Lag = 10(3) Lag = 18(4) Alt. Case Growth(5) Past Cases(6) Bars etc.(7) Fulltime + Home(8) All of (5)-(7)
Notes: (a) presents the estimated of college visits and K-12 school visits with the 90 percent confidence intervalsacross different specifications taking the column (1) of Table 1 as baseline. (b) presents the estimates of college visits, K-12 school visits, and the interaction between K-12 school visits and no mask wearing requirement for staff takingcolumn (2) of Table 1 as baseline. The results are based on the debiased estimator. SI Appendix, Fig. S3 presentsthe results based on the estimator without bias correction.4 VICTOR CHERNOZHUKOV, HIROYUKI KASAHARA, AND PAUL SCHRIMPF
Figure 4.
Sensitivity analysis for the estimated coefficients of K-12 visitsand college visits of death growth regressions: Debiased Estimator (a) Death Growth Estimates
SchoolVisitsCollegeVisits 0.00 0.25 0.50
Estimated Coefficients model(1) Baseline, Lag = 21(2) Lag = 28(3) Lag = 35(4) Alt. Death Growth(5) Past Deaths(6) Bars etc.(7) Fulltime + Home(8) All of (5)-(7) (b) Death Growth Estimates with School Visits × No Mask
SchoolVisitsx No-MaskSchoolVisitsCollegeVisits -0.25 0.00 0.25 0.50
Estimated Coefficients model(1) Baseline, Lag=21(2) Lag = 28(3) Lag = 35(4) Alt. Death Growth(5) Past Deaths(6) Bars etc.(7) Fulltime + Home(8) All of (5)-(7)
Notes: (a) presents the estimated of college visits and K-12 school visits with the 90 percent confidence intervalsacross different specifications taking the column (1) of SI Appendix, Table S3 as baseline. (b) presents the estimatesof college visits , K-12 school visits, and the interaction between K-12 school visits and no mask wearing requirementfor staff taking column (2) of SI Appendix, Table S3 as baseline.PENING K-12 SCHOOLS AND THE SPREAD OF COVID-19 15
Materials and Methods
Data.
Cases and the deaths for each county are obtained from the New York Times. Safe-Graph provides foot traffic data based on a panel of GPS pings from anonymous mobiledevices. Per-device visits to K-12 schools, colleges/universities, restaurants, bars, recre-ational places, and churches are constructed from the ratio of daily device visits to thesepoint-of-interest locations to the number of devices residing in each county. Full-time andpart-time workplace visits are the ratio of the number of devices that spent more than 6hours and between 3 to 6 hours, respectively, at one location other than one’s home loca-tion to the total number of device counts. Staying home device variable is the ratio of thenumber of devices that do not leave home locations to the total number of device counts.MCH Strategy Data provides information on the date of school openings with differentteaching methods (in-person, hybrid, and remote) as well as mitigation strategies at 14703school districts. We link school district-level MCH data to county-level data from NYTand SafeGraph using the file for School Districts and Associated Counties at US CensusBureau. School district data is aggregated up to county using the enrollment of studentsat the district level. Specifically, we construct the proportion of students with differentteaching methods for each county-day observation using the district level information onschool opening dates and teaching methods. We also construct a county-level dummyvariable of “No mask requirement for staff” which takes a value of 1 if there exists at leastone school district without any mask requirement for staff and 0, otherwise. Our regressorsare 7 days moving averages of these variables. A substantial fraction of school districtsreport “unknown” or “pending” for teaching methods and mask requirements. We dropcounty observations for which more than 50 percent of students attend school districts thatreport unknown or pending for teaching methods or mask requirements when these variablesare included in regressors.NPIs data on stay-at-home orders and gathering bans is from Jie Ying Wu Killeen et al.(2020) while the data on mask policies is from Wright et al. (2020). These NPI data containinformation up to the end of July; in our regression analysis, we set the value of these policyvariables after August to be the same as the value of the last day of observations. Cases byage groups for Fig. 2 is from CDC. SI Appendix, Tables S5-S6 present summary statisticsand correlation matrix. Fig. S4. presents the evolution of percentiles of these variables overtime.
Methods.
Our research design closely follows Chernozhukov, Kasahara, and Schrimpf(2021). Fig. 5 is a causal path diagram for our model that describes how policies, be-havior, and information interact together: • The forward health outcome, Y i,t + (cid:96) , is determined last after all other variables havebeen determined; • The policies, P it , affect health outcome Y i,t + (cid:96) either directly, or indirectly by alteringhuman behavior B it , which may be only partially observed; Figure 5.
The causal path diagram for our model P it I it Y i,t + ‘ B it I it W it • Information variables, I it , such as lagged values of outcomes can affect human be-havior and policies, as well as outcomes; • The confounders W it , which vary across counties and time, affect all other variables;these include unobserved but estimable county, time, state, state-week effects.The index i denotes the county i , and t and t + (cid:96) denotes the time, where (cid:96) represents thetime lag between infection and case confirmation or death. Our health outcomes are thegrowth rates in Covid-19 cases and deaths and policy variables include school reopening invarious modes, mask mandates, ban gathering, and stay-at-home orders, and the informa-tion variables include lagged values of outcome (as well as other variables described in thesensitivity analysis).The causal structure allows for the effect of the policy to be either direct or indirect.For example, school openings not only directly affect case growth through the within-schooltransmission but also indirectly affect case growth by increasing parents’ mobility. Thestructure also allows for changes in behavior to be brought by the change in policies andinformation. The information variables, such as the number of past cases, can cause peopleto spend more time at home, regardless of adopted policies; these changes in behavior, inturn, affect the transmission of SARS-CoV-2.Our measurement equation will take the form:∆ log(∆ C it ) = X (cid:48) i,t − θ + δ T ∆ log( T it ) + (cid:15) it , where i is county, t is day, ∆ C it is weekly confirmed cases over 7 days, T it is the numberof tests over 7 days, ∆ is a 7-day differencing operator, (cid:15) it is an unobserved error term. X i,t − collects other behavioral, policy, and confounding variables, where the lag of 14days captures the time lag between infection and confirmed case (see MIDAS (2020)). InSI Appendix, we relate this specification to the SIRD model.The main regressors of interest are the visits to K-12 schools and colleges/universitiesas well as the K-12 school opening variables with different teaching methods together with PENING K-12 SCHOOLS AND THE SPREAD OF COVID-19 17 their interactions with mask requirements for staff. As confounders, X i,t − includes a setof county dummies and a set of all interaction terms between state dummies and weekdummies. We also consider 2, 3, and 4 weeks lagged log values of weekly cases as well asthree NPI policy variables. The growth rate of tests, ∆ log( T it ), is captured by the observedgrowth rate of tests at state-level as well as interaction terms between state dummy variablesand week dummy variables. The standard errors are computed by clustering at the state-level, where its rationale is that the county-level stochastic shocks may be correlated acrosscounties especially within the state.Our specification effectively contains the lagged dependent variables in a set of regressorsbecause the log of past weekly cases with different lag lengths can be transformed into thelog-differences of past weekly cases. Our model is a dynamic panel regression model inwhich the fixed effects estimator with a set of county dummies may result in the Nickellbias (Nickell, 1981). To eliminate the bias, we construct an estimator with bias correctionas follows.Given our panel data with sample size ( N, T ), denote a set of counties by N = { , , ..., N } .We randomly and repeatedly partition N into two sets as N j and N j = N \ N j for j = 1 , , ..., J , where N j and N j (approximately) contain the same number of counties. Foreach of j = 1 , ..., J , consider two sub-panels (where i stands for county and t stands for theday) defined by S j = S j ∪ S j and S j = S j ∪ S j with S j k = { ( i, t ) : i ∈ N k , t ≤ (cid:100) T / (cid:101)} and S j k = { ( i, t ) : i ∈ N k , t ≥ (cid:98) T / (cid:99)} for k = 1 ,
2, where (cid:100) . (cid:101) and (cid:98) . (cid:99) are the ceiling andfloor functions. We form the estimator with bias correction as (cid:98) β BC := (cid:98) β − ( (cid:98) β − (cid:101) β ) (cid:124) (cid:123)(cid:122) (cid:125) bias estimator = 2 (cid:98) β − (cid:101) β with (cid:101) β := 1 J J (cid:88) j =1 (cid:101) β S j ∪ S j , where (cid:98) β is the standard estimator with a set of N county dummies while (cid:101) β S j ∪ S j denotes theestimator using the data set S j ∪ S j but treats the counties in S j differently from those in S j to form the estimator— namely, we include approximately 2 N county dummies to compute (cid:101) β S j ∪ S j . We choose J = 2 in our empirical analysis. We report asymptotic standard errorswith state-level clustering, justified by the standard asymptotic theory of bias-correctedestimators. For some specifications, we also experimented with J = 5 and obtained the results similar to those with J = 2. Table 2.
The Association of School/College Openings with Mobility in theUnited States: Debiased Estimator (a) Full-time Workplace Visits and Staying Home Devices
Dependent variable
Full Time Full Time Stay Home Stay Home(1) (2) (3) (4)College Visits − ∗∗∗ − ∗∗∗ − ∗∗∗ − ∗∗∗ (0.004) (0.006) (0.024) (0.026)K-12 School Visits 0.078 ∗∗∗ − ∗∗ (0.006) (0.026)Open K-12 In-person 0.999 ∗∗∗ − ∗∗∗ (0.125) (0.382)Open K-12 Hybrid 0.509 ∗∗∗ ∗∗∗ Note: ∗ p < ∗∗ p < ∗∗∗ p < (b) Visits to Restaurants and Bars Dependent variable
Restaurants Restaurants Bars Bars(1) (2) (3) (4)College Visits 0.064 0.034 0.016 ∗∗∗ ∗∗ (0.053) (0.051) (0.006) (0.005)K-12 School Visits 0.006 0.008(0.046) (0.006)Open K-12 In-person − ∗∗∗ − ∗∗∗ (0.404) (0.041)Open K-12 Hybrid − ∗∗∗ − ∗∗∗ (0.272) (0.038)Open K-12 Remote − ∗ Notes: All regression specifications include county fixed effects, state-week fixed effects, three NPIs variables, andthe log of cases without lag, lagged by 1 and 2 weeks. See SI Appendix, Table S1 for the estimated coefficients forNPIs and the log of current and past cases. The debiased estimator is used. Clustered standard errors at the statelevel are reported in the bracket. SI Appendix, Table S2 reports the estimates for NPIs and past cases. ∗ p < ∗∗ p < ∗∗∗ p < Table 3.
The Association of School/College Openings, Full-time/Part-timeWork, and Staying Home with Case Growth in the United States: DebiasedEstimator
Dependent variable:
Case Growth Rates (1) (2) (3) (4)College Visits, 14d lag 0.060 0.012 0.114 ∗ ∗∗∗ ∗∗∗ (0.075) (0.087)K-12 Visits × No-Mask 0.287 ∗∗∗ (0.071)K-12 In-person, 14d lag 0.015 − − ∗∗ − ∗∗∗ (0.013) (0.013)K-12 Remote, 14d lag − ∗∗∗ − ∗∗∗ (0.015) (0.014)K-12 In-person × No-Mask 0.034 ∗ (0.020)K-12 Hybrid × No-Mask 0.043 ∗∗∗ (0.017)Full-time Work Device, 14d lag − ∗∗ ∗∗ (0.417) (0.490) (0.384) (0.436)Part-time Work Device, 14d lag 0.262 0.466 0.820 ∗∗∗ ∗∗∗ (0.259) (0.305) (0.276) (0.309)Staying Home Device, 14d lag − ∗∗∗ − ∗∗∗ − ∗∗∗ − ∗∗∗ (0.057) (0.069) (0.061) (0.067)Observations 690,297 545,131 612,963 528,941R Notes: Dependent variable is the log difference in weekly positive cases across 2 weeks. All regression specificationsinclude county fixed effects and state-week fixed effects, three NPIs, and 2, 3, and 4 weeks lagged log of cases. See SIAppendix, Table S3 for the estimated coefficients for NPIs and the log of current and past cases. The debiased fixedeffects estimator is applied. Asymptotic clustered standard errors at the state level are reported in the bracket. ∗ p < ∗∗ p < ∗∗∗ p < References
Auger, Katherine A., Samir S. Shah, Troy Richardson, David Hartley, Matthew Hall, Amanda Warniment,Kristen Timmons, Dianna Bosse, Sarah A. Ferris, Patrick W. Brady, Amanda C. Schondelmeyer, andJoanna E. Thomson. 2020. “Association Between Statewide School Closure and COVID-19 Incidence andMortality in the US.”
JAMA
324 (9):859–870. URL https://doi.org/10.1001/jama.2020.14348 .Bignami, Simona, Yacine Boujija, John Sandberg, and Olivier Drouin. 2021. “Enfants, ´ecoles et COVID-19: le cas montr´ealais.”Brandal, Lin T, Trine S Ofitserova, Hinta Meijerink, Rikard Rykkvin, Hilde M Lund, Olav Hungnes, Mar-grethe Greve-Isdahl, Karoline Bragstad, Karin Nyg˚ard, and Winje Brita A. 2021. “Minimal transmissionof SARS-CoV-2 from paediatric COVID-19 cases in primary schools, Norway, August to November 2020.”
Euro Surveill
URL https://doi.org/10.2807/1560-7917.ES.2020.26.1.2002011 .Chang, Serina, Emma Pierson, Pang Wei Koh, Jaline Gerardin, Beth Redbird, David Grusky, and JureLeskovec. 2021. “Mobility network models of COVID-19 explain inequities and inform reopening.”
Nature
589 (7840):82–87. URL https://doi.org/10.1038/s41586-020-2923-3 . Chen, Shuowen, Victor Chernozhukov, and Iv´an Fern´andez-Val. 2019. “Mastering panel metrics: causalimpact of democracy on growth.” In
AEA Papers and Proceedings , vol. 109. 77–82.Chen, Shuowen, Victor Chernozhukov, Ivan Fernandez-Val, Hiroyuki Kasahara, and Paul Schrimpf. 2020.“Cross-Over Jackknife Bias Correction for Non-Stationary Nonlinear Panel Data.”Chernozhukov, Victor, Hiroyuki Kasahara, and Paul Schrimpf. 2021. “Causal impact of masks, policies,behavior on early covid-19 pandemic in the U.S.”
Journal of Econometrics
220 (1):23–62.Falk, A, A Benda, P Falk, S Steffen, Z Wallace, and TB Høeg. 2021. “COVID-19 Cases and Transmission in17 K–12 Schools — Wood County, Wisconsin, August 31–November 29, 2020.”
Morbidity and MortalityWeekly Report http://dx.doi.org/10.15585/mmwr.mm7004e3 .Gillespie, Darria Long, Lauren Ancel Meyers, Michael Lachmann, Stephen C Redd, and Jonathan MZenilman. 2021. “The Experience of Two Independent Schools with In-Person Learning During theCOVID-19 Pandemic.” medRxiv
URL .Goldhaber, Dan, Scott A Imberman, Katharine O Strunk, Bryant Hopkins, Nate Brown, Erica Harbatkin,and Tara Kilbride. 2021. “To What Extent Does In-Person Schooling Contribute to the Spread of COVID-19? Evidence from Michigan and Washington.” Working Paper 28455, National Bureau of EconomicResearch. URL .Goldhaber-Fiebert, Jeremy D., David M. Studdert, and Michelle M. Mello. 2020. “School Reopenings andthe Community During the COVID-19 Pandemic.”
JAMA Health Forum https://doi.org/10.1001/jamahealthforum.2020.1294 .Hahn, Jinyong and Whitney Newey. 2004. “Jackknife and Analytical Bias Reduction for Nonlinear PanelModels.”
Econometrica
72 (4):1295–1319. URL https://EconPapers.repec.org/RePEc:ecm:emetrp:v:72:y:2004:i:4:p:1295-1319 .Harris, Douglas N., Engy Ziedan, and Susan Hassig. 2021. “The Effects of School Reopeningson COVID-19 Hospitalizations.” Tech. rep. URL .Hobbs, Charlotte V., Lora M. Martin, Sara S. Kim, Brian M. Kirmse, Lisa Haynie, Sarah McGraw, PaulByers, Kathryn G. Taylor, Manish M. Patel, Brendan Flannery, and CDC COVID-19 Response Team.2020. “Factors Associated with Positive SARS-CoV-2 Test Results in Outpatient Health Facilities andEmergency Departments Among Children and Adolescents Aged ¡18 Years - Mississippi, September-November 2020.”
MMWR. Morbidity and Mortality Weekly Report
69 (50):1925–1929. URL https://pubmed.ncbi.nlm.nih.gov/33332298 .Ismail, Sharif A., Vanessa Saliba, Jamie Lopez Bernal, Mary E. Ramsay, and Shamez N. Ladhani. 2020.“SARS-CoV-2 infection and transmission in educational settings: a prospective, cross-sectional analysisof infection clusters and outbreaks in England.”
The Lancet Infectious Diseases
URL https://doi.org/10.1016/S1473-3099(20)30882-3 .KA, Fisher, Tenforde MW, Feldstein LR, Christopher J. Lindsell, Nathan I. Shapiro, D. Clark Files,Kevin W. Gibbs, Heidi L. Erickson, Matthew E. Prekker, Jay S. Steingrub, Matthew C. Exline, Daniel J.Henning, Jennifer G. Wilson, Samuel M. Brown, Ithan D. Peltan, Todd W. Rice, David N. Hager, Adit A.Ginde, H. Keipp Talbot, Jonathan D. Casey, Carlos G. Grijalva, Brendan Flannery, Manish M. Patel,and Wesley H. Self. 2020. “Community and Close Contact Exposures Associated with COVID-19 AmongSymptomatic Adults ≥
18 Years in 11 Outpatient Health Care Facilities — United States, July 2020.”
MMWR Morb Mortal Wkly Rep http://dx.doi.org/10.15585/mmwr.mm6936a5 .Killeen, Benjamin D., Jie Ying Wu, Kinjal Shah, Anna Zapaishchykova, Philipp Nikutta, AniruddhaTamhane, Shreya Chakraborty, Jinchi Wei, Tiger Gao, Mareike Thies, and Mathias Unberath. 2020.“A County-Level Dataset for Informing the United States’ Response to COVID-19.”Leidman, Eva, Lindsey M. Duca, John D. Omura, Krista Proia, James W. Stephens, and Erin K. Sauber-Schatz. 2021. “COVID-19 Trends Among Persons Aged 0–24 Years — United States, March 1–December12, 2020.”
MMWR Morb Mortal Wkly Rep
70. URL http://dx.doi.org/10.15585/mmwr.mm7003e1 .Madewell, Zachary J., Yang Yang, Jr Longini, Ira M., M. Elizabeth Halloran, and Natalie E. Dean. 2020.“Household Transmission of SARS-CoV-2: A Systematic Review and Meta-analysis.”
JAMA NetworkOpen https://doi.org/10.1001/jamanetworkopen.2020.31756 .MIDAS. 2020. “MIDAS 2019 Novel Coronavirus Repository: Parameter Estimates.” URL https://github.com/midas-network/COVID-19/tree/master/parameter_estimates/2019_novel_coronavirus . PENING K-12 SCHOOLS AND THE SPREAD OF COVID-19 21
Miyahara, Reiko, Naho Tsuchiya, Ikkoh Yasuda, Yura Ko, Yuki Furuse, Eiichiro Sando, Shohei Nagata,Tadatsugu Imamura, Mayuko Saito, Konosuke Morimoto, Takeaki Imamura, Yugo Shobugawa, HiroshiNishiura, Motoi Suzuki, and Hitoshi Oshitani. 2021. “Familial Clusters of Coronavirus Disease in 10Prefectures, Japan, February-May 2020.”
Emerging Infectious Diseases
27 (3). URL .Nickell, Stephen. 1981. “Biases in Dynamic Models with Fixed Effects.”
Econometrica
49 (6):1417–26. URL https://EconPapers.repec.org/RePEc:ecm:emetrp:v:49:y:1981:i:6:p:1417-26 .Panovska-Griffiths, Jasmina, Cliff C Kerr, Robyn M Stuart, Dina Mistry, Daniel J Klein, Russell M Viner,and Chris Bonell. 2020. “Determining the optimal strategy for reopening schools, the impact of testand trace interventions, and the risk of occurrence of a second COVID-19 epidemic wave in the UK: amodelling study.”
The Lancet Child & Adolescent Health .Pearl, Judea. 2009.
Causality . Cambridge university press.Vlachos, Jonas, Edvin Herteg˚ard, and Helena B. Svaleryd. 2021. “The effects of school closures on SARS-CoV-2 among parents and teachers.” medRxiv
118 (9). URL .White House, The. 2020. “Guidelines for Opening Up America Again.” URL .Willeit, Peter, Robert Krause, Bernd Lamprecht, Andrea Berghold, Buck Hanson, Evelyn Stelzl, Herib-ert Stoiber, Johannes Zuber, Robert Heinen, Alwin K¨ohler, David Bernhard, Wegene Borena, ChristianDoppler, Dorothee von Laer, Hannes Schmidt, Johannes Pr¨oll, Ivo Steinmetz, and Michael Wagner.2021. “Prevalence of RT-PCR-detected SARS-CoV-2 infection at schools: First results from the Aus-trian School-SARS-CoV-2 Study.” medRxiv
URL .Wright, Austin L., Konstantin Sonin, Jesse Driscoll, and Jarnickae Wilson. 2020. “Poverty and EconomicDislocation Reduce Compliance with COVID-19 Shelter-in-Place Protocols.”
SSRN Electronic Journal .Ziauddeen, Nida, Kathryn Woods-Townsend, Sonia Saxena, Ruth Gilbert, and Nisreen A Alwan. 2020.“Schools and COVID-19: Reopening Pandora’s box?”
Public Health in Practice . Edition: 2020/12/22 Publisher: The Au-thors. Published by Elsevier Ltd on behalf of The Royal Society for Public Health.Zimmerman, Kanecia O., Ibukunoluwa C. Akinboyo, M. Alan Brookhart, Angelique E. Boutzoukas, Kath-leen McGann, Michael J. Smith, Gabriela Maradiaga Panayotti, Sarah C. Armstrong, Helen Bristow,Donna Parker, Sabrina Zadrozny, David J. Weber, and Daniel K. Benjamin. 2021. “Incidence andSecondary Transmission of SARS-CoV-2 Infections in Schools.”
Pediatrics
URL https://pediatrics.aappublications.org/content/early/2021/01/06/peds.2020-048090 . Supplementary Information Appendix
The Model and Methods.
The Structural Causal Model.
Our approach draws on the framework presented in our previ-ous paper Chernozhukov, Kasahara, and Schrimpf (2021). Here we summarize the approachfor completeness, highlighting the main difference (here we do not assume that all relevantsocial distancing behavioral variables are observed).We begin with a qualitative description of the model via a causal path diagram shownin Figure 6, which describes how policies, behavior, and information interact together: • The forward health outcome, Y i,t + (cid:96) , is determined last, after all other variables havebeen determined; • The adopted vector of policies, P it , affect health outcome Y i,t + (cid:96) either directly, orindirectly by altering individual distancing and other precautioanry behavior B it ,which may be only partially observed; • Information variables, I it , such as lagged values of outcomes and other lagged ob-servable variables (see robustness checks) can affect human behavior and policies,as well as outcomes; • The confounding factors W it , which vary across counties and time, affect all othervariables; these include unobserved though estimable county, time, state, state-weekeffects.The index i denotes observational unit, the county, and t and t + (cid:96) denotes the time, where (cid:96) represents the typical time lag between infection and case confirmation or death. P it I it Y i,t + (cid:96) B it I it W it Figure 6.
The causal path diagram for our model.Our main outcomes of interest are the growth rates in Covid-19 cases and deaths andpolicy variables include school reopening in various modes, mask mandates, ban gathering,
PENING K-12 SCHOOLS AND THE SPREAD OF COVID-19 23 and stay-at-home orders, and the information variables include lagged values of outcome(as well as other variables described in the sensitivity checks).The role of behavioral variables in the model is two-fold. First, the presence of these vari-ables in the model requires us to control for the information variables – even when informa-tion variables affect outcomes only through policies or behavior. In this case conditioningon the information blocks the backdoor path (see, Pearl (2009)) creating confounding Y i,t + (cid:96) ←− B it ←− I it −→ P it . Therefore conditioning on the information is important even when there is no direct effect I it −→ Y i,t + (cid:96) . This observation motivates our main dynamic specification below, whereinformation variables include lagged growth rates and new cases or new deaths per capita.Second, while not all behavioral variables may be observable, we can still study as the matterof supporting analysis, the effects of policies on observed behavioral variables (the portion oftime in workplace, restaurants, and bars) and of behavioral variables on outcomes, therebygaining insight as to whether policies have changed private behavior and to what extentthis private behavior changed the outcomes (for the analysis, of early pandemic data in thisvein, see our previous paper).The causal structure allows for the effect of the policy to be either direct or indirect.The structure also allows for changes in behavior to be brought by the change in policiesand information. These are all realistic properties that we expect from the context ofthe problem. Policies such as closures and reopenings of schools, closures or reopening ofnon-essential business, and restaurants, affect the behavior in strong ways. In contrast,policies such as mandating employees to wear masks can potentially affect the Covid-19transmission directly. The information variables, such as recent growth in the number ofcases, can cause people to spend more time at home, regardless of adopted policies; thesechanges in behavior, in turn, affect the transmission of Covid-19.The causal ordering induced by this directed acyclical graph is determined by the follow-ing timing sequence:(1) information and confounders get determined at t ,(2) policies are set in place, given information and confounders at t ;(3) behavior is realized, given policies, information, and confounders at t ;(4) outcomes get realized at t + (cid:96) given policies, behavior, information, and confounders.The model also allows for direct dynamic effects of information variables on the outcomethrough autoregressive structures that capture persistence in growth patterns. We do nothighlight these dynamic effects and only study the short-term effects (longer-run effects gettypically amplified; see our previous paper Chernozhukov, Kasahara, and Schrimpf (2021)for more details.) Our quantitative model for causal structure in Figure 6 is given by the following econo-metric structural equation model: Y i,t + (cid:96) ( b, p, ι ) := α (cid:48) b + π (cid:48) p + µ (cid:48) ι + δ (cid:48) Y W it + ε yit ,B it ( p, ι ) := β (cid:48) p + γ (cid:48) ι + δ (cid:48) B W it + ε bit ,P it ( ι ) := p ( η (cid:48) ι, W it , ε pit ) , (SEM)which is a collection of structural potential response functions (potential outcomes), wherethe stochastic schocks are decomposed into an observable part δ (cid:48) W and unobservable part ε . Lower case letters ι , b and p denote the potential values of information, behavior, andpolicy variables. The restrictions on shocks are described below.The observed outcomes, policy, and behavior variables are generated by setting ι = I it and propagating the system from the last equation to the first: Y i,t + (cid:96) := Y i,t + (cid:96) ( B it , P it , I it ) ,B it := B it ( P it , I it ) ,P it := P it ( I it ) . The orthogonality restrictions on the stochastic components are as follows: The stochasticshocks ε yit and ε pit are centered and furthermore, ε yit ⊥ ( ε bit , P it , W it , I it ) ,ε bit ⊥ ( P it , W it , I it ) ,ε pit ⊥⊥ ( W it , I it ) , (O)where we say that V ⊥ U if E V U = 0. This is a standard way of representing restrictionson errors in structural equation modeling. The last equation states that variation in policiesis exogenous conditionally on confounders and information variables.The system above together with orthogonality restrictions (O) implies the following col-lection of stochastic equations for realized variables: Y i,t + (cid:96) = α (cid:48) B it + π (cid:48) P it + µ (cid:48) I it + δ (cid:48) Y W it + ε yit , ε yit ⊥ B it , P it , I it , W it (BPI → Y) B it = β (cid:48) P it + γ (cid:48) I it + δ (cid:48) B W it + ε bit , ε bit ⊥ P it , I it , W it (PI → B)As discussed below, the information variable includes case growth. Therefore, the or-thogonality restriction ε yit ⊥ P it holds if the government does not have knowledge on futurecase growth beyond what is predicted by the information set and the confounders; evenwhen the government has some knowledge on ε yit , the orthogonality restriction may hold ifthere is a time lag for the government to implement its policies based on ε yit .We stress that our main analysis does not require all components of B it to be observable. Main Implication.
The model stated above implies the following projection equation:
PENING K-12 SCHOOLS AND THE SPREAD OF COVID-19 25 Y i,t + (cid:96) = a (cid:48) P it + b (cid:48) I it + c (cid:48) W it + ¯ ε it , ¯ ε it ⊥ P it , I it , W it , (PI → Y)where a (cid:48) := ( α (cid:48) β (cid:48) + π (cid:48) ) , b (cid:48) := ( α (cid:48) γ (cid:48) + µ (cid:48) ) , c (cid:48) := ( α (cid:48) δ (cid:48) B + δ (cid:48) Y )This follows immediately from plugging equation (PI → B) to equation (BPI → Y) andverifying that the composite stochastic shock ¯ ε it obeys the orthogonality condition statedin (PI → Y).The main parameter of interest is the structural causal effect of the policy: a (cid:48) = ( α (cid:48) β (cid:48) + π (cid:48) ) . It comprises direct policy effect π (cid:48) as well as the indirect effect α (cid:48) β (cid:48) , realized by the policychanging observed and unobserved behavior variables B it . This coefficient a and b canestimated directly using the dynamic panel data methods described in more detail below.As additional analysis, we can estimate the determinants for the observed behavioralmobility measures– the observed part of B it . Identification and Parameter Estimation.
The orthogonality equations imply that the mainequation is the projection equation, and parameters a and b are identified if P it and I it havesufficient variation left after partialling out the effect of controls:˜ Y i,t + (cid:96) = a (cid:48) ˜ P it + c (cid:48) ˜ I it + ¯ ε it , ¯ ε it ⊥ ˜ P it , ˜ I it , (1)where ˜ V it = V it − W (cid:48) it E[ W it W (cid:48) it ] − E[ W it V it ] denotes the residual after removing the orthogonalprojection of V it on W it . The residualization is a linear operator, implying that (1) followsimmediately from the above. The parameters of (1) are identified as projection coefficientsin these equations, provided that residualized vectors have non-singular variance matrix:Var( ˜ P (cid:48) it , ˜ I (cid:48) it ) > . (2)Our main estimation method is the fixed effects estimator, where the county, state, state-week effects are treated as unobserved components of W it and estimated directly from thepanel data, so they are rendered (approximately) observable once the history is sufficientlylong. The stochastic shocks { ε it } Tt =1 are treated as independent across states and can bearbitrarily dependent across time t within a state. In other words, the standard errors willbe clustered at the state level. When histories are not long, substantial biases emerge fromworking with the estimated version (cid:99) W it of W it (known as the Nickel bias (Nickell, 1981)) andthey need to be removed using debiasing methods. In our context, debiasing changes themagnitudes of the original biased fixed effect estimator but does not change the qualitativeconclusions reached without any debiasing. Formulating Outcome and Key Confounders via SIR model.
Letting C it denotethe cumulative number of confirmed cases in county i at time t , our outcome Y it = ∆ log(∆ C it ) := log(∆ C it ) − log(∆ C i,t − ) (3)approximates the weekly growth rate in new cases from t − t . Here ∆ denotes thedifferencing operator over 7 days from t to t −
7, so that ∆ C it := C it − C i,t − is the numberof new confirmed cases in the past 7 days.We chose this metric as this is the key metric for policymakers deciding when to relaxCovid mitigation policies. The U.S. government’s guidelines for state reopening recommendthat states display a “downward trajectory of documented cases within a 14-day period”(White House, 2020). A negative value of Y it is an indication of meeting these criteria forreopening. By focusing on weekly cases rather than daily cases, we smooth idiosyncraticdaily fluctuations as well as periodic fluctuations associated with the days of the week.Our measurement equation for estimating equations (BPI → Y) and (PI → Y) will take theform: ∆ log(∆ C it ) = X (cid:48) i,t − θ + δ T ∆ log( T it ) + (cid:15) it , (M-C)where i is county, t is day, C it is cumulative confirmed cases, T it is the number of tests over7 days, ∆ is a 7-days differencing operator, (cid:15) it is an unobserved error term. X i,t − collectsother behavioral, policy, and confounding variables, depending on whether we estimate(BPI → Y) or (PI → Y), where the lag of 14 days captures the time lag between infection andconfirmed case (see MIDAS (2020)). Here∆ log( T it ) := log( T it ) − log( T i,t − )is the key confounding variable, derived from considering the SIR model below. We describeother confounders in the empirical analysis section.Our main estimating equation (M-C) is motivated by a variant of SIR model, where weadd confirmed cases and infection detection via testing. Let S , I , R , and D denote thenumber of susceptible, infected, recovered, and dead individuals in a given state. Each ofthese variables are a function of time. We model them as evolving as˙ S ( t ) = − S ( t ) N β ( t ) I ( t ) (4)˙ I ( t ) = S ( t ) N β ( t ) I ( t ) − γ I ( t ) (5)˙ R ( t ) = (1 − κ ) γ I ( t ) (6)˙ D ( t ) = κγ I ( t ) (7)where N is the population, β ( t ) is the rate of infection spread, γ is the rate of recovery ordeath, and κ is the probability of death conditional on infection. We may show that log(∆ C it ) − log(∆ C i,t − ) approximates the average growth rate of cases from t − t . PENING K-12 SCHOOLS AND THE SPREAD OF COVID-19 27
Confirmed cases, C ( t ), evolve as ˙ C ( t ) = τ ( t ) I ( t ) , (8)where τ ( t ) is the rate that infections are detected.Our goal is to examine how the rate of infection β ( t ) varies with observed policies andmeasures of social distancing behavior. A key challenge is that we only observed C ( t ) and D ( t ), but not I ( t ). The unobserved I ( t ) can be eliminated by differentiating (8) and using(5) as ¨ C ( t )˙ C ( t ) = S ( t ) N β ( t ) − γ + ˙ τ ( t ) τ ( t ) . (9)We consider a discrete-time analogue of equation (9) to motivate our empirical specificationby relating the detection rate τ ( t ) to the number of tests T it while specifying S ( t ) N β ( t ) as alinear function of variables X i,t − . This results in∆ log(∆ C it ) ¨ C ( t )˙ C ( t ) = X (cid:48) i,t − θ + (cid:15) it S ( t ) N β ( t ) − γ + δ T ∆ log( T ) it ˙ τ ( t ) τ ( t ) which is equation (M-C), where X i,t − captures a vector of variables related to β ( t ). Structural Interpretation . The component X (cid:48) i,t − θ is the projectionof β i ( t ) S i ( t ) /N i ( t ) − γ on X i,t − (including testing variable). Growth Rate in Deaths as Outcome . By differentiating (7) and (8) with respect to t and using (9), we obtain ¨ D ( t )˙ D ( t ) = ¨ C ( t )˙ C ( t ) − ˙ τ ( t ) τ ( t ) = S ( t ) N β ( t ) − γ. (10)Our measurement equation for the growth rate of deaths is based on equation (10) butaccount for a 21 day lag between infection and death as∆ log(∆ D it ) = X (cid:48) i,t − θ + (cid:15) it , (M-D)where ∆ log(∆ D it ) := log(∆ D it ) − log(∆ D i,t − ) (11)approximates the weekly growth rate in deaths from t − t in state i . Sensitivity analysisalso provides results for the case of 28 and 35 lag. Debiased Fixed Effects Dynamic Panel Data Estimator.
We apply Jackknife biascorrections; see Chen et al. (2020) and Hahn and Newey (2004) for more details. Here, webriefly describe the debiased fixed effects estimator we use.Given our panel data with sample size (
N, T ), denote a set of counties by N = { , , ..., N } .We randomly and repeatedly partition N into two sets as N j and N j = N \ N j for j = 1 , , ..., J , where N j and N j (approximately) contain the same number of counties. For each of j = 1 , ..., J , consider two sub-panels (where i stands for county and t stands for theday) defined by S j = S j ∪ S j and S j = S j ∪ S j with S j k = { ( i, t ) : i ∈ N k , t ≤ (cid:100) T / (cid:101)} and S j k = { ( i, t ) : i ∈ N k , t ≥ (cid:98) T / (cid:99)} for k = 1 , (cid:100) . (cid:101) and (cid:98) . (cid:99) are the ceiling and floor functions. Each of these two subpanels, S j and S j , includes observations for all cross-sectional units and time periods.We form the estimator with bias-correction as (cid:98) β BC := 2 (cid:98) β − (cid:101) β with (cid:101) β := 1 J J (cid:88) j =1 (cid:101) β S j ∪ S j , where (cid:98) β is the standard estimator with a set of N county dummies while (cid:101) β S j ∪ S j denotesthe estimator using the data set S j ∪ S j but treats the counties in S j differently from thosein S j to form the estimator— namely, we include approximately 2 N county dummies tocompute (cid:101) β S j ∪ S j . Thus, ( (cid:98) β − (cid:101) β ) is the approximation to the bias of (cid:98) β , subtracting whichfrom (cid:98) β gives the formula given above. We set J = 2 in our empirical analysis. When wechoose J = 5 for some specifications, we obtained similar results.An alternative jacknife bias-corrected estimator is (cid:98) β CBC = 2 (cid:98) β − J (cid:80) Jj =1 ( (cid:101) β S j + (cid:101) β S j ) / (cid:101) β S jk denotes the fixed effect estimator using the subpanel S jk for k = 1 ,
2. In ourempirical analysis, these two cross-over jackknife bias corrected estimators give similar re-sult; in simulation experiments, the first form performed somewhat better, so we settledout choice on it.We report asymptotic standard errors with state-level clustering, justified by the standardasymptotic theory of bias corrected estimators. The rationale for state-level clustering isthat the stochastic shocks in the model can be correlated across counties, especially withinthe state. A simple way to model this is to allow for the arbitrary within-state correlationand adjust the standard errors to account for this (state-level clustering).
Department of Economics and Center for Statistics and Data Science, MIT, MA 02139
Email address : [email protected] Vancouver School of Economics, UBC, 6000 Iona Drive, Vancouver, BC.
Email address : [email protected] Vancouver School of Economics, UBC, 6000 Iona Drive, Vancouver, BC.
Email address : [email protected] PENING K-12 SCHOOLS AND THE SPREAD OF COVID-19 29
Table S1.
The Association of School/College Openings and NPI Policieswith Case Growth in the United States: Standard Fixed Effects Estimatorwithout Bias Correction
Dependent variable:
Case Growth Rates (1) (2) (3) (4)College Visits, 14d lag 0.359 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.071) (0.073) (0.064) (0.076)K-12 Visits, 14d lag 0.393 ∗∗∗ ∗∗∗ (0.070) (0.070)K-12 Visits × No-Mask 0.100(0.070)K-12 In-person, 14d lag 0.062 ∗∗∗ ∗∗∗ (0.017) (0.021)K-12 Hybrid, 14d lag 0.040 ∗∗∗ ∗∗ (0.014) (0.013)K-12 Remote, 14d lag 0.030 ∗ ∗ (0.016) (0.015)K-12 In-person × No-Mask 0.009(0.019)K-12 Hybrid × No-Mask 0.032 ∗ (0.017)Mandatory mask 14d lag − − − − − ∗ − − ∗∗ − − ∗∗∗ − ∗∗∗ − ∗∗∗ − ∗∗∗ (0.031) (0.039) (0.034) (0.040)log(Cases), 14d lag − ∗∗∗ − ∗∗∗ − ∗∗∗ − ∗∗∗ (0.009) (0.010) (0.010) (0.010)log(Cases), 21d lag − ∗∗∗ − ∗∗∗ − ∗∗∗ − ∗∗∗ (0.005) (0.005) (0.005) (0.005)log(Cases), 28d lag − ∗∗∗ − ∗∗∗ − ∗∗∗ − ∗∗∗ (0.003) (0.003) (0.004) (0.004)Test Growth Rates 0.009 ∗∗ ∗ ∗∗ ∗ (0.004) (0.004) (0.004) (0.004)County Dummies Yes Yes Yes YesState × Week Dummies Yes Yes Yes YesObservations 690,297 545,131 612,963 528,941R Notes: Dependent variable is the log difference in weekly positive cases across 2 weeks. Regressors are 7-day movingaverages of corresponding daily variables and lagged by 2 weeks to reflect the time between infection and casereporting except that we don’t take any lag for the log difference in test growth rates. All regression specificationsinclude county fixed effects and state-week fixed effects to control for any unobserved county-level factors andtime-varying state-level factors such as various state-level policies as well as 2, 3, and 4 weeks lagged log of cases.The standard fixed effects estimator without bias-correction is applied. Asymptotic clustered standard errors at thestate level are reported in the bracket. ∗ p < ∗∗ p < ∗∗∗ p < Figure S1.
Average weekly cases and deaths are associated with differentmodes of opening K-12 schools, visits to K-12 schools, and visits to col-leges/universities (a) K-12 School Visits (b) Restaurant Visits (c) Recreation Facilitiy Visits (d) Church Visits K − S c hoo l V i s i t s / D ev i ce plan In−person/No−MaskIn−person/Yes−MaskHybrid/No−MaskHybrid/Yes−MaskRemote R es t a n r a n t V i s i t s / D ev i ce plan In−person/No−MaskIn−person/Yes−MaskHybrid/No−MaskHybrid/Yes−MaskRemote R ec r ea t i on V i s i t s / D ev i ce plan In−person/No−MaskIn−person/Yes−MaskHybrid/No−MaskHybrid/Yes−MaskRemote C hu r c h V i s i t s / D ev i ce plan In−person/No−MaskIn−person/Yes−MaskHybrid/No−MaskHybrid/Yes−MaskRemote (e) Full-Time Workplace Visits (f) Part-Time Workplace Visits (g) Staying Home Devices (h) Bar Visits Fu ll T i m e W o r k / D ev i ce plan In−person/No−MaskIn−person/Yes−MaskHybrid/No−MaskHybrid/Yes−MaskRemote P a r t T i m e W o r k / D ev i ce plan In−person/No−MaskIn−person/Yes−MaskHybrid/No−MaskHybrid/Yes−MaskRemote S t ay H o m e / D ev i ce plan In−person/No−MaskIn−person/Yes−MaskHybrid/No−MaskHybrid/Yes−MaskRemote B a r V i s i t s / D ev i ce plan In−person/No−MaskIn−person/Yes−MaskHybrid/No−MaskHybrid/Yes−MaskRemote (i) Cases by Opening Modes (j) Cases by Opening Modes (k) Cases by Opening Modes (l) Cases by Opening Modes (Student Mask Requirements) (Staff Mask Requirements) (Sports Activities) (Online Instruction Increase) W eek l y C ases P e r plan In−person/No−Student−MaskIn−person/Yes−Student−MaskHybrid/No−Student−MaskHybrid/Yes−Student−MaskRemote W eek l y C ases P e r plan In−person/No−MaskIn−person/Yes−MaskHybrid/No−MaskHybrid/Yes−MaskRemote W eek l y C ases P e r plan In−person/Yes−SportsIn−person/No−SportsHybrid/Yes−SportsHybrid/No−SportsRemote W eek l y C ases P e r plan In−person/No−OnlineIn−person/Yes−OnlineHybrid/No−OnlineHybrid/Yes−OnlineRemote (m) Deaths by Opening Modes (n) Deaths by Opening Modes (o) Deaths by Opening Modes (p) Deaths by Opening Modes (Student Mask Requirements) (Staff Mask Requirements) (Sports Activities) (Online Instruction Increase) W eek l y D ea t h s P e r plan In−person/No−Student−MaskIn−person/Yes−Student−MaskHybrid/No−Student−MaskHybrid/Yes−Student−MaskRemote W eek l y D ea t h s P e r plan In−person/No−MaskIn−person/Yes−MaskHybrid/No−MaskHybrid/Yes−MaskRemote W eek l y D ea t h s P e r plan In−person/Yes−SportsIn−person/No−SportsHybrid/Yes−SportsHybrid/No−SportsRemote W eek l y D ea t h s P e r plan In−person/No−OnlineIn−person/Yes−OnlineHybrid/No−OnlineHybrid/Yes−OnlineRemote (q) Cases by K-12 Visits (r) Cases by K-12 Visits (s) Cases by K-12 Visits (t) Cases by K-12 Visits (Student Mask Requirements) (Staff Mask Requirements) (Sports Activities) (Online Instruction Increase)
246 Sep Oct Nov DecDate W eek l y C ases P e r group High/No−MaskHigh/Yes−MaskMedium/No−MaskMedium/Yes−MaskLow/No−MaskLow/Yes−Mask
246 Sep Oct Nov DecDate W eek l y C ases P e r group High/No−Staff−MaskHigh/Yes−Staff−MaskMedium/No−Staff−MaskMedium/Yes−Staff−MaskLow/No−Staff−MaskLow/Yes−Staff−Mask
246 Sep Oct Nov DecDate W eek l y C ases P e r group High/Yes−SportsHigh/No−SportsMedium/Yes−SportsMedium/No−SportsLow/Yes−SportsLow/No−Sports
246 Sep Oct Nov DecDate W eek l y C ases P e r group High/No−OnlineHigh/Yes−OnlineMedium/No−OnlineMedium/Yes−OnlineLow/No−OnlineLow/Yes−Online (u) Deaths by K-12 Visits (v) Deaths by K-12 Visits (w) Deaths by K-12 Visits (x) Deaths by K-12 Visits (Student Mask Requirements) (Staff Mask Requirements) (Sports Activities) (Online Instruction Increase) W eek l y D ea t h s P e r group High/No−MaskHigh/Yes−MaskMedium/No−MaskMedium/Yes−MaskLow/No−MaskLow/Yes−Mask W eek l y D ea t h s P e r group High/No−Staff−MaskHigh/Yes−Staff−MaskMedium/No−Staff−MaskMedium/Yes−Staff−MaskLow/No−Staff−MaskLow/Yes−Staff−Mask W eek l y D ea t h s P e r group High/Yes−SportsHigh/No−SportsMedium/Yes−SportsMedium/No−SportsLow/Yes−SportsLow/No−Sports W eek l y D ea t h s P e r group High/No−OnlineHigh/Yes−OnlineMedium/No−OnlineMedium/Yes−OnlineLow/No−OnlineLow/Yes−Online
Notes: (a)-(h) plot the evolution of corresponding variables in the title before and after the day of school openingsand corresponding to figures reported in Fig. 1(c)(d) in the main text. (i)-(p) corresponds to Fig.(a)(b) and plot theevolution of weekly cases or deaths per 1000 persons averaged across counties within each group of counties classifiedby K-12 school teaching methods and different mitigation strategies (mask requirements for students, maskrequirements for staffs, allowing for sports activities, and increase in online instructions) against the days since K-12school opening. In (i) and (m), counties that implement in-person teaching are classified into “In-person/Yes-Mask”and “In-person/No-Mask” based on whether at least one school district requires students to wear masks or not. In(k) and (o), counties that implement in-person teaching are classified into “In-person/Yes-Sports” and“In-person/No-Sports” based on whether at least one school district requires students to allow sports activities ornot. In (l) and (p), counties that implement in-person teaching are classified into “In-person/No-Online” and“In-person/Yes-Online” based on whether at least one school district answer that no increase in online instruction.(q)-(x) are similar to (i)-(p) but classify counties by the volume of per-device K-12 school visits and take thecalendar dates instead of the days since opening schools as x-axis, where ”Low,” ”Middle,” and ”High” arecounty-day observations of which 14 days lagged per-device K-12 school visits less than the first quartile, betweenthe first and the third quartiles, and larger than the third quartile, respectively. In (q) and (u), ”Low/No-Mask,””Middle/No-Mask,” and ”High/No-Mask” are a subset of low, middle, and high visits groups of counties for whichat least one school district does not require students to wear masks.PENING K-12 SCHOOLS AND THE SPREAD OF COVID-19 31
Figure S2.
The number of cases by age groups and the number of visitsto colleges/universities, bars, restaurants, recreation facilities, K-12 schools,and a comparison of reported cases between CDC and NYT data
Pima, AZ Ingham, MI Centre, PA Story, IA Champaign, IL
Cases by Age Groups Cases by Age Groups Cases by Age Groups Cases by Age Groups Cases by Age Groups
30 − 39 Years40 − 49 Years50 − 59 Years20 − 29 Years0 − 9 Years10 − 19 Years60 − 69 Years80+ Years70 − 79 Years
Date da y m o v i ng a v e r age o f c a s e s Arizona − Pima
10 − 19 Years20 − 29 Years80+ Years0 − 9 Years50 − 59 Years30 − 39 Years70 − 79 Years
Date da y m o v i ng a v e r age o f c a s e s Michigan − Ingham
50 − 59 Years70 − 79 Years20 − 29 Years10 − 19 Years40 − 49 Years60 − 69 Years30 − 39 Years80+ Years
Date da y m o v i ng a v e r age o f c a s e s Pennsylvania − Centre
10 − 19 Years30 − 39 Years70 − 79 Years40 − 49 Years20 − 29 Years50 − 59 Years60 − 69 Years
Date da y m o v i ng a v e r age o f c a s e s Iowa − Story
20 − 29 Years10 − 19 Years40 − 49 Years60 − 69 Years30 − 39 Years50 − 59 Years0 − 9 Years80+ Years70 − 79 Years
Date da y m o v i ng a v e r age o f c a s e s Illinois − Champaign
College Visits College Visits College Visits College Visits College Visits
Date da y m o v i ng a v e r age o f c o ll ege v i s i t s College visits in Arizona − Pima county
Date da y m o v i ng a v e r age o f c o ll ege v i s i t s College visits in Michigan − Ingham county
Date da y m o v i ng a v e r age o f c o ll ege v i s i t s College visits in Pennsylvania − Centre county
Date da y m o v i ng a v e r age o f c o ll ege v i s i t s College visits in Iowa − Story county
Date da y m o v i ng a v e r age o f c o ll ege v i s i t s College visits in Illinois − Champaign county
Bar Visits Bar Visits Bar Visits Bar Visits Bar Visits
Date da y m o v i ng a v e r age o f ba r v i s i t s Bar visits in Arizona − Pima county
Date da y m o v i ng a v e r age o f ba r v i s i t s Bar visits in Michigan − Ingham county
Date da y m o v i ng a v e r age o f ba r v i s i t s Bar visits in Pennsylvania − Centre county
Date da y m o v i ng a v e r age o f ba r v i s i t s Bar visits in Iowa − Story county
Date da y m o v i ng a v e r age o f ba r v i s i t s Bar visits in Illinois − Champaign county
Restaurant Visits Restaurant Visits Restaurant Visits Restaurant Visits Restaurant Visits
Date da y m o v i ng a v e r age o f r e s t au r an t v i s i t s Restaurant visits in Arizona − Pima county
Date da y m o v i ng a v e r age o f r e s t au r an t v i s i t s Restaurant visits in Michigan − Ingham county
Date da y m o v i ng a v e r age o f r e s t au r an t v i s i t s Restaurant visits in Pennsylvania − Centre county
Date da y m o v i ng a v e r age o f r e s t au r an t v i s i t s Restaurant visits in Iowa − Story county
Date da y m o v i ng a v e r age o f r e s t au r an t v i s i t s Restaurant visits in Illinois − Champaign county
Recreation Facilitiy Visits Recreation Facilitiy Visits Recreation Facilitiy Visits Recreation Facilitiy Visits Recreation Facilitiy Visits
Date da y m o v i ng a v e r age o f g y m v i s i t s Gym visits in Arizona − Pima county
Date da y m o v i ng a v e r age o f g y m v i s i t s Gym visits in Michigan − Ingham county
Date da y m o v i ng a v e r age o f g y m v i s i t s Gym visits in Pennsylvania − Centre county
Date da y m o v i ng a v e r age o f g y m v i s i t s Gym visits in Iowa − Story county
Date da y m o v i ng a v e r age o f g y m v i s i t s Gym visits in Illinois − Champaign county
School Visits School Visits School Visits School Visits School Visits
Date da y m o v i ng a v e r age o f sc hoo l v i s i t s School visits in Arizona − Pima county
Date da y m o v i ng a v e r age o f sc hoo l v i s i t s School visits in Michigan − Ingham county
Date da y m o v i ng a v e r age o f sc hoo l v i s i t s School visits in Pennsylvania − Centre county
Date da y m o v i ng a v e r age o f sc hoo l v i s i t s School visits in Iowa − Story county
Date da y m o v i ng a v e r age o f sc hoo l v i s i t s School visits in Illinois − Champaign county
CDC vs. NYT Cases CDC vs. NYT Cases CDC vs. NYT Cases CDC vs. NYT Cases CDC vs. NYT Cases date da y m o v i ng a v e r age o f ne w c a s e s colour CDCNYT
Comparison of cases in NYT and CDC data in Arizona − Pima date da y m o v i ng a v e r age o f ne w c a s e s colour CDCNYT
Comparison of cases in NYT and CDC data in Michigan − Ingham date da y m o v i ng a v e r age o f ne w c a s e s colour CDCNYT
Comparison of cases in NYT and CDC data in Pennsylvania − Centre date da y m o v i ng a v e r age o f ne w c a s e s colour CDCNYT
Comparison of cases in NYT and CDC data in Iowa − Story date da y m o v i ng a v e r age o f ne w c a s e s colour CDCNYT
Comparison of cases in NYT and CDC data in Illinois − Champaign
Notes: Figure corresponds to Fig. 2 in the main text but for Pima, AZ, Ingham, MI, Centre, PA, Story, IA, andChampaign, IL. Across various counties, we also report the evolution of visits to recreation facilities and K-12 schoolvisits. The last panel at the bottom compares the sum of weekly cases across all age groups reported in CDCdataset with the weekly reported case in NYT dataset.2 VICTOR CHERNOZHUKOV, HIROYUKI KASAHARA, AND PAUL SCHRIMPF
Figure S3.
Sensitivity analysis for the estimated coefficients of K-12 vis-its and college visits of case growth regressions: Estimator without BiasCorrection (a) Case Growth Estimates
SchoolVisitsCollegeVisits 0.0 0.4 0.8
Estimated Coefficients model(1) Baseline(2) Lag = 10(3) Lag = 18(4) Alt. Case Growth(5) Past Cases(6) Bars etc.(7) Fulltime + Home(8) All of (5)-(7) (b) Case Growth Estimates with School Visits × No Mask
SchoolVisitsx No-MaskSchoolVisitsCollegeVisits 0.0 0.4 0.8
Estimated Coefficients model(1) Baseline(2) Lag = 10(3) Lag = 18(4) Alt. Case Growth(5) Past Cases(6) Bars etc.(7) Fulltime + Home(8) All of (5)-(7)
Notes: These figures corresponds to Fig. 3 of the main text but report the result of the(standard) fixed effects estimator without bias correction.
PENING K-12 SCHOOLS AND THE SPREAD OF COVID-19 33
Figure S4.
Evolution of Cases/Deaths per 1000 Persons, Case/DeathGrowth, Visits to K-12 Schools, Colleges, Restaurants, Bars, Gyms,Churches, K-12 School Opening Modes, and NPIs across U.S. counties (1) Case Growth (2) Cases per 1000 (3) Visits to K-12 Schools −2.50.02.5 Apr May Jun Jul Aug Sep Oct Nov Dec Jan
Date W ee k l y G r o w t h R a t e s Evolution of Case Growth Rates
Date W ee k l y C a s e s P e r Evolution of Weekly Cases per 1000
Date V i s i t s / D e v i c e Evolution of K−12 School Visits (4) Death Growth (5) Deaths per 1000 (6) Visits to Colleges −202 Apr May Jun Jul Aug Sep Oct Nov Dec Jan
Date W ee k l y G r o w t h R a t e s Evolution of Death Growth Rates
Date W ee k l y D ea t h s P e r Evolution of Weekly Deaths per 1000
Date V i s i t s / D e v i c e Evolution of College Visits (7) Visits to Restaurants (8) Visits to Bars (9) Visits to Rec. Facilities
Date V i s i t s / D e v i c e Evolution of Restaurant Visits
Date V i s i t s / D e v i c e Evolution of Bar Visits
Date V i s i t s / D e v i c e Evolution of Rec. Facilities Visits (10) Visits to Churches (11) School Opening Modes (12) NPIs
Date V i s i t s / D e v i c e Evolution of Church Vists date po r t i on o f c oun t i e s plan In−personRemoteHybridUnknown
School reopening date po r t i on o f c oun t i e s policy mask−mandatestay−at−homeban−gathering County−level policies
Notes: (1)-(10) report the evolution of various percentiles of corresponding variables inthe title over time. (10) reports the proportion of counties that open K-12 schools withdifferent teaching methods including “Unknown” over time while (11) reports theproportion of counties that implement three NPIs over time.
Table S2.
The Association of School/College Openings with Mobility inthe United States: All Estimates (a) Full-time Workplace Visits and Staying Home Devices
Dependent variable
Full Time Full Time Stay Home Stay Home(1) (2) (3) (4)College Visits − ∗∗∗ − ∗∗∗ − ∗∗∗ − ∗∗∗ (0.004) (0.006) (0.024) (0.026)K-12 School Visits 0.078 ∗∗∗ − ∗∗ (0.006) (0.026)Open K-12 In-person 0.999 ∗∗∗ − ∗∗∗ (0.125) (0.382)Open K-12 Hybrid 0.509 ∗∗∗ ∗∗∗ − ∗∗∗ − ∗∗∗ ∗ − − ∗∗∗ ∗∗∗ (0.031) (0.033) (0.330) (0.340)log(Cases), 14d lag 0.004 0.007 0.273 ∗∗∗ ∗∗∗ (0.004) (0.005) (0.028) (0.028)log(Cases), 21d lag 0.002 − ∗∗∗ ∗∗∗ (0.002) (0.003) (0.019) (0.017)log(Cases), 28d lag 0.006 ∗∗∗ ∗ ∗∗∗ ∗∗∗ (0.002) (0.002) (0.023) (0.024)County Dummies Yes Yes Yes YesState × Week Dummies Yes Yes Yes YesObservations 670,909 595,886 670,909 595,886R (b) Visits to Restaurants and Bars Dependent variable
Restaurants Restaurants Bars Bars(1) (2) (3) (4)College Visits 0.064 0.034 0.016 ∗∗∗ ∗∗ (0.053) (0.051) (0.006) (0.005)K-12 School Visits 0.006 0.008(0.046) (0.006)Open K-12 In-person − ∗∗∗ − ∗∗∗ (0.404) (0.041)Open K-12 Hybrid − ∗∗∗ − ∗∗∗ (0.272) (0.038)Open K-12 Remote − ∗ ∗∗∗ − − − ∗∗∗ − ∗∗∗ − ∗∗∗ − ∗∗∗ (0.203) (0.241) (0.025) (0.025)log(Cases), 14d lag − ∗∗ − − ∗ − − ∗∗∗ − ∗∗∗ − ∗∗ − ∗ (0.032) (0.032) (0.005) (0.005)log(Cases), 28d lag − ∗∗∗ − ∗∗∗ − ∗∗ − ∗∗∗ (0.043) (0.042) (0.005) (0.005)County Dummies Yes Yes Yes YesState × Week Dummies Yes Yes Yes YesObservations 670,909 595,886 670,909 595,886R Notes: These tables report the omitted estimates of Table 2 in the main text. All regression specifications includecounty fixed effects and state-week fixed effects. The debiased estimator is used. Clustered standard errors at thestate level are reported in the bracket. ∗ p < ∗∗ p < ∗∗∗ p < Table S3.
The Association of School/College Openings, NPI Policies, Full-time/Part-time Work, and Staying Home Devices with Case Growth in theUnited States: Debiased Fixed Effects Estimator
Dependent variable:
Case Growth Rates (1) (2) (3) (4)College Visits, 14d lag 0.060 0.012 0.114 ∗ ∗∗∗ ∗∗∗ (0.075) (0.087)K-12 Visits × No-Mask 0.287 ∗∗∗ (0.071)K-12 In-person, 14d lag 0.015 − − ∗∗ − ∗∗∗ (0.013) (0.013)K-12 Remote, 14d lag − ∗∗∗ − ∗∗∗ (0.015) (0.014)K-12 In-person × No-Mask 0.034 ∗ (0.020)K-12 Hybrid × No-Mask 0.043 ∗∗∗ (0.017)Full-time Work Device, 14d lag − ∗∗ ∗∗ (0.417) (0.490) (0.384) (0.436)Part-time Work Device, 14d lag 0.262 0.466 0.820 ∗∗∗ ∗∗∗ (0.259) (0.305) (0.276) (0.309)Staying Home Device, 14d lag − ∗∗∗ − ∗∗∗ − ∗∗∗ − ∗∗∗ (0.057) (0.069) (0.061) (0.067)Mandatory mask 14d lag − ∗∗∗ − ∗∗∗ − ∗∗∗ − ∗∗∗ (0.018) (0.017) (0.019) (0.019)Ban gatherings 14d lag − ∗∗∗ − ∗∗∗ − ∗∗∗ − ∗∗∗ (0.034) (0.044) (0.034) (0.043)Stay at home 14d lag − ∗∗∗ − ∗∗∗ − ∗∗∗ − ∗∗∗ (0.033) (0.040) (0.034) (0.040)log(Cases), 14d lag − ∗∗∗ − ∗∗∗ − ∗∗∗ − ∗∗∗ (0.009) (0.010) (0.010) (0.010)log(Cases), 21d lag − ∗∗∗ − ∗∗∗ − ∗∗∗ − ∗∗∗ (0.004) (0.005) (0.005) (0.005)log(Cases), 28d lag − ∗∗∗ − ∗∗∗ − ∗∗∗ − ∗∗∗ (0.003) (0.003) (0.004) (0.003)Test Growth Rates 0.009 ∗∗ ∗ ∗∗ ∗∗ (0.004) (0.004) (0.004) (0.004)County Dummies Yes Yes Yes YesState × Week Dummies Yes Yes Yes YesObservations 690,297 545,131 612,963 528,941R Notes: Dependent variable is the log difference in weekly positive cases across 2 weeks. All regression specificationsinclude county fixed effects and state-week fixed effects to control for any unobserved county-level factors andtime-varying state-level factors such as various state-level policies. The debiased fixed effects estimator is applied.Asymptotic clustered standard errors at the state level are reported in the bracket. ∗ p < ∗∗ p < ∗∗∗ p < Table S4.
The Association of School/College Openings and NPI Policieswith Death Growth in the United States: Debiased Fixed Effects Estimator
Dependent variable:
Death Growth Rates (1) (2) (3) (4)College Visits, 21d lag 0.142 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.047) (0.057) (0.058) (0.055)K-12 Visits, 21d lag 0.160 ∗∗∗ × No-Mask 0.174 ∗∗ (0.073)K-12 In-person, 21d lag − − × No-Mask 0.050 ∗∗∗ (0.016)K-12 Hybrid × No-Mask 0.017(0.015)Mandatory mask, 21d lag − ∗∗ − ∗∗ − ∗∗∗ − ∗∗ (0.009) (0.009) (0.009) (0.009)Ban gatherings, 21d lag − − ∗∗ − ∗∗ − ∗∗ (0.027) (0.025) (0.027) (0.025)Stay at home, 21d lag − ∗∗∗ − ∗∗ − ∗∗∗ − ∗∗ (0.032) (0.030) (0.030) (0.029)log(Deaths), 21d lag − ∗∗∗ − ∗∗∗ − ∗∗∗ − ∗∗∗ (0.004) (0.005) (0.004) (0.006)log(Deaths), 28d lag − ∗∗∗ − ∗∗∗ − ∗∗∗ − ∗∗∗ (0.004) (0.005) (0.005) (0.005)log(Deaths), 35d lag − ∗∗∗ − ∗∗∗ − ∗∗∗ − ∗∗∗ (0.004) (0.005) (0.004) (0.005)County Dummies Yes Yes Yes YesState × Week Dummies Yes Yes Yes YesObservations 628,061 490,568 557,219 476,794R Notes: Dependent variable is the log difference in weekly reported deaths across 2 weeks. Regressors are 7-daymoving averages of corresponding daily variables and lagged by 3 weeks to reflect the time between infection andcase reporting. All regression specifications include county fixed effects and state-week fixed effects to control for anyunobserved county-level factors and time-varying state-level factors such as various state-level policies. The debiasedfixed effects estimator is applied. Asymptotic clustered standard errors at the state level are reported in the bracket.Estimates are based on the sample of counties after dropping the smallest 10 percent in population sizes because thenumber of reported deaths is zero for many observations in small counties. ∗ p < ∗∗ p < ∗∗∗ p < Table S5.
Summary Statistics
Statistic N Mean St. Dev. Min Pctl(25) Pctl(75) MaxCase Growth Rate 698,278 0.099 0.901 − − − − − − − − − Notes: Based on observations from April 15, 2020 to December 2, 2020 for the maximum of 3142 counties.
Table S6.