Evaluating Policies Early in a Pandemic: Bounding Policy Effects with Nonrandomly Missing Data
UUnderstanding the Effects of Tennessee’s Open Covid-19Testing Policy: Bounding Policy Effects with NonrandomlyMissing Data ∗† Brantly Callaway ‡ Tong Li § May 27, 2020
Abstract
Increased testing for Covid-19 is seen as one of the most important steps to re-open theeconomy. The current paper considers Tennessee’s “open-testing” policy where the state sub-stantially increased testing while removing symptom requirements for individuals to be tested.To understand the effect of the policy, we combine standard identifying assumptions with addi-tional weak assumptions to deal with non-random testing that lead to bounds on policy effects ofinterest. Our results suggest that Tennessee’s open-testing policy has reduced total cases (whichare not fully observed), confirmed cases, and trips to work among counties with a fast-growingnumber of confirmed cases.
Widespread testing for Covid-19 is seen as one of the requirements for re-opening the econ-omy. For example, in California, “the ability to monitor and protect our communities throughtesting, contact tracing, isolating, and supporting those who are positive or exposed” is thefirst of six indicators for when the state would potentially modify its stay-at-home order. Theidea is that mass testing would include individuals with mild symptoms or even no symptoms.Individuals that test positive could be isolated for some period of time and their contacts couldbe traced, notified of their potential exposure, and potentially also be tested. In principle, thisextra testing would further result in detecting and mitigating localized outbreaks much earlierthan they would be otherwise. ∗ Li dedicates this paper to the memory of his late mother, Mrs. Qing Zheng, who passed away on April 21, 2020. † We thank Lesley Turner, John Weymark, Jun Zhao, Lily Zhao, and seminar participants at Vanderbilt Universityfor helpful comments. ‡ Department of Economics. University of Georgia. [email protected] § Department of Economics. Vanderbilt University. [email protected] a r X i v : . [ ec on . E M ] J un n the current paper, we examine a policy implemented by Tennessee that expanded theirtesting capacity and made testing available to anyone who wanted to take the test. This policywas substantially different from neighboring states that all have had fewer tests per capita alongwith substantial eligibility requirements for taking the test. The goal of the paper is to examinewhether or not Tennessee’s open-testing policy has affected the number of Covid-19 cases in thestate as well as other economic outcomes (in particular, we focus on number of trips to work).Our identification strategy is to take counties in Tennessee and compare them to “similar”counties in nearby states. In particular, we compare outcomes in Tennessee and Alabama amongcounties that had similar populations, Covid-19 cases, and had conducted a similar numberof tests prior to Tennessee’s open-testing policy. Under standard assumptions, differences inoutcomes experienced by counties in Tennessee relative to outcomes in counties in Alabamawith similar characteristics can be attributed to the policy differences between the two states.This would be a relatively straightforward exercise if testing were administered randomlyin each state; but that is not the case. In particular, individuals who are more likely to haveCovid-19 appear to be much more likely to take the test. For most states, including Alabama,this is by construction: showing certain symptoms is a requirement to be able to be tested forCovid-19. Even in Tennessee, where the policy allows any individual that would like to get atest to take it for free, it still seems likely to be the case that there is some selection into takingthe test. In practice, this creates two challenges. First, it is not possible to directly comparecounties that had the same number of total Covid-19 cases before the open-testing policy wasimplemented because the total number of Covid-19 cases is not observed. Second, and for thesame reason, it is challenging to evaluate the effect of the policy on the total number of Covid-19cases. We propose several strategies for dealing with nonrandomly missing testing data (discussedat length below). In particular, we make relatively weak assumptions on the fraction of untestedindividuals that have had Covid-19 that lead to bounds on the policy effects of interest.For observed outcomes such as confirmed cases and trips to work (which are relatively simplerdue to only suffering from the first issue mentioned above), our results indicate that the open-testing policy (i) decreased the number of confirmed cases, and (ii) decreased the amount oftravel to work in counties that experienced relatively large increases in their number of confirmedcases over time. For the per capita number of total Covid-19 cases (which is the relatively hardercase due to suffering from both issues mentioned above), we obtain non-trivial bounds on theeffect of the policy suggesting that the policy has reduced the number of total cases in Tennesseerelative to what they would have been in the absence of the policy. The most important driverof these results is that it simultaneously appears that the open-testing policy has increased thenumber of tests while decreasing the number of confirmed cases – together, these form a strongpiece of evidence that the policy has reduced total cases even though total cases are unobservedand it is hard to come up with reasonable assumptions that lead to point identification ofthe effect of the policy on total cases. Taken together, these results suggest that Tennessee’s To be clear on the terminology, we use the phrase total cases to refer to the total number of Covid-19 cases – thisis in general not observed. We use the phrase confirmed cases to refer to the number of positive tests for Covid-19. pen-testing policy has had a positive effect along several dimensions.Our paper’s main contribution is two-fold. First, it has been a widely held view that moretesting is important to contain the outbreak and to reopen the economy. It is thus of pro-found importance to quantify the effects of adopting the open-testing policy. To the best of ourknowledge, our paper is the first one to evaluate the effects of an open Covid-19 testing policywhere we study the effect of the policy in Tennessee which is the first state to offer open-testing.Second, we make a new methodological contribution in providing a method to bound the policyeffects with nonrandomly missing data, which is a serious concern in our case, and cannot bedealt with employing the standard methods used in the treatment effects literature. Buildingon Manski and Molinari (2020), who study bounding the Covid-19 infection rate under weak as-sumptions, we provide a novel method to bound the policy effects of the open-testing policy. Ourmethod can also be applied to other policy evaluation applications with nonrandomly missingdata. And, in particular, our methodology is potentially useful for studying the effects of otherpolicies related to Covid-19 (e.g., evaluating stay-at-home policies as in Friedson, McNichols,Sabia, and Dave (2020) and Dave, Friedson, Matsuzawa, and Sabia (2020)).The paper is organized as follows. Section 2 provides more details about Tennessee’s open-testing policy. Section 3 discusses the data that we use to study the effects of the policy.Section 4 considers our approach to bounding total Covid-19 cases and evaluating policy effectsin the presence of nonrandomly missing data. Section 5 provides our main results on analyzingthe effects of Tennessee’s open-testing policy. Section 6 concludes.
On Wednesday, April 15, Tennessee’s Republican governor Bill Lee announced free testingin the state for anyone who wanted a test. That Saturday, April 18, more than 6,500 Tennesseeresidents were tested at 20 different testing locations across the state. Unlike almost all otherstates at that time, obtaining a test did not require an individual to be showing symptoms orto be in a high risk group. These tests were also available on the weekends of April 25 andMay 2. Over the course of those three weekends, over 23,000 individuals were tested at a totalof 67 different testing sites. Following those three weekends, the open-testing policy has beensomewhat modified. Tennessee has increased their emphasis on testing high-risk populations.That being said, the requirements to be tested for Covid-19 in Tennessee appear to be minimaleven after the policy was modified. For example, in his tweet on May 14, 2020, U.S. Senator Lamar Alexander (R-Tenn) says that more testing is keyto ensuring people are safe as they go back to work and go back to school; https://twitter.com/SenAlexander/status/1260957179210653697. Notes:
Per capita tests for Tennessee and its six neighboring states over time. The vertical black line is set at April17 which is approximately the start of Tennessee’s open-testing policy.
Sources:
Covid Tracking Project (https://covidtracking.com/) and Census Bureau
Compared to its six neighboring states, Tennessee was already a high testing state when thenew policy was implemented (see Figure 1). Out of these seven states, Tennessee was roughlytied with Mississippi for the most tests per capita when the open-testing policy was implemented.Between April 17 and May 8, the number of test per capita increased in Tennessee by over 2percentage points – more than any of its neighboring states. And by May 8, Tennessee hadconducted more tests per capita than any of its neighboring states (46% more than Alabama,68% more than Arkansas, 66% more than Georgia, 96% more than Kentucky, 31% more thanMississippi, and 109% more than North Carolina).On the other hand, Tennessee was closer to the middle in terms of number of confirmed cases.On April 17, Tennessee was essentially tied with Alabama for third out of seven in terms of percapita confirmed cases (see Figure 2) behind Georgia and Mississippi. By May 8, Tennessee hadmoved somewhat ahead of Alabama in terms of per capita number of confirmed cases but wasstill behind both Georgia and Mississippi. It is also important to remember that the numberof confirmed cases depends on the number of tests especially in cases like Covid-19 where thenumber of tests is relatively low and there may be a large number of asymptomatic cases or caseswith relatively mild symptoms. Thus, for example, the increase in confirmed cases in Tennesseerelative to Alabama could be explained by an increase in actual cases in Tennessee relative toAlabama or just due to a mechanical increase in the number of confirmed cases arising frommore extensive testing. One main reason for Tennessee’s large number of tests is that, unlike most otherstates, Tennessee has been directly paying private labs. See https://wpln.org/post/tennessees-secret-to-plentiful-coronavirus-testing-picking-up-the-tab/.
Notes:
Per capita confirmed Covid-19 cases for Tennessee and its six neighboring states over time. The vertical blackline is set at April 17 which is approximately the start of Tennessee’s open-testing policy.
Sources:
Covid Tracking Project (https://covidtracking.com/) and Census Bureau (a) Per Capita Tests by County(b) Per Capita Confirmed Cases by County
Notes:
Per capita tests and confirmed cases for all counties in Tennessee except Trousdale and Bledsoe. The highlightedcounties are the most populated counties in Tennessee. Davidson County includes Nashville. Williamson County isthe largest suburban county of Nashville. Shelby County includes Memphis. Knox County includes Knoxville. Thevertical black line is set at April 17 which is approximately the start of Tennessee’s open-testing policy.
Sources: In some of our descriptive analysis, we keep these counties, but n our main results, we drop these counties. This section discusses our approach to (i) bounding the rate of total Covid-19 cases acrosscounties, (ii) evaluating the effect of Tennessee’s open-testing policy on other outcomes (inparticular, county-level confirmed cases and county-level number of trips to work), and (iii)evaluating the effect of Tennessee’s open-testing policy on the total number of Covid-19 cases.We use the following notation: • C ilt – a binary variable for whether or not individual i in county l has had Covid-19 bytime period t . • R ilt – a binary variable for whether or not individual i in county l has tested positive forCovid-19 by time period t • T ilt – a binary variable for whether or not individual i in county l has taken a test forCovid-19 by time period t Our first goal is descriptive: to see what fraction of the population has had Covid-19 by timeperiod t in a particular county l (note that the same arguments would apply for another fixedlocation such as a state as well). That is, our interest centers on P ( C ilt = 1). To be clear aboutthe notation here, this is the fraction of the population in county l at time period t that has hadCovid-19. That is, we are averaging over all individuals in a particular county l at time period t . Identifying the fraction of individuals that have ever had Covid-19 is challenging because (i)not all individuals have been tested and (ii) for individuals that have been tested for Covid-19, testing has not been randomly assigned. The goal of this section is to develop non-trivialbounds on the fraction of total Covid-19 cases in a particular location at a particular time underplausible identifying assumptions. In particular, following Manski and Molinari (2020), noticethat P ( C ilt = 1) = P ( C ilt = 1 | T ilt = 1) P ( T ilt = 1) + P ( C ilt = 1 | T ilt = 0) P ( T ilt = 0) (1)which follows immediately by the law of total probability. Next, consider each of these termsindividually: • P ( C ilt = 1 | T ilt = 1). This is the fraction of the population in county l at time period t that has had Covid-19 conditional on being tested. We discuss this term in more detailbelow. Other recent work on estimating the number of cases in the presence of nonrandomly missing data includesHorta¸csu, Liu, and Schwieg (2020) though their approach is substantially different from the approach taken in thissection. P ( T ilt = 1) is the (observed) fraction of the population in county l at time period t whohave been tested for Covid-19. • P ( T ilt = 0) is the (observed) fraction of the population in county l at time period t whohave not been tested for Covid-19. • P ( C ilt = 1 | T ilt = 0) is the (unobserved) fraction of the population that have had Covid-19but have not been tested in county l by time period t . This term is the hardest to identify,and we discuss plausible assumptions that lead to bounds on this term below.Next, consider P ( C ilt = 1 | T ilt = 1). It can be written as P ( C ilt = 1 | T ilt = 1) = P ( C ilt = 1 | T ilt = 1 , R ilt = 1) P ( R ilt = 1 | T ilt = 1)+ P ( C ilt = 1 , | T ilt = 1 , R ilt = 0) P ( R ilt = 0 | T ilt = 1)= P ( R ilt = 1 | T ilt = 1) + P ( R ilt = 0 | T ilt = 1 , C ilt = 1) P ( C ilt = 1 | T ilt = 1)where the first equality holds by the law of total probability and the second equality holdsbecause (i) R ilt = 1 = ⇒ T ilt = 1 (i.e., in order to test positive, an individual has to betested), (ii) we suppose that the false positive rate of the test is equal to 0 which impliesthat P ( C ilt = 1 | R it = 1) = 1, and (iii) repeated application of the definition of conditionalprobability for the second term. Then, rearranging implies that P ( C ilt = 1 | T ilt = 1) = P ( R ilt = 1 | T ilt = 1)1 − P ( R ilt = 0 | T ilt =1 = 1 , C ilt = 1) (2)where • P ( R ilt = 1 | T ilt = 1) is the (observed) fraction of tests that have come back positive incounty l at time period t . • P ( R ilt = 0 | T ilt = 1 , C ilt = 1) is the false negative rate of the test. This is a property of thetest, and we set the false negative rate to be equal to 0.25. Equation (2) says that the probability of having Covid-19 conditional on being tested is increas-ing in the fraction of positive tests and the false negative rate of the test. It also implies thatevery term in Equation (1) is identified except P ( C ilt = 1 | T ilt = 0). Without employing some The false positive rate is given by P ( C ilt = 0 | R ilt Manski and Molinari (2020) put bounds on a closely related term called the Negative Predictive Value of the test;we could similarly put bounds on the false negative rate of the test. We do not do this in the current paper in order tomainly focus on the bounds arising from non-random testing. In the results presented below, in general, the boundsare not very sensitive to different reasonable values of the false negative rate of the test. More specifically, a higherfalse negative rate increases both the lower bound and the upper bound, but it increases the upper bound relativelymore under Assumption 1 (see Equation (4) below); thus bounds tend to be somewhat wider (although the actualdifference is small) under larger values of the false negative rate of the test. dditional assumption on this term, the bounds on the rate of total cases are given by P ( R ilt = 1)1 − P ( R ilt = 0 | T ilt =1 = 1 , C ilt = 1) ≤ P ( C ilt = 1) ≤ P ( R ilt = 1)1 − P ( R ilt = 0 | T ilt =1 = 1 , C ilt = 1) + P ( T ilt = 0)(3)In our case, these sorts of bounds would be extremely wide. For example, for the wholestate of Tennessee, P ( R ilt = 1) is about 0.2% and P ( T ilt = 0) is about 97% (i.e., about 3% ofTennessee’s population has been tested and about 0.2% have had a positive test). If the onlyrestriction on P ( C ilt = 1 | T ilt = 0) is that it is bounded between 0 and 1, then this will lead toextremely wide bounds on Covid-19 cases (essentially uninformative). Instead (and continuingto follow Manski and Molinari (2020)), we make the following assumption. Assumption 1 (Covid-19 Bound for Untested Individuals) . P ( C ilt = 1 | T ilt = 0) ≤ P ( C ilt = 1 | T ilt = 1)Assumption 1 says that the fraction of individuals who have had Covid-19 (in a particularcounty) is lower among the group of individuals who have not been tested than among thosewho have been tested. This is a fairly weak assumption. This assumption is likely to holdfor two reasons. First, tests have been predominantly given to individuals expressing Covid-19symptoms. Second, even in Tennessee where testing has been available to anyone who wantsto take a test, (i) individuals expressing symptoms are still among those most likely to takethe test and (ii) it seems likely that there is some self-selection into taking the test amongindividuals who think they may have Covid-19 even if they do not have the right combinationof symptoms to otherwise warrant a test. It is also helpful to think about the limiting cases ofthe assumption. P ( C ilt = 1 | T ilt = 0) = 0 in the case when no untested individuals have hadCovid-19. P ( C ilt | T ilt = 0) = P ( C ilt = 1 | T ilt = 1) if the probability of having had Covid-19 isthe same for individuals who have not been tested as for individuals who have been tested. Thiscondition would hold if testing were randomly assigned. In practice, either of these limitingconditions would be strong enough to point identify P ( C ilt = 1); however, based on the abovediscussion, neither of these limiting conditions seems likely to hold. Instead, Assumption 1imposes the much weaker condition that the probability of having had Covid-19 for the groupof individuals who have not been tested falls in between these two limiting cases.Assumption 1 does not affect the lower bound on the total number of cases, but it is poten-tially very useful in lowering the upper bound on the number of Covid-19 cases in a particularlocation. In particular, notice that under Assumption 1, P ( C ilt = 1) ≤ P ( C ilt = 1 | T ilt = 1) (4)This can lead to a much tighter bound especially when P ( C ilt = 1 | T ilt = 1) is substantially lessthan one. For example, for the whole state of Tennessee, P ( C ilt = 1 | T ilt = 1) is roughly equalto 8%. This immediately leads to a much tighter bound on the total number of cases relativeto not putting any restrictions on P ( C ilt = 1 | T ilt = 0). .2 Policy Evaluation with Nonrandomly Missing Data The previous section discussed how to bound the total number of Covid-19 cases in a par-ticular location. The second goal of the paper is to go beyond these descriptive bounds andevaluate how Tennessee’s open-testing policy has affected the (unobserved) total number ofCovid-19 cases as well as other outcomes such as confirmed cases and trips to work.The notation for this section is somewhat different from the previous section. In particular,define C lt = P ( C ilt = 1), R lt = P ( R ilt = 1), T lt = P ( T ilt = 1). These are defined at the county-level and correspond to the fraction of the population in the county that has had Covid-19,that have tested positive for Covid-19, and that have been tested for Covid-19, respectively. We also suppose that we have access to county-level covariates X l that do not vary over time;the main covariate that we use is county population. Some of the results below consider policyeffects on other outcomes; in that case we denote the county-level outcome in time period t by Y lt .To think about the effect of Tennessee’s open-testing policy, we define potential outcomesfor county l in time period t . In particular, let C lt (1), T lt (1), R lt (1), and Y lt (1) denote the percapita number of total Covid-19 cases, the per capita number of tests, the per capita numberof positive tests, as well as other county-level outcomes that would occur in county l in timeperiod t if the open-testing policy were in place. Similarly, if the policy is not in place for county l in time period t , we denote the potential outcomes that would occur in this case by: C lt (0), T lt (0), R lt (0), and Y lt (0). To conserve on notation, define Z lt ( d ) = ( Y lt ( d ) , R lt ( d ) , T lt ( d ) , X (cid:48) l ) (cid:48) for d ∈ { , } . This collects the covariates and all potential outcomes except for C lt ( d ). Also,define Z ∗ lt ( d ) = ( Z lt ( d ) (cid:48) , C lt ( d )) (cid:48) .Next, let D l be a binary variable indicating treatment participation. For counties thatparticipate in the open-testing policy (i.e., counties in Tennessee), D l = 1; otherwise, D l = 0.Also suppose that there are two time periods: t ∗ and t ∗ − and that the policy is implementedbetween time periods t ∗ and t ∗ −
1. In this setup, we observe Z lt ∗ = D l Z lt ∗ (1) + (1 − D l ) Z lt ∗ (0) and Z lt ∗ − = Z lt ∗ − (0)In other words, in post-treatment time periods we observe “treated” potential outcomes forcounties that participate in the treatment (i.e., counties in Tennessee) and observe “untreated”potential outcomes for counties that do not participate in the treatment (i.e., counties in Al-abama). In pre-treatment time periods, we observe untreated potential outcomes for all counties– these are the outcomes under the baseline policy of restricting testing to individuals meetingthe symptom requirements. Evaluating the Effect of Open-Testing on Observed Outcomes
To start with, consider identifying the effect of Tennessee’s open-testing policy on someobserved outcome (e.g., the number of confirmed Covid-19 cases or the number of trips to work) Also, notice that we do not need to estimate these quantities; rather each of them is exactly observed. Our results extend immediately to the case where there are more available time periods. n county l at time period t ∗ . We start with this case because it is simpler because Y lt , theoutcome, is fully observed while C lt , the per capita number of total cases in county l , is not.Our interest in this section is in identifying AT T Y ( Z lt ∗ − ) = E [ Y lt ∗ (1) − Y lt ∗ (0) | Z lt ∗ − , D l = 1] and AT T Y = E [ Y lt ∗ (1) − Y lt ∗ (0) | D l = 1] AT T Y ( Z lt ∗ − ) is the average effect of the open-testing policy on the outcome for counties inTennessee with pre-treatment characteristics Z lt ∗ − . AT T Y is the overall average effect of theopen-testing policy on the outcome across all counties in Tennessee.We make the following assumption Assumption 2 (Unconfoundedness) . E [ Y lt ∗ (0) | Z ∗ lt ∗ − (0) , D l = 1] = E [ Y lt ∗ (0) | Z ∗ lt ∗ − (0) , D l = 0]Assumption 2 is a standard and widely used assumption to identify the affect of some eco-nomic policy (see, for example, Imbens and Wooldridge (2009)). It says that, if the policy hadnot been enacted, on average outcomes in counties in Tennessee would have been the sameas outcomes in counties in Alabama that had the same pre-treatment characteristics; i.e., thesame outcomes in the previous period, the same per capita number of confirmed cases, the samenumber of per capita tests, the same population, as well as the same per capita number of totalcases .One cannot immediately use Assumption 2 because Z ∗ lt ∗ − (0) includes C lt ∗ − (0) – the percapita number of total Covid-19 cases in a particular county – which is unobserved. But, inpractice, most outcomes in period t ∗ are likely to depend heavily on how widespread Covid-19has been – even if it has gone largely undetected. Therefore, it seems quite important to controlfor the (unobserved) number of cases. To address this issue, we make the following assumption Assumption 3 (Conditional Independence of Pre-Policy Total Covid-19 Cases) . C lt ∗ − (0) ⊥⊥ D l | Z lt ∗ − (0)Assumption 3 says that, in the pre-treatment period, the distribution of the per capitanumber of total Covid-19 cases was the same for counties in Tennessee and Alabama thathad the same pre-policy characteristics. To be clear here, Assumption 3 does not imply thatthe unobserved number of total cases is exactly the same across counties in Tennessee andAlabama. Rather, it rules out things like systematic differences in unobserved total cases in thepre-treatment period for counties in Tennessee and Alabama with similar populations and thathad run a similar number of tests and had confirmed a similar number of cases. Under these Assumption 3 is likely to be much more plausible when the observed conditioning variables include county-levelnumber of tests. As noted above, county-level testing data is not widely available, and its availability for Alabama is amain reason that we focus on comparisons between counties in Tennessee and Alabama that have similar characteristics(including having run a similar number of tests which is not available at the county-level for any of the other statessurrounding Tennessee). ssumptions, we have the following result Proposition 1.
Under Assumptions 2 and 3,
AT T Y ( Z lt ∗ − ) and AT T Y are identified and givenby AT T Y ( Z lt ∗ − ) = E [ Y lt ∗ | Z lt ∗ − , D l = 1] − E [ Y lt ∗ | Z lt ∗ − , D l = 0] and AT T Y = E [ AT T Y ( Z lt ∗ − ) | D l = 1]The proof of Proposition 1 is in Appendix A. The result in Proposition 1 says that the averageeffect of Tennessee’s open-testing policy on the outcome can be obtained by the difference inaverage outcomes between counties in Tennessee and counties in Alabama that had the sameoutcomes, per capita number of confirmed cases, per capita number of tests, and population –all before Tennessee implemented its open-testing policy. Evaluating the Effect of Open-Testing on Total Covid-19 Cases
Next, we consider trying to identify the effect of Tennessee’s open-testing policy on the percapita number of total Covid-19 cases. This is distinctly more challenging than the previouscase because the total number of cases is not observed. To start with, we continue to makeAssumption 3, and we modify Assumption 2 to hold for total Covid-19 cases:
Assumption 4 (Covid Unconfoundedness) . E [ C lt ∗ (0) | Z ∗ lt ∗ − (0) , D l = 1] = E [ C lt ∗ (0) | Z ∗ lt ∗ − (0) , D l = 0]Assumption 4 is analogous to Assumption 2 but for total Covid-19 cases. It says that, inthe absence of the policy intervention, on average, the per capita number of total Covid-19cases would have been the same for counties in Tennessee and Alabama that had the samepre-treatment characteristics (including the same number of unobserved total Covid-19 cases).Similarly to the previous section, we focus on identifying AT T C ( Z lt ∗ − ) = E [ C lt ∗ (1) − C lt ∗ (0) | Z lt ∗ − , D l = 1] and AT T C = E [ C lt ∗ (1) − C lt ∗ (0) | D l = 1](5) AT T C ( Z lt ∗ − ) is the average effect of the policy on the per capita number of total Covid-19cases for counties in Tennessee with pre-treatment characteristics Z lt ∗ − . AT T C is the overallaverage effect of the policy on the per capita number of total Covid-19 cases in Tennessee. Inaddition, all of the results in the previous section go through suggesting that AT T C ( Z lt ∗ − ) = E [ C lt ∗ | Z lt ∗ − , D l = 1] − E [ C lt ∗ | Z lt ∗ − , D l = 0]= P ( C ilt ∗ = 1 | Z lt ∗ − , D l = 1) − P ( C ilt ∗ = 1 | Z lt ∗ − , D l = 0) (6) nd AT T C = E [ C lt ∗ | D l = 1] − E (cid:104) E [ C lt ∗ | Z lt ∗ − , D l = 0] | D l = 1 (cid:105) The problem here is that C lt ∗ is not observed. Instead, we only have the bounds given inEquation (4). The next proposition provides bounds on the policy effects on the total numberof Covid-19 cases under the assumptions that we have made so far. It essentially holds by usingthe same bounds as in the previous section, invoking Assumptions 3 and 4, and then takingdifferences across counties in Tennessee and Alabama that have the same characteristics. Beforestating this result, we define two more terms to conserve on notation below. First, for d ∈ { , } ,define γ d ( Z lt ∗ − ) := P ( C ilt ∗ = 1 , T ilt ∗ = 1 | Z lt ∗ − , D l = d )This corresponds to the first term in Equation (1) (now conditional on Z lt ∗ − and D l = d ) andis point identified. Second, for d ∈ { , } , define τ d ( Z lt ∗ − ) := P ( T ilt ∗ = 1 | Z lt ∗ − , D l = d )which is also an observed quantity. Proposition 2.
Under Assumptions 1, 3 and 4, C B,Llt ∗ ( Z lt ∗ − ) ≤ AT T C ( Z lt ∗ − ) ≤ C B,Ult ∗ ( Z lt ∗ − ) where C B,Llt ∗ ( Z lt ∗ − ) := γ ( Z lt ∗ − ) − γ ( Z lt ∗ − ) − γ ( Z lt ∗ − ) 1 − τ ( Z lt ∗ − ) τ ( Z lt ∗ − ) C B,Ult ∗ ( Z lt ∗ − ) := γ ( Z lt ∗ − ) − γ ( Z lt ∗ − ) + γ ( Z lt ∗ − ) 1 − τ ( Z lt ∗ − ) τ ( Z lt ∗ − ) and E (cid:104) C B,Llt ∗ ( Z lt ∗ − ) | D l = 1 (cid:105) ≤ AT T C ≤ E (cid:104) C B,Ult ∗ ( Z lt ∗ − ) | D l = 1 (cid:105) The proof of Proposition 2 is provided in Appendix A. These sort of bounds arise under thecombination of (i) standard identifying assumptions for policy effects and (ii) Assumption 1 –that the probability of having had Covid-19 is lower among untested individuals than amongtested individuals. The term in common for each of the bounds, γ ( Z lt ∗ − ) − γ ( Z lt ∗ − ), comesfrom (i) differences in the number of confirmed cases per test between counties in Tennessee andAlabama with similar characteristics and (ii) differences in the testing rate between counties inTennessee and Alabama with similar characteristics. The extra term for the lower bound comesfrom setting the fraction of untested individuals in counties in Tennessee who have had Covid-19to be equal to zero while setting the fraction of untested individuals in counties in Alabama ho have had Covid-19 to be equal to the fraction who have had Covid-19 conditional on beingtesting (this comes from the bound in Assumption 1). The upper bound comes from doing theopposite: setting the fraction of untested individuals who have had Covid-19 to be equal to zerofor counties in Alabama and setting the fraction of untested individuals who have had Covid-19to be as large as possible (under Assumption 1) for counties in Tennessee. The weights onthese terms (the terms involving τ and τ ) also tend to be very large because the fraction ofuntested individuals is much larger than the fraction of tested individuals. This implies that τ d ( Z lt ∗ − ) (cid:28) − τ d ( Z lt ∗ − ) for d ∈ { , } .The drawback of these bounds is that they are unlikely to be informative about the sign ofthe policy effect. To see this, notice that the terms involving γ d ( Z lt ∗ − ) are often quite small.On the other hand, the extra terms can be orders of magnitude larger. In our case, using thesebounds, the bounds cover 0 for all counties and are not very informative. In order to delivertighter bounds, we make some additional assumptions. Assumption 5 (Joint Unconfoundedness) . P ( C ilt ∗ (0) = 1 , T ilt ∗ (0) = 0 | Z ∗ lt ∗ − (0) , D l = 1) = P ( C ilt ∗ (0) = 1 , T ilt ∗ (0) = 0 | Z ∗ lt ∗ − (0) , D l = 0) Assumption 6 (Bound on Total Cases and Untested Individuals) . P ( C ilt ∗ (1) = 1 , T ilt ∗ (1) = 0 , | Z lt ∗ − (0) , D l = 1) ≤ P ( C ilt ∗ (0) = 1 , T ilt ∗ (0) = 0 | Z lt ∗ − (0) , D l = 1)Assumption 5 says that, in the absence of the policy, the joint probability that a person hasCovid-19 and that they were not tested for Covid-19 is the same for individuals located in coun-ties with similar pre-policy characteristics regardless of whether or not the county experiencesthe policy. This is essentially an unconfoundedness assumption (similar to, e.g., Assumption 4),but it strengthens that assumption to hold jointly for both the fraction of the population thathave had Covid-19 and the fraction that have not been tested, both in the absence of the policy.Assumption 6 says that, for individuals in counties that experience the policy, the jointprobability of having Covid-19 and not being tested under the policy is lower than the jointprobability of having Covid-19 and not being tested in the absence of the policy. We provide a setof more primitive conditions and more detailed discussion of this assumption in Appendix B.1.At a high level, though, this assumption is plausible under the conditions that (i) the open-testing policy does not make tests less available than they otherwise would have been withoutthe policy, (ii) the open-testing policy does not increase the number of Covid-19 cases amongthe untested relative to what they would have been without the policy, (iii) there is not negativeselection into taking the test among those who would be tested under the policy but would notbe tested in the absence of the policy.Next, we discuss tighter bounds on the effect of the open-testing policy under these additionalassumptions. In practice, the extreme cases that lead to the lower bound and upper bound seem unlikely to hold. This suggeststhat these bounds are likely to be quite conservative. roposition 3. Under Assumptions 1 and 3 to 6, C C,Llt ∗ ( Z lt ∗ − ) ≤ AT T C ( Z lt ∗ − ) ≤ C C,Ult ∗ ( Z lt ∗ − ) where C C,Llt ∗ ( Z lt ∗ − ) := C B,Llt ∗ ( Z lt ∗ − ) C C,Ult ∗ ( Z lt ∗ − ) := γ ( Z lt ∗ − ) − γ ( Z lt ∗ − ) and E (cid:104) C C,Llt ∗ ( Z lt ∗ − ) | D l = 1 (cid:105) ≤ AT T C ≤ E (cid:104) C C,Ult ∗ ( Z lt ∗ − ) | D l = 1 (cid:105) The proof of Proposition 3 is provided in Appendix A. Notice that the lower bound is thesame as it was in the previous case, but that the upper bound can be substantially tighter.In particular, the upper bound does not contain the same extra term as in Proposition 2; asdiscussed earlier, this term is the “dominant” term in the upper bound, and it is removed underthe additional conditions in Assumptions 5 and 6.Moreover, recall that γ ( Z lt ∗− ) − γ ( Z lt ∗ − ) comes from the difference between confirmedcases in counties in Tennessee relative to counties in Alabama with similar characteristics. Thenumber of confirmed cases depends on two things. First, it depends positively on differences inthe fraction of positive cases per test among counties in Tennessee relative to counties in Alabamawith similar characteristics; this term will tend to be negative if the policy is expanding testingavailability to individuals who are less likely to have Covid-19. Second, it depends positively ondifferences in the fraction of individuals who take the test among counties in Tennessee relativeto counties in Alabama with similar characteristics. This difference will tend to be positive ifthe policy is increasing the number of tests. Although it is hard to reason whether the numberof confirmed cases will go up in response to the policy, our estimates suggest that this termis somewhat negative; i.e., we find that the number of confirmed cases appears to decrease inresponse to the policy.Before concluding this section, it is worth mentioning that the upper bound in Proposition 3is still likely to be quite conservative and discussing why this is the case. Omitting covariatesbelow in order to simplify the expressions, notice that P ( C ilt ∗ (1) | D l = 1) − P ( C ilt ∗ (0) | D l = 1)= P ( C ilt ∗ (1) , T ilt ∗ (1) = 1 | D l = 1) − P ( C ilt ∗ (0) = 1 , T ilt ∗ (0) = 1 | D l = 1) (A)+ P ( C ilt ∗ (1) , T ilt ∗ (1) = 0 | D l = 1) − P ( C ilt ∗ (0) = 1 , T ilt ∗ (0) = 0 | D l = 1) (B)which holds immediately by rewriting the marginal probabilities in terms of joint probabilitiesand rearranging. Term (A) is equal to γ − γ (the upper bound in Proposition 3) and is what we This difference is also scaled by (1 − false negative rate of the test) − , but the scaling does not matter for the signof the upper bound. ust discussed. Next, consider Term (B). This term is, in general, not point identified because itdepends on cases among untested individuals. Assumption 6 says that this term is non-positive,and the upper bound comes from setting this term equal to 0. In practice, though, it seemsquite likely that this term would be negative (implying that our upper bound is conservative).To see this, notice that it can be written as (cid:16) P ( C ilt ∗ (1) = 1 | T ilt ∗ (1) = 0 , D l = 1) − P ( C ilt ∗ (0) = 1 | T ilt ∗ (0) = 0 , D l = 1) (cid:124) (cid:123)(cid:122) (cid:125) Term (B.1) (cid:17) P ( T ilt ∗ (1) = 0 | D l = 1)(7)+ P ( C ilt ∗ (0) = 1 | T ilt ∗ (0) = 0 , D l = 1) (cid:16) P ( T ilt ∗ (1) = 0 | D l = 1) − P ( T ilt ∗ (0) = 0 | D l = 1) (cid:124) (cid:123)(cid:122) (cid:125) Term (B.2) (cid:17)
First, it is likely that Term (B.1) ≤
0. Term (B.1) holds fixed the group of individuals in aparticular county that are untested (here, it is the group that would be untested under the open-testing policy), and compares the probability of untested individuals having Covid-19 under theopen-testing policy relative to the absence of the policy. The condition will hold as long as longas the open-testing policy does not increase the probability of having Covid-19 for this group.The case that Term (B.2) is negative is even stronger. It will be negative if testing increasesunder the open-testing policy which seems very likely to be the case.The above discussion suggests that the bounds that we report on the effect of Tennessee’sopen testing policy on the total number of cases are likely to be conservative. Open-testing maylead to a fewer number of actual cases, but (holding the number of actual cases fixed) increasedtesting mechanically leads to a larger number of confirmed cases. If the number of confirmedcases is decreasing at the same time as the number of tests are increasing (which is what wefind in the application), this is therefore a strong piece of evidence that Tennessee’s policy isdecreasing the total number of cases – even if we are not able to provide plausible assumptionsthat lead to point identification.
The identification results above are constructive and suggest plug-in estimators of each pa-rameter of interest. There are a number of possibilities here (e.g., regressions or weightingestimators), but we found it natural to use a matching estimator where, for each county inTennessee, we found a “match” in Alabama based on pre-policy county characteristics, Z lt ∗ − .Overall, average effects can be calculated by taking the average outcome experienced by countiesin Tennessee and subtracting the overall average outcome experienced by “matched” countiesin Alabama.In practice to construct the matched dataset, we match on per capita tests and per capitaconfirmed cases on April 17 (the pre-treatment date when we observe tests by county in Al-abama). We also match on county population. And, finally, we match on pre-treatment declinesin county-level trips to work from pre-Covid baseline.To actually construct the matches, for county l in Tennessee, we construct its match by hoosing the county in Alabama that minimizes the squared standardized Euclidean distance d ( Z lt ∗ − , Z jt ∗ − ) = ( Z lt ∗ − − Z jt ∗ − ) (cid:48) S − ( Z lt ∗ − − Z jt ∗ − )where j indexes counties in Alabama and S is a diagonal matrix containing the diagonal elementsof the variance matrix of Z jt − . Finally, we drop all counties that do not have a close match.This results in a matched dataset that includes 77 (out of 95 total) counties in Tennessee. Another issue is that, in practice, it is not immediately clear how to conduct inference inour case. We essentially have no sampling uncertainty because we observe all the exact numberof confirmed cases and tests in each county for all counties in Tennessee and Alabama. In lightof this, to conduct inference, we focus on design-based uncertainty (see, for example, Imbensand Rubin (2015) and Abadie, Athey, Imbens, and Wooldridge (2020)). For some parameterof interest, we approximate its distribution under the sharp null hypothesis of no county-leveleffect of the policy by repeatedly randomly assigning a variable indicating participation in thepolicy among counties in Tennessee and their match in Alabama. Then, we construct a p-valueby calculating the fraction of times that the original estimated parameter of interest is largerin absolute value than the estimated parameters that arise from repeatedly randomly assigningparticipating in the policy. This is similar to the sorts of randomization inference procedures usedin the difference-in-differences, synthetic control, and matching literatures (e.g., Bertrand, Duflo,and Mullainathan (2004), Abadie, Diamond, and Hainmueller (2010) and Ferman (2019)).
Challenges to Identification
Before presenting our main results, we briefly discuss the timing of other policy decisionsmade by Tennessee and Alabama. We list the timing of implementing major policies by Ten-nessee and Alabama in Table 1. The timing of other policies is important in this context because(i) states have been implementing a number of policies in response to the Covid-19 pandemicand (ii) if the policies themselves or the timing of implementing these policies differs substan-tially across Tennessee and Alabama, then our results would mix together the effects of theopen-testing policy as well other policy differences between Tennessee and Alabama.Table 1 shows that the timing of major other policies has been very similar for Tennessee andAlabama. To give some examples, Alabama closed public schools on March 19 and Tennesseeclosed them on March 20. Tennessee had a stay-at-home order from April 2 to April 29; Alabamahad a stay-at-home order from April 4 to April 30. These close similarities in terms of otherpolicies across states provides one piece of evidence in favor of interpreting our results below asbeing due to the open-testing policy. Our matched dataset does not include Davidson County (which contains Nashville) or Shelby County (whichcontains Memphis). These are notable exclusions from our dataset, but they occur because it is relatively harder tofind a reasonable match for large urban counties relative to less populated and rural counties. Randomization inference is discussed in more detail in the working paper version (Bertrand, Duflo, and Mul-lainathan (2002)).
Policy TN Start TN End AL Start AL EndSchools Closed March 20 - March 19 -Gathering Restrictions March 23 - March 19 -Stay-at-Home Order April 2 April 29 April 4 April 30Non-essential Businesses Closed April 1 April 27 March 28 April 30 a Sources:
Institute for Health Metrics and Evaluation (IMHE).https://covid19.healthdata.org/united-states-of-america/tennessee andhttps://covid19.healthdata.org/united-states-of-america/alabama
Descriptive Bounds on the Per Capita Number of Total Cases across Counties
Next, we compute bounds on the per capita number of total Covid-19 cases across countiesand separately for Tennessee and Alabama. These results are available in Figure 4 for May 8and in Figure 7 in Appendix C for April 17. For both states, the bounds are somewhat narrowerby May 8 than they were on April 17 – this should not be surprising as the number of testshad increased substantially in both states over time. Focusing on the bounds on May 8, it isimmediately clear that the bounds on the per capita number of total cases tend to be noticeablytighter in Tennessee counties than in Alabama counties. The mean width of the bounds is 0.06in Tennessee and 0.13 in Alabama.It is also worth discussing the bounds in some particular cases. The two counties with themost informative bounds are Trousdale County and Bledsoe County, but these are both ruralcounties that have experienced large Covid-19 outbreaks in prisons and that have had extensivetesting in those prisons. Besides those two counties, the county with the next highest upperbound on total Covid-19 cases is Davidson at 18%. Davidson County is home of Nashville, thecapital and largest city in Tennessee. It is also interesting to consider why the upper boundfor Davidson County is high. Recall that the upper bound mainly depends on the number ofconfirmed cases per test (will tend to be high when this is high). Outside of Trousdale andBledsoe Counties, Davidson County has the highest number of confirmed cases per test (13%)which leads to the large upper bound. Out of the top 10 counties in terms of the upper boundon total Covid-19 cases (excluding Trousdale and Bledsoe), 7 out of 10 are in Davidson Countyand surrounding suburban counties or Shelby County (Memphis) and surrounding suburbancounties. The other three are Perry County, Bedford County, and Grundy County which are allrural.The lower bound on the rate of total Covid-19 cases is uniformly quite small as it is primarilydriven by the number of confirmed cases. To give an example, Davidson County has the 4thhighest lower bound on total Covid-19 rates, and it is only 0.6%.
Main Results: Policy Effects of Tennessee’s Open-Testing Policy
Next, we consider the effect of Tennessee’s open-testing policy on several different outcomes.First, we consider the effect of the policy on the number of tests and on the number of confirmed (a) Tennessee (b) Alabama
Notes:
Bounds on the total (unobserved) per capita Covid-19 cases by county in Tennessee and Alabama using thebounds described in the text. These bounds are for May 8, 2020. cases. These results are available in Panels (a) and (b) of Figure 5. The results in that figurecome from comparing the actual number of tests or confirmed cases for particular counties inTennessee compared to their “match” county in Alabama. On average, the number of tests percapita increased by 13% in counties in Tennessee relative to what they would have experiencedif they had not enacted the policy. As discussed above, since we observe the full population, itdoes not seem appropriate to use conventional confidence intervals based on large samples froman infinite population. Instead, we consider the sharp null hypothesis that the number of testsin a particular county is exactly the same as it would be if the policy had not been enacted.Then, we can test if the policy is having an effect on the number of tests by comparing theestimated average effect of the policy on the number of tests to the distribution of the numberof tests when we randomly assign participation in the policy between each county in Tennesseeand its match in Alabama. Under this null (as well as the maintained assumptions), we computea p-value as the fraction of times that the estimated effect of the policy is larger in absolute This is computed by taking the average number of tests per capita across counties in Tennessee (23.4 per 1000individuals), subtracting the average number of tests per capita in matched counties (20.7 per 1000 individuals), anddividing by this same term. (a) Tests (b) Confirmed Cases (c) Work Trips
Notes:
The effect of Tennessee’s open testing policy on the per capita number of tests, confirmed cases, and percentagechange in trips to work by county. In Panel (a), the orange dots are the difference in per capita tests for counties inTennessee relative to their matched county in Alabama. In Panel (b), the orange dots are the difference in per capitaconfirmed cases for counties in Tennessee relative to their matched county in Alabama. In Panel (c), the blue dotsare the percentage change in trips to work for particular counties in Tennessee relative to their matched county inAlabama. The orange dots are the per capita change in confirmed cases over time for each county in Tennessee. value than its corresponding estimate when participating in the policy is randomly assigned.The p-value in this case is 0.359. This is not statistically significant at conventional significancelevels, but it is at least suggestive that the policy is actually increasing the number of tests.Next, we consider a similar exercise in terms of number of confirmed Covid-19 cases. Theseresults are available in Panel (b) of Figure 5. Here, we estimate that, on average, the open-testing policy decreased the number of confirmed cases by 50% relative to what they wouldhave been in the absence of the policy. Using the same randomization inference procedure asabove, we get a p-value of 0.006. These results suggest that Tennessee’s open-testing policy issubstantially decreasing the number of confirmed cases. This estimate comes from averaging the per capita number of confirmed cases across counties in Tennessee (0.83per 1000 individuals), subtracting the average number of confirmed cases in their matched counties from Alabama(1.68 per 1000 individuals), and dividing by the latter term. ext, we consider how the policy is affecting number of trips to work. It is important tobe careful here. Our strategy is to see how the number of trips to work changes as a functionof the change in the number of confirmed cases over time. Panel (c) of Figure 5 plots thepercentage change in trips to work for counties in Tennessee relative to their matched county inAlabama along with the observed change in confirmed cases between April 17 and May 8 amongcounties in Tennessee. We regress the difference in trips to work among matched counties on theobserved change in confirmed cases for counties in Tennessee and estimate that the number oftrips to work declines by 1.28 percent when the number of confirmed cases per 1000 individualsincreased by 1 between April 17 and May 8; this parameter has a p-value of 0.022. This resultsuggests that, at least to some extent, individuals in counties where the number of confirmedcases was increasing the most over time reduced their number of trips to work relative to thenumber of trips that they would have taken without the policy.Finally, we consider the effect of Tennessee’s open-testing policy on the total number ofper-capita Covid-19 cases. These results are provided in Figure 6. The lower bound on thepolicy effect is quite low – across counties, the average is a decrease of total cases by 99 per 1000individuals relative to what they would have been without the policy. This is probably a fairlyunrealistically large effect as it occurs when no untested individuals in Tennessee have had Covid-19, but untested individuals in Alabama have the same probability of having Covid-19 as testedindividuals. It is more interesting to consider the upper bound (this is the smallest possibleeffect of the policy given the data and identifying assumptions); see Panel (b) of Figure 6. The average of the upper bound is a reduction of total cases by 1.12 per 1000 individuals. Thisseems quite small, but it is also useful to compare this number to the number of confirmed casesper 1000 individuals which is only 0.8. This suggests that even the upper bound on the effect ofthe policy should be interpreted as a non-trivial reduction in Covid-19 cases. The randomizationp-value is 0.006. These results suggest that Tennessee’s policy is leading to fewer total cases.At a higher level, that the upper bound is negative is driven by the fact that confirmed casesappear to have decreased in Tennessee due to the open-testing policy. This decrease in confirmedcases in the presence of an increase in total tests is a strong piece of evidence that Tennessee’sopen-testing policy is decreasing the total number of cases.As one final step to help interpret our bounds, we consider a sensitivity analysis. Recallthat our bounds on the effect of the open-testing policy on the per capita number of total casesprimarily come from (i) the probability of having had Covid-19 conditional on not being testedbeing unknown and (ii) differences in the probability of having had Covid-19 conditional on not Calculating the p-value here involves a slight modification to our approach for previous parameters of interest. Asbefore, we repeatedly randomly assign participating in the policy across matches in Tennessee and Alabama, but now,at each iteration, we run a regression of the difference in number of trips to work between each county and its match(here, the “treated” county can be the opposite of what it is in the actual data) on the increase in cases over time. Wecalculate a p-value as the fraction of times that the original estimate is larger (in absolute value) than the estimatedparameters coming from repeatedly randomly assigning treatment status. Proposition 3, in combination with the expression in Equation (2), implies that the upper bound is a scaled versionof the difference in confirmed cases across counties in Tennessee and counties in Alabama with similar characteristics.Therefore Panel (b) of Figure 6 is very similar to Panel (b) Figure 5. For the same reason, the p-values reported inthis section are exactly the same as well. (a) Bounds (b) Upper Bound
Notes:
Panel (a) provides estimates of the overall bounds on effect of Tennessee’s open testing policy on the per capitanumber of total cases separately by each county in Tennessee. Panel (b) reproduces the estimates of the upper boundalone with a modified scale. being tested under the open-testing policy relative to the same probability in the absence of thepolicy (see Equation (7)). To perform a sensitivity analysis, we write P ( C ilt ∗ (0) = 1 | T ilt ∗ (0) = 0 , D l = 1) = αP ( C ilt ∗ (0) = 1 | T ilt ∗ (0) = 1 , D l = 1) (8) P ( C ilt ∗ (1) = 1 | T ilt ∗ (1) = 0 , D l = 1) = βP ( C ilt ∗ (0) = 1 | T ilt ∗ (0) = 0 , D l = 1)for α and β between 0 and 1. The assumptions in the paper are consistent with any of these pos-sible values for α and β , but researchers may have some priors on what values these parametersare more likely to take. Given values for α and β , this is enough to pin down the total numberof Covid-19 cases (both under the policy and in the absence of the policy), and Table 2 reportsthe average number of Covid-19 cases per 1000 individuals in the absence of the policy and thepercentage reduction in total Covid-19 cases due to the policy for those particular values of α and β . Notice that the number of total cases strongly depends on α ; that is, how likely untested In particular, given values of α and β , (and, for simplicity, ignoring conditioning on covariates), the average effect α β − . − . − . − . − . − . − . − . − . − . − . − . − . − . − . − . − . − . − . − . − . − . − . − . − . .
24 3 .
21 12 .
00 51 .
07 99 . a Notes:
The table provides estimates of the percentagereduction of total Covid-19 cases due to Tennessee’sopen-testing policy for particular values of the sensitivityanalysis parameters α and β which are defined inEquation (8). The last row of the table provides the numberof cases per 1000 individuals for particular values of α (anddoes not depend on β ) in the absence of the open-testingpolicy. individuals are to have had Covid-19 relative to tested individuals. The assumptions we havemade in the paper allow for α to take any value between 0 and 1; it is hard to know preciselywhat value this should take, but one would suspect that it is at least somewhat less than one.Our overall lower bound occurs when α = 1 and β = 0 – as discussed above, this is the casewhere, under the policy, no untested individuals have had Covid-19, but, in the absence of thepolicy, the probability of having had Covid-19 among the untested is the same as the probabilityof having had Covid-19 of tested individuals. This results in a decrease in cases of 99%, but, inlight of the previous discussion, seems like an unlikely case. It is even harder to get a good ideaabout what value β should take. The open-testing policy has larger effects for small values of β . But even for large values of β , our estimates indicate substantial effects of the open-testingpolicy in reducing the total number of cases as long as α is not large. In this paper, we have considered the effects of Tennessee’s open Covid-19 testing policywhich increased the number and accessibility of tests in the state. Understanding the effect ofthis policy is substantially hampered by nonrandom testing. Our approach has been to combine of the policy, P ( C ilt ∗ (1) | D l = 1) − P ( C ilt ∗ (0) | D l = 1), is given by γ − γ + P ( C ilt ∗ = 1 | T ilt ∗ = 1 , D l = 0) (cid:16) αβP ( T ilt ∗ = 0 | D l = 1) − αP ( T ilt ∗ = 0 | D l = 0) (cid:17) and all terms are identified (this result holds immediately from Equations (7) and (9)). tandard policy evaluation identifying assumptions with additional, plausible assumptions thatlead to bounds on the effect of the policy. Overall, we found suggestive evidence that Tennessee’spolicy has decreased the number of total and confirmed cases in Tennessee and was allowingworkers to decrease their trips to work in counties that had experienced greater increases in theirconfirmed case count over time. In this sense, it seems that Tennessee’s open-testing policy hasbenefited the state.Conditional on being able to increase the total number of tests, it is less clear if open-testingis, in some sense, optimal relative to other testing schemes. It seems that there are a number oftradeoffs here. Clearly, understanding the effects of the policy (as well as understanding thingslike the overall spread of Covid-19) would be greatly enhanced by random testing. On the otherhand, more targeted testing could be useful for detecting outbreaks and protecting high riskgroups. eferences [1] Abadie, Alberto, Susan Athey, Guido W Imbens, and Jeffrey M Wooldridge. “Sampling-based versus design-based uncertainty in regression analysis”. Econometrica
Journal of the American Statistical Association
The Quarterly Journal of Economics arXiv preprint arXiv:1909.05093 (2019).[7] Friedson, Andrew I, Drew McNichols, Joseph J Sabia, and Dhaval Dave. “Did Cali-fornia’s shelter-in-place order work? Early coronavirus-related public health effects”.Working Paper. 2020.[8] Horta¸csu, Ali, Jiarui Liu, and Timothy Schwieg. “Estimating the fraction of unre-ported infections in epidemics with a known epicenter: An application to Covid-19”.Working Paper. 2020.[9] Imbens, Guido W and Donald B Rubin.
Causal Inference in Statistics, Social, andBiomedical Sciences . Cambridge University Press, 2015.[10] Imbens, Guido and Jeffrey Wooldridge. “Recent developments in the econometricsof program evaluation”.
Journal of Economic Literature
Journal of Econometrics (2020).25
Proofs
Proof of Proposition 1 . The result follows because
AT T Y ( Z lt ∗ − ) = E [ Y lt ∗ (1) | Z lt ∗ − , D l = 1] − E [ Y lt ∗ (0) | Z lt ∗ − , D l = 1]= E [ Y lt ∗ (1) | Z lt ∗ − , D l = 1] − E (cid:104) E [ Y lt ∗ (0) | Z ∗ lt ∗ − , D l = 1] (cid:12)(cid:12) Z lt ∗ − , D l = 1 (cid:105) = E [ Y lt ∗ (1) | Z lt ∗ − , D l = 1] − E (cid:104) E [ Y lt ∗ (0) | Z ∗ lt ∗ − , D l = 0] (cid:12)(cid:12) Z lt ∗ − , D l = 1 (cid:105) = E [ Y lt ∗ (1) | Z lt ∗ − , D l = 1] − E (cid:104) E [ Y lt ∗ (0) | Z ∗ lt ∗ − , D l = 0] (cid:12)(cid:12) Z lt ∗ − , D l = 0 (cid:105) = E [ Y lt ∗ (1) | Z lt ∗ − , D l = 1] − E [ Y lt ∗ (0) | Z lt ∗ − , D l = 0]= E [ Y lt ∗ | Z lt ∗ − , D l = 1] − E [ Y lt ∗ | Z lt ∗ − , D l = 0]which is the result. The first equality is the definition of AT T Y ( Z lt ∗ − ); the second equalityholds by the law of iterated expectations (the outer expectation averages over the distributionof C lt ∗ − (0) conditional on Z lt ∗ − and D l = 1); the third equality holds by Assumption 2; thefourth holds by Assumption 3; the fifth equality holds by the law of iterated expectations; andthe sixth equality holds because Y lt ∗ (1) is the observed outcome when D l = 1 and Y lt ∗ (0) isthe observed outcome when D l = 0. The result for AT T Y holds immediately by averaging over AT T Y ( Z lt ∗ − ) over the distribution of Z lt ∗ − conditional on D l = 1. Proof of Proposition 2 . First, recall that
AT T C ( Z lt ∗ − ) = E [ C lt ∗ (1) − C lt ∗ (0) | Z lt ∗ − , D l = 1]= E [ C lt ∗ | Z lt ∗ − , D l = 1] − E [ C lt ∗ | Z lt ∗ − , D l = 0]= P ( C ilt ∗ = 1 | Z lt ∗ − , D l = 1) − P ( C ilt ∗ = 1 | Z lt ∗ − , D l = 0)where the second equality using the same arguments as in the proof of Proposition 1 and thethird equality holds by the definition of C lt ∗ . Omitting the dependence on Z lt ∗ − for notationalsimplicity, and then plugging in Equation (1) and the definition of γ d ( Z lt ∗ − ) further impliesthat P ( C ilt ∗ (1) = 1 | D l = 1) − P ( C ilt ∗ (0) = 1 | D l = 1) = γ − γ (9)+ P ( C ilt ∗ = 1 | T ilt ∗ = 0 , D l = 1) (cid:124) (cid:123)(cid:122) (cid:125) P ( T ilt ∗ = 0 | D l = 1) − P ( C ilt ∗ = 1 | T ilt ∗ = 0 , D l = 0) (cid:124) (cid:123)(cid:122) (cid:125) P ( T ilt ∗ = 0 | D l = 0)where the two underlined terms are not identified because the number of Covid-19 cases is notobserved for individuals that have not been tested. But bounds on the effect of Tennessee’sopen-testing policy on total Covid-19 cases arise from restrictions on these terms. In particular,Assumption 1 says that, for d ∈ { , } , P ( C ilt ∗ = 1 | T ilt ∗ = 0 , D l = d ) ≤ P ( C ilt ∗ = 1 | T ilt ∗ = 1 , D l = d ) B,Ult ∗ ( Z lt ∗ − ), the upper bound in the proposition, comes from setting P ( C ilt ∗ = 1 | T ilt ∗ = 0 , D l =1) = P ( C ilt ∗ = 1 | T ilt ∗ = 1 , D l = 1) (its maximum value under Assumption 1) and from setting P ( C ilt ∗ = 1 | T ilt ∗ = 0 , D l = 0) = 0. C B,Llt ∗ ( Z lt ∗ − ), the lower bound in the proposition comesfrom setting P ( C ilt ∗ = 1 | T ilt ∗ = 0 , D l = 1) = 0 and from setting P ( C ilt ∗ = 1 | T ilt ∗ = 0 , D l = 0) = P ( C ilt ∗ = 1 | T ilt ∗ = 1 , D l = 0) (its maximum value under Assumption 1). The bounds on AT T C arise from averaging over the bounds for AT T C ( Z lt ∗ − ) as discussed in the text.Next, we provide an auxiliary result that is useful for proving Proposition 3. Lemma 1.
Under Assumptions 1 and 3 to 6, P ( C ilt ∗ = 1 , T ilt ∗ = 0 | Z lt ∗ − , D l = 1) ≤ P ( C ilt ∗ = 1 , T ilt ∗ = 0 | Z lt ∗ − , D l = 0) (10) Proof.
To show the result (and omitting conditioning on Z lt ∗ − ), notice that P ( C ilt ∗ = 1 , T ilt ∗ = 0 | D l = 1) = P ( C ilt ∗ (1) = 1 , T ilt ∗ (1) = 0 | D l = 1) ≤ P ( C ilt ∗ (0) = 1 , T ilt ∗ (0) = 0 | D l = 1)= P ( C ilt ∗ = 1 , T ilt ∗ = 0 | D l = 0)where the first equality holds because treated potential outcomes are observed outcomes when D l = 1, the second line holds by Assumption 6, and third line holds by Assumptions 3 and 5. Proof of Proposition 3 . Following the same logic as in the proof of Proposition 2 (see Equa-tion (9) in particular) and continuing to omit conditioning on covariates to simplify the notation,the lower bound arises by making P ( C ilt ∗ = 1 | T ilt ∗ = 0 , D l = 1) as small as possible while mak-ing P ( C ilt ∗ = 1 | T ilt ∗ = 0 , D l = 0) as large as possible. Neither Assumption 5 nor Assumption 6has any additional effect on these terms though so the lower bound remains unchanged.For the upper bound, plugging the result of Lemma 1 into Equation (9) implies that P ( C ilt ∗ (1) = 1 | D l = 1) − P ( C ilt ∗ (0) = 1 | D l = 1) ≤ γ − γ which implies the result for the upper bound of AT T C ( Z lt ∗ − ). The result for AT T C holds byaveraging over the Z lt ∗ − in AT T C ( Z lt ∗ − ). B More Details on Methodology
B.1 Additional Discussion on Assumption 6
This section gives some more primitive Assumption 6 to hold. We consider the followingconditions:
Extra Conditions: (i) P ( C ilt ∗ (1) = 1 | T ilt ∗ (1) = 0 , D l = 1) ≤ P ( C ilt ∗ (0) = 1 | T ilt ∗ (1) = 0 , D l = 1)(ii) P ( C ilt ∗ (0) = 1 | T ilt ∗ (0) = 0 , T ilt ∗ (1) = 1 , D l = 1) ≥ P ( C ilt ∗ (0) = 1 | T ilt ∗ (0) = 0 , T ilt ∗ (1) =0 , D l = 1) iii) T ilt ∗ (1) = 0 = ⇒ T ilt ∗ (0) = 0Extra Condition (i) says that the probability of untested individuals having Covid-19 doesnot increase under the policy relative to the absence of the policy holding the group of testedindividuals fixed (here, it is equal to the group that would be tested under the policy).Extra Condition (ii) says that the probability of having Covid-19 is greater for the group ofindividuals that would be tested if the policy is implemented but not tested if the policy is notimplemented than for the group of individuals that would not be tested under either policy. Extra Condition (iii) says that all individuals who would have been tested in the absenceof the policy (i.e., individuals meeting the symptoms requirement and who had sought a test)would continue to be tested under the open-testing policy.Next, notice that Assumption 6 holds if the following difference is less than or equal to 0. P ( C ilt ∗ (1) = 1 , T lit ∗ (1) = 0 | D l = 1) − P ( C ilt ∗ (0) = 1 , T ilt ∗ (0) = 0 | D l = 1)= (cid:16) P ( C ilt ∗ (1) = 1 | T ilt ∗ (1) = 0 , D l = 1) − P ( C ilt ∗ (0) = 1 | T ilt ∗ (1) = 0 , D l = 1) (cid:124) (cid:123)(cid:122) (cid:125) Term (A) (cid:17) P ( T ilt ∗ (1) = 0 | D l = 1)+ (cid:16) P ( C ilt ∗ (0) = 1 | T ilt ∗ (1) = 0 , D l = 1) − P ( C ilt ∗ (0) = 1 | T ilt ∗ (0) = 0 , D l = 1) (cid:124) (cid:123)(cid:122) (cid:125) Term (B) (cid:17) P ( T ilt ∗ (1) = 0 | D l = 1)+ P ( C ilt ∗ (0) = 1 | T ilt ∗ (0) = 0 , D l = 1) (cid:16) P ( T ilt ∗ (1) = 0 | D l = 1) − P ( T ilt ∗ (0) = 0 | D l = 1) (cid:124) (cid:123)(cid:122) (cid:125) Term (C) (cid:17) where the equality holds by adding and subtracting P ( C ilt ∗ (0) = 1 | T ilt ∗ (1) = 0 , D l = 1) P ( T ilt ∗ (1) =0 | D l = 1) and P ( C ilt ∗ (0) = 1 | T ilt ∗ (0) = 0 , D l = 1) P ( T ilt ∗ (1) = 0 | D l = 1). Term (A) ≤ P ( C ilt ∗ (0) = 1 | T ilt ∗ (0) = 0 , D l = 1)= P ( C ilt ∗ (0) = 1 | T ilt ∗ (0) = 0 , T ilt ∗ (1) = 0 , D l = 1) P ( T ilt ∗ (1) = 0 | T ilt ∗ (0) = 0 , D l = 1)+ P ( C ilt ∗ (0) = 1 | T ilt ∗ (0) = 0 , T ilt ∗ (1) = 1 , D l = 1) P ( T ilt ∗ (1) = 1 | T ilt ∗ (0) = 0 , D l = 1)which holds by the law of total probability. Then, applying Extra Condition (ii) implies that P ( C ilt ∗ (0) = 1 | T ilt ∗ (0) = 0 , D l = 1) ≥ P ( C ilt ∗ (0) = 1 | T ilt ∗ (0) = 0 , T ilt ∗ (1) = 0 , D l = 1)and Extra Condition (iii) additionally implies that P ( C ilt ∗ (0) = 1 | T ilt ∗ (0) = 0 , D l = 1) ≥ P ( C ilt ∗ (0) = 1 | T ilt ∗ (1) = 0 , D l = 1)which implies that Term (B) ≤
0. That Term (C) ≤ Another way to explain this condition is that there is positive self-selection into taking the test among individualsthat become tested under the open-testing policy but would not have been tested without the open-testing policy. Additional Figures
Figure 7: Bounds on Per Capita Number of Pre-Policy Total Covid-19 Cases by County (a) Tennessee (b) Alabama
Notes: