A Selective Review of Negative Control Methods in Epidemiology
AA Selective Review of Negative Control Methods inEpidemiology
Xu Shi ∗ , Wang Miao and Eric Tchetgen Tchetgen Department of Biostatistics, University of Michigan School of Mathematical Sciences, Peking University Statistics Department, The Wharton School, University of Pennsylvania
AbstractPurpose of Review
Negative controls are a powerful tool to detect and adjust forbias in epidemiological research. This paper introduces negative controls to a broaderaudience and provides guidance on principled design and causal analysis based on aformal negative control framework.
Recent Findings
We review and summarize causal and statistical assumptions,practical strategies, and validation criteria that can be combined with subject matterknowledge to perform negative control analyses. We also review existing statisticalmethodologies for detection, reduction, and correction of confounding bias, and brieflydiscuss recent advances towards nonparametric identification of causal effects in a dou-ble negative control design.
Summary
There is great potential for valid and accurate causal inference lever-aging contemporary healthcare data in which negative controls are routinely available.Design and analysis of observational data leveraging negative controls is an area ofgrowing interest in health and social sciences. Despite these developments, furthereffort is needed to disseminate these novel methods to ensure they are adopted bypracticing epidemiologists. K eywords: bias correction, bias detection, bias reduction, negative control, unmeasuredconfounding. ∗ Email: [email protected]. The authors have no conflicts to disclose. Human and Animal Rights: Thisarticle does not contain any studies with human or animal subjects performed by any of the authors. Introduction
Despite ongoing efforts to improve study design and statistical analysis of epidemiologicalresearch, failure to rule out non-causal explanation of empirical findings has prompted sub-stantial discussions in the health science [1, 2]. A powerful tool increasingly recognized tomitigate bias is negative control study design and analysis [3–5]. Negative controls havea long history in laboratory experiments and epidemiology [3, 6–8]. However, they havemainly been used to detect bias rather than to remove bias. More recent methodologicaladvances that enable both bias detection and bias removal have not been fully recognized.As a result, the potential for valid and accurate causal inference leveraging contemporaryhealthcare data with abundant negative controls has to date not been fully realized. Thispaper aims to introduce negative controls to a broader audience and provide guidance onprincipled design and causal analysis based on a formal negative control framework. Wefocus on resolving bias due to unmeasured confounding in observational studies, althoughnegative controls have recently also been used to tackle a variety of biases such as selectionbias [3, 4, 9], measurement bias [3, 4], and homophily bias [10, 11] in both observationalstudies and randomized trials [5].
A negative control outcome (NCO) is a variable known not to be causally affected by thetreatment of interest. Likewise, a negative control exposure (NCE) is a variable known not tocausally affect the outcome of interest. To the extent possible, both NCO and NCE should beselected such that they share a common confounding mechanism as the exposure and outcomevariables of primary interest, although this is not always necessary [12, 13]. These known-nulleffects have been used to detect residual confounding bias: presence of an association betweenthe NCE and the outcome (or between the NCO and the exposure) constitutes compellingevidence of residual confounding bias, while absence of such association implies no empiricalevidence of such bias. For example, in a study about the effects of influenza vaccinationon influenza hospitalization in the elderly (Figure 1), injury/trauma hospitalization wasconsidered as an NCO as it can not be causally affected by influenza vaccination, but maybe subject to the same confounding mechanism mainly driven by health-seeking behavior [14].The authors found that despite efforts to control for confounding, influenza vaccination notonly appeared to reduce risk of influenza hospitalization after influenza season (risk ratio 0.82,95% CI 0.73–0.92), but also appeared to reduce risk of injury/trauma hospitalization (riskratio 0.83, 95% CI 0.75–0.91). This was interpreted as evidence of bias due to inadequatelycontrolled confounding. Likewise, annual wellness visit history can be considered as an NCEas it is unlikely to cause flu-related hospitalization.In the following, we adopt the potential outcome framework which we use to formallydefine causal effects as well as to articulate sufficient identification conditions to perform validcausal inferences from observational data. We proceed under the fundamental assumptionthat for each subject in the target population there exist a potential outcome variable Y ( a ),that would be observed if possibly contrary to fact, the subject were exposed to treatmentvalue a , for all possible treatment values of a in a set A . In the common setting where thetreatment is dichotomous A = { , } , the assumption states that each subject has a well2 YU ZWIV flu shot influenzahospitalizationhealth-seeking behavior(unmeasured) annual wellnessvisit history (NCE) injury/traumahospitalization (NCO) physicianpreference (cid:55) (cid:55) Figure 1: An illustrating example of different types of negative controls: consider studyingthe causal effect of flu shot (A) on influenza hospitalization (Y), subject to confoundingby unmeasured health-seeking behavior (U). Annual wellness visit history (Z) is an NCEwhich does not causally affect Y. Injury/trauma hospitalization (W) is an NCO which is notcausally affected by A. Both Z and W are proxies of health-seeking behavior. Physician’sprescribing preference (IV) is an instrumental variable which likely induces variation in thechoice of treatment, and may not affect the outcome other than through its influence onthe treatment. As discussed in Sections 1.1 and 3.1, both a valid instrumental variable andan invalid instrumental variable associated with U are valid NCE. All arguments are madeimplicitly conditional on measured covariates X. Independence between A and Z (or Y andW) conditional on U is not necessary. See more examples in Table A.1 of the Appendix.defined pair of potential outcomes ( Y (0) , Y (1)) corresponding to their outcome under activetreatment a = 1 and control treatment a = 0, respectively [15, 16]. In such setting, our goalis to make inferences about the population average treatment effect (ATE) defined as ATE = E [ Y (1) − Y (0)]. Now, consider an observational study in which one observes independentand identically distributed samples on ( Y, A, X ), where A is a subject’s observed binarytreatment assignment, Y is his/her observed outcome, and X are observed confounders ofthe association between A and Y . We sometimes refer to A as primary treatment and Y as primary outcome. We assume that the treatment is defined with enough specificity suchthat among subjects with A = a , the observed outcome Y is a realization of the potentialoutcome value Y ( a ), that is Assumption 1 (Consistency) . Y ( a ) = Y when A = a . Much of the literature on causal inference in observational studies relies on the strongassumption of no unmeasured confounding for the purpose of identification, i.e., A ⊥⊥ Y ( a ) | X , which is sometimes referred to as ignorability assumption. This assumption essentiallyrules out the existence of unmeasured common causes, denoted as U , of the treatment andoutcome variables – an untestable assumption which is often at the source of much skepticismabout causal interpretation of associations found in observational data. We do not make suchignorability assumption to establish causation. Instead, we invoke the following assumptionthat describes the relationship between treatment and outcome in the presence of bothmeasured and unmeasured confounding. Assumption 2 (Latent ignorability) . A ⊥⊥ Y ( a ) | U, X .
3n addition to (
A, Y, X ), suppose that one has also observed a secondary outcome W and/or a secondary exposure Z , and let Y ( a, z ) and W ( a, z ) denote the corresponding coun-terfactual values that would be observed had the primary treatment and secondary exposuretaken value ( a, z ). W and Z are formally defined as negative control outcome and exposurevariables provided that the following assumptions hold Assumption 3 (Negative control outcome) . W ( a, z ) = W and W ⊥⊥ A | U, X . Assumption 4 (Negative control exposure) . Y ( a, z ) = Y ( a ) and Z ⊥⊥ ( Y ( a ) , W ) | U, X . Assumptions 3 and 4 entail: (1) there is no remaining unmeasured common cause between(
A, Z ) and (
Y, W ) conditional on (
U, X ); (2) there is no causal effect of Z on Y conditionalon U , A and X , and there is no causal effect of A and Z on W conditional on U and X ,which are referred to as the exclusion restrictions. We refer to a pair of W and Z as thedouble negative control. It is not necessary to have both NCO and NCE, although the doublenegative control will be sufficient for nonparametric identification of the ATE as detailed inSection 3.2.Figure 1 illustrates a directed acyclic graph (DAG) encoding the above assumptions.Consider a study of the effectiveness of flu shot ( A ) on influenza-related hospitalization ( Y ).A major concern in such studies is potential hidden bias due to unmeasured health-seekingbehavior ( U ), a well-known common cause of flu shot status and influenza hospitalization.In such a study, routinely captured information on a person’s annual wellness visit historyentails a good candidate NCE ( Z ) satisfying Assumption 4, as it reflects a person’s tendencyto engage in healthy behavior, and is unlikely to cause influenza hospitalization. Similarly,recorded data on a person’s injury/trauma hospitalization provides compelling candidateNCO( W ) satisfying Assumption 3, as it is likely associated with health-seeking behavior andunaffected by flu shot. In addition, we can view an instrumental variable (IV) as an NCE[12, 17]. An IV is a pre-treatment variable satisfying the following three core assumptions:(IV relevance) the IV must be associated with the treatment; (Exclusion restriction) theIV must not have a direct effect on the outcome that is not mediated by the treatment;(IV independence) the IV must be independent of unmeasured confounders. For example,physician’s prescribing preference is often taken as an IV in comparative effectiveness studies,because it likely induces variation in the choice of treatment, and may not affect the outcomeother than through its influence on the treatment [18]. A valid IV satisfies Assumption 4 andhence is a valid NCE, which is further explained in Section 3.1. Besides the above three IVconditions, a forth condition is necessary to identify a causal effect, such as the monotonicityassumption or the no current treatment interaction assumption [19–22]. Alternatively, causaleffect identification using IV is also made possible by further incorporating an NCO undera double negative control framework introduced in Section 3.2.It is important to note that Figure 1 is not the only DAG satisfying the negative controlassumptions. For example, a more general DAG would allow Z to affect A , correspondingto the case where an annual wellness visit could result in flu vaccination during flu season.Moreover, physician preferences are not randomized and may be associated with U viaphysician-patient interactions, potentially violating the IV independence assumption. Suchan invalid IV violating the IV independence assumption is still a valid NCE as long as theexclusion restriction holds, regardless of whether the IV relevance assumption holds. In this4ase, an NCO can be used to repair an invalid IV for causal effect identification under adouble negative control framework [12, 17]. Additional DAGs illustrating settings in whichAssumptions 2-3 hold are provided in Table A.1 of the Appendix. As demonstrated in[12] and [17], an NCE can be either pre- or post-treatment variable. Unmeasured commoncauses of the Z - A association and Y - W association can also be present without necessarilyinvalidating Assumptions 3-4. A key insight is that a valid NCO does not necessarily needto be an outcome variable and may in fact precede the treatment in view, while a valid NCEneed not necessarily be a treatment and may in fact be ascertained either together withprimary outcome of interest or subsequently. In prior literature, NCO has been referred to as falsification outcome/end point [23–26],control outcome [14, 27, 28], secondary outcome [29, 30], supplementary response [6] andunaffected outcome [31]. NCE has been referred to as control exposure [27] and residual-confounding indicator [32, 33]. Both NCO and NCE have been referred to as proxies ofunmeasured confounder [34–36]. In addition, an exposure-outcome pair known a priori tobe unrelated has also been referred to as a negative control pair [37–41].The literature reviewed in the current paper is largely limited to papers that use afore-mentioned nomenclature. Although [3] and [27] review negative control literature, to thebest of our knowledge, this paper is the first to systematically summarize both formal causaland statistical methodology together with applications of negative controls. The rest of thepaper is organized as follows. Design and validation of negative controls are discussed in Sec-tion 2. We then review both assumptions and methods for using negative controls to detect,reduce, and remove unmeasured confounding bias in Section 3. We use a simple example toillustrate double negative control adjustment (i.e., leveraging NCE and NCO when both areavailable) of confounding bias in Section 3.2. We close with a summary in Section 4.
Existing applications of negative controls mainly focus on detection of uncontrolled con-founding bias. We list in Table 1 selected studies that employed negative controls to detectresidual confounding and to strengthen causal conclusions. Among these studies, eight usedNCEs and nine used NCOs. Table 1 is by no means comprehensive, as hundreds of stud-ies have leveraged negative control variables as evidenced by the number of recent articlesthat have cited [3] as the foundational paper on the use of negative control exposures andoutcomes in Epidemiology, but rather a representative set of examples that help illustratestrategies for identifying compelling candidate negative controls.
Effect of influenza vaccination on influenza hospitalization: using injury/traumahospitalization as an NCO
As detailed in Section 1.1, to study the effects of influenzavaccination on influenza hospitalization in the elderly, injury/trauma hospitalization was5aken as an NCO to detect confounding by unmeasured health-seeking behavior [14]. In-fluenza hospitalization before the flu season was also used as an NCO, because flu vaccinecan not protect against influenza hospitalization when there is little flu virus circulation.
Effect of maternal exposure on offspring outcomes: using paternal exposure asan NCE
A number of publications have used paternal exposure as an NCE to study theintrauterine effect of maternal exposure on offspring outcome. Specifically, [42–46] studiedthe association between maternal smoking and offspring outcomes, and compared paternaland maternal associations to detect potential bias due to unmeasured confounding by family-level confounding factors or parental phenotypes. Similarly, [47] compared maternal andpaternal distress and their associations with offspring asthma. Evaluation of the validity ofpaternal exposure as an NCE has also been considered in [48]. They found that cotinine levelfrom exposure to partner smoking were low in non-smoking pregnant women, which suggeststhat using paternal smoking as an NCE for investigating intrauterine effects is valid.
Effect of air pollution on health outcomes: using future air pollution as an NCE
Besides use of paternal exposures, NCEs are also used in air pollution studies. For example,[32, 33, 49, 50] studied statistical methods that utilize future air pollution as an NCE for biasdetection and bias reduction, because the future is not expected to causally affect the past.In addition, [51] studied the effect of air pollutant on asthma, and leveraged two differentNCEs: air pollutant level in the future and air pollutant level in a distant city.
In addition to the above examples, various negative control designs are also summarized inTable 1. Rather than detailing each study in Table 1, we summarize these studies in termsof their respective strategy to identify negative control variables below. A commonly usedstrategy to select negative controls leverages temporal and spacial constraints that essen-tially guarantee the exclusion restrictions in Assumptions 3-4. Temporal ordering leveragesthe universal truth that the future cannot causally affect the past. For example, as detailedabove, [32, 33, 49–51] specify future measurements of air pollution as an NCE to study theeffect of current air pollution on health outcomes. Similarly, [46] proposed to look at ma-ternal exposure before and after pregnancy in studying the intrauterine effect of maternalexposure on offspring outcome. An essential prerequisite for this design is that primaryoutcome does not cause subsequent exposure (at least in the short term), certainly a reason-able assumption in air pollution settings. Prior information about timing of exposure alsosometimes allows one to leave out an essential ingredient [3]. For instance, [14] defined asNCO the number of hospitalizations prior to influenza season in order to estimate the effectof influenza vaccination on influenza hospitalization, as little to no flu circulates prior to fluseason for influenza vaccination to be protective against. Spatial distancing has also beenconsidered as an effective means to enforce exclusion restrictions in Assumptions 3-4. Forinstance, [51] took air pollutant level in a distant city as an NCE to study the effect of airpollutant on asthma. [52, 53] studied screening sigmoidoscopy and mortality from colon tu-mor, and selected tumor from proximal colon that is beyond the reach of the sigmoidoscopy6 eference Exposure Outcome Negative Control Exposure Negative Control Outcome [42] Maternal smoking Low birth weight Paternal smoking[43] Maternal smoking Sudden infant deathsyndrom Paternal smoking[44] Maternal smoking Offspring height, ponderalindex, body mass index Paternal smoking[45] Maternal smoking Offspring blood pressure Paternal smoking[47] Maternal distress Offspring asthma Paternal distress[46, 48]: Maternal smoking, alcoholuse or dietary patterns Offspring development Paternal smoking, alcohol use ordietary patterns[51] Air pollutant Ashma Future air pollutant, airpollutant elsewhere[54] Mammography-screeningparticipation Death from breast-cancer Dental-care participation Death from causes other thanbreast cancer and from externalcauses such as accidents,intentional self-harm and assaults[14] Influenza vaccination Mortality andpneumonia/influenzahospitalization Outcome before and after influenzaseason; injury/traumahospitalization[55] Air pollutant Asthma hospitalization Appendicitis hospitalization[56–59] Smoking Mortality from lungcancer Other causes of death[60] Psychological stress postearthquake Deaths from cardiacevents Other causes of death, e.g. cancer[52, 53] Screening sigmoidoscopy Mortality from distalcolon tumor Mortality from proximal colontumor (above the reach of thesigmoidoscopy)
Table 1: Summary of selected applications using negative controls for detection of confounding bias. s an NCO.Another strategy is to select as NCO an outcome analogous to the primary outcome how-ever resulting from mechanism a priori known to be unrelated to the primary treatment. Asillustration of this approach, consider [14] which took hospitalization due to injury/traumaas an NCO for the primary outcome, hospitalization due to influenza. Similarly, to evalu-ate the effect of air pollution on hospitalization due to asthma, [55] defined hospitalizationdue to appendicitis as an NCO. In addition, several studies routinely use death from othercauses as NCO: [56–59] studied the effect of smoking on lung cancer with mortality fromother causes as an NCO, [60] studied the effect of psychological stress on deaths from car-diac events after an earthquake with death from other causes as an NCO, and [54] selecteddeath from causes other than breast cancer and from external causes such as accidents, in-tentional self-harm and assaults as NCO to estimate the effect of mammography-screeningparticipation on breast cancer mortality. Despite the various strategies in the literature to find candidate negative controls, researchersshould rigorously validate the choice of negative controls and be aware of possible violationsof negative control assumptions. Similar to the assumptions of no unmeasured confounding,negative control assumptions (Assumptions 3 and 4) are causal assumptions that can onlybe established by subject matter considerations and not by empirical test without additionalassumptions. In practice, we recommend checking the following criteria in finding a candidatenegative control. • “Irrelevant to Y (or A )”: The NCE should not cause the outcome of interest, while theNCO should not be caused by the treatment of interest nor the NCE. These conditionsare formally implied by Assumptions 3 and 4. • “Comparable to A (or Y )”: In most cases it is important to have the source of bias inmind before designing a negative control study although this is not always necessary[12, 13]. Unmeasured confounding mechanism of negative controls should be com-parable to that of A and Y in the following sense: the NCE must be associated withunmeasured confounders conditional on measured confounders and primary treatment;the NCO must be associated with unmeasured confounders conditional on measuredconfounders. Hence the negative control variable is often viewed as a proxy of theunmeasured confounders. A variable completely irrelevant to all mechanisms underconsideration would not provide any useful information. These conditions are for-mally required by Assumptions 5 and 7 in Section 3; • “Adequate Negative Control Power”: The NCE and NCO are not exceedingly rarerelative to primary treatment and outcome variables, respectively. For example, in theevent that the negative control variable is a rare binary variable, or if the associationbetween unmeasured confounder and negative control variable is weak, then large sam-ple may be necessary to achieve sufficient power for detecting confounding bias [61,62].We list examples of possible violations of negative control assumptions in the Appendix.8 Review of methods
Key assumption and rationale for bias detection
Assumptions 3 and 4 give rise toformal statistical tests of the null hypothesis that adjustment for observed covariates sufficesto control for confounding bias, rejection of which indicates presence of an unmeasuredconfounder U . A key assumption for this bias detection strategy is that the negative controlexposure or outcome is U -comparable to the primary exposure or outcome: Assumption 5 ( U -comparable) . W (cid:54)⊥⊥ U | X and Z (cid:54)⊥⊥ U | A, X . The U -comparability assumption requires that unmeasured confounders U of A - Y as-sociation are identical to those of the A - W association and Z - Y association, such that anon-null A - W or Z - Y association can be attributed to U . Therefore, presence of an associa-tion between primary and negative control variables implies residual confounding bias, whileabsence of such associations implies no empirical evidence of unmeasured confounding. It isimportant to note that when evaluating Z - Y association one must also adjust for A to ruleout the potential association between Z and Y due to the pathway Z − A → Y (the arrowbetween Z and A could either be Z → A or Z ← A ). Examples of such relationships arelisted in Table A.1 of the Appendix. Notably, conditional on X , a valid IV independent of U and associated with A satisfies Assumption 5 because of conditioning on a collider A onthe IV → A ← U pathway [12, 17]; likewise an invalid IV that violates the IV independenceassumption defined in Section 1.1 would also satisfy Assumption 5 regardless of whether IVand A are associated, as mentioned in Section 1.1. Methods
As detailed in Section 2, majority of existing applications used negative con-trols for bias detection, by testing for an association between primary and negative controlvariables. A review of bias detection methods is presented in Table 2. For example, [32]formalized bias detection as a Wald test of the coefficient of NCE in a regression modelof the outcome on the primary and negative control exposures. Moreover, [63, 64] notedthat an invalid NCE that violates the exclusion restriction but satisfies the U -comparableassumption can nevertheless validate a causal interpretation when it does not appear to beassociated with the outcome adjusting for the treatment of interest. Summary of literature
Beyond bias detection, recent developments have made it possibleto reduce and sometimes completely remove unmeasured confounding bias using negativecontrols. In air pollution studies, current and future pollutant levels are often positivelycorrelated and are associated with unmeasured confounders in the same direction. In thissetting, [33] showed that incorporating future air pollution, an NCE, in the outcome modelcan reduce confounding bias. Further bias attenuation was proposed in [49] by incorporatingboth past and future exposures. Bias reduction using an NCO was considered by [65] inestimation of standardized mortality ratio, where the standardized mortality ratio of theNCO was used to reduce bias in that of the primary outcome. In addition, [38, 40] considered9 eference and Setting Main Assumptions Besides Assumptions 2-5 MethodsD [32]: Time-series study. Z = future air pollution A t +1 . (1) A t +1 ⊥ Y t | A t , U t , X t .(2) log[ E ( Y t )] = α + βA t + γX t + β f A t +1 . Bias detection by Wald-test on β f .[63, 64]: invalid NCE Z . (1) Violation of exclusion restriction Y ( a, z ) (cid:54) = Y ( a ).(2) Z is U -comparable with A : Z (cid:54)⊥⊥ U | A, X . No evidence of Z - Y association adjusting for A implies no residual confounding of A - Y association. R [33, 49]: Time-series study. Z = future air pollution A t +1 . (1) A t +1 ⊥ Y t | A t , U t , X t ; A t +1 (cid:54)⊥⊥ ( A t , U t ) | X t .(2) Y t ( a t , x t , u t ) = β + β α t + β x t + β u t + (cid:15) t ; E [ (cid:15) t | A t = a t , U t = u t , X t = x t ] = 0.(3) E [ U t | A t = a t , A t +1 = a t +1 , X t = x t ] = α + α a t + α x t + α a t +1 ; sign ( α ) = sign ( α ).(4) E [ A t +1 | A t = a t , X t = x t ] = γ + γ a t + γ x t ; γ >
0. Bias reduction by fitting E [ Y t | A t , X t , A t +1 ] insteadof fitting E [ Y t | A t , X t ]. Further bias reductionconsidered in [49] by incorporating X t +1 or A t − .Identification of β is possible with multiple futureexposures under autoregressive model for exposuretime series.[65]: Standardized mortalityratio in occupational cohortstudy. (1) E [ Y (1) | X = k ] /E [ Y ref | X = k ] = exp( α k − δ k ) E [ W | X = k ] /E [ W ref | X = k ] = exp( − (cid:15) k ).(2) sign ( (cid:15) k ) = sign ( δ k ) and 0 < | (cid:15) k | < | δ k | . Adjust for bias δ k via E [ Y (1) | X = k ] E [ W ref | X = k ] /E [ Y ref | X = k ] E [ W | X = k ] . [38, 40]: Define negativecontrols as drug–outcomepairs where one believes nocausal effect exists. (1) For a negative control drug-outcome pair, the effectestimate β i ∼ N ( θ i , τ i ) , i = 1 , . . . , n , where θ i ∼ N ( µ, σ )is the true bias.(2) Under the null of no treatment effect, the effectestimate β n +1 H ∼ N ( µ, σ + τ n +1 ). Estimate µ, σ by MLE with L ( µ, σ | θ, τ ) = Π ni =1 ´ p ( β i | θ i , τ i ) p ( θ i | µ, σ ) dθ i .Calibrated p -value computed via Wald-test of β n +1 .Confidence interval calibrated similarly usingdistribution generated by positive controls. C [66, 67]: W, Y =Time-to-event outcome. (1) There exist monotonic functions that describe U - Y and U - W associations: Y (0) = h y ( U, X ) , W = h w ( U, X ).(2) Cox models for Y and W w/ hazard ratio e β y and e β w . The hazard ratio measuring the causal effect oftreatment is e β y − β w .[13, 68]: Generalizeddifference-in-differences usingNCO. (1) There exist monotonic functions that describe U - Y and U - W associations: Y (0) = h y ( U, X ) , W = h w ( U, X ).(2) Positivity: if 0 < f W | A =1 ,X ( W ∗ ) then0 < f W | A =0 ,X ( W ∗ ) <
1, where W ∗ = ( W | A = 1 , X ) isdistributed as W in the exposed group. The average treatment effect on the treated is E [ Y (1) − Y (0) | A = 1] = E [ Y | A =1] − E [ F − Y | A =0 ,X ) · F W | A =0 ,X ( W ∗ )]. Generalized thedifference-in-differences approach to the broadercontext of NCO.[69]: Calibration using NCO. (1) W ⊥ A | X, Y (1) , Y (0). (2) Rank preservation: Y = Y (0) + Ψ A , and hence W ⊥ A | X, Y (0) by (1).(3) E [ W | A, Y (0) = Y − Ψ A, X ] = β + β X + β Y (Ψ)+ β A ,where β = 0 by (1). The 95% CI for any Ψ consists of all Ψ for whichˆ β (Ψ) ± . β (Ψ)] contains 0; Under (1)-(3), fit E [ W | A, Y, X ] = β + β X + β Y + β Ψ A , then thecausal effect Ψ = − β Ψ /β .[70–72]: Removing unwantedvariation in gene-expressionanalysis. (1) Y × p = X × q β q × p + U × r Γ r × p + (cid:15) × p , p ≥ r + 1.(2) W × s = U × r Γ Wr × s + (cid:15) W × s , s ≥ r, Rank(Γ Wr × s ) = r .(3) ( (cid:15), (cid:15) W ) ∼ N (0 , diag( σ , . . . , σ p + s )) , ( (cid:15), (cid:15) W ) ⊥⊥ ( X, U ).(4) U × r = X q α q × r + (cid:15) U × r , (cid:15) U ∼ N (0 , I r ) , (cid:15) U ⊥⊥ X . [70, 71]: Estimate U by factor analysis of (2), thenestimate β from (1). [72]: Estimate Γ W and Γ byfactor analysis of Y = X ( β + α Γ) + ( (cid:15) U Γ + (cid:15) ) (5) and W = Xα Γ W + ( (cid:15) U Γ W + (cid:15) W ) (6). Then estimate α from (6), and estimate β from (5).[12, 17, 36]: Nonparametricidentification. Assumption 7 Identify h in E [ Y | A, Z, X ] = E [ h ( W, A, X ) | A, Z, X ],then ATE = E [ h ( W, A = 1 , X )] − E [ h ( W, A = 0 , X )].
Table 2: Summary of published methodologies using negative controls for detection (D), reduction (R), and correction (C) of confounding bias. alibrating p -value and confidence intervals by deriving an empirical null distribution fromthe association between primary and negative control variables.Several methods were developed to achieve full bias removal, under certain assumptionssuch as monotonicity [13, 66–68], rank preservation [69], and linear model for unmeasuredconfounding. Specifically, [66, 67] considered bias correction by using a negative controltime-to-event outcome under a monotonicity assumption that describes the U - Y and U - W association. Under a similar monotonicity assumption, [13] generalized difference-in-difference method to NCO method, which is further extended by [68]. In addition, [69]developed an outcome calibration approach with a rank preservation assumption under whichthe counterfactual primary outcome can account for the unmeasured confounding betweenthe A - W association. Lastly, [70–72] assumed a linear model for the unmeasured confounderand proposed to estimate U by factor analysis. Nonparametric identification in a double negative control design
The above meth-ods remove unmeasured confounding bias under relatively stringent assumptions. [36] es-tablished sufficient conditions under which the ATE can be nonparametrically identifiedleveraging an NCE and an NCO, i.e., via a double negative control design [17]. That is, theATE can be uniquely expressed as a function of the observed data distribution without im-posing any restriction on the observed data distribution, such that distinct data generatingmechanisms are guaranteed to lead to distinct ATE values. Further method developmentsinclude semiparametric estimation under categorical negative controls and unmeasured con-founding [17] and alternative strategies to identify the ATE via a so-called confoundingbridge function [12].Double negative controls are widely available in health sciences. For example, in air pollu-tion studies, [12] used future air pollution level and past health outcome as negative controlexposure and outcome, respectively. [17] took two routinely monitored control outcomesfrom administrative healthcare data in vaccine safety studies as double negative control, inthe setting where both control outcomes are independent of the primary outcome and satisfyboth Assumption 3 and Assumption 4. In influenza vaccine effectiveness research presentedin Figure 1, annual wellness visit and injury/trauma hospitalization can serve as doublenegative control. In addition, when IV is available, identification is made possible by furtherincorporating an NCO such as a pretreatment measurement of the outcome.Below we will first detail the identification conditions established in [36] and then intro-duce identification methods proposed in [36] and [12].
Assumption 6 (Positivity) . < P ( A = a, Z = z | X ) < for all a , z . Assumption 7 (Completeness) . (a) For all a , W (cid:54)⊥⊥ Z | A = a, X . (b) For any squareintegrable function g , if E [ g ( W ) | Z = z, A = a, X ] = 0 for almost all z, a , then g ( W ) = 0 . Assumption 6 is a regular positivity assumption ensuring that in all strata of X , thereare always some individuals with A = a, Z = z for all a , z . Assumption 7 is a commonlyused completeness condition for identification [73]. Specifically, Assumption 7(a) essentiallyrequires U -comparability. That is, both Z and W should be associated with U such thatvariation in U can be recovered from variation in Z and W . Assumption 7(b) aims to ensurethat the underlying unmeasured confounding mechanism in E [ Y | A, U ] can be identified11sing Z and W . For example, suppose U is a binary variable. Then Assumption 7 furtherrequires that Z and W have at least two categories, and E [ W | A = a, Z = 1 , X = x ] − E [ W | A = a, Z = 0 , X = x ] is not equal to zero for all a, x . Rationale
In the presence of unmeasured confounding by a latent variable U , an observeddifference in the outcome between the treatment and control groups is a combination ofthe underlying causal effect and confounding bias. One cannot directly disentangle thevariation in the outcome due to the treatment from the unwanted variation due to U , as U is not measured. We seek to indirectly remove such unwanted variation, i.e., unmeasuredconfounding bias, by leveraging available proxies of U . An important example of such proxy isan NCO chosen to be associated with U but not causally affected by the treatment (Figure 1).Therefore, any difference in the NCO, W , between the treatment and control groups can onlybe attributed to U . Such a difference can uncover the unwanted variation due to U assumingthat U - Y and U - W associations are the same, and there is no U - A additive interaction on Y . An example of such W is the pre-exposure baseline measure of the outcome, in whichcase bias adjustment reduces to the well-known difference-in-differences approach [13].The above describes identification of the ATE under assumptions that are generallyuntenable, because the U - Y and U - W associations will often be on different scales, andthere may be U - A interactions in the model for Y . In order to nonparametrically identifyunmeasured confounding bias, we make use of the NCE Z. Because Z is associated with Y or W only through U , the ratio of Z - Y and Z - W associations captures the ratio of U - Y and U - W associations, allowing for U - A interactions. In summary, leveraging a double negativecontrol design one can nonparametrically identify the magnitude of unmeasured confoundingbias via the following mechanism: The NCO uncovers the confounding bias up to a scalethat reflects the difference between U - Y and U - W associations, while the NCE recovers thescale leveraging Z - Y and Z - W associations. This mechanism is further illustrated in anexample below. Example
To further illustrate the idea of identification using double negative control,consider a simple example where we assume the following linear structural equation modelsinvolving unmeasured confounding U , although the nonparametric identification proposedin [36] does not rely on any restriction about the data generating models. We suppressmeasured confounders X to ease notation – all arguments are made implicitly conditionalon X .Had U been measured, we could fit (1) and obtain the true causal effect which is β YA .When in fact U is not measured, to leverage double negative control, we additionally assumethe U - W relationship in (2) and U - Z relationship in (3). E [ Y | A, U ] = β Y0 + β YA A + β YU U (1) E [ W | U ] = β W0 + β WU U (2) E [ U | A, Z ] = β U0 + β UA A + β UZ Z. (3)Models (1)–(3) indicate the following models that one could actually fit using the observeddata ( Y, A, W, Z ). These models are obtained by replacing U with E [ U | A, Z ] in the primary12nd negative control outcome models (1) and (2). E [ Y | A, Z ] (1) = β Y0 + β YA A + β YU E [ U | A, Z ] (4) (3) = β Y0 + β YA A + β YU ( β U0 + β UA A + β UZ Z ) (5) E [ W | A, Z ] (2) = β W0 + β WU E [ U | A, Z ] (6) (3) = β W0 + β WU ( β U0 + β UA A + β UZ Z ) . (7)From (1) we know that the true causal effect is β YA . However, if one were to regress Y on A and Z without accounting for U such as in [33], then the coefficient of A would beequal to β YA + β YU β UA . Here β YU β UA is confounding bias, which arises when there exists a U that is associated with both Y and A . One cannot directly separate the confounding biasfrom the true causal effect because U is not observed. Nevertheless, the coefficients in theobserved models (5) and (7) allows us to infer β YU β UA . To facilitate discussion, we introducenotation for the coefficients in models (5) and (7). Let δ YA = β YA + β YU β UA and δ YZ = β YU β UZ denote the coefficients of A and Z in the primary outcome model (5), respectively, and let δ WA = β WU β UA and δ WZ = β WU β UZ denote the coefficients of A and Z in the negative controloutcome model (7), respectively.We detail three strategies to identify the unmeasured confounding bias β YU β UA leveraginga single NCO, a single NCE, or the double negative control. First, we note that coefficientof A in the primary outcome model, δ YA , is a combination of both true causal effect andconfounding bias, whereas coefficient of A in the negative control outcome model, δ WA , reflectspure confounding bias because A does not causally affect W . In fact, if U - Y and U - W associations are equal on the additive scale, i.e., β WU = β YU , then δ WA matches the confoundingbias β YU β UA . That is, under the assumption of equal U - Y and U - W additive association, aform of “additive outcome equi-confounding” [13], the treatment effect on NCO is equal tothe unmeasured confounding bias. Hence the causal effect can be recovered by backing outthe association of the treatment with the NCO from the association of the treatment withthe primary outcome. Note that in this scenario it is not necessary to have an NCE: one canfit the primary and negative control outcome on treatment without adjusting for the NCE,and then take the difference in treatment effects. When NCO is the baseline outcome, theabove reduces to the difference-in-difference method [13].Second, the coefficient of Z in the primary outcome model, δ YZ , would be zero if there wasno unmeasured confounding because Z does not causally affect Y . Therefore, coefficient of Z in the outcome model reflects pure confounding bias. In fact, if U - A and U - Z associationsare the equal on the additive scale, i.e., β UA = β UZ , then δ YZ captures the bias β YU β UA dueto unmeasured confounding. That is, under the assumption of equal U - A and U - Z additiveassociation, a form of “additive treatment equi-confounding”, the NCE effect on the primaryoutcome is equal to the unmeasured confounding bias. Hence the causal effect is given bythe difference in coefficients of treatment and NCE in the primary outcome model. Notethat in this scenario it is not necessary to have an NCO: one can fit the primary outcomeon treatment and NCE, and then take the difference in effects of treatment and NCE on Y .In both scenarios described above, the “additive outcome equi-confounding” or “additivetreatment equi-confounding” is a rather strong assumption, as it requires Y and W , or Z A , to operate on the same scale. To relax these assumptions, we can leverage thedouble negative control. Specifically, if U - Y and U - W associations are unequal, then δ WA reflects pure confounding bias up to a scale which is equal to β YU /β WU . Because Z - Y ( Z - W )association is a product of U - Z and U - Y ( U - W ) associations, the ratio of Z - Y and Z - W associations is equal to the ratio of U - Y and U - W associations. That is, β YU /β WU = δ YZ /δ WZ .The confounding bias is thus equal to δ WA scaled by δ YZ /δ WZ , and the true causal effect is giveby δ YA − δ WA × δ YZ /δ WZ . It is important to note that the first two adjustment methods are aspecial case of the general adjustment method, in that the confounding bias is always equalto δ WA δ YZ /δ WZ across all three scenarios.To summarize, the confounding bias β YU β UA = δ WA δ YZ /δ WZ = δ WA if β WU = β YU δ YZ if β UA = β UZ δ WA δ YZ /δ WZ if β WU (cid:54) = β YU and β UA (cid:54) = β UZ . (8a)(8b)(8c)Hence the true causal effect is identified as β YA = δ YA − δ WA δ YZ /δ WZ . (9)It is important to note that equation (9) is only meaningful when δ WZ is not equal to zero.If δ WZ = 0 then either there is no evidence of the presence of U and β YU β UA = 0, or a se-lected negative control variable is not sufficiently associated with U , violating Assumption 7.Similar arguments apply to δ WA and δ YZ . In fact, as summarized in Table 2, many negativecontrol methods detect, reduce, and remove unmeasured confounding bias using analogies ofscenario (8a) [13, 65–67] and scenario (8b) [32, 33, 49].In practice, identification via (9) relies on fitting the primary and negative control out-come models E [ Y | A, Z ] and E [ W | A, Z ]. Alternatively, one could directly make as-sumption about the underlying unmeasured confounding mechanism E [ Y | A, U ] which isproposed in [12]. To illustrate, consider again the example above. Let (cid:101) U W = W − β β WU , thenby (2) (cid:101) U W is a good proxy of U in the sense that E [ (cid:101) U W | U ] = U . In particular, let h ( W, A ) = β Y0 + β YA A + β YU (cid:101) U W , then by (1) we have E [ Y | A, U ] = E [ h ( W, A ) | A, U ] , (10) E [ Y | A, Z ] = E [ h ( W, A ) | A, Z ] , (11)where (11) is obtained by taking expectation on both sides of (10). The above equationsindicate that h captures the relationship between U - Y and U - W associations via (10), whichcan be identified by the relationship between Z - Y and Z - W associations via (11). Becauseof this key observation, h is referred to as the confounding bridge function in [12]. Thefunctional form of h is implied by (1) and (2). Once h is identified, we have that E [ Y ( a )] (10) = E U { E [ Y | A = a, U ] } = E [ h ( W, A = a )]. In practice, one may assume a familiar linearmodel about the functional form of h that satisfies (10), such as h ( W, A ; θ ) = θ + θ A A + θ W W. (12)14hen under Assumption 7, θ can be identified by the population moment equation E [ g ( A, Z ) { Y − h ( W, A ; θ ) } ] = 0 using the generalized method of moments (GMM) method [74]. With θ identified, the ATE is given byATE = E [ h ( W, A = 1; θ )] − E [ h ( W, A = 0; θ )] . (13)A simple version of the above GMM procedure can be realized via a simple two stage leastsquares procedure as followed [12]:Stage I: regress W on A and Z (with intercept), and obtain the fitted value (cid:99) W as a proxy of U ;Stage II: regress Y on and A (with intercept), adjusting for (cid:99) W ,then the coefficient of A is the true causal effect β YA assuming (1) and (2). The two stageleast squares approach given above provides a simple implementation of the NC methodusing existing and widely disseminated IV software packages such as the ivregress , ivreg ,or ivreg2 command in Stata, the gmm , sem , ivpack , or AER package in R, and the
SYSLIN procedure in SAS.
Negative controls are innovative and important tools in observational studies. Develop-ment of negative control methods will encourage researchers to routinely check for evidenceof confounding bias and rigorously adjust for residual confounding bias. Negative controlvariables are widely available in routinely collected healthcare data such as administrativeclaims and electronic health records data, because information on secondary treatments andoutcomes beyond the primary treatment and outcome of interest are often recorded, andsuch secondary treatments and outcomes can potentially serve as negative controls. There-fore development of negative controls methods is critical to unlocking the full potential ofcontemporary healthcare data and ultimately improve the validity of research findings. Itis important to note that other sources of bias, such as selection bias and misclassificationbias, are typical in routinely collected healthcare data. Developing negative control methodsaccounting for bias beyond residual confounding is thus an important area of future research.We have specified statistical assumptions, practical strategies, and validation criteriathat can be combined with subject matter knowledge to design negative control studies inSection 2. We also illustrated identification of the ATE by either fitting the observed primaryand negative control outcome models or through assumption on the unmeasured confoundingmechanism followed by a simple two stage least squares procedure in Section 3. We believethat these examples can provide practical guidance on use of negative control methods to abroader audience. 15 ppendix
A.1 Examples of invalid negative controls that violates some as-sumption
Violation 1: no arrow between U and W
There must be an arrow between U and W ,because an NCO is a proxy of unmeasured confounder. It recovers the confounding bias byreflecting variation due to U . Violation 2: no arrow between U and Z, and Z (cid:54)→ A The only scenario that Z doesnot need to be associated with U is when Z is an instrumental variable (see first cell ofTable A.1). In this case, A is a collider between Z and U , such that Z and U are marginallyindependent. Conditioning on a collider will create collider bias such that Z and U becomeconditionally dependent. The requirements about Z in Assumptions 5 and 7 are all madeconditioning on A . Therefore an instrumental variable is a valid NCE. Violation 3: Y → W If the outcome causes the NCO, then the treatment directly causesthe NCO via the path A → Y → W , which violates Assumption 3. Violation 4: Z → U ← W The direction of the arrow between U and the negative controldoesn’t always matter. For example, we can have Z → U , U → Z , W → U , or U → W .However, if both Z and W cause U , then U is a collider in the path Z → U ← W . In thiscase, conditional on U , Z and W will become associated. This violates Assumption 2. A.2 Example of causal graphs encoding the negative control as-sumptions
Below we enumerate the possible relationships among
Z, A, U and among
Y, W, U in Ta-ble A.1. These partial graphs can be combined into a directed acyclic graph that encodesthe negative control assumptions. Grey colored graphs are invalid because of violation ofkey assumptions. 16able A.1: Examples of graphs for
Z, A, U relationships and for
W, Y, U relationships. The two pieces of graphs can be combined in to a directedacyclic graph that encodes the negative control assumptions. Grey colored graphs are invalid because of violation of key assumptions.
Examples of graphs for
Z, A, U relationships Z → A (pre-treatment) A → Z (post-treatment) Z ⊥⊥ A No arrow between Instrumental variable (IV) Violate Assumption 5 and 7 Violate Assumption 5 and 7 U and Z (may violate A U, X YZ A U, X YZ A U, X YZ
Assumption 5 and 7) Invalid IV Post-treatment proxy of U Surrogate of UU → Z A U, X YZ A U, X YZ A U, X YZ
May violate Assumption 2 if there is W → UZ → U A U, X YZ A U, X YZ A U, X YZ
Examples of graphs for
W, Y, U relationships W → Y ( a ) Y ( a ) → W Y ( a ) ⊥⊥ W | ( U, X )(violate Assumptions 3 and 4)No arrow between Violate Assumption 5 and 7 Violate Assumptions 3, 5, and 7 Violate Assumption 5 and 7 U and W (violate A U, X Y W A U, X Y W A U, X Y W
Assumption 5 and 7) Violate Assumption 3 U → W A U, X Y W A U, X Y W A U, X Y W
May violate Assumption 2 if there is Z → U Violate Assumption 3 W → U A U, X Y W A U, X Y W A U, X Y W eferences [1] John PA Ioannidis. “Why most published research findings are false”. In: PLOS Medicine
American Journal of Epidemiology •• Marc Lipsitch, Eric J Tchetgen Tchetgen, and Ted Cohen. “Negative controls: atool for detecting confounding and bias in observational studies”. In:
Epidemiology
This paper is the first to formally define negativecontrol exposure and outcome with conditions for bias detection as well asexamples in epidemiology. [4] Benjamin F Arnold, Ayse Ercumen, Jade Benjamin-Chung, and John M Colford Jr.“Brief report: negative controls to detect selection bias and measurement bias in epi-demiologic studies”. In:
Epidemiology
Journal of the American Medical Association
Biometrics
Epidemiology
Experimental Design for Biologists . Cold Spring Harbor LaboratoryPress, 2014.[9] Zhihong Cai and Manabu Kuroki. “On identifying total effects in the presence oflatent variables and selection bias”. In:
Proceedings of the Twenty-Fourth Conferenceon Uncertainty in Artificial Intelligence . 2008, pp. 62–69.[10] Lan Liu and Eric Tchetgen Tchetgen. “Regression-based Negative Control of Ho-mophily in Dyadic Peer Effect Analysis”. In: arXiv preprint arXiv:2002.06521 (2020).[11] Naoki Egami. “Identification of Causal Diffusion Effects Under Structural Stationar-ity”. In: arXiv preprint arXiv:1810.07858 (2018).[12] • Wang Miao, Xu Shi, and Eric J Tchetgen Tchetgen. “A Confounding Bridge Approachfor Double Negative Control Inference on Causal Effects”. In: (2020). In progress, aprior version can be found at https://arxiv.org/abs/1808.04945 . This paperintroduces the confounding bridge function that links primary and negativecontrol outcome distributions for identification of the average treatmenteffect leveraging a negative control exposure. [13] Tamar Sofer, David B Richardson, Elena Colicino, Joel Schwartz, and Eric J TchetgenTchetgen. “On negative outcome control of unobserved confounding as a generalizationof difference-in-differences”. In:
Statistical Science
International Journal of Epidemiology
Statistical Science (1990), pp. 465–472.[16] Donald B Rubin. “Estimating causal effects of treatments in randomized and nonran-domized studies.” In:
Journal of Educational Psychology • Xu Shi, Wang Miao, and Eric J Tchetgen Tchetgen. “Multiply robust causal inferencewith double negative control adjustment for categorical unmeasured confounding”. In:
Journal of the Royal Statistical Society: Series B (Statistical Methodology)
This paper provides a general semiparametric framework forobtaining inferences about the average treatment effect under categoricalunmeasured confounding and negative controls. [18] M Alan Brookhart, Jeremy A Rassen, and Sebastian Schneeweiss. “Instrumental vari-able methods in comparative safety and effectiveness research”. In:
Pharmacoepidemi-ology and Drug Safety
Journal of the American Statistical Association
Epidemiology (2006), pp. 360–372.[21] James M Robins. “Correcting for non-compliance in randomized trials using struc-tural nested mean models”. In:
Communications in Statistics-Theory and methods
Journal ofthe Royal Statistical Society: Series B (Statistical Methodology)
Journal of the American Medical Association
Annalsof Internal Medicine
BMC Public Health
Scientific Reports
HealthServices Research
Design of observational studies . New York, NY: Springer-Verlag,2010.[29] Marcus R Munaf`o, Kate Tilling, Amy E Taylor, David M Evans, and George DaveySmith. “Collider scope: when selection bias can substantially influence observed asso-ciations”. In:
International Journal of Epidemiology
Journal of the American StatisticalAssociation
Biometrika
Epidemiology • W Dana Flanders, Matthew J Strickland, and Mitchel Klein. “A new method for par-tial correction of residual confounding in time-series and other observational studies”.In:
American Journal of Epidemiology
This paper de-velops a regression-based method taking future air pollution as a negativecontrol exposure to reduce residual confounding bias in a time-series studyon air pollution effects. [34] Xavier de Luna, Philip Fowler, and Per Johansson. “Proxy variables and nonparametricidentification of causal effects”. In:
Economics Letters
150 (2017), pp. 152–154.[35] Manabu Kuroki and Judea Pearl. “Measurement bias and effect restoration in causalinference”. In:
Biometrika •• Wang Miao, Zhi Geng, and Eric J Tchetgen Tchetgen. “Identifying causal effectswith proxy variables of an unmeasured confounder”. In:
Biometrika
This paper establishes sufficient conditions for nonparametricidentification of the average treatment effect using double negative control. [37] • David Madigan, Paul E Stang, Jesse A Berlin, Martijn Schuemie, J Marc Overhage,Marc A Suchard, Bill Dumouchel, Abraham G Hartzema, and Patrick B Ryan. “Asystematic statistical approach to evaluating evidence from observational studies”. In:
Annual Review of Statistics and Its Application
This paperprovides a systematic review of challenges in observational studies and de- cribes a data-driven approach to calculating calibrated p-values leveragingnegative controls. [38] Martijn J Schuemie, Patrick B Ryan, William DuMouchel, Marc A Suchard, and DavidMadigan. “Interpreting observational studies: why empirical calibration is needed tocorrect p-values”. In: Statistics in Medicine
Statistics in Medicine
Proceedings of the National Academyof Sciences
Philosophical Transactions of the Royal Society A:Mathematical, Physical and Engineering Sciences
American Journal of Epidemiology
Pediatrics
International Journal of Epidemiology
Hypertension
Basic & Clinical Pharmacology& Toxicology
Scandinavian Journal of Public Health
Drug and Alcohol Dependence
139 (2014), pp. 159–163.2149] Wang Miao and Eric J Tchetgen Tchetgen. “Invited commentary: bias attenuation andidentification of causal effects with multiple negative controls”. In:
American Journalof Epidemiology
American Journal of Epidemiology (2020).[51] Thomas Lumley and Lianne Sheppard. “Assessing seasonal confounding and model se-lection bias in air pollution epidemiology using positive and negative control analyses”.In:
Environmetrics
New England Journal of Medicine
Digestive Diseases and Sciences • Mette Lise Lousdal, Timothy L Lash, W Dana Flanders, M Alan Brookhart, IvarSønbø Kristiansen, Mette Kalager, and Henrik Støvring. “Negative controls to detectuncontrolled confounding in observational studies of mammographic screening com-paring participants and non-participants”. In:
International Journal of Epidemiology (2020).
This paper uses both negative control exposure and negative con-trol outcome to detect residual confounding in an observational study ofmammographic screening comparing participants and non-participants. [55] Lianne Sheppard, Drew Levy, Gary Norris, Timothy V Larson, and Jane Q Koenig.“Effects of ambient air pollution on nonelderly asthma hospital admissions in Seattle,Washington, 1987-1994”. In:
Epidemiology (1999), pp. 23–30.[56] E Cuyler Hammond and Daniel Horn. “The relationship between human smokinghabits and death rates: a follow-up study of 187,766 men”. In:
Journal of the AmericanMedical Association
British Medical Journal
British Medical Journal
Journal of the National Cancer institute
The Lancet
Epidemiology
Epidemiology (Cambridge, Mass.)
Epidemiology
Epidemiology
Epidemiology
American Journal ofEpidemiology https://biostats.bepress.com/harvardbiostat/paper192/ .[68] Adam Glynn and Nahomi Ichino. “Generalized Nonlinear Difference-in-Difference-in-Differences”. In:
V-Dem Working Paper
90 (2019). Available at https : / / papers .ssrn.com/sol3/papers.cfm?abstract_id=3410888 .[69] Eric Tchetgen Tchetgen. “The control outcome calibration approach for causal in-ference with unobserved confounding”. In:
American Journal of Epidemiology
Biostatistics
Biostatistics • Jingshu Wang, Qingyuan Zhao, Trevor Hastie, and Art B Owen. “Confounder adjust-ment in multiple hypothesis testing”. In:
Annals of Statistics
This paper unifies unmeasured confounding adjustment methods inmultiple hypothesis testing and provides theoretical guarantees for thesemethods. [73] Whitney K Newey and James L Powell. “Instrumental variable estimation of nonpara-metric models”. In:
Econometrica