[PDF] Accounting for recall bias in case-control studies: a causal inference approach

Abstract

A case-control study is designed to help determine if an exposure is associated with an outcome. However, since case-control studies are retrospective, they are often subject to recall bias. Recall bias can occur when study subjects do not remember previous events accurately. In this paper, we first define the estimand of interest: the causal odds ratio (COR) for a case-control study. Second, we develop estimation approaches for the COR and present estimates as a function of recall bias. Third, we define a new quantity called the \textit{R-factor}, which denotes the minimal amount of recall bias that leads to altering the initial conclusion. We show that a failure to account for recall bias can significantly bias estimation of the COR. Finally, we apply the proposed framework to a case-control study of the causal effect of childhood physical abuse on adulthood mental health.

Full PDF

AACCOUNTING FOR RECALL BIAS IN CASE-CONTROL STUDIES: ACAUSAL INFERENCE APPROACH

KWONSANG LEE AND FRANCESCA DOMINICI Abstract.

R-factor , which denotes the minimal amount ofrecall bias that leads to altering the initial conclusion. We show that a failure to account for recallbias can signiﬁcantly bias estimation of the COR. Finally, we apply the proposed framework to acase-control study of the causal eﬀect of childhood physical abuse on adulthood mental health. Introduction

A case-control (or case-referent, case-noncase) study is designed to investigate whether there isevidence of an association between one or more risk factors and an outcome. In a case-control study,we ﬁrst identify the cases (a group known to have the outcome) and the controls (a group knownto be free of the outcome). Then, we look back in time to learn which subjects in each group hadthe exposure(s), comparing the frequency of the exposure in the case group to the control group.Therefore, by deﬁnition, a case-control study is always retrospective because it starts with anoutcome and then collect retrospectively information about risks factors or exposures for the casesand the controls (Lewallen and Courtright, 1998). Compared to other study designs, case-controlstudies have several advantages. They are time-eﬃcient, inexpensive, and particularly suitablefor rare outcomes (Mann, 2003). However, they tend to be more susceptible to biases than othercomparative studies (Schulz and Grimes, 2002).The common problem in case-control studies is that it is diﬃcult to measure exposure to risk fac-tor correctly. Recall bias is a problem in studies that use self-reporting, such as case-control studiesand retrospective cohort studies. Recall bias is a systematic error that occurs when participantsdo not remember previous events/experience accurately or omit details: the accuracy and volumeof memories may be inﬂuenced by subsequent events/experience. Recall bias can be random ordiﬀerential (Rothman, 2012). Diﬀerential recall bias occurs when the exposure is under-reportedfor controls and over-reported for cases (or viceversa) (Barry, 1996; Chouinard and Walter, 1995).For example, in the data set analyzed in this paper there is evidence that adults with a mentalhealth problem and high levels of anger scores (cases) tend to under-report their exposure, that is, Department of Statistics, Sungkyunkwan University, Seoul, Republic of Korea Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA

Key words and phrases.

Causal eﬀect; Mantel-Haenszel estimate; Prognostic score; Stratiﬁcation. a r X i v : . [ s t a t . M E ] F e b ACCOUNTING FOR RECALL BIAS IN CASE-CONTROL STUDIES whether or not they have been abused as children compared to the controls. This could be due toseveral factors, such as repression by the victim as a mean of self-protection or a desire to avoiddiscussing these experiences (Fergusson et al., 2000). In addition to diﬀerential recall bias, randomrecall bias can often occur in case-control studies, but it equally occurs in case and control groups,for instance, due to memory failure itself. Raphael (1987) pointed out that such random error willlead to measurement error that will usually lead to a loss of statistical power. On the contrary,diﬀerential recall bias is likely to lead to a biased estimate.In this paper, we focus on diﬀerential recall bias rather than random bias. Many studies havedocumented that this diﬀerential recall bias can have a signiﬁcant impact on associational mea-sures (Chouinard and Walter, 1995; Coughlin, 1990; Drews and Greenland, 1990; Barry, 1996;Greenland, 1996). Despite of the importance of accounting for recall bias, there is yet a gap be-tween previous studies about recall bias and current causal inference studies. First, to best of ourknowledge, contributions regarding how recall bias aﬀects measures with causal interpretations arescarce. For example, Zhang (2008) proposed an approach based on a logistic regression model forestimating a marginal causal odds ratio (COR). Persson and Waernbaum (2013) proposed severalapproaches for estimating this marginal COR in case-control studies. However, neither accountedfor recall bias. In contrast, many of the existing methods accounting for recall bias can reveal onlyan association between exposure and outcome. Furthermore, their analysis is limited to either es-timating a marginal OR without adjusting for confounders or estimating a conditional OR relyingon restricted models such as logistic regression. For instance, Greenland and Kleinbaum (1983)proposed an approach for accounting for recall bias that uses a matrix correction in the contextof a misclassiﬁcation problem. However, the correcting matrix must be derived either from earlierstudies or a validation study carried out on a subsample of the study subjects. Though using thematrix is conceptually simple, it is not feasible to adjust for confounders. Alternatively, Barry(1996) proposed a logistic regression method to assess the impact of recall bias on the conclusionby postulating a simpler misclassiﬁcation model while adjusting for confounders. However, theconfounder adjustment heavily depends on the logistic regression model, thus the target estimandis restricted to a conditional OR.The overall goal of this paper is to introduce a causal inference framework for case-control studies,for estimating the causal odds ratio (COR) in the presence of recall bias. More speciﬁcally, ﬁrst, wedeﬁne the CORs as a function of tuning parameters that quantify recall bias. The true COR andnaive COR (which assumes that there is no recall bias even when recall bias is present in reality) canbe analytically compared in terms of these tuning parameters. Second, we introduce two approachesfor estimating COR as a function of these tuning parameters. We focus on estimating the marginalCOR that is often used to assess the exposure eﬀect in the population as a whole. To estimate themarginal COR, we develop a maximum likelihood estimation method and a stratiﬁcation methodbased on the prognostic score (Hansen, 2008). Third, we also introduce and provide estimationapproaches for other COR measures such as conditional and common CORs. Finally, we introduce

CCOUNTING FOR RECALL BIAS IN CASE-CONTROL STUDIES 3 the

R-factor deﬁned as the minimal amount of recall bias that would lead to a reporting of aspurious association between exposure and disease.2.

Notation and Odds Ratios in Case-control Studies

Causal Inference Framework in Case-Control Studies.

We start by introducing thenotation for a matched case-control study that can cover the notation for an unmatched case-control study with one or more strata. In a matched case-control study, ﬁrst we stratify thepopulation intro strata deﬁned by age, gender, race or other demographics and then match thecases and controls within strata. (Breslow, 1982). Within each stratum, potential confounders thataﬀect both exposure and outcome can be adjusted by certain criteria such as exact matching. Ifthe outcome is rare, there are relatively few cases available in each stratum for the later analysis,which may result in providing ineﬃcient estimates. Alternatively, researchers may want to ﬁndcases ﬁrst and then match them to similar controls (Breslow and Day, 1980). Age and sex are oftenexactly matched in this matched case-control design.Case-control studies are usually retrospective in nature. Most of the time, risk factors areretrospectively investigated to ﬁnd the cause of the outcome, thus exposure to a risk factor isnever randomized. A naive comparison of the prevalence of the outcome between the exposedand unexposed groups can be misleading due to confounding bias. To establish causation betweenexposure and outcome, we rely on the potential outcome framework (Splawa-Neyman, 1990; Rubin,1974). Assume that there are I strata. Each stratum i , i = 1 , . . . , I , contains n i individuals. Thereare N = (cid:80) Ii =1 n i individuals in total. We denote by ij the j th individual in stratum/matchedset i for j = 1 , . . . , n i . For example, if the data is collected from an unmatched case-controldesign without stratiﬁcation, then there is only one stratum. We let T ij = 1, to indicate thatindividual ij was exposed to a certain risk factor, and T ij = 0 otherwise. We can deﬁne potentialoutcomes as follows; if T ij = 0, individual ij exhibits response Y ij (0), whereas if T ij = 1 thenindividual ij exhibits Y ij (1). Depending on whether individual ij was exposed or not, only oneof the two potential outcomes can be observed. The response exhibited by individual ij is Y ij = T ij Y ij (1)+(1 − T ij ) Y ij (0). In this paper, both Y ij (1) and Y ij (0) are assumed to be binary. If Y ij = 1,individual ij is considered as a case and if Y ij = 0 then individual ij is a control. Let X ij denotea vector of covariates. In a matched case-control study with exact matching, within the samestratum i , two individuals ij and ij (cid:48) share the same covariates (i.e., X ij = X ij (cid:48) ). For an unmatchedcase-control study, a function of X ij such as the propensity score (Rosenbaum and Rubin, 1983)can be used to construct strata adjusting for confounding bias. In this case, two individual ij and ij (cid:48) may have diﬀerent values of the covariates, but the same value of the propensity score.2.2. Deﬁnition and Identiﬁcation of Conditional and Marginal Causal Odds Ratios.

In this section, we introduce the parameter of interest, which is marginal causal odds ratio (COR) deﬁned as ψ = { p (1 − p ) } / { p (1 − p ) } where p t = E [ Y ij ( t )]; t = 0 ,

1. In some instances we might beinterested in the conditional COR at a given level of X = x , which is deﬁned as ψ ( x ) = { p ( x )(1 − p ( x )) } / { p ( x )(1 − p ( x )) } where p t ( x ) = E [ Y ij ( t ) | X ij = x ]. We note that p t = E X [ p t ( x )]. ACCOUNTING FOR RECALL BIAS IN CASE-CONTROL STUDIES

However, in general ψ (cid:54) = E X [ ψ ( x )], which is often referred as the non-collapsibility of the oddsratios (Greenland and Robins, 1986).To identify these causal parameters, we consider two assumptions: (1) unconfoundedness and (2)positivity. The unconfoundedness assumption means that the potential outcomes ( Y ij (1) , Y ij (0))are conditionally independent of the treatment T ij given X ij , i.e., ( Y ij (1) , Y ij (0)) ⊥⊥ T ij | X ij . Thesecond assumption means that the probability P( T ij = 1 | X ij ) lies in (0 , strong ignorability . We introduce p y | t ( x ) = P( Y ij = y | T ij = t, X ij = x )for t = 0 , y = 0 ,

1. We note that these probabilities can be computed by using observablevariables. Under the strong ignorability assumption, p ( x ) , p ( x ) based on potential outcomescan be identiﬁed as p | ( x ) , p | ( x ) respectively. Also the conditional and marginal CORs can beidentiﬁed as ψ ( x ) = p | ( x ) p | ( x ) p | ( x ) p | ( x ) , ψ = E [ p | ( x )] E [ p | ( x )] E [ p | ( x )] E [ p | ( x )] , (1)Using the identiﬁcation results, many estimation methods based on the propensity score have beenproposed. The performance of methods for the marginal and conditional ORs are compared inAustin (2007) and Austin et al. (2007) respectively. However, most of the developed methods arenot suitable in the presence of recall bias.3. Impact of Recall Bias on Causal Parameters

Recall Bias Model.

In this paper, we consider situations with diﬀerential recall bias wherethe exposure is over-reported (under-reported) diﬀerently among cases and controls. In presenceof recall bias, the underlying true exposure T ij is not observed. Instead, we observe the biasedexposure T ∗ ij . If there is no recall bias, then T ij = T ∗ ij .We introduce two tuning parameters to measure diﬀerential over-reporting recall bias: η = P( T ∗ ij = 1 | Y ij = 1 , T ij = 0 , X ij = x ) η = P( T ∗ ij = 1 | Y ij = 0 , T ij = 0 , X ij = x ) (2)The parameters ( η , η ) represent the probability of over-reporting among cases and the controlsrespectively. We implicitly assume in (2) that the magnitude of recall bias does not depend oncovariates X ij . Also, for each ij , T ij = 1 implies T ∗ ij = 1, but T ij = 0 implies either T ∗ ij = 0 or T ∗ ij = 1. Therefore, recall bias occurs only when T ij = 0. If T ij = 0, Y ij (0) is observed as Y ij , andthe observed exposure T ∗ ij depends on Y ij (0). Similarly, we can deﬁne another set of two tuningparameters in under-reporting situations, ζ = P( T ∗ ij = 0 | Y ij (1) = 1 , T ij = 1 , X ij = x ) ζ = P( T ∗ ij = 0 | Y ij = 0 , T ij = 1 , X ij = x ) (3)The parameters ( ζ , ζ ) represent the probability of under-reporting among cases and the controlsrespectively. We use only one of the two sets of the parameters depending on the situation whether CCOUNTING FOR RECALL BIAS IN CASE-CONTROL STUDIES 5 exposure is over-reported or under-reported. Note that if there is no recall bias, then η = η = 0(or ζ = ζ = 0).For our example that will be discussed in Section 6, we consider a study of child abuse and adultmental health, especially anger. Adults are likely not to report their childhood abuse if there are any.Thus, exposure was generally under-reported. We use the parameter set ( ζ , ζ ). The parameter ζ ( ζ ) is the proportion of adults with higher (lower) anger scores that fail to recall child abusecorrectly. In this study, it is known that under-reporting child abuse is equally likely for cases andcontrols. This information obtained from previous literature can impose an additional restrictionsuch as ζ = ζ . We will discuss two analyses with and without this additional information inSection 6.3.2. Change in Target Parameters Due to Recall bias.

In this section, we investigate theimpact of recall bias on the marginal and conditional CORs, analytically. If there is no recall biasand the exposure T ij is observed correctly, the conditional and marginal CORs, ψ ( x ) and ψ , canbe identiﬁed based on the conditional probabilities p y | t ( x ) as in (1). In presence of recall bias,if we make inference based on T ∗ ij rather than T ij , then we obtain a biased estimate of the CORsince ( Y ij (1) , Y ij (0)) (cid:54)⊥⊥ T ∗ ij | X ij . To describe this, consider the probabilities based on observablevariables, p ∗ y | t ( x ) = P( Y ij = y | T ∗ ij = t, X ij = x ) for y = 0 , t = 0 ,

1. In the presence of recallbias, any consistent estimator of ψ target a diﬀerent estimand ψ ∗ (or ψ ∗ ( x )), ψ ∗ ( x ) = p ∗ | ( x ) p ∗ | ( x ) p ∗ | ( x ) p ∗ | ( x ) , ψ ∗ = E [ p ∗ | ( x )] E [ p ∗ | ( x )] E [ p ∗ | ( x )] E [ p ∗ | ( x )] . The following propositions compare ψ ( x ) , ψ with ψ ∗ ( x ) , ψ ∗ respectively. Proposition 1.

For given ≤ η , η ≤ , ψ ( x ) ≤ ψ ∗ ( x ) ⇔ q ∗ ( x ) η ≤ q ∗ ( x ) η for each x where q ∗ y ( x ) = P ( T ∗ ij = 1 | Y ij = y, X ij = x ) for y = 0 , . Similarly, if the source of bias is under-reporting of exposure, for given ≤ ζ , ζ ≤ , then ψ ( x ) ≤ ψ ∗ ( x ) ⇔ { − q ∗ ( x ) } ζ ≥ { − q ∗ ( x ) } ζ . This proposition implies that the true conditional COR ψ ( x ) is always smaller than ψ ∗ ( x ) if q ∗ ( x ) η ≤ q ∗ ( x ) η . Using models or stratiﬁcation for q ∗ y ( x ), the condition can be checked for ﬁxedvalues of ( η , η ) or ( ζ , ζ ). In particular, if η = 0 (i.e., no over-reporting bias for controls), ψ ( x ) ≤ ψ ∗ ( x ). Proposition 1 shows the condition for conditional CORs ψ ( x ) ≤ ψ ∗ ( x ), but thiscondition cannot be generalized to the marginal COR case due to non-collapsibility (Greenlandet al., 1999). Under certain additional conditions, ψ ≤ ψ ∗ can be claimed. Proposition 2.

Assume that exposure is over-reported.

ACCOUNTING FOR RECALL BIAS IN CASE-CONTROL STUDIES (i)

Suppose η ≤ η . If ψ ( x ) ≤ η /η , then ψ ( x ) ≤ ψ ∗ ( x ) . Furthermore, if max x ψ ( x ) ≤ η /η ,then ψ ≤ ψ ∗ . (ii) Suppose η ≥ η . If ψ ( x ) ≥ η /η , then ψ ( x ) ≥ ψ ∗ ( x ) . Furthermore, if min x ψ ( x ) ≥ η /η ,then ψ ≥ ψ ∗ . This proposition means that, when exposure is over-reported, if the maximum of the conditionalCOR, max x ψ ( x ), is less than or equal to η /η , then recall bias leads to ψ ≤ ψ ∗ . Also, if η = 0,then ψ ( x ) is always less than η /η and thus ψ ( x ) ≤ ψ ∗ ( x ) for all x and thus ψ ≤ ψ ∗ . When η = η , ψ ( x ) ≤ ψ ∗ ( x ) if ψ ( x ) ≤

1. A similar argument can be applied to the situation where exposure isunder-reported. When assuming that exposure is under-reported, the following statements hold:(i) Suppose ζ ≤ ζ . If ψ ( x ) ≥ ζ /ζ , then ψ ( x ) ≥ ψ ∗ ( x ). Furthermore, if min x ψ ( x ) ≥ ζ /ζ ,then ψ ≥ ψ ∗ , and (ii) Suppose ζ ≥ ζ . If ψ ( x ) ≤ ζ /ζ , then ψ ( x ) ≤ ψ ∗ ( x ). Furthermore, ifmax x ψ ( x ) ≤ ζ /ζ , then ψ ≤ ψ ∗ . Similarly, when ζ = ζ , ψ ( x ) ≥ ψ ∗ ( x ) if ψ ( x ) ≥

1. Proofs forboth propositions are in the Supplementary Materials.Propositions 1 and 2 enable us to assess the impact of recall bias on both marginal and conditionalCORs before starting the main analysis. For example, it is known from previous literature thatchild abuse and adult anger are positively associated based on logistic regression, which implies ψ ( x ) ≥ x . Given the additional information of ζ = ζ , from Proposition 2 (i) forunder-reported exposures, we expect ψ ( x ) ≥ ψ ∗ ( x ), and further ψ ≥ ψ ∗ . This relationship showsthat if the underlying true COR ψ ( x ) is recovered by accounting for recall bias, then it should begreater than the biased estimand ψ ∗ ( x ). Even without the prior knowledge, Proposition 1 providesa way to predict the impact of recall bias. Using a model for q ∗ y ( x ) , y = 0 ,

1, it is possible to checkwhether the condition { − q ∗ ( x ) } ζ ≥ { − q ∗ ( x ) } ζ is satisﬁed or not. For instance, a logisticregression model q ∗ y ( x ) = exp( β y Y + β X X ) / { β y Y + β X X ) } can be considered, and thenused for assessing the condition with various values of ( ζ , ζ ). Especially when using ζ = ζ , thecondition for ψ ( x ) ≥ ψ ∗ ( x ) is further simpliﬁed as β y ≥ R-factor: how much recall bias is needed to alter the initial conclusion?

The abovepropositions show that, when there is recall bias, estimators without controlling for recall biascan introduce bias. In this section, we illustrate that even a small amount of recall bias can biasthe COR estimate signiﬁcantly. To illustrate this, with a small simulation study. We assumetake N = 2000, just one I = 1, and no confounding. We assume T j ∼ Bernoulli(0 . Y j (1) ∼ Bernoulli(0 .

25) and Y j (0) ∼ Bernoulli(0 . ψ is 1. Then, we compute Y j = Y j (1) T j + Y j (0)(1 − T j ). We consider a situation of over-reporting among the casesonly, that is η > η = 0. Therefore, if T j = 1, then T ∗ j = 1, but if T j = 0, then T ∗ j ∼ Bernoulli( η ). For each simulated dataset, we calculate the bias of ˆ ψ ∗ with respect to thetrue ˆ ψ as a function of η . Figure 1 shows the estimate ˆ ψ ∗ and its 95% conﬁdence intervals (CI) asa function of η ∈ [0 , . η increases, ˆ ψ ∗ increases rapidly. We can see that failing to accountfor recall bias can lead to a misleading conclusion even for a small amount of recall bias. CCOUNTING FOR RECALL BIAS IN CASE-CONTROL STUDIES 7

Figure 1.

Changes in ψ depending on the size of recall bias. Estimates and 95%conﬁdence intervals of ψ are shown for diﬀerent values of η . The red line representsthe true COR ψ = 1.In this paper, we introduce a R-factor to quantify the minimum amount of recall bias that canalter the initial conclusion. As shown in Figure 1, we emphasize that, for η > .

06, we wouldconclude that there is a statistically signiﬁcant relationship between exposure and disease whenin reality there is none, ψ = 1. In this context, we deﬁne the R-factor is 6%, which means thatif among the cases at least 6% reported to be exposed where in reality they were not, the initialconclusion of no eﬀect would be altered. More speciﬁcally for η > .

06, since the 95% CI does notcontain 1, we would reject the null hypothesis even though the true COR is 1.The R-factor can be diﬀerently deﬁned in a diﬀerent context. For instance, suppose the initialconclusion that rejects the null hypothesis of no eﬀect. Then, the R-factor can be deﬁned as theminimum amount of recall bias (in this case, in terms of η while ﬁxing η = 0) to make the samehypothesis not to be rejected. The R-factor can be used for a summary statistic that shows thesensitivity of the conclusion.4. Two Methods for Recovering the Marginal Causal Odds Ratio in the Presenceof Recall Bias

In this section, we propose two estimation methods that provide a consistent estimate of themarginal COR ψ in presence of recall bias and confounding: (1) maximum likelihood estimation and(2) stratiﬁcation using prognostic scores. For given values of ( η , η ) or ( ζ , ζ ), to get a consistentestimate of ψ , the ﬁrst ML-based method requires the models for two potential outcomes and ACCOUNTING FOR RECALL BIAS IN CASE-CONTROL STUDIES exposure to be correctly speciﬁed. The stratiﬁcation-based method requires less model assumptions.Stratiﬁcation can be implemented based on either propensity scores or prognostic scores (Hansen,2008). In the following subsections, we discuss these two estimation methods in more detail. Forsimplicity, we consider the over-reported exposure case with the tuning parameters ( η , η ). Under-reported exposure with ( ζ , ζ ) can be dealt with in a similar way.4.1. Maximum Likelihood Estimation (ML).

Consider the outcome models m t ( x ) = P( Y ij =1 | T ij = t, X ij = x ) , t = 0 , p | ( x ) and p | ( x ). Also,consider the model e ( x ) for the probability π ( x ) = P( T ij = 1 | X ij = x ) that is often called the propensity score (Rosenbaum and Rubin, 1983). In Section 2.2, we discussed that the probability p ( x ) can be identiﬁed as p | ( x ), thus can be estimated by m ( x ). In the absence of recall bias,either m t ( x ) or e ( x ) is required to be correctly speciﬁed to obtain a consistent estimate. However,in presence of recall bias, since we do not observe T ij , m t ( x ) nor e ( x ) cannot be estimated from theobservable dataset. We can estimate the marginal COR as a function of the tuning parameters ofthe recall bias model.The ﬁrst method presented in this subsection uses maximum likelihood estimation. To constructthe likelihood function, it is required to specify both m t ( x ) and e ( x ) to obtain an estimate forgiven values of ( η , η ). Using the recall bias model (2), the joint probability P( Y ij , T ∗ ij | X ij ) ofobservable variables can be represented by a function of m ( x ) , m ( x ) and e ( x ). We assume models m t ( x ; γ t ) , t = 0 , e ( x ; β ) with parameters γ t and β . For instance, logistic regressions can beused such as m ( T, X ; γ ) = exp( γ t T + γ T X X ) / (1 + exp( γ t T + γ T X X )) with m ( X ) = m (1 , X ; γ ) and m ( X ) = m (0 , X ; γ ) and e ( X ) = exp( β T X ) / (1 + exp( β T X )). These model parameters can beestimated by solving the following maximization problem,ˆ θ = ( ˆ β, ˆ γ , ˆ γ ) = arg max β,γ ,γ I (cid:88) i =1 n i (cid:88) j =1 log P( Y ij = y ij , T ∗ ij = t ∗ ij | X ij = x ) , (4)where P( Y ij = 1 , T ∗ ij = 1 | X ij = x ) = m ( x ; γ ) e ( x ; β ) + η m ( x ; γ ) { − e ( x ; β ) } P( Y ij = 0 , T ∗ ij = 1 | X ij = x ) = { − m ( x ; γ ) } e ( x ; β ) + η { − m ( x ; γ ) }{ − e ( x ; β ) } P( Y ij = 1 , T ∗ ij = 0 | X ij = x ) = (1 − η ) m ( x ; γ ) { − e ( x ; β ) } P( Y ij = 0 , T ∗ ij = 0 | X ij = x ) = (1 − η ) { − m ( x ; γ ) }{ − e ( x ; β ) } . Once we obtain the estimate ˆ θ , we can compute ˆ m t ( x ) = m t ( x ; ˆ γ t ) and ˆ e ( x ) = e ( x ; ˆ β ). Themarginal probability p y , y = 0 , m t ( x ),ˆ p ,ML = 1 N I (cid:88) i =1 n i (cid:88) j =1 ˆ m ( X ij ) , ˆ p ,ML = 1 N I (cid:88) i =1 n i (cid:88) j =1 ˆ m ( X ij ) CCOUNTING FOR RECALL BIAS IN CASE-CONTROL STUDIES 9

Table 1.

The 2 × i th stratumCase ControlExposed ( T ∗ = 1) a ∗ i b ∗ i a ∗ i + b ∗ i Not Exposed ( T ∗ = 0) c ∗ i d ∗ i c ∗ i + d ∗ i a ∗ i + c ∗ i b ∗ i + d ∗ i n ∗ i The marginal COR can be estimated byˆ ψ ML = ˆ p ,ML (1 − ˆ p ,ML )ˆ p ,ML (1 − ˆ p ,ML ) . Since the estimate ˆ θ can vary with the values of η and η , the estimator ˆ ψ ML can be consideredas a function of η and η . We will use notation ˆ ψ ML ( η , η ) if an explicit expression is necessary.Unlike usual situations in causal inference, to obtain a valid estimate of ˆ ψ ML , both m t ( x ; γ t ) and e ( x ; β ) have to be estimated correctly. However, when obtaining ˆ ψ , the estimated parameter ˆ β isnot used. This parameter can be understood as a nuisance parameter. We note that since T ij is notobservable, the inverse probability weighted estimator as proposed in Forbes and Shortreed (2008)cannot be considered here.Under the recall bias model (2), if ( η , η ) are known and the three models for m , m , e arecorrectly speciﬁed, then ˆ ψ is a consistent estimator for ψ . For diﬀerent values of η and η , ˆ ψ canbe changed but ˆ ψ ∗ remains unchanged. Therefore, we can consider ˆ ψ as an estimator that recoversthe true marginal COR when we believe η and η are correctly speciﬁed. However, η and η are unknown in practice. For data applications, it is recommended to consider a plausible regionof ( η , η ) based on previous knowledge. Then, the causal conclusion can be evaluated with theconsidered region.Also, for given ( η , η ), the asymptotic variance of the estimator ˆ ψ ( η , η ) can be obtained byusing the theory of M-estimation (Stefanski and Boos, 2002). However, it is not recommended be-cause the formula for the variance is diﬃcult to derive in general. Instead, the bootstrap procedurecan be considered to construct conﬁdence intervals for ψ .4.2. Stratiﬁcation.

Stratiﬁcation can be alternatively used to estimate ψ by aiming to bal-ance the covariate distributions between exposed and unexposed groups. Among stratiﬁcationbased methods, Graf and Schumacher (2008) proposed an estimator for ψ based on stratum-speciﬁc probabilities for strata deﬁned by the propensity score, p i = E X | stratum i [ p ( x )] and p i = E X | stratum i [ p ( x )]. If the assumption that ( Y ij (1) , Y ij (0)) is independent of T ij withineach stratum i holds, then these probabilities can be identiﬁed from the 2 × i . However, due to recall bias, T ∗ ij is observed instead of T ij . Under the model (2), for p yt ( x ) = E ( Y ij = y, T ij = t | X ij = x ) and p ∗ yt ( x ) = E ( Y ij = y, T ∗ ij = t | X ij = x ), the followingrelationships hold, p ( x ) = p ∗ ( x ) − η − η p ∗ ( x ) , p ( x ) = p ∗ ( x )1 − η

10 ACCOUNTING FOR RECALL BIAS IN CASE-CONTROL STUDIES p ( x ) = p ∗ ( x ) − η − η p ∗ ( x ) , p ( x ) = p ∗ ( x )1 − η . For each generated stratum, assume that Table 1 is observed. Using the above relationships, theprobabilities p i and p i are estimated by ˆ p i = a i / { a i + b i } , ˆ p i = c i / { c i + d i } where a i = { a ∗ i − η ( a ∗ i + c ∗ i ) } / (1 − η ), b i = { b ∗ i − η ( b ∗ i + d ∗ i ) } / (1 − η ), c i = c ∗ i / (1 − η ) and d i = d ∗ i / (1 − η ). Similarly, forthe under-reporting exposure case, a i = a ∗ i / (1 − ζ ), b i = b ∗ i / (1 − ζ ), c i = { c ∗ i − ζ ( a ∗ i + c ∗ i ) } / (1 − ζ ),and d i = { d ∗ i − ζ ( b ∗ i + d ∗ i ) } / (1 − ζ ). The marginal probabilities can be estimated by weightedaverage of these stratum-speciﬁc probabilities with weights s i = n i /N , i.e., ˆ p ,S = (cid:80) Ii =1 s i ˆ p i andˆ p ,S = (cid:80) Ii =1 s i ˆ p i . Therefore, the marginal COR is estimated byˆ ψ S = ˆ p ,S (1 − ˆ p ,S )ˆ p ,S (1 − ˆ p ,S ) . The estimator ˆ ψ S was discussed and the variance formula of ˆ ψ S was given in Stampf et al. (2010).We provide a modiﬁed variance formula that account for recall bias in the Supplementary Materials.However, this variance estimator is valid when strata are ﬁxed. If strata are decided by the observeddata, bootstrap procedures are recommended.Compared to the estimator ˆ ψ ML , this approach does not use outcome logistic regressions ( m and m ). Therefore, ˆ ψ S require less modeling assumptions. Also, to obtain ˆ ψ S , less computation eﬀortis required. However, as many of stratiﬁcation-based methods, this method relies on the assumptionthat stratiﬁcation achieves covariate balance at least approximately. Furthermore, strata are formedbased on the biasedly estimated propensity score ˆ e ∗ ( x ) using T ∗ ij instead of unobservable T ij . It isnot feasible to compare the covariate distributions between the exposed ( T ij = 1) and unexposed( T ij = 0) groups. Since we assume that recall bias is independent of covariates conditioning onobserved outcome, if η = η , the covariate balance between the T ∗ ij = 1 and T ∗ ij = 0 groups isasymptotically same as that between the T ij = 1 and T ij = 0 groups.As we discussed, constructing strata based on the propensity score can be problematic if η and η are signiﬁcantly diﬀerent from 0. Instead of using the propensity score, the prognostic score(Hansen, 2008) can be used to construct strata. If there is Ψ( X ij ) such that Y ij (0) ⊥⊥ X ij | Ψ( X ij ),we call Ψ( x ) the prognostic score. Like propensity score stratiﬁcation, prognostic stratiﬁcationpermits estimation of exposure eﬀects within the exposed group. If it is assumed that there isno eﬀect modiﬁcation, prognostic stratiﬁcation is valid for estimating overall exposure eﬀects. Forinstance, if m ( T ij , X ij ; γ ) = exp( γ t T ij + γ T X X ij ) / (1+exp( γ t T + γ T X X ij )) is assumed, Ψ( X ij ) = γ T X X ij is the prognostic score. As with propensity scores, stratiﬁcation on the prognostic score leads to adesirable and balanced structure. Since we do not know Ψ( X ij ) a priori, this has to be estimatedfrom the data. Since exposure was over-reported, we know T ∗ ij = 0 always implies T ij = 0. We canestimate γ T X from using the data of the T ∗ ij = 0 group, and estimate Ψ( X ij ) for all individuals.Furthermore, the stratiﬁcation method provides an explicit expression of the estimator in termsof stratum-speciﬁc cell counts. Thus, the stratiﬁcation estimate is tractable. In particular, theestimator ˆ ψ S ( η , η ) is an increasing function of η and a decreasing function of η . CCOUNTING FOR RECALL BIAS IN CASE-CONTROL STUDIES 11

Proposition 3.

The function ˆ ψ S ( η , η ) satisﬁes ˆ ψ S ( η , η ) ≤ ˆ ψ S ( η (cid:48) , η ) if η ≤ η (cid:48) ˆ ψ S ( η , η ) ≥ ˆ ψ S ( η , η (cid:48) ) if η ≤ η (cid:48) The proof of this proposition is given in the Supplementary Materials. The results of thisproposition is straightforward. For example, if it is suspected that exposure is more over-reportedamong controls (i.e., a higher η ), then the odds of exposure among controls is thought to be moreinﬂated. Therefore, the recovered COR value is higher. Similarly, we can prove that ˆ ψ S ( ζ , ζ ) is adecreasing function of ζ and an increasing function of ζ .5. Simulation

We conduct a simulation study for comparing the performance of the methods we proposed inthe previous sections - (1) ML method and (2) Stratiﬁcation method. We consider two approachesfor the Stratiﬁcation (S) method: one based on the propensity score stratiﬁcation and the otherbased on the prognostic score stratiﬁcation. We call the former S prop and the latter S prog .We consider four binary covariates, say X i , X i , X i , X i . All of the covariates are ﬁnely balancedin the sample population, which means 2 = 16 possible combinations have n/

16 individuals each.For each individual i , we randomly generate an exposure and potential outcomes: T i ∼ Bernoulli( p i,T ) , Y i (0) ∼ Bernoulli( p i,Y (0) ) , Y i (1) ∼ Bernoulli( p i,Y (1) )where logit( p i,T ) = β + β X i + β X i + β X i + β X i + β , X i X i logit( p i,Y (0) ) = γ + γ X i + γ X i + γ X i + γ X i + γ , X i X i logit( p i,Y (1) ) = γ T + logit( p i,Y (0) ) . Since we do not consider eﬀect modiﬁcation, the eﬀect of an exposure on an outcome is constantfor every individual, and the true conditional COR is exp( γ T ). However, the true marginal CORis computed by { ¯ p Y (1) (1 − ¯ p Y (0) ) } / { ¯ p Y (0) (1 − ¯ p Y (1) ) } where ¯ p Y (0) = (1 /n ) (cid:80) ni =1 p i,Y (0) and ¯ p Y (1) =(1 /n ) (cid:80) ni =1 p i,Y (1) .Due to recall bias, we cannot observe an exposure T i , instead we observe the biased exposure T ∗ i . For this simulation study, we assume that exposure is over-reported. We generate T ∗ i based onobserved outcome Y i = Y i (1) T i + Y i (0)(1 − T i ) such as T ∗ i = T i + (1 − T i ) Y i · RB i + (1 − T i )(1 − Y i ) · RB i where RB i ∼ Bernoulli( η ) and RB i ∼ Bernoulli( η ). RB i and RB i are indicators of recall biasunder Y i = 1 and Y i = 0 respectively. If T i = 1, RB i and RB i have no impact on T ∗ i since T ∗ i always 1. Table 2.

Performance of the estimation methods for recovering the marginal COR.Three methods are compared, (1) maximum likelihood, (2) stratiﬁcation based onpropensity scores, and (3) stratiﬁcation based on prognostic scores. The Crudemethod that does not account for recall bias is also reported. The log values of themarginal CORs are reported. The size of recall bias is ( η , η ) = (0 . , . n True Crude ML S prop S prog (cor, cor) 800 0.000 0.597 0.002 0.102 0.0460.357 0.917 0.358 0.466 0.4010.706 1.231 0.714 0.827 0.7512000 0.000 0.591 -0.001 0.115 0.0400.357 0.919 0.360 0.477 0.4000.706 1.226 0.704 0.827 0.740(mis, cor) 800 0.000 0.481 0.008 -0.065 0.0160.357 0.826 0.367 0.311 0.3710.706 1.174 0.731 0.690 0.7292000 0.000 0.472 -0.002 -0.064 -0.0010.357 0.822 0.364 0.317 0.3600.706 1.167 0.720 0.686 0.709(cor, mis) 800 0.000 0.681 0.010 0.110 0.0540.310 0.958 0.313 0.428 0.3640.607 1.229 0.603 0.726 0.6602000 0.000 0.674 0.005 0.123 0.0490.310 0.955 0.307 0.435 0.3560.607 1.222 0.602 0.740 0.656(mis, mis) 800 0.000 0.258 -0.062 -0.183 -0.0160.310 0.541 0.256 0.154 0.2950.607 0.807 0.555 0.475 0.5812000 0.000 0.262 -0.056 -0.168 -0.0030.310 0.536 0.250 0.151 0.2970.607 0.792 0.543 0.467 0.584We consider logistic regression models for exposure and outcome only with the four covariates,but without considering the interaction term. Thus, if β , (cid:54) = 0, the exposure model is misspeciﬁed.Similarly, if γ , (cid:54) = 0, then the outcome model is misspeciﬁed. We consider four simulation scenarioswhere the exposure and outcome models are correctly speciﬁed or misspeciﬁed: (i) (cor, cor), (ii)(mis, cor), (iii) (cor, mis), and (iv) (mis, mis). For example, (mis, cor) means the exposuremodel is misspeciﬁed (i.e., β , (cid:54) = 0), but the outcome model is correctly speciﬁed (i.e., γ , =0). For our simulation study, we set ( β , β , β , β , β ) = ( − , , − , ,

0) and ( γ , γ , γ , γ , γ ) =( − , , − , , β , , γ , ) = (0 ,

0) for (cor, cor), (2 ,

0) for (mis,cor), (0 , −

2) for (cor, mis), and (2 , −

2) for (mis, mis). We compare the considered methods in termsof how they can successfully recover the true marginal CORs under diﬀerent model misspeciﬁcationscenarios.

CCOUNTING FOR RECALL BIAS IN CASE-CONTROL STUDIES 13

Besides this factor of model misspeciﬁcation, we consider two sample sizes ( n = 800 or 2000), andconsider three values of γ T between 0 and 1. Also, we ﬁx the values of the recall bias parameters as( η , η ) = (0 . , .

1) throughout this simulation. Table 2 shows the simulation results that are ob-tained from 2000 simulated datasets. The Crude method is essentially biased since confounding biasis not controlled. As shown in the table, Crude is biased in every scenario. Also, S prop is biased forall considered scenarios. The propensity score is estimated by regression T ∗ i on ( X i , X i , X i , X i ),thus the estimated score is biased. As we discussed in Section 4.2, we expect that when η = η ,stratiﬁcation based on this biased score will adjust for confounding bias, but simulation shows thatS prop always provides biased estimates. However, stratiﬁcation based on prognostic scores, S prog ,shows great performance in recovering the true marginal COR. Especially for the case of (mis,mis), S prog shows the best performance. Finally, the ML method provides least biased estimatesexcept for (mis, mis). Although it requires correct models for exposure and outcome, if either theexposure or outcome model is correctly speciﬁed, ML shows the best performance. However, whenboth models are misspeciﬁed, ML provides a more biased estimate than S prog does.6. Data Example: Child Abuse and Adult Anger

We consider a retrospective case-control study to examine the question “Does child abuse byeither parent increase a likelihood toward to adult anger?” This study is the 1993-94 sibling surveyof the Wisconsin Longitudinal Study (WLS) that is publicly available. We deﬁne the exposurevariable by combining two responses asking whether there was abuse in their childhood by fatheror mother respectively. The responses were measured in four categories: “not at all”, “a little”,“some”, and “a lot,” and, by following Springer et al. (2007), the exposure of child physical abuseis deﬁned as an indicator of “some” or “a lot” to at least one of the two responses. Since these twoquestions were asked for adults to recall their childhood exposures, there may exist a systematicbias in recalling exposures. The outcome was initially measured as Spielberger (1988)’s anger scale.We deﬁne a binary outcome variable; an individual is a case if his/her anger score is greater thanor equal to 18 that is the 90th percentile of the measured anger scores. Also, seven covariates areconsidered: sex, age at the time of the interview, father’s education, mother’s education, parentalincome, farm background, and an indicator of parents’ marital problems or single parent. SeeSpringer et al. (2007); Small et al. (2013) for more details about the WLS data.With the same WLS data, Springer et al. (2007) used a logistic regression model and found thatchildhood physical abuse is associated with anger with odds ratio 2.02 with the 95% CI (1.44, 2.84).However, in their discussion, they pointed out that the results may be aﬀected by a tendency ofunder-reporting of abuse that is a common weakness in studies of child abuse and adult health.Fergusson et al. (2000) summarized three properties in the nature of reporting/measuring exposurein childhood: (1) absence of false positive responses, (2) high rates (about 50%) of false negativeresponses and (3) independence of psychiatric state and reporting errors. The ﬁrst property meansthat if T ij = 0, T ∗ ij cannot be 1. Second, recall bias is severe and there is a high proportion of Table 3.

Estimated marginal CORs for the eﬀect of child abuse on adult angerwithout accounting for recall biasMethod ˆ ψ SE(log( ˆ ψ )) 95% CIML using logistic regressions (ML) 1.84 0.16 (1.36, 2.51)Stratiﬁcation (S) 1.79 0.16 (1.31, 2.44) Table 4.

Sensitivity analysis of recall bias for ﬁve values of ζ = ζ . The esti-mates and 95% conﬁdence intervals are displayed for the maximum likelihood andstratiﬁcation methods Method( ζ , ζ ) ML S(.1,.1) 1.87 (1.36, 2.57) 1.81 (1.32, 2.49)(.2,.2) 1.90 (1.36, 2.65) 1.84 (1.33, 2.56)(.3,.3) 1.94 (1.37, 2.75) 1.89 (1.34, 2.65)(.4,.4) 1.98 (1.37, 2.88) 1.95 (1.36, 2.80)(.5,.5) 2.03 (1.35, 3.06) 2.06 (1.38, 3.06)reporting T ∗ ij = 0 when T ij = 1 in reality. Finally, recall bias does not depends on the observedoutcome, which implies ζ = ζ .We applied (i) the ML method by using logistic regression (ML) and (ii) stratiﬁcation based onthe prognostic score (S) for estimating the marginal odds ratio. For the ML method, the logisticoutcome regression with the seven covariates without interaction terms is considered. For the Smethod, based on the same outcome model, the prognostic score can be estimated. Five strata areconstructed by using the quintile values of the estimated prognostic score.If there was no recall bias (i.e., ζ = ζ = 0 and T ij = T ∗ ij ), the estimates are reported inTable 3. The ﬁrst two methods for the marginal OR provide similar estimates, ˆ ψ ML = 1 .

84 with95% CI (1.35, 2.52) and ˆ ψ S = 1 .

80 with 95% CI (1.30, 2.49). We conducted sensitivity analysisof recall bias with parameters ( ζ , ζ ). Based on the previous literature, we focus on the line of0 ≤ ζ = ζ ≤ .

5. The estimates for various values of ζ and ζ are shown in Table 4. All theestimates increase as ζ = ζ increases. Also, all the 95% conﬁdence intervals do not contain 1.This implies that the under-reporting issue does not alter the initial conclusion; on the contrary,it strengthens the conclusion that there is signiﬁcant evidence that child abuse increases the oddsof adult anger. Note that the variance of the estimates are obtained by using 500 bootstrappedsamples.Although the ML method uses an additional model assumption that the S method does not,they provide almost identical results. For a larger value of ζ = ζ , the estimates start to increaserapidly. The curves of the estimates across ζ = ζ are displayed in Figure 2. For all values of ζ = ζ , the conﬁdence bands are located far above the red line that indicates the null eﬀect. Wefurther extended the sensitivity analysis on the line of ζ = ζ to the region of 0 ≤ ζ , ζ ≤ . ζ Figure 2.

Estimates of the marginal OR with 95% CIs across the line of ζ = ζ and ζ in this region. The dashed diagonal line indicates the line of ζ = ζ . The estimates alongthis line is represented in Figure 2. As shown in this ﬁgure, most of the estimates are above one.For the region of ζ ≥ ζ + 0 . ≤ ζ ≤ .

1, the estimates are below 1.As discussed in Section 3.3, the R-factor may be of interest. The initial conclusion obtainedwhen ζ = ζ = 0 is that there is a strong eﬀect of child abuse on adult anger. The R-factor canbe considered when this conclusion can be altered. Since Proposition 3 shows that ˆ ψ S ( ζ , ζ ) is adecreasing function of ζ , we ﬁx ζ = 0 and need to ﬁnd the minimum value of ζ such that the 95%CI contains 1. The R-factor is computed as 0.24 in this application. This shows that unless morethan 24% of controls under-reported their exposures while all cases correctly reported, the initialconclusion remains unchanged. Therefore, the R-factor can be used for examining the robustnessto recall bias. 7. Discussion

In this paper, we have introduced a causal inference framework for case-control studies whileaccounting for recall bias, by using causal estimands such as marginal, conditional, and commonCORs. We considered a set of two tuning parameters that characterizes all possible combinationsof recall bias. We have proposed two estimation approaches for recovering the marginal COR (max-imum likelihood and stratiﬁcation) in the presence of recall bias. Our proposed approach has the

Figure 3.

A contour plot for the values of ( ζ , ζ ) in the region 0 ≤ ζ , ζ ≤ . ψ are obtained from the stratiﬁcation method. The diagonalline corresponds to the line of ζ = ζ following features. First, to best our knowledge, ours is the ﬁrst attempt to estimate the marginalCOR accounting for recall bias while adjusting for confounding bias. Second, we demonstratedtheoretically and empirically that failing to account for recall bias can lead to substantial bias inthe estimation of the COR. We also showed the easy-to-check conditions when ψ ∗ is greater than ψ or vice versa. Third, we developed two estimation approaches, ML and stratiﬁcation, based on theidentiﬁcation results in causal inference. In particular, stratiﬁcation can be used to reduce the riskof model misspeciﬁcation. Finally, we developed sensitivity analysis and introduced the R-factorwhich can provide information about how much recall bias is needed to qualitatively alter the causalconclusion. This will provide a practical guidance for practitioners to examine robustness of theirﬁndings. CCOUNTING FOR RECALL BIAS IN CASE-CONTROL STUDIES 17

We proposed and compared two estimation methods for recovering the marginal COR. However,both rely on model assumptions. Though the stratiﬁcation method requires less than the MLmethod does, prognostic stratiﬁcation requires an additional assumption that there is no eﬀectmodiﬁcation. Another limitation is that the two tuning parameters may be too simple to illustraterecall bias. We assumed that the misclassiﬁed probabilities ( η , η ) or ( ζ , ζ ) are equal for allindividuals, but this may be not realistic. Instead of assigning the same probability, we mayrestrict the probability within a certain interval. We are planning to extend the proposed methodto matching that usually requires lesser model assumptions. We left this as our future research.Although our main goal is to make inference for the marginal COR, in sparse data cases such asmatched case-control study data, methods for estimating the marginal COR may fail to provide ameaningful estimate. In such cases, the Mantel-Haenszel (MH) method can be used for summarizinga large number strata and estimating a common OR across strata. The MH estimator is not anappropriate estimator for the marginal COR since the common COR is typically not the same asthe marginal COR (Austin, 2007). However, the MH estimator can work well if the stratum-speciﬁcCORs are expected to be equal. Also, it can produce a simple summary of several contingencytables. We developed a new MH-type estimation method for a common COR that accounts forrecall bias that is described in the Supplementary Material. The MH method uses matching tocreate strata, but does not require any of the model assumptions that the ML method assumes.Furthermore, the MH method provides a closed form estimator so that the impact of recall biascan be analytically tractable. However, it requires an assumption for the target parameter that thestratum-speciﬁc CORs are identical. This may be implausible in some cases.8. Software

The methods described in this paper are available at https://github.com/kwonsang/recall_bias_case_control_study . Supplementary Material

Supplementary material is available online. These materials contain the Mantel-Haenszel methodfor estimating the common COR. Also, the population marginal COR is discussed. Proofs forPropositions 1,2,3 are described and the variance estimators are discussed.

Acknowledgments

Conﬂict of Interest : The idea behind the development of this paper came from doing a consul-tation on a lawsuit against Colgate, but in a diﬀerent context with a diﬀerent data set.

Funding

This work was supported by NIH grants (R01ES026217, R01MD012769, R01ES028033, 1R01ES030616,1R01AG066793-01R01, 1R01ES029950, R01ES028033-S1), Alfred P. Sloan Foundation (G-2020-13946) and Vice Provost for Research at Harvard University (Climate Change Solutions Fund).

References

Austin, P. C. (2007). The performance of diﬀerent propensity score methods for estimating marginalodds ratios.

Statistics in Medicine , 26(16):3078–3094.Austin, P. C., Grootendorst, P., Normand, S.-L. T., and Anderson, G. M. (2007). Conditioning onthe propensity score can result in biased estimation of common measures of treatment eﬀect: AMonte Carlo study.

Statistics in Medicine , 26:754–768.Barry, D. (1996). Diﬀerential recall bias and spurious associations in case/control studies.

Statisticsin Medicine , 15(23):2603–2616.Breslow, N. (1982). Design and analysis of case-control studies.

Annual review of public health ,3(1):29–54.Breslow, N. E. and Day, N. E. (1980).

Statistical methods in cancer research, Volume 1 - Theanalysis of case-control studies . International Agency for Research on Cancer: Lyon.Chouinard, E. and Walter, S. (1995). Recall bias in case-control studies: an empirical analysis andtheoretical framework.

Journal of Clinical Epidemiology , 48(2):245–254.Coughlin, S. S. (1990). Recall bias in epidemiologic studies.

Journal of Clinical Epidemiology ,43(1):87–91.Drews, C. D. and Greenland, S. (1990). The impact of diﬀerential recall on the results of case-controlstudies.

International Journal of Epidemiology , 19(4):1107–1112.Fergusson, D. M., Horwood, L. J., and Woodward, L. J. (2000). The stability of child abusereports: a longitudinal study of the reporting behaviour of young adults.

Psychological Medicine ,30(3):529–544.Forbes, A. and Shortreed, S. (2008). Inverse probability weighted estimation of the marginalodds ratio: Correspondence regarding ‘The performance of diﬀerent propensity score methodsfor estimating marginal odds ratios’ by P. Austin, Statictics in Medicine, 2007; 26: 3078–3094.

Statistics in Medicine , 27(26):5556–5559.Graf, E. and Schumacher, M. (2008). Comments on ‘the performance of diﬀerent propensity scoremethods for estimating marginal odds ratios’ by peter c. austin, statistics in medicine 2007; 26(16): 3078–3094.

Statistics in Medicine , 27(19):3915–3917.Greenland, S. (1996). Basic methods for sensitivity analysis of biases.

International journal ofepidemiology , 25(6):1107–1116.Greenland, S. and Kleinbaum, D. G. (1983). Correcting for misclassiﬁcation in two-way tables andmatched-pair studies.

International Journal of Epidemiology , 12(1):93–97.Greenland, S. and Robins, J. M. (1986). Identiﬁability, exchangeability, and epidemiological con-founding.

International Journal of Epidemiology , 15(3):413–419.Greenland, S., Robins, J. M., and Pearl, J. (1999). Confounding and collapsibility in causal infer-ence.

Statistical Science , 14(1):29–46.Hansen, B. B. (2008). The prognostic analogue of the propensity score.

Biometrika , 95(2):481–488.Lewallen, S. and Courtright, P. (1998). Epidemiology in practice: case-control studies.

Communityeye health , 11(17492047):57–58.

CCOUNTING FOR RECALL BIAS IN CASE-CONTROL STUDIES 19

Mann, C. J. (2003). Observational research methods. Research design II: cohort, cross sectional,and case-control studies.

Emergency Medicine Journal , 20(1):54–60.Persson, E. and Waernbaum, I. (2013). Estimating a marginal causal odds ratio in a case-controldesign: Analyzing the eﬀect of low birth weight on the risk of type 1 diabetes mellitus.

Statisticsin Medicine , 32(14):2500–2512.Raphael, K. (1987). Recall bias: a proposal for assessment and control.

International Journal ofEpidemiology , 16(2):167–170.Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score in observationalstudies for causal eﬀects.

Biometrika , 70(1):41–55.Rothman, K. J. (2012).

Epidemiology: an introduction . Oxford university press.Rubin, D. B. (1974). Estimating causal eﬀects of treatments in randomized and nonrandomizedstudies.

Journal of educational Psychology , 66(5):688.Schulz, K. F. and Grimes, D. A. (2002). Case-control studies: research in reverse.

The Lancet ,359(9304):431–434.Small, D. S., Cheng, J., Halloran, M. E., and Rosenbaum, P. R. (2013). Case deﬁnition and designsensitivity.

Journal of the American Statistical Association , 108(504):1457–1468.Spielberger, C. D. (1988).

State-trait anger expression inventory professional manual . PsychologicalAssessment Resources.Splawa-Neyman, J. (1990). On the application of probability theory to agricultural experiments.Essay on principles. Section 9.

Statistical Science , pages 465–472.Springer, K. W., Sheridan, J., Kuo, D., and Carnes, M. (2007). Long-term physical and mentalhealth consequences of childhood physical abuse: Results from a large population-based sampleof men and women.

Child Abuse & Neglect , 31(5):517–530.Stampf, S., Graf, E., Schmoor, C., and Schumacher, M. (2010). Estimators and conﬁdence intervalsfor the marginal odds ratio using logistic regression and propensity score stratiﬁcation.

Statisticsin Medicine , 29(7-8):760–769.Stefanski, L. A. and Boos, D. D. (2002). The calculus of M-estimation.

The American Statistician ,56(1):29–38.Zhang, Z. (2008). Estimating a marginal causal odds ratio subject to confounding.