[PDF] On the plausibility of the latent ignorability assumption

Abstract

The estimation of the causal effect of an endogenous treatment based on an instrumental variable (IV) is often complicated by attrition, sample selection, or non-response in the outcome of interest. To tackle the latter problem, the latent ignorability (LI) assumption imposes that attrition/sample selection is independent of the outcome conditional on the treatment compliance type (i.e. how the treatment behaves as a function of the instrument), the instrument, and possibly further observed covariates. As a word of caution, this note formally discusses the strong behavioral implications of LI in rather standard IV models. We also provide an empirical illustration based on the Job Corps experimental study, in which the sensitivity of the estimated program effect to LI and alternative assumptions about outcome attrition is investigated.

Full PDF

aa r X i v : . [ ec on . E M ] J un On the plausibility of the latent ignorability assumption

Martin Huber

June 4, 2020

University of Fribourg, Dept. of Economics

Abstract:

The estimation of the causal eﬀect of an endogenous treatment based on an instrumentalvariable (IV) is often complicated by attrition, sample selection, or non-response in the outcome of interest.To tackle the latter problem, the latent ignorability (LI) assumption imposes that attrition/sample selectionis independent of the outcome conditional on the treatment compliance type (i.e. how the treatmentbehaves as a function of the instrument), the instrument, and possibly further observed covariates. As aword of caution, this note formally discusses the strong behavioral implications of LI in rather standardIV models. We also provide an empirical illustration based on the Job Corps experimental study, in whichthe sensitivity of the estimated program eﬀect to LI and alternative assumptions about outcome attritionis investigated.

Keywords: instrument, non-response, attrition, sample selection, latent ignorability.

JEL classiﬁcation:

C21, C24, C26.

Address for correspondence: Martin Huber, University of Fribourg, Bd. de P´erolles 90, 1700 Fribourg, Switzer-land, [email protected].

Introduction

A frequently encountered complication when estimating the eﬀect of a potentially endogenoustreatment based on an instrumental variable (IV) methods is attrition/sample selection/non-response bias in the outcome. To account for this problem, the missing at random (MAR)assumption (e.g. Rubin (1976)), for instance, requires outcome attrition to only depend onobservable variables. Alternatively, Frangakis and Rubin (1999) propose a latent ignorability(LI) restriction, which assumes attrition to be independent of the outcome conditional on theinstrument and the treatment compliance type (i.e. whether one is a complier or non-complierin the notation of Angrist, Imbens, and Rubin (1996)). In the IV framework both assumptionscan be combined (e.g. Mealli, Imbens, Ferro, and Biggeri (2004)), imposing independenceconditional on the compliance type, the instrument, and further observables.We argue that LI is nevertheless quite restrictive, as attrition is not allowed to be related tounobservables aﬀecting the outcome in a very general way. Section 2 formally discusses the strongbehavioral implications of LI in standard IV models with non-response. This assumption shouldtherefore be cautiously scrutinized in applications. As an example, consider Barnard, Frangakis,Hill, and Rubin (2003), who assess a randomized voucher program for private schooling withnoncompliance (where the IV is the randomization and the treatment is private schooling) andattrition in the test score outcomes, because some children did not take the test. Unobservablesas ability or motivation likely aﬀect both test taking and test scores. LI (combined with MAR)requires that conditional on the compliance type (i.e. private schooling as a function of voucherreceipt), voucher assignment, and observed covariates, test taking is not related to ability ormotivation (and thus, test scores). Among compliers (only in private schooling when randomizedin), those taking the test must thus have the same distribution of ability and motivation as thoseabstaining. However, even within compliers, heterogeneity in ability and motivation may besuﬃciently high to selectively aﬀect test taking such that LI fails. Section 3 provides an empiricalillustration using the Job Corps experimental study, in which the estimated program eﬀect underLI is compared to alternative assumptions about outcome attrition.1

IV models with nonresponse

Assume the following parametric IV model with nonresponse: Y = α + Dα + U, D = 1( β + Zβ ≥ V ) , R = 1( γ + Dγ ≥ W ) . (1) Y is the outcome of interest, D is the binary (and potentially endogenous) treatment, and R isthe response indicator. Note that 1( · ) is the indicator function that is equal to one if its argumentis satisﬁed and zero otherwise. Y is only observed if R = 1 and unknown if R = 0, implying non-response, sample selection, or attrition. Z is a randomly assigned instrument aﬀecting D (but notdirectly Y or R ) and assumed to be binary, e.g., the randomization indicator in an experiment. U, V, W denote arbitrarily associated unobservables, α , α , β , β , γ , γ are coeﬃcients.Angrist, Imbens, and Rubin (1996) deﬁne four compliance types, denoted by T , based on howthe potential treatment status depends on the instrument: An individual is a complier (deﬁer) ifher potential treatment state is one (zero) in the presence and zero (one) in the absence of theinstrument and an always-taker (never-taker) if the potential treatment is always (never) one,independent of the instrument. Assume that β is positive (a symmetric case could be made fora negative β ). Then, an individual is a complier if β + β ≥ V > β , an always taker if β ≥ V ,and a never taker if β + β < V . Deﬁers do not exist due to the positive sign of β .We now impose the following latent ignorability (LI) assumption, see Frangakis and Rubin(1999), and critically assess it in the light of our standard IV model with attrition: Assumption 1 (latent ignorability): Y ⊥ R | Z, T (where ‘ ⊥ ’ denotes independence),which is equivalent to Y ⊥ R | Z, D, T as Z and T perfectly determine D . Furthermore, we assumethat the error term U is continuous, such that Y is continuous. Finally, for the moment we alsoimpose that U = V = W such that the same unobservable (e.g. motivation) aﬀects the outcome(e.g. test score), treatment (e.g. private schooling), and response (e.g. test taking).Note that Assumption 1 implies that the distribution of U among compliers is the same across2esponse states given the instrument: E ( f ( Y ) | Z = 1 , T = c, R = 1) = E ( f ( Y ) | Z = 1 , T = c, R = 0) (2) ⇔ E ( f ( U ) | Z = 1 , β + β ≥ U > β , γ + γ ≥ U ) = E ( f ( U ) | Z = 1 , β + β ≥ U > β , γ + γ < U ) , where f ( · ) denotes an arbitrary function with a ﬁnite expectation and the second line followsfrom the parametric model in (1). Obviously, the joint satisfaction of U = V = W and (2) isimpossible in this context, as the distribution of U conditional on γ + γ ≥ U and γ + γ < U ,respectively, is non-overlapping. An analogous impossibility result holds for E ( f ( Y ) | Z = 0 , T = c, R = 1) = E ( f ( Y ) | Z = 0 , T = c, R = 0), which is also implied by Assumption 1.Imposing U = V = W seems too extreme for most applications and was chosen for illustrativepurposes. However, even if the unobserved terms in the various equations are not the same, butnon-negligibly correlated as commonly assumed in IV models, identiﬁcation may seem question-able. Suppose, for instance, that W = δ V + ǫ , where ǫ is random noise and δ is a coeﬃcient.Then, Assumption 1 and the model in (1) imply that E ( f ( U ) | Z = 1 , β + β ≥ V > β , γ + γ ≥ δ V + ǫ ) (3)= E ( f ( U ) | Z = 1 , β + β ≥ V > β , γ + γ < δ V + ǫ ) ⇔ E (cid:18) f ( U ) | Z = 1 , min (cid:18) β + β , γ + γ − ǫδ (cid:19) ≥ V > β (cid:19) = E (cid:18) f ( U ) | Z = 1 , β + β ≥ V > max (cid:18) β , γ + γ − ǫδ (cid:19)(cid:19) . If U is associated with either ǫ , V , or both, the latter equality does not hold in general, but onlyif the association of U, ǫ , V is of a very speciﬁc form, which raises concerns about Assumption 1.Finally, we investigate an in terms of functional form assumptions more general IV model,where Y , D , and R are given by nonparametric functions denoted by φ , ψ , and η , respectively: Y = φ ( D, U ) , D = 1( ψ ( Z, V ) ≥ , R = 1( η ( D, W ) ≥ . (4)3nder this model, Assumption 1 implies that E ( f ( U ) | Z = 1 , ψ (1 , V ) ≥ , ψ (0 , V ) < , η (1 , W ) ≥

0) = E ( f ( U ) | Z = 1 , ψ (1 , V ) ≥ , ψ (0 , V ) < , η (1 , W ) < . (5)This can be satisﬁed in special cases, for instance if U = π ψ (1 , V ) ≥ , ψ (0 , V ) <

0) + ε , with π denoting the (homogeneous) eﬀect of being a complier and ε being random noise. Then, (5)simpliﬁes to E ( f ( ǫ ) | Z = 1 , ψ (1 , V ) ≥ , ψ (0 , V ) < , η (1 , W ) ≥

0) = E ( f ( ǫ ) | Z = 1 , ψ (1 , V ) ≥ , ψ (0 , V ) < , η (1 , W ) < ǫ is independent of W . In general, identiﬁcationrequires that T is a suﬃcient statistic to control for the endogeneity introduced by conditioning on R . This, however, implies that the association between U , V , and W is quite speciﬁc, otherwiseAssumption 1 does not hold. As an illustration for treatment evaluation under LI and alternative assumptions about attrition,we consider the experimental evaluation of the U.S. Job Corps program (see for instance Schochet,Burghardt, and Glazerman (2001)), providing training and education for young disadvantagedindividuals. We aim at estimating the eﬀect of program participation ( D ) in the ﬁrst or secondyear after randomization into Job Corps ( Z ) on log weekly wages of females in the third year( Y ). Of the 4,765 females in the experimental sample with observed treatment status, wages areonly observed for 3,682 individuals ( R = 1), while 1,083 do not report to work.Reconsidering the IV model of (4), we assume that in each of φ , ψ , and η a vector of observedcovariates, denoted by X , may enter as additional explanatory variables. Similar to Fr¨olichand Huber (2014), Section 2.2, we assume that (i) Assumption 1 holds conditional on X (thuscombining LI and MAR), (ii) U ⊥ Z | X, T such that the instrument aﬀects the outcome onlythrough the treatment, (iii) T ⊥ Z | X which is implied by random assignment, (iv) Pr( T = c ) > T = d ) = 0 so that compliers exist and deﬁers are ruled out, and (v) 0 < Pr( Z = 1 | X ) < X (measured prior torandomization) includes education, ethnicity, age and its square, school and working status, andreceipt of Aid to Families with Dependent Children (AFDC) and food stamps.4able 1: Descriptive statistics total sample working not workingmean std.dev mean std.dev mean std.deveducation: 12 years 0.23 0.42 0.25 0.44 0.17 0.37education: 13 or more years 0.03 0.18 0.04 0.19 0.01 0.10race: black 0.54 0.50 0.53 0.50 0.56 0.50race: Hispanic 0.19 0.39 0.18 0.38 0.21 0.40age 18.59 2.18 18.66 2.19 18.37 2.14in school prior to randomization 0.63 0.48 0.63 0.48 0.61 0.49school information missing 0.02 0.14 0.02 0.13 0.03 0.17in job prior to randomization 0.61 0.49 0.65 0.48 0.47 0.50received AFDC 0.41 0.49 0.40 0.49 0.45 0.50received food stamps 0.54 0.50 0.52 0.50 0.60 0.49treatment: Job Corps participation 0.45 0.50 0.46 0.50 0.41 0.49instrument: randomization 0.64 0.48 0.66 0.48 0.60 0.49instrument: kids under 6 0.77 0.90 0.73 0.88 0.88 0.95instrument kids under 15 1.15 1.26 1.12 1.23 1.25 1.34 We compare sempiparametric LATE estimation based on the latter assumptions (see Theorem1 in Fr¨olich and Huber (2014)) to (i) MAR-based LATE estimation as in Section 2.3 of Fr¨olichand Huber (2014) (assumptions: Y ⊥ R | X, Z, D , (

U, T ) ⊥ Z | X , Pr( T = c ) >

0, Pr( T = d ) = 0,0 < Pr( Z = 1 | X ) < R = 1 (ignoring sampleselection), and (iii) the method of Fricke, Fr¨olich, Huber, and Lechner (2020), which tacklessample selection and treatment endogeneity by two distinct instruments. In the latter approach,which allows for non-ignorable selection related to U in a more general way than LI, we use thenumber of kids younger than 6 in the household 2.5 years after random assignment as instrumentfor R . We apply a semiparametric version of the estimator outlined in equation (23) of Fricke,Fr¨olich, Huber, and Lechner (2020) along with the weighting function in their expression (21).Table 2: Eﬀect estimates LI + MAR MAR Wald 2 IVseﬀect 0.12 0.16 0.12 0.16standard error 0.06 0.06 0.05 0.33bootstrap p-values (quantile-based) 0.05 0.00 0.03 0.65

Table 1 provides descriptive statistics for the covariates, the treatment, and the instrumentsin the total sample and for working and not working females. Across the latter groups for in-stance education, aid receipt, previous job status, and Job Corps participation diﬀer importantly,pointing to non-random selection into employment. Table 2 presents the eﬀect estimates, stan-dard errors, and p-values based on 1999 bootstraps using the quantile method. The eﬀect under5I + MAR (based on Theorem 1 of Fr¨olich and Huber (2014)) of 0.12 log points virtually iden-tical to the Wald estimator which ignores sample selection bias, and both are statistically signif-icantly diﬀerent from zero. The MAR-based estimate is one third higher, but not signiﬁcantlydiﬀerently so. The method of Fricke, Fr¨olich, Huber, and Lechner (2020) based on two instru-ments (2 IVs) yields virtually the same eﬀect as MAR and is neither statistically signiﬁcantlydiﬀerent from any other estimator, nor from zero at any conventional level.It seems important to understand the diﬀerences in the behavioral assumptions of the estima-tors. LI + MAR, for instance, assumes that given the covariates and program assignment, unob-servables like ability and motivation do not jointly aﬀect employment and wages among compli-ers. In constrast, the method of Fricke, Fr¨olich, Huber, and Lechner (2020) does not rely on thisrestriction and allows for more general forms of sample selection, at the cost of also requiring avalid instrument for employment. In our illustration, the results turned out to be rather robustto the diﬀerent assumptions considered, which need not necessarily hold in other contexts.

References

Angrist, J., G. Imbens, and

D. Rubin (1996): “Identiﬁcation of Causal Eﬀects using Instrumental Variables,”

Journal ofAmerican Statistical Association , 91, 444–472 (with discussion).

Barnard, J., C. Frangakis, J. Hill, and

D. Rubin (2003): “A Principal Stratiﬁcation Approach to Broken RandomizedExperiments: A Case Study of School Choice Vouchers in New York City,”

Journal of the American Statistical Association ,98, 299–323.

Frangakis, C., and

D. Rubin (1999): “Addressing complications of intention-to-treat analysis in the combined presence ofall-or-none treatment-noncompliance and subsequent missing outcomes,”

Biometrika , 86, 365–379.

Fricke, H., M. Fr¨olich, M. Huber, and

M. Lechner (2020): “Endogeneity and Non-Response Bias in Treatment Evalua-tion - Nonparametric Identiﬁcation of Causal Eﬀects by Instruments,” forthcoming in the Journal of Applied Econometrics . Fr¨olich, M., and

M. Huber (2014): “Treatment evaluation with multiple outcome periods under endogeneity and attri-tion,”

Journal of the American Statistical Association , 109, 1697–1711.

Mealli, F., G. Imbens, S. Ferro, and

A. Biggeri (2004): “Analyzing a randomized trial on breast self-examination withnoncompliance and missing outcomes,”

Biostatistics , 5, 207–222.

Rubin, D. (1976): “Inference and Missing Data,”

Biometrika , 63, 581–592.

Schochet, P., J. Burghardt, and

S. Glazerman (2001): “National Job Corps Study: The Impacts of Job Corps onParticipants’ Employment and Related Outcomes,”

Report (Washington, DC: Mathematica Policy Research, Inc.) ..