aa r X i v : . [ ec on . E M ] J un On the plausibility of the latent ignorability assumption
Martin Huber
June 4, 2020
University of Fribourg, Dept. of Economics
Abstract:
The estimation of the causal effect of an endogenous treatment based on an instrumentalvariable (IV) is often complicated by attrition, sample selection, or non-response in the outcome of interest.To tackle the latter problem, the latent ignorability (LI) assumption imposes that attrition/sample selectionis independent of the outcome conditional on the treatment compliance type (i.e. how the treatmentbehaves as a function of the instrument), the instrument, and possibly further observed covariates. As aword of caution, this note formally discusses the strong behavioral implications of LI in rather standardIV models. We also provide an empirical illustration based on the Job Corps experimental study, in whichthe sensitivity of the estimated program effect to LI and alternative assumptions about outcome attritionis investigated.
Keywords: instrument, non-response, attrition, sample selection, latent ignorability.
JEL classification:
C21, C24, C26.
Address for correspondence: Martin Huber, University of Fribourg, Bd. de P´erolles 90, 1700 Fribourg, Switzer-land, [email protected].
Introduction
A frequently encountered complication when estimating the effect of a potentially endogenoustreatment based on an instrumental variable (IV) methods is attrition/sample selection/non-response bias in the outcome. To account for this problem, the missing at random (MAR)assumption (e.g. Rubin (1976)), for instance, requires outcome attrition to only depend onobservable variables. Alternatively, Frangakis and Rubin (1999) propose a latent ignorability(LI) restriction, which assumes attrition to be independent of the outcome conditional on theinstrument and the treatment compliance type (i.e. whether one is a complier or non-complierin the notation of Angrist, Imbens, and Rubin (1996)). In the IV framework both assumptionscan be combined (e.g. Mealli, Imbens, Ferro, and Biggeri (2004)), imposing independenceconditional on the compliance type, the instrument, and further observables.We argue that LI is nevertheless quite restrictive, as attrition is not allowed to be related tounobservables affecting the outcome in a very general way. Section 2 formally discusses the strongbehavioral implications of LI in standard IV models with non-response. This assumption shouldtherefore be cautiously scrutinized in applications. As an example, consider Barnard, Frangakis,Hill, and Rubin (2003), who assess a randomized voucher program for private schooling withnoncompliance (where the IV is the randomization and the treatment is private schooling) andattrition in the test score outcomes, because some children did not take the test. Unobservablesas ability or motivation likely affect both test taking and test scores. LI (combined with MAR)requires that conditional on the compliance type (i.e. private schooling as a function of voucherreceipt), voucher assignment, and observed covariates, test taking is not related to ability ormotivation (and thus, test scores). Among compliers (only in private schooling when randomizedin), those taking the test must thus have the same distribution of ability and motivation as thoseabstaining. However, even within compliers, heterogeneity in ability and motivation may besufficiently high to selectively affect test taking such that LI fails. Section 3 provides an empiricalillustration using the Job Corps experimental study, in which the estimated program effect underLI is compared to alternative assumptions about outcome attrition.1
IV models with nonresponse
Assume the following parametric IV model with nonresponse: Y = α + Dα + U, D = 1( β + Zβ ≥ V ) , R = 1( γ + Dγ ≥ W ) . (1) Y is the outcome of interest, D is the binary (and potentially endogenous) treatment, and R isthe response indicator. Note that 1( · ) is the indicator function that is equal to one if its argumentis satisfied and zero otherwise. Y is only observed if R = 1 and unknown if R = 0, implying non-response, sample selection, or attrition. Z is a randomly assigned instrument affecting D (but notdirectly Y or R ) and assumed to be binary, e.g., the randomization indicator in an experiment. U, V, W denote arbitrarily associated unobservables, α , α , β , β , γ , γ are coefficients.Angrist, Imbens, and Rubin (1996) define four compliance types, denoted by T , based on howthe potential treatment status depends on the instrument: An individual is a complier (defier) ifher potential treatment state is one (zero) in the presence and zero (one) in the absence of theinstrument and an always-taker (never-taker) if the potential treatment is always (never) one,independent of the instrument. Assume that β is positive (a symmetric case could be made fora negative β ). Then, an individual is a complier if β + β ≥ V > β , an always taker if β ≥ V ,and a never taker if β + β < V . Defiers do not exist due to the positive sign of β .We now impose the following latent ignorability (LI) assumption, see Frangakis and Rubin(1999), and critically assess it in the light of our standard IV model with attrition: Assumption 1 (latent ignorability): Y ⊥ R | Z, T (where ‘ ⊥ ’ denotes independence),which is equivalent to Y ⊥ R | Z, D, T as Z and T perfectly determine D . Furthermore, we assumethat the error term U is continuous, such that Y is continuous. Finally, for the moment we alsoimpose that U = V = W such that the same unobservable (e.g. motivation) affects the outcome(e.g. test score), treatment (e.g. private schooling), and response (e.g. test taking).Note that Assumption 1 implies that the distribution of U among compliers is the same across2esponse states given the instrument: E ( f ( Y ) | Z = 1 , T = c, R = 1) = E ( f ( Y ) | Z = 1 , T = c, R = 0) (2) ⇔ E ( f ( U ) | Z = 1 , β + β ≥ U > β , γ + γ ≥ U ) = E ( f ( U ) | Z = 1 , β + β ≥ U > β , γ + γ < U ) , where f ( · ) denotes an arbitrary function with a finite expectation and the second line followsfrom the parametric model in (1). Obviously, the joint satisfaction of U = V = W and (2) isimpossible in this context, as the distribution of U conditional on γ + γ ≥ U and γ + γ < U ,respectively, is non-overlapping. An analogous impossibility result holds for E ( f ( Y ) | Z = 0 , T = c, R = 1) = E ( f ( Y ) | Z = 0 , T = c, R = 0), which is also implied by Assumption 1.Imposing U = V = W seems too extreme for most applications and was chosen for illustrativepurposes. However, even if the unobserved terms in the various equations are not the same, butnon-negligibly correlated as commonly assumed in IV models, identification may seem question-able. Suppose, for instance, that W = δ V + ǫ , where ǫ is random noise and δ is a coefficient.Then, Assumption 1 and the model in (1) imply that E ( f ( U ) | Z = 1 , β + β ≥ V > β , γ + γ ≥ δ V + ǫ ) (3)= E ( f ( U ) | Z = 1 , β + β ≥ V > β , γ + γ < δ V + ǫ ) ⇔ E (cid:18) f ( U ) | Z = 1 , min (cid:18) β + β , γ + γ − ǫδ (cid:19) ≥ V > β (cid:19) = E (cid:18) f ( U ) | Z = 1 , β + β ≥ V > max (cid:18) β , γ + γ − ǫδ (cid:19)(cid:19) . If U is associated with either ǫ , V , or both, the latter equality does not hold in general, but onlyif the association of U, ǫ , V is of a very specific form, which raises concerns about Assumption 1.Finally, we investigate an in terms of functional form assumptions more general IV model,where Y , D , and R are given by nonparametric functions denoted by φ , ψ , and η , respectively: Y = φ ( D, U ) , D = 1( ψ ( Z, V ) ≥ , R = 1( η ( D, W ) ≥ . (4)3nder this model, Assumption 1 implies that E ( f ( U ) | Z = 1 , ψ (1 , V ) ≥ , ψ (0 , V ) < , η (1 , W ) ≥
0) = E ( f ( U ) | Z = 1 , ψ (1 , V ) ≥ , ψ (0 , V ) < , η (1 , W ) < . (5)This can be satisfied in special cases, for instance if U = π ψ (1 , V ) ≥ , ψ (0 , V ) <
0) + ε , with π denoting the (homogeneous) effect of being a complier and ε being random noise. Then, (5)simplifies to E ( f ( ǫ ) | Z = 1 , ψ (1 , V ) ≥ , ψ (0 , V ) < , η (1 , W ) ≥
0) = E ( f ( ǫ ) | Z = 1 , ψ (1 , V ) ≥ , ψ (0 , V ) < , η (1 , W ) < ǫ is independent of W . In general, identificationrequires that T is a sufficient statistic to control for the endogeneity introduced by conditioning on R . This, however, implies that the association between U , V , and W is quite specific, otherwiseAssumption 1 does not hold. As an illustration for treatment evaluation under LI and alternative assumptions about attrition,we consider the experimental evaluation of the U.S. Job Corps program (see for instance Schochet,Burghardt, and Glazerman (2001)), providing training and education for young disadvantagedindividuals. We aim at estimating the effect of program participation ( D ) in the first or secondyear after randomization into Job Corps ( Z ) on log weekly wages of females in the third year( Y ). Of the 4,765 females in the experimental sample with observed treatment status, wages areonly observed for 3,682 individuals ( R = 1), while 1,083 do not report to work.Reconsidering the IV model of (4), we assume that in each of φ , ψ , and η a vector of observedcovariates, denoted by X , may enter as additional explanatory variables. Similar to Fr¨olichand Huber (2014), Section 2.2, we assume that (i) Assumption 1 holds conditional on X (thuscombining LI and MAR), (ii) U ⊥ Z | X, T such that the instrument affects the outcome onlythrough the treatment, (iii) T ⊥ Z | X which is implied by random assignment, (iv) Pr( T = c ) > T = d ) = 0 so that compliers exist and defiers are ruled out, and (v) 0 < Pr( Z = 1 | X ) < X (measured prior torandomization) includes education, ethnicity, age and its square, school and working status, andreceipt of Aid to Families with Dependent Children (AFDC) and food stamps.4able 1: Descriptive statistics total sample working not workingmean std.dev mean std.dev mean std.deveducation: 12 years 0.23 0.42 0.25 0.44 0.17 0.37education: 13 or more years 0.03 0.18 0.04 0.19 0.01 0.10race: black 0.54 0.50 0.53 0.50 0.56 0.50race: Hispanic 0.19 0.39 0.18 0.38 0.21 0.40age 18.59 2.18 18.66 2.19 18.37 2.14in school prior to randomization 0.63 0.48 0.63 0.48 0.61 0.49school information missing 0.02 0.14 0.02 0.13 0.03 0.17in job prior to randomization 0.61 0.49 0.65 0.48 0.47 0.50received AFDC 0.41 0.49 0.40 0.49 0.45 0.50received food stamps 0.54 0.50 0.52 0.50 0.60 0.49treatment: Job Corps participation 0.45 0.50 0.46 0.50 0.41 0.49instrument: randomization 0.64 0.48 0.66 0.48 0.60 0.49instrument: kids under 6 0.77 0.90 0.73 0.88 0.88 0.95instrument kids under 15 1.15 1.26 1.12 1.23 1.25 1.34 We compare sempiparametric LATE estimation based on the latter assumptions (see Theorem1 in Fr¨olich and Huber (2014)) to (i) MAR-based LATE estimation as in Section 2.3 of Fr¨olichand Huber (2014) (assumptions: Y ⊥ R | X, Z, D , (
U, T ) ⊥ Z | X , Pr( T = c ) >
0, Pr( T = d ) = 0,0 < Pr( Z = 1 | X ) < R = 1 (ignoring sampleselection), and (iii) the method of Fricke, Fr¨olich, Huber, and Lechner (2020), which tacklessample selection and treatment endogeneity by two distinct instruments. In the latter approach,which allows for non-ignorable selection related to U in a more general way than LI, we use thenumber of kids younger than 6 in the household 2.5 years after random assignment as instrumentfor R . We apply a semiparametric version of the estimator outlined in equation (23) of Fricke,Fr¨olich, Huber, and Lechner (2020) along with the weighting function in their expression (21).Table 2: Effect estimates LI + MAR MAR Wald 2 IVseffect 0.12 0.16 0.12 0.16standard error 0.06 0.06 0.05 0.33bootstrap p-values (quantile-based) 0.05 0.00 0.03 0.65
Table 1 provides descriptive statistics for the covariates, the treatment, and the instrumentsin the total sample and for working and not working females. Across the latter groups for in-stance education, aid receipt, previous job status, and Job Corps participation differ importantly,pointing to non-random selection into employment. Table 2 presents the effect estimates, stan-dard errors, and p-values based on 1999 bootstraps using the quantile method. The effect under5I + MAR (based on Theorem 1 of Fr¨olich and Huber (2014)) of 0.12 log points virtually iden-tical to the Wald estimator which ignores sample selection bias, and both are statistically signif-icantly different from zero. The MAR-based estimate is one third higher, but not significantlydifferently so. The method of Fricke, Fr¨olich, Huber, and Lechner (2020) based on two instru-ments (2 IVs) yields virtually the same effect as MAR and is neither statistically significantlydifferent from any other estimator, nor from zero at any conventional level.It seems important to understand the differences in the behavioral assumptions of the estima-tors. LI + MAR, for instance, assumes that given the covariates and program assignment, unob-servables like ability and motivation do not jointly affect employment and wages among compli-ers. In constrast, the method of Fricke, Fr¨olich, Huber, and Lechner (2020) does not rely on thisrestriction and allows for more general forms of sample selection, at the cost of also requiring avalid instrument for employment. In our illustration, the results turned out to be rather robustto the different assumptions considered, which need not necessarily hold in other contexts.
References
Angrist, J., G. Imbens, and
D. Rubin (1996): “Identification of Causal Effects using Instrumental Variables,”
Journal ofAmerican Statistical Association , 91, 444–472 (with discussion).
Barnard, J., C. Frangakis, J. Hill, and
D. Rubin (2003): “A Principal Stratification Approach to Broken RandomizedExperiments: A Case Study of School Choice Vouchers in New York City,”
Journal of the American Statistical Association ,98, 299–323.
Frangakis, C., and
D. Rubin (1999): “Addressing complications of intention-to-treat analysis in the combined presence ofall-or-none treatment-noncompliance and subsequent missing outcomes,”
Biometrika , 86, 365–379.
Fricke, H., M. Fr¨olich, M. Huber, and
M. Lechner (2020): “Endogeneity and Non-Response Bias in Treatment Evalua-tion - Nonparametric Identification of Causal Effects by Instruments,” forthcoming in the Journal of Applied Econometrics . Fr¨olich, M., and
M. Huber (2014): “Treatment evaluation with multiple outcome periods under endogeneity and attri-tion,”
Journal of the American Statistical Association , 109, 1697–1711.
Mealli, F., G. Imbens, S. Ferro, and
A. Biggeri (2004): “Analyzing a randomized trial on breast self-examination withnoncompliance and missing outcomes,”
Biostatistics , 5, 207–222.
Rubin, D. (1976): “Inference and Missing Data,”
Biometrika , 63, 581–592.
Schochet, P., J. Burghardt, and
S. Glazerman (2001): “National Job Corps Study: The Impacts of Job Corps onParticipants’ Employment and Related Outcomes,”
Report (Washington, DC: Mathematica Policy Research, Inc.) ..