Debiased/Double Machine Learning for Instrumental Variable Quantile Regressions
DDebiased/Double Machine Learning for Instrumental VariableQuantile Regressions
Jau-er Chen a,b and
Jia-Jyun Tien c a Institute for International Strategy, Tokyo International University. b Center for Research in Econometric Theory and Applications, National Taiwan University. c Department of Economics, National Taiwan University.
Abstract
The aim of this paper is to investigate estimation and inference on a low-dimensionalcausal parameter in the presence of high-dimensional controls in an instrumental vari-able quantile regression. The estimation and inference are based on the Neyman-typeorthogonal moment conditions, that are relatively insensitive to the estimation of the nui-sance parameters. The Monte Carlo experiments show that the econometric procedureperforms well. We also apply the procedure to reinvestigate two empirical studies: thequantile treatment effect of 401(k) participation on accumulated wealth, and the distri-butional effect of job-training program participation on trainee earnings.Keywords: instrumental variable, quantile regression, treatment effect, LASSO,double machine learning.
JEL Classification: C21; C26.
Correspondence: Jau-er Chen. E-mail: [email protected] Address: 1-13-1 Matobakita Kawagoe, Saitama 350-1197, Japan.This version: September 2019. We are grateful to Masayuki Hirukawa, Tsung-Chih Lai, and Hsin-Yi Lin for discussions andcomments. This paper has benefited from presentations at the 2nd International Conference on Econometrics and Statistics(EcoSta 2018), and the Ryukoku University. The authors declare no conflict of interest. The usual disclaimer applies.Funding: This research was partly funded by the personal research fund from Tokyo International University, and financiallysupported by the Center for Research in Econometric Theory and Applications (Grant no. 107L900203) from The FeaturedAreas Research Center Program within the framework of the Higher Education Sprout Project by the Ministry of Education(MOE) in Taiwan. a r X i v : . [ ec on . E M ] S e p Introduction
Model selection and variable selection are widely discussed in the area of prediction. Muchless attention, however, has been paid to the modification of prediction methods underthe context of causal machine learning in economics, cf. Athey (2017) and Athey (2018).As one of the pioneering papers, within the linear framework of instrumental variableestimation, Belloni et al. (2014) proposed a double-selection procedure to correct for anomitted variable bias in a high-dimensional framework. Constructing a general frameworkencompassing results from the aforementioned Belloni’s paper, Chernozhukov et al. (2015)and Chernozhukov et al. (2018a) proposed a unified procedure, double/debiased machinelearning (DML), which remains valid for nonlinear or semi-nonparametric models. Theaim of this paper is to investigate estimation and inference on a low-dimensional causalparameter in the presence of high-dimensional controls in an instrumental variable quantileregression. In particular, our procedure follows the idea outlined by Chernozhukov et al(2018b). To the best of our knowledge, the present study is the first to investigate MonteCarlo performance and empirical studies of the double machine learning procedure withinthe framework of instrumental variable quantile regressions. The Monte Carlo experimentsshow that our econometric procedure performs well.Causal machine learning has been actively studied in economics in recent years, whichare based on two approaches: the double machine learning, cf. Chernozhukov et al.(2018), and the generalized random forests, cf. Athey, Tibshirani and Wager (2019). Chenand Hsiang (2019) investigate the generalized random forests model using instrumentalvariable quantile regression. In contrast to the DML for instrumental variable quantileregressions, their econometric procedure yields a measure of variable importance in termsof heterogeneity among control variables. Although related to our paper, Chen and Hsiang(2019) do not consider the setting of high-dimensional controls.We apply the proposed procedure to empirically investigate causal quantile effects ofthe 401(k) participation on net financial assets. Our empirical results signify that the401(k) participants with low savings propensity are more associated with the nonlinearincome effect, which complements the findings concluded in Chernozhukov et al. (2018a)and Chiou et al. (2018). Another empirical example of the job training program partici-pation is investigated as well.The rest of the paper is organized as follows. The model specification and estimationprocedure are introduced in Section 2. Section 3 presents Monte Carlo experiments.Section 4 presents two empirical applications. Section 5 concludes the paper. The Model
We briefly review the conventional instrumental variable quantile regression (IVQR), andthen the IVQR within the framework of high-dimensional controls. Our DML procedurefor the IVQR is introduced in this section, which is constructed based on a tentativeprocedure suggested by Chernozhukov et al. (2018b).
The following conditional moment restriction yields an IVQR estimator. P [ Y ≤ q ( τ, D, X ) | X, Z ] = τ, (1)where q ( · ) is the structural quantile function, τ stands for the quantile index, D , X and Z are, respectively, the target variable, control variables and instruments. Condition (1)and linear structural quantile specification leads to the following unconditional momentrestriction E [( τ − ( Y − D (cid:48) α − X (cid:48) β ≤ X, Z )is a vector of a function of instruments and control variables. The parameters depend onthe quantile of interest, but we suppress the τ associated with α and β for simplicity ofpresentation. Equation (2) leads to a particular moment condition for doing partiallingout: g τ ( V, α ; β, δ ) = ( τ − ( Y ≤ D (cid:48) α + X (cid:48) β )Ψ( α, δ ( α ))) (3)with “instrument” Ψ( α, δ ( α )) := ( Z − δ ( α ) X ) (4) δ ( α ) = M ( α ) J − ( α ) , where δ is a matrix parameter, M ( α ) = E [ ZX (cid:48) f ε (0 | X, Z )] , J ( α ) = E [ XX (cid:48) f ε (0 | X, Z )]and f ε (0 | X, Z ) is the conditional density of (cid:15) = Y − D (cid:48) α − X (cid:48) β ( α ) with β ( α ) defined by E [( τ − ( Y ≤ D (cid:48) α + X (cid:48) β ( α )) X ] = 0 . (5) e construct the grid search interval for α first and profile out the coefficient for each α in the interval on the exogenous variable by equation (5). That is,ˆ β ( a ) = arg min b ∈B N N (cid:88) i =1 ρ τ ( Y i − D (cid:48) i a − X (cid:48) i b ) . We build sample counterpart of the population moment condition based on equations(2)–(5). That is, ˆ g N ( a ) = 1 N N (cid:88) i =1 g ( V i , a, ˆ β ( a ) , ˆ δ ( a )) , (6)where ˆ δ ( a ) = (cid:99) M ( a ) (cid:98) J − ( a )for (cid:99) M ( a ) = 1 N h
N N (cid:88) i =1 Z i X (cid:48) i K h N (cid:0) Y i − D (cid:48) i a − X (cid:48) i ˆ β ( a ) (cid:1)(cid:98) J ( a ) = 1 N h
N N (cid:88) i =1 X i X (cid:48) i K h N (cid:0) Y i − D (cid:48) i a − X (cid:48) i ˆ β ( a ) (cid:1) where K h N is a kernel function with bandwidth h N . We thus can solve for the parametersthrough optimizing the GMM criterion function. Specifically,ˆ α ( τ ) = arg min a ∈A N ˆ g N ( a ) (cid:48) (cid:98) Σ( a, a ) − ˆ g N ( a ) (7) (cid:98) Σ( a , a ) = 1 N N (cid:88) i =1 g (cid:0) V i , a , ˆ β ( a ) (cid:1) g (cid:0) V i , a , ˆ β ( a ) (cid:1) (cid:48) where (cid:98) Σ( a , a ) is a weighting matrix used in the GMM estimation. Notice that theestimator ˆ α based on the inverse quantile regression (i.e. IVQR) is first-order equivalentto the estimator defined by the GMM. We modify the procedure introduced in Subsection 2.1 in order to deal with a dataset ofhigh-dimensional control variables. We construct the grid search interval for α and profileout the coefficients on exogenous variable using the L -norm penalized quantile regressionestimator: ˆ β ( a ) = arg min b ∈B n N (cid:88) i =1 ρ τ ( Y i − D (cid:48) i a − X (cid:48) i b ) + λ dim ( b ) (cid:88) j =1 | b j | . (8) n addition, we estimate (cid:99) M ( a ) = 1 N h
N N (cid:88) i =1 Z i X (cid:48) i K h N (cid:0) Y i − D (cid:48) i a − X (cid:48) i ˆ β ( a ) (cid:1)(cid:98) J ( a ) = 1 N h
N N (cid:88) i =1 X i X (cid:48) i K h N (cid:0) Y i − D (cid:48) i a − X (cid:48) i ˆ β ( a ) (cid:1) . We also do dimension reduction on J because of the large dimension of X . In partic-ular, we implement the following regularization.ˆ δ j ( a ) = arg min δ δ (cid:48) ˆ J ( a ) δ − ˆ M j ( a ) δ + ϑ || δ || . The regularization above does a weighting LASSO for each instrument variable on controlvariables, and consequently the L norm optimization obeys the Karush-Kuhn-Tuckercondition || ˆ δ j ( a ) (cid:48) ˆ J ( a ) − ˆ M j ( a ) || ∞ ≤ ϑ, ∀ j. (9)After implementing the double machine learning procedure outlined above for the IVQR,we now can solve for the low-dimensional causal parameter α through optimizing theGMM defined as follows. The sample counterpart of the moment conditionˆ g N ( a ) = 1 N N (cid:88) i =1 (cid:0) τ − (cid:0) Y i − D (cid:48) i a − X (cid:48) i ˆ β ( a ) ≤ (cid:1)(cid:1) Ψ( a, ˆ δ ( a )) . (10)Accordingly, ˆ α = arg min a ∈A N ˆ g N ( a ) (cid:48) (cid:98) Σ( a, a ) − ˆ g N ( a ) . More importantly, the aforementioned double machine learning procedure (DML-IVQRhereafter) satisfies the Neyman orthogonality conditions, cf. Chernozhukov et al. (2018b).
Under the regularity conditions listed in Chernozhukov and Hansen (2008), the asymptoticnormality of the GMM estimator with a nonsmooth objective function is guaranteed. Wehave √ n ˆ g N ( a ) d −→ N (0 , Σ( a, a )) . (11)Consequently, it leads to N ˆ g N ( a ) (cid:48) (cid:98) Σ( a, a ) − ˆ g N ( a ) d −→ χ dim ( Z ) . e define W N ≡ N ˆ g N ( a ) (cid:48) (cid:98) Σ( a, a ) − ˆ g N ( a ) . It then follows that a valid (1 − p ) percent confidence region for the true parameter, α ,may be constructed as the set CR := { α ∈ A : W N ( α ) ≤ c − p } , where c − p is the critical point such that P [ χ dim ( Z ) > c − p ] = p, and A can be numerical approximated by the grid { α j , j = 1 , ..., J } . The suggested double machine learning algorithm involves solving L1-norm optimizationwhich is a nontrivial task. Researchers often represent the L1-norm penalized quantileobjective function as a linear programming problem. Specifically,minimize θ ∈R ,θ ∈R p N (cid:88) i =1 ρ τ ( Y i − θ − W (cid:48) i θ ) + λ (cid:107) θ (cid:107) (12)minimize θ ∈R ,θ ∈R p ,ξ ∈R n N (cid:88) i =1 { τ ( ξ ) + + (1 − τ )( ξ ) − } + λ (cid:107) θ (cid:107) subject to θ + x (cid:48) i θ + ξ i = y i , i = 1 , . . . , n.z := [ θ +0 θ − ( θ + ) (cid:48) ( θ − ) (cid:48) ( ξ + ) (cid:48) ( ξ − ) (cid:48) ] (cid:48) c := [ 0 0 000 (cid:48) (cid:48) τ (cid:48) (1 − τ )111 (cid:48) ] (cid:48) a := [ 0 0 111 (cid:48) (cid:48) (cid:48) (cid:48) ] (cid:48) A := [ 111 (cid:48) − (cid:48) X − X I (cid:48) − I (cid:48) ] b := Y, where θ = [ α (cid:48) , β (cid:48) ] (cid:48) and W = [ D (cid:48) , X (cid:48) ] (cid:48) .However, it turns out that the computation is challenging and time-consuming. Forinstance, it often meets the singular design within the high dimensional framework. Asan alternative, we utilize the algorithm developed by Yi and Huang (2017) who use theHuber loss function to approximate the quantile loss function. In the equation (12), ρ τ isnot differentiable, and ρ τ ( t ) = (1 − τ ) t − + τ t + = 12 | t | + (2 τ − t. ince h τ ( t ) → | t | as τ → + , where h τ ( t ) is the Huber loss function of t defined in Yi andHuang (2017), we have ρ τ ( t ) ≈ h τ ( t ) + (2 τ − t for small τ . Therefore the equation(12) can be approximated byminimize θ ∈R ,θ ∈R p N (cid:88) i =1 h τ ( Y i − θ − W (cid:48) i θ ) + (2 τ − Y i − θ − W (cid:48) i θ ) + λ (cid:107) θ (cid:107) . (13)The optimization above stands for the Huber approximation. This optimization problemis more computationally feasible for the sake of the differentiability of the loss function. We evaluate the finite-sample performance, in terms of RMSE and MAD, of the doublemachine learning for the IVQR. The following data generating process is modified fromthe one considered in Chen and Lee (2018). (cid:34) u i (cid:15) i (cid:35) ∼ N (cid:32) , (cid:34) . . (cid:35)(cid:33) x i z i v i ∼ N (0 , I ) Z i = z i + v i + x i D i = Φ( z i + (cid:15) i ) X i = Φ( x i ) Y i = 1 + D i + X Ti + D i ∗ u i , where Φ( · ) is the cumulative distribution function of a standard normal random variable.Consequently, α ( τ ) = 1 + F − (cid:15) ( τ ) , where F (cid:15) ( · ) is the cumulative distribution function of (cid:15) . .1 Partialing out and nonPartialing out Z on X We focus on comparing MAD and RMSE resulting from different models under the exactspecification (10 control variables). po-GMM stands for doing partialing out Z on X .GMM stands for doing no partialing out Z on X . Table 1 shows that doing partialingout Z on X leads to an efficiency gain across quantiles especially when sample size ismoderate. Table 1: Partiailing out and nonPartialing out Z on Xn = 500 n = 1000RMSE MAD RMSE MAD α . (po-GMM) 0.1888 0.1510 0.1219 0.0950 α . (GMM) 0.4963 0.2559 0.1631 0.1138 α . (po-GMM) 0.1210 0.0966 0.0812 0.0654 α . (GMM) 0.1782 0.1179 0.0963 0.0754 α . (po-GMM) 0.0989 0.0716 0.0689 0.0436 α . (GMM) 0.1436 0.1016 0.0801 0.0542 α . (po-GMM) 0.1374 0.1066 0.0828 0.0676 α . (GMM) 0.2403 0.1710 0.1146 0.0848 α . (po-GMM) 0.2437 0.1839 0.1391 0.1067 α . (GMM) 0.8483 0.5340 0.3481 0.1967 The date generating process considers ten control variables. po-GMM stands fordoing partialing out Z on X . GMM stands for doing no partialing out Z on X . We now evaluate the finite-sample performance of the IVQR with high-dimensional con-trols. The data generating process involves 100 control variables with an approximatesparsity structure. In particular, the exact model (true model) depends only on 10 rele-vant control variables out of the 100 controls. GMM uses 100 control variables withoutregularization. Table 2 shows that the RMSE and MAD stemmed from the DML-IVQRare close to those from the exact model. In addition, Figure 1 plots distributions of theIVQR estimator with/without double machine learning. The DML-IVQR stands for thedouble machine learning for the IVQR with high-dimensional controls. Histograms sig-nify that the DML-IVQR estimator is more efficient and less biased than the IVQR usingmany control variables. Since a weak-identification robust inference procedure resultsnaturally form the IVQR, cf. Chernozhukov and Hansen (2008), we construct the robustconfidence regions for the GMM and the DML-IVQR estimators. Figure 2 signifies that, α . (GMM) 0.7648 0.6645 0.3917 0.3442 α . (exact-GMM) 0.1888 0.1510 0.1219 0.0950 α . (DML-IVQR) 0.3112 0.2389 0.1376 0.1085 α . (GMM) 0.2712 0.2212 0.1646 0.1361 α . (exact-GMM) 0.1210 0.0966 0.0812 0.0654 α . (DML-IVQR) 0.1562 0.1254 0.0991 0.0804 α . (GMM) 0.1627 0.1234 0.1038 0.0754 α . (exact-GMM) 0.0989 0.0716 0.0689 0.0436 α . (DML-IVQR) 0.1168 0.0846 0.0775 0.0510 α . (GMM) 0.3421 0.2806 0.1747 0.1452 α . (exact-GMM) 0.1374 0.1066 0.0828 0.0676 α . (DML-IVQR) 0.1495 0.1167 0.0930 0.0741 α . (GMM) 0.9449 0.8032 0.4320 0.3681 α . (exact-GMM) 0.2437 0.1839 0.1391 0.1067 α . (DML-IVQR) 0.3567 0.2608 0.1649 0.1231 across quantiles, the weak-identification (or weak-instrument) robust confidence regionbased on the DML-IVQR is relatively sharp. The Monte Carlo experiments show thatthe DML-IVQR procedure performs well. Notice: DML-IVQR results are plotted in green. Results from the GMM with many controls are in orange.
We reinvestigate impact of the 401(k) participation on accumulated wealth. Total wealthor net financial asset is the outcome variable Y . Treatment variable D is a binary variablestanding for participation in the 401(k) plan. Instrument Z is an indicator for beingeligible to enroll in the 401(k) plan. The vector of covariates X consists of income,age, family size, married, an IRA individual retirement account, a defined benefit statusindicator, a home ownership indicator and the different education-year indicator variables.The data consists of 9915 observations.Following the regression specification in Chernozhukov and Hansen (2004), Table 3presents quantile treatment effects obtained from different estimation procedures whichhave been defined in the previous section including IVQR, po-GMM and GMM. Thecorresponding results are similar. As to the high-dimensional analysis, we create 119technical control variables including those constructed by the polynomial bases, inter-action terms, and cubic splines (thresholds). To ensure each basis has equal length, weutilize the minimax normalization for all technical control variables. Consequently, we usethe plug-in method to determine the value of penalty when doing the LASSO under themoment condition, and tune the penalty in the quantile L1-norm objective function basedon the Huber approximation by 5-fold cross validation. The DML-IVQR also implements feature normalization of the outcome variable for the sake of computational efficiency.To make the estimated treatment effects across different estimation procedures roughlycomparable, Table 4 shows the effect obtained through the DML-IVQR multiplied bythe standard deviation of the outcome variable. Weak identification/instrument robustinference on quantile treatment effects are depicted in Figures 4 and 5. Yet, the robustconfidence interval widens as the sample size becomes fewer at the upper quantiles; esti-mated quantile treatment effects are significantly different from zero. We could use theresult from the DML-IVQR as a data-driven robustness check on those summarized inthe Table 3.Tables 5 and 6 present the selected important variables across different quantiles. Theapproximate sparsity is asymmetric across the conditional distribution in the sense thatthe number of selected variables decreases as the quantile index τ increases. However, ithinges on the relatively small number of observations at the upper quantiles as well. Ourempirical results also signify that the 401(k) participants with low savings propensity aremore associated with the nonlinear income effect than those with high savings propensity,which complements the results concluded in Chernozhukov et al. (2018a) and Chiou etal. (2018). In this particular example, τ captures the rank variable which governs theunobservable heterogeneity: savings propensity. Small values of τ represent participantswith low savings propensity. The nonlinear income effects, across quantile ranging from(0, 0.5], are picked up by the selected variables such as max(0 , inc − . , inc − . , inc − .
2) and etc. Technical variables in terms of age, education, familysize, and income are more frequently selected. In addition, these four variables are alsoidentified as important variables in the context of the generalized random forests, cf. Chenand Hsiang (2019). × × We create 119 technical control variables including those constructed by the polynomial bases, interaction terms, andcubic splines (thresholds). The DML-IVQR estimates the distributional effect which signifies an asymmetric patternsimilar to the one identified in Chernozhukov and Hansen (2004).
Figure 4: Weak Instrument Robust Inference, P401(K) on TW with hqreg L1-norm14igure 5: Weak Instrument Robust Inference, P401(K) on NFTA with hqreg L1-norm15able 5: Total WealthQuantile Selected Variables0.15 ira , educ , educ , age ∗ ira , age ∗ inc , f size ∗ educ , f size ∗ hmortira ∗ educ , ira ∗ inc , hval ∗ inc , marr , male , i a twoearn , marr ∗ f size , pira ∗ inc , max (0 , age − . max (0 , educ − . max (0 , educ − . max (0 , age − . ira , age ∗ f size , age ∗ ira , age ∗ incf size ∗ educ , ira ∗ educ , ira ∗ inchval ∗ inc , marr , male , i twoearn , marr ∗ f sizepira ∗ inc , twoearn ∗ f size , max (0 , inc − . inc , age ∗ f size , age ∗ ira , age ∗ incf size ∗ educ , ira ∗ educ , ira ∗ hval , ira ∗ inchval ∗ inc , male , a a pira ∗ inc , twoearn ∗ age , twoearn ∗ f sizetwoearn ∗ hmort , twoearn ∗ educ , max (0 , educ − . inc , ira , age ∗ ira , age ∗ hvalage ∗ inc , educ ∗ inc , hval ∗ inc , pira ∗ inc , pira ∗ age inc , ira , age ∗ hval , age ∗ inc , ira ∗ educeduc ∗ inc , hval ∗ inc , pira ∗ inc , pira ∗ hval Selected variables across τ , tuned via cross validation. ira : individual retirement account (IRA), inc : income, fsize : family size, hequity : home equity, hva l:home value, educ : education years, marr : married, smcol : college, db : defined benefit pension, hown :home owner, hmort : home mortgage, a
1: less than 30 years old, a
2: 30-35 years old, a
3: 36-44 years old, a
4: 45-54 years old, a
5: 55 years old or older, i < $10 K , i
2: $10 − K , i
3: $20 − K , i
4: $30 − K , i
5: $40 − K , i
6: $50 − K , i
7: $75 K +. ira , educ , f size , hval , educ , age ∗ educ , age ∗ hmortage ∗ inc , f size ∗ hmort , f size ∗ inc , ira ∗ educ , ira ∗ inchval ∗ inc , marr , db , male , i i i i twoearn , marr ∗ f sizepira ∗ inc , pira ∗ educ , twoearn ∗ inc , twoearn ∗ iramax (0 , age − . max (0 , age − . max (0 , age − . max (0 , inc − . max (0 , inc − . max (0 , educ − . ira , hmort , age ∗ hmort , age ∗ inc , f size ∗ hmort , f size ∗ incira ∗ educ , ira ∗ inc , hval ∗ inc , db , smcol , malei i i i a a twoearn , pira ∗ inc , pira ∗ agepira ∗ f size , twoearn ∗ inc , twoearn ∗ iratwoearn ∗ hmort , max (0 , age − . max (0 , age − . max (0 , inc − . max (0 , inc − . max (0 , inc − . max (0 , educ − . age , ira , age ∗ f size , age ∗ ira , age ∗ incf size ∗ educ , f size ∗ hmort , ira ∗ educ , ira ∗ inc , hval ∗ inc , hownmale , i i a a a pira ∗ incpira ∗ f size , twoearn ∗ inc , twoearn ∗ f sizetwoearn ∗ hmort , twoearn ∗ educ , max (0 , inc − . ira , age ∗ inc , hval ∗ inc , pira ∗ inc , pira ∗ age ira , age ∗ inc , educ ∗ inc , hval ∗ inc , pira ∗ inc Selected variables across τ , tuned via cross validation. ira : individual retirement account (IRA), inc : income, fsize : family size, hequity : home equity, hva l:home value, educ : education years, marr : married, smcol : college, db : defined benefit pension, hown :home owner, hmort : home mortgage, a
1: less than 30 years old, a
2: 30-35 years old, a
3: 36-44 years old, a
4: 45-54 years old, a
5: 55 years old or older, i < $10 K , i
2: $10 − K , i
3: $20 − K , i
4: $30 − K , i
5: $40 − K , i
6: $50 − K , i
7: $75 K +. .2 Effects of subsidized training on male and female trainee earnings Abadie, Angrist and Imbens (2002) use the Job Training Partnership Act (JTPA) datato estimate the quantile treatment effect of job training on the earning distribution. Thedata are from Title II of the JTPA in early 1990s, which consist of 11,204 samples,5,102 of them are male, and 6,102 of them are female. In estimation, they take thirty-month earnings as the outcome variable, enrollment for JTPA service as the treatmentvariable, and a randomized o er of JTPA enrollment as the instrumental variable. Thecontrol variables include the binary variables of black and Hispanic applicants, high-school graduates, married applicants, 5 age-group, AFDC receipt (for women), whetherthe applicant worked at least 12 weeks in the 12 months preceding random assignment,the dummies for the original recommended service strategy (classroom, OJT/JSA, other)and a dummy for whether earnings data are from the second follow-up survey.Table 7 presents quantile treatment effects for male and female groups respectivelyobtained from several estimation procedures including IVQR, po-GMM, and GMM. Asto the high-dimensional analysis, we create 85 technical control variables including thoseconstructed by the polynomial bases, interaction terms, and cubic splines (thresholds).Table 8 shows the quantile treatment effect obtained through the DML-IVQR. Table7 together with the existing findings in the literature suggest that for female only, jobtraining program generates significantly positive treatment effect on earnings at 0.5 and0.75 quantiles. The DML-IVQR signifies similar results, which can be confirmed bythe identification-robust confidence intervals depicted in Figures 6 and 7. The selectedvariables are collected in the online appendix . Thus, the existing empirical conclusionsin the literature is reassured by the IVQR using double machine learning procedure. Table 7: Estimations with Abadie et. al. (2002)’s SpecificationQuantiles 0.1 0.15 0.25 0.5 0.75 0.85 0.9Male(IVQR) 0 -200 400 500 3300 3100 1700Male(po-GMM) 0 -100 500 1900 5000 6800 7800Male(GMM) 0 -100 500 1600 5100 5800 7200Female(IVQR) 0 0 400 1600 2500 1900 1400Female(po-GMM) 0 200 700 3300 5200 6500 6900Female(GMM) 100 200 700 3200 5200 6500 6900 Selected variables for the male group: https://github.com/FieldTien/DML-QR/blob/master/Empirical_work/hqreg_data/selected_male.csv ; selected variables for the female group: https://github.com/FieldTien/DML-QR/blob/master/Empirical_work/hqreg_data/selected_female.csv × × We create 85 technical control variables including those constructed from the polynomial bases, interaction terms,and cubic splines (thresholds).
Figure 6: Weak Instrument Robust Inference. The male group.19igure 7: Weak Instrument Robust Inference. The female group.20
Conclusion
The performance of a debiased/double machine learning algorithm within the frameworkof high-dimensional IVQR is investigated. The simulation results signify that the proposedprocedure performs more efficiently than those based on the conventional estimator withmany controls. Furthermore, we evaluate the corresponding weak-identification robustconfidence interval of the low-dimensional causal parameter. Given a large number oftechnical controls, we reinvestigate quantile treatment effects of the 401(k) participationon accumulated wealth and then highlight the non-linear income effects driven by the thesavings propensity. eferences Abadie A. Angrist J. and G. Imbens. 2002. “Instrumental Variables Estimates of theEffect of Subsidized Training on the Quantiles of Trainee Earnings,”
Econometrica ,70(1): 91–117.Athey, S. 2017. “Beyond Prediction: Using Big Data for Policy Problems,”
Science , 355:483–485.Athey, S. 2018. “The Impact of Machine Learning on Economics,” working paper, StanfordGSB.Athey, S., Tibshirani, J., and S. Wager. 2019. “Generalized Random Forests,”
The Annalsof Statistics , 47(2): 1148–1178.Belloni, A., Chernozhukov V., and C. Hansen. 2014. “High-Dimensional Methods andInference on Structural and Treatment Effects,”
Journal of Economic Perspectives ,28: 29–50.Chen, J.-E. and C.-W. Hsiang. 2019. “Causal Random Forests Model using InstrumentalVariable Quantile Regression,” working paper, Center for Research in EconometricTheory and Applications, National Taiwan University.Chen, L.-Y., and S. Lee. 2018. “Exact Computation of GMM Estimators for InstrumentalVariable Quantile Regression Models,”
Journal of Applied Econometrics , forthcom-ing.Chernozhukov, V., and C. Hansen. 2004. “The Impact of 401(k) Participation on theWealth Distribution: An Instrumental Quantile Regression Analysis,”
Review ofEconomics and Statistics , 86: 735–751.Chernozhukov, V. and C. Hansen. 2008. “Instrumental Variable Quantile Regression: ARobust Inference Approach,”
Journal of Econometrics , 142: 379–398.Chernozhukov, V., Hansen C., and M. Spindler. 2015. “Valid Post-Selection and Post-Regularization Inference: An Elementary, General Approach,”
Annual Review ofEconomics , 7: 649–688.Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., andJ. Robins. 2018a. “Double/debiased Machine Learning for Treatment and Structural arameters,” Econometrics Journal , 21: C1–C68.Chernozhukov, V., Hansen, C., and K. W¨uthrich. 2018b. “Instrumental Variable QuantileRegression,”
Handbook of Quantile Regression .Chiou, Y.-Y., Chen, M.-Y., and J.-E. Chen. 2018. “Nonparametric Regression with Mul-tiple Thresholds: Estimation and Inference,”
Journal of Econometrics , 206(2): 472–514.Yi, C., and J. Huang. 2017. “Semismooth Newton Coordinate Descent Algorithm forElastic-net Penalized Huber Loss Regression and Quantile Regression,”
Journal ofComputational and Graphical Statistics , 26(3): 547–557., 26(3): 547–557.