Identifying causal channels of policy reforms with multiple treatments and different types of selection
IIdentifying causal channels of policy reforms withmultiple treatments and different types of selection ∗ Annabelle Doerr
UC BerkeleyUniversity of Basel
Anthony Strittmatter
University of St. Gallen
October 13, 2020
Abstract
We study the identification of channels of policy reforms with multiple treat-ments and different types of selection for each treatment. We disentangle reformeffects into policy effects, selection effects, and time effects under the assumption ofconditional independence, common trends, and an additional exclusion restrictionon the non-treated. Furthermore, we show the identification of direct- and indirectpolicy effects after imposing additional sequential conditional independence assump-tions on mediating variables. We illustrate the approach using the German reformof the allocation system of vocational training for unemployed persons. The reformchanged the allocation of training from a mandatory system to a voluntary vouchersystem. Simultaneously, the selection criteria for participants changed, and the re-form altered the composition of course types. We consider the course compositionas a mediator of the policy reform. We show that the empirical evidence from pre-vious studies reverses when considering the course composition. This has importantimplications for policy conclusions.JEL-Classification: C21, J68, H43Keywords: Difference-in-Differences, Mediation Analysis, Treatment Effects Evaluation, Admin-istrative Data, Training Voucher ∗ This study is part of the project “Regional Allocation Intensities, Effectiveness and Reform Effects ofTraining Vouchers in Active Labor Market Policies”, IAB project 1155. This is a joint project of the Insti-tute for Employment Research (IAB) and the University of Freiburg. We gratefully acknowledge financialand material support from the IAB. The paper was presented at ESPE in Aarhus, CAFE Workshop inBørkop, SOLE in Washington, EALE in Ljubljana, Joint Research Centre of the European Commission,Centre for European Economic Research, and the University of Bern. We thank participants for helpfulcomments, in particular Hugo Bodory, Bernd Fitzenberger, Hans Fricke, Michael Lechner, Michael Knaus,Thomas Kruppe, Marie Paul, and Gesine Stephan. We are particularly grateful for detailed commentsand remarks from Conny Wunsch. Furthermore, we thank two anonymous referees. The usual disclaimerapplies. Correspondence: [email protected], [email protected] a r X i v : . [ ec on . E M ] O c t Introduction
A popular approach to evaluate the effectiveness of policy reforms in quasi experimentalsettings is the Difference-in-Differences (DiD) method. The baseline version of DiD re-quires to observe one treatment and one control group. Both groups are untreated beforethe policy reform. After the reform the treatment group receives the treatment whilethe control group remains untreated. The effectiveness of the reform can be estimatedby comparing the differences in outcomes of both groups before and after the reform im-plementation. This comparison will lead to unbiased estimates under the common trendassumption, i.e., when the outcomes of both groups would have developed parallel to eachother in the absence of the policy reform.Often the evaluation of policy instruments does not work that simply. For this reason,the literature proposes several extensions of the baseline DiD method. Some studies im-pose conditional independence assumptions to account for selection into treatment basedon observable characteristics (e.g., Abadie, 2005, Heckman, Ichimura, and Todd, 1997,Lechner, 2010). Other studies consider multiple treatments (e.g., Fricke, 2017). Thiscapture situations in which the policy of interest is the reform of an existing policy in-strument, for example an increase of treatment intensity, instead of the implementationof a new instrument. There are studies that combine both extensions (e.g., Felfe, Nollen-berger, and Rodriguez-Planas, 2014, Havnes and Mogstad, 2011a,b). Some studies evenconsider multiple treatments and different types of selection for each treatment (e.g., Cardand Hyslop, 2005, Rinne, Uhlendorff, and Zhao, 2013). However, the results from thesestudies are not formally identified without the implementation of a structural model forthe specific policy question.As the first contribution of our paper, we formally show how policy reforms can be non-parametrically decomposed into effects of changing the policy instrument and selectioneffects, as well as other time changing factors such as business cycle effects. We mainlyrely on conditional independence and common trend assumptions. We highlight that anadditional exclusion restriction on the untreated is sufficient to identify the policy andtime effects which is not recognised by the previous literature. The imposed assumptions2re necessary and sufficient to achieve additive separability, which is, for example, imposedby Rinne, Uhlendorff, and Zhao (2013).Second, we focus on the direct and indirect channels through which changes in a pol-icy instrument may unfold their effects. Relying on mediation analysis (see, e.g., Huber,Lechner, and Mellace, 2017, for a review about mediation analysis), we are the first whoexplore direct and indirect effects of a quasi-experimental policy reform. The existingapproaches of the mediation literature investigate direct and indirect treatment insteadof policy reform effects (see, e.g., Flores and Flores-Lagunes, 2009, Huber, Lechner, andStrittmatter, 2018, Imai, Keele, and Yamamoto, 2010, Imai, Keele, Tingley, and Ya-mamoto, 2011, Petersen, Sinisi, and van der Laan, 2006, Van der Weele, 2009). Closelyrelated is the study by Deuchert, Huber, and Schelker (2019), who use common trendassumptions to identify direct and indirect policy effects. In contrast, we rely on com-mon trend assumptions to identify the policy effects and impose additional sequentialindependence assumptions on the mediators to identify the direct and indirect effects ofthe policy reform on the outcome of interest.Third, we illustrate our approach using a large-scale reform of the allocation systemof unemployed individuals to vocational training in Germany. The reform replaced theexisting mandatory allocation system with a voucher allocation system. The voucher sys-tem offers voluntary participation and participants have (some) influence on the coursechoice (see detailed discussion in Doerr, Fitzenberger, Kruppe, Paul, and Strittmatter,2017). Under the mandatory system , participation was compulsory and caseworkers inlocal employment agencies allocated participants to specific courses. Additionally, thereform changed the criteria for selecting unemployed persons into training programmes.Under the pre-reform system, caseworkers assigned training based on subjective criteria,whereas the new selection rule focuses on predicted future employment outcomes. Case-workers were incentivised to select unemployed persons with an expected re-employmentprobability of at least 70% within six months after the end of training. Accordingly, thisreform offers a setting in which the overall reform effect is a composition of time effects, Relatedly, Huber, Schelker, and Strittmatter (2019) use a changes-in-changes framework to identify directand indirect channels. However, itis unexplored whether training is more effective under voluntary or mandatory participa-tion, net of the course composition that might change under different allocation systems.Our study also contributes to close this research gap.We build on the work of Rinne, Uhlendorff, and Zhao (2013), who investigate thesame reform. They disentangle the effects of the reform of the allocation systems fromthe changing selection criteria and find positive but mostly insignificant short-term effectsof the voucher reform. We replicate their results using a larger data set and an efficient es-timation method. Rinne, Uhlendorff, and Zhao (2013) observe 1,319 training participantsafter the reform and match control observation by single-nearest-neighbour matching. Incontrast, we obtained administrative data from the Federal Employment Agency of Ger-many, which contain the population of vocational training participants during the years2001-2004. Our evaluation sample consists of more than 26,000 training participants ineach time period. We apply the doubly robust and locally efficient auxiliary-to-studytilting estimator proposed in Graham, De Xavier Pinto, and Egel (2016).Furthermore, our data allows us to consider long-term effects over a time period of morethan seven years after programme entry (in contrast to 1.5 years in Rinne, Uhlendorff, andZhao, 2013). Qualitatively we confirm the findings for the time period under consideration For example, Perez-Johnson, Moore, and Santillano (2011) provide experimental evidence for the relativeeffectiveness of different degrees of participants’ influence on the course choice under voluntary partic-ipation in training. They find that increasing participants’ course choices has no effects on his or herre-employment probability and negative effects on his or her earnings.
4n Rinne, Uhlendorff, and Zhao (2013). Our results suggest positive effects of the reformof the allocation system in the short-term. Moreover, we find that the reform of theallocation system reduces the re-employment probabilities between the first and secondyear after the start of training. After three years, the effects turn positive and remain onan approximately stable level until seven years after the training started. This suggeststhat it is crucial to consider long-term reform effects.In contrast to Rinne, Uhlendorff, and Zhao (2013), we consider the type and durationof training as mediators, i.e., intermediate outcomes on the causal path of the assignmentsystem to the individual labour market outcomes. Our results show that the short-termpositive effects of the reform are mainly driven by a different composition of training coursetypes and durations after the reform. More individuals participate in shorter courses inthe post-reform period which leads to an improvement of labour market outcomes inthe short-term but not in the long-term. This is almost a mechanical effect, becauseparticipants in courses with short durations are distracted from intensive job search for ashorter time period.This is an example of Manski’s (1997) mixing problem in programme evaluations.Treatment variation occurs because participant can self-select into different types and du-rations of training. This makes the evaluation of the treatment particularly complicated,because it is difficult to disentangle variations in the allocation system or treatment.Manski (1997) suggests an partial identification approach to address the mixing problem(see also the discussion in Gundersen, Kreider, Pepper, and Tarasuk, 2017). We followa different strategy and use a mediation analysis framework (e.g. Imai, Keele, Tingley,and Yamamoto, 2011) to separate the effects of the voluntary allocation system from thevariation in the types and durations of training.We are particularly interested in the effects of voluntary participation net of the effectsfrom a changing course composition. We find negative employment effects during the firstthree years after programme entry. During the lock-in period the re-employment chancesdecrease by up to four percentage points. A possible explanation for this result is lowerjob search intensity under voluntary participation which may be explained by a highermotivation to attend and complete the courses. The effects tend to turn positive in5he long-term. Possibly, unemployed individuals accumulate more human capital underthe voluntary system than under the mandatory system, which pays off in the long-term. These results point out that causal channels largely affect the policy conclusions.From a policy maker perspective, voluntary participation should only be offered when theprogrammes’ objective is a long-term investment in human capital. Mandatory assignmentappears to be more successful in the short-term. Accordingly, this allocation systemshould be used when fast reintegration is the major programme goal.The remainder of this paper is structured as follows. In the next section, we showthe identification of the policy effect of a reform and its causal channels in a setting withmultiple treatments and selection. We discuss the parameter of interest, identification,and estimation strategy. A detailed illustration of this approach using the example of theallocation reform of vocational training in Germany follows in Section 3. The final sectionconcludes. Additional information is provided in Online Appendices A-E.
We define the parameter of interest within the potential outcome framework proposedby Rubin (1974). We denote random variables by capital letters and realized values bysmall letters. Assume we have a random sample of individuals from a large population.For each individual in the sample, we observe the treatment state D = d ∈ { , } whichindicates whether the individual receives a treatment D = 1 or not D = 0. Furthermore,we assume that a reform of the policy instrument took place at some point in time. Let T be an indicator for the time period that can take on the values t ∈ { , } for thepre-reform or post-reform time period, respectively. Finally, we consider a policy systemindicator S = s ∈ { b, a } that is b before the reform was implemented and a afterwards.We indicate the potential outcomes by Y dt ( s ). They can be stratified into eight groups: Y ( b ) and Y ( b ) indicate the potential outcomes that would be observed if under pre-reform system treatment in the pre- or post-reform period, respectively. Y ( a ) and6 ( a ) are the potential outcomes under post-reform system treatment in the pre- orpost-reform period. Y ( b ) and Y ( b ) are the potential outcomes under pre-reform systemnon-treatment before or after the reform. Y ( a ) and Y ( a ) are the analogous potentialoutcomes under post-reform system non-treatment in both time periods.We only observe one potential outcome for each individual. We never observe pre-reform system treatments after the reform took place ( Y ( b ), Y ( b )). Similarly, we neverobserve the post-reform system treatments in the pre-reform period ( Y ( a ), Y ( a )) be-cause the post-reform policy system was implemented as part of the reform. The observedoutcome equals Y = (cid:88) d ∈{ , } (cid:88) t ∈{ , } (cid:88) s ∈{ b,a } G ( d, t, s ) Y dt ( s ) , where G ( d, t, s ) is an indicator function with G ( d, t, s ) = 1 { D = d, T = t, S = s } for d, t ∈ { , } and s ∈ { b, a } , which is the stable unit treatment value assumption (SUTVA)(e.g., Cox, 1958).In our application, D specifies whether an unemployed individual participates in a vo-cational training programme and S specifies the training allocation system before (manda-tory system m ) and after the reform (voucher system v ). The outcome Y measuresdifferent labour market outcomes.We are primary interest in the policy effect, i.e., the effect of the reform of the allocationinto training from a mandatory to a voucher system. Policy effects are the expecteddifference of potential outcomes under the voucher and mandatory systems by holdingtreatment status and time period constant. In particular, we focus in our application onthe policy effects under treatment in the post-reform period γ p = E [ Y ( v ) − Y ( m ) | D = 1 , T = 1] . Consider the following thought experiment to clarify the interpretation of this policyeffect: Compare the employment outcomes of training participants who receive a trainingvoucher after the reform with the employment outcomes that they would obtain if they Alternatively, the policy effect could be defined under non-treatment status or for the pre-reform period. γ ba = E [ Y ( v ) − Y ( v ) | D = 1 , T = 1] − E [ Y ( m ) − Y ( m ) | D = 1 , T = 0] . We show below how to decompose the overall effect into the selection effect, the timeeffect, and policy effect. It is often the case, that reforms of policy instruments also effectthe selection of those who are treated with the policy instrument. In our example, partof the reform was the implementation of stricter selection criteria of participants. As aconsequence, treated individuals before and after the reform may differ in their observedcharacteristics. The selection effect under the mandatory system in the pre-reform periodcan be formalised as γ sel = E [ Y ( m ) − Y ( m ) | D = 1 , T = 1] − E [ Y ( m ) − Y ( m ) | D = 1 , T = 0] . The treated population may change before and after the reform, but the policy systemand time effects are held constant. The following thought experiment may clarify theinterpretation of the selection effect: Assign participants from the post-reform period totraining in the pre-reform period. Then, compare them to actually observed participantsin the pre-reform period.Furthermore, the labour market outcomes of individuals could differ before and afterthe reform because of time effects even after controlling for treatment state and policysystem. In our setting, it is likely that business cycle effects occur. In our application, wedefine the business cycle effects under the mandatory system for the treated population8fter the reform, which we formalise as γ bc = E [ Y ( m ) − Y ( m ) | D = 1 , T = 1] and γ bc = E [ Y ( m ) − Y ( m ) | D = 1 , T = 1] . The parameters γ bc defines business cycle effects under treatment in the mandatorysystem and γ bc defines business cycle effects under non-treatment in the mandatorysystem. In the following, we discuss the sufficient assumptions to identify the effects ofinterest. The identification of the overall reform effect γ ba and selection effects γ sel from the jointdistribution of random variables ( Y, G ( d, t, s ) , X ) can be achieved by controlling for alarge set of K confounding pre-treatment variables X with support X ⊆ R K to accountfor the possibility of selection into treatment based on observed characteristics. Assumption 1a (Conditional Mean Independence)
For all d, d (cid:48) , t, t (cid:48) ∈ { , } , s ∈ { m, v } and x ∈ X , E [ Y dt ( s ) | D = d (cid:48) , T = t (cid:48) , X = x ] = E [ Y dt ( s ) | D = d, T = t, X = x ]and all necessary moments exist.This assumption implies that the expected potential outcomes are independent of thetreatment D and time period T after controlling for the pre-treatment control variables X . All confounding variables, which jointly influence the expected potential outcomesand treatment status must be included in the vector X . Note that Assumption 1a alsoincludes a time dimension, i.e., we assume that individuals being treated in t = 1 wouldhave the same expected potential outcomes as treated individuals in t = 0 if they weretreated under the pre-reform policy system before the reform (conditional on X ). Thisassumptions holds if those treated before and after the reform do not differ systematically9n unobserved characteristics that influence both the treatment probability and potentialoutcomes. Assumption 2a (Support) .0 < P r ( G ( d, t, s ) = 1 | X = x ) < ∀ d, t ∈ { , } for the subpopulation with G ( d (cid:48) , t (cid:48) , s ) = 1 ∀ d (cid:48) , t (cid:48) ∈ { , } .Assumption 2a requires overlap in the propensity score distributions of the differentsub-populations, which can be tested in the data (see the discussion in Lechner andStrittmatter, 2019).Under Assumptions 1a and 2a, for all d, d (cid:48) , t, t (cid:48) ∈ { , } and s ∈ { m, v } E [ Y dt ( s ) | D = d (cid:48) , T = t (cid:48) ] = E (cid:20) p d (cid:48) ,t (cid:48) ,s ( X ) p d (cid:48) ,t (cid:48) ,s · p d,t,s ( X ) G ( d, t, s ) Y (cid:21) , (1)is identified from observed data on the joint distribution of ( Y, G ( d, t, s ) , G ( d (cid:48) , t (cid:48) , s ) , X ),with p k,l,s ( x ) = P r ( G ( k, l, s ) = 1 | X = x ) and p k,l,s = P r ( G ( k, l, s ) = 1) for k ∈ { d, d (cid:48) } and l ∈ { t, t (cid:48) } (see, e.g., Rosenbaum and Rubin, 1983). For completeness, a formal proofof (1) can be found in Online Appendix B.1.Accordingly, the before-after effect γ ba can be calculated as the difference betweenthe average treatment effects on the treated (ATT) before and after the reform. Thepre-reform ATT can be formalised as γ pre = E [ Y ( m ) − Y ( m ) | D = 1 , T = 0] . The expected potential outcome E [ Y ( m ) | D = 1 , T = 0] is directly observed from thedata. E [ Y ( m ) | D = 1 , T = 0] is the counterfactual expected potential outcome, because Y ( m ) is never observed for treated individuals before the reform. In our setting, γ pre is the average effect of training participation under the mandatory system in the pre-reform period for unemployed persons who mandatorily participate. The pre-reform ATT10s identified from observed data as γ pre A a,Aa = E (cid:20) p , ,m G (1 , , m ) Y (cid:21) − E (cid:20) p , ,m ( X ) p , ,m · p , ,m ( X ) G (0 , , m ) Y (cid:21) . The post-reform ATT can be indicated by γ post = E [ Y ( v ) − Y ( v ) | D = 1 , T = 1] . The expected potential outcome E [ Y ( v ) | D = 1 , T = 1] is directly observed from the data. E [ Y ( v ) | D = 1 , T = 1] is a counterfactual expected potential outcome, because Y ( v ) isnever observed for treated individuals in the post-reform period. Here, the parameter γ post is the average effect of participation in the post-reform period for participants underthe voucher system. The post-reform ATT is identified from observed data as γ post A a,A a = E (cid:20) p , ,v G (1 , , v ) Y (cid:21) − E (cid:20) p , ,v ( X ) p , ,v · p , ,v ( X ) G (0 , , v ) Y (cid:21) . Next, we focus on the selection effect. In our setting, programme participants beforeand after the reform are likely to differ in their observed characteristics due to changesin the selection criteria. We are interested in the differences between the effectiveness oftraining that comes solely by the changing characteristics of participants holding every-thing else constant on the pre-reform situation, γ sel = E [ Y ( m ) − Y ( m ) | D = 1 , T = 1] − E [ Y ( m ) − Y ( m ) | D = 1 , T = 0] . The expected potential outcome E [ Y ( m ) | D = 1 , T = 0] is directly observed from thedata. The selection effect is identified under Assumption 1a and 2a by γ sel A a,A a = E (cid:20) p , ,v ( X ) p , ,v · p , ,m ( X ) G (1 , , m ) Y (cid:21) − E (cid:20) p , ,v ( X ) p , ,v · p , ,m ( X ) G (0 , , m ) Y (cid:21) − (cid:20) E (cid:20) p , ,m G (1 , , m ) Y (cid:21) − E (cid:20) p , ,m ( X ) p , ,m · p , ,m ( X ) G (0 , , m ) Y (cid:21)(cid:21) . The identification of business cycle effects and the policy effect requires two additional11ssumptions because we never observe the pre-reform policy system after the reform andthe post-reform policy system before the reform. First, we assume that potential outcomesof the non-treated are independent of the policy system, i.e., we assume that the reformhas no effects on the outcomes of the untreated. This is a plausible assumption if onlya relatively small fraction of the population is affected by the policy system such thatgeneral equilibrium effects can be neglected. Assumption 3 (Exclusion Restriction on Untreated) E [ Y ( v ) | D = 1 , T = 1] = E [ Y ( m ) | D = 1 , T = 1] . Second, we impose the assumption of common trends. Thereby, we assume the busi-ness cycle effects to be independent of the treatment status, i.e., in absence of the reformthe time trends of the potential outcomes would be similar under treatment and non-treatment in the mandatory system when the characteristics of the participants would befixed.
Assumption 4 (Common Trend Assumption) . γ bc = γ bc Under Assumptions 1a, 2a, 3, and 4, we can identify the business cycle effect undermandatory treatment γ bc from observed data as, γ bc A = γ bc = E [ Y ( m ) − Y ( m ) | D = 1 , T = 1] A = E [ Y ( v ) − Y ( m ) | D = 1 , T = 1] A a,A a = E (cid:20) p , ,v ( X ) p , ,v · p , ,v ( X ) G (0 , , v ) Y (cid:21) − E (cid:20) p , ,v ( X ) p , ,v · p , ,m ( X ) G (0 , , m ) Y (cid:21) . Now, we focus on the parameter of primary interest in this study. The policy effect isthe difference of potential outcomes of treated due to a change in the policy instrument A possible extension is to focus on bounds instead of point-identification (see discussion in, e.g., Kikuchi,2017, Twinam, 2017). E [ Y ( v ) | D = 1 , T = 1] and using E [ Y ( v ) | D = 1 , T = 1] = E [ Y ( m ) | D = 1 , T = 1] (A3), we can rewrite the policy effectas γ p = E [ Y ( v ) − Y ( m ) | D = 1 , T = 1] A = E [ Y ( v ) − Y ( v ) | D = 1 , T = 1] − E [ Y ( m ) − Y ( m ) | D = 1 , T = 1] . The potential outcome Y ( m ) is never observed for treated individuals after the reform.However, under the imposed assumptions the policy effect can be decomposed into thedifferent reform parameters by adding and subtracting E [ Y ( m ) − Y ( m ) | D = 1 , T = 0]and E [ Y ( m ) − Y ( m ) | D = 1 , T = 1]. Thus, the policy effect is equal to the overallreform effect minus business cycle effects minus the selection effect, which are all - asshown above - identified from observed data: γ p = E [ Y ( v ) − Y ( v ) | D = 1 , T = 1] − E [ Y ( m ) − Y ( m ) | D = 1 , T = 1]+ E (cid:2) Y ( m ) − Y ( m ) | D = 1 , T = 1 (cid:3) − E (cid:2) Y ( m ) − Y ( m ) | D = 1 , T = 1 (cid:3) + E (cid:2) Y ( m ) − Y ( m ) | D = 1 , T = 0 (cid:3) − E (cid:2) Y ( m ) − Y ( m ) | D = 1 , T = 0 (cid:3) = E (cid:2) Y ( v ) − Y ( v ) | D = 1 , T = 1 (cid:3) − E (cid:2) Y ( m ) − Y ( m ) | D = 1 , T = 0 (cid:3)(cid:124) (cid:123)(cid:122) (cid:125) γ ba − (cid:104) E (cid:2) Y ( m ) − Y ( m ) | D = 1 , T = 1 (cid:3) − E (cid:2) Y ( m ) − Y ( m ) | D = 1 , T = 0 (cid:3) (cid:105)(cid:124) (cid:123)(cid:122) (cid:125) γ sel − E (cid:2) Y ( m ) − Y ( m ) | D = 1 , T = 1 (cid:3)(cid:124) (cid:123)(cid:122) (cid:125) γ bc + E (cid:2) Y ( m ) − Y ( m ) | D = 1 , T = 1 (cid:3)(cid:124) (cid:123)(cid:122) (cid:125) γ bc . Accordingly, γ p = γ ba − γ sel − ( γ bc − γ bc ), which is the additive separability assumptionimposed in Rinne, Uhlendorff, and Zhao (2013). We show the sufficient conditions toachieve additive separability. Imposing assumptions 1a, 2a, 3 and 4, we have shown thatthe total change in the effectiveness of the policy instrument from before to after thereform can be decomposed into the effect of changing the selection, a time effect and the13olicy effect and that these, in turn, are identified from observed data. Thus, the policyeffect can be estimated from observed data as γ p A a,A a,A ,A = E (cid:20) p , ,v G (1 , , v ) Y (cid:21) − E (cid:20) p , ,v ( X ) p , ,v · p , ,v ( X ) G (0 , , v ) Y (cid:21) − E (cid:20) p , ,v ( X ) p , ,v · p , ,m ( X ) G (1 , , m ) Y (cid:21) − E (cid:20) p , ,v ( X ) p , ,v · p , ,m ( X ) G (0 , , m ) Y (cid:21) . We apply a mediation framework (see, for instance, the seminal paper by Baron andKenny, 1986) to isolate the causal channels through which the policy effect works. Inour setting, we aim to separate the effects of voluntary participation (in the following’assignment effect’) from the effect of increased course choice (in the following ’compositioneffect’). Thereby, we consider the type and duration of training as so-called mediators,i.e., intermediate outcomes on the causal path of the assignment system to the individuallabour market outcomes. Let C denote the composition of programmes. To investigatethe reform channels, we augment the notation of the potential outcomes with programmecomposition. This new notation of potential outcomes is directly linked to the formernotation by Y dt ( s ) = Y dt ( s, C = c ) = Y dt ( s, c ). We start with the policy effect expressedas the total effect of the change from a mandatory to a voucher system by γ p = E (cid:2) Y ( v, c v ) − Y ( m, c m ) | D = 1 , T = 1 (cid:3) , (2)where we denote c m the realised programme composition under mandatory assignmentand c v the realised programme composition under voucher assignment.This extended notation allows us to define further parameters of interest. The impactof the policy effect may be (partly) due to increased course choice or to a direct effect ofvoluntary participation. In the following, we show how these two effects can be disentan-gled. First, we are particularly interested in the so-called controlled direct effect (see, for14nstance, Pearl, 2001). It can be formalised as ρ = E (cid:2) Y ( v, c v ) − Y ( m, c v ) | D = 1 , T = 1 (cid:3) . This is the direct effect of the voucher system for the type and duration composition oftraining as under the mandatory system, i.e., the assignment effect. Second, the effect ofincreased course choice can be formalised as δ = E (cid:2) Y ( m, c v ) − Y ( m, c m ) | D = 1 , T = 1 (cid:3) . This is the indirect effect of increased course choice, i.e., the assignment system is keptconstant while the composition of programme types and durations varies. As can be seenfrom adding and substracting Y ( m, c v ) in the expectation of expression (2), the directeffect ρ and the indirect effect δ sum up to the total policy effect γ p .However, causal mechanisms are not easily identified. Even if the policy effect isidentified, this would not imply identification of the mediator effects. Addressing theendogeneity of mediators requires that they are independent of the potential outcomesconditional on the policy system and the covariates. Assumption 1b (Sequential Conditional Mean Independence)
For all s, s (cid:48) ∈ { m, v } and x ∈ X , E [ Y ( s, c s (cid:48) ) | D = 1 , T = 1 , C = c s (cid:48) , X = x ] = E [ Y ( s, c s (cid:48) ) | D = 1 , T = 1 , X = x ]and all necessary moments exist.Assumption 1b implies for treated in the post-reform period that, given the observedpre-treatment confounders, the expected potential outcomes are independent of the typeand duration of training. The selection of the type and duration of training causallysucceeds the selection into treatment. Therefore, we call Assumption 1b sequential con-ditional mean independence. The combination of Assumptions 1a and 1b is analogue tosequential conditional independence assumption invoked in the non-parametric mediation15iterature for identifying direct effects (see, e.g., Imai, Keele, Tingley, and Yamamoto,2011). In contrast, a multiple treatment framework would assume contemporaneous se-lection into treatment and selection of the type and duration of training (Imbens, 2000,Lechner, 2001). Then Assumptions 1a and 1b would have to hold contemporaneouslyinstead of sequentially. Assumption 2b (Support) .0 < P r ( G ( d, t, s ) = 1 | C = c, X = x ) < ∀ s ∈ { m, v } , d, t ∈ { , } . Assumption 2b requires overlap in the propensity score distributions of the mediatorsunder both systems and control variables. Finally, under Assumption 1a,b, 2a,b, 3 and 4the controlled direct and the indirect effects can be identified as ρ A a,b,A a,b,A ,A = E (cid:20) p , ,v G (1 , , v ) Y (cid:21) − E (cid:20) p , ,v ( X ) p , ,v · p , ,v ( X ) G (0 , , v ) Y (cid:21) − E (cid:20) p , ,v ( X, C ) p , ,v ( C ) · p , ,m ( X, C ) G (1 , , m ) Y (cid:21) − E (cid:20) p , ,v ( X ) p , ,v · p , ,m ( X ) G (0 , , m ) Y (cid:21) , and δ A a,b,A a,b,A ,A = γ p − ρ, with p , ,v ( x, c ) = P r ( G i (1 , , v ) = 1 | C = c, X i = x ) and p , ,v ( c ) = P r ( G i (1 , , v ) =1 | C = c ) (see, e.g., Huber, 2014). We illustrate our approach using a large-scale reform of the allocation system of unem-ployed individuals to vocational training in Germany. This reform presents an illustrativeexample in which policy effects, selection effects, and time effects are part of the overallreform effect. The main objective of vocational training for unemployed persons is theadjustment of their skills to changing requirements in the labour market and/or changed16ndividual conditions (due to health problems, for example). In Germany, vocationaltraining comprises three types of programmes: practice firm training, classical vocationaltraining, and retraining. Classical vocational training courses takes place in classrooms oron-the-job and are categorised by their planned durations. We distinguish between shorttraining (a maximum duration of six months) and long training (a minimum durationof six months). Practice firm training simulates a work environment in a practice firm.Retraining (also called degree course) has long durations of up to three years. It leads tothe completion of a (new) vocational degree within the German apprenticeship system.Further descriptions and examples of courses can be found in Table 1.Table 1 around here
Before 2003, caseworkers’ assignment of unemployed to courses was mandatory and basedon subjective criteria. The introduction of a voucher system on January 1, 2003 had theintention to increase the responsibility of training participants and to establish marketsystems for training providers (Bruttel, 2005). Potential training participants receivea training voucher that allows them to select the provider and course. Their choice issubject to the following restrictions: First, the voucher specifies the objective, content,and maximum duration of the course. Second, it can be redeemed within a one-daycommuting zone. Third, the validity of training vouchers varies between one week and amaximum of three months. Importantly, caseworkers cannot impose sanctions if a voucheris not redeemed.Simultaneously with the voucher system, the reform introduced stricter selection crite-ria for potential training participants. The post-reform paradigm of the German FederalEmployment Agency focuses on direct and rapid placement of unemployed individuals,high reintegration rates, and low dropout rates. Caseworkers award vouchers such thatat least 70% of all voucher recipients are expected to find jobs within six months ofcompleting training. The enforcement of the 70% criterion was difficult, because satisfying the rule had no consequences. For .2 Data, treatment and sample This study is based on administrative data provided by the German Federal EmploymentAgency. The data set contains information on all individuals in Germany who participatedin a training programme between 2001 and 2004. Individual records are collected from theIntegrated Employment Biographies (IEB). The sample used as the comparison grouporiginates from the same database. It is constructed as a 3% random sample of individualswho experience at least one transition from employment to non-employment. The treatment is defined as the first participation in a vocational training programmeduring the first year of unemployment. We follow a static evaluation approach and im-pute (pseudo) participation starts (similar to, e.g., Lechner, 1999, Lechner and Smith,2007). The evaluation sample is constructed as an inflow sample into unemployment.The baseline sample (Sample A) consists of individuals who became unemployed in 2001under the mandatory system or in 2003 under the voucher system, after having been con-tinuously employed for at least three months. Additionally, we use an alternative sampledefinition (Sample B) for which we alter the pre-reform sample restrictions. We considerindividuals who enter unemployment in 2002 and start training within the following 12months but no later than December 2002. Thereby, we approximate the timing of thereform implementation with respect to inflow into unemployment. Sample B is used forrobustness tests. A graphical illustration of the samples is presented in Figure 1.Figure 1 around hereEntering unemployment is defined as the transition from (non-subsidised, non-marginal,non-seasonal) employment to non-employment of at least one month. We focus on individ-uals who are eligible for unemployment benefits at the time of inflow into unemployment. this reason, the selection rule was abolished after 2004. The IEB is a rich administrative database and the source of the sub-samples of data used in all recentstudies that evaluate German ALMP programmes (e.g., Biewen, Fitzenberger, Osikominu, and Paul, 2014,Lechner, Miquel, and Wunsch, 2011, Lechner and Wunsch, 2013). The IEB is a merged data file containingindividual records collected from four different administrative processes: the IAB Employment History(
Besch¨aftigten-Historik ), the IAB Benefit Recipient History (
Leistungsempf¨anger-Historik ), the Data onJob Search originating from the Applicants Pool Database (
Bewerberangebot ), and the Participants-in-Measures Data (
Maßnahme-Teilnehmer-Gesamtdatenbank) . IAB (
Institut f¨ur Arbeitsmarkt- und Berufs-forschung ) is the abbreviation for the research department of the German Federal Employment Agency. We account for the fact that we have different sampling probabilities in all calculations whenever necessary.
The baseline Sample A includes 206,511 unweighted or 1,011,125 reweighted observations.We account for the fact that we use a 100% sample of programme participants and a 3%random sample of non-participants using the inverse inclusion probabilities as weights. Weobserve 26,341 unemployed individuals who redeem vouchers and 69,216 participants whoare directly assigned to a training course. This is the full sample of vocational trainingparticipants in Germany that satisfies our sample selection criteria. The sample includes420,014 reweighted control persons before and 495,554 reweighted control persons afterthe reform. Table 2 around hereIn Table 2, we report the sample first moments of the observed characteristics with alarge standardised difference. Additionally, we present descriptive statistics for observedcharacteristics with small standardised differences in Table A.1 in Online Appendix A.In the first two columns of Table 2, we report the sample first moments of the controlvariables for participants and non-participants under the voucher system. The respec-tive sample moments under the mandatory system can be found in the third and fourthcolumns. The last three columns display the standardised differences between the differentsub-samples and the treatment group under the voucher system. Training participantsare on average younger, have fewer instances of incapacity and are better educated. Theyhave more successful employment and welfare histories than unemployed individuals inthe comparison group. These patterns are observed under both systems. The primarydifferences are observed in the employment histories of participants and the regional char-acteristics. Training participants under the voucher system have been employed longer19nd have higher cumulative earnings than participants under the mandatory system.Furthermore, participants under the voucher system are more likely to reside in localemployment agency districts with low employment in the construction sector and a highshare of male unemployment.
Assumptions 1a, 1b are strong, but standard in the programme evaluation literature.The plausibility of similar assumptions has been studied by Biewen et al. (2014) andLechner and Wunsch (2013) for training programme evaluations. Their findings suggestthat such assumptions are plausible for training programme evaluations when rich datais available. We use exceptionally rich data, which includes the control variables usedin the previous literature and additional new variables. In particular, we use baselinepersonal characteristics, the timing of programme starts, regions, benefit and unemploy-ment insurance claims, pre-programme outcomes, and labour market histories (see Table2 and Table A.1 in Online Appendix A). In addition to the standard variables, we controlfor proxy information concerning physical or mental health problems, lack of motivation,and reported sanctions. Furthermore, we control for regional characteristics at the levelof local employment agency districts, which are often not available with such precision.Thus, the imposed assumptions appear to be plausible in our setting.Assumption 2a,b can be tested using the data. In unreported calculations, we performsimple support tests and do not observe any incidence of support problems.Assumption 3 requires that the reform has no effect on the non-treated. After con-trolling for the changed selection of treated before and after the reform, which can indeedchange the composition of the non-treated, it is plausible that the assignment system isindependent of the potential outcomes of non-participants. The main argument for thisis that the reform of the assignment mechanism only affects participants and the share ofparticipants is relatively small, such that general equilibrium effects can be neglected.We show several plausibility tests for Assumption 4, which requires that the potentialoutcomes of participants and non-participants would follow the same trend in the absence20f the reform. We present three different types of supporting evidence for the plausibilityof this assumption. First, Figure 2 reports the long-term trends in the outcome variablesfor different samples for the years between 1990 and 2012. Prior to the treatment startdates in 2001 and 2003, the outcomes of the participants and non-participants samplesevolve in parallel over many years. Given these parallel trends, it is likely that we wouldobserve the same respective patterns after 2001 or 2003 in the absence of a treatment.Figure 2 around hereSecond, we experiment with additional information on local employment agency dis-tricts (i.e., regional control variables). We observe the monthly regional unemploymentrate (by gender and citizen status), the ratio of vacant full-time jobs, employment sharesby sector and population density. We assess the sensitivity of our findings with respect tothese factors. If our results are not sensitive to the regional control variables, we expectthat possible interactions between the effectiveness of training participation and the un-employment rate (or the business cycle in general) are not important in our application.This would support the plausibility of the common trend assumption.Third, we use an alternative sample definition (Sample B) for which we alter the pre-reform sample restrictions. We consider individuals who enter unemployment in 2002and start training within the following twelve months but no later than December 2002.Consequently, not all individuals in Sample B can participate during the first twelvemonths of their unemployment period (e.g., an individual who enters unemployment inOctober can only receive treatment under the mandatory system in the following threemonths). Using Sample B, we approximate the timing of the reform implementation withrespect to the inflow into unemployment. We argue that the common trend assumptionis more likely to hold if the time difference between the pre- and post-reform periodsis smaller. However, in contrast to the baseline sample (Sample A), Sample B is notbalanced in the pre- and post-reform periods (comp. Figure 1).21 .5 Estimation
We apply a semi-parametric reweighting estimator,
Auxiliary-to-Study Tilting (Graham,De Xavier Pinto, and Egel, 2016), in all estimations. This estimator is well suited toour empirical design because it balances the efficient sample first moments exactly. Fur-thermore, it is √ N -consistent and asymptotically normal. The estimator is described inOnline Appendix B.2. We start this section by showing the overall reform effect. Figure 3 presents the ATTs forparticipants in vocational training courses before the reform ( γ pre ) and after the reform( γ post ). The outcomes of interest are nonsubsidised and nonmarginal employment whichis subject to social security contributions (‘employment’ in the following). Results formonthly earnings are available in the online appendix. We report separate effects forevery month during 88 months following the course start. The lines are monthly pointestimates and the diamonds indicate significant effects at the 5% level.Figure 3 around hereTraining participants suffer from negative lock-in effects before and after the reform.The lock-in effects are steeper in the pre-reform period but have longer durations afterthe reform. The long-term effects of participation in vocational training courses on em-ployment probability are positive. Training participation increases long-term employmentprobability (seven years after the start of training) by five percentage points before thereform and by 7.5 percentage points after the reform.The raw difference between the post- and pre-reform effectiveness of training identifiesthe overall difference in effects before and after the reform ( γ ba ). In Figure 3, the red solidline shows a positive difference in effects before and after the reform in the short-term Subsidized employment is employment in the context of an ALMP. Marginal employment is accordingsocial security regulations in Germany defined as employment of a few hours per week only. γ sel ), which are reported in Figure 4. The effects show thedifferences in the effectiveness of training that can be solely explained by a differentparticipant selection in terms of their characteristics holding time and policy instrumentconstant. The results suggest that stricter selection criteria only have a minor influenceon the effectiveness of training. If anything, we find negative selection effects over thelong-term. Given the small differences in most observed characteristics, such small andmostly insignificant selection effects are plausible.Figure 5 presents the business cycle effects of non-participation ( γ bc ) for Samples23 and B with and without additional regional control variables. The time effects showan immediate, sharp increase of employment probabilities which peaks after three years.Thereafter, the effects evolve to a 3-5 percentage points higher employment probabilityin the post-reform period compared to the pre-reform period.Figure 5 around hereThe general pattern of the time effects is not sensitive to the sample definition orto the inclusion of additional regional labour market characteristics. This supports theplausibility of the common trend assumption. However, by the implementation of theHartz reforms, the German labour market was intensively reformed during the observationperiod, particularly in 2005. An improvement of labour market conditions can be observedover the long-term. This does not alter the plausibility of our identifying assumptions aslong as all groups are equally affected by the Hartz reforms.Finally, Figure 6 displays the policy effects for Samples A and B, with and withoutadditional regional control variables. They show the difference in the effectiveness oftraining that can be solely explained by the changing assignment system from mandatoryassignment to vouchers holding participants characteristics and time period fixed.The pattern of the policy effects varies in the different periods after course start. Inthe short term, the policy effects are positive, implying that training is more effectiveunder the voucher system. In the best case, training participants who receive a voucherhave employment probabilities that are approximately 2-3 percentage points higher com-pared to participants in the mandatory system. Over the medium term, the policy effectsare negative. The specifications using Sample B present a slightly more negative picture.In the worst-case scenario, the employment probability decreases by 5 percentage points.Three years after the start of training, we observe an increase to slightly positive butmostly insignificant policy effects. After seven years, the effects are positive for all spec-ifications. However, the effects are only significant for Sample A with a 4-5 percentagepoint increase in employment probabilities. The results of all specifications are relatively stable between 40 and 80 months after training participationbegins. This mitigates concerns that our findings are greatly altered by the financial crisis in 2008.
To interpret the policy effect, it is necessary to investigate the channels through whichthe reform affect the employment outcome. First, training is voluntary after the reform.Thus, the effectivness of training might differ between a voucher and a mandatory systembecause voluntarily assigned participants are more motivated than compulsory assignedparticipants. Second, voucher assigned participants have free course choice conditional onthe specification on the voucher. In Table 3, we report descriptive statistics for differenttypes and duration of training programmes before and after the reform. The share ofshort training programmes increases from 21% to 42% after the reform. Moreover, theshare of long training programmes decreases from 41% to 19%. The average planned andactual duration of long programmes (practice firm) decrease nearly three (two) monthsafter the reform. The share of participants in retraining courses increases from 19% to25%. The average planned duration is extended by more than one month. Table 3 around hereAccordingly, the composition of programme types and durations changed substantiallyafter the reform. We observe higher shares of participants in programmes with a durationof less than six months and higher shares of participants in very long programmes withdurations of more than two years. The first development might reflect increased freedomof choice under the voucher system. Training vouchers are determined with respect to themaximum programme duration. The unemployed individuals are free to choose a trainingprovider and may self-select into shorter courses.To disentangle the effects of voluntary participation from the effects of increased coursechoice we apply a mediation framework (see, e.g, Robins and Greenland, 1992, Baronand Kenny, 1986). We consider the type and duration of training as mediators, i.e., In 2003, there was also a reduction in the total number of vocational training programmes for politicalreasons. p , ,v ( x, c v ). Figure 7 shows the policy effects, the course composition effects and the effects ofvoluntary participation for Samples A and B with regional control variables. We findpositive short-term effects that can be explained by the larger share of short programmesafter the reform. After 2-3 years, the effects turn negative which can be explained by alarger share of retraining programmes in the voucher system. In the long term, the coursecomposition effects become slightly positive but remain close to zero.Figure 7 around hereThe effects of voluntary participation become negative immediately after the start oftraining. After two years, voluntary participation leads to a 3-5 percentage points declinein the employment probability compared to mandatory participation. The voluntary par-ticipation effects remain negative until three years after the start of training. Unemployedindividuals might perceive less pressure to find a job under voluntary participation, asthey feel more accommodated, have more positive attitudes towards the training courseand a higher motivation to complete the programme. A descriptive analysis of dropoutrates supports this interpretation (see Online Appendix D). We find that the dropout We generate dummies for the planned programme durations (less than 6 months, between 6 and 12months, between 12 and 24 months, and more than 24 months). These durations correspond to differentprogramme types. Furthermore, we account for interactions between these dummies and the plannedprogramme duration to allow for linear trends within each period.
Our results qualitatively confirm the findings in Rinne, Uhlendorff, and Zhao (2013) forthe time horizon of 1.5 years after treatment. We find positive effects of the reform of theallocation system in the short-term. Moreover, we find that the reform of the allocationsystem reduces the re-employment probabilities between the first and second year afterthe start of training. Our application shows that the consideration of long-term effects iscrucial. In the long-term, the policy effects turn positive and remain on an approximatelystable level until seven years after the training started.Compared to earlier studies, we show that it is important to consider direct andindirect effects of a policy reform. We provide evidence that the short-term positive policyeffects are mainly driven by a changing composition of training course types and durationafter the reform. The share of individuals who participate in shorter courses increasedin the post-reform period. The selection into shorter courses improves the labour marketoutcomes in the short-term. This is almost a mechanical effect, because participants inshorter courses are distracted from intensive job search for a shorter time period.If we focus on the direct effect of voluntary participation net of the course compositioneffect, we observe a reduction of training effectiveness in the short-term and a significant27ncrease in the long run. This can be explained by a higher motivation of participantsunder the voucher system to focus on the course contents and to complete training insteadof intensively search for a new job during course participation.
In this study, we formally show the identification of channels of policy reforms withmultiple treatments and different selection into each type of treatment. We discuss theassumptions that are sufficient to identify the different components of the policy reformwhich are selection effects, time effects and the policy effects. Furthermore, we provide aformal framework of the causal channels through which the policy effects may affect theoutcome of interest using mediation analysis.We illustrate the empirical approach using a large reform of the allocation of vocationaltraining programmes in Germany. The pre-reform system granted caseworkers substantialauthority through mandatory allocation of unemployed individuals to training courses.The post-reform voucher system introduces voluntary participation and some freedomof course choice. Additionally, the reform changed the criteria for selecting unemployedpersons into training programmes. This reforms is a illustrative example in which theoverall reform effect can be decomposed into selection effects, time effects and the policyeffects of interest. We separate the different reform components from each other andinvestigate the channels through which the reform of the allocation system operates. Weare mainly interested in the direct effect of changing the allocation of vocational trainingfrom a mandatory to a voluntary system net from indirect effects that may occur throughthe increased course choice.The empirical results show the importance of considering causal channels since theymay operate in opposing directions. Here, the policy effect indicates an increased effec-tiveness of training after the reform in the short run. We show that the positive effectmainly comes from indirect effects of the policy reform whereas the direct effects showa short-term reduction in the effectiveness of training. This is important knowledge forpolicy makers because it allows to target policy instruments more precicely. Depending on28he short- and long-term objectives of policy makers it may even reverse the applicationof policy instruments.
References
Abadie, A. (2005): “Semiparametric Difference-in-Differences Estimators,”
Review ofEconomic Studies , 72(1), 1–19.
Baron, R. M., and
D. A. Kenny (1986): “The Moderator-Mediator Variable Distinc-tion in Social Psychological Research: Conceptual, Strategic, and Statistical Consider-ations,”
Journal of Personality and Social Psychology , 51, 1173–1182.
Biewen, M., B. Fitzenberger, A. Osikominu, and
M. Paul (2014): “The Effec-tiveness of Public Sponsored Training Revisited: The Importance of Data and Method-ological Choices,”
Journal of Labor Economics , 32(4), 837–897.
Bruttel, O. (2005): “Delivering Active Labour Market Policy Through Vouchers: Ex-periences with Training Vouchers in Germany,”
International Review of AdministrativeSciences , 71(3), 391–404.
Card, D., and
D. R. Hyslop (2005): “Estimating the Effects of a Time-LimitedEarnings Subsidy for Welfare-Leavers,”
Econometrica , 73(6), 1723–1770.
Cox, D. R. (1958): “The Regression Analysis of Binary Sequences,”
Journal of the RoyalStatistical Society: Series B (Methodological) , 20, 215–242.
Deuchert, E., M. Huber, and
M. Schelker (2019): “Direct and Indirect EffectsBased on Difference-in-Differences with an Application to Political Preferences Follow-ing the Vietnam Draft Lottery,”
Journal of Business and Economic Statistics , 37(4),710–720.
Doerr, A., B. Fitzenberger, T. Kruppe, M. Paul, and
A. Strittmatter (2017): “Employment and Earnings Effects of Awarding Training Vouchers in Ger-many,”
Industrial and Labor Relations Review , 70(3), 767–812.29 elfe, C., N. Nollenberger, and
N. Rodriguez-Planas (2014): “Can’t BuyMommy’s Love? Universal Childcare and Children’s Long-Term Cognitive Develop-ment,”
Journal of Population Economics , 283(2), 393–422.
Flores, C., and
A. Flores-Lagunes (2009): “Identification and Estimation of CausalMechanisms and Net Effects of a Treatment under Unconfoundedness,”
IZA DiscussionPaper , 4237.
Fricke, H. (2017): “Identifcation based on Difference-in-Differences Approaches withMultiple Treatments,”
Oxford Bulletin of Economics and Statistics , 79(3), 426–433.
Graham, B. S., C. C. De Xavier Pinto, and
D. Egel (2016): “Efficient Estimationof Data Combination Models by the Method of Auxiliary-to-Study Tilting,”
Journal ofBusiness & Economics Statistics , 34(2), 288–301.
Gundersen, C., B. Kreider, J. Pepper, and
V. Tarasuk (2017): “Food Assis-tance Programs and Food Insecurity: Implications for Canada in Light of the MixingProblem,”
Empirical Economics , 52(3), 1065–1087.
Havnes, T., and
M. Mogstad (2011a): “Money for Nothing? Universal Child Careand Maternal Employment,”
Journal of Public Economics , 95(11-12), 1455–1465.(2011b): “No Child Left Behind: Subsidized Child Care and Children’s Long-Run Outcomes,”
American Economic Journal: Economic Policy , 3(2), 97–129.
Heckman, J. J., H. Ichimura, and
P. E. Todd (1997): “Matching as an EconometricEvaluation Estimator: Evidence from Evaluating a Job Training Programme,”
Reviewof Economic Studies , 64(4), 605–654.
Hirano, K., G. W. Imbens, and
G. Ridder (2003): “Efficient Estimation of AverageTreatment Effects Using the Estimated Propensity Score,”
Econometrica , 71(4), 1161–1189.
Huber, M. (2014): “Identifying Causal Mechanisms (Primarily) Based on Inverse Prob-ability Weighting,”
Journal of Applied Econometrics , 29(6), 920–943.30 uber, M., M. Lechner, and
G. Mellace (2017): “Why Do Tougher CaseworkersIncrease Employment? The Role of Programme Assignment as a Causal Mechanism,”
Review of Economics and Statistics , 99(1), 180–183.
Huber, M., M. Lechner, and
A. Strittmatter (2018): “Direct and Indirect Effectsof Training Vouchers for the Unemployed,”
Journal of the Royal Statistical Society,Series A , 181(2), 441–463.
Huber, M., M. Schelker, and
A. Strittmatter (2019): “Direct and Indirect Ef-fects based on Changes-in-Changes,” arXiv:1909.04981 . Imai, K., L. Keele, D. Tingley, and
T. Yamamoto (2011): “Unpacking the BlackBox of Causality: Learning about Causal Mechanisms from Experimental and Obser-vational Studies,”
American Political Science Review , 105(4), 765–789.
Imai, K., L. Keele, and
T. Yamamoto (2010): “Identification, Inference and Sensi-tivity Analysis for Causal Mediation Effects,”
Statistical Science , 25, 51–71.
Imbens, G. (2000): “The Role of the Propensity Score in Estimating Dose-ResponseFunctions,”
Biometrika , 87(3), 706–710.
Kikuchi, N. (2017): “Intergenerational Transmission of Education in Japan: Nonpara-metric Bounds Analysis with Multiple Treatments,”
ISER Discussion Paper No. 1011 . Lechner, M. (1999): “Earnings and Employment Effects of Continuous Off-the-jobTraining in East Germany after Unification,”
Journal of Business and Economic Statis-tics , 17(1), 74–90.
Lechner, M. (2001): “Identification and Estimation of Causal Effects of Multiple Treat-ments under the Conditional Independence Assumption,” in
Econometric Evaluation ofLabour Market Policies , ed. by M. Lechner, and
F. Pfeiffer, pp. 43–58. ZEW EconomicStudies 13. New York: Springer-Verlag.(2010): “The Estimation of Causal Effects by Difference-in-Difference Methods,”
Foundations and Trends in Econometrics , 4(3), 165–224.31 echner, M., R. Miquel, and
C. Wunsch (2011): “Long-run Effects of Public SectorSponsored Training,”
The Journal of the European Economic Association , 9(4), 742–784.
Lechner, M., and
J. Smith (2007): “What is the Value Added by Caseworkers?,”
Labour Economics , 14(2), 135–151.
Lechner, M., and
A. Strittmatter (2019): “Practical Procedures to Deal withCommon Support Problems in Matching Estimation,”
Econometric Reviews , 38(2),193–207.
Lechner, M., and
C. Wunsch (2013): “Sensitivity of Matching-Based Program Eval-uations to the Availability of Control Variables,”
Labour Economics , 21(C), 111–121.
Manski, C. (1997): “The Mixing Problem in Programme Evaluation,”
Review of Eco-nomic Studies , 64(4), 537–553.
McCall, B., J. A. Smith, and
C. Wunsch (2016): “Government-Sponsored Voca-tional Education for Adults,”
Handbook of the Economics of Education , 5, 479–652.
Paul, M. (2015): “Many Dropouts? Never mind!- Employment Prospects of Dropoutsfrom Training Programs,”
Annals of Economics and Statistics , 119-120, 235–267.
Pearl, J. (2001): “Direct and Indirect Effects,”
Proceedings of the Seventeenth Confer-ence on Uncertainty in Artificial Intelligence , pp. 411–420.
Perez-Johnson, I., Q. Moore, and
R. Santillano (2011): “Improving the Effec-tiveness of Individual Training Accounts: Long-Term Findings from an ExperimentalEvaluation of Three Service Delivery Models,”
Final Report, Mathematica Policy Re-search, Princeton, NJ . Petersen, M. L., S. E. Sinisi, and
M. J. van der Laan (2006): “Estimation ofDirect Causal Effects,”
Epidemiology , 17, 276–284.
Rinne, U., A. Uhlendorff, and
Z. Zhao (2013): “Vouchers and Caseworkers inTraining Programs for the Unemployed,”
Empirical Economics , 45(3), 1089–1127.32 obins, J., and
S. Greenland (1992): “Identifiability and Exchangeability for Directand Indirect Effects,”
Epidemiology , 3, 143–155.
Rosenbaum, P., and
D. Rubin (1983): “The Central Role of Propensity Score inObservational Studies for Causal Effects,”
Biometrica , 70(1), 41–55.
Rubin, D. B. (1974): “Estimating the Causal Effect of Treatments in Randomized andNon-Randomized Studies,”
Journal of Educational Psychology , 66(5), 688–701.
Strittmatter, A. (2016): “What Effect Do Vocational Training Vouchers Have on theUnemployed?,”
IZA World of Labor , 316.
Tomini, F., W. Groot, and
H. Maassen van den Brink (2016): “The Effective-ness of the Voucher Training Programs: A Systematic Review of the Evidence fromEvaluations,”
TIER Working Paper Series , 16/08.
Twinam, T. (2017): “Complementarity and Identification,”
Econometric Theory , 33(5),1154–1185.
Van der Weele, T. J. (2009): “Marginal Structural Models for the Estimation ofDirect and Indirect Effects,”
Epidemiology , 20, 18–26.33 igures
Figure 1: Graphical illustration of Sample A and B (a) Sample A(b) Sample B
Figure 2: Time trends of employment probabilities for different subgroups of individualsfor the 1991-2012 period
Note: We report time trends for the years between 1990 and 2012. The outcome variables are reweighted as described inOnline Appendix B.2. Similar findings are obtained without reweighting.
Note: We estimate separate effects for each of the 88 months following the treatment. Diamonds indicate significant pointestimates at the 5%-level. Significance levels are bootstrapped with 499 replications. Lines without diamonds indicate pointestimates that are not significantly different from zero. We use baseline Sample A and control for local employment agencydistrict characteristics and the full set of observed characteristics (see Table A.2 in Online Appendix A).
Figure 4: Selection and overall reform effects
Note: We estimate separate effects for each of the 88 months following the treatment. Diamonds indicate significant pointestimates at the 5%-level. Significance levels are bootstrapped with 499 replications. Lines without diamonds indicate pointestimates that are not significantly different from zero. We use baseline Sample A and control for local employment agencydistrict characteristics and the full set of observed characteristics (see Table A.2 in Online Appendix A).
Note: We estimate separate effects for each of the 88 months following the treatment. Diamonds indicate significant pointestimates at the 5%-level. Significance levels are bootstrapped with 499 replications. Lines without diamonds indicate pointestimates that are not significantly different from zero. We use baseline Sample A and control for local employment agencydistrict characteristics and the full set of observed characteristics (see Table A.2 in Online Appendix A).
Figure 6: Policy effects
Note: We estimate separate effects for each of the 88 months following the treatment. Diamonds indicate significant pointestimates at the 5%-level. Significance levels are bootstrapped with 499 replications. Lines without diamonds indicate pointestimates that are not significantly different from zero. We use baseline Sample A and control for local employment agencydistrict characteristics and the full set of observed characteristics (see Table A.2 in Online Appendix A). (a) Sample A (b) Sample B
Note: We estimate separate effects for each of the 88 months following the treatment. Diamonds indicate significant pointestimates at the 5%-level. Significance levels are bootstrapped with 499 replications. Lines without diamonds indicate pointestimates that are not significantly different from zero. We use baseline Sample A and control for local employment agencydistrict characteristics and the full set of observed characteristics (see Table A.2 in Online Appendix A). In the durationeffects, we account for the planned course durations and interactions using fixed duration dummies. ables Table 1: Vocational training programmes
Programme type Description ExamplesPractice firm training Courses that took place in practice firms tosimulate a work environment. Training in commercial software, for officeclerks, in data processingShort training Provision of occupation specific skills (dura-tion ≤ > Voucher system Mandatory system Absolute standardised differences betweenTreatment- Control- Treatment- Control (1) and (2) (1) and (3) (1) and (4)group group group group(1) (2) (3) (4) (5) (6) (7)
Personal characteristics
Age 38.8 41.3 38.7 41.5 28.5 0.9 31.4Older than 50 years .010 .111 .019 .125 43.3 7.1 47.0Incapacity (e.g., ill-ness, pregnancy) .022 .050 .032 .062 15.4 6.2 20.2Health .083 .128 .093 .146 14.5 3.4 20.0
Education and occupation
University entry de-gree (Abitur) .229 .170 .197 .142 14.7 7.9 22.5White-collar .382 .476 .440 .527 19.2 12.0 29.5Manufacturing .069 .101 .101 .147 11.7 11.4 25.3
Employment and welfare history
Half months empl.(last 2 years) 45.6 44.9 44.5 43.7 10.1 15.4 25.7Half months since lastunempl. in last 2 years 46.8 46.2 45.6 44.4 11.6 19.7 35.0Half months since lastOLF (last 2 years) 45.8 44.6 44.9 43.3 15.5 12.5 29.9Eligibility unempl.benefits 13.5 14.7 13.2 14.8 21.1 5.9 20.7Remaining unempl.insurance claim 25.6 22.3 23.4 21.4 25.0 18.0 31.7Cumulative earnings(last 4 years) 91,204 83,632 80,913 81,156 15.6 21.8 21.0
Timing of unemployment and programme start
Start unempl. inSeptember .151 .079 .099 .075 22.9 15.7 24.2Elapsed unempl. dura-tion 5.06 3.55 4.53 3.45 46.0 15.7 49.0
Characteristics of local employment agency districts
Share of empl. in con-struction industry .064 .065 .077 .077 2.3 54.3 55.5Share of male unempl. .564 .563 .541 .541 1.1 50.8 53.5Note: See Table A.1 in Online Appendix A for sample first moments of observed characteristics with small standardiseddifferences. In columns (1)-(4), we report the sample first moments of observed characteristics for the treated and non-treated sub-samples. Information on individual characteristics refers to the time of inflow into unemployment, with theexception of the elapsed unemployment duration and monthly regional labour market characteristics, which refer to the(pseudo) treatment time. In columns (5)-(7), we report the standardised differences between the different sub-samplesand the treatment group under the voucher system. A description of how we measure absolute standardised differences isavailable in Online Appendix B.2. Rosenbaum and Rubin (1983) classify absolute standardised difference of more than 20as “large”. OLF is the acronym for “out of labour force”. nline Appendix to“Identifying causal channels of policy reforms withmultiple treatments and different types of selection” Annabelle Doerr and Anthony Strittmatter Sections:
A. Descriptive statisticsB. Supplements to the empirical approachC. Matching qualityD. The change in dropout ratesE. Results for monthly earningsF. Heterogeneous results by programme type Annabelle Doerr, UC Berkeley, [email protected] and Anthony Strittmatter, Department ofEconomics, University of St.Gallen, [email protected]. Descriptive statistics
Table A.1: Sample first moments of observed characteristics with small standardiseddifferences.
Voucher system Mandatory system Standardised differences betweenTreatment- Control- Treatment- Control- (1) and (2) (1) and (3) (1) and (4)group group group group(1) (2) (3) (4) (5) (6) (7)
Personal characteristics
Female .472 .447 .477 .411 5.0 .9 12.4No German citizenship .054 .080 .052 .071 10.5 1.0 7.2Children under 3 years .042 .035 .040 .031 3.7 1.2 6.1Single .300 .285 .270 .251 3.4 6.7 11.1Sanction .007 .007 .009 .008 .2 2.0 .5Lack of motivation .007 .007 .009 .008 .2 2.0 .5
Education and occupation
No schooling degree .036 .068 .036 .056 14.3 .4 9.3Schooling degree without Abitur .720 .731 .750 .770 2.4 6.8 11.4Missing .014 .031 .017 .032 10.9 2.4 11.9No vocational degree .203 .227 .218 .219 5.9 3.6 3.8Academic degree .112 .096 .081 .063 5.4 10.6 17.3Agriculture, Fishery .012 .020 .015 .023 6.7 3.2 8.7Construction .054 .032 .027 .022 10.9 13.3 16.6Trade and Retail .127 .169 .148 .175 11.8 6.2 13.5Communication and Information Ser-vice .108 .137 .122 .128 8.6 4.2 6.1
Employment and welfare history
Half months unempl. in last 2 years .398 .370 .578 .581 1.6 9.5 9.7No unempl. in last 2 years .914 .921 .877 .878 2.7 11.8 11.6Unemployed in last 2 years .034 .040 .046 .052 3.1 6.2 9.1
Timing of unemployment and programme start
Start unempl. in January .060 .101 .117 .105 15.0 19.8 16.1Start unempl. in February .070 .089 .108 .089 7.2 13.4 7.0Start unempl. in March .096 .083 .105 .085 4.5 3.0 3.7Start unempl. in April .102 .088 .120 .086 4.8 5.7 5.8Start unempl. in June .059 .078 .058 .072 7.6 .6 5.3Start unempl. in July .052 .080 .053 .078 11.1 .3 10.4Start unempl. in August .081 .078 .080 .078 1.0 .3 .9Start unempl. in October .127 .078 .085 .082 16.4 13.8 14.9Start unempl. in November .086 .079 .045 .082 2.6 16.6 1.7Start unempl. in December .045 .082 .040 .089 15.0 2.8 17.6
State of residence
Baden-W¨urttemberg .087 .113 .095 .090 8.6 2.9 1.2Bavaria .159 .138 .111 .115 6.1 14.1 12.8Berlin, Brandenburg .093 .093 .107 .111 .1 4.7 6.0Hamburg, Mecklenburg WesternPomerania, Schleswig Holstein .076 .088 .098 .092 4.3 7.9 5.6Hesse .064 .068 .063 .058 1.7 .1 2.3Northrhine-Westphalia .232 .206 .182 .197 6.2 12.4 8.6Rhineland Palatinate, Saarland .056 .054 .055 .049 .9 .6 3.4Saxony-Anhalt, Saxony, Thuringia .123 .142 .189 .190 5.5 18.4 18.5
Characteristics of local employment agency districts
Population per km
910 889 789 895 1.3 7.5 .9Unemployment rate (in %) 12.2 12.3 12.1 12.0 1.9 1.4 3.8Share of empl. in production industry .250 .246 .246 .241 5.1 4.7 9.9Share of empl. in trade industry .150 .150 .150 .150 1.8 2.7 2.8Share of non-German unempl. .139 .141 .126 .128 2.5 14.3 12.1Share of vacant fulltime jobs .794 .794 .800 .799 0 8.4 7.6Note: See Table 2 for sample first moments of observed characteristics with large standardised differences. In columns (1)-(4), we reportthe sample first moments of observed characteristics for the treated and non-treated sub-samples. Information on individual characteristicsrefers to the time of inflow to unemployment, with the exception of the elapsed unemployment duration and monthly regional labour marketcharacteristics, which refer to the (pseudo) treatment time. In columns (5)-(7), we report the standardised differences between the differentsub-samples and the treatment group under the voucher system. Please find a description of how we measure standardised differences inOnline Appendix B.2. OLF is the acronym for “out of labour force”.
Voucher Mandatory Standardised differences betweensystem system (1) and (2)(1) (2)
Personal characteristics
Female .472 .476 .8Age 38.754 38.697 .8Older than 50 years .011 .019 7.1No German citizenship .054 .052 1.1Children under 3 years .042 .040 1.2Single .300 .270 6.6Health problems .083 .093 3.7Sanction .007 .009 2.1Incapacity (e.g., illness, pregnancy) .022 .032 6.3Lack of motivation .007 .009 2.1
Education and occupation
No schooling degree .036 .035 .5Schooling degree without Abitur .719 .762 9.8University entry degree (Abitur) .230 .185 11.2No vocational degree .204 .217 3.3Academic Degree .114 .081 11.3White-collar .383 .440 11.7Agriculture, Fishery .012 .015 3.2Manufacturing .069 .101 11.6Construction .053 .027 13.1Trade and Retail .127 .148 6.1Communication and Information Service .109 .122 4.1
Employment and welfare history
Half months empl. in last 2 years 45.5 44.5 15Half months unempl. in last 2 years .401 .587 10Half months since last unempl. in last 2 years 46.7 45.6 20.7No unempl. in last 2 years .913 .876 12.1Unempl. in last 2 years .034 .047 6.3
Timing of unemployment and programme start
Start unempl. in January .059 .116 19.9Start unempl. in February .070 .108 13.4Start unempl. in March .095 .104 2.9Start unempl. in April .102 .120 5.7Start unempl. in June .059 .058 .4Start unempl. in July .052 .053 .7Start unempl. in August .082 0.08 .6Start unempl. in September .152 .099 16.2Start unempl. in October .127 .085 13.5Start unempl. in November .087 .046 16.6Start unempl. in December .045 .040 2.6Elapsed unempl. duration 5.08 4.54 16.2
State of residence
Baden-W¨urttemberg .085 .093 2.9Bavaria .159 .113 13.4Berlin, Brandenburg .090 .103 4.3Hamburg, Mecklenburg Western Pomerania, Schleswig Holstein .077 .099 8Hesse .064 .064 0Northrhine-Westphalia .231 .180 12.7Rhineland Palatinate, Saarland .056 .055 .7Saxony-Anhalt, Saxony, Thuringia .125 .191 18
Characteristics of local employment agency districts
Share of empl. in production industry .250 .246 5.1Share of empl. in construction industry .064 .077 52.4Share of empl. in trade industry .150 .150 3.1Share of male unempl. .564 .541 46.9Share of non-German unempl. .138 .126 13.3Share of vacant fulltime jobs .793 .800 8.9Population per km
902 778 7.4Unemployment rate (in %) 12.2 12.1 2.5Note: In columns (1)-(2), we report the efficient first moments of observed characteristics for the treated sub-samples. They are exactlyequal in the other re-weighted sub-samples, which are not reported. Information on individual characteristics refers to the time of inflow tounemployment, with the exception of the elapsed unemployment duration and monthly regional labour market characteristics which refer tothe (pseudo) treatment time. In column (3), we report the standardised differences (SD) between the two treatment groups. Please find adescription of how we measure standardised differences in Online Appendix B.2. OLF is the acronym for “out of labour force”. Supplements to the empirical approach
B.1 Proof of Equation (1)
We show that E [ Y di,t ( s ) | D i = g, T i = q ] can be identified from the joint distributionof random variables ( Y, G ( d, t, s ) , G ( d (cid:48) , t (cid:48) , s ) , X ) under Assumptions 1a and 2a (comp.Hirano, Imbens, and Ridder, 2003, Rosenbaum and Rubin, 1983): E [ Y di,t ( s ) | D i = d (cid:48) , T i = t (cid:48) ] = (cid:90) E [ Y di,t ( s ) | D i = d (cid:48) , T i = t (cid:48) , X i = x ] f X ( x | D i = d (cid:48) , T i = t (cid:48) ) dx, = (cid:90) E [ Y di,t ( s ) | D i = d, T i = t, X i = x ] f X ( x | D i = d (cid:48) , T i = t (cid:48) ) dx, = (cid:90) E [ Y i | D i = d, T i = t, X i = x ] f X ( x | D i = d (cid:48) , T i = t (cid:48) ) dx, = (cid:90) E [ G i ( d, t, s ) Y i | D i = d, T i = t, X i = x ] f X ( x | D i = d (cid:48) , T i = t (cid:48) ) dx, = (cid:90) p d,t,s ( x ) E [ G i ( d, t, s ) Y i | X i = x ] f X ( x | D i = d (cid:48) , T i = t (cid:48) ) dx, = (cid:90) p d (cid:48) ,t (cid:48) ,s ( x ) p d (cid:48) ,t (cid:48) ,s · p d,t,s ( x ) E [ G i ( d, t, s ) Y i | X i = x ] f X ( x ) dx, = (cid:90) p d (cid:48) ,t (cid:48) ,s ( x ) p d (cid:48) ,t (cid:48) ,s · p d,t,s ( x ) G i ( d, t, s ) Y i f X ( x ) dx, = E (cid:20) p d (cid:48) ,t (cid:48) ,s ( x ) p d (cid:48) ,t (cid:48) ,s · p d,t,s ( x ) G i ( d, t, s ) Y i (cid:21) . In the first equation we apply the law of iterative expectations. In the second equalitywe condition on D i = d , which is possible because we assume that the expected potentialoutcomes are independent of the treatment after controlling for X i (Assumption 1). Inequality three we replace the potential by the observed outcome. In equality four wemultiply the outcome Y i with the the group dummy G i ( d, t, s ). In equality five we usethe fact that E [ DY ] = E [ DY | D = 1] P r ( D = 1). In equality six we apply Bayes’ rule.We make a backward application of the law of iterative expectations in equality seven.Finally, we replace the integral by an expectation in equality eight. (cid:3) .2 Estimation strategy A straightforward estimation strategy is based on the sample analogue of (1)ˆ E [ Y di,t ( s ) | D i = d (cid:48) , T i = t (cid:48) ] = 1 N N (cid:88) i =1 ˆ ω i Y i , with ˆ ω i = G i ( d, t, s ) N (cid:80) Nj =1 ˆ p d (cid:48) ,t (cid:48) ,s ( X j ) · ˆ p d (cid:48) ,t (cid:48) ,s ( X i )ˆ p d,t,s ( X i ) , (1)where ˆ p d (cid:48) ,t (cid:48) ,s ( X i ) and ˆ p d,t,s ( X i ) indicate the estimated conditional treatment probabilities(henceforth, propensity scores ). This is an Inverse Probability Weighting (IPW) estimator.Hirano, Imbens, and Ridder (2003) demonstrate that the consistency and efficiency of anIPW critically depend on the estimated propensity scores. Parametric specifications ofthe propensity score do not necessarily lead to efficient estimates. One reason is that (1)seeks to balance the sample covariate distributions, which equalˆ F d (cid:48) ,t (cid:48) = 1 (cid:80) Ni =1 ˆ p d (cid:48) ,t (cid:48) ,s ( X i ) N (cid:88) i =1 G i ( d (cid:48) , t (cid:48) , s )1 { X i ≤ x } , when d = d (cid:48) and t = t (cid:48) . However, ˆ F d (cid:48) ,t (cid:48) can be more efficiently estimated using informationfrom the entire population rather than from the random sample d (cid:48) , t (cid:48) alone. The efficientestimators for the covariate distributions of subpopulation d (cid:48) , t (cid:48) equalˆ F effd (cid:48) ,t (cid:48) = 1 (cid:80) Ni =1 ˆ p d (cid:48) ,t (cid:48) ,s ( X i ) N (cid:88) i =1 ˆ p d (cid:48) ,t (cid:48) ,s ( X i )1 { X i ≤ x } . Accordingly, reweighting estimators that recover ˆ F effd (cid:48) ,t (cid:48) rather than of ˆ F d (cid:48) ,t (cid:48) may be moreefficient. We report the efficient first moments for all control variables and both treatmentgroups in Table A.2 in Online Appendix A.Graham, De Xavier Pinto, and Egel (2016) recently proposed a double robust andlocally efficient semiparametric version of IPW, named Auxiliary-to-Study Tilting (AST).This estimator precisely balances the efficient first moments of all control variables ineach treatment sample. Using AST, the propensity score is estimated in a conventionalparametric way. We use the probit model ˆ p d (cid:48) ,t (cid:48) ,s ( X i ) = Φ( X (cid:48) i ˆ β ), where Φ( · ) denotes thecumulative normal distribution function and X (cid:48) i ˆ β is the estimated linear index. Theestimated propensity score ˆ p d,t,s ( x ) is replaced by ˜ p d,t,s ( x ). It is estimated under the Exact balancing is not guaranteed for the sample moments using conventional IPW estimators. N N (cid:88) i =1 G i ( d, t, s )1 N N (cid:88) j =1 ˆ p d (cid:48) ,t (cid:48) ,s ( X j ) · ˆ p d (cid:48) ,t (cid:48) ,s ( X i )˜ p d,t,s ( X i ) · X i = 1 N N (cid:88) i =1 ˆ p d (cid:48) ,t (cid:48) ,s ( X i )1 N N (cid:88) j =1 ˆ p d (cid:48) ,t (cid:48) ,s ( X j ) · X i , (2)where ˜ p d,t,s ( X i ) = Φ( X (cid:48) i ˜ β ) is specified such that the left and right sides of (2) are numer-ically equivalent for all elements in X i (including a constant term). The right side is theefficient first moment estimate. As the efficient first moment estimates are independentof subpopulation with d, t , the first moments are exactly balanced in all treatment groupsfor d, t ∈ { , } using this procedure. The constant guarantees that the weights sum toone. The expected potential outcomes are estimated using˜ E [ Y di,t ( s ) | D i = d (cid:48) , T i = t (cid:48) ] = 1 N N (cid:88) i =1 ˜ ω i Y i , with ˜ ω i = G i ( d, t, s ) N (cid:80) Nj =1 ˆ p d (cid:48) ,t (cid:48) ,s ( X j ) · ˆ p d (cid:48) ,t (cid:48) ,s ( X i )˜ p d,t,s ( X i ) . It can be shown that this estimator is √ N -consistent and asymptotically normal dis-tributed. Similar to Graham, De Xavier Pinto, and Egel (2016), we compute the signifi-cance levels (p-values) of our estimated parameters based on a non-parametric bootstrap-ping procedure (sampling individual observations with replacement). The large sample properties of AST are subject to assumptions regarding the specification of the propen-sity score. These assumptions imply that the propensity score is correctly specified, strictly increasing inits arguments, differentiable, and well located within the unit interval. Matching quality
We assess the matching quality by reporting the moments (mean, variance, skewness,kurtosis) and standardised differences for the control variables in all four samples. Thestandardised differences are defined by SD = | µ d,t,s − µ d (cid:48) ,t (cid:48) ,s | (cid:113) . σ µ d,t,s + σ µ d (cid:48) ,t (cid:48) ,s ) · , where µ d,t,s is the moment and σ µ d,t,s is the variance of the moment in the respectivetreatment group G i ( d, t, s ) with d, d (cid:48) , t, t (cid:48) ∈ { , } and s ∈ { v, m } . The pre-matchingstandardised differences between the sample first moments are reported in Table 2. Thepost-matching standardised differences between the efficient first moments are exactlyzero, as the first moments are precisely balanced (see the discussion in Online AppendixB.2). Therefore, we do not report the standardised difference of the matched treatmentand control samples in Table A.2 (only between the voucher and mandatory system).In the optimal case, matching estimators balance the complete distributions of allcontrol variables rather than only the first moments. For all binary variables, this re-quirement is satisfied because the first moments are balanced. In the main specifications,we control for 63 variables, 43 of which are binary. For the other variables, we report thevariance, skewness, and kurtosis for the different samples matched to the treatment groupunder the voucher system in Table C.1. Furthermore, we present the higher moments forthe different samples matched to the treatment group under the mandatory system inTable C.2. For most moments, we report small standardised differences. However, par-ticularly for the monthly regional labour market characteristics, we find large differencesin the higher moments for the samples that are matched to the treatment group underthe mandatory system. 7able C.1: Higher moments of observed characteristics matched to the treatment groupunder the voucher system. Voucher system Mandatory system Standardised differences betweenTreatment- Control- Treatment- Control- (1) and (2) (1) and (3) (1) and (4)group group group group(1) (2) (3) (4) (5) (6) (7)
Variance
Age 55.48 64.64 56.33 63.1 13.62 1.35 11.68Half months empl. in the last 24 months 41.45 40.17 38.23 36.72 1.16 2.98 4.47Half months unempl.in the last 24 months 2.98 3.12 3.12 3.14 .72 .67 .77Time since last unemployment in the last 24 months (half-months) 20.07 20.45 20.73 20.5 .42 .67 .44 · · · · km Skewness
Age 46.23 110.42 81.03 115.84 5.20 3.18 5.87Half months empl. in the last 24 months -691.31 -663.08 -610.47 -566.93 1.18 3.50 5.52Half months unempl. in the last 24 months 29.37 32.61 34.63 35.01 .97 1.40 1.52Time since last unemployment in the last 24 months (half-months) -381.07 -400.22 -449.84 -439.91 .89 2.61 2.30 · -1.37 · -1.42 · -1.25 · · · · · km · · · · Kurtosis
Age 6984 9302 7132 8593 13.28 1.05 9.75Half months empl. in the last 24 months 14214 13745 12377 11302 .88 3.64 5.95Half months unempl. in the last 24 months 375 440 521 520 .96 1.82 1.84Time since last unemployment in the last 24 months (half-months) 8409 9053 11762 11308 1.22 4.30 3.95 · · · · km · · · · .22 8.46 6.91Unemployment rate (in %) 1740 1941 1219 1239 3.82 12.50 11.49 Note: In columns (1)-(4), we report the variance, skewness, and kurtosis of observed characteristics for the treated andnon-treated sub-samples. Information on individual characteristics refers to the time of inflow into unemployment, withthe exception of the elapsed unemployment duration and monthly regional labour market characteristics, which refer tothe (pseudo) treatment time. In columns (5)-(7), we report the standardised differences between the different sub-samplesand the treatment group under the voucher system. All control variables that are not reported in this table have binarydistributions. The higher moments of these variables are precisely balanced in the matched samples.
Voucher system Mandatory system Standardised differences betweenTreatment- Control- Treatment- Control- (1) and (2) (1) and (3) (1) and (4)group group group group(1) (2) (3) (4) (5) (6) (7)
Variance
Age 59.75 60.3 72.45 66.49 .82 16.27 8.64Half months empl. in the last 24 months 58.91 54.18 62.6 53.03 3.98 6.51 1Half months unempl. in the last 24 months 3.93 4.32 4.4 4.2 1.83 .33 .51Time since last unemployment in the last 24 months (half-months) 41.11 45.83 44.56 44.64 3.34 .84 .75 · · · · km Skewness
Age 85.74 110.07 153.47 163.78 2 2.96 3.9Half months empl. in the last 24 months -894.18 -795.24 -1055.27 -772.03 3.87 8.51 .9Half months unempl. in the last 24 months 31.99 43.57 39.91 40.72 3.36 1.02 .69Time since last unemployment in the last 24 months (half-months) -732.4 -1014.77 -896.02 -969.59 7.23 2.76 .96 · · · · .34 3.08 5.03Cumulative benefits (last 4 years before unemployment) 3005.55 2948.6 2372.48 2995.96 .25 2.75 .18Elapsed unemployment duration 9.71 12.41 11.45 11.55 3.23 1.07 1Share of empl. in production industry .0001537 .000418 .0001786 .0004147 13.62 11.99 .14Share of empl. in construction industry .0000071 .0000112 .0000112 .0000135 8.1 .06 4.02Share of empl. in trade industry .0000054 .0000044 .0000049 .0000045 3.59 1.7 .44Share of male unempl. -.0000996 -.0000117 -.0000718 -.0000223 23.68 18.35 3.3Share of non-German unempl. .0005108 .0002599 .0004424 .0002689 11 8.29 .44Share of vacant fulltime jobs -.0002477 -.0004009 -.0002327 -.0003027 5.25 6.01 3.66Population per km · · · · Kurtosis
Age 7962.29 8231.21 11811 10104.53 1.62 15.71 8.85Half months empl. in the last 24 months 18211.33 16415.04 23904.57 15972.42 3.16 9.7 .7Half months unempl. in the last 24 months 331.3 602.69 439.63 540.75 3.88 2.29 .72Time since last unemployment in the last 24 months (half-months) 15975.52 27816.77 22022.23 26122.82 10.04 4.4 1.13 · · · · km · · · · Note: In columns (1)-(4), we report the variance, skewness, and kurtosis of observed characteristics for the treated andnon-treated sub-samples. Information on individual characteristics refers to the time of inflow into unemployment, withthe exception of the elapsed unemployment duration and monthly regional labour market characteristics, which refer tothe (pseudo) treatment time. In columns (5)-(7), we report the standardised differences between the different sub-samplesand the treatment group under the voucher system. All control variables that are not reported in this table have binarydistributions. The higher moments of these variables are precisely balanced in the matched samples. The change in dropout rates
In our interpretation of the negative effects of voluntary participation over the short- andmedium-term after course start (comp. Section 3.7), we argue that participants mightchange their attitudes towards training in a positive way and participate with higher mo-tivation. If an increase in motivation actually occurs, we should see a lower dropout rateunder the voucher system compared to the mandatory system. Therefore, we implementa simple descriptive analysis to investigate the change in dropout rates under both allo-cation systems. Course completion or dropout is only observed for treated individuals.We define dropout as proposed by Paul (2015) if particiants complete less than 80% ofthe planned course duration.Table D.1: Marginal changes of dropout rate in the mandatory vs. voucher system
Dep. variable: Dropout yes/no (1) (2) (3)Post-reform period -.047 (.002) -.046 (.002) -.037 (.002)Personal characteristics No Yes YesEducation and occupation No Yes YesEmployment and welfare history No Yes YesTiming of unemployment and programme start No Yes YesState of residence No Yes YesProgramme type and durations No No Yes
Note: Marginal effects after probit estimations based on the sample of treated individuals in Sample A.
We estimate different specifications in which we add more control variables. In column(3), we use all available controls variables including dummies for different planned coursedurations. In all specifications (1)-(3), the marginal effect of the time dummy on thedropout rate is significantly negative implying that the dropout rate decreases after thereform by about 4-5 percentage points. This supports our argumentation.10
Results for monthly earnings
Figure E.1: Overall reform, post-reform, and pre-reform treatment effects
Note: We estimate separate effects for each of the 88 months following the treatment. Diamonds indicate significant pointestimates at the 5%-level. Significance levels are bootstrapped with 499 replications. Lines without diamonds indicate pointestimates that are not significantly different from zero. We use baseline Sample A and control for local employment agencydistrict characteristics and the full set of observed characteristics (see Table A.2 in Online Appendix A).
Figure E.2: Selection and overall reform effects
Note: We estimate separate effects for each of the 88 months following the treatment. Diamonds indicate significant pointestimates at the 5%-level. Significance levels are bootstrapped with 499 replications. Lines without diamonds indicate pointestimates that are not significantly different from zero. We use baseline Sample A and control for local employment agencydistrict characteristics and the full set of observed characteristics (see Table A.2 in Online Appendix A).
Note: We estimate separate effects for each of the 88 months following the treatment. Diamonds indicate significant pointestimates at the 5%-level. Significance levels are bootstrapped with 499 replications. Lines without diamonds indicate pointestimates that are not significantly different from zero. We use baseline Sample A and control for local employment agencydistrict characteristics and the full set of observed characteristics (see Table A.2 in Online Appendix A).
Figure E.4: Policy effects
Note: We estimate separate effects for each of the 88 months following the treatment. Diamonds indicate significant pointestimates at the 5%-level. Significance levels are bootstrapped with 499 replications. Lines without diamonds indicate pointestimates that are not significantly different from zero. We use baseline Sample A and control for local employment agencydistrict characteristics and the full set of observed characteristics (see Table A.2 in Online Appendix A). (a) Sample A (b) Sample B