Estimation and Sensitivity Analysis for Causal Decomposition in Heath Disparity Research
RRunning head: CAUSAL DECOMPOSITION ANALYSIS 1Estimation and Sensitivity Analysis for Causal Decomposition in Heath Disparity ResearchSoojin Park , Chioun Lee , and Xu Qin University of California, Riverside University of PittsburghAUSAL DECOMPOSITION ANALYSIS 2AbstractIn the field of disparities research, there has been growing interest in developing acounterfactual-based decomposition analysis to identify underlying mediating mechanismsthat help reduce disparities in populations. Despite rapid development in the area, mostprior studies have been limited to regression-based methods, undermining the possibility ofaddressing complex models with multiple mediators and/or heterogeneous e ects. Wepropose an estimation method that e ectively addresses complex models. Moreover, wedevelop a novel sensitivity analysis for possible violations of identification assumptions.The proposed method and sensitivity analysis are demonstrated with data from the MidlifeDevelopment in the US study to investigate the degree to which disparities incardiovascular health at the intersection of race and gender would be reduced if thedistributions of education and perceived discrimination were the same across intersectionalgroups.AUSAL DECOMPOSITION ANALYSIS 3Estimation and Sensitivity Analysis for Causal Decomposition in Heath Disparity Research
1. Introduction
Despite gradual declines in cardiovascular disease (CVD) mortality in the US over 50years, racial di erences in the burden of CVD continue to play a substantial role inmaintaining racial di erences in life expectancy (Carnethon et al., 2017; Leigh, Alvarez, &Rodriguez, 2016). In 2010, the American Heart Association (AHA) introduced a newmetric, “ideal cardiovascular health,” to improve cardiovascular health and reduce healthdisparities in populations (Lloyd-Jones et al., 2010). Yet, cumulative evidence shows thatAfrican Americans (hereafter “Blacks”) have worse cardiovascular health than non-HispanicWhites (hereafter “Whites”) and that such a racial gap appears larger for women than men(Pool, Ning, Lloyd-Jones, & Allen, 2017), indicating that populations that fall into multipleminority statuses (Black women) are particularly vulnerable to poor cardiovascular health.In order to identify mediating factors that potentially reduce disparities, we investigatedhypothetical interventions that would simultaneously equalize the distributions betweennon-marginalized groups (e.g., White men) and marginalized groups (e.g., Black women) oftwo well-established psychosocial mediators: education and perceived discrimination.The current study is motivated by three methodological challenges to investigatingmultiple mediating mechanisms underlying health disparities across di erent race-gendergroups. First, investigators who have used multiple mediators are often concerned aboutthe causal ordering of mediators, particularly when their temporal ordering is unclear. Amethodologically rigorous investigation is required to look at whether results are still validif the causal ordering of mediators is reversed. Second, the causal structural modelunderlying the identification of mediators is complex in many circumstances; for example,Bauer and Scheim (2019) have shown di erential e ects. That is, the e ect of perceiveddiscrimination on psychological distress varies by race-gender group. Thus, it is essential touse an estimator that addresses complex models with heterogeneous e ects. Third, becauseresults may not be valid if one of the identification assumptions for the e ects of interest isAUSAL DECOMPOSITION ANALYSIS 4violated, sensitivity analysis for possible violations of assumptions is needed. The goals ofthe current study, therefore, are 1) to examine the identification assumptions and resultswhen the causal ordering of mediators is reversed, 2) to develop an estimator based oninverse-propensity scores that can address a complex model with heterogeneous e ectsand/or multiple mediators, and 3) to develop a novel sensitivity analysis based on thecoe cients of determination.This paper proceeds as follows. Section 2 reviews causal decomposition analysis inthe context of health disparities research, and Section 3 presents the identificationassumptions and results. Section 4 presents our estimation method in comparison toregression-based methods, and Section 5 presents our sensitivity analysis in comparisonwith other techniques. In Section 6, we demonstrate our estimation method and sensitivityanalysis using data from the Midlife Development in the US (MIDUS) study. Finally,Section 7 discusses the implications of the study for best practices for disparity research.The R code used for our case study is given in an e-Appendix.
2. Causal Decomposition Analysis: A Review2.1. Observed Disparities across Intersectional Groups
Intersectionality is a theoretical framework that views multiple categories (such asgender and race) as interacting in a matrix of domination, producing distinct inequalitieswith adverse outcomes for marginalized groups (Collins, 1990). For example, suppose thatwe investigate disparities in cardiovascular health at the intersection of gender and race.Following the intercategorical approach used in Bauer and Scheim (2019), the nexus ofself-identified race and gender implies four intersectional groups: White men, Whitewomen, Black men, and Black women. Intersectionality theory suggests that women ofcolor (Black women) will have poorer cardiovascular health than other race-gender groups.The issue is that the causal e ect on cardiovascular health of intersectional status ishard to obtain because 1) the causal e ect of intersectional status is di cult to defineAUSAL DECOMPOSITION ANALYSIS 5precisely and 2) intersectional status cannot be randomized. VanderWeele and Robinson(2014) have argued that the e ect of race may indicate, separately or jointly, the e ect ofgenetic background, physical features of him/herself, or physical features of parents. Eventhough the e ect of intersectional status can be precisely defined, intersectional groupmembership is not randomized, so its e ect would be correlated with confoundingvariables, such as genetic vulnerability, family socioeconomic status (SES), neighborhoodSES, etc. To address these issues, VanderWeele and Robinson (2014) suggested a way ofinterpreting the e ect of socially defined characteristics (e.g., being Black or women) thatdoes not require causal inferences. Simply, they suggested focusing on the observeddisparity between the reference and comparison groups (e.g., White men vs. Black women)when distributions of baseline covariates are equal across the groups. This interpretationcircumvents the issue of discussing the causal e ects of ascribed characteristics, which areessentially non-modifiable. Therefore, throughout this manuscript, we adopt this approachand focus on the observed disparity in cardiovascular health between non-marginalized andmarginalized groups when distributions of baseline covariates are equal across the groups. ect of Mediating Variables Simply observing health disparities between non-marginalized and marginalizedgroups does not necessarily explain why the disparities exist or how to reduce them(Jackson, 2017). But investigating mediating mechanisms can help to inform policyinterventions that reduce health disparities. Estimating natural direct and indirect e ects(Pearl, 2001; Robins & Greenland, 1992) defined under the potential-outcome frameworkhas been an integral part of the history of causal mediation analysis. Natural indirecte ects are defined, for example, as the expected change in the outcome in response to achange in a mediator (from the value that would have resulted under one exposure to thevalue that would have resulted under another exposure). The natural direct and indirecte ects require setting the mediator value for each individual to a potential value thatAUSAL DECOMPOSITION ANALYSIS 6would have resulted under a particular exposure.Influenced by causal mediation analysis based on natural direct and indirect e ects,Jackson (2017) discussed the possibility of estimating natural direct and indirect e ects inthe context of intersectional disparity. Bauer and Scheim (2019) also adopted thisapproach and applied VanderWeele’s 3-way decomposition analysis in the context of healthdisparities at the intersection of gender, sexuality, and race/ethnicity. However, identifyingnatural direct and indirect e ects of intersectional status (given that the initial disparity isobserved rather than causally defined) requires two assumptions: 1) no omittedpre-exposure mediator and outcome confounding given baseline covariates and 2) nopost-exposure mediator and outcome confounding. Both assumptions are strong but thelatter is particularly strong given that health disparities can be determined by a myriad offactors throughout the life course.VanderWeele and Robinson (2014) considered the use of randomized interventionalanalogues of natural direct and indirect e ects (hereafter, interventional e ects) in thecontext of health disparities when no post-exposure confounding exists. Following thisapproach, Jackson and VanderWeele (2018) proposed a new way of decomposition underthe rubric of “causal decomposition analysis” in which the confounder between themediator and the outcome is controlled for but the relationship between the confounderand the outcome is intact.Jackson (2017) and Jackson and VanderWeele (2018) provided identification resultsusing interventional e ects in the presence of post-exposure confounding given that theconfounder is measured. Randomized interventional e ects set the mediator value to arandomly drawn value from the distribution of the mediator among all of those with aparticular exposure, instead of setting the mediator value for each individual to a potentialvalue that would have resulted under a particular exposure (VanderWeele & Robinson,2014). By this simple change, the identification of direct and indirect e ects does notrequire the assumption of no post-exposure confounding. Instead, it requires the weakerAUSAL DECOMPOSITION ANALYSIS 7assumption of no omitted post-exposure confounding. This assumption of no omittedconfounding is weaker in the sense that, when post-exposure confounding exists (providedthat it is measured), the interventional e ects are still identified while the natural indirecte ects cannot be identified. This weaker assumption allows the identification ofpath-specific e ects involving multiple mediators as long as the causal ordering between themediators is determined as shown in Jackson (2018).Despite methodological developments in recent years, prior studies have largely reliedon regression estimators (e.g., Jackson & VanderWeele, 2018; VanderWeele & Robinson,2014). Regression is a suitable method if the causal structural model is simple with a singlemediator and no di erential e ects. In many instances, however, a causal structural modelunderlying a substantive problem is complex with multiple mediators and/or di erentiale ects. In the example that we will provide below, we consider two potential mediators(perceived discrimination and education) jointly that may explain health disparities acrossintersectional groups. In addition, following Bauer and Scheim (2019), we assume di erente ects, that is, that the e ect of perceived discrimination varies by intersectional group.Recently, Jackson (2019) proposed a weighting method, using ratio-of-mediator probabilityand inverse odds ratios, that handles models with di erential e ects and a single mediatore ectively. Therefore, as a next step, it is essential to develop an estimator that canaddress complex models with di erential e ects and multiple mediators in the context ofdisparity research.In addition to employing a flexible estimator, an important problem is that resultsmay be sensitive to possible violation of assumptions invoked for the identification of thee ects of interest. Many sensitivity analysis techniques have been developed to evaluatepossible violations of confounding assumptions in the context of causal mediation studiesbased on natural indirect e ects (e.g., Hong, Qin, & Yang, 2018; VanderWeele, 2010; Imai,Keele, & Yamamoto, 2010; VanderWeele & Chiba, 2014; Imai & Yamamoto, 2013).Sensitivity analysis is an essential part of causal mediation studies but it has not yetAUSAL DECOMPOSITION ANALYSIS 8appeared in the disparity literature. This is perhaps because the existing sensitivityanalysis techniques based on natural indirect e ects have not been extended to randomizedinterventional e ects, which are often employed in disparity research.
3. Identification3.1. Notation and Definitions
On the pathways from the intersectional groups to cardiovascular health, weconsidered three mediators: child abuse, discrimination, and education. As a primarymodel, we assume that education depends on child abuse and perceived discrimination andthat perceived discrimination depends on child abuse (see Figure 1). The causal orderingbetween mediators can be arguable; for example, perceived discrimination could a ecteducation but education could a ect perceived discrimination. Therefore, we also provideidentification assumptions and results for the e ects of interest when the ordering ofmediators is reversed.Figure 1 depicts a directed acyclic graph (DAG) that represents the proposedintervention to reduce cardiovascular health disparities across intersectional groups. Todefine the e ects of interest more precisely, let covariates (age) be denoted as C , which iscorrelated with complex historical structures that are responsible for racism and sexism( H ). The history gives rise to an association between the intersectional group ( R ),childhood SES ( X ), and genetic vulnerability (parental history of cardiovascular andmetabolic health, C ). Let R be the intersectional group indicator ( R = 0 : White men; R = 1 : White women; R = 2 : Black men; and R = 3 : Black women), and Y becardiovascular health. We assume that White men ( R = 0 ) is the reference group and thatthe rest of the groups are the comparison groups. There are multiple mediators, which are X (child abuse), D (perceived discrimination) and M (education). The supports of thedistributions of X , D , and M are X , D , and M , respectively.Here, the exposure is the intersectional status, and the intervening mediators are DAUSAL DECOMPOSITION ANALYSIS 9 R D M YX H X C C Figure 1 . Directed acyclic graph showing relationship between intersectional status,cardiovascular health, and three potential mediators
Note. 1) Diagram represents the relationship between race and gender intersectional status R ,cardiovascular health Y , discrimination D , and education M , as well as history H , age C , geneticvulnerability C , childhood SES X , and child abuse X .2) Solid lines represent relationships that are preserved and dashed lines represent relationships thatare removed by intervening on D and M .3) Placing a box around the conditioning variables implies that a disparity is considered withinlevels of these variables. and M. Given this, X is a post-exposure confounder as it is measured after a child wasborn and also confounds the mediator-outcome relationship. For notational simplicity, let C denote the vector of baseline covariates that consists of C and C ; and let X denote thevector of mediator-outcome confounders that are measured at the same time as theexposure ( X ) and measured after the exposure ( X ). The supports of the distributions of C and X are C and X , respectively.Under the stable-unit-treatment-value assumption (SUTVA ), let G d | c ( r ) and G m | c ( r ) be, respectively, random draws, given C = c , from the distributions of D and M under R = r . As a result, E [ Y ( G d | c (0) , G m | c (0)) | R = r ] is the expected cardiovascular health for acomparison group ( R = r ) that would have been observed if perceived discrimination andeducation were randomly drawn from the distributions of these mediators for members ofthe reference group (i.e., White men) who have the same age and same geneticvulnerability.The observed disparity ( · ) is defined as the di erence in the observed health outcome The SUTVA assumes 1) that an individual does not a ect the outcome of another individual and 2)that there is no variation in the treatment. AUSAL DECOMPOSITION ANALYSIS 10between a comparison group ( R = r ) and White men ( R = 0 ) who have the samedistributions of age and genetic vulnerability. We consider disparities within the samelevels of age ( C ) and genetic vulnerability ( C ). This is because, on one hand, age is thevariable that a ects the outcome and also is correlated with complex historical processesthat are responsible for gender and race di erences. For example, older generations mighthave experienced more discriminatory events due to their gender or racial status thanyounger generations. On the other hand, genetic vulnerability is a ected by complexhistorical structures related to racism and sexism and, in turn, a ects the outcome. Thismakes genetic vulnerability correlated with the intersectional group membership, and webelieve that estimating the initial disparity within the same level of genetic vulnerability isnecessary because genetic vulnerability is given and not manipulable. Formally, · ( r, © q c E [ Y | R = r, c ] P ( c ) ≠ E [ Y | R = 0 , c ] P ( c ) , where r œ { , , } and c œ C . Sinceno causal interpretation is given to this estimand ( · ), no causal identification assumptionsare required. However, positivity (i.e., P ( R = r | c ) > ) is required for a nonparametricidentification.Our main interest is how much the observed disparity would be reduced or remain ifwe intervened so that the distributions of education and perceived discrimination were thesame between a comparison group and White men. Formally, the disparity reduction ( ” )and disparity remaining ( ’ ) are defined as ” ( r ) © ÿ c E [ Y | R = r, c ] P ( c ) ≠ E [ Y ( G d | c (0) , G m | c (0)) | R = r ] , ’ (0) © E [ Y ( G d | c (0) , G m | c (0)) | R = r ] ≠ ÿ c E [ Y | R = 0 , c ] P ( c ) . (1)The disparity reduction ” ( r ) is the degree to which cardiovascular health for a comparisongroup ( R = r ) would change if the distributions of perceived discrimination and educationwere the same as those of White men ( R = 0 ), that is, how much the disparity would bereduced by the hypothetical intervention of equalizing distributions of perceivedAUSAL DECOMPOSITION ANALYSIS 11discrimination and education between the two groups. The disparity remaining is ’ (0) ,which is the degree to which the disparity for the comparison group ( R = r ) would remainif the distributions of the mediators were the same as those of White men ( R = 0 ). Bycombining the disparity reduction and disparity remaining, we can obtain the observeddisparity as · ( r,
0) = ” ( r ) + ’ (0) . We employ causal decomposition analysis based on the interventional e ects becauseour main questions of interest are based on the hypothetical intervention of equalizing thedistribution of mediators. If natural direct and indirect e ects were used, we would have toconsider an intervention of changing the values of mediators (discrimination and education)for each Black woman to the value that would have been observed if that individual was aWhite man. This intervention, although it is hypothetical, is strange to consider.Compared to this, intervening to set the distributions of mediators for all Black women tobe the same as those for all White men is less problematic (VanderWeele & Vansteelandt,2014). Another benefit of using the interventional e ects instead of natural e ects is thatthe assumption of no post-exposure mediator and outcome confounding is not required.The assumption is not plausible in the context of our example because child abuseconfounds the relationship between the other mediators (i.e., perceived discrimination andeducation) and cardiovascular health, and is also a ected by the intersectional group. Thisintroduces the issue of post-exposure confounding. The interventional e ects resolve thisissue. We present the identification assumptions and results for disparity reduction ( ” ( r ) )and disparity remaining ( ’ (0) ) when interventional e ects are used. Three assumptions(A1-A3) that permit the identification are as follows. A1. Conditional Ignorability : Y ( d, m ) ‹ { D, M }| R = r, X = x , C = c for all r œ { , , , } ,d œ D , m œ M , x œ X , and c œ C . AUSAL DECOMPOSITION ANALYSIS 12
A2. Positivity and overlap : < P ( R = r | c ) < and < P ( D = d, M = m | R = r, C=c ) forall r œ { , , , } , d œ D , m œ M , and c œ C . A3. Consistency : if D i = d and M i = m then Y i = Y ( d, m ) for all d œ D , and m œ M . Conditional ignorability (A1) states that no unmeasured confounding exists betweencardiovascular health ( Y ) and education and perceived discrimination ( D, M ) jointly giventhe intersectional group, mediator-outcome confounders, and baseline covariates. Unlikemediator ignorability in the causal mediation literature (e.g., Imai, Keele, & Yamamoto,2010; Pearl, 2001) based on natural indirect e ects, this assumption is conditioned onpost-exposure mediator and outcome confounding, which is X . This is an importantadvantage of using interventional e ects instead of natural e ects since it may not beplausible to assume the absence of post-exposure confounding in many settings. However,assumption A1 is still strong and cannot be guaranteed to be met even when conditionedon the intersectional group, mediator-outcome confounders, and baseline covariates. It istherefore essential to conduct sensitivity analysis that evaluates the robustness of thefindings to potential violations of this assumption. Our second assumption (A2) impliesthat 1) the conditional probability of intersectional status as well as the conditionalprobability of mediators are positive (positivity), and 2) there is su cient overlap incovariates across di erent race and gender combinations (region of common support). Ourthird assumption (A3) is that the observed outcome under a particular exposure value isthe same as the outcome after intervening to set the exposure to that value (consistency).Under assumptions A1-A3, the counterfactual E [ Y ( G d | c (0) , G m | c (0)) | R = r ] isnon-parametrically identified as ÿ x ,d,m, c E [ Y | R = r, x , d, m, c ] P ( X = x | R = r, c ) P ( D = d, M = m | R = 0 , c ) P ( c ) . (2) A proof is given in Appendix A.When multiple mediators are considered, the causal ordering between mediators isconsequential to correctly identify natural direct and indirect e ects (Imai, Keele, &AUSAL DECOMPOSITION ANALYSIS 13Tingley, 2010; VanderWeele & Vansteelandt, 2014). In the context of our example, thecausal ordering between mediators is unclear because questionnaires measuring themediators were administered in the baseline survey. Therefore, we examine how robust ouridentification results are when the causal ordering of the mediators is reversed. First, weconsider a scenario in which the causal ordering between the two intervening mediators(i.e., D and M ) is reversed as X æ M æ D . In this scenario, the identificationassumptions A1-A3 are the same as before; and the identification result is also the same asshown in equation (2). This is because the two mediators are considered jointly, regardlessof whether D a ects M or M a ects D .Second, we consider another scenario in which X occurs between two interveningvariables: D æ X æ M . The assumptions permitting identification of disparityreductions are slightly di erent than before. Instead of A1, we assume B1) that therelationship between D and Y is unconfounded given the intersectional group, childhoodSES, and baseline covariates (i.e., Y ( d, m ) ‹ D | R = r, X = x , C = c ) and B2) that therelationship between M and Y is unconfounded given the intersectional group, childhoodSES, child abuse, discrimination, and baseline covariates (i.e., Y ( d, m ) ‹ M | R = r, X = x , D = d, C = c ). A2 and A3 remain the same. Underassumptions B1, B2, A2, and A3, the counterfactual E [ Y ( G d | c (0) , G m | c (0)) | R = r ] isidentified as ÿ d,m, c , x E [ Y | R = r, d, x , m, c ] P ( x | R = r, c ) P ( x | R = r, x , d, c ) P ( d, m | R = 0 , c ) P ( c ) . (3) A proof for this scenario is given in Appendix B. If mediator-outcome confounders X donot exist, both equations (2) and (3) reduce to the same expression: q d,m, c E [ Y | R = r, d, m, c ] P ( d, m | R = 0 , c ) P ( c ) .AUSAL DECOMPOSITION ANALYSIS 14
4. Estimation
In this section, we propose an estimator that can be used in disparities research. Theestimator is built on the approach developed by Albert (2012) and VanderWeele andVansteelandt (2014) for natural direct and indirect e ects. This estimator provides aconvenient setting when multiple mediators are considered because empirical distributionsof mediators are used instead of modeling the mediators. In addition, this estimator easilyaddresses di erential e ects.To calculate the e ects of interest, we need to first calculate q c E [ Y | R = r, c ] P ( c ) .After some algebra, q c E [ Y | R = r, c ] P ( c ) can be computed by the weighted average of y among R = r given the weight ˆ W r = ˆ P ( R = r )ˆ P ( R = r | c ) , (4)where r œ { , , , } and c œ C . The probability of R = r given covariates can be obtainedby fitting a probit or logistic regression model. For instance, by fitting a multinomiallogistic regression, ˆ P ( R = 1 | c ) = exp (ˆ ⁄ c )1+ q r =1 exp (ˆ ⁄ r c ) , where ˆ ⁄ r s represent coe cients in logit for R = r . A functional form fitted for the intersectional group given covariates should becorrectly specified for a valid result. Using equation (4), the observed disparity between acomparison group ( R = r ) and White men given covariates can be estimated as ˆ · ( r,
0) = E [ ˆ W r y | R = r ] ≠ E [ ˆ W y | R = 0] .To estimate E [ Y ( G d | c (0) , G m | c (0)) | R = r ] , we first begin by fitting an outcome modeland compute E [ Y | R = r, x i , d i , m i , c i ] for each subject i among the reference group( R = 0 ), which is the predicted value of Y for individual i , if the individual was in acomparison group ( R = r ) but using the individual’s own values of mediators ( d i and m i )and covariates ( c i ). Then, we fit a model for confounders X and compute P ( x i | R = r, c i ) for each subject i among the reference group ( R = 0 ), which is the joint probability of x i , ifthe individual was in a comparison group ( R = r ) but using the individual’s own values ofcovariates ( c i ).AUSAL DECOMPOSITION ANALYSIS 15Finally, the predicted values of Y after incorporating the predicted values ofconfounders X (generated from the joint probability of x i ) will be averaged over i given aweight as E [ ˆ W q x ˆ µ rxDMc ˆ Â x | rc | R = 0] , where ˆ µ rxDMc = ˆ E [ Y | R = r, x , d i , m i , c ] and ˆ Â x | rc = ˆ P ( x | R = r, c ) . A proof is given in Appendix C. Based on this result, the disparityreduction and disparity remaining are estimated, respectively, as ˆ ” ( r ) = E [ ˆ W r y | R = r ] ≠ E [ ˆ W ÿ x ˆ µ rxDMc ˆ Â x | rc | R = 0] and ˆ ’ (0) = E [ ˆ W ÿ x ˆ µ rxDMc ˆ Â x | rc | R = 0] ≠ E [ ˆ W y | R = 0] . (5)The estimation requires several steps: calculating weights, calculating predicted values of Y and X , and calculating the weighted average. Therefore, we used nonparametricbootstrapping in order to obtain correct standard errors.When fitting the outcome model, di erential e ects are assumed regarding perceiveddiscrimination ( D ) across di erent intersectional groups, which is consistent with Bauerand Scheim (2019). Di erential e ects regarding education ( M ) across di erentintersectional groups can be easily specified but we did not include it in our model becausethe interaction e ect was not significant. For a valid result, the outcome model should becorrectly specified. The estimate will be biased if di erential e ects are present but areomitted from the outcome model.As discussed above, this estimator is particularly useful when multiple mediators areconsidered because modeling mediators ( D and M ) is not necessary. Specifying a correctfunctional form for multiple mediators can be challenging as the number of mediatorsincreases. An Aside: Regression Estimator
We review a regression estimator used by Jackson and VanderWeele (2018) toestimate the e ects of interest (i.e., ” ( r ) and ’ (0) ) and to discuss strengths and weaknessesof the regression estimator. To begin, we assume the simplest model that does not assumeAUSAL DECOMPOSITION ANALYSIS 16any di erential e ects.First, the observed disparity can be estimated by fitting the following regression: Y = „ + q r =1 „ r I ( R = r ) + „ c C + e , where I ( R = r ) is a dummy variable indicating R = r for r œ { , , } , and e follows a normal distribution. Here, ˆ „ r would be theestimated observed disparity after conditioning on covariates.Second, the disparity reduction and disparity remaining are estimated by fitting thefollowing regressions: Y = “ + ÿ r =1 “ r I ( R = r ) + “ x X + “ c C + e , and Y = – + ÿ r =1 – r I ( R = r ) + – x X + – d D + – m M + – c C + e , (6)where e and e follow standard normal distributions. The term ˆ – r is the disparityremaining estimate after intervening on perceived discrimination and education ( D and M )within the same level of mediator-outcome confounders ( X = x ). This estimand is notdesirable because the disparity is estimated in the focus group in which the level ofmediator-outcome confounders (childhood SES and abuse) is the same, for instance, thegroup that had no exposure to child abuse. Also, this estimand prevents us from estimatinga part of the disparity remaining, which is the path mediated via mediator-outcomeconfounders (i.e., R æ X æ Y ) by conditioning on X = x . Therefore, to estimate thedisparity remaining defined in equation (1), this path mediated via mediator-outcomeconfounders X should be added, which is ˆ – x ˆ “ x · ( ˆ „ r ≠ ˆ “ r ) . This is the disparity reductionwhen equalizing X alone but only scaled by the unmediated path between X and Y .Therefore, by combining these two e ect estimates, ’ (0) is estimated as ˆ – r + ˆ – x ˆ “ x · ( ˆ „ r ≠ ˆ “ r ) .The term ˆ “ r ≠ ˆ – r is the disparity reduction estimate after intervening on perceiveddiscrimination and education ( D and M ) within the same level of X = x . However, thepurpose is not to obtain disparity reduction within the same level of X and, thus, ˆ “ r ≠ ˆ – r isinsu cient to capture the disparity reduction. To obtain the disparity reduction defined inAUSAL DECOMPOSITION ANALYSIS 17equation (1), we have to add (1 ≠ ˆ – x ˆ “ x ) · ( ˆ „ r ≠ ˆ “ r ) , which is the disparity reduction whenequalizing X alone but only scaled by the mediated path between X and cardiovascularhealth. By combining these two e ect estimates, the disparity reduced is estimated as ˆ ” ( r ) = ˆ “ r ≠ ˆ – r + (1 ≠ ˆ – x ˆ “ x ) · ( ˆ „ r ≠ ˆ “ r ) . For proofs, refer to Jackson and VanderWeele (2018).The regression estimator yields in general e cient estimates; and the estimation isstraightforward if the estimation is based on the subpopulation of X = x (refer to thedi erence method in the structural equation model framework (Baron & Kenny, 1986;MacKinnon & Luecken, 2008)). However, if the estimation is not based on thesubpopulation, the regression estimator is no longer straightforward even after assumingthe simplest model in which no di erential e ects exist. As models change, the disparityreduction and disparity remaining should be recalculated. This calls for an estimator thatis suitable for a complex model with multiple mediators and possible di erential e ectssuch as the proposed estimator. Moreover, compared to the regression estimator, theproposed estimator can exploit the covariance balancing property of propensity scores(Imbens & Rubin, 2015), in which researchers can select a sample where the samples acrossintersectional groups are more balanced in terms of baseline covariates.
5. Sensitivity Analysis
Identification and estimation crucially rely on assumption A1, which is notempirically testable. The conditional ignorability assumption requires no omittedconfounding between the outcome and the two mediators (perceived discrimination andeducation) simultaneously, given the intersectional group, mediator-outcome confounders,and baseline covariates. In order for this assumption to be met, even approximately,substantive knowledge is required about the confounding structure between the outcomeand mediators. This is because omitted variable bias could be amplified after conditioningon observed confounders depending on the type of the observed confounders (see, forexample, Steiner & Kim, 2016). To address possible violations of this assumption, weAUSAL DECOMPOSITION ANALYSIS 18develop a novel sensitivity analysis that systematically assesses the validity of results basedon the coe cients of determination. cients of Determination Among the sensitivity analysis techniques developed for natural direct and indirecte ects, VanderWeele (2010)’s approach is flexible enough to apply to interventional e ects.Built upon VanderWeele’s approach, Park and Esterling (2020) provided the bias formulasthat can be used as sensitivity analysis for interventional e ects. Yet, this sensitivityanalysis technique requires unobserved confounders to be binary, and the interpretation ofsensitivity parameters is based on the scale of the independent and dependent variables.Most importantly, the bias formulas are calculated assuming independence between theunmeasured confounders and the existing mediator-outcome confounders ( X ). Theproposed sensitivity analysis technique is distinct from Park and Esterling (2020) in that 1)dependence between the unmeasured confounders and the existing mediator-outcomeconfounders is allowed, 2) it does not require unobserved confounders to be binary, and 3)the interpretation of sensitivity parameters is based on the coe cients of determination,which are scale-free measures.Some conditions are required to calculate bias: 1) an unobserved confounder U existsthat confounds the relationship between the mediators (discrimination and education) andthe outcome given covariates (formally, Y ( d, m ) ‹ { D, M }| R = r, X = x , C = c , U = u )and 2) the unobserved confounder U is measured before post-exposure confounder X .Figure 2 represents the scenarios that meet these conditions. These unobservedconfounders may include, for example, neighborhood environments at the time of birth.If the disparity reduction for R = r is estimated given observed covariates, ” ( r ) = q c E [ Y | R = r, c ] P ( c ) ≠ q x ,d,m, c E [ Y | R = r, x , d, m, c ] P ( X = x | R = r, c ) P ( D = d, M = m | R = 0 , c ) P ( c ) . If U exists, this expression will lead to a biased estimate. We define thisbias as the di erence between the expected value of the estimate and the true e ect. ForAUSAL DECOMPOSITION ANALYSIS 19 R D M YX H X C C U Figure 2 . When unobserved confounder U between the mediators and outcome exists Note. 1) Diagram represents the relationship between race and gender intersectional status R ,cardiovascular health Y , discrimination D , and education M , as well as history H , age C , geneticvulnerability C , childhood SES X , and child abuse X .2) Solid lines represent relationships that are preserved and dashed lines represent relationships thatare removed by intervening on D and M .3) Placing a box around the conditioning variables implies that a disparity is considered withinlevels of these variables. example, the bias for disparity reduction for R = r ( bias ( ” ( r )) ) is defined as ÿ c E [ Y | R = r, c ] P ( c ) ≠ ÿ x ,d,m, c E [ Y | R = r, x , d, m, c ] P ( x | R = r, c ) P ( d, m | R = 0 , c ) P ( c ) ≠ ÿ c E [ Y | R = r, c ] P ( c ) + ÿ x ,d,m, c ,u E [ Y | R = r, x , d, m, c , u ] P ( x | R = r, c , u ) P ( d, m | R = 0 , c , u ) P ( c , u ) . Given this definition and under some simplifying assumptions, the biases for disparityreduction and disparity remaining can be expressed using coe cients of determination.The simplifying assumptions required are 1) the e ect of unobserved confounder ( U ) on theoutcome ( Y ) is constant within the strata of the intersectional group ( R ),mediator-outcome confounders ( X ), mediators ( D, M ), and baseline covariates ( C ), and 2)the e ect of the mediators ( D , and M ) jointly on unobserved confounder ( U ) is constantwithin the strata of the intersectional group ( R ), mediator and outcome confounders ( X ),and baseline covariates ( C ). Let the partial R value of U be denoted as R Y ≥ U | I ( R = r ) , X ,D,M, C ; and the partial R value of { D, M } be denoted as R U ≥ { D,M }| I ( R = r ) , X , C .Then, the absolute value of biases for disparity reduction and disparity remaining forAUSAL DECOMPOSITION ANALYSIS 20 R = r are expressed as | bias | = se ( “ dm ) ˆııÙ R Y ≥ U | I ( R = r ) , X ,D,M, C ◊ R U ≥ { D,M }| I ( R = r ) , X , C ≠ R U ≥ { D,M }| I ( R = r ) , X , C df ◊ | ÿ c { P ( d, m | r, c ) ≠ P ( d, m | , c } P ( c ) | , (7) where se ( “ dm ) can be obtained from regressing Y on D and M jointly after conditioning on X and C , and df (degrees of freedom) can be obtained from the regression. In addition, | q c { P ( d, m | r, c ) ≠ P ( d, m | , c } P ( c ) | can be obtained from the data by regressing D and M jointly on R and C . A proof is given in Appendix D. The equation (7) states that thebias depends on two sensitivity parameters, which are 1) how much unobserved confounder U explains the variance of the outcome Y after controlling for the intersectional group, twomediators, mediator-outcome confounding, and existing covariates ( R Y ≥ U | I ( R = r ) , X ,D,M, C )and 2) how much the mediators jointly explain the variance of unobserved confounder U given the intersectional group, mediator-outcome confounding, and existing covariates( R U ≥ { D,M }| I ( R = r ) , X , C ).The equation also states that the absolute value of biases between disparity reductionand disparity remaining is the same given that the observed disparity is not causallydefined and, thus, the bias for the observed disparity due to the unobserved confounder iszero. The bias for disparity reduction and disparity remaining is the same except that thesigns are the opposite, which implies that the absolute value of the biases between them isthe same. These bias formulas are obtained in part as a result of extending Cinelli andHazlett (2020), who addressed the bias for treatment e ects.The simplifying assumptions required for calculating the bias may be too strong to bemet in some cases. In this case, one can use the original bias formula shown in appendix D(equation (14)), which may not be very practical since the number of sensitivityparameters becomes unwieldy. Therefore, we recommend modifying the original biasformula depending on the particular violation(s). Suppose that the first simplifyingassumption is violated such that the e ect of the unobserved confounder ( U ) on theAUSAL DECOMPOSITION ANALYSIS 21outcome ( Y ) depends on the value of mediator-outcome confounders ( Z ) as “ z u ; and thatthe second assumption is met that the joint e ect of the mediators ( D, M ) on theunobserved confounder ( U ) is constant across the level of confounder ( X ) and covariates( C ) as — dm . Then the following modified bias formula can be used as an alternative. bias ( ” ( r )) = ÿ z , c “ z u P ( z | R = 0 , c ) ◊ — dm { P ( d, m | R = r, c ) ≠ P ( d, m | R = 0 , c } P ( c ) , (8)where d œ D , m œ M , and c œ C . The conditional probability of z (i.e., P ( z | R = 0 , c ) ) canbe drawn from the data. Then the overall value of “ u will be obtained by summing overvalues of “ z u weighted by this conditional probability of z . Although many studies have developed sensitivity analysis techniques when naturaldirect and indirect e ects are used, very few have been extended to a case whereinterventional e ects are used with multiple mediators. Therefore, we discuss theextendibility of existing sensitivity analyses and compare them with our approach.Imai, Keele, and Yamamoto (2010) used the correlation between the mediator andoutcome as a measure of omitted pretreatment confounding, and they examined the changeof the estimate depending on the change of this correlation. This approach is advantageousin terms of inference because it provides standard errors of the estimates for varyingcorrelation values. However, extending this sensitivity analysis technique to the case ofinterventional e ects with multiple mediators may not be straightforward given themultiple correlations between the errors in the M ≠ Y , D ≠ M , and D ≠ Y relationships.Even if the extension is possible, the bias formulas will be a lot more complicated than thesingle mediator case.Hong et al. (2018) developed a sensitivity analysis based on weighting. One sensitivityparameter is the correlation between the weight di erence and the outcome; the othersensitivity parameter is the standard deviation of the weight di erence. This approach canAUSAL DECOMPOSITION ANALYSIS 22be easily extendable to interventional e ects if weights are modified to accommodatemultiple mediators and post-exposure confounders. Then, the bias formulas will remain thesame as the single mediator case. One issue with this extension is that all mediators shouldbe correctly modeled, which is challenging as the number of mediators increases.VanderWeele (2010) derived bias formulas due to omitted pretreatment confounding,which, under some simplifying assumptions, are reduced to the multiplication of twosensitivity parameters. The two sensitivity parameters used were 1) the e ect of theunobserved confounder on the outcome and 2) the di erence in prevalence of theunobserved confounder between treatment and control groups. One issue with using thesesensitivity parameters is that eliciting possible values of sensitivity parameters fromdi erent studies may not be straightforward due to the outcomes having di erent scales. Incontrast, eliciting partial R values from di erent studies is arguably straightforward evenwhen di erent scales are used.
6. Application to the MIDUS data6.1. Data and Measures
Extending analyses by Lee, Park, and Boylan (2020), we extracted baseline andoutcome data from MIDUS and the MIDUS Refresher. We limited the sample to thoserespondents (n=1978) who participated in MIDUS wave 2 or MIDUS Refresher biologicaldata collection and identified themselves either as non-Hispanic White or non-HispanicBlack. Intersectional status was created by following the intercategorical complexityapproach (e.g., Bauer & Scheim, 2019). Racial and gender statuses were created using thenexus of self-identified race/ethnicity and gender. Cardiovascular health was assessed inaccordance with the AHA’s criteria (Lloyd-Jones et al., 2010). A composite was created toreflect the criteria for ideal, intermediate, or poor cardiovascular health, respectively, oneach of seven metrics: smoking, BMI, physical activity, diet, total cholesterol, bloodpressure, and fasting glucose.AUSAL DECOMPOSITION ANALYSIS 23We have considered two life-course mediators (perceived discrimination andeducation) that explain cardiovascular disparities across intersectional groups. As forperceived discrimination, respondents were asked to report the number of times in their lifethey faced “discrimination” in 11 questions. Each item was recoded 1 if respondentsreported 1 or more times, otherwise 0. An inventory of lifetime discrimination wasconstructed by summing the items with possible scores ranging from 0 to 11 (Williams etal. 1997). Education is a variable that indicates the highest level of degree completed,which ranges from 1 = no school/some grade school to 12 = PhD, MD, or otherprofessional degree.Mediator-outcome confounders include childhood SES and abuse. Childhood SES isan index measure including parental education, poverty, financial status, and employmentstatus of parent(s). Abuse is an index, drawn from items on the Childhood TraumaQuestionnaire (Bernstein & Fink, 1998), measuring experiences of emotional, physical, orsexual abuse, with possible responses to each item ranging from 1 (never true) to 5 (veryoften true).For covariates, we included age and parental history of cardiovascular and metabolicillness (heart problems, stroke, and diabetes), which may reflect genetic susceptibility andshared lifestyle/environments associated with reduced respondent’s cardiovascular healthscore.According to Lee et al. (2020), while Black women have the lowest (i.e., unhealthiest)cardiovascular health scores (=6.95), White women have the highest cardiovascular healthscores (=8.74). White men have higher scores than Black men (7.95 vs. 7.27), and there isno significant gender di erence among Blacks. We compared the results from both proposed and regression-based methods. For theproposed method, we used the estimator shown in equation (5) and calculated theAUSAL DECOMPOSITION ANALYSIS 24estimates with and without considering di erential e ects across intersectional groups. Forthe regression-based method, we used the regression estimator suggested by Jackson andVanderWeele (2018) and calculated the estimates without considering di erential e ects. Itis possible to consider di erential e ects with the regression estimator but this requiressome calculations.The results of all comparison groups (Black men, White women, and Black women)are available, but for simplicity, we only present the disparity reduction and disparityremaining for Black women when compared to White men after intervening on educationand perceived discrimination simultaneously.Table 1 Estimates of the disparity reduction and disparity remaining for Black women vs White men
Estimator Weighting-Based Regression-BasedObserved disparity ( · (3 , ) -0.976 -0.927(95% CI) (-1.270, -0.667) (-1.240, -0.617)Without Di erential E ectsDisparity remaining ( ’ (0) ) -0.377 -0.396(95% CI) (-0.690, -0.038) (-0.704 -0.072)Disparity reduction ( ” (3) ) -0.599 -0.531(95% CI) (-0.783, -0.425) (-0.688 -0.386)% reduction 61.4% 57.3%With Di erential E ectsDisparity remaining ( ’ (0) ) -0.512(95% CI) (-0.852, -0.184)Disparity reduction ( ” (3) ) -0.464(95% CI) (-0.697, -0.241)% reduction 47.5% Note.
CI = confidence interval.Table 1 shows initial observed disparities ( · (3 , ), disparity remaining ( ’ (0) ), anddisparity reduction ( ” (3) ) in terms of cardiovascular health. The results fromweighting-based methods show that, compared to White men, initial disparity for Blackwomen (after controlling for covariates) is -0.98. Similarly, the results from regression-basedAUSAL DECOMPOSITION ANALYSIS 25methods show that, compared to White men, initial disparity for Black women is -0.93.The di erence is probably due to the following reasons: 1) the proposed estimator requiresthe covariate balancing property of propensity scores and 2) the proposed estimatorstandardizes covariates across intersectional groups rather than conditioning on them.After equalizing the distributions of both education and perceived discriminationacross groups, the results from weighting-based methods (without considering di erentiale ects) show that the initial disparity would be reduced by 61.4% for Black women, whencompared to White men. The results from regression-based methods show lower levels ofdisparity reduction for Black women (57.3%) when compared to White men. Thediscrepancy may be due to the fact that the weighting-based method relies on correctlyspecifying the intersectional group, mediator-outcome confounders, and outcome modelswhile the regression-based method relies on correctly specifying the three di erent outcomemodels (i.e., covariates only model and equations (6)).As we consider whether the e ect of perceived discrimination on cardiovascularhealth varies by intersectional group, the results from weighting-based methods show thatthe initial disparity would be reduced by 47.5% for Black women. The di erence in thepercentage of reduction when considering di erential e ects indicates the importance ofaddressing di erential e ects. Figure 3 shows the change in the estimates with the change in the two sensitivityparameters: 1) the partial R value of an unobserved confounder on the outcome given theintersectional status, mediator-outcome confounders, mediators, and baseline covariatesand 2) the partial R value of mediators on an unobserved confounder given theintersectional status, mediator-outcome confounders, and covariates. According to Figure3A, the estimate for disparity reduction will become zero if the partial R values are bothabout 0.14. This implies that the disparity reduction for Black women due to equalizingAUSAL DECOMPOSITION ANALYSIS 26the distributions of discrimination and education will be completely washed away if 1) theunobserved confounder explains 14% of the variance of cardiovascular health aftercontrolling for intersectional status, mediator-outcome confounders, mediators, andbaseline covariates and 2) the mediators explain 14% of the variance of the unobservedconfounder after controlling for intersectional status, mediator-outcome confounders, andbaseline covariates. This amount of confounding is unlikely given the range of thesensitivity parameters drawn from the strongest existing covariate (triangle and circlepoints from Figure 3A). The strongest existing covariate (age) explains 2.5% of thevariance in cardiovascular health after controlling for intersectional status,mediator-outcome confounders, mediators, and the rest of baseline covariates. When thesame or even the twice the amount of variance explained is assumed for the partial R value of the mediators on an unobserved confounder, the disparity reduction estimatewould still be strongly negative (i.e., between -0.4 to -0.2).Some investigators may be more interested in the inference than the points where theestimate becomes zero. If standard errors are assumed to be invariant depending on theamount of confounding, the 95% confidence interval will cover zero if the partial R valuesare both about 0.075. This amount of confounding is still greater than any existingcovariate.The same applies to the disparity remaining estimate. Figure 3B suggests that theestimate for disparity remaining will become zero if the partial R values are both about0.15. The 95% confidence interval will cover zero if the partial R values are both about0.06. This amount of confounding is still greater than any existing covariate. Therefore, weconclude that the significance level of disparity remaining may change if there existunobserved confounders that are as strong as existing covariates (triangle and circle pointsfrom Figure 3B).AUSAL DECOMPOSITION ANALYSIS 27 A) Disparity reduction partial R of U on Y pa r t i a l R o f D , M on U . . . . . . . − . − . . . . ● B) Disparity remaining partial R of U on Y pa r t i a l R o f D , M on U . . . . . . . − . − . . . ● Figure 3 . Sensitivity of the estimates using R values Note. 1) Bold lines represent the points at which the estimates become zero. 2) Standard linesrepresent the points at which the estimates become the respective value (e.g., -0.2, -0.1, 0.1, 0.2,etc.). 3) Dashed lines represent the points at which the upper and lower confidence intervals includezero. 4) A triangle point represents the partial R value drawn from the strongest existing covariate,assuming equal R values between the two sensitivity parameters. 5) A circle point represents thepartial R value drawn from the strongest existing covariate, assuming that the partial R of D, M on U is twice the size of the partial R of U on Y .
7. Discussion
In this paper, we study identification, estimation, and sensitivity analysis for thedisparity reduction and disparity remaining between intersectional groups. Our papercontributes to the causal decomposition analysis literature in several ways. First, wedeveloped a nonparametric estimator, which e ectively accommodates a complex modelwith multiple mediators and/or di erential e ects. Second, we developed a novelsensitivity analysis based on coe cients of determination. The proposed sensitivityanalysis technique can help researchers to assess the robustness of their findings to possibleviolations of assumptions even when the model underlying their study is complex.The recent work by Jackson (2019) proposed alternative weighting methods (i.e.,AUSAL DECOMPOSITION ANALYSIS 28ratio of mediator probability and inverse odds ratio weightings) in the context of disparityresearch that can be used when a single mediator is considered. These estimators are alsoflexible to accommodate di erential e ects and can be modified to accommodate multiplemediators. They require a functional form to be correctly specified for every interveningmediator and intersectional group, while the proposed estimator requires a functional formto be correctly specified for the outcome, mediator-outcome confounders, and intersectionalgroup. Researchers can choose between these estimators depending on the context of theirstudies; yet, the proposed estimator is less demanding in terms of modeling perspectives asthe number of mediators increases.The assumptions that permit the identification of the disparity reduction anddisparity remaining are strong and, in general, not empirically testable. Given that, it issurprising that no sensitivity analysis has yet been applied to disparities research. Theproposed sensitivity analysis can address complex models with multiple mediators anddi erential e ects. In addition, using coe cients of determination makes the interpretationof sensitivity parameters straightforward. We hope that the method and sensitivityanalysis will be useful to disparity researchers who investigate multiple mediators anddi erential e ects.While the primary purpose of this paper is methodological, we note issues related toperceived discrimination. Our measure of discrimination mainly captures individuals’awareness of or willingness to report discrimination. It is possible that experiencingdiscrimination might have di erent meanings and/or reporting thresholds acrossintersectional groups. For example, White men might view discrimination as a loss of whitesupremacy or privilege while Black women might encounter multiple forms ofdiscrimination through racial or male supremacy. If a di erential construct is used as amediator, it is di cult to interpret the disparity reduction/remaining in a meaningful waysince it is not clear how to equalize, even hypothetically, the distribution of perceiveddiscrimination across groups. Jackson and VanderWeele (2019) discussed this problem ofAUSAL DECOMPOSITION ANALYSIS 29di erential constructs and o ered several solutions for it.There are limitations and possible future directions for the proposed methods. First,one important limitation of the proposed sensitivity analysis technique is that standarderrors of the estimates are assumed to be invariant with a varying amount of confounding.This is a strong assumption that may not be met in practice. Therefore, it will be animportant area of future research to develop a numerical or analytical approach to obtaincorrect standard errors with a varying amount of confounding. Second, a possible futuredirection would be to allow time-varying mediators and outcomes. This represents animportant research topic if the causal ordering between the mediator and outcome is notclear (e.g., low income can cause cardiovascular disease and poor health status can causelow income). Third, another important area for future study is to address possiblemeasurement errors in confounders and mediators. Given that measurement error forconfounders and mediators is common, it would be necessary to develop sensitivity analysisto check the robustness of the results to possible measurement error in the context ofdisparity research.AUSAL DECOMPOSITION ANALYSIS 30 Appendix A: Identification of E [ Y ( G d | c (0) , G m | c (0)) | R = r ] Under assumptions A1-A3, the counterfactual E [ Y ( G d | c (0) , G m | c (0)) | R = r ] can beexpressed as = ÿ c E [ Y ( G d | c (0) , G m | c (0)) | R = r, c )] P ( c )= ÿ d,m, c E [ Y ( d, m ) | R = r, G d | c (0) = d, G m | c (0) = m, c ] P ( G d | c (0) = d, G m | c (0) = m | R = r, c ) P ( c )= ÿ d,m, c E [ Y ( d, m ) | R = r, c ] P ( d, m | R = 0 , c ) P ( c )= ÿ d,m, c , x E [ Y ( d, m ) | R = r, x , c ] P ( x | R = r, c ) P ( d, m | R = 0 , c ) P ( c )= ÿ d,m, c , x E [ Y ( d, m ) | R = r, x , d, m, c ] P ( x | R = r, c ) P ( d, m | R = 0 , c ) P ( c )= ÿ d,m, c , x E [ Y | R = r, x , d, m, c ] P ( x | R = r, c ) P ( d, m | R = 0 , c ) P ( c ) . (9) The second, fourth, and sixth equalities are due to the law of iterated expectations. Thethird equality is because G d | c (0) = d , and G m | c (0) = m are random given C = c . The sixthequality holds due to assumption A1 ( Y ( d, m ) ‹ { D, M }| R = r, X = x , C = c ). Theseventh equality holds due to A3 (consistency). This completes the proof.AUSAL DECOMPOSITION ANALYSIS 31 Appendix B: Identification of E [ Y ( G d | c (0) , G m | c (0)) | R = r ] with an alternativeordering of the mediators We assume that M depends on D and X , and X depends on D (i.e., D æ X æ M ). Assuming this causal ordering of mediators, the counterfactual E [ Y ( G d | c (0) , G m | c (0)) | R = r ] can be expressed as = ÿ c E [ Y ( G d | c (0) , G m | c (0)) | R = r, c ] P ( c )= ÿ d,m, c E [ Y ( d, m ) | R = r, G d | c (0) = d, G m | c (0) = m, c ] P ( G d | c (0) = d, G m | c (0) = m | R = 1 , c ) P ( c )= ÿ d,m, c E [ Y ( d, m ) | R = r, c ] P ( d, m | R = 0 , c ) P ( c )= ÿ d,m, c ,x E [ Y ( d, m ) | R = r, x , c ] P ( x | R = r, c ) P ( d, m | R = 0 , c ) P ( c )= ÿ d,m, c ,x E [ Y ( d, m ) | R = r, x , d, c ] P ( x | R = r, c ) P ( d, m | R = 0 , c ) P ( c )= ÿ d,m, c ,x ,x E [ Y ( d, m ) | R = r, x , d, x , c ] P ( x | R = r, c ) P ( x | R = r, x , d, c ) P ( d, m | R = 0 , c ) P ( c )= ÿ d,m, c ,x ,x E [ Y ( d, m ) | R = r, x , d, x , m, c ] P ( x | R = r, c ) P ( x | R = r, x , d, c ) P ( d, m | R = 0 , c ) P ( c )= ÿ d,m, c ,x ,x E [ Y | R = r, x , d, x , m, c ] P ( x | R = r, c ) P ( x | R = r, x , d, c ) P ( d, m | R = 0 , c ) P ( c ) . (10) The second and fifth equalities are due to the law of iterated expectations. The thirdequality is because G d | c (0) = d , and G m | c (0) = m are random given C = c . The fourth andsixth equalities are due to B1 ( Y ( d, m ) ‹ D | R = r, X = x, C = c ) and B2( Y ( d, m ) ‹ M | R = r, D = d, X = x , X = x , C = c ), respectively. The seventh equalityis due to A3 (consistency). This completes the proof.AUSAL DECOMPOSITION ANALYSIS 32 Appendix C: Estimation of ” ( r ) and ’ (0) As defined in equation (1), ” ( r ) = q c E [ Y | R = r, c ] P ( c ) ≠ E [ Y ( G d | c (0) , G m | c (0)) | R = r ] ,and ’ (0) = E [ Y ( G d | c (0) , G m | c (0)) | R = r ] ≠ q c E [ Y | R = 0 , c ] P ( c ) . First, we estimate q c E [ Y | R = r, c ] P ( c ) using weight as ÿ c E [ Y | R = r, c ] P ( c ) = ÿ c ,y yP ( y | R = r, c ) P ( c )= ÿ c ,y P ( R = r ) P ( R = r | c ) yP ( y | R = r, c ) P ( R = r | c ) P ( c ) 1 P ( R = r )= ÿ c ,y P ( R = r ) P ( R = r | c ) yP ( y, c | R = r )= E [ P ( R = r ) P ( R = r | c ) y | R = r ]= E [ W r y | R = r ] , (11) where W r = P ( R = r ) P ( R = r | c ) . The first equality is due to the law of iterated expectations. Thethird equality is due to Bayes theorem. The fourth equality is because E [ P ( y, c | R = r ) | R = r ] = 1 .Next, we estimate E [ Y ( G d | c (0) , G m | c (0)) | R = r ] . According to equation (2), E [ Y ( G d | c (0) , G m | c (0)) | R = r ] equals ÿ c , x ,d,m E [ Y | R = r, x , d, m, c ] P ( x | R = r, c ) P ( d, m | R = 0 , c ) P ( c )= ÿ c , x ,D,M,R I ( R = 0) E [ Y | R = r, x , D, M, c ] P ( x | R = r, c ) P ( D, M | R, c ) P ( c )= ÿ c , x ,D,M,R I ( R = 0) P ( R = 0 | c ) E [ Y | R = r, x , D, M, c ] P ( x | R = r, c ) P ( D, M | R, c ) P ( R | c ) P ( c )= ÿ c , x ,D,M,R I ( R = 0) P ( R = 0 | c ) E [ Y | R = r, x , D, M, c ] P ( x | R = r, c ) P ( D, M, R, c )= E [ I ( R = 0) P ( R = 0 | c ) ÿ x E [ Y | R = r, x , D, M, c ] P ( x | R = r, c )]= E [ P ( R = 0) P ( R = 0 | c ) ÿ x E [ Y | R = r, x , D, M, c ] P ( x | R = r, c ) | R = 0]= E [ W ÿ x µ r x DM c  x | r c | R = 0] , (12) AUSAL DECOMPOSITION ANALYSIS 33where W = P ( R =0) P ( R =0 | c ) , µ r x DM c = E [ Y | R = r, x , D, M, c ] , and  x | r c = P ( x | R = r, c ) . In thefirst equality, we use D, M , and R to represent that these are random variables. The fourthequality is because q b E [ a | b ] P ( b ) = E [ E [ a | b ]] . This completes the proof.AUSAL DECOMPOSITION ANALYSIS 34 Appendix D: Bias formulas of ” ( r ) and ’ (0) The bias for disparity reduction for R = r (bias( ” ( r ) )) is defined as the di erencebetween the expected estimate and the true value as ÿ c E [ Y | R = r, c ] P ( c ) ≠ ÿ x ,d,m, c E [ Y | R = r, x , d, m, c ] P ( x | R = r, c ) P ( d, m | R = 0 , c ) P ( c ) ≠ ÿ c E [ Y | R = r, c ] P ( c ) + ÿ x ,d,m, c ,u E [ Y | R = r, x , d, m, c , u ] P ( x | R = r, c , u ) P ( d, m | R = 0 , c , u ) P ( c , u )= ≠ ÿ x ,d,m, c E [ Y | R = r, x , d, m, c ] P ( x | R = r, c ) P ( d, m | R = 0 , c ) P ( c )+ ÿ x ,d,m, c ,u E [ Y | R = r, x , d, m, c , u ] P ( x | R = r, c , u ) ) ÿ x P ( d, m | R = 0 , x , c , u ) P ( x | R = 0 , c , u ) * P ( c , u )= ≠ ÿ x ,d,m, c ,u E [ Y | R = r, x , d, m, c , u ] P ( u | R = r, x , d, m, c ) P ( x | R = r, c ) P ( d, m | R = 0 , c ) P ( c )+ ÿ x ,d,m, c ,u E [ Y | R = r, x , d, m, c , u ] P ( x | R = r, c , u ) ) ÿ x P ( u | R = 0 , x , d, m, c ) P ( u | R = 0 , x , c ) P ( d, m | R = 0 , x , c ) ◊ P ( u | R = 0 , x , c ) P ( u | R = 0 , c ) P ( x | R = 0 , c ) * P ( u | c ) P ( c )= ≠ ÿ x ,d,m, c ,u E [ Y | R = r, x , d, m, c , u ] P ( u | R = r, x , d, m, c ) P ( x | R = r, c ) P ( d, m | R = 0 , c ) P ( c )+ ÿ x ,d,m, c ,u E [ Y | R = r, x , d, m, c , u ] P ( u | r, x , c ) P ( u | r, c ) P ( x | R = r, c ) ) ÿ x P ( u | R = 0 , x , d, m, c ) P ( d, m | R = 0 , x , c ) ◊ P ( x | R = 0 , c ) * P ( c )= ≠ ÿ x ,d,m, c ,u E [ Y | R = r, x , d, m, c , u ] P ( u | R = r, x , d, m, c ) P ( x | R = r, c ) P ( d, m | R = 0 , c ) P ( c )+ ÿ x ,d,m, c ,u E [ Y | R = r, x , d, m, c , u ] P ( u | r, x , c ) P ( u | R = r, c ) P ( x | R = r, c ) P ( u | R = 0 , c ) P ( d, m | R = 0 , c ) P ( c )= ≠ ÿ x ,d,m, c ,u E [ Y | R = r, x , d, m, c , u ] { P ( u | r, x , d, m, c ) ≠ P ( u | r, x , c ) } P ( x | r, c ) P ( d, m | R = 0 , c ) P ( c ) (13) The first and the last equalities are because of the law of total probability. The secondequality is due to Bayes theorem. The third and fifth equalities are because U onlyconfounds the relationship between the mediators and outcome (and thus, U ‹ R = r | C = c ). The fourth equality is because of the law of total probabilityAUSAL DECOMPOSITION ANALYSIS 35( q d,m { q x P ( u | R = 0 , x , d, m, c ) P ( d, m | R = 0 , x , c ) P ( x | R = 0 , c ) } = q x ,d,m { P ( u | R =0 , x , d, m, c ) P ( d, m | R = 0 , x , c ) P ( x | R = 0 , c ) } P ( d, m | R = 0 , c ) = P ( u | R = 0 , c ) P ( d, m | R = 0 , c ) ).By using the law of total probability, the last expression of equations (13) equals = ≠ ÿ x ,d,m, c ,u E [ Y | R = r, x , d, m, c , u ] ) ÿ d,m P ( u | r, x , d, m, c ) P ( d, m | , c ) ≠ P ( u | r, x , d, m, c ) P ( d, m | r, x , c ) * ◊ P ( x | R = r, c ) P ( d, m | R = 0 , c ) P ( c )= ≠ ÿ x ,d,m, c ,u E [ Y | R = r, x , d, m, c , u ] ) ÿ d,m P ( u | r, x , d, m, c ) P ( d, m | , c ) ≠ P ( u | r, x , d, m, c ) P ( d, m | r, c ) * ◊ P ( x | R = r, c ) P ( d, m | R = 0 , c ) P ( c )= ÿ x ,d,m, c ,u E [ Y | R = r, x , d, m, c , u ] d,m P ( u | r, x , d, m, c ) { P ( d, m | R = r, c ) ≠ P ( d, m | R = 0 , c ) } $ ◊ P ( x | R = r, c ) P ( d, m | R = 0 , c ) P ( c ) . (14) The first equality is because P ( u | R = r, x , c ) = q d,m P ( u | R = r, x , d, m, c ) P ( d, m | R = r, x , c ) .The second equality is because q x P ( d, m | R = r, x , c ) P ( x | R = r, c ) = P ( d, m | r, c ) . The lastexpression of equations (14) is a nonparametric bias formula that can be used without anyassumptions.Now, suppose that the 1) e ect of U on Y is constant within the strata of R, X , D, M , and C and equals to “ u ; and 2) the e ect of mediators D, M on U is constantwithin the strata of R, X , and C and equals to — dm . Then the bias formula reduces to = “ u ◊ — dm ÿ c { P ( d, m | R = r, c ) ≠ P ( d, m | R = 0 , c } P ( c ) . (15)Let the partial R value of unobserved confounder U on the outcome given I ( R = r ) , X , D, M and C be denoted as R Y ≥ U | I ( R = r ) , X ,D,M, C ; and let the partial R value of themediators on unobserved confounder U given I ( R = r ) , X , and C be denoted as AUSAL DECOMPOSITION ANALYSIS 36 R U ≥ { D,M }| I ( R = r ) , X , C . Then, the bias can be expressed as bias ( ” ( r )) = cov ( Y, U | I ( R = r ) , X , D, M, C ) var ( U | I ( R = r ) , X , D, M, C ) ◊ cov ( U, { D, M }| I ( R = r ) , X , C ) var ( { D, M }| I ( R = r ) , X , C ) ◊ ÿ c { P ( d, m | r, c ) ≠ P ( d, m | , c } P ( c )= cor ( Y, U | I ( R = r ) , X , D, M, C ) sd ( Y | I ( R = r ) , X , D, M, C ) sd ( U | I ( R = r ) , X , D, M, C ) ◊ cor ( U, { D, M }| I ( R = r ) , X , C ) sd ( U | I ( R = r ) , X , C ) sd ( { D, M }| I ( R = r ) , X , C ) ◊ ÿ c { P ( d, m | r, c ) ≠ P ( d, m | , c } P ( c ) . (16) where cov=covariance, cor=correlation, sd=standard deviation. The second equality isbecause cov ( A, B | C ) = cor ( A, B | C ) sd ( A | C ) sd ( B | C ) . Then, the absolute value of the biascan be expressed as | bias ( ” ( r )) | = ˆııÙ R Y ≥ U | I ( R = r ) , X ,D,M, C ◊ R U ≥ { D,M }| I ( R = r ) , X , C ≠ R U ≥ { D,M }| I ( R = r ) , X , C ◊ sd ( Y | I ( R = r ) , X , D, M, C ) sd ( { D, M }| I ( R = r ) , X , C ) ◊ | ÿ c { P ( d, m | r, c ) ≠ P ( d, m | , c } P ( c ) | = se ( “ dm ) ˆııÙ R Y ≥ U | I ( R = r ) , X ,D,M, C ◊ R U ≥ { D,M }| I ( R = r ) , X , C ≠ R U ≥ { D,M }| I ( R = r ) , X , C df ◊ | ÿ c { P ( d, m | r, c ) ≠ P ( d, m | , c } P ( c ) | (17) The first equality is because sd ( U | ( R = r ) , X , C ) sd ( U | I ( R = r ) , X ,D,M, C ) = ≠ R U ≥ { D,M }| I ( R = r ) , X , C and cor ( A, B | C ) = R A ≥ B | C . The second equality is because se ( “ dm ) = sd ( Y | I ( r ) , X ,D,M, C ) sd ( D,M | I ( r ) , X , C ) Ò df ,where “ dm is obtained from regressing Y on D and M jointly after conditioning on I ( R = r ) , X , and C and the df can be obtained from the degrees of freedom of theregression.Since no causal interpretation is given to the e ect of R on Y , the bias for theobserved disparity due to U is zero. Therefore, bias( ’ (0) )= -bias( ” ( r ) ), which implies thatthe absolute value of the biases between disparity reduction and disparity remaining is thesame. This completes the proof.AUSAL DECOMPOSITION ANALYSIS 37ReferencesAlbert, J. M. (2012). Mediation analysis for nonlinear models with confounding. Epidemiology (Cambridge, Mass.) , (6), 879-888.Baron, R. M., & Kenny, D. A. (1986). The moderator–mediator variable distinction insocial psychological research: Conceptual, strategic, and statistical considerations. Journal of personality and social psychology , (6), 1173-1182.Bauer, G. R., & Scheim, A. I. (2019). Methods for analytic intercategoricalintersectionality in quantitative research: Discrimination as a mediator of healthinequalities. Social Science & Medicine , , 236–245.Bernstein, D., & Fink, L. (1998). Manual for the childhood trauma questionnaire. NewYork: The Psychological Corporation .Carnethon, M. R., Pu, J., Howard, G., Albert, M. A., Anderson, C. A., Bertoni, A. G., . . .others (2017). Cardiovascular health in african americans: a scientific statement fromthe american heart association.
Circulation , (21), e393–e423.Cinelli, C., & Hazlett, C. (2020). Making sense of sensitivity: Extending omitted variablebias. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , (1), 39–67.Collins, P. H. (1990). Black feminist thought in the matrix of domination. Black feministthought: Knowledge, consciousness, and the politics of empowerment , , 221–238.Hong, G., Qin, X., & Yang, F. (2018). Weighting-based sensitivity analysis in causalmediation studies. Journal of Educational and Behavioral Statistics , (1), 32–56.Imai, K., Keele, L., & Tingley, D. (2010). A general approach to causal mediation analysis. Psychological Methods , , 309–334.Imai, K., Keele, L., & Yamamoto, T. (2010). Identification, inference and sensitivityanalysis for causal mediation e ects. Statistical Science , 51–71.Imai, K., & Yamamoto, T. (2013). Identification and sensitivity analysis for multiplecausal mechanisms: Revisiting evidence from framing esperiments.
Political Analysis ,AUSAL DECOMPOSITION ANALYSIS 38 , 141-171.Imbens, G. W., & Rubin, D. B. (2015). Causal inference in statistics, social, andbiomedical sciences . Cambridge University Press.Jackson, J. W. (2017). Explaining intersectionality through description, counterfactualthinking, and mediation analysis.
Social Psychiatry and Psychiatric Epidemiology , (7), 785–793.Jackson, J. W. (2018). On the interpretation of path-specific e ects in health disparitiesresearch. Epidemiology , (4), 517–520.Jackson, J. W. (2019). Meaningful causal decompositions in health equity research:definition, identification, and estimation through a weighting framework. arXivpreprint arXiv:1909.10060 .Jackson, J. W., & VanderWeele, T. (2018). Decomposition analysis to identify interventiontargets for reducing disparities. Epidemiology , (6), 825–835.Jackson, J. W., & VanderWeele, T. J. (2019). Intersectional decomposition analysis withdi erential exposure, e ects, and construct. Social Science & Medicine , ,254–259.Lee, C., Park, S., & Boylan, J. (2020). Cardiovascular health at the intersection of race andgender: Identifying life-course process to reduce health disparities. (Unpublishedmanuscript)Leigh, J. A., Alvarez, M., & Rodriguez, C. J. (2016). Ethnic minorities and coronary heartdisease: an update and future directions.
Current Atherosclerosis Reports , (2), 9.Lloyd-Jones, D. M., Hong, Y., Labarthe, D., Moza arian, D., Appel, L. J., Van Horn, L.,. . . others (2010). Defining and setting national goals for cardiovascular healthpromotion and disease reduction: the american heart association?s strategic impactgoal through 2020 and beyond. Circulation , (4), 586–613.MacKinnon, D. P., & Luecken, L. J. (2008). How and for whom? mediation andmoderation in health psychology. Health Psychology , (2S), S99-S100.AUSAL DECOMPOSITION ANALYSIS 39Park, S., & Esterling, K. M. (2020). Sensitivity analysis for pretreatment confounding withmultiple mediators. Journal of Educational and Behavioral Statistics ,1076998620934500.Pearl, J. (2001). Direct and indirect e ects. In Proceedings of the seventeenth conferenceon uncertainty in artificial intelligence (pp. 411–420).Pool, L. R., Ning, H., Lloyd-Jones, D. M., & Allen, N. B. (2017). Trends in racial/ethnicdisparities in cardiovascular health among us adults from 1999–2012.
Journal of theAmerican Heart Association , (9), e006027.Robins, J. M., & Greenland, S. (1992). Identifiability and exchangeability for direct andindirect e ects. Epidemiology , 143–155.Steiner, P. M., & Kim, Y. (2016). The mechanics of omitted variable bias: Biasamplification and cancellation of o setting biases. Journal of causal inference , (2).VanderWeele, T. (2010). Bias formulas for sensitivity analysis for direct and indirecte ects. Epidemiology (Cambridge, Mass.) , (4), 540.VanderWeele, T., & Chiba, Y. (2014). Sensitivity analysis for direct and indirect e ects inthe presence of exposure-induced mediator-outcome confounders. Epidemiology,Biostatistics, and Public Health , (2).VanderWeele, T., & Robinson, W. R. (2014). On causal interpretation of race inregressions adjusting for confounding and mediating variables. Epidemiology(Cambridge, Mass.) , (4), 473-483.VanderWeele, T., & Vansteelandt, S. (2014). Mediation analysis with multiple mediators. Epidemiologic Methods ,2