[PDF] A multistate approach for mediation analysis in the presence of semi-competing risks with application in cancer survival disparities

Abstract

We propose a novel methodology to quantify the effect of stochastic interventions on non-terminal time-to-events that lie on the pathway between an exposure and a terminal time-to-event outcome. Investigating these effects is particularly important in health disparities research when we seek to quantify inequities in timely delivery of treatment and its impact on patients survival time. Current approaches fail to account for semi-competing risks arising in this setting. Under the potential outcome framework, we define and provide identifiability conditions for causal estimands for stochastic direct and indirect effects. Causal contrasts are estimated in continuous time within a multistate modeling framework and analytic formulae for the estimators of the causal contrasts are developed. We show via simulations that ignoring censoring in mediator and or outcome time-to-event processes, or ignoring competing risks may give misleading results. This work demonstrates that rigorous definition of the direct and indirect effects and joint estimation of the outcome and mediator time-to-event distributions in the presence of semi-competing risks are crucial for valid investigation of mechanisms in continuous time. We employ this novel methodology to investigate the role of delaying treatment uptake in explaining racial disparities in cancer survival in a cohort study of colon cancer patients.

Full PDF

AA multistate approach for mediation analysis in thepresence of semi-competing risks with application incancer survival disparities

Linda Valeri ∗ , Cecile Proust-Lima , Weijia Fan ,Jarvis T. Chen , Helene Jacqmin-Gadda Department of Biostatistics, Columbia University Mailman School of Public Health,722 W 168th St, New York, NY, USA Department of Biostatistics and Epidemiology, Universite de Bordeaux, Talence, France Department of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Boston, MA, USA ∗ To whom correspondence should be addressed; E-mail: [email protected] a r X i v : . [ s t a t . M E ] F e b Abstract

We propose a novel methodology to quantify the effect of stochastic interventions on non-terminal time-to-events that lie on the pathway between an exposure and a terminal time-to-event outcome. Investigating these effects is particularly important in health disparities researchwhen we seek to quantify inequities in timely delivery of treatment and its impact on patients’survival time. Current approaches fail to account for semi-competing risks arising in this set-ting. Under the potential outcome framework, we deﬁne and provide identiﬁability conditionsfor causal estimands for stochastic direct and indirect effects. Causal contrasts are estimated incontinuous time within a multistate modeling framework and analytic formulae for the estima-tors of the causal contrasts are developed. We show via simulations that ignoring censoring inmediator and/or outcome time-to-event processes, or ignoring competing risks may give mis-leading results. This work demonstrates that rigorous deﬁnition of the direct and indirect effectsand joint estimation of the outcome and mediator time-to-event distributions in the presence ofsemi-competing risks are crucial for valid investigation of mechanisms in continuous time. Weemploy this novel methodology to investigate the role of delaying treatment uptake in explain-ing racial disparities in cancer survival in a cohort study of colon cancer patients.

Mediation analysis is a popular approach to decompose the total effect of an exposure on an out-come into direct and indirect effects through an intermediate factor, called a mediator (Green-land and Robins, 1992). Literature in this ﬁeld is fast growing, now accommodating a wealthof designs, variable dimensions and types (VanderWeele, 2015). Along with the estimationof “natural” effects, other effects that involve interventions on mediators might be of interest.Speciﬁcally, controlled direct effects represent the effect of the exposure on the outcome while2he mediator is further ﬁxed to a speciﬁc level or an intervention on its distribution is sought(i.e., stochastic interventions) (Vansteeland and Daniel, 2017; Mu˜noz and van der Laan, 2012).Recently, the causal inference community has drawn attention to the study of causal effects inthe presence of competing or semi-competing risks. Competing risks arise when a terminalfailure time outcome competes with another terminal event (Andersen et al., 2012). Semi-competing events arise when the outcome of interest is a non-terminal event that competes witha terminal time-to-event (e.g. death) which can occur before or after the outcome of interest(Fine, Jiang and Chappell, 2001). The consequence of both competing and semi-competingevents is to render non observable, effectively undeﬁned, the primary outcome of interest whenthe competing event occurs ﬁrst, a phenomenon that is also referred to as “truncation” (Zhangand Rubin, 2003). This poses challenges to formalizing and estimating causal contrast for totaleffects as well as direct and indirect effects. In particular, excluding from the analysis individu-als for whom the competing event occurs ﬁrst and the outcome is not observed, has been shownto lead to selection bias (Tchetgen et al., 2012).We consider the setting in which interest lies in estimating the effect of interventions on anon-terminal time-to-event mediator in the pathway between the exposure and a terminal time-to-event outcome. The present work is motivated by the study of the role of disparities inaccess to care as determinants of racial disparities in survival among cancer patients. Recently,causal inference approaches have been proposed to investigate determinants of such disparities(VanderWeele and Robinson, 2014; Valeri et al., 2016, Devick et al., 2020). We here wish toinvestigate the role of waiting time to surgery in explaining racial disparities survival amongcolon cancer (CC) patients adopting a causal inference approach. Speciﬁcally, we wish to esti-mate the extent to which racial disparities in survival would remain had the distribution of theintermediate time-to-event, time to surgery, in the least advantaged population been equalizedto that observed in the most advantaged population. We address this question in the Cancer Care3utcomes Research and Surveillance Consortium (CanCORS) patient population. A challengein quantifying the impact of this intervention arises from the fact that the time to surgery ispartially unobserved, as some patients died prior to receiving treatment. Exploratory analysessuggest that Black colon cancer patients are more likely to die before receiving treatment thanWhite patients. Furthermore, among the ones who survive, Blacks display a longer waiting timeto surgery. We here set to address this question overcoming the challenge of semi-competingrisks.Our work builds upon the substantial body of literature that describes and proposes solutions tothe problem of semi-competing and competing risks when estimating total effects, direct andindirect effects. Recently, solutions to deﬁne, identify and estimate total effects in the presenceof truncation have been proposed. When the outcome is a time-to-event, Young, Tchetgen,and Hern´an (2018) and Stensrud et al. (2020) proposed novel causal estimands for the averagetreatment effect in the presence of competing risks. Others focused on the study of survivor av-erage causal effects (SACE) for non-survival outcomes in the presence of semi-competing risks(Ding et al., 2011; Wang, Zhou and Richardson, 2017; Comment et al., 2019). Yet, the problemof semi-competing risks has received relatively little attention when interest lies in estimatingdirect and indirect effect on a failure time outcome. Proposals for novel causal contrasts andg-methods for mediation analysis in the context of semi-competing events have been proposedfor the setting in which the outcome is a failure time and the mediator is time-dependent (Aalenet al., 2020; Zheng and Van der Laan, 2017; Lin et al., 2017; and Tai et al., 2020). Althoughtremendous progress has been made to advance the formulation of causal effects in the pres-ence of semi-competing events, unsolved questions remain. First, thus far no causal contrastshave been proposed in the context of mediation analysis when both mediator and outcome aretimes-to-event. Second, causal mediation contrasts put forth by the most recent contributionswhere the outcome is a time-to-event and the mediator ﬁxed time or longitudinal are deﬁned on4he hazard ratio scale. Several authors have shown that regardless of how competing events areaccommodated, contrasts of hazards cannot generally be interpreted as causal effects (Hern´an,2010). Third, most commonly mediation analysis proceeds by modeling the mediator and out-come regressions separately. In the presence of non-linearities, this estimation strategy is oftenaffected by a model incompatibility issue that precludes valid effect decomposition. Such issuescould be particularly exacerbated when both outcome and mediator are potentially censored.To address these gaps, we propose interventional analogues to direct and indirect effects to esti-mate the effect of stochastic interventions on a time-to-event mediator in the presence of semi-competing risks. We provide conditions that lead to non-parametric identiﬁcation and proposea novel multistate modeling approach for estimation and inference of the causal effects. Wefurther demonstrate the necessity to properly account for semi-competing risks via simulationsand illustrate our methodology in the study of racial disparities in colon cancer survival.

Let A denote the exposure variable (race in our context) and X denote cancer stage at diagnosis( I − IV ), capturing the illness severity when the cancer is diagnosed. Let T , S and K denotethe time to surgery, the time to death and the time to censoring, respectively. Censoring timeis assumed to be non informative. The observation times for T (a non-terminal event) and S (a terminal event) processes are denoted by Y T = min ( T, S, K ) and Y S = min ( S, K ) withevent observation indicators δ T = I ( Y T = T ) and δ S = I ( Y S = S ) . We also observe C , avector of baseline covariates that are confounders of the T - S relationship. The observed datafor individual i is O i = ( Y Ti , δ Ti , Y Si , δ Si , X i , A i , C i ) .The directed acyclic graph in Figure 1A describes stage and time to treatment as determinantsof racial differences in cancer survival. In particular, for patients in each stage, we are interested5n understanding the role of racial differences in time to treatment from diagnosis in explainingthe racial differences in survival.We can deﬁne racial disparities (TE, total effect) in survival among colon cancer patients usingdifferent scales. For example, we might be interested in estimating racial disparities with respectto the survival function (eq. 2.1) or we could also consider the restricted mean survival time,restricted at time r (eq. 2.2). T E s = P ( S > s | A = 1 , X, C ) − P ( S > s | A = 0 , X, C ) (1) T E r = E ( min ( S, r ) | A = 1 , X, C ) − E ( min ( S, r ) | A = 0 , X, C ) (2) We are interested in a ﬁxed or stochastic intervention g on the mediator. G ( · ) denotes a ran-dom draw from an arbitrary distribution in a hypothetical population. This distribution couldbe completely synthetic and of arbitrary choice of the investigator, or learned from the data. Inthe latter case, g = G ( T x,c )( A = a ) = G ( T x,c )( a ) denotes a random draw from the time totreatment distribution observed among individuals with A = a , X = x and C = c .We can consider survival differences across exposure under an intervention that ﬁxes the medi-ator distribution in both exposure groups: SDE s = P ( S g > s | A = 1 , X, C ) − P ( S g > s | A = 0 , X, C ) (3) We can interpret (eq. 2.3) as the residual difference in the probability of surviving after time s between White and Black patients had the distribution of time to treatment g been the sameacross the two exposure groups. Setting g = G ( T x,c )(0) we formalize a stochastic interventionfor which the time to treatment T of all individuals in group A = 1 with certain stage x andbaseline covariates c is randomly assigned as sampled from the distribution of T in the group of6ndividuals with A = 0 , X = x , and C = c . We refer to this causal contrast as “stochastic directeffect” ( SDE s ) because this causal effect ﬁxes the intermediate time-to-event distribution to bethe same in both exposure groups.We can also consider causal contrasts within an exposure group considering ﬁxed or stochasticinterventions, g and g (cid:48) on the intermediate time-to-event: SIE s = P ( S g > s | A = a, X, C ) − P ( S g (cid:48) > s | A = a, X, C ) (4) We can interpret (eq. 2.4) as the change in the probability of surviving after time s forsubjects in exposure group A = a for a change in the time to treatment level or distributionfrom g to g (cid:48) . For example setting g = G ( T x,c )(0) and g (cid:48) = G ( T x,c )(1) indicates a change inthe time to treatment distribution from what observed in group A = 1 (Black patients in ourcase) to what observed among group A = 0 (White patients). We refer to this causal contrast as“stochastic indirect effect” ( SIE s ) because, for this effect to be different from zero, race has tobe associated with the intermediate time-to-event which in turn needs to be causally related tothe survival time.In what follows we focus on the causal estimand in (eq. 2.3), the “residual disparity” or moregenerally called the “stochastic direct effect” ( SDE s ) but the results apply also to the “stochasticindirect effect”. Note that for these causal estimands we are not requiring the time to treatmentto be always observed. We allow for censoring (under a random mechanism conditional onobserved covariates) and for the terminal event to occur prior to the intermediate event. Several assumptions must be met to identify these causal contrasts involving the potential out-come S g . 7 ssumption 1. Consistency of potential outcomes.Let S i | T i = t denote the survival time of individual i given that this subject is observed to re-ceive treatment at time t and let S i,g = t denote the potential survival time for subject i had weintervened setting the treatment time equal to t . For each subject i and each level of time totreatment T = t we assume S i,g = t = S i | T i = t That is for each i and each t , the survival in a world where we intervene setting the time totreatment to a speciﬁc value t (via a ﬁxed or stochastic intervention) is the same as the survivalin the real world where we observe a time to treatment equal to t . Assumption 2.

Conditional exchangeability (no unmeasured confounding). The observed time-to-treatment assignment does not depend on the potential outcomes after accounting for the setof measured covariates A , X , and C . S g ⊥ T | A, C, X

Assumption 3.

Non-informative censoring of event times. The vector of censoring times K is conditionally independent of all potential event times (which implies that the observedcensoring time is conditionally independent of all potential event times). S g ⊥ K | A, C, X

Assumption 4.

We assume that positivity holds. That is for each covariates pattern ( A, X, C ) that has positive probability in the data, such that the joint density f ( A, X, C ) > is positiveand the probability of time to treatment (ﬁxed or stochastic) intervention g = t is positive (i.e. P r ( g = t | A, X, C ) > ) with probability 1. 8nder these assumptions, the g-formula of non parametric identiﬁcation for the residual dispar-ity (eq. 2.3) where g = G ( T x,c )(0) is: P ( S G ( T x,c )(0) > s | A = 1 , X, C ) − P ( S G ( T x,c )(0) > s | A = 0 , X, C )= (cid:90) t P ( S t > s | G ( T x,c )(0) = t, A = 1 , X, C ) f G ( T x,c )(0) ( t | A = 1 , X, C ) dt − (cid:90) t P ( S t > s | G ( T x,c )(0) = t, A = 0 , X, C ) f G ( T x,c )(0) ( t | A = 0 , X, C ) dt = (cid:90) t P ( S t > s | G ( T x,c )(0) = t, A = 1 , X, C ) f G ( T x,c )(0) ( t | A = 1 , X, C ) dt − P ( S > s | A = 0 , X, C )= (cid:90) t P ( S t > s | T = t, A = 1 , X, C ) f ( t | A = 0 , X, C ) dt − P ( S > s | A = 0 , X, C )= (cid:90) t { P ( S > s | T = t, A = 1 , X, C ) − P ( S > s | T = t, A = 0 , X, C ) } f ( t | A = 0 , X, C ) dt (5) where, the ﬁrst equality is due to the assumption of conditional exchangeability, the secondequality solves the integral for the A = 0 group, the third equality uses the deﬁnition of stochas-tic intervention that ﬁxes the random variable T to t , and the fourth equality uses consistencyand re-expands the integral for the A = 0 group. The multistate model (Figure 1B) is a natural framework to operationalize the nonparametricestimator derived in the previous section. Our setting involves the following two scenarios:(1) the individual transits from diagnosed to treated, and then from treated to death or (2) theindividual transits from diagnosed directly to death and time to treatment is censored at time ofdeath. In this context our intervention aims at controlling the intensity of the transition fromdiagnosed to treated only.We deﬁne α ( t | A, X, C ) , α ( t | A, X, C ) and α ( t | t (cid:48) , A, X, C ) as the instantaneous haz-ard of transiting from diagnosed to treated, from diagnosed to death, and from treated to death,9espectively. Λ ( s | A, X, C ) , Λ ( s | A, X, C ) , Λ ( s | T, A, X, C ) are the corresponding cumu-lative transition intensity functions. The survival function is the sum of the probability ofbeing alive and not treated, P ( s | A, X, C ) , and the probability of being alive and treated, P ( s | A, X, C ) : P ( S > s | A, X, C ) = P ( s | A, X, C ) + P ( s | A, X, C ) where the probability of being alive and untreated at time s is: P ( s | A, X, C ) = e − Λ ( s | A,X,C ) − Λ ( s | A,X,C ) (6) the probability of being alive and treated at time s is: P ( s | A, X, C ) = (cid:90) s e − Λ ( u | A,X,C ) − Λ ( u | A,X,C ) α ( u ) e − Λ ( s | T = u,A,X,C )+Λ ( u | T = u,A,X,C ) du (7) Let g = G ( T x,c )(0) be a random draw from the time to treatment distribution in the group A = 0 . The probability of being alive at time s and untreated at time s according to the time totreatment distribution observed in the A = 0 subgroup, P g ( s | A, X, C ) , and the probability tobe alive at time s and treated at time s according to the time to treatment distribution observedin the A = 0 subgroup, P g ( s | A, X, C ) , are given by: P g ( s | A, X, C ) = e − Λ ( s | A =0 ,X,C ) − Λ ( s | A,X,C ) (8) and P g ( s | A, X, C ) = (cid:90) s e − Λ ( u | A =0 ,X,C ) − Λ ( u | A,X,C ) α ( u | A = 0 , X, X ) − Λ ( s | T = u,A,X,C )+Λ ( u | T = u,A,X,C ) du (9) Our estimator for the stochastic direct effect deﬁned in (2.3) can be obtained as the combi-nation of the four probabilities given in equations (2.6)-(2.9):

SDE s = P ( S g > s | A = 1 , X, C ) − P ( S g > s | A = 0 , X, C )= { P g ( s | A = 1 , X, C ) + P g ( s | A = 1 , X, C ) } − { P ( s | A = 0 , X, C ) + P ( s | A = 0 , X, C ) } . (10) We can interpret this estimator in equation (2.10) as the average difference in the globalsurvival of a hypothetical population of Black patients having the time to treatment distributionobserved among White patients who survived to receive treatment and the global survival ob-served among White patients. The 1st term is the probability of being alive and untreated at s for a Black patient with the treatment transition hazard (from untreated to treated) of a Whitepatient and the death transition hazard of a Black patient. The second term is the probability ofbeing alive and treated at s for a patient with the treatment transition hazard of a White patientand the death transition hazard of a Black patient given the intervention on the time to treatment.The third term is the probability to be alive and untreated at s for a White patient. The fourthterm is the probability to be alive and treated at s for a White patient.Similarly, it can be shown that the estimator for SIE s introduced in (2.4) is given by: SIE s = P ( S G ( T x,c )(1) > s | A = 1 , X, C ) − P ( S G ( T x,c )(0) > s | A = 1 , X, C ) (cid:90) s P ( S > s | t, A = 1 , C, X ) f ( t | A = 1 , C, X ) dt − (cid:90) s P ( S > s | t, A = 1 , C, X ) f ( t | A = 0 , C, X ) dt = (cid:110) P ( s | A = 1 , X, C ) + P ( s | A = 1 , X, C ) (cid:111) − (cid:110) P g (cid:48) ( s | A = 1 , X, C ) + P g (cid:48) ( s | A = 1 , X, C ) (cid:111) = (cid:110) e − Λ ( s | A =1 ,X,C ) − Λ ( s | A =1 ,X,C ) ++ (cid:82) s e − Λ ( t | A =1 ,X,C ) − Λ ( t | A =1 ,X,C ) α ( t | A = 1 , X, C ) e − Λ ( s | t,A =1 ,X,C )+Λ ( t | t,A =1 ,X,C ) dt (cid:111) − (cid:110) e − Λ ( s | A =0 ,X,C ) − Λ ( s | A =1 ,X,C ) + (cid:82) s e − Λ ( t | A =0 ,X,C ) − Λ ( t | A =1 ,X,C ) α ( t | A = 0 , X, C ) e − Λ ( s | t,A =1 ,X,C )+Λ ( t | t,A =1 ,X,C ) dt (cid:111) . (11) We propose a semi-parametric approach to estimate the causal contrasts deﬁned above. In par-ticular, we specify a semi-parametric proportional intensity model for the hazard of transitioningfrom diagnosed to treated ( α ( t | A, X, C ) ), from diagnosed to death ( α ( t | A, X, C ) ) and fromtreated in t (cid:48) to death in t ( α ( t | t (cid:48) , A, X, C ) ). α ( t | A, X, C ) = α ( t ) e β A + β X + β (cid:48) C (12) α ( t | A, X, C ) = α ( t ) e γ A + γ X + γ (cid:48) C (13) α ( t | t (cid:48) , A, X, C ) = α ( t ) e δ A + δ t (cid:48) + δ A ∗ t (cid:48) + δ X + δ (cid:48) C for t (cid:48) < t (14) T (the non-terminal event) and S (the terminal event)processes are denoted by Y T = min ( T, S, K ) and Y S = min ( S, K ) , and that event obser-vation indicators are given by δ T = I ( Y T = T ) and δ S = I ( Y S = S ) , the observed datalikelihood contribution for person i deﬁned by these hazards is: L i = (cid:104) e − Λ ( Y Ti ) − Λ ( Y Ti ) α ( Y Ti ) e − Λ ( Y Si )+Λ ( Y Ti ) α ( Y Si | Y Ti ) δ Si (cid:105) δ Ti (cid:104) e − Λ ( Y Si ) − Λ ( Y Si ) α ( Y Si ) δ Si (cid:105) (1 − δ Ti ) . The estimation procedure for the

SDE s and the SIE s can then seamlessly proceed by ﬁttingthe multistate model, estimating the cumulative hazard via the Breslow method (Lin, 2007) andsolving the integrals that are involved in the estimator by computing α and Λ using rectangular(or Simpson) method (Atkinson and Kendall, 1989) along the time interval (0 , s ) .The estimation procedure for the causal quantities can be summarized in three steps:(1) estimate the multistate regression model coefﬁcients. We here decided to use the R pack-age mstate (de Wreede et al., 2011) for this purpose but other alternatives can be consid-ered (Jackson, 2011).(2) predict baseline hazards, hazards, and cumulative hazards for each transition for observedtimes in the range t ∈ (0 , s ] , in new data for each racial/ethnic group and time to treatment t ∈ (0 , s ] . We used the function msﬁt() in this step.(3) calculate the causal effects of interest by plugging in the estimated hazards and cumula-tive hazards. 13nference can be conducted via bootstrapping. R code for the procedure can be found on theGitHub page of the corresponding author (add url). We conducted an extensive simulation study to evaluate the performance of our proposed method(“multi-state”) and to compare it with two other commonly used methods when analyzing semi-competing risk data (“exclude

T > S ” and “censor

T > S ”). For our proposed method, weinclude all subjects. The “exclude

T > S ” approach excludes from the analytic sample sub-jects who experienced the terminal event but not the non-terminal event, such that

T > S . The“censor

T > S ” approach includes all subjects in the analysis, treating all subjects who experi-enced the terminal event but not the non-terminal event as being censored, with the parametersreferring to transition from diagnosed to death set to zero. We examined 3 semi-competingrisks situations, considering no semi-competing risks (i.e., everyone has either the intermediateevent or is censored for T and S ), 10% and 40% semi-competing risks (10% or 40% subjectsexperience the terminal event without experiencing the intermediate event). For each semi-competing risk scenario, we considered 4 scenarios for total effect ( T E s ), and stochastic directeffect ( SDE s ) and stochastic indirect effect ( SIE s ) all generated under a proportional intensitymultistate model (2.12)-(2.14), allowing for either a linear or partially linear model (exposure-mediator interaction) speciﬁcation. In scenarios 1 and 2 we set SIE s (cid:54) = 0 and SDE s (cid:54) = 0 ,and specify the model for the hazard of transitioning from the intermediate non-terminal eventto the terminal time-to-event outcome without (scenario 1) and with interaction (scenario 2)between the intermediate time-to-event and exposure. In scenario 3 we specify SIE s (cid:54) = 0 and SDE s = 0 and in scenario 4 we set SIE s = 0 and SDE s (cid:54) = 0 .We simulated 100 datasets with sample size of 2000, and we calculated quantities of interestfor time points starting from baseline to s = 24 months follow-up by increments of 0.5 months.14tandard errors of the estimates were computed by bootstrap with 100 bootstrap samples. Per-formance of each method under each scenario was assessed for bias, conﬁdence interval (CI)coverage probability, mean squared error (MSE), and type I error rate (null cases) for effectestimates at 24 months. Full details of the simulation strategy are given in Appendix A of theSupplementary Material available at Biostatistics online.

Figure 2 compares the performance of different methods for scenario 1 with 10% and 40%semi-competing risks. Figures S1-S3 in Appendix B of the Supplementary Material show re-sults for the other simulation scenarios.For all four effect type scenarios and all levels of semi-competing risks, the proposed multistateapproach provides unbiased estimates, CI coverage probability close to 95% and lowest MSEcompared to the alternative approaches for all three effects of interest. The approaches “exclude

T > S and “censor

T > S ” produce biased estimates and suboptimal coverage for

SDE s and SIE s even when semi-competing risks are low (10%) (Figures 2A, 2C and S1). As expected, forthe estimation of SDE s under the scenario in which SDE s = 0 and for the estimation of SIE s under the scenario in which SIE s = 0 (Figures S2 and S3) all approaches are unbiased. Forall other scenarios, the estimates for SDE s and SIE s may be severely biased and conﬁdenceintervals coverage probability are not satisfying, when these alternative approaches are adopted.The simulation results underscore that severe bias will arise when semi-competing risksare ignored and bias will increase as the semi-competing risks increase (Figures 2B and 2D)and in the presence of non linearities, such as an exposure-mediator interaction (Figure S2).Type I errors are invalid for the test of the SIE s when semi-competing risks are high and the“exclude T > S ” method is adopted (Figure S6). The simulation study demonstrates that15dopting the multistate approach to mediation analysis that we propose appropriately addressessemi-competing risks under the identiﬁability assumptions discussed in section 2.2 and correctmodel speciﬁcation.

We obtained data from the National Cancer Institute’s CanCORS Consortium, which containsdetailed information from cancer patients and physicians (Ayanian et al., 2004). The consortiumcollected data on colorectal cancer (CRC) cases in multiple regions and health care delivery sys-tems across the U.S. The study population consisted of non-Hispanic White and non-HispanicBlack patients enrolled between 2003 and 2005. To be eligible for enrollment as a CRC case,the study required patients to be at least 21 years old and with newly diagnosed invasive adeno-carcinoma of the colon or rectum. The CanCORS Consortium oversampled minority groups atseveral of the sites. Of the ten CanCORS centers that enrolled patients with CRC, 8 centers col-lected survival data, including: Henry Ford Health System (HFHS), Kaiser Permanente Hawaii(KPHI), Kaiser Permanente Northwest (KPNW), Northern California Cancer Center (NCCC),State of Alabama (UAB), Los Angeles County (UCLA), North Carolina (UNC), and VeteransHealth Administration (VA). CanCORS collected stage at diagnosis information via medicalrecord abstraction or from cancer registries, categorized as stage I-IV according to the Amer-ican Joint Committee on Cancer (AJCC) staging criteria (Edge et al., 2010). For the purposeof the analysis, we selected patients diagnosed with colon cancer (CC), and conducted analy-ses stratiﬁed by stage. We calculated time from diagnosis to treatment from the correspondingdates reported in CanCORS data for treament. We focused on surgery as the treatment of in-terest since report of dates of other treatments (chemotherapy and radiotherapy) was sparse inthis sample and surgery is the ﬁrst line of therapy for most patients diagnosed with colon cancer16cross stages (Lawler et al., 2020; Libutti et al., 2019).

For our CanCORS data analysis, we used survival time in months since diagnosis. We consid-ered age at diagnosis (age < , − , and > years), sex (female versus male), incomelevel ( < $40 k , $40 − k , > $80 k ; with the middle group category being the reference), andmedical center to be potential confounders of the time to surgery-survival relationship. Ta-ble S1 provides descriptive statistics of the sample of CC patients by racial-ethnic group. Weinspected the cumulative incidence curves for waiting time to surgery and survival time (Fig-ure 3) by racial ethnic group. We ﬁt a multistate Cox proportional hazard regression for thetransitions from diagnosis to surgery, from diagnosis to death and from surgery to death. Mo-tivated by Valeri et al. (2016), we conducted backwards model selection allowing for race byincome/gender/age, time to surgery by race, and time to surgery by income interactions. Wealso considered quadratic effects of time to surgery in the transition to death. We performedanalyses using the best ﬁtting models for the three transitions. In CanCORS, we estimated theBlack-White disparity in 5-year (60 month) survival probability prior to intervention on thetime to surgery distribution ( T E ). Further, we estimated the residual difference in 5-year(60 month) survival probability after a hypothetical intervention on the distribution of waitingtime to surgery of the Black patients to match the distribution waiting time to surgery observedamong the White patients ( SDE ) using our multistate modeling approach. We consideredour approach along with the two alternative estimation procedures that ignore semi-competingrisks, namely “exclude T > S ” and “censor

T > S ”. We calculated 95% conﬁdence intervalsfor the disparity (

T E ), the residual disparity under intervention ( SDE ) and the proportionof the disparity eliminated by the intervention ( P E = T E − SDE T E ) using the bootstrap with100 bootstrap samples. 17 .2 Results From descriptives in Table S1 and cumulative incidence curves in Figure 3 we note that Blackpatients had longer waiting time to surgery and slightly shorter survival times on average com-pared to White patients. We found also that 6.2% of the patients died prior to receiving treat-ment, indicating a semi-competing risk problem, albeit small. At diagnosis Black patients wereyounger and presented higher stages than White patients. We observed income and gender dif-ferences as well.We report results of analyses for stage II patients (n=283, comprising 23.2% of the sample) inthe main manuscript. Results for other stages can be found in the Appendix B of the Supple-mentary Material (Figures S7-S9).Table 1 displays the output of the multistate modeling analyses. Adjusting for gender, age,income and center, there is suggestive evidence that stage II Black patients experienced a re-duction in the hazard of surgery compared to White patients (HR=0.75, 95% conﬁdence in-terval=0.55 – 1.03) and an increase in hazard of dying prior to receiving surgery (HR=3.26,95% conﬁdence interval=0.51 – 20.0). Considering the transition from treatment to death statusmodeled without adjustment for the mediator, we found evidence of racial-ethnic disparitiesat the intersection of income category. Black patients within the middle income group expe-rienced more than three times the hazard of dying compared to the White patients within themiddle income group (HR=3.71, 95% conﬁdence interval=2.19 – 6.30). When adjustment ismade for the mediator, the racial coefﬁcient for this income group is reduced (HR=1.39, 95%conﬁdence interval=0.82 – 2.36). In addition to race-income interaction, we found suggestiveevidence of non-linear effect of time to treatment, whereby the hazard of death increased withlonger waiting time to treatment, with this effect reducing over time.Figure 4 displays the estimated racial disparity for stage II, middle income CC patients on thesurvival probability difference scale and the residual disparity on the survival probability differ-18nce scale had the hazard of transitioning from diagnosis to treatment been the same between thetwo racial-ethnic groups. Results are shown for the three modeling strategies. Table 2 shows theeffect estimates for the difference of 5-year survival probability between the two racial ethnicgroups before (

T E ) and after ( SDE ) the intervention on the time to treatment distribution.Using the proposed multistate approach, we estimate a Black-White disparity in 5-year survivalprobability of T E = − . , 95% conﬁdence interval=(-0.51, -0.05) and if we implementedan intervention so that Black and White patients displayed the same distribution for time tosurgery, the residual disparity after the intervention would be SDE = − . , 95% conﬁdenceinterval=(-0.47, -0.04). These results indicate that 8% of the racial-ethnic disparity would beeliminated by such an intervention. The alternative approach selecting only patients who weretreated, leads to a slightly reduced estimated disparity of T E = − . , 95% conﬁdenceinterval=(-0.49, 0.00) and no change when considering the intervention ( SDE = − . , 95%conﬁdence interval=(-0.45, 0.00)). Similar results are obtained from the analysis that considersthe semi-competing event as a censoring event. Results for other stages indicate similarly weakimpact of the intervention in our sample (Figures S7-S9). Our application shows that ignoringsemi-competing risks, even when the contribution of semi-competing risks is small (as in thesedata), may lead to different estimates of the disparities and the impact of the intervention. We have protentised a novel approach for mediation analysis when mediator and outcome aretimes-to-event. To our knowledge, this is the ﬁrst method presented in the causal inference lit-erature to estimate randomized interventional analogues of direct and indirect effects when themediator is a time-to-event and semi-competing risks are present. This method allows for jointmodeling of outcome and mediator through the multistate model formulation. Our extensionof causal mediation methodology that allows for semi-competing risks and interventions on19ntermediate times-to-event is important for many applications in public health. For instance,this approach could be applied to quantify the change in racial disparities in survival for pa-tients affected by any health condition, say COVID-19, after an intervention on timing of care,say admission to ICU. In other epidemiological setting across the life-course, this approachcould be applied to evaluate the role of intermediate time-to-events, in early or mid-life, suchas cardiovascular events, in explaining disparities or exposure effects on late life events, suchas dementia.Our simulation study shows our proposed multistate approach performs better than currentmethods that ignore semi-competing risks. In the presence of such phenomenon, we thus adviseto use our approach over other methods.Applying this novel methodology to the CanCORS cohort, we found a racial disparity in coloncancer survival for middle income stage II patients and a negative association of race and timeto surgery. If the time to surgery distribution were ﬁxed for both groups to what observed in theWhite population, the disparity would be reduced by about 8%, suggesting that interventionsimproving timeliness of treatment could potentially reduce slightly racial disparities in survival.Our approach recovers the counterfactual survival curve in the group of Black patients if theirtime to treatment distribution were ﬁxed to what was observed among the White patients whosurvived long enough to receive treatment. This causal interpretation is weaker than the oneput forth by natural direct and indirect effects. The identiﬁcation of natural effects requiresstronger assumptions than the ones we have discussed for stochastic direct and indirect effects.We would need to assume that no confounders of the mediator outcome relationship are af-fected by the exposure and that there are no confounders of the exposure-mediator relationship.Moreover, natural effects in the presence of semi-competing risks require the identiﬁcation ofthe counterfactual distribution of the time-to-event mediator when death could be treated eitheras a conditioning or a censoring event (Young et al., 2018). The extension to natural effects in20he presence of competing risks as well as the setting of multiple time-to-event mediators is adirection for future work.Some limitations of our method are due to the reliance on no unmeasured confounding as-sumptions and on the correct speciﬁcation of the multistate model transition probabilities as afunction of the exposure, confounders and mediator. In our data application, our results are lim-ited by several factors such as potential residual confounding by socio-economic factors beyondincome, such as education, and measurement error in time to surgery, which was self-reported.The causal effect should be interpreted with caution. The stochastic direct and indirect effectsthat we propose consider a change in the waiting time to treatment distribution observed amongthe White and Black patients who survived. We might argue whether these distributions are rep-resentative of the target population. This depends on the hazard of the transition from diagnosisto death in the target population, which is a function of both individual and quality of care fac-tors. The generalizability of the racial disparities estimates beyond CanCORS may be limited,as only research medical centers were involved in the cohort study. Such medical centers typi-cally display higher quality of care. Finally, we here have considered only treatment timing aspotential determinants of racial disparities in survival, other aspects of treatment quality shouldbe considered.In future work, we plan to evaluate the performance of our approach in the presence of mea-surement error in the time-to-event mediator and to extend the approach to allow for multiplesequential time-to-event mediators and time-dependent confounders.

Software in the form of R code, together with a sample input data set and complete documen-tation is available on request from GitHub page: https://github.com/wf2213/multistate.21 cknowledgments

This study makes use of data generated by the CanCORS Consortium.This work was supported by the National Institute of Mental Health award K01 MH118477.

Conﬂict of Interest : None declared.

References

Aalen, O. O., Stensrud, M. J., Didelez, V., Daniel, R., Røysland, K., & Strohmaier, S. (2020).Time-dependent mediators in survival analysis: Modeling direct and indirect effects with theadditive hazards model. Biometrical Journal, 62(3), 532-549.Andersen, P. K., Geskus, R. B., de Witte, T., & Putter, H. (2012). Competing risks in epidemi-ology: possibilities and pitfalls. International Journal of Epidemiology, 41(3), 861-870.Atkinson, K.E. (1989). An Introduction to Numerical Analysis (2nd ed.). John Wiley & Sons.ISBN 0-471-50023-2.Ayanian, J. Z., Chrischilles, E. A., Wallace, R. B., Fletcher, R. H., Fouad, M. N., Kiefe, C.I., ... & West, D. W. (2004). Understanding cancer treatment and outcomes: the cancer careoutcomes research and surveillance consortium. Journal of Clinical Oncology: ofﬁcial journalof the American Society of Clinical Oncology, 22(15), 2992-2996.Comment, L., Mealli, F., Haneuse, S., & Zigler, C. (2019). Survivor average causal effectsfor continuous time: a principal stratiﬁcation approach to causal inference with semicompetingrisks. arXiv preprint arXiv:1902.09304.Devick, K. L., Valeri, L., Chen, J., Jara, A., Bind, M. A., & Coull, B. A. (2020). The roleof body mass index at diagnosis of colorectal cancer on Black–White disparities in survival: adensity regression mediation approach. Biostatistics. kxaa034,https://doi.org/10.1093/biostatistics/kxaa034 22e Wreede, L. C., Fiocco, M., Putter, H. (2011). mstate: an R package for the analysis of com-peting risks and multi-state models. Journal of statistical software, 38(7),1-30.Ding, P., Geng, Z., Yan, W., & Zhou, X. H. (2011). Identiﬁability and estimation of causaleffects by principal stratiﬁcation with outcomes truncated by death. Journal of the AmericanStatistical Association, 106(496), 1578-1591.Edge, S.B., Byrd, D. R., Carducci, M. A., Compton, C. C., Fritz, A. G., & Greene, F. L. (2010).AJCC cancer staging manual (Vol. 649). S. B. Edge (Ed.). New York: Springer.Fine, J.P., Jiang, H., Chappell, R. (2001). On semi-competing risks data. Biometrika, 88(4),907-919. Hern´an, M.A. (2010). The hazards of hazard ratios. Epidemiology (Cambridge,Mass.), 21(1), 13. Jackson, C. H. (2011). Multi-state models for panel data: the msm packagefor R. Journal of statistical software,38(8), 1-29.Lawler, M., Johnston, B., Van Schaeybroeck, S., Salto-Tellez, M., Wilson, R., Dunlop, M., andJohnston, P.G. Chapter 74 – Colorectal Cancer. In: Niederhuber JE, Armitage JO, Dorshow JH,Kastan MB,Tepper JE, eds. Abeloff’s Clinical Oncology. 6th ed. Philadelphia, Pa. Elsevier:2020.Libutti, S.K., Saltz, L.B., Willett, C.G., and Levine, R.A. Ch 62 - Cancer of the Colon. In: De-Vita VT, Hellman S,Rosenberg SA, eds. DeVita, Hellman, and Rosenberg’s Cancer: Principlesand Practice of Oncology. 11th ed. Philadelphia, Pa: Lippincott-Williams Wilkins; 2019.Lin, D.Y. (2007). On the Breslow estimator. Lifetime data analysis, 13(4), 471-480. Lin, S.H.,Young, J.G., Logan, R., & VanderWeele, T.J. (2017). Mediation analysis for a survival outcomewith time-varying exposures, mediators, and confounders. Statistics in medicine, 36(26), 4153-4166.D´ıaz, I.M. and van der Laan, M. (2012). Population intervention causal effects based on stochas-tic interventions. Biometrics 68.2: 541-549. 23obins, J.M., and Greenland, S. (1992). Identiﬁability and exchangeability for direct and indi-rect effects. Epidemiology: 143-155.Stensrud, M. J., Young, J. G., Didelez, V., Robins, J. M., & Hern´an, M. A. (2020). SeparableEffects for Causal Inference in the Presence of Competing Events. Journal of the AmericanStatistical Association, (just-accepted), 1-9.Tai, A. S., Tsai, C. A., & Lin, S. H. (2020). Survival mediation analysis with the death-truncatedmediator: The completeness of the survival mediation parameter. Harvard University Biostatis-tics Working Paper Series. Working Paper 223.https://biostats.bepress.com/harvardbiostat/paper223Tchetgen, E. J. T., Glymour, M. M., Shpitser, I., & Weuve, J. (2012). Rejoinder: to weight ornot to weight? On the relation between inverse-probability weighting and principal stratiﬁca-tion for truncation by death. Epidemiology, 23(1), 132-137.Valeri, L., Chen, J. T., Garcia-Albeniz, X., Krieger, N., VanderWeele, T. J., & Coull, B. A.(2016). The role of stage at diagnosis in colorectal cancer black–white survival disparities: acounterfactual causal inference approach. Cancer Epidemiology and Prevention Biomarkers,25(1), 83-89.VanderWeele, T.J. (2015). Explanation in causal inference: methods for mediation and interac-tion. Oxford University Press.VanderWeele, T. J., & Robinson, W. R. (2014). On causal interpretation of race in regressionsadjusting for confounding and mediating variables. Epidemiology (Cambridge, Mass.), 25(4),473-484.Vansteelandt, S., & Daniel, R. M. (2017). Interventional effects for mediation analysis withmultiple mediators. Epidemiology (Cambridge, Mass.), 28(2), 258.Wang, L., Zhou, X. H., & Richardson, T. S. (2017). Identiﬁcation and estimation of causaleffects with outcomes truncated by death. Biometrika, 104(3), 597-612.24oung, J. G., Tchetgen Tchetgen, E. J., & Hern´an, M. A. (2018). The choice to deﬁne com-peting risk events as censoring events and implications for causal inference. arXiv preprintarXiv:1806.06136.Zhang, J. L., & Rubin, D. B. (2003). Estimation of causal effects via principal stratiﬁcationwhen some outcomes are truncated by “death”. Journal of Educational and Behavioral Statis-tics, 28(4), 353-368. Zheng, W., & van der Laan, M. (2017).Longitudinal mediation analysis with time-varying mediators and exposures, with applicationto survival outcomes. Journal of Causal Inference, 5(2). DOI: https://doi.org/10.1515/jci-2016-0006. 25

Figure 1: Fig. 1 (A) Directed Acyclic Graph encoding our assumptions on conditional indepen-dences among the nodes. (B) Illness-death model representation of our study where individualscan transit from diagnosed (dx) to treated and then to death or can transit directly from diag-nosed to death status. 26 ) B) C) D)

Figure 2: Simulation results for scenario 1 (

SDE s (cid:54) = 0 and SIE s (cid:54) = 0 no exposure-mediatorinteraction).A) and C) Curves of the effects on the survival probability difference scale for 10% and 40%competing risks, respectively.B) and D) Comparison of approaches in the estimation of the effects on differences in theprobability of surviving after 24 months in terms of bias, coverage probability and MSE for10% and 40% competing risks, respectively. 27able 2: Multistate Cox Proportional Hazard model for transitions diagnosed-treated,diagnosed-death, treated-death in stage II colon cancer patients. First column without and sec-ond column with adjustment for time to surgery in the third transition. Models are adjusted for:income, age, gender, medical center and allow in the third transition for race-income, race-timeto surgery interaction, nonlinear effect of time. Predictors Est CI p Est CI pTransition diagnosed-treated

Race 0.75 0.55 – 1.03 0.072 0.75 0.55 – 1.03 0.072Age 0.97 0.80 – 1.16 0.713 0.97 0.80 – 1.16 0.713Gender 1.10 0.84 – 1.43 0.492 1.10 0.84 – 1.43 0.492Income < K > K Transition diagnosed-death

Race 3.26 0.51 – 20.4 0.207 3.26 0.51 – 20.4 0.207Age 1.25 0.32 – 4.78 0.749 1.25 0.32 – 4.78 0.749Income 0.97 0.41 – 2.24 0.942 0.97 0.41 – 2.24 0.942Gender 1.84 0.27 – 12.51 0.534 1.84 0.27 – 12.51 0.534Center 1.44 0.84 – 2.44 0.178 1.44 0.84 – 2.44 0.178

Transition treated-death

Race 3.71 2.19 – 6.30 0.029 1.39 0.82 – 2.36 0.780Age 1.30 0.63 – 2.66 0.193 1.28 0.62 – 2.65 0.213Gender 1.08 0.39 – 2.99 0.794 1.09 0.39 – 3.03 0.772Income < K > K < K > K T E : racial difference in the > months survival probability, SDE : racialdifference in the > months survival probability after the intervention on time to treatmentdistribution (shifting time to treatment distribution in blacks to match the one observed for thewhites, i.e. reducing waiting times to surgery in the most disadvantaged group), and P E theproportion of racial disparities eliminated by the intervention.

Method/Effects

T E SDE P E

Multistate -0.29 (-0.51, -0.05) -0.26 (-0.47, -0.04) 8%exclude

T > S -0.25 (-0.49, 0.00) -0.25 (-0.45, 0.00) 0%censor

T > S -0.25 (-0.48, 0.00) -0.25 (-0.46, 0.00) 0%29 .000.250.500.751.00 0 30 60 90

Months C u m u l a t i v e i n c i den c e o f e v en t race White Black Cumulative incidence functions

Figure 3: Cumulative incidence curves stratiﬁed by racial-ethnic group for time-to-surgery (topcurves) and time-to-death (bottom curves). 30

20 40 60 80 − . − . − . − . − . − . . Time in months T E and S D E Figure 4: Total effect (

T E , solid line) and Stochastic direct effect (

SDE , dashed line) on thesurvival probability difference scale estimated in the CanCORS data for subjects diagnosed atstage II under the three approaches: multistate model (black), exclude

T > S (dark gray) andcensor