Hierarchical causal variance decomposition for institution and provider comparisons in healthcare
HHierarchical causal variance decomposition forinstitution and provider comparisons in healthcare
Bo Chen and Olli Saarela ∗ Dalla Lana School of Public Health, University of TorontoMay 25, 2020
Abstract
Disease-specific quality indicators (QIs) are used to compare institutionsand health care providers in terms processes or outcomes relevant to treatmentof a particular condition. In the context of surgical cancer treatments, the per-formance variations can be due to hospital and/or surgeon level differences, cre-ating a hierarchical clustering. We consider how the observed variation in carereceived at patient level can be decomposed into that causally explained by thehospital performance, surgeon performance within hospital, patient case-mix,and unexplained (residual) variation. For this purpose, we derive a four-wayvariance decomposition, with particular attention to the causal interpretationof the components. For estimation, we use inputs from a mixed-effect modelwith nested random hospital/surgeon-specific effects, and a multinomial logis-tic model for the hospital/surgeon-specific patient populations. We investigatethe performance of our methods in a simulation study.
Keywords: causal inference, variance decomposition, quality indica-tor, nested random effects model ∗ Correspondence to: Olli Saarela, Dalla Lana School of Public Health, 155 College Street,Toronto, Ontario M5T 3M7, Canada. Email: [email protected] a r X i v : . [ s t a t . M E ] M a y Introduction
Data on health care utilization and patient outcomes have multilevel structure, withclusters formed for example by administrative subregions, referral networks, hospi-tals, and physicians (Daniels and Gatsonis, 1999). Quantifying the between clustervariation in processes of care and outcomes can reveal quality of care related issues,motivating the practice of hospital or provider profiling, with reviews of statisticalapproaches given for example by Goldstein and Spiegelhalter (1996); Shahian andNormand (2008); Racz and Sedransk (2010). Some of the modeling approaches areaimed at identifying outlier clusters Farrell et al. (2010), while others focus on quan-tifying and explaining the sources of variation (Hawley et al., 2006). We will discussthese issues in the context of disease-specific quality indicators (QIs) for surgical careof kidney cancer (Wood et al., 2013; Lawson et al., 2017), where the clustering ofinterest are surgeons nested within hospitals. The QIs we consider are either processtype, capturing variations in care delivered, or outcome type, capturing variation inpatient outcomes (Donabedian, 1988).Although it is possible that some surgeons operate in multiple hospitals, a hierar-chical clustering can be constructed by considering each hospital-surgeon combinationas a separate category. To introduce some notation, let Y ∈ R represent the observedprocess or outcome experienced by a given patient, and X = ( X , . . . , X p ) representa vector of patient characteristics necessary for case-mix adjustment in the compar-isons. Also, let Z ∈ { , ..., m } indicate the hospital in which the patient was actuallytreated, and S ∈ { , ..., h z } the surgeon that treated the patient in hospital z . Thetreatment received/outcome of the patient can be modeled through a generalizedlinear model of the form E [ Y | Z = z, S = s, X = x ] = g − ( α + α z + γ zs + β (cid:48) x ) , (1)where g is the link function. The model can be made identifiable by setting the fixedeffects α = 0 and γ z = 0, z = 1 , . . . m , or alternatively through random effectsby taking them to be IID as α z ∼ N (0 , τ ) and γ zs ∼ N (0 , κ ). The covariancebetween the hospital and surgeon effects is 0 due to the two-level categories beingnested rather than crossed, this model is referred to as a nested random effects model(Norberg, 1986; Longford, 1987; Rabe-Hesketh et al., 2005). It also involves the ad-ditional assumption that the random effects are independent of the individual-levelcharacteristics (Dieleman and Templin, 2014; Clarke et al., 2015). The choice betweenthe fixed effect and random effect formulations depends in part on the numbers ofclusters at different levels and numbers of observations per cluster. With large num-ber of clusters, some of these small, the random effect model can provide more stableestimation due to the shrinkage effect for the small clusters, which may be desirableeven if the distributional assumption on the random effect is violated. Model (1) as-sumes the absence of interactions between the cluster effects and the individual-level2haracteristics, but this can be relaxed by allowing for the interactions, which canagain be either fixed or random (Bell et al., 2019). While in some contexts modelssuch as (1) are fitted to estimate the effects β of the individual-level characteristics, inthe hospital/provider profiling context such a model is typically used to estimate thecluster effects while adjusting for the case-mix factors X ; predictions from the modelcan be used to calculate directly standardized estimates of the hospital/provider spe-cific means. The problem of directly standardized comparisons between hospitals ofcan be framed in a causal inference framework using potential outcomes, as outlinedby (Varewyck et al., 2014).Another use for models such as (1) is to quantify how much variation in the out-come is explained by the cluster-level effects. In the present context this answersthe question of whether quality of care differences exist in the health care systemoverall, in particular, adjusted for patient case-mix, whether similar kinds of patientsreceive different level of care. Demonstrating such variation is often the first stepof validating a proposed process or outcome measure as a QI. In the case of iden-tity link, or at the link function scale, an X -conditional variance decomposition canbe directly given in terms of the random effect variance parameters (Merlo et al.,2006). More generally at the outcome scale, variance decompositions can still becalculated making use of model-based predictions. However, generally there existsseveral alternative variance decompositions depending on the order of conditioningon the variables (Bowsher and Swain, 2012). In Chen et al. (2019) we demonstratedthat a certain ordering of the variables results in a variance decomposition where thebetween-hospital component can be given a causal interpretation. We also analyzedhospital-level variation in the quality of surgical care of kidney cancer in Ontario,Canada. The proposed quality indicators we considered were the proportion of par-tial (versus radical) nephrectomies among stage T1a nephrectomy patients, the sameproportion restricted to the subpopulation of patients with chronic kidney diseaseor its risk factors diabetes or hypertension, minimally invasive surgery among T1-T2 radical nephrectomy patients, and readmission within 30 days of the surgery forT1-T4 radical nephrectomy patients. The first three of these are process type, whilereadmission is an outcome. While we found significant between-hospital and case-mixvariation in several of the indicators, also the residual variances were large, raising thequestion of how much within-hospital between-surgeon variation is captured by theseindicators. This motivates us to further develop a causal variance decomposition andcorresponding estimators for hierarchical clusterings, for the purpose of quantifyingthe contribution of hospital and surgeon level effects while adjusting for patient case-mix. This requires introduction of potential outcomes notation for multiple levels ofexposures, which can be adapted from instrumental variable (Angrist et al., 1996)and causal mediation analysis (VanderWeele and Vansteelandt, 2009; VanderWeeleet al., 2014) literature. 3 .2 Objectives Based on the objectives motivated above, the structure of the paper is as follows.In Section 2.1, we adapt potential outcomes notation to represent multiple nestedexposure levels, and state the necessary assumptions for unconfounded comparisonsbetween the levels. In Section 2.2, we generalize the three-way causal variance de-composition of Chen et al. (2019) to a four-way decomposition capturing variancecomponents due to patient case-mix, hospitals’ performance, surgeons’ performance,and unexplained variation, and discuss its causal interpretation. In Section 2.4, weconnect the variance decomposition to measures of intra-class correlation. We pro-pose an estimation method based on nested random-effect models in Section 3 andstudy its properties in a simulation study in Section 4. We end with a discussion onlimitations and future research directions in Section 5.
We suppress the individual level index i , and let Y ∈ R represent the observed processor outcome experienced by a given patient, used to construct a QI. Let Y ( z, s ) ∈ R represent the counterfactual outcomes of the same patient received care via surgeon s ∈ { , . . . , h z } operating in a hospital z ∈ { , . . . , m } . Let further X = ( X , . . . , X p )be a vector of covariates relevant to the case-mix adjustment, which include de-mographic, comorbidity, and disease progression information. Let Z ∈ { , . . . , m } indicate the hospital where the patient was actually treated, and S ∈ { , . . . , h z } the surgeon that operated the patient. Let further S ( z ) ∈ { , .., h z } indicate thesurgeon that potentially operates the patient, if referred to hospital z . The observedvariables Y and S are linked to their potential counterparts under the counterfactualconsistency/stable unit treatment value assumption (SUTVA), by Y = Y ( Z, S ( Z ))and S = S ( Z ). Causal inferences on the hospital and surgeon effects are possibleunder the assumption of strong ignorability of the joint hospital and surgeon assign-ment mechanism, which states that 0 < P ( Z = z, S = z | X = x ) < z ∈ { , . . . , m } , s ∈ { , . . . , h z } and x (positivity) and Y ( z, s ) ⊥⊥ ( Z, S ) | X (condi-tional exchangeability) Rubin and Rosenbaum (1983); Hern´an and Robins (2006). Wenote that positivity is assumed only over the observed hospital/surgeon combinations,to make the levels nested rather than crossed. We introduce the shorthand notations g ( s ; z, X ) ≡ P ( S = s | Z = z, X ) and e ( z ; X ) = P ( Z = z | X ) for the correspondingassignment probabilities. The hypothesized causal relationships are illustrated in thedirected acyclic graph (DAG) in Figure 1. Here I and H represent additional factorsthat influence the hospital/surgeon assignment and the patient characteristics, butare not confounders. In particular, adjustment for the instrumental variables I wouldlikely lead to positivity violations. 4 SZ YIH
Figure 1: Causal mechanism for hospital ( Z ) assignment, surgeon ( S ) assignmentand a process of care ( Y ). X represents a vector of potential confounders relevant tothe case-mix adjustment, while I represents instrumental variables that predict thehospital assignment but are not confounders. H represents latent history that caninfluence I and X but is not in itself confounder.We note that since Y ( z, S ( z )) = Y ( z ), the above notation reduces to the hospital-level potential outcome notation used by for example (Varewyck et al., 2014) andChen et al. (2019). For this one-level clustering, in Chen et al. (2019) we derived avariance decomposition V [ Y ] = V X (cid:40)(cid:88) z E ( Y ( z ) | X ) e ( z ; X ) (cid:41) + E X (cid:88) z (cid:34) E ( Y ( z ) | X ) − (cid:88) z (cid:48) E ( Y ( z (cid:48) ) | X ) e ( z (cid:48) ; X ) (cid:35) e ( z ; X ) + E X (cid:40)(cid:88) z V ( Y ( z ) | X ) e ( z ; X ) (cid:41) (2)= variance explained by the patient case-mix+ average variance causally explained by the between-hospital differences in performanceconditional on case-mix+ residual variance . Here the second variance component captures the average squared differences fromthe average level of care for similar patients between the hospitals. However, it doesnot capture between provider variation within the hospitals, which is included in theresidual variance. In the following we derive a four-way decomposition that introduces5 new term to capture the within hospital between provider variation.
Under counterfactual consistency/SUTVA, we have V [ Y ] = V [ Y ( Z, S ( Z ))]. We beginwith the two-way variance decomposition V [ Y ( Z, S ( Z ))] = V X [ E ( Y ( Z, S ( Z )) | X )] + E X [ V ( Y ( Z, S ( Z ))) | X )] . (3)In the Eqution (3), the first term can further write V X [ E ( Y ( Z, S ( Z )) | X )] = V X [ E Z | X [ E ( Y ( Z, S ( Z )) | Z, X )]]= V X { E Z | X [ E S ( Z ) | Z,X [ E ( Y ( Z, S ( Z )) | S ( Z ) , Z, X )]] } (4)In the latter term, we can further write E X [ V ( Y ( Z, S ( Z )) | X )]= E X [ V Z | X [ E ( Y ( Z, S ( Z )) | Z, X )]]+ E X [ E Z | X [ V ( Y ( Z, S ( Z )) | Z, X )]]= E X { V Z | X [ E S ( Z ) | Z,X [ E ( Y ( Z, S ( Z )) | S ( Z ) , Z, X )]] } + E X { E Z | X [ V S ( Z ) | Z,X [ E ( Y ( Z, S ( Z )) | S ( Z ) , Z, X )]] } + E X { E Z | X [ E S ( Z ) | Z,X [ V ( Y ( Z, S ( Z )) | S ( Z ) , Z, X )]] } . (5)Substituting these Equations (4) and (5) into Equation (3), we obtain V [ Y ( Z, S ( Z ))] = V X { E Z | X [ E S ( Z ) | Z,X [ E ( Y ( Z, S ( Z )) | S ( Z ) , Z, X )]] } + E X { V Z | X [ E S ( Z ) | Z,X [ E ( Y ( Z, S ( Z )) | S ( Z ) , Z, X )]] } + E X { E Z | X [ V S ( Z ) | Z,X [ E ( Y ( Z, S ( Z )) | S ( Z ) , Z, X )]] } + E X { E Z | X [ E S ( Z ) | Z,X [ V ( Y ( Z, S ( Z )) | S ( Z ) , Z, X )]] } . (6)Due to strong ignorability assumption, we have E ( Y ( z, s ) | S = s, Z = z, X ) = E ( Y ( z, s ) | X )and V ( Y ( z, s ) | S = s, Z = z, X ) = V ( Y ( z, s ) | X ) . V [ Y ] = V X (cid:26) (cid:88) z (cid:88) s E ( Y ( z, s ) | X ) g ( s ; z, X ) e ( z ; X ) (cid:27) + E X (cid:26) (cid:88) z (cid:20) (cid:88) s E ( Y ( z, s ) | X ) g ( s ; z, X ) − (cid:88) z (cid:48) (cid:88) s E ( Y ( z (cid:48) , s ) | X ) g ( s ; z (cid:48) , X ) e ( z (cid:48) ; X ) (cid:21) e ( z ; X ) (cid:27) + E X (cid:26) (cid:88) z (cid:20) (cid:88) s (cid:18) E ( Y ( z, s ) | X ) − (cid:88) s (cid:48) E ( Y ( z, s (cid:48) ) | X ) g ( s (cid:48) ; z, X ) (cid:19) g ( s ; z, X ) (cid:21) e ( z ; X ) (cid:27) + E X (cid:26) (cid:88) z (cid:88) s V ( Y ( z, s ) | X ) g ( s ; z, X ) e ( z ; X ) (cid:27) . (7)The interpretation is V [ Y ] = total observed variance in care received= variance explained by the patient case-mix+ average variance causally explained by the between hospitaldifferences in performance conditional on case-mix+ average variance causally explained by the surgeon performanceconditioning on patient case-mix and hospital performance+ unexplained (residual) variance . We note that the first and second term in (7) are equivalent to the first and secondterm in (2). This also implies that the third, residual, variance component in (2) isequivalent to the sum of the third and fourth components in (7), meaning that theadditional term in (7) is a result of splitting the residual variance in (2). We willconsider the second and the third terms as the causal quantities of interest. Thesehave a causal interpretation, as further discussed in Section 2.3, and can be linkedto the observable quantities under the causal assumptions, working backwards from(7). In the second term, the hospital performance is compared to the average levelacross all the hospitals for a patient with characteristics X and then averaged overthe patient population. In the third term, the surgeon performance is compared tothe average level of the surgeons in the same hospital for a patient with characteristics X , and then averaged over the hospitals and the patient population.There are two possible approaches to estimate the variance components in (7). Inthe first approach, directly based on the factorization in (7), we can estimate thembased on modeling E [ Y | S, Z, X ], P ( S | Z, X ), P ( Z | X ) and using the empiricaldistribution of X . Because the four-way variance decomposition can be also expressed7s V [ Y ] = V X [ E ( Y | X )]+ E Z,X (cid:8) [ E ( Y | Z, X ) − E ( Y | X )] (cid:9) + E S,Z,X (cid:8) [ E ( Y | S, Z, X ) − E ( Y | Z, X )] (cid:9) + E S,Z,X [ V ( Y | S, Z, X )] , (8)the variance components can be also estimated based on modeling E [ Y | S ( Z ) , Z, X ], E [ Y | Z, X ], E [ Y | X ] and using the empirical distribution of ( S, Z, X ). We willdiscuss both approached in Section 3; the former is based on factorization of thelikelihood and can be used to construct an approximate Bayesian inference procedure.
To better understand the causal interpretation of the variance decomposition, we con-sidered a special case with two hospitals with two surgeons each. The interpretationof the case-mix and between-hospital components is unchanged and was already dis-cussed by Chen et al. (2019), so we focus on the interpretation of the within-hospitalbetween-surgeon component. With two hospitals with indexed by z = 1 ,
2, and twosurgeons in each indexed by s = 1 ,
2, we denote the hospital assignment probabilitywith P ( Z = 1 | X ) = e (1; X ) = e ( X ), with P ( Z = 2 | X ) = 1 − e ( X ), and the sur-geon assignment probability within hospital with P ( S = 1 | Z = z, X ) = g (1; z, X ) = g ( z, X ), with P ( S = 2 | Z = z, X ) = 1 − g ( z, X ). Now the third term (7) becomes E X (cid:110) e ( X ) g (1 , X )(1 − g (1 , X )) [ E ( Y (1 , | X ) − E ( Y (1 , | X )] + (1 − e ( X )) g (2 , X )(1 − g (2 , X )) (cid:2) E ( Y (2 , | X ) − E ( Y (2 , | X ) (cid:3) (cid:111) = E X (cid:110) e ( X ) V ( S | Z = 1 , X ) [ E ( Y (1 , | X ) − E ( Y (1 , | X )] + (1 − e ( X )) V ( S | Z = 2 , X ) [ E ( Y (2 , | X ) − E ( Y (2 , | X )] (cid:111) . (9)We consider three scenarios to illustrate the causal interpretation of (9) via the rela-tionship of X, Z, S and Y in Figure 1. Scenario 1.
In the absence of the arrow X → Y , which implies Y ⊥⊥ X | ( Z, S ) and E ( Y ( z, s ) | X ) = E ( Y ( z, s )), (9) becomes[ E ( Y (1 , − E ( Y (1 , E X [ e ( X ) V ( S | Z = 1 , X )]+ [ E ( Y (2 , − E ( Y (2 , E X [(1 − e ( X )) V ( S | Z = 2 , X )] . The first multiplicative terms represent squared pairwise causal contrasts. Thisis multiplied by the second terms, the magnitude of which depends on the8olume of patients of type X in each hospital, as well as the variation in thesurgeon assignment for patients of type X . The latter is maximized whenboth surgeons treat similar patient populations. If the surgeons specialize ontreatment of specific kinds of patients, so that there is no overlap in the patientpopulations treated by the two surgeons, the positivity assumption is violatedand the corresponding component is equal to 0. Thus, for between-surgeonperformance differences to manifest through this variance component, theremust be some overlap in the patient population they treat. Scenario 2.
In the absence of the arrow X → S , X → Z , I → S , I → Z , whichimplies ( Z, S ) ⊥⊥ X , we have e ( X ) = e , g (1 , X ) = g (1), and g (2 , X ) = g (2).Now (9) becomes eV ( S | Z = 1) E X (cid:26)(cid:2) E ( Y (1 , | X ) − E ( Y (1 , | X ) (cid:3) (cid:27) + (1 − e ) V ( S | Z = 2) E X (cid:26)(cid:2) E ( Y (2 , | X ) − E ( Y (2 , | X ) (cid:3) (cid:27) . We note that under this completely randomized setting the magnitude of thecausal effects are proportional to the terms E X (cid:8) [ E ( Y ( z, | X ) − E ( Y ( z, | X )] (cid:9) = E X (cid:8) E ( Y ( z, − Y ( z, | X ) (cid:9) = V X [ E ( Y ( z, − Y ( z, | X )] + E X { E ( Y ( z, − Y ( z, | X ) } = V X [ E ( Y ( z, − Y ( z, | X )] + E [ Y ( z, − Y ( z, , that is, proportional to the sum of the variance of the covariate conditionalcausal effects and the squared population average causal effect. The formercaptures effect modification by the patient characteristics, showing that any ef-fect modification adds to the measure, rather than canceling out, and the lattercaptures the overall performance difference. The resulting variance component,as expressed for two levels being compared, has similarities to the causal inter-pretation recently given to the model reliance metric used to measure variableimportance in machine learning contexts (Fisher et al., 2019). Our results showthat a similar kind of effect measure can be derived through a variance decom-position argument under a randomized assignment, and generalize this from twoto multiple exposure levels being compared. Since the variance component un-der a randomized assignment mechanism may be of interest in itself as a causalquantity, in Section 2.4 we show that it can always be derived and estimatedunder a hypothetical target assignment mechanism of interest regardless of theactual mechanism that assigns patients for hospitals and surgeons. Scenario 3.
In the absence of the arrow S → Y , which implies Y ⊥⊥ S | ( Z, X ),we have E ( Y (1 , | X ) = E ( Y (1 , | X ) and E ( Y (2 , | X ) = E ( Y (2 , | ), and the between-surgeon component is zero, as it should it the absence ofindividual-level causal effects. The variance decomposition (7) was derived for the observed marginal variance ofthe outcome, which is why it depends on the mechanism that assigns patients tohospitals and surgeons, including the hospital and surgeon volume. Alternatively, wecan derive a variance decomposition under a hypothetical “randomized” assignment,where for example each hospital/surgeon treats similar kind of patient population,and/or similar patient volume. This will also allow us to derive a connection betweenthe causal variance decomposition and well-known intra-class correlation measures.Let A ∈ { , . . . , m } and B ( A ) ∈ s ∈ { , . . . , h a } be hospital and surgeon assignmentsrandomly drawn with specified probabilities ˜ e ( a ; X ) = P ( A = a | X ) and ˜ g ( b ; a, X ) ≡ P ( B = b | A = a, X ), chosen such that ( A, B ) ⊥⊥ ( Z, S ) | X , 0 < P ( A = a, B = b | X = x ) < Y ( a, b ) ⊥⊥ ( A, B ) | X . Here choosing for example ˜ e ( a ; X ) = P ( Z = a )and ˜ g ( b ; a, X ) = P ( S = b | Z = a ) would correspond to a mechanism where eachhospital/surgeon treats similar patient population, but retaining the original patientvolumes. Choosing ˜ e ( a ; X ) ≡ /m and ˜ g ( b ; a, X ) = 1 /h a would also mean setting thevolumes to be the same.For the variance under the hypothetical assignment mechanism we get the decom-position V [ Y ( A, B ( A ))]= V X (cid:26) (cid:88) a (cid:88) b E ( Y ( a, b ) | X )˜ g ( b ; z, X )˜ e ( a ; X ) (cid:27) + E X (cid:26) (cid:88) a (cid:20) (cid:88) b E ( Y ( a, b ) | X )˜ g ( b ; a, X ) − (cid:88) a (cid:48) (cid:88) b E ( Y ( a (cid:48) , b ) | X )˜ g ( b ; a (cid:48) , X )˜ e ( a (cid:48) ; X ) (cid:21) ˜ e ( a ; X ) (cid:27) + E X (cid:26) (cid:88) a (cid:20) (cid:88) b (cid:18) E ( Y ( a, b ) | X ) − (cid:88) b (cid:48) E ( Y ( a, b (cid:48) ) | X )˜ g ( b (cid:48) ; a, X ) (cid:19) ˜ g ( b ; a, X ) (cid:21) ˜ e ( a ; X ) (cid:27) + E X (cid:26) (cid:88) a (cid:88) b V ( Y ( a, b ) | X )˜ g ( b ; a, X )˜ e ( a ; X ) (cid:27) . (10)Because under the above assumptions, we have E [ Y ( a, b ) | A = a, B = b, X ] = E [ Y ( a, b ) | X ]= E [ Y ( a, b ) | Z = a, S = b, X ]= E [ Y | Z = a, S = b, X ] , e ( a ; X ) and ˜ g ( b ; a, X ) are fixed quantities, this is also estimable from theobserved data on ( Y, Z, S, X ).Under the special case of ˜ e ( a ; X ) ≡ /m and ˜ g ( b ; a, X ) = 1 /h a and the linearmixed-effect model E [ Y ( a, b ) | Z = a, S = b, X ] = E [ Y ( a, b ) | X ] = α + α a + γ ab + β (cid:48) X, where the nested hospital and surgeon random effects are IID as α a ∼ N (0 , τ ) and γ ab ∼ N (0 , κ ) and residuals distributed as Y − E [ Y | Z, S, X ] ∼ N (0 , σ ), the secondterm in (10) can be written as1 m (cid:88) a (cid:40)(cid:32) α a − m (cid:88) a (cid:48) α a (cid:48) (cid:33) + (cid:32) h a (cid:88) b γ ab − m (cid:88) a (cid:48) h a (cid:48) (cid:88) b γ a (cid:48) b (cid:33)(cid:41) . Keeping m is fixed and letting h a → ∞ for all a ∈ { , . . . , m } , the terms h a (cid:80) b γ ab converge to E ( γ ab ) = 0, and the second term in becomes (10)1 m (cid:88) a (cid:26) α a − m (cid:88) a (cid:48) α a (cid:48) (cid:27) , (11)which in turn converges to V ( α a ) = τ , the variance of the between-hospital effects.Hence, for this variance component we obtain the same result as for the three-waycausal variance decomposition proposed by Chen et al. (2019).The third term in (10) can be written as1 m (cid:88) a h a (cid:88) b (cid:32) α ab − h a (cid:88) b (cid:48) γ ab (cid:48) (cid:33) . (12)Keeping m fixed and letting h a → ∞ for all a ∈ { , . . . , m } , (12) converges to V ( α ab ) = κ , the variance of the within-hospital between-surgeon effects. Therefore,under this special case, the X -conditional causal variance decomposition is V ( Y ( A, B ( A )) | X ) = V A | X [ E B ( A ) | A,X [ E ( Y ( A, B ( A )) | B ( A ) , A, X )]]+ E A | X [ V B ( A ) | A,X [ E ( Y ( A, B ( A )) | B ( A ) , A, X )]]+ E A | X [ E B ( A ) | A,X [ V ( Y ( A, B ( A )) | B ( A ) , A, X )]] → τ + κ + σ when h a → ∞ ∀ a ∈ { , .., m } and m → ∞ , that is, asymptotically equivalent to the variance decomposition obtained through themodel-based random effect and residual variances. Here ( τ + κ ) / ( τ + κ + σ ) wouldcorrespond to the within-hospital within-surgeon intra-class correlation coefficient.11 Estimators
We outline estimators based on the decomposition (7), which requires fitting hospital,surgeon and case-mix conditional outcome model and case-mix conditional assign-ment model. The same estimation approach works also for decomposition (9), butsubstituting fixed target assignment probabilities in place of the observed ones.To model the outcomes, we use a generalized linear mixed model E [ Y ( z, s ) | X ; θ ] = E [ Y | Z = z, S = s, X ; θ ] = g − (cid:16) α + α z + γ zs + β (cid:48) X (cid:17) . (13)where θ = ( α , α z , α s , β ) and where the nested hospital and surgeon random effects aretaken to be IID as α z ∼ N (0 , τ ) and γ zs ∼ N (0 , κ ). For the joint hospital/surgeonassignment mechanism, we fit a multinomial logistic regression model P ( Z = z, S = s | X ; η ) = (cid:80) ma =2 exp( ψ a + φ (cid:48) a X ))+ (cid:80) h b =2 exp( ψ b + φ (cid:48) b X ))+ (cid:80) ma =2 (cid:80) hzb =2 exp( ψ ab + φ (cid:48) ab X ) z = 1 , s = 1 exp( ψ z + φ (cid:48) z X )1+ (cid:80) ma =2 exp( ψ a + φ (cid:48) a X ))+ (cid:80) h b =2 exp( ψ b + φ (cid:48) b X )+ (cid:80) ma =2 (cid:80) hzb =2 exp( ψ ab + φ (cid:48) ab X ) z (cid:54) = 1 , s = 1 exp( ψ s + φ (cid:48) s X )1+ (cid:80) ma =2 exp( ψ a + φ (cid:48) a X ))+ (cid:80) h b =2 exp( ψ b + φ (cid:48) b X )+ (cid:80) ma =2 (cid:80) hzb =2 exp( ψ ab + φ (cid:48) ab X ) z = 1 , s (cid:54) = 1 exp( ψ zs + φ (cid:48) zs X i )1+ (cid:80) ma =2 exp( ψ a + φ (cid:48) a X ))+ (cid:80) h b =2 exp( ψ b + φ (cid:48) b X )+ (cid:80) ma =2 (cid:80) hzb =2 exp( ψ ab + φ (cid:48) ab X ) , z (cid:54) = 1 , s (cid:54) = 1(14)where η = { ( ψ ab , φ ab ) : ( a, b ) (cid:54) = (1 , } . Hence, we can obtain e ( z ; X, η ) = P ( Z = z | X ; η ) = h z (cid:88) s =1 P ( Z = z, S = s | X ; η )and g ( s ; z, X, η ) = P ( S = s | Z = z, X ; η ) = P ( Z = z, S = s | X ; η ) P ( Z = z | X ; η ) . Alternatively, a multinomial logistic assignment model can be first fitted at hospi-tal level to estimate the quantities e ( z ; X, η ), after which surgeon level multinomialassignment models are fitted conditionally on each hospital to estimate g ( s ; z, X, η ).We denote the fitted values for the expected outcomes µ i ( z, s ; θ ) = E ( Y i | Z i = z, S i = s, x i ; θ ). Under the mixed-effects model these are obtained by using empiricalBayes prediction for the random hospital and surgeon-level intercepts (e.g Skrondaland Rabe-Hesketh, 2004, Chapter 7). Further, we denote V [ Y ; θ, η ] = ω ( θ, η ) + ω ( θ, η ) + ω ( θ, η ) + ω ( θ, η ) for the four terms in the parametrized version of the12ariance decomposition (7). The first (case-mix) component can now be estimatedby ω (ˆ θ, ˆ η ) = 1 n − n (cid:88) i =1 (cid:26) (cid:88) z (cid:88) s µ i ( z, s ; ˆ θ ) g ( s ; z, x i , ˆ η ) e ( z ; x i , ˆ η ) − n n (cid:88) i (cid:48) =1 (cid:88) z (cid:88) s µ i ( z, s ; ˆ θ ) g ( s ; z, x i (cid:48) , η ) e ( z ; x i (cid:48) , ˆ η ) (cid:27) . The second (between-hospital) component can be estimated by ω (ˆ θ, ˆ η ) = 1 n n (cid:88) i =1 (cid:26) (cid:88) z (cid:20) (cid:88) s µ i ( z, s ; ˆ θ ) g ( s ; z, x i , ˆ η ) (cid:21) e ( z ; x i , ˆ η ) − (cid:20) (cid:88) z (cid:88) s µ i ( z, s ; ˆ θ ) g ( s ; z, x i , ˆ η ) e ( z ; x i , ˆ η ) (cid:21) (cid:27) . The second (between-surgeon) component can be estimated by ω (ˆ θ, ˆ η ) = 1 n n (cid:88) i =1 (cid:26) (cid:88) z (cid:20) (cid:88) s µ i ( z, s ; ˆ θ ) g ( s ; z, x i , ˆ η ) − (cid:18) (cid:88) s µ i ( z, s ; ˆ θ ) g ( s ; z, x i , ˆ η ) (cid:19) (cid:21) e ( z ; x i , ˆ η ) (cid:27) . The fourth (residual) variance can be estimated by subtracting the sum of the abovethree components from the empirical marginal variance, or alternatively, based on thedistributional assumption in the outcome model. In particular, for a binary outcomewe have V ( Y i | Z i = z, S i = s, x i ; θ ) = µ i ( z, s ; θ )[1 − µ i ( z, s ; θ )], and the residualvariance component is given by ω (ˆ θ, ˆ η ) = 1 n n (cid:88) i =1 (cid:26) (cid:88) z (cid:88) s µ i ( z, s ; ˆ θ )[1 − µ i ( z, s ; ˆ θ )] g ( s ; z, x i , ˆ η ) e ( z ; x i , ˆ η ) (cid:27) . If the parameters can be estimated consistently such that ˆ θ p → θ and ˆ η p → η , thecomponent estimators ω j (ˆ θ, ˆ η ), j ∈ { , , , } , are also consistent by the continuousmapping theorem and applying the law of large numbers for the sample averagesover the empirical covariate distribution. We will briefly investigate their asymptoticnormality through simulation in Section 4, but the variance estimation approach wepropose in 3.2 does not make use of asymptotic normality.Alternatively to the above model-based estimators, as noted in Section 2.2, a semi-parametric estimation procedure is suggested by decomposition (8), based on threedifferent outcome models. The semi-parametric approach is applicable to the decom-position of the empirical marginal variance, while the model-based approach can alsobe used to estimate decomposition under hypothetical assignment mechanisms. Wewill briefly compare the two approaches in the simulation study of Section 4.13 .2 Variance estimation The point estimators suggested in Section 3.1, are entirely model-based, conditionalon the empirical covariate distribution. Because the model components correspondto the factorization of the likelihood, we can evaluate the uncertainty in the variancecomponent estimates via approximate Bayesian inference. This is based on drawingsamples from the joint posterior distribution of the parameters θ and η , given by f ( θ, η | Y , Z , S , X ) = f ( Y , Z , S | X , θ, η ) f ( θ | X ) f ( η | X ) f ( Y , Z , S | X )= f ( Y | θ, Z , S , X ) f ( θ | X ) f ( Y | Z , S , X ) f ( Z , S | η, X ) f ( η | X ) f ( Z , S | X )= f ( θ | Y , Z , S , X ) f ( η | Z , S , X ) . Posterior samples for the variance components can be obtained by sampling θ and η from their posterior distributions, and recalculating ω ( θ, η ) , ω ( θ, η ) , ω ( θ, η ) and ω ( θ, η ) for each draw. For the outcome model parameters θ , we approximated theposterior using the parametric bootstrap, by resampling outcomes from the fittedmodel, refitting the model and calculating new fitted values µ i ( z, s ; θ ). For the as-signment model parameters, we used the normal approximation to sample the η froma multivariate normal distribution M V N (ˆ η, V (ˆ η )), where ˆ η is the maximal likeli-hood estimator and V (ˆ η ) is the asymptotic variance-covariance matrix via the fittedmultinomial logistic regression. We used simulation to study the properties of the methods proposed in Section 3. Theobjectives for the simulation study were to (a) study the asymptotic properties (con-sistency, asymptotic normality) of the proposed point estimators, (b) to demonstratethat the new four-way decomposition is consistent with our previously proposed three-way decomposition, and (c) to compare the model-based estimators to the alternativesemi-parametric decomposition to verify that both are estimating the same quantity.We used a data-generating mechanism similar to Figure 1, omitting the variables I and H for simplicity. The asymptotic behavior of the estimators was studies by vary-ing the total number of hospitals m , the total number of surgeons q , and the totalnumber of patients n . We generated two patient case-mix factors, X ∼ N (0 ,
1) and X ∼ Bernoulli(0 . Z ) and surgeon ( S ) assignments were generatedbased on multinomial logistic model (14), where the intercepts were generated from N (0 , .
25) and coefficients from N (0 , . E [ Y ( z, s ) | X ] = α z + γ zs + X + 2 X α s and surgeon effects γ zs were generated independentlyfrom N (0 , Y ( z, s ) = E [ Y ( z, s ) | X ] + ε , where ε ∼ Logistic(0 , { Y ( z,s ) ≥ } . The observed outcomes were takes to be Y = Y ( Z, S ). For estima-tion, we fitted mixed effect logistic models with nested random effects as in (13) andmultinomial assignment models as in (14). The resulting estimates for the variancecomponents were compared to the true values calculated under the above specifiedparameter values.
The bars in Figure 2 show the simulated sampling distribution means for the threevariance components for the binary outcomes under different combinations of n (totalnumber of patients), m (total number of hospitals), and q (total number of surgeons),based on 1000 replications. The 95% quantile interval for the sampling distributionis represented by the black error bar. The 95% confidence interval for the mean isrepresented by the blue error bar; this reflects the Monte Carlo error in the estimatedmean of the sampling distribution. The red dots indicate the true values of thevariance components. From the results, we can observe that the between-hospitaland case-mix components are well estimated under all scenarios. Accurate estimationof the within-hospital between-surgeon component requires sufficient surgeon-specificpatient volumes, which are the largest under the scenario n = 5000, m = 5 and q = 25,explaining the more precise estimate for this variance component therein. Figure 3shows density plots for the simulated sampling distributions of the three variancecomponents with varying n and fixed m and q . These are fairly normal-shaped, anddemonstrate decreasing variability with increasing number of patients.The gray bars in Figure 4a show the simulated sampling distribution means for theestimated components of the three-way decomposition (2), and the white bars showthe corresponding components obtained through the four-way decomposition (7) byadding up the third (between surgeon) and fourth (residual) variance components.The estimates are similar, demonstrating that the new between-surgeon variancecomponent is part of the residual variance in the three-way decomposition.The gray bars in Figure 4b show the simulated sampling distribution means forthe three variance components using the model-based formulation (7) and the whitebars the alternative semi-parameric formulation (8). The point estimates, as wellas their variability are similar under both approaches, demonstrating that both areappropriate for the point estimation. The methods in the present paper are aimed at helping to assess the usefulness ofa given process or outcome in constructing a disease-specific quality indicator for15 l l
Components V a r i an c e E s t i m a t e s n=2500, m=5, q=25 ll l Components V a r i an c e E s t i m a t e s n=5000, m=5, q=25 ll l Components V a r i an c e E s t i m a t e s n=5000, m=5, q=50 ll l Components V a r i an c e E s t i m a t e s n=5000, m=10, q=50 Figure 2: Simulated sampling distribution means for the three variance components(without residual variance) under the random-effect model for the binary outcomesunder different combinations of n (total number of patients), m (total number ofhospitals), and q (total number of surgeons), based on 1000 replications. The red dotsindicate the true variances. The 95% quantile interval of the sampling distributionis represented by the black error bar. The 95% confidence interval for the mean isrepresented by the blue error bar, reflecting the Monte Carlo error in the estimatedmean of the sampling distribution.identifying performance related between hospital and between surgeon variation. Al-though here we focused on two levels of hierarchical clustering due to the motivating16 Estimated Hospital Variance den s i t y n m=5, q=25 Estimated Patient Variance den s i t y n m=5, q=25 Estimated Surgeon Variance den s i t y n m=5, q=25 Figure 3: Density plots for the simulated sampling distributions of the case-mix,between-hospital, and between-surgeon variance components with a binary outcome,based on 1000 replications.application, the proposed four-way variance decomposition proposed here could begeneralized to arbitrary number of levels of hierarchical clustering, by introducingfurther conditioning variables. The additional variance components will come outof the residual variation for lower level clusters introduced, as we observed for thesurgeons within hospitals. While in the present context there are no more lower levelclusters to introduce, we could introduce higher level clusters such as the Local HealthIntegration Networks (LHINs) which are health administrative subregions in Ontario.For estimation of the variance decompositions, we used nested random effect mod-els, as these can easily accommodate large number of small categories without iden-tifiability problems that would be present with corresponding fixed effect models.For simplicity, we also omitted hospital-case-mix and surgeon-case-mix interactionterms from the models; however, in principle these can be easily incorporated, eitherthrough fixed or random effects, as the interpretation of the variance decompositionsis separate from the parametrization of the hospital and surgeon effects. In fact,17 lll ll
Components V a r i an c e E s t i m a t e s Method l l
Four−way Three−way (a) llll ll
Components V a r i an c e E s t i m a t e s Method l l (7) (8) (b)
Figure 4: Panel (a): Simulated sampling distribution means for the variance com-ponents of the three-way decomposition (2), and the same components estimatedthrough the four-way decomposition (7). The red dots indicate the true variances.The 95% quantile interval of the sampling distribution is represented by the black er-ror bar. The 95% confidence interval for the mean is represented by the blue error bar,reflecting the Monte Carlo error in the estimated mean of the sampling distribution.Panel (b): Simulated sampling distribution means for the three variance components(without residual variance) estimated using the model-based formulation (7) and thealternative semi-parameric formulation (8), based on 1000 replications. The red dotsindicate the true variances. The 95% quantile interval of the sampling distributionis represented by the black error bar. The 95% confidence interval for the mean isrepresented by the blue error bar, reflecting the Monte Carlo error in the estimatedmean of the sampling distribution.the formulation of the causal variance decompositions does not dictate what kind ofmodels are used to estimate the predictive means/probabilities needed for calcula-tion of the decomposition. Instead of parametric models, predictions derived throughmachine learning algorithms might be useful as well, though it is an open questionhow well these can capture the effects of large number of levels in the categoricalexposures. In the context of multi-category categorical exposures, the componentsin the causal variance decomposition can be seen as a way to concisely summarize alarge number of pairwise causal contrasts. In principle the same approach could alsobe extended to other types of exposures, including continuous and function valued18xposures, which is one further research direction we are pursuing.We borrowed nested potential outcome notation from causal mediation analysisand instrumental variable estimation literature to represent the nested exposure lev-els. However, although the path hospital → surgeon → outcome in the causal diagram1 resembles mediation, we note that the current problem is not a mediation problem,due to the surgeons being nested within the hospitals by definition in our analysis.Because of this, the surgeon effect is separate from the hospital effect, rather than acomponent of it. However, in Daignault et al. (2019) we proposed methodology formediation analysis in the quality of care context, with the aim of quantifying howmuch of between hospital differences in an outcome type measure could be accountedfor by of between hospital differences in a process type measure considered as a me-diator, using the hospital → minimally invasive surgery → length of stay pathway asan example. This raises the question of decomposing between hospital variation in amediation analysis sense. This problem has connections to various R and effect sizetype measures that have been suggested in the psychometric literature for measuringmediation in the linear structural equation modeling framework (e.g. de Heus, 2012;Lachowicz et al., 2018; Mioˇcevi´c et al., 2018). We are currently working on extend-ing the causal variance decomposition approach to allow for decomposing betweenhospital variance into direct and indirect effects. Acknowledgement
This work was supported by a Discovery Grant from the Natural Sciences and En-gineering Research Council of Canada (to OS) and the Ontario Institute for CancerResearch through funding provided by the Government of Ontario (to BC).
References
Angrist, J. D., Imbens, G. W., and Rubin, D. B. (1996). Identification of causaleffects using instrumental variables.
Journal of the American statistical Association ,91(434):444–455.Bell, A., Fairbrother, M., and Jones, K. (2019). Fixed and random effects models:making an informed choice.
Quality & Quantity , 53(2):1051–1074.Bowsher, C. G. and Swain, P. S. (2012). Identifying sources of variation and the flowof information in biochemical networks.
Proceedings of the National Academy ofSciences , 109(20):E1320–E1328.Chen, B., Lawson, K. A., Finelli, A., and Saarela, O. (2019). Causal variance decom-positions for institutional comparisons in healthcare.
Statistical methods in medicalresearch , page 0962280219880571. 19larke, P., Crawford, C., Steele, F., and Vignoles, A. (2015). Revisiting fixed-andrandom-effects models: some considerations for policy-relevant education research.
Education Economics , 23(3):259–277.Daignault, K., Lawson, K. A., Finelli, A., and Saarela, O. (2019). Causal mediationanalysis for standardized mortality ratios.
Epidemiology , 30(4):532–540.Daniels, M. J. and Gatsonis, C. (1999). Hierarchical generalized linear models in theanalysis of variations in health care utilization.
Journal of the American StatisticalAssociation , 94(445):29–42.de Heus, P. (2012). R squared effect-size measures and overlap between direct andindirect effect in mediation analysis.
Behavior Research Methods , 44(1):213–221.Dieleman, J. L. and Templin, T. (2014). Random-effects, fixed-effects and the within-between specification for clustered data in observational health studies: A simula-tion study.
PLoS One , 9(10):e110257.Donabedian, A. (1988). The quality of care. How can it be assessed?
JAMA ,260(12):1743–1748.Farrell, P. J., Groshen, S., MacGibbon, B., and Tomberlin, T. J. (2010). Outlierdetection for a hierarchical bayes model in a study of hospital variation in surgicalprocedures.
Statistical methods in medical research , 19(6):601–619.Fisher, A., Rudin, C., and Dominici, F. (2019). All models are wrong, but many areuseful: Learning a variables importance by studying an entire class of predictionmodels simultaneously.
Journal of Machine Learning Research , 20(177):1–81.Goldstein, H. and Spiegelhalter, D. J. (1996). League tables and their limitations:statistical issues in comparisons of institutional performance.
Journal of the RoyalStatistical Society, Series A , 159:385–443.Hawley, S. T., Hofer, T. P., Janz, N. K., Fagerlin, A., Schwartz, K., Liu, L., Deapen,D., Morrow, M., and Katz, S. J. (2006). Correlates of between-surgeon variationin breast cancer treatments.
Medical Care , pages 609–616.Hern´an, M. A. and Robins, J. M. (2006). Estimating causal effects from epidemio-logical data.
Journal of Epidemiology and Community Health , 60:578–586.Lachowicz, M. J., Preacher, K. J., and Kelley, K. (2018). A novel measure of effectsize for mediation analysis.
Psychological Methods , 23(2):244.Lawson, K. A., Saarela, O., Abouassaly, R., Kim, S. P., Breau, R. H., and Finelli, A.(2017). The impact of quality variations on patients undergoing surgery for renalcell carcinoma: a national cancer database study.
European urology , 72(3):379–386.20ongford, N. T. (1987). A fast scoring algorithm for maximum likelihood estimation inunbalanced mixed models with nested random effects.
Biometrika , 74(4):817–827.Merlo, J., Chaix, B., Ohlsson, H., Beekman, A., Johnell, K., Hjerpe, P., ˙Rstam,L., and Larsen, K. (2006). A brief conceptual tutorial of multilevel analysis insocial epidemiology: using measures of clustering in multilevel logistic regressionto investigate contextual phenomena.
Journal of Epidemiology and CommunityHealth , 60(4):290–297.Mioˇcevi´c, M., ORourke, H. P., MacKinnon, D. P., and Brown, H. C. (2018). Statisticalproperties of four effect-size measures for mediation models.
Behavior researchmethods , 50(1):285–301.Norberg, R. (1986). Hierarchical credibility: analysis of a random effect linear modelwith nested classification.
Scandinavian Actuarial Journal , 1986(3-4):204–222.Rabe-Hesketh, S., Skrondal, A., and Pickles, A. (2005). Maximum likelihood estima-tion of limited and discrete dependent variable models with nested random effects.
Journal of Econometrics , 128(2):301–323.Racz, M. J. and Sedransk, J. (2010). Bayesian and frequentist methods for providerprofiling using risk-adjusted assessments of medical outcomes.
Journal of the Amer-ican Statistical Association , 105:48–58.Rubin, D. B. and Rosenbaum, P. R. (1983). The central role of the propensity scorein observational studies for causal effects.
Biometrika , 70(1):41–55.Shahian, D. M. and Normand, S.-L. T. (2008). Comparison of “risk-adjusted” hospitaloutcomes.
Circulation: Journal of the American Heart Association , 117:1955–1963.Skrondal, A. and Rabe-Hesketh, S. (2004).
Generalized latent variable modeling:Multilevel, longitudinal, and structural equation models . Crc Press.VanderWeele, T. and Vansteelandt, S. (2009). Conceptual issues concerning media-tion, interventions and composition.
Stat Interface , 2:457–468.VanderWeele, T., Vansteelandt, S., and Robins, J. (2014). Effect decomposition inthe presence of an exposure-induced mediator-outcome confounder.
Epidemiology ,25(2):300–306.Varewyck, M., Goetghebeur, E., Eriksson, M., and Vansteelandt, S. (2014). Onshrinkage and model extrapolation in the evaluation of clinical center performance.
Biostatistics , 15(4):651–664. 21ood, L., Bjarnason, G. A., Black, P. C., Cagiannos, I., Heng, D. Y., Kapoor,A., Kollmannsberger, C. K., Mohammadzadeh, F., Moore, R. B., Rendon, R. A.,Soulieres, D., Tanguay, S., Venner, P., Jewett, M., and Finelli, A. (2013). Using theDelphi technique to improve clinical outcomes through the development of qualityindicators in renal cell carcinoma.