A regression framework for a probabilistic measure of cost-effectiveness
AA regression framework for a probabilistic measure ofcost-effectiveness
Nicholas Illenberger , Nandita Mitra , Andrew J. Spieker University of Pennsylvania, Department of Biostatistics, Epidemiology, and Informatics, Philadelphia PA Vanderbilt University Medical Center, Department of Biostatistics, Nashville TN
Abstract
To make informed health policy decisions regarding a treatment, we must consider both its cost andits clinical effectiveness. In past work, we introduced the net benefit separation (NBS) as a novel measureof cost-effectiveness. The NBS is a probabilistic measure that characterizes the extent to which a treatedpatient will be more likely to experience benefit as compared to an untreated patient. Due to variation intreatment response across patients, uncovering factors that influence cost-effectiveness can assist policymakers in population-level decisions regarding resource allocation. In this paper, we introduce a regressionframework for NBS in order to estimate covariate-specific NBS and find determinants of variation in NBS.Our approach is able to accommodate informative cost censoring through inverse probability weightingtechniques, and addresses confounding through a semiparametric standardization procedure. Throughsimulations, we show that NBS regression performs well in a variety of common scenarios. We apply ourproposed regression procedure to a realistic simulated data set as an illustration of how our approachcould be used to investigate the association between cancer stage, comorbidities and cost-effectivenesswhen comparing adjuvant radiation therapy and chemotherapy in post-hysterectomy endometrial cancerpatients.
Cost-effectiveness analyses are a useful tool for aggregating information on the differences in cost and clinicaleffectiveness between two comparator health interventions. Typically, a cost-effectiveness analysis relieson defining a particular kind of summary measure, which may be used to optimize the tradeoff betweentreatment costs and effectiveness. Historically, most analyses have involved either the incremental cost-effectiveness ratio (ICER) or the equivalent net monetary benefit (NMB). Each of these compares the mean1 a r X i v : . [ s t a t . M E ] J a n ifference in cost and the mean difference in some clinical measure between treatments. Specifically, theICER is defined as the ratio of these two quantities and therefore has units of cost per unit of the clinicalmeasure (for instance, $ USD per year, if the clinical outcome is marked by post-treatment survival time).The experimental treatment is declared cost-effective if the ICER lies below a certain threshold, known asthe willingness-to-pay (WTP). The NMB was introduced by Stinnet et al. (Stinnett & Mullahy, 1998) as arefined, yet equivalent measure of cost-effectiveness that suffers from neither a singularity in the denominatornor the known poor statistical properties associated with ratio quantities.When analyzing medical costs that are accumulated over time, censoring poses specific barriers to straight-forward estimation of mean costs. Correlation between costs at time of censoring and those at end-of-studyprecludes the use of methods which rely on the non-informative censoring assumption. This challenge ismost commonly addressed by inclusion of inverse probability of censoring weights (IPCW) (Bang & Tsiatis,2000). Li et al. expanded this approach with a doubly robust method to account for exposure-outcomeconfounding in settings where data are derived from observational studies (Li, Vachani, Epstein, & Mitra,2018).In practice, the extent of cost-effectiveness associated with a treatment may vary across levels of anobserved covariate. As an example, consider the setting of endometrial cancer, in which treatment ofteninvolves total hysterectomy. Following surgery, there is a large degree of variation in the treatment a patientreceives, with some receiving adjuvant radiation, some receiving adjuvant chemotherapy, and some receivingneither (Latif, Haggerty, Jean, Lin, & Ko, 2014). While costs associated with radiation and chemotherapyare certain to exceed those associated with a control condition, the survival benefit associated with eitherradiation or chemotherapy relative to neither could reasonably be expected to vary by patient covariates, e.g.,cancer stage. Specifically, those with a lower cancer stage (and therefore, a lower overall risk of recurrence)may not experience the same degree of survival benefit from adjuvant treatment as compared to those witha higher cancer stage.In previous work, regression methods for NMB have been developed to compare cost-effectiveness acrosssubgroups defined by observed covariates. Willan et al. (Willan, Briggs, & Hoch, 2004) show how NMBmay be characterized as a function of the parameters in a regression of cost and effectiveness outcomes ontreatment and observed covariates. Under the assumption of bivariate normal errors, the authors proposeda hypothesis test for the effects of covariates on NMB. Because cost distributions are typically skewed,Nixon et al. presented an alternative approach which could accommodate more flexible error structures(Nixon & Thompson, 2005). While positing regression models similar to those of Willan et al. (Willanet al., 2004), their method implements Markov Chain Monte Carlo methods to accommodate gamma andlog-normal distributed errors. These approaches are useful if we are interested in how the cost-effectiveness2f an intervention (as measured by NMB or ICER) varies across levels of measured covariates.In recent work, Spieker et al. (Spieker, Illenberger, Roy, & Mitra, 2021) introduced a novel probabilisticmeasure of cost-effectiveness, the net benefit separation (NBS). The NBS characterises the stochastic orderingof individual net benefits (INB) between treated and untreated populations. For a given WTP, INB is definedas the difference between the cost a payer is willing to incur for an observed effectiveness measure and thatwhich is truly incurred. Compared with NMB and ICER, which are functions of the average difference incost and effectiveness measures between treatments, NBS measures the probability that a patient receivingtreatment will experience greater treatment benefit than a patient receiving control. Because the distributionof medical costs, and consequently INBs, are often skewed, cost-effectiveness comparators that do not relyon the average values of cost or measures of effectiveness can be useful. In settings where treatment israndomized, NBS can be estimated nonparametrically using a scaled variant of the Wilcoxon rank-sumstatistic.As with NMB and ICER, extensions of NBS to settings with informative cost censoring and confoundinghave been proposed. Spieker et al. (Spieker et al., 2021) introduce a semiparametric Monte Carlo stan-dardization procedure to estimate NBS while adjusting for differences in the distribution of confoundingvariables across treatment arms. Models used within the standardization procedure may be fit using IPCWtechniques if there are censored cost data. However, no methods currently exist which can explain variabilityin NBS using covariates. In this paper, we introduce regression methodology to account for variability in theNBS using observed covariates. In previous work, Pepe et al. developed regression methods for measuresof stochastic ordering for use in diagnostic testing settings (Pepe, 1997). Alonzo and Pepe later extendedthis work to improve computational efficiency in large samples (Alonzo & Pepe, 2002). Because the NBS isdefined as the stochastic ordering of INBs between treatment groups, we are able to embed their approachwithin our proposed methodology. We also describe how Monte Carlo standardization and IPCW methodsmay be used to account for censoring and confounding when performing NBS regression. Additionally, wepropose a hypothesis testing framework for determining the strength of evidence for the effects of covariateson NBS.The remainder of this paper is organized as follows: In Section 2 we review the net benefit separation asproposed by Spieker et al. (Spieker et al., 2021). In Section 3 we define a conditional variant of the NBS andintroduce our regression-based estimation procedure. In Section 4 we conduct simulation studies to examinethe finite sample properties of this estimator. In Section 5, we use our proposed approach to analyze thecost-effectiveness of adjuvant treatments for endometrial cancer. In the final section, we summarise theutility of our proposed cost-effectiveness parameters and discuss limitations of our methodology.3
Net benefit separation
Assume we have individual patient data on costs and treatment effectiveness. Let A be a binary treatmentindicator, Z be a measure of effectiveness, Y be a measure of medical cost, and let i = 1 , . . . , n index sampleunits. In a typical setting, Z and Y may represent survival time and medical cost in USD( $ ). For predefinedWTP, λ , INB is defined as B ( λ ) = λZ − Y . We assume without loss of generality that larger values of Z are desirable, so that greater INB indicates greater treatment benefit. If we let B a ( λ ) represent a randomlysampled INB from a hypothetical population receiving treatment A = a , then NBS is defined as θ ( λ ) = P (cid:0) B ( λ ) > B ( λ ) (cid:1) . NBS can be interpreted as the probability that a randomly sampled unit from a population receivingtreatment will have a greater INB than an independently sampled unit from a control population. For aparticular λ , if θ ( λ ) = 0 .
5, then the treatments are equally cost-effective with respect to NBS. Note that θ ( λ ) > . E [ B ( λ )] > E [ B ( λ )], i.e. cost-effectiveness with respect to NBS does notimply greater expected INB. For example, if the distribution of INBs is skewed, then the expected benefitunder control may be larger than that under treatment even if the majority of patients receiving treatmentexperience greater benefit than those receiving control. Thus, NBS can help provide a more complete viewof the range of patient experiences, particularly when cost and effectiveness metrics are skewed.For simplicity, we first consider estimation in a setting where treatment is randomized and with nocensoring. Here, the observed distribution of INBs in the treatment and control arms form nonparametricestimates of the distributions of B ( λ ) and B ( λ ). Thus, NBS can be consistently estimated using a scaledvariant of the Wilcoxon rank-sum statistic (cid:98) θ ( λ ) = 12 N (cid:32) N N (cid:88) i =1 A i R i ( λ ) − N − (cid:33) , where N a is the number of participants receiving treatment A = a and R i ( λ ) is the pooled rank of theINB for the i th individual. Spieker et al. proposed a flexible standardization approach to estimate theNBS in the presence of measured confounders (Spieker et al., 2021). By modelling the joint distributionof observed variables, they are able to apply Monte Carlo methodology to sample INBs from treated anduntreated populations which have the same underlying distribution of confounders. Applying the scaledWilcoxon rank-sum statistic to the generated INBs provides an estimate of NBS that accounts for measuredconfounding. If data are censored, then IPCW can be incorporated in the modeling step to account forcensored cost data. In the section that follows, we will extend this methodology to evaluate the association4etween measured covariates and the NBS. In this section we introduce regression methodology for NBS. Section 3.1 defines a conditional variant ofNBS and describes procedures for estimation and hypothesis testing when treatment is randomized. InSection 3.2 we propose a Monte Carlo standardization procedure to address measured confounding. Section3.3 discusses how to incorporate IPCW to account for informative cost censoring.
Suppose X is some covariate of interest which may explain variation in the NBS. In an endometrial cancersetting, X may represent cancer stage or age at diagnosis. We want to assess how the cost-effectivenessof a treatment changes with respect to X . Specifically, we may seek to (1) characterize the associationbetween X and NBS, (2) assess the NBS across levels of X , and (3) evaluate the strength of evidence forassociation. When treatment is randomized and X is discrete, the first two goals can be accomplished byestimating θ ( λ | X = x ) using the scaled Wilcoxon rank-sum statistic within subsets defined by X = x . Onthe other hand, if X is continuous, then this is not possible without first discretizing X . In either case, thereare currently no methods to assess the strength of association between X and the NBS. In this section, weintroduce a procedure that can accomplish all three of the stated goals for continuous or discrete X .For a given λ , the NBS conditional on X is defined as θ ( λ | X ) = P ( B ( λ ) > B ( λ ) | X ) . (1)It can be shown that the conditional NBS is equivalent to θ ( λ | X ) = (cid:90) S | X ( S − | X ( ω )) dω (2)where S a | X ( u ) = P ( B a ( λ ) > u | X ) denotes the conditional survivor function for INB in a hypotheticalpopulation receiving treatment A = a and S − a | X ( u ) is the quantile function for the same population. Toestimate θ ( λ | X ), we build methodology parallel to that used by Alonzo and Pepe (Alonzo & Pepe, 2002)for ROC regression. Earlier work by Pepe (Pepe, 2000) noted that if we define U ij = ( B i ( λ ) > B j ( λ ))where i and j index treated and untreated patients, then S | x ( S − | x ( ω j )) is equivalent to the expectation of U ij conditional on S | x ( B j ) = ω j and X = x . Given estimated quantiles (cid:98) ω j , standard GLM procedures5an be used to estimate S | x ( S − | x ( ω )). Because U ij must be defined for every pair of treated and untreatedunits, this approach is intractable for large samples. Instead, we use an alternative approach proposed byAlonzo and Pepe (Alonzo & Pepe, 2002) which improves on the efficiency of the original estimator. Let Ωbe a finite set of values between 0 and 1. For ω ∈ Ω, if we define U iω = ( B i ( λ ) > S − | X i ( ω )) for eachunit i receiving treatment, then E [ U iω ] = S | X i ( S − | X i ( ω )). For a given probability, ω ∈ Ω, this definitionof U iω gives the indicator of whether a treated unit’s observed INB is greater than the ω -quantile of thedistribution of INBs in an untreated population. In practice, because S − | X ( ω ) is unknown, U iω is definedusing a consistent estimator of the quantile function U iω = ( B i ( λ ) > (cid:98) S − | X i ( ω )). Alonzo and Pepe (Alonzo& Pepe, 2002) suggest using empirical estimates of the quantiles when possible (i.e. for discrete X ) andquantile regression (Koenker & Bassett Jr, 1978) otherwise.The conditional expectation of U iω can be modeled using any flexible regression approach. One sensiblemodeling choice is the probit form: E [ U ω | X ] = Φ( β + β X + β Φ − ( ω )) where Φ( · ) is the cumulativedistribution function of the standard normal distribution. This is the form that would arise if B ( λ ) and B ( λ ) were normally distributed or if they could be monotonically rescaled to be normally distributed.Hanley (Hanley, 1988) shows that when fitting ROC curves using the binormal form, estimates of AUCexhibit low bias even when the model is misspecified. Because we characterize NBS as the area under the fitbinormal model, it can be seen as analogous to AUC. Once the conditional expectation has been modeled,NBS can be estimated through numerical integration using standard software (Piessens, de Doncker-Kapenga,¨Uberhuber, & Kahaner, 2012). Under the probit model, our estimate is given by (cid:98) θ ( λ | X ) = (cid:90) Φ( (cid:98) β + (cid:98) β X + (cid:98) β Φ − ( ω )) dω. (3)Testing whether X influences θ ( λ | X ) is equivalent to testing whether the coefficient(s) associated with X are equal to zero. For example, under the probit model this is a test of if β = 0. Because multiplevalues U iω are defined for each individual (one for each ω ∈ Ω), model-based variance estimates that assumeindependent observations are invalid. Instead, we propose the nonparametric bootstrap hypothesis test. Animportant consideration in the regression of NBS is the choice of Ω. For sets consisting of N ω equally spacedpoints, Ω = { j/ ( N ω + 1); j = 1 , ..., N ω } , selecting N ω = N − N ω were able to achieve close to maximal efficiency while mitigatingcomputational burden. 6 .2 Adjustment for measured confounding Suppose that there are measured covariates, L , which affect the probability of receiving treatment as well asthe cost or effectiveness outcomes. Our proposed estimation procedure consists of two steps: a Monte Carlostandardization step, and a regression step. The first step constructs two pseudo-populations which differ intreatment status, but where the distribution of confounding variables, L , is equivalent. Let (cid:98) f ( z, y | A, X, L )denote some model for the conditional distribution of cost and effectiveness measures. The pseudo-populationunder treatment A = a is built by drawing M observations from the empirical distribution of the baselinecovariates, (cid:98) F ( L, X ), and then using the sampled values of L and X to sample from the fit model (cid:98) f ( z, y | A = a, X, L ). From these pseudo-populations, we may sample from the joint distributions of the INBs and X among treated and untreated units, ( B ( λ ) , X ) and ( B ( λ ) , X ) respectively. This methodology ensuresthat INBs are sampled from populations which receive different treatments, but have the same underlyingdistribution of measured confounders. This procedure is semiparametric because, while parametric modelsmay be used to fit the distributions of Z and Y , the joint distribution F ( X, L ) is estimated nonparametrically.The Monte Carlo standardization procedure is summarized below:1. Model the conditional distributions f ( Z | A, X, L ) and f ( Y | A, X, L, Z ) where f ( · ) and F ( · ) denote theprobability density function and the cumulative density function of a random variable. Let (cid:98) f ( Z | A, X, L )and (cid:98) f ( Y | A, X, L, Z ) denote the estimated distributions.2. For a = 0 , • Sample (cid:101) l m and (cid:101) x m for m = 1 , ..., M from the empirical distribution (cid:98) F ( L, X ). • For m = 1 , ..., M sample (cid:101) z m and (cid:101) y m from (cid:98) f ( Z | A = a, (cid:101) x m , (cid:101) l m ) and (cid:98) f ( Y | A = a, (cid:101) x m , (cid:101) l m , (cid:101) z m ) • Calculate (cid:101) B a,m ( λ ) for each pair ( (cid:101) z m , (cid:101) y m )The models for f ( Z | A, X, L ) and f ( Y | A, X, L, Z ) can be flexibly estimated to reduce bias. For example,zero-inflated gamma regression models can be used to accommodate skewed cost distributions with structuralzeros. By including effectiveness as a predictor of cost, this procedure allows for correlation between Z and Y . The second step of our estimation procedure entails regressing the NBS using the sampled INBs. Be-cause ( B ( λ ) , X ) and ( B ( λ ) , X ) are sampled from populations where treatment is independent of measuredconfounders, NBS regression proceeds as described in Section 3.1. The regression step is outlined below:1. For each ω ∈ Ω, use sampled values (cid:101) B ,m ( λ ) to estimate S − | X ( ω ).2. For each generated (cid:101) B ,m ( λ ) and ω ∈ Ω, define U mω = ( (cid:101) B ,m ( λ ) > (cid:98) S − | X ( ω )).7. Fit a model for E [ U mω | X ]4. For X = x , estimate the subgroup-specific NBS by numerically integrating: (cid:98) θ ( λ | X = x ) = (cid:90) (cid:98) E [ U iω | X = x ] Up until this point, we have developed methods which are suitable for cases with no censoring, e.g. when Z is a continuous outcome with no missing data. Often, the effectiveness outcome of greatest clinical interestis time to death post treatment. Because patients followed over time may be lost to follow-up, we introducemethodology to account for censoring when estimating NBS. Suppose we are interested in estimating costsup until a maximum time, τ . For observation i , let C i be the censoring time, T i be the survival time, and T ∗ i = min( T i , τ ). Define δ i = ( C i < T i ) and δ ∗ i = ( C i < T ∗ i ) be indicators of censored survival time andcensored costs. The observable data are δ , δ ∗ , and Z = min ( T i , C i ). Because costs at time of censoringcan be correlated with costs at event time, it is necessary to use IPCW to estimate f ( Y | A, X, L, Z ). Let G ( t ) = P ( t ≤ C ) denote the probability that the censoring time is beyond t . If we assume T and C areindependent, then G ( t ) can be modeled within each treatment group using the Kaplan-Meier product limitestimator based on the data ( C, δ ∗ ). If instead we assume that survival and censoring times are independentonly once we condition on treatment status and other measured variables, W , then we can use the stratifiedcox model to estimate G ( t ). Under this model, we assume that the hazard function for censoring is givenby: h ( t | V, W ) = exp( η T W ) h V ( t )where V denotes stratifying variables (treatment status, A , and other discrete covariates). The coefficientsin this model, η , can be estimated using the Cox partial likelihood (Cox, 1975): (cid:98) G ( T ∗ i | V i , W i ) = exp − n (cid:88) j =1 δ j ( V i = V j , T ∗ i > X j )exp( (cid:98) η T W i ) (cid:80) nk =1 ( V i = V k , X k ≥ X j )exp( (cid:98) η T W k ) For each individual with uncensored cost data, IPCW is defined as w i = δ ∗ i (cid:98) G ( T ∗ i )The weights, w i , can be incorporated into the Monte Carlo procedure when modeling f ( Y | A, X, L, Z ).8or example, suppose we assume that costs follow a log-normal distribution. To fit f ( Y | A, X, L, Z ), we canperform weighted log-normal regression using only patients with fully observed cost data (i.e. those with δ ∗ i = 1). Any modeling procedure that incorporates weighting may be used to estimate the cost distribution.The resulting estimate, (cid:98) f ( Y | A, X, L, Z ), can be used in the standardization procedure described in Section3.2.
We perform a simulation study to determine the finite-sample properties of the proposed regression estimator.Although data used for cost-effectiveness analyses often have thousands of observations, it is beneficial tounderstand how our methodology performs in both small and large sample settings. We consider simulationsettings with low sample size ( n = 500) and high sample size ( n = 5000) as well as low (10%), medium(30%), and high (50%) rates of censoring. Finally, we consider the case where the stratifying variable is andis not associated with θ ( λ | X ). For each simulation, we estimate the NBS under two WTP thresholds, λ = 2and 12. These represent the cost a payer is willing to incur for a unit change in effectiveness.We generate the data as follows: L ∼ N (0 , X ∼ Bernoulli( p x = 0 . A ∼ Bernoulli( p a = expit( L )), T ∼ Weibull( k = 2 , λ = exp(4 . . A +0 . L + β x X + β ax A × X )), C ∼ Weibull( k = 2 , λ = exp( γ +0 . A )),and Y ∼ Lognormal( µ = 4 . . T + 0 . A, σ = 0 . X is associated with θ ( λ | X )we set β x = 0 . β ax = 0 .
5. Under these parameter values θ (2 | X = 0) = 0 . θ (2 | X = 1) = 0 . θ (12 | X = 0) = 0 .
527 and θ (12 | X = 1) = 0 . X = 1 benefit more from treatment (with respect to survival) than those with X = 0. Theseparation between the distribution of INBs between treated and untreated subjects is larger in a populationwhere X = 1 than in one with X = 0. In the setting where θ ( λ | X ) does not depend on X , both β x and β ax are set to zero. In these simulations, θ (2 | X ) = 0 .
353 and θ (12 | X ) = 0 .
527 regardless of the value of X .In simulations where X affects NBS, we set γ to 0 . . . X does not affect NBS, these values are 5 . . . θ ( λ | X ), we use the standardization procedure described in Section 3.1. A Weibull regressionmodel is used to estimate the distribution of T and a log-normal model is used for costs. Because Y isinformatively censored, we use IPCW when estimating the cost model. To estimate the probability of beinguncensored, we use the Cox proportional hazards model. In the standardization step, we draw M = 5000observations within each level of A . For the regression of S | X ( S − | X ( ω )), we obtain (cid:98) S − | X ( ω )) for ω ∈ Ωusing the empirical quantiles within levels of X . The previously described probit link is used to model U iω X . Standard errors are estimated as the standard deviation of B = 300 bootstrap replicates.The nominal type-I error rate is set to 0 .
05 for our hypothesis tests. Hypothesis tests are performed usingsymmetric 95% CIs based on the bootstrap replicates.For each setting, we simulate 1000 datasets, estimate the NBS conditional on X , and perform the non-parametric bootstrap hypothesis test. We report the mean estimate of the NBS across simulations, the meanbootstrap estimated standard error of the NBS, the standard deviation of estimated NBS across simulations,and the proportion of hypothesis tests in which the null hypothesis was rejected. Tables 1 and 2 providesimulation results for the settings where X affects and does not affect the NBS, respectively.Average point estimates of NBS are close to their true values for all tested values of X and λ . Bootstrapestimated standard errors closely approximate the empirical standard errors observed accross simulations.Due to a lower effective sample size, standard errors increase as the proportion of censored units increases.Similarly, higher variability in the estimates of θ ( λ | X = 0) can be attributed to the fact that there is a lowerprevalence of X = 0 in our simulations. In simulations where X affects NBS, power is lowest when thereis a sample size of 500, 50% censoring, and a WTP of 2. All other observed rejection rates exceed 0 . X does not affect cost-effectiveness, the probability of rejecting the null is close to the desired 0 . In settings where we estimate NBS using data from observational databases, we must consider the potentialeffects of unmeasured confounding. Because the Monte Carlo procedure described in Section 3.2 only ensuresthat INBs are sampled from populations with the same underlying distribution of measured covariates, it isnecessary to understand how estimates of NBS may be affected by unmeasured confounding. We perform sim-ulations to determine how sensitive the proposed regression estimator of NBS is to unmeasured confounding.We simulate two standard normal variates, U and U . The confounder U influences treatment probabil-ity and survival time, while U affects treatment and cost. We consider two settings, (1) where only theexposure-survival relationship is confounded and (2) where only the exposure-cost relationship is confounded.Variables L , X , and C are simulated as in previous simulations, A ∼ Bernoulli (expit( L + γ U + γ U )), T ∼ Weibull( k = 2 , λ = exp(4 .
05 + 0 . A + 0 . L − η U + 0 . X + 0 . A × X )), and Y ∼ Lognormal( µ =4 . . T + 0 . A + η U , σ = 0 . γ , γ , η and η control the strength of unmeasuredconfounding. In simulations where the exposure-survival relationship is confounded, we examine the effect oflow ( γ = 0 . η = 0 . γ = 0 . η = 0 . γ = 1, η = 0 .
3) levels of confounding.10he parameters γ and η equal zero in this setting. When the exposure-cost relationship is confounded, γ = η = 0. Low, medium, and high levels of confounding are simulated by setting ( γ = 0 . η = 0 . γ = 0 . η = 0 . γ = 1, η = 1) respectivelyWe simulate 1000 datasets with 5000 observations and a 30% censoring rate. We do not account for U or U when estimating NBS using the Monte Carlo standardization procedure. Simulation results forthese simulations are provided in Table 3. As the level of confounding increases, the mean estimate of theNBS becomes more biased. When confounding is strongest, percent bias exceeds 10%. Because cost is givenless weight in INB for large WTP, the bias from unmeasured exposure-cost confounding decreases as WTPincreases. Similarly, bias increases with WTP when there is unmeasured confounding of the exposure-survivalrelationship. Bootstrap estimated standard errors are able to capture the true variability in estimates ofNBS regardless of the strength of confounding. The standard treatment for early-stage endometrial cancer is complete hysterectomy. Patients sometimesreceive additional adjuvant radiation or chemotherapy to decrease the risk of recurrence (Latif et al., 2014).However, there is insufficient evidence that adjuvant radiation therapy increases overall survival (Shaeffer &Randall, 2005). Because more than half of the cost associated with the treatment of endometrial cancer isaccrued in the period after initial treatment (Mariotto, Robin Yabroff, Shao, Feuer, & Brown, 2011), it is im-portant to ensure that only those patients who are likely to benefit from adjuvant therapies receive additionaltreatment. We illustrate how to determine the cost-effectiveness of adjuvant radiation and chemotherapyusing empirically-based synthetic data developed by Spieker et al. (Spieker et al., 2021). This dataset wasdeveloped based on data from a large observational endometrial cancer database. Followup for patientsin the original database is insufficient to perform a meaningful cost-effectiveness analysis. Covariates inthe synthetic data are drawn from their empirical distribution in the cancer database. Cost and survivaloutcomes are simulated based on relations observed in the database and from literature surrounding theprognoses of stage I and stage II endometrial cancer (Spieker, Ko, Roy, & Mitra, 2020; Shaeffer & Randall,2005; Susumu et al., 2008). The data have covariate, treatment, and outcome distributions similar to whatwould be observed in practice. The use of synthetic data is motivated by a desire to make the code and datafor this analysis available as well as to ensure sufficient followup.We consider three possible courses of treatment post-hysterectomy: (1) receipt of adjuvant radiationtherapy (RT), (2) receipt of adjuvant chemotherapy (CT), and (3) receipt of neither therapy, hereafterreferred to as control. In our first analysis we compare those receiving RT to those receiving control, and in11ur second we compare CT to control. Because the data are simulated, the results of our analysis are to beused solely for illustrating how NBS regression can be used in a practical setting. This work does not makeuse of data from human subjects and does not require IRB approval.The data contain monthly follow-up information on N = 13526 endometrial cancer patients. The upperbound on follow-up time is ten years, at which point all observations are administratively censored. Ourgoal is to evaluate the association between two measures of patient health, cancer stage and Charlsoncomorbidity index (Charlson, Szatrowski, Peterson, & Gold, 1994), and the cost-effectiveness of adjuvantRT or CT compared to control. We define cost-effectiveness using medical costs in USD( $ ) and survivaltime in years. We assume that age at diagnosis, cancer stage, Charlson index, baseline receipt of RT orCT, and number of hospitalizations in the first-month post-surgery are confounders of the exposure-outcomerelationship. Treatment status is defined over the period from two-to-four months post-hysterectomy. Forexample, if a patient received adjuvant RT during this period, then they are included in the RT group inour analysis regardless of their treatment status throughout the remainder of the study period. If a patientreceived both RT and CT during this period, they are included in both treatment groups. The mean ageat diagnosis is 73 .
70 (SD = 6 . . .
32% of patients having an index of zero. In the first monthpost-hysterectomy, the baseline period, 74 (0 . . . . We wish to estimate NBS comparing each adjuvant therapy to control conditional on cancer stage andCharlson comorbidity index. We consider a range of WTP (from $50 k to $120 k per year),cancer stages(stages I and II), and categorized Charlson indices (0, 1, and 2+). Because baseline covariates affect bothtreatment status and patient outcomes, we apply the standardization procedure described in Section 3.2.We assume survival time can be modelled using a Weibull distribution. To account for structural zeros incost data, we use a two-stage zero-inflated cost model where the probability of having zero costs is estimatedusing a logistic model and the distribution of nonzero costs is fit using a log-normal model. All models arefit conditional on treatment status and confounders. For each treatment, we draw M = 10000 Monte Carloobservations from the fit models for survival and medical costs and use a probit model when finding the12egression estimator of the NBS. For our analysis of RT versus control, we model the NBS using the form: θ ( λ | Charlson, Stage) = (cid:90) Φ (cid:18) (cid:98) β R + (cid:98) β R Stage II + (cid:98) β R Charlson(1)+ (cid:98) β R Charlson(2+) + (cid:98) β R Φ − ( ω ) (cid:19) dω The variables Charlson(1) and Charlson(2+) are indicators of whether a patient’s Charlson score is equalto one or at least two, respectively. Analogous models are used when estimating the NBS comparing CTto control conditional on cancer stage and Charlson index. Parameters for the regression model comparingCT to control are denoted by the superscript C . Hypothesis testing is based on B = 1000 nonparametricbootstrap replicates. In Table 4, we provide estimates of the NBS for each of the treatments as a function of Charlson index andcancer stage. In almost every setting, the estimated NBS is greater than 0 .
5, indicating that patients arelikely to benefit from treatment regardless of their cancer stage or number of comorbidities at diagnosis.Because there are few subjects who have stage II cancer and receive adjuvant chemotherapy ( n = 45),estimates of the NBS within this cohort too uncertain to claim cost-effectiveness. For WTP of $50 k , theestimated NBS comparing RT to control among patients with stage I cancer and a comorbidity index ofzero is 0 .
54 with 95% CI (0 . , . θ ( λ = 50 | Stage I, Charlson(2+)) comparing RT to control is (0 . , . H R : β R = 0 and H C : β C = 0 against their alternatives. At λ = $50k, the p-values for these tests are 0 . . .
007 and 0 . .
55 for patients with stage I cancer and0 .
62 for those with stage II cancer. This indicates that patients with more advanced cancers may benefitmore from receiving adjuvant radiation therapy than those with stage I cancer. A similar conclusion is drawnat a WTP of $50k.To test whether patient’s comorbidity indices are associated with NBS after accounting for cancer stage,we perform a test of the hypotheses H R : β R = β R = 0 and H C : β C = β C = 0 against their respectivealternatives. At a WTP of $50k, the p-value for the test of parameters in the adjuvant radiation model is0 . . . . In this paper, we introduce a regression framework for NBS that allows us to explain variability in cost-effectiveness arising from covariates. The proposed method is the first to enable estimation of NBS withinlevels of measured covariates and allow for testing the effects of covariates on NBS. Regression of NBS is donein three steps: (1) estimate the distribution of cost and effectiveness outcomes conditional on treatment andmeasured covariates, (2) implement Monte Carlo standardization to sample INBs from treated and untreatedpopulations with the same underlying distribution of confounders, and (3) apply ROC regression techniquesdeveloped by Alonzo and Pepe (Alonzo & Pepe, 2002) to estimate NBS. Our proposed methodology general-14zes the work of Spieker et al. (Spieker et al., 2021) on estimating NBS to allow for covariate adjustment andhypothesis testing. Understanding how patient characteristics influence cost-effectiveness can help policymakers allocate resources towards groups which are most likely to benefit from treatment.In cost-effectiveness analyses where patients are lost to follow-up, estimators that assume independencebetween costs at time of censoring and those at time of death are known to be inconsistent (Lin, Feuer,Etzioni, & Wax, 1997). We provide methodology for incorporating IPCW in models for the conditional dis-tribution of treatment costs. When the probability of censoring depends on measured covariates, we describehow Cox proportional hazards models can be used to determine covariate-specific IPCW. In simulations, wefind that our method is able to attain low bias and adequate power in settings with small sample size andup to 50% censoring. Moreover, estimates of standard error obtained using the nonparametric bootstrap arerepresentative of empirical standard errors across various levels of censoring.Our proposed estimation procedure assumes that models for the conditional distributions of cost andeffectiveness are properly specified. Flexible modeling techniques present a potential route to reduce therisk of model misspecification. Bayesian nonparametric methods for zero-inflated data have been developedby Oganisian et al. (Oganisian, Mitra, & Roy, 2018) to estimate cost distributions, but extensions of thismethodology to cost-effectiveness analyses have not been explored. Similarly, ensemble learning algorithmssuch as Super Learner (Van der Laan, Polley, & Hubbard, 2007) could be used to model complex cost andeffectiveness distributions. For example, in their doubly robust estimator of NMB, Li et al. (Li et al., 2018)successfully use Super Learner to estimate both propensity score and outcome models. In future work, itwould be useful to determine the best ways to incorporate these methods into regression of NBS. Afterthe standardization step, parametric regression models for NBS may also be misspecified. Prior work byHanley (Hanley, 1988) has examined the robustness of the binormal form for modeling ROC curves. Theirresults suggest that estimates of AUC arising from the binormal form, and analogously NBS, may exhibitlow bias even when the form is misspecified. Additionally, because the underlying distribution of INBs areleft unspecified, the proposed NBS regression framework can be described as “distribution-free” (Alonzo &Pepe, 2002).Because cost-effectiveness data are often drawn from observational databases, unmeasured confoundingmay influence estimates of NBS. In simulations with low levels of unmeasured confounding, our regressionestimator of NBS exhibits small to moderate bias (between 2% and 8% bias). As expected, we found thatas the strength of confounding increases, estimates of NBS become more biased. In any studies involvingobservational databases, researchers should consider potential sources of unmeasured confounding. Methodsto assess sensitivity of NBS to unmeasured confounding may also be useful and will be a subject of futurework. 15ote that NBS is defined in terms of baseline treatment status. In our endometrial cancer example, thecomparison groups are those who received some adjuvant therapy in the period from two to four months aftersurgery and those who did not. Because treatment effects may depend on the duration or frequency of treat-ment, conclusions concerning cost-effectiveness may change depending on whether treatment is consideredtime-stable, as in our definition, or time-dependent. To estimate cumulative medical costs under a time-dependent treatment strategy, Spieker et al. (Spieker, Roy, & Mitra, 2018) developed a nested g-computationprocedure. Similar methodologies that can incorporate time-dependent treatment may be useful when weare interested in comparing the cost-effectiveness of multiple treatment strategies and are the subject offuture work.Because patient covariates may influence the cost-effectiveness of a treatment, our proposed regressionframework for NBS provides a useful approach to identify which groups may benefit the most from treatment.The results from our synthetic data analysis illustrate how we can evaluate the effects of cancer stage ornumber of comorbidities on the cost-effectiveness of adjuvant therapies for endometrial cancer patients. Thisinformation may be useful to policy makers aiming to better allocate resources.
References
Alonzo, T. A., & Pepe, M. S. (2002). Distribution-free ROC analysis using binary regression techniques.
Biostatistics , (3), 421–432.Bang, H., & Tsiatis, A. A. (2000). Estimating medical costs with censored data. Biometrika , (2), 329–343.Charlson, M., Szatrowski, T. P., Peterson, J., & Gold, J. (1994). Validation of a combined comorbidityindex. Journal of Clinical Epidemiology , (11), 1245–1251.Cox, D. R. (1975). Partial likelihood. Biometrika , (2), 269–276.Hanley, J. A. (1988). The robustness of the “binormal” assumptions used in fitting ROC curves. MedicalDecision Making , (3), 197–203.Koenker, R., & Bassett Jr, G. (1978). Regression quantiles. Econometrica: Journal of the EconometricSociety , 33–50.Latif, N. A., Haggerty, A., Jean, S., Lin, L., & Ko, E. (2014). Adjuvant therapy in early-stage endometrialcancer: a systematic review of the evidence, guidelines, and clinical practice in the US.
The Oncologist , (6), 645–653.Li, J., Vachani, A., Epstein, A., & Mitra, N. (2018). A doubly robust approach for cost–effectivenessestimation from observational data. Statistical Methods in Medical Research , (10), 3126–3138.16in, D., Feuer, E., Etzioni, R., & Wax, Y. (1997). Estimating medical costs from incomplete follow-up data. Biometrics , 419–434.Mariotto, A. B., Robin Yabroff, K., Shao, Y., Feuer, E. J., & Brown, M. L. (2011). Projections of thecost of cancer care in the united states: 2010–2020.
Journal of the National Cancer Institute , (2),117–128.Nixon, R. M., & Thompson, S. G. (2005). Methods for incorporating covariate adjustment, subgroupanalysis and between-centre differences into cost-effectiveness evaluations. Health Economics , (12),1217–1229.Oganisian, A., Mitra, N., & Roy, J. A. (2018). A Bayesian nonparametric model for zero-inflated outcomes:Prediction, clustering, and causal estimation. Biometrics .Pepe, M. S. (1997). A regression modelling framework for receiver operating characteristic curves in medicaldiagnostic testing.
Biometrika , (3), 595–608.Pepe, M. S. (2000). An interpretation for the ROC curve and inference using GLM procedures. Biometrics , (2), 352–359.Piessens, R., de Doncker-Kapenga, E., ¨Uberhuber, C. W., & Kahaner, D. K. (2012). Quadpack: a subroutinepackage for automatic integration (Vol. 1). Springer Science & Business Media.Shaeffer, D. T., & Randall, M. E. (2005). Adjuvant radiotherapy in endometrial carcinoma.
The Oncologist , (8), 623–631.Shiroiwa, T., Sung, Y.-K., Fukuda, T., Lang, H.-C., Bae, S.-C., & Tsutani, K. (2010). International survey onwillingness-to-pay (WTP) for one additional QALY gained: what is the threshold of cost effectiveness? Health Economics , (4), 422–437.Spieker, A. J., Illenberger, N., Roy, J. A., & Mitra, N. (2021). Net benefit separation and the determinationcurve: A probabilistic framework for cost-effectiveness estimation. To appear in Statistical Methods inMedical Research .Spieker, A. J., Ko, E. M., Roy, J. A., & Mitra, N. (2020). Nested g-computation: a causal approach toanalysis of censored medical costs in the presence of time-varying treatment.
Journal of the RoyalStatistical Society: Series C (Applied Statistics) , (5), 1189–1208.Spieker, A. J., Roy, J., & Mitra, N. (2018). Analyzing medical costs with time-dependent treatment: Thenested g-formula. Health Economics , (7), 1063–1073.Stinnett, A. A., & Mullahy, J. (1998). Net health benefits: A new framework for the analysis of uncertaintyin cost-effectiveness analysis. Medical Decision Making , (2 suppl), S68–S80.Susumu, N., Sagae, S., Udagawa, Y., Niwa, K., Kuramoto, H., Satoh, S., & Kudo, R. (2008). Random-ized phase III trial of pelvic radiotherapy versus cisplatin-based combined chemotherapy in patients17ith intermediate and high-risk endometrial cancer: A Japanese gynecologic oncology group study. Gynecologic Oncology , (1), 226–233.Van der Laan, M. J., Polley, E. C., & Hubbard, A. E. (2007). Super learner. Statistical applications ingenetics and molecular biology , (1).Willan, A. R., Briggs, A. H., & Hoch, J. S. (2004). Regression methods for covariate adjustment andsubgroup analysis for non-censored cost-effectiveness data. Health Economics , (5), 461–475.18ens. Sample Size λ X θ ( λ | X ) Est. (cid:99) SE ESE Pr. Reject10% 500 2 0 0.353 0.352 0.028 0.030 0.98610% 500 2 1 0.588 0.589 0.049 0.049 −
10% 500 12 0 0.527 0.526 0.028 0.030 0.98810% 500 12 1 0.746 0.745 0.038 0.038 −
10% 5000 2 0 0.353 0.354 0.011 0.010 1.00010% 5000 2 1 0.588 0.587 0.019 0.020 −
10% 5000 12 0 0.527 0.527 0.011 0.010 1.00010% 5000 12 1 0.746 0.745 0.015 0.015 −
30% 500 2 0 0.353 0.351 0.032 0.032 0.92430% 500 2 1 0.588 0.591 0.061 0.065 −
30% 500 12 0 0.527 0.526 0.032 0.032 0.96430% 500 12 1 0.746 0.746 0.045 0.046 −
30% 5000 2 0 0.353 0.352 0.012 0.012 1.00030% 5000 2 1 0.588 0.588 0.023 0.023 −
30% 5000 12 0 0.527 0.526 0.012 0.012 1.00030% 5000 12 1 0.746 0.745 0.017 0.017 −
50% 500 2 0 0.353 0.353 0.041 0.040 0.69450% 500 2 1 0.588 0.580 0.083 0.087 −
50% 500 12 0 0.527 0.527 0.037 0.036 0.84250% 500 12 1 0.746 0.742 0.057 0.055 −
50% 5000 2 0 0.353 0.353 0.015 0.016 1.00050% 5000 2 1 0.588 0.586 0.034 0.033 −
50% 5000 12 0 0.527 0.527 0.013 0.014 1.00050% 5000 12 1 0.746 0.745 0.020 0.019 − Table 1: Results for setting where X has an effect on the NBS. Columns represent the probability ofbeing censored, sample size, WTP ( λ ), value of X , true NBS ( θ ( λ | X )), mean point estimate (Est.), meanestimated standard error ( (cid:99) SE), empirical standard error (ESE), and proportion of simulations in which thenull hypothesis of no effect of X is rejected (Pr. Reject).19ens. Sample Size λ X θ ( λ | X ) Est. (cid:99) SE ESE Pr. Reject10% 500 2 0 0.353 0.355 0.028 0.029 0.05410% 500 2 1 0.353 0.355 0.047 0.049 −
10% 500 12 0 0.527 0.528 0.029 0.029 0.04810% 500 12 1 0.527 0.530 0.047 0.049 −
10% 5000 2 0 0.353 0.354 0.011 0.010 0.04810% 5000 2 1 0.353 0.353 0.018 0.017 −
10% 5000 12 0 0.527 0.528 0.011 0.010 0.04610% 5000 12 1 0.527 0.527 0.018 0.018 −
30% 500 2 0 0.353 0.353 0.033 0.034 0.04030% 500 2 1 0.353 0.356 0.055 0.054 −
30% 500 12 0 0.527 0.526 0.032 0.035 0.04630% 500 12 1 0.527 0.529 0.053 0.051 −
30% 5000 2 0 0.353 0.352 0.012 0.013 0.05430% 5000 2 1 0.353 0.353 0.021 0.020 −
30% 5000 12 0 0.527 0.525 0.012 0.012 0.05430% 5000 12 1 0.527 0.527 0.020 0.020 −
50% 500 2 0 0.353 0.354 0.043 0.041 0.04650% 500 2 1 0.353 0.356 0.069 0.068 −
50% 500 12 0 0.527 0.528 0.038 0.037 0.04050% 500 12 1 0.527 0.529 0.062 0.061 −
50% 5000 2 0 0.353 0.354 0.017 0.016 0.03650% 5000 2 1 0.353 0.354 0.027 0.027 −
50% 5000 12 0 0.527 0.527 0.014 0.014 0.05450% 5000 12 1 0.527 0.527 0.023 0.023 − Table 2: Results for setting where X has no effect on the NBS. Columns represent the probability ofbeing censored, sample size, WTP ( λ ), value of X , true NBS ( θ ( λ | X )), mean point estimate (Est.), meanestimated standard error ( (cid:99) SE), empirical standard error (ESE), and proportion of simulations in which thenull hypothesis of no effect of X is rejected (Pr. Reject).20urvival Confounded Cost ConfoundedConf. λ X θ ( λ | X ) Est. (cid:99) SE ESE θ ( λ | X ) Est. (cid:99) SE ESElow 2 0 0.354 0.346 0.012 0.012 0.358 0.329 0.011 0.012low 2 1 0.588 0.578 0.024 0.023 0.586 0.559 0.024 0.024low 12 0 0.527 0.516 0.011 0.012 0.527 0.519 0.012 0.012low 12 1 0.746 0.735 0.018 0.017 0.745 0.740 0.017 0.017med. 2 0 0.356 0.323 0.012 0.011 0.384 0.285 0.011 0.011med. 2 1 0.585 0.546 0.024 0.023 0.571 0.475 0.026 0.026med. 12 0 0.526 0.482 0.012 0.012 0.522 0.490 0.012 0.012med. 12 1 0.739 0.700 0.019 0.020 0.741 0.718 0.018 0.018high 2 0 0.362 0.287 0.011 0.011 0.423 0.251 0.011 0.011high 2 1 0.576 0.488 0.024 0.024 0.556 0.386 0.027 0.027high 12 0 0.523 0.425 0.012 0.012 0.512 0.427 0.012 0.012high 12 1 0.722 0.631 0.020 0.020 0.720 0.651 0.020 0.021Table 3: Results concerning sensitivity to unmeasured confounding of the exposure-survival and the exposure-cost relationship. Columns represent the WTP ( λ ), value of X , true NBS ( θ ( λ | X )), mean point estimate(Est.), mean estimated standard error ( (cid:99) SE), and empirical standard error (ESE).
Treatment λλλ
Charlson(0) Charlson(1) Charlson(2+)Stage I
RT 50 0.54 (0.51, 0.55) 0.52 (0.50, 0.54) 0.51 (0.48, 0.53)120 0.55 (0.52, 0.62) 0.53 (0.52, 0.56) 0.53 (0.50, 0.55)CT 50 0.63 (0.56, 0.68) 0.60 (0.53, 0.66) 0.56 (0.49, 0.63)120 0.67 (0.60, 0.71) 0.64 (0.58, 0.70) 0.60 (0.54, 0.67)
Stage II
RT 50 0.59 (0.54, 0.64) 0.57 (0.54, 0.63) 0.56 (0.52, 0.62)120 0.62 (0.56, 0.65) 0.60 (0.56, 0.65) 0.60 (0.54, 0.64)CT 50 0.58 (0.37, 0.69) 0.54 (0.34, 0.67) 0.49 (0.30, 0.63)120 0.64 (0.45, 0.74) 0.61 (0.43, 0.72) 0.58 (0.40, 0.70))Table 4: Estimated NBS comparing RT to control and CT to control conditional on Charlson index andcancer stage. Non-parametric bootstrap-based 95% CIs for each estimated NBS are provided in parentheses.21 l l l l ll l l l l ll l l l l l l l l ll l l ll l l ll l l l l l l ll l l l l l l ll l l l l l l l l ($ x 1,000) q ( l | C ha r l s on , S t age ) l l l l l ll l l l l ll l l l l l l l l ll l l ll l l ll l l l l l l ll l l l l l l ll l l l l l l l l ($ x 1,000) Charlson Index
Figure 1: Estimates of the NBS comparing RT to control as a function of the WTP, λ . CED curves areprovided for patient cohorts conditional on cancer stage and Charlson comorbidity index. Results for patientswith stage I cancer are provided on the left, while those for patients with stage II cancer are on the right.Estimated NBS within the range of primary interest are denoted in black. Gray indicates estimated NBSoutside of this region, provided to observe the behavior of the NBS. l l l l l ll l l l l ll l l l l l l l l ll l l ll l l l l l l l l l l ll l l l l l l ll l l l l l l l l ($ x 1,000) q ( l | C ha r l s on , S t age ) l l l l l ll l l l l ll l l l l l l l l ll l l ll l l l l l l l l l l ll l l l l l l ll l l l l l l l l ($ x 1,000) Charlson Index
Figure 2: Estimates of the NBS comparing CT to control as a function of the WTP, λλ