[PDF] Calculating the Expected Value of Sample Information in Practice: Considerations from Three Case Studies

Abstract

Investing efficiently in future research to improve policy decisions is an important goal. Expected Value of Sample Information (EVSI) can be used to select the specific design and sample size of a proposed study by assessing the benefit of a range of different studies. Estimating EVSI with the standard nested Monte Carlo algorithm has a notoriously high computational burden, especially when using a complex decision model or when optimizing over study sample sizes and designs. Therefore, a number of more efficient EVSI approximation methods have been developed. However, these approximation methods have not been compared and therefore their relative advantages and disadvantages are not clear. A consortium of EVSI researchers, including the developers of several approximation methods, compared four EVSI methods using three previously published health economic models. The examples were chosen to represent a range of real-world contexts, including situations with multiple study outcomes, missing data, and data from an observational rather than a randomized study. The computational speed and accuracy of each method were compared, and the relative advantages and implementation challenges of the methods were highlighted. In each example, the approximation methods took minutes or hours to achieve reasonably accurate EVSI estimates, whereas the traditional Monte Carlo method took weeks. Specific methods are particularly suited to problems where we wish to compare multiple proposed sample sizes, when the proposed sample size is large, or when the health economic model is computationally expensive. All the evaluated methods gave estimates similar to those given by traditional Monte Carlo, suggesting that EVSI can now be efficiently computed with confidence in realistic examples.

Full PDF

CCalculating the Expected Value of Sample Information inPractice: Considerations from Three Case Studies

Anna Heath , Natalia R. Kunst , Christopher Jackson , Mark Strong , FernandoAlarid-Escudero , Jeremy D. Goldhaber-Fiebert , Gianluca Baio , Nicolas A.Menzies , Hawre Jalal , and on behalf of the Collaborative Network for Value ofInformation (ConVOI) The Hospital for Sick Children University of Toronto University College London University of Oslo Yale University School of Medicine Amsterdam UMC LINK Medical Research MRC Biostatistics Unit, University of Cambridge School of Health and Related Research, University of Sheﬃeld Center for Research and Teaching in Economics (CIDE) Stanford Health Policy, Centers for Health Policy and Primary Care and OutcomesResearch, Stanford University Harvard TH Chan School of Public Health University of Pittsburgh

Abstract

Investing eﬃciently in future research to improve policy decisions is an important goal. ExpectedValue of Sample Information (EVSI) can be used to select the speciﬁc design and sample size ofa proposed study by assessing the beneﬁt of a range of diﬀerent studies. Estimating EVSI withthe standard nested Monte Carlo algorithm has a notoriously high computational burden, especiallywhen using a complex decision model or when optimizing over study sample sizes and designs.Therefore, a number of more eﬃcient EVSI approximation methods have been developed. However,these approximation methods have not been compared and therefore their relative advantages anddisadvantages are not clear.A consortium of EVSI researchers, including the developers of several approximation methods,compared four EVSI methods using three previously published health economic models. The ex-amples were chosen to represent a range of real-world contexts, including situations with multiplestudy outcomes, missing data, and data from an observational rather than a randomized study. Thecomputational speed and accuracy of each method were compared, and the relative advantages andimplementation challenges of the methods were highlighted.In each example, the approximation methods took minutes or hours to achieve reasonably accurateEVSI estimates, whereas the traditional Monte Carlo method took weeks. Speciﬁc methods areparticularly suited to problems where we wish to compare multiple proposed sample sizes, when theproposed sample size is large, or when the health economic model is computationally expensive. Allthe evaluated methods gave estimates similar to those given by traditional Monte Carlo, suggestingthat EVSI can now be eﬃciently computed with conﬁdence in realistic examples.

Introduction

The Expected Value of Sample Information (EVSI) [1, 2] quantiﬁes the expected beneﬁt of undertak-ing a potential future study that aims to reduce uncertainty about the parameters of a health economicmodel. The expected net beneﬁt of sampling (ENBS), which is the diﬀerence between EVSI and the1 a r X i v : . [ s t a t . A P ] M a y xpected research study costs, can be used to inform decisions regarding study design and research prior-itization. The future study with the highest ENBS should be prioritized if we wish to maximize economiceﬃciency. Thus, EVSI has the potential to determine the value of future research and to guide its designwhen accounting for economic constraints.Despite this potential, EVSI has rarely been used in practical settings for a variety of reasons [3]. Inthe past, calculating EVSI in real-world scenarios has been based on nested Monte Carlo (MC) sampling[4], and this is computationally costly if we wish to produce accurate estimates with high precision. Thiscomputational burden is further increased when one aims to compute EVSI for multiple trial designs inorder to determine the optimal (i.e., with the highest ENBS) research study [5, 6]. High performancecomputing resources can be used to overcome some of these barriers, but often at the expense of anincreased requirement for programming skills and an increase in the complexity of the analysis.Several methods have been developed to overcome these computational barriers and unlock the po-tential of EVSI as a tool for research prioritization and trial design optimization [7, 8, 9, 10, 11, 12, 13,14, 15, 16, 17]. However, as many of these methods have been developed concurrently, they have notbeen compared. Additionally, EVSI estimation methods are typically evaluated using health economicmodels and trial designs chosen for computational convenience rather than those that reﬂect real-worlddecision making.Some of the EVSI estimation methods that have been proposed place restrictions on the structureof the underlying health economic model and/or the study design [7, 8, 9]. These restrictions typicallytake the form of an assumption about the study data that ensures that the prior and posterior modelparameter distributions take the same form (conjugacy), and by doing so, allow for computationallyeﬃcient EVSI estimation. This, however restricts the applicability of these methods. EVSI estimationbased on minimal modelling, where a comprehensive clinical trial is available to inform EVSI estimation,has also been proposed [18]. However, this paper aims to review EVSI estimation procedures for threecase studies where the health economic models are based on a diverse evidence base and, when combinedwith the proposed study designs, do not fulﬁll the assumptions required for these restrictive or minimalmodelling methods.Thus, our comparison is restricted to four recent calculation methods developed by (in chronologicalorder) Strong et al. [10], Menzies [11], Jalal and Alarid-Escudero [13] (extending a method proposed inJalal et al. [12]), and Heath et al. [15, 16, 17]. Whilst these methods are all based on diﬀerent approachesand assumptions, they all provide estimation techniques for approximating EVSI that, in comparison tonested MC sampling methods, are less computationally demanding whilst retaining accuracy.Our primary goal is to test the four EVSI estimation methods across a range of health economicmodels and trial designs to gain a greater understanding of their behaviour in practice. We will evaluatethe accuracy of EVSI estimation methods across the three models and the computational time requiredto obtain these estimates. These three models have several key features that reﬂect real-world trial designand may make it challenging to estimate EVSI in practice. These are: the presence of multiple trialoutcomes, missingness or loss to follow-up in the data, and a study design that is observational ratherthan randomized. Notation and Key Concepts

Health economic decision making aims to determine the intervention, from some set of feasible al-ternatives, that is expected to be optimal in terms of utility (which is usually net monetary beneﬁt ornet health beneﬁt [19]). We characterize a health economic model as a function that takes as an input avector of parameters θ , and returns the costs and health eﬀects associated with each intervention in theset of alternatives. Uncertainty in the input parameters is represented using a probability distribution p ( θ ). To ﬁnd the optimal intervention, costs and eﬀects are combined into a single measure of economicvalue by calculating the net beneﬁt for each of the T treatment options considered relevant, conditionalon θ . Uncertainty about θ induces uncertainty about the net beneﬁt for each treatment t = 1 , . . . , T . Wedenote the net beneﬁt for treatment t given parameters θ as NB θ t . Under the assumption of a rational,risk neutral decision maker, the optimal intervention given current evidence is the intervention associatedwith the maximum expected net beneﬁt.We consider that the model parameters can be split into two sets θ = ( φ , ψ ), where φ is a subsetof parameters that we wish to obtain more information on, and ψ are the remaining parameters. Forexample, clinical trials are informative for clinical outcomes but may not collect information abouthealth state utilities or costs. The economic value of eliminating all uncertainty about φ (assuming riskneutrality) is equal to the Expected Value of Partial Perfect Information (EVPPI) [20, 21, 22]. This is2iven by EVPPI = E φ (cid:104) max t E θ | φ (cid:104) NB θ t (cid:105)(cid:105) − max t E θ (cid:104) NB θ t (cid:105) . (1)The EVSI is the value of collecting additional data, denoted X , to inform the parameters φ and isbounded above by the EVPPI. If these data had been collected and observed to have a value x , theywould be combined with the current evidence to generate an updated distribution for φ , p ( φ | x ). Undera Bayesian approach, this would in turn be used to update the distribution of the net beneﬁt of eachtreatment. The optimal intervention conditional on the data x is the treatment associated with themaximum expected net beneﬁt based on the updated knowledge about the relevant parameters φ . Ifthe optimal intervention changes, compared to the current decision, then the information in x has value.However, as the data have not been collected yet (and may never be), the average value over all possibledatasets is considered. Mathematically, EVSI is deﬁned asEVSI = E X (cid:104) max t E θ | X (cid:104) NB θ t (cid:105)(cid:105) − max t E θ (cid:104) NB θ t (cid:105) , (2)where the distribution of X can be deﬁned through p ( X , θ ) = p ( θ ) p ( X | θ ) where p ( X | θ ) = p ( X | φ ) isthe sampling distribution for the data given the parameters. We assume that the sampling distributionfor the data is only deﬁned conditional on φ , i.e., does not provide information on the value of theparameters ψ , except through any relationship with φ . Calculation Methods for EVSI

It is rarely possible to compute EVSI analytically as the net beneﬁt is often a complex function of θ .Additionally, it is challenging to compute the expectation of a maximum analytically as required in theﬁrst term of equation (2). Therefore, a range of methods have been developed to approximate EVSI. Nested Monte Carlo Computations for EVSI

The simplest approximation method [4] computes all the expectations in equation (2) using MCsimulation. The second term can be computed by simulating s = 1 , . . . , S parameter values, θ s , from p ( θ ). The simulated values are used as inputs to a health economic model to obtain S simulationsof the net beneﬁt for each intervention, denoted NB θ s t . Note that this process is required to performa “probabilistic sensitivity analysis” (PSA) [23], used to assess the impact of parametric uncertaintyon the decision uncertainty, which is mandatory in various jurisdictions [24, 25, 26]. The average ofNB θ t , . . . , NB θ S t for each intervention can be computed and max t E θ (cid:104) NB θ t (cid:105) is estimated by the maximumof these means.The ﬁrst term in equation (2) is more complex to compute by simulation. Firstly, S datasets X s must be generated conditional on the simulated θ s from the assumed sampling distribution p ( X | θ s ).For each X s , we simulate R values from the updated distribution of the model parameters p ( θ | X s ).These R simulations are used as inputs to the health economic model to simulate from the updateddistribution of the net beneﬁt for each intervention. The mean net beneﬁt for each treatment option isthen calculated to estimate E θ | X (cid:104) NB θ t (cid:105) for t = 1 , . . . , T . The maximum of these simulated means isthen selected for each X s . Thus, to compute EVSI by MC simulation, we require S × R runs of thehealth economic model. This is computationally expensive for standard choices of S and R , which aretypically in the thousands. Therefore, the following methods focus on approximating the updated meanof the incremental net beneﬁt associated with each intervention t using a smaller simulation burden. Wedenote the expectation of the incremental net beneﬁt, conditional on data X , as µ X t = E θ | X (cid:104) NB θ t (cid:105) . In a similar manner, we also denote the expectation of the incremental net beneﬁt, conditional on somevalue of the parameters of interest φ , as µ φ t = E θ | φ (cid:104) NB θ t (cid:105) . Finally, to increase the numerical stability of the following approximation methods, it is easier to workin terms of the incremental net beneﬁt or loss, deﬁned, without loss of generality, as INB θ t = NB θ t − NB θ for t = 2 , . . . , T . 3 trong et al. The Strong et al. method estimates EVSI by ﬁtting a regression model between the simulated valuesof the incremental net beneﬁt, as the ‘dependent’ or ‘response’ variable, and a scalar or low-dimensionalsummary for the simulated dataset X as the ‘independent’ or ‘predictor’ variable(s) [10]. This low-dimensional summary for X should reﬂect how the data would be summarized if the study were to goahead and must be computed for each simulated dataset X s . Once this regression model has been ﬁtted, µ X t is estimated by the ﬁtted values from this regression model. EVSI is then estimated directly fromthese estimates of µ X t . Menzies

Menzies [11] presents two EVSI estimation methods, the most accurate of which estimates µ X t byreweighting simulations of µ φ t . This reweighting is based on the likelihood of observing a simulateddataset X conditional on diﬀerent values for φ . The term likelihood is used in the statistical sense andis equal to p ( X | φ ).This method simulates S future datasets X s from p ( X | φ s ). The likelihood for every simulatedvector for φ is then calculated conditional on X s . For the sample X s , µ X s t is estimated as the averageof µ φt , weighted by the likelihood of the dataset X s , and the method can therefore be seen as an exampleof importance sampling [27, 28]. EVSI is estimated based on the estimate of µ X s t for each future sample. Jalal et al.

The Jalal et al. method published by Jalal and Alarid-Escudero [13], building on work from Jalal et al. [12], ﬁts a linear meta-model between the simulated incremental net beneﬁt values, as theresponse variable, and simulations for φ , as the predictor variables. Each term of the linear meta-modelis then rescaled based on a Gaussian-Gaussian Bayesian updating approach to estimate its “posterior”expectation across diﬀerent future datasets X . These estimated distributions are then recombined usingthe coeﬃcients of the linear model to estimate µ X t and compute EVSI.For a proposed future data collection strategy of size N , the rescaling factor for each term of thelinear meta-model is equal to NN + N , where N is known as the prior eﬀective sample size. In some prior-likelihood pairs, N can be obtainedanalytically. In other settings, N can be estimated using one of two estimation methods. If the data X can be summarized using a summary statistic W ( X ), then N can be computed as a function ofthe variance of W ( X ). If a suitable statistic cannot be derived, then nested posterior sampling can beused to estimate N . In this method, S future datasets X s , s = 1 , . . . , S are simulated. Each of thesesamples is used to update the information about the model parameters p ( θ | X s ), typically using R simulations and computing the mean for φ . The variance of the mean for φ , across diﬀerent samples X s , is then used to estimate N . Computationally, this nested sampling method to compute N isrelatively computationally expensive compared to the other two proposals to determine N . However,calculation of N is only needed once to compute EVSI across study size. Heath et al.

The Heath et al. [16, 17] estimation method combines the simulations µ φ t and a modiﬁed nested MCsampling method to estimate EVSI. This method reduces the number of times the updated distributionof the net beneﬁt must be simulated to estimate EVSI from S , typically at least 1000, to Q , usuallybetween 30 and 50 [17]. Thus, EVSI is estimated with Q × R health economic model runs.The Heath et al. method uses nested MC sampling to estimate the variance of the incremental netbeneﬁt for diﬀerent future datasets. These estimated variances rescale simulations of µ φ t for t = 2 , . . . , T to approximate simulations of µ X t which can be used to estimate EVSI. The Heath et al. method onlyrequires a single nested simulation procedure to estimate EVSI across sample size [17]. A “linear” model is required for this method. However, non-linear functions of φ can be deﬁned and combined linearlyto account for ﬂexible relationships between the incremental net beneﬁt and the parameters φ . ase Studies These EVSI methods are applied to three case studies designed to explore trial designs using healtheconomic models that make EVSI estimation reﬂective of real-world decision making. The ﬁrst casestudy is a stylized chemotherapy example used to evaluate EVSI estimation in the presence of multipleoutcomes, reﬂecting a realistic trial design with a single primary, and multiple secondary, outcomes.The second case study evaluates EVSI methods in the presence of missingness in the data using apreviously published health economic model to explore EVSI estimation when we account for standardconsiderations in trial design and development. Finally, we evaluate EVSI methods for a health economicmodel based on a time-dependent natural history model where the main data source is observational.

Case Study 1: A New Chemotherapy Treatment

This model was developed in Heath and Baio [16] to evaluate two chemotherapy interventions, i.e.,the current standard of care and a novel treatment that reduces the number of adverse events. Thesetwo options are equal in their clinical outcomes so we focus on the adverse events. The probability ofadverse events for the standard of care is denoted π and ρ denotes the proportional reduction in theprobability of adverse events with the novel treatment.All patients incur a treatment cost of £

110 for the standard of care or £

420 for the novel treatment.Patients without adverse events or those that have recovered have a quality of life (QoL) measure of q .The health economic impact of adverse events is modelled with a Markov model depicted in Figure 1. Inthis model, γ and γ denote the constant probability of requiring hospital care and dying, respectively,and λ and λ denote the constant probability of recovery given that an individual remains at home orenter hospital, respectively. The cycle length is 1 day and the time horizon is 15 days. Recovered patientsincur no further cost while patients who die have a one-time cost of terminal care. There are costs andQoL measures associated with home and hospital care. PSA distributions for the model parameters areinformed using previous data or deﬁned using expert opinion with all distributional assumptions givenin the supplementary material.Figure 1: A four state Markov model used to model the health economic impact of adverse events froma chemotherapy treatment. Sampling Distributions for X The EVSI is computed for a future two-arm randomized control trial whose primary outcome is thenumber of adverse events. As a secondary set of measures, the study monitors the treatment pathwayfor patients who experience adverse events. Thus, the trial directly informs six model parameters φ =( π , ρ, γ , γ , λ , λ ) by collecting six outcomes. We will enrol 150 patients per arm.To deﬁne the sampling distribution for the six outcomes, we model the number of adverse eventsusing binomial distributions conditional on π and ρ ; X AE ∼ Bin (150 , π ) and X AE ∼ Bin (150 , ρπ ) . The number of patients treated in hospital and the number of patients who die are modelled as X Hosp ∼ Bin ( X AE + X AE , γ ) and X Death ∼ Bin ( X Hosp , γ ) . λ and λ , T iHC ∼ Exponential ( η )with η = − log( λ ) and i = 1 , . . . , X AE + X AE − X Hosp . The recovery time for every patient whorecovers in hospital is modelled as T jH ∼ Exponential ( η )with η = − log( λ ) and j = 1 , . . . , X Hosp − X Death . Case Study 2: A Model for Chronic Pain

This example uses a cost-eﬀectiveness model developed by Sullivan et al. [29], and extended in Heath et al. [17], to evaluate treatments for chronic pain. This is based on a Markov model with 10 states,where each state has an associated QoL and cost. The model is initiated when a cohort of patientsreceive their initial treatment for chronic pain. Patients can experience adverse events due to treatmentand can withdraw from treatment due to adverse events or lack of eﬃcacy. Following this, they can beoﬀered an alternative therapy or withdraw from treatment. If they withdraw from this second line oftreatment, they can receive further treatment or discontinue, both considered absorbing states as themodel does not consider a death state.As a treatment for chronic pain, a patient can ﬁrst either be oﬀered morphine or an innovativetreatment. If they withdraw, they are oﬀered oxycodone as an alternative treatment. Thus, the onlydiﬀerence between the two options is the ﬁrst-line treatment where the innovative treatment is moreeﬀective, more expensive and causes fewer adverse events. A more in-depth presentation of all themodel parameters is given in [29] where the parameter distributions are gamma for costs and beta forprobabilities and utilities. The means of these distributions are informed by relevant studies identiﬁedfollowing a literature review and the standard deviation is taken as 10% of the underlying mean estimate.The per-person lifetime EVSI is calculated, assuming a discount factor of 0.03 per year over 15 years.

Sampling Distributions for X EVSI is computed for a study that investigates the QoL weights for patients who remain on treatmentwithout any adverse events and of patients who withdraw from the ﬁrst line of treatment due to lackof eﬃcacy. The individual level variability in these two QoL weights is modelled, for simplicity, asindependent beta distributions although the assumption of independence may be invalid [30]. Thepopulation level mean QoL weight, i.e., the mean of the individual level QoL distribution, is deﬁnedas the value of those two health states in the Markov Model. The standard deviations of the individuallevel distributions is then set equal to 0.3, for patients who remain on treatment, and 0.31, for patientswho withdraw due to lack of eﬃcacy [31] . We compute EVSI for trials enrolling 10, 25, 50, 100 and 150patients. We assume that only a proportion of the questionnaires are returned, leading to missingnessin the data.To generate the data, a response rate of 68.7% is assumed, consistent with the return rate observed in[32]. We generate a response indicator for each patient in the trial using a Bernoulli distribution. If thisindicator is 1, then we assume the patient returned the questionnaire and therefore we have observedutility scores for both states for that patient, simulated from the beta distributions speciﬁed above,conditional on the model parameters. Case Study 3: A Model for Colorectal Cancer

This example uses a health economic model developed by Alarid-Escudero et al. [33] to evaluatea screening strategy for colorectal cancer (CRC) and pre-cancerous lesions known as adenomas. Thismodel is based on a nine-state Markov model with age-dependent transition intensities which govern theonset of adenomas (pre-cancerous growths) and the risk of all-cause mortality. The onset of adenomasis modeled using a Weibull hazard conditional on age l ( a ) = λ ga g − This sampling distribution for the data causes some minor issues for the Gibbs sampling procedure used in the JAGSprogram for Bayesian updating. λ and g are the shape and scale parameters of the Weibull distribution and a is the age of thepatient. Model parameters are calibrated to observed literature and uncertainty in the model parameters g and λ reﬂects the uncertainty in these calibration targets.The costs and QoL associated with each health state are used to evaluate the economic burden ofCRC. The screening strategy is assumed to capture patients with adenomas and early cancer so they canbe operated on before the cancer progresses and becomes clinically detected. The proposed screeningstrategy has a sensitivity with a mean of 0.98 and a speciﬁcity with a mean of 0.87. Some members ofthe general population have undiagnosed adenomas and early stage CRC at the onset of the simulation. Sampling Distributions for X EVSI is computed for a study that investigates the onset of adenomas in the general population toinform the shape and scale of the Weibull hazard function. A cross-section of the general populationaged between 25 and 90 without any screening history will be screened for the presence of adenomaswith a gold standard test with 100% sensitivity and speciﬁcity. Upon enrollment, the age of the subjectsis recorded to determine the age-speciﬁc risk. EVSI is computed for trials enrolling 5, 40, 100, 200, 500,750, 1000 and 1500 participants.To generate prospective data, we simulate the enrolment age for participants. Demographic datafrom Canada in 2011, obtained from the Human Mortality Database [34], were used to generate studysubjects with an age distribution representative of the general population, with ages restricted between25 and 90 years. Conditional on their age a , a participant has a probability p ( a ) = 1 − e − λ a g of having an adenoma or CRC. The outcome for a speciﬁc subject was simulated from a Bernoullidistribution conditional on p ( a i ) X i ∼ Ber ( p ( a i ))where a i is the age of participant i . We assumed that there is no missing data as participants are enrolledand undergo the test at the same clinic visit and no other data are collected. Analysis

Comparing the presented EVSI estimation methods is challenging as their accuracy and computa-tional time are dependent on choices made by the modeller and the computational eﬃciency of themethod implementation. Table 1 outlines the simulation choices that were made for the case studies.These choices were made to achieve EVSI estimates with a reasonable level of precision, while keepingthe computation time manageable. For example, smaller sample sizes were necessary for models witha greater computational cost. We compared the speed and accuracy achievable by each method, andidentiﬁed their relative advantages and challenges in practice.The prior eﬀective sample size for the Jalal et al. method needs to be computed once to estimatethe EVSI across sample size. As posterior updating is slower for larger sample sizes, it is preferable toestimate N with a small proposed sample X . However, the estimation of N also relies on a Gaussianapproximation so the sample size of X should be suﬃciently large to assume normality. Thus, thetable above (Jalal et al. future sample size) highlights the sample size of X used in the nested posteriorsampling to estimate N that balances accuracy and computational speed.For the ﬁrst two case studies, we computed a standard error for the EVSI estimates by recomputingthe EVSI 200 times, each time with the same PSA samples, so that this standard error reﬂects uncertaintyarising from any simulation involved in the EVSI estimation procedure itself.To obtain the computational time for the four recent approximation methods, computations wereundertaken on a computer with an i7 Intel processor with 16 GB of RAM in R version 3.5.1. Thenested MC computations were undertaken on a Linux Google Compute Engine virtual machine. Thecomputation time give below is the total time across all cores. Code to undertake the computations inthis paper is available from GitHub at https://github.com/convoigroup/EVSI-in-practice .7imulation Choices Case StudyChemotherapyside eﬀects (1) Chronic Pain(2) CRC screening(3)Initial PSA size 100,000 100,000 5,000Number of µ φ t simulationsfrom EVPPI calculation 100,000 100,000 5,000Nested simulation outer loopsize 100,000 100,000 NANested simulation inner loopsize 100,000 100,000 NAStrong et al. sample size 100,000 100,000 5,000Menzies sample size 20,000 5,000 2,500Jalal et al. N computationmethod nested posteriorsampling nested posteriorsampling nested posteriorsamplingJalal et al. N estimationouter loop size 1,000 1,000 5,000Jalal et al. N estimation in-ner loop size 10,000 10,000 5,000Jalal et al. N estimation fu-ture sample size 30 40 40Heath et al. outer loop size 50 50 50Heath et al. inner loop size 10,000 10,000 5,000Table 1: The simulation choices to compute EVSI for the four recent approximation methods and thenested MC method for case study 1, 2 and 3. Results

Case Study 1: Chemotherapy Side Eﬀects

Figure 2 displays the 95% central intervals for the four faster EVSI approximation methods, withthe nested MC estimate shown as a vertical line. All the methods produce EVSI estimates that arerelatively close to the EVSI estimated by nested MC sampling, which we assume is accurate giventhe large simulation size. The 95% central interval for the Heath et al. method is the only intervalthat contains the “true” value, represented by the nested MC EVSI. At the same time, the Heath etal. estimate is associated with substantial variability compared to the other methods.

21 22 23 24EVSI MCStrMenJalHea

Figure 2: The mean per-person EVSI estimates, across 200 simulated estimation procedures, for theﬁve methods under consideration for the chemotherapy example with a future sample size of 150 andwillingness-to-pay of £ , et al. and Jalal et al. methods involves ﬁnding a ﬂexible regression modelthat ﬁts well and is computationally feasible to estimate. As there are six parameters in this example,ﬁnding such a model was relatively challenging and required examination of residual plots.8 ase Study 2: Chronic Pain Figure 3 shows that the 95% central intervals for the Heath et al. and the Menzies methods containthe nested MC estimate, which we assume to be accurate given the large simulation size, for all samplesizes. However, all methods are relatively close to the nested MC estimate. The Strong et al. methodproduced the shortest 95% central intervals while the three alternatives are relatively comparable. Notethat the Menzies estimate is based on a smaller PSA simulation size but still oﬀers similar variabilitycompared to the other methods.

400 500 600 700 800 900EVSI N = N = N = N = N = EVPPIMCStrMenJalHea

Figure 3: The mean EVSI estimates, across 200 simulated estimation procedures, for the ﬁve methodsunder consideration for the chronic pain example. EVSI was calculated across 5 diﬀerent sample sizesfor the future trial. The 95% central intervals from these 200 simulations are shown as horizontal linesand the gold standard MC estimator is shown as a vertical line.In this example, the summary statistic used for the Strong et al. method is the geometric mean of X and 1 − X . These statistics are suﬃcient to estimate the model parameters of the beta distribution andwere derived using the Fisher-Neymann factorization theorem [35]. Summarizing X using the arithmeticmean and variance gives incorrect EVSI estimates for this case study. Case Study 3: Colorectal Cancer

Figure 4 demonstrates a broad consensus among the four recent approximation methods for the CRCscreening model. Nested MC simulations are not undertaken for this case study due to the computationaltime required to obtain suitably accurate estimates for comparison. Thus, while we can note that thefour methods give similar results, we cannot assert that these EVSI estimates are “correct.”For a sample size of 1,500, the Menzies EVSI estimate is incorrect. This is because the likelihoodtends to 0 for large sample sizes making the weighted samples diﬃcult to approximate. Furthermore,the Menzies method slightly over-estimates the EVSI for sample sizes between 500 and 1000. This is9ecause we only use a subset of the PSA simulations to obtain this EVSI estimate and the EVPPI, upperlimit for EVSI, estimated using this subset is slightly over-estimated, judging from the full 5,000 PSAsimulations.

Sample Size (Log−scale)

EVS I HeathJalalMenziesStrongEVPPI

Figure 4: EVSI estimates for the four methods under consideration for the CRC model. EVSI is calcu-lated for 9 diﬀerent sample sizes for the future trial and is plotted across sample size. The sample size isplotted on the log scale with the sample sizes marked on the natural scale. The EVPPI, computed usingthe Strong et al.

EVPPI computation method [36], is included as a black line on this Figure.

Computational Time

Table 2 shows the computational time for the ﬁve EVSI computation methods for each of the threecase studies. For the ﬁrst two case studies, all four alternatives are considerably faster than the nestedMC method. For the third case study, the computational cost of the underlying CRC model meant thatit was not computationally feasible to use the nested Monte Carlo method.For the ﬁrst two case studies, the Heath et al. method has the lowest computation time as the under-lying health economic model is fast. The Heath et al. method also estimates EVSI across multiple samplesizes simultaneously which improves the computational time for the Chronic Pain example compared tothe Strong et al. , and Menzies methods. For these two examples, the computation time required toﬁt an accurate regression model is relatively high, increasing the computation time for the Strong etal. method. The Jalal et al. method has the highest computation time as it uses nested MC simulationto calculate N . However, after estimating N , EVSI can be re-estimated for any sample size. Thus, ifEVSI was to be estimated across more sample sizes, the Jalal et al. method would oﬀer computational10avings on the Strong et al. , and Menzies methods. For the Chemotherapy example, the Menzies methodhas a similar computational cost to the other three methods. However, it is estimated based on a reducedsimulation size; if all 100,000 PSA simulations are used, the computation time is greater than 2 hours.For the Chronic Pain example, the Menzies method is noticeably slower as the computation time for thelikelihood increases when the proposed sample size of X is larger. Case Study Computational Time (mins)Nested MC Strong et al.

Menzies Jalal et al.

Heath et al.

1: Chemotherapy 60480 6.45 4.45 7.47 1.482: Chronic Pain 223200 12.05 86 22.27 2.463: Colorectal Cancer ∗ Table 2: The computational time required to produce EVSI estimates for the ﬁve methods under con-sideration for the three case studies presented in this review.For the CRC screening example, the Jalal et al. method is fastest because, even though N is estimatedthrough nested MC simulation, it must only be computed once to estimate the EVSI across sample size.In contrast, for the Strong et al. , method, X is summarized by ﬁnding the maximum likelihood estimates(MLE) for g and λ that must be estimated, using relatively slow computational optimization procedures,for each sample X s , s = 1 , . . . , S and sample size. Thus, estimating the summary statistics is slow in thiscase study. The Heath et al. method is more computationally expensive as the underlying probabilisticsensitivity analysis for the CRC health economic model is expensive and must be rerun Q × S = 250 , Discussion

The paper uses three case studies to assess four novel methods for approximating EVSI. These meth-ods were developed in response to the immense computational burden required to estimate EVSI usingnested MC simulations. As these methods were developed concurrently, no head-to-head comparisonhas been undertaken. Additionally, these methods have typically been assessed using health economicmodels designed for computational simplicity rather than reﬂecting real-life decision making.Thus, we compared these four methods using case studies designed to cover a number of diﬀerenttrial designs, interventions and health economic model structures that may make the EVSI estimationmore challenging. In general, the EVSI estimates were accurate when the underlying assumptions forthe respective methods were met, highlighting the importance of checking these assumptions. Thecomputational complexity of these methods varies for diﬀerent health economic models, diﬀerent samplingdistributions for the future data, and depending on whether optimization over diﬀerent sample sizes isrequired.In general, we ﬁnd that the four methods are comparable in terms of accuracy and computationaltime in these more realistic situations. However, it should be noted that appropriately assessing accu-racy is challenging because diﬀerences in the EVSI estimate could lead to alternative future researchrecommendations, even when the diﬀerence is small. This is especially true for diseases with high inci-dence as the EVSI is multiplied by the incidence to determine whether the trial oﬀers value for researchinvestment. The determination of whether the EVSI is suﬃciently precise will depend on the decisionproblem at hand, so care should be taken when interpreting these results.It is likely to be more useful to compare these methods on their ease of implementation. The “optimal”estimation method trading oﬀ accuracy, precision, computational time and ease of implementation willchange depending on the health economic model structure, proposed trial design and analyst expertise.Due to the diﬀerences between these four methods and the inherent diﬀerences in health economic modelsand trial designs, giving general purpose recommendations is not simple and would not be unconditional.Nonetheless, this analysis highlights that the Strong et al. is accurate and eﬃcient, provided theanalyst can correctly summarize the trial data and ﬁt a regression model. The Menzies method isaccurate but computationally relatively expensive for large PSA simulation sizes. The Jalal et al. methodis eﬃcient when estimating EVSI across sample size but may require nested posterior sampling whenconsidering realistic data collection exercises. Finally, the Heath et al. method is accurate and eﬃcientwhen the health economic model has a low computation time but becomes more unfeasible as the model11ecomes more expensive. The Jalal et al. and Heath et al. methods required expertise in Bayesianmethods for all the examples in this paper.While further research is required to give comprehensive guidance on the situations in which each ofthese methods is most useful, we can conclude that, provided the underlying assumptions of the methodare met, any of the four methods chosen is likely to produce reasonable estimates in reasonable amountof time.

Contributions

AH conceived the study, performed the analysis and drafted the paper; NRK advised on the studydesign and contributed to the analysis, results interpretation and drafting the paper; CJ advised on studydesign and contributed to drafting the paper; MS advised on the study design, veriﬁed the implemen-tation of the Strong et al. method, and contributed to drafting the paper; FA-E advised on the studydesign, contributed to drafting the paper and veriﬁed the implementation of the Jalal et al. method;JDG-F advised on study design and contributed to results interpretation and drafting the paper; GBcontributed to drafting the paper and the results interpretation; NM advised on the study design andcontributed to drafting the paper; HJ conceived the study; contributed to drafting the paper and veriﬁedthe implementation of the Jalal et al. method. All authors approved the ﬁnal draft.

Acknowledgements

AH was funded by the Canadian Institute of Health Research through the SPOR Innovative ClinicalTrial Multi-Year Grant . NRK was funded by the Research Council of Norway (276146) and LINK Medi-cal Research. CJ was funded by the UK Medical Research Council programme MRC MC UU 00002/11.This paper draws on work that MS conducted while supported by a NIHR Post-Doctoral Fellowship(PDF-2012-05-258) from 2013 to 2016. FA-E was funded by the National Cancer Institute (U01- CA-199335) as part of the Cancer Intervention and Surveillance Modeling Network (CISNET). JDG-F wasfunded in part by a grant from Stanford’s Precision Health and Integrated Diagnostics Center (PHIND).GB was partially funded by a research grant sponsored by Mapi/ICON at University College Lon-don. NM was supported by National Institutes of Health (NIH) [R01AI112438-02.]. HJ was funded byNIH/NCATS grant 1KL2TR0001856. The funding agreement ensured the authors’ independence in de-signing the study, interpreting the data, writing, and publishing the report. The authors would also liketo thank Alan Brennan, Michael Fairley, David Glynn, Howard Thom and Ed Wilson for their commentsand discussion as part of the ConVOI group.

References [1] R. Schlaifer.

Probability and statistics for business decisions . McGraw-Hill, 1959.[2] H. Raiﬀa and H. Schlaifer.

Applied Statistical Decision Theory . Harvard University Press, Boston,MA, 1961.[3] L. Steuten, G. van de Wetering, K. Groothuis-Oudshoorn, and V. Ret`el. A systematic and criticalreview of the evolving methods and applications of value of information in academia and practice.

PharmacoEconomics , 31(1):25–48, 2013.[4] A. Brennan, S. Kharroubi, A. O’Hagan, and J. Chilcott. Calculating Partial Expected Value ofPerfect Information via Monte Carlo Sampling Algorithms.

Medical Decision Making , 27:448–470,2007.[5] S. Conti and K. Claxton. Dimensions of design space: a decision-theoretic approach to optimalresearch design.

Medical Decision Making , 29(6):643–660, 2009.[6] E. Jutkowitz, F. Alarid-Escudero, K. Kuntz, and H. Jalal. The Curve of Optimal Sample Size(COSS): A Graphical Representation of the Optimal Sample Size from a Value of InformationAnalysis.

PharmacoEconomics , 2019.[7] A. Ades, G. Lu, and K. Claxton. Expected Value of Sample Information Calculations in MedicalDecision Modeling.

Medical Decision Making , 24:207–227, 2004.128] N. Welton, J. Madan, D. Caldwell, T. Peters, and A. Ades. Expected value of sample information formulti-arm cluster randomized trials with binary outcomes.

Medical Decision Making , 34(3):352–365,2014.[9] A. Brennan and S. Kharroubi. Expected value of sample information for Weibull survival data.

Health Economics , 16(11):1205–1225, 2007.[10] M. Strong, J. Oakley, A. Brennan, and P. Breeze. Estimating the Expected Value of SampleInformation Using the Probabilistic Sensitivity Analysis Sample A Fast Nonparametric Regression-Based Method.

Medical Decision Making , 35(5):570–583, 2015.[11] N. Menzies. An eﬃcient estimator for the expected value of sample information.

Medical DecisionMaking , 36(3):308–320, 2016.[12] H. Jalal, J. Goldhaber-Fiebert, and K. Kuntz. Computing expected value of partial sample informa-tion from probabilistic sensitivity analysis using linear regression metamodeling.

Medical DecisionMaking , 35(5):584–595, 2015.[13] H. Jalal and F. Alarid-Escudero. A Gaussian Approximation Approach for Value of InformationAnalysis.

Medical Decision Making , 38(2):174–188, 2018.[14] A. Brennan and S. Kharroubi. Eﬃcient computation of partial expected value of sample informationusing Bayesian approximation.

Journal of Health Economics , 26(1):122–148, 2007.[15] A. Heath, I. Manolopoulou, and G. Baio. Eﬃcient Monte Carlo Estimation of the Expected Valueof Sample Information using Moment Matching.

Medical Decision Making , 38(2):163–173, 2018.[16] A. Heath and G. Baio. Calculating the Expected Value of Sample Information Using Eﬃcient NestedMonte Carlo: A Tutorial.

Value in Health , 21(11):1299–1304, 2018.[17] A. Heath, I. Manolopoulou, and G. Baio. Bayesian Curve Fitting to Estimate the Expected Valueof Sample Information using Moment Matching Across Diﬀerent Sample Sizes.

Accepted to MedicalDecision Making , in press(-):–, 2018.[18] D. Meltzer, T. Hoomans, J. Chung, and A. Basu. Minimal modeling approaches to value of infor-mation analysis for health research.

Medical Decision Making , 31(6):E1–E22, 2011.[19] A. Stinnett and J. Mullahy. Net health beneﬁts a new framework for the analysis of uncertainty incost-eﬀectiveness analysis.

Medical Decision Making , 18(2):S68–S80, 1998.[20] J. Felli and G. Hazen. Sensitivity analysis and the expected value of perfect information.

MedicalDecision Making , 18:95–109, 1998.[21] D. Coyle and J. Oakley. Estimating the expected value of partial perfect information: a review ofmethods.

The European Journal of Health Economics , 9(3):251–259, 2008.[22] A. Heath, I. Manolopoulou, and G. Baio. A Review of Methods for Analysis of the Expected Valueof Information.

Medical Decision Making , 37(7):747–758, 2017.[23] G. Baio and P. Dawid. Probabilistic sensitivity analysis in health economics.

Statistical Methods inMedical Research , 24(6):615–634, 2011.[24] EUnetHTA. Methods for health economic evaluations: A guideline based on current practices inEurope - second draft, 29th September 2014.[25] Department of Health and Ageing. Guidelines for preparing submissions to the PharmaceuticalBeneﬁts Advisory Committee: Version 4.3, 2008.[26] Canadian Agency for Drugs and Technologies in Health. Guidelines for the economic evaluation ofhealth technologies: Canada [3rd Edition]., 2006.[27] C. Robert and G. Casella.

Monte Carlo Statistical Methods . Springer-Verlag New York, Inc.,Secaucus, NJ, USA, 2005.[28] Donald B Rubin. Using the SIR algorithm to simulate posterior distributions.

Bayesian Statistics ,3:395–402, 1988. 1329] W. Sullivan, M. Hirst, S. Beard, D. Gladwell, F. Fagnani, J. Bastida, Cl Phillips, and W. Dunlop.Economic evaluation in chronic pain: a systematic review and de novo ﬂexible economic model.

TheEuropean Journal of Health Economics , 17(6):755–770, 2016.[30] J. Goldhaber-Fiebert and H. Jalal. Some health states are better than others: using health staterank order to improve probabilistic analyses.

Medical Decision Making , 36(8):927–940, 2016.[31] R. Ikenberg, N. Hertel, Andrew M., M. Obradovic, G. Baxter, P. Conway, and H. Liedgens. Cost-eﬀectiveness of tapentadol prolonged release compared with oxycodone controlled release in the UKin patients with severe non-malignant chronic pain who failed 1st line treatment with morphine.

Journal of Medical Economics , 15(4):724–736, 2012.[32] S. Gates, M. Williams, E. Withers, E. Williamson, S. Mt-Isa, and S. Lamb. Does a monetaryincentive improve the response to a postal questionnaire in a randomised controlled trial? TheMINT incentive study.

Trials , 10(1):44, 2009.[33] F. Alarid-Escudero, R. MacLehose, Y. Peralta, K. Kuntz, and E. Enns. Nonidentiﬁability in modelcalibration and implications for medical decision making.

Medical Decision Making , 38(7):810–821,2018.[34] Human Mortality Database. Available at ” ” or ” ”(data downloaded on [2019-01-22]), 2019.[35] R. Hogg and A. Craig. Introduction to mathematical statistics.(5”” edition).

Englewood Hills, NewJersey , 1995.[36] M. Strong, J. Oakley, and A. Brennan. Estimating Multiparameter Partial Expected Value ofPerfect Information from a Probabilistic Sensitivity Analysis Sample A Nonparametric RegressionApproach.

Medical Decision Making , 34(3):311–326, 2014.

A Inputs for the Chemotherapy Model

Model Input Distribution 1 st Prior Parameter 2 nd Prior Parameter Previous Data π - Probability of adverse events Beta 1 1 Number of adverse events ρ - Reduction in adverse events withtreatment Normal Mean: 0.65 Precision: 100 No q - QoL weight with no adverse events Beta 18.23 0.372 NoΓ - Probability of hospitalization Beta 1 1 Number of hospitalizationsΓ - Probability of death Beta 1 1 Number of deaths γ - Daily transition probability to hos-pital Γ - - - γ - Daily probability of death Γ - - - λ - Daily probability of recovery fromhome care Beta 5.12 6.26 No λ - Daily probability of recovery fromhospital Beta 3.63 6.74 NoCost of death LogNormal 8.33 0.13 NoCost of home care LogNormal 7.74 0.039 NoCost of hospitalization LogNormal 8.77 0.15 NoQoL weight for home care Beta 5.75 5.75 NoQoL weight for hospitalization Beta 0.87 3.47 No- Daily probability of recovery fromhospital Beta 3.63 6.74 NoCost of death LogNormal 8.33 0.13 NoCost of home care LogNormal 7.74 0.039 NoCost of hospitalization LogNormal 8.77 0.15 NoQoL weight for home care Beta 5.75 5.75 NoQoL weight for hospitalization Beta 0.87 3.47 No