Eliciting judgements about dependent quantities of interest: The SHELF extension and copula methods illustrated using an asthma case study
Björn Holzhauer, Lisa V. Hampson, John Paul Gosling, Björn Bornkamp, Joseph Kahn, Markus R. Lange, Wen-Lin Luo, Caterina Brindicci, David Lawrence, Steffen Ballerstedt, Anthony O'Hagan
EEliciting judgements about dependent quantitiesof interest: The SHELF extension and copulamethods illustrated using an asthma case study
Bj¨orn Holzhauer Lisa V. Hampson John Paul Gosling Bj¨orn Bornkamp Joseph Kahn Markus R. Lange Wen-Lin Luo Caterina Brindicci David Lawrence Steffen Ballerstedt Anthony O’Hagan Novartis Pharma AG, Analytics, Basel, Switzerland JBA Risk Management Ltd, Skipton, United Kingdom Novartis Pharmaceuticals Corporation, Analytics, East Hanover,NJ, USA The University of Sheffield, School of Mathematics and Statistics,Sheffield, United KingdomFebruary 16, 2021
Abstract
Pharmaceutical companies regularly need to make decisions aboutdrug development programs based on the limited knowledge from earlystage clinical trials. In this situation, eliciting the judgements of expertsis an attractive approach for synthesising evidence on the unknown quan-tities of interest. When calculating the probability of success for a drugdevelopment program, multiple quantities of interest — such as the effectof a drug on different endpoints — should not be treated as unrelated.We discuss two approaches for establishing a multivariate distributionfor several related quantities within the SHeffield ELicitation Framework(SHELF). The first approach elicits experts’ judgements about a quantityof interest conditional on knowledge about another one. For the secondapproach, we first elicit marginal distributions for each quantity of inter-est. Then, for each pair of quantities, we elicit the concordance probabilitythat both lie on the same side of their respective elicited medians. This al-lows us to specify a copula to obtain the joint distribution of the quantitiesof interest.We show how these approaches were used in an elicitation workshopthat was performed to assess the probability of success of the registrationalprogram of an asthma drug. The judgements of the experts, which wereobtained prior to completion of the pivotal studies, were well aligned withthe final trial results. a r X i v : . [ s t a t . M E ] F e b Introduction
The decision to continue or stop the development of a new drug is an exampleof high-stakes decision making in the pharmaceutical industry. To continueusually means a commitment to large and costly clinical trials that may exposethe enrolled patients to risks, while to stop may mean a missed opportunity tohelp patients. At the same time, only limited data are usually available. Thus,improving the decision making in these situations is an important problem.For decision making with no or limited directly relevant data, eliciting thejudgements of a group of experts is one approach to effectively combining theavailable direct and indirect evidence. Expert knowledge elicitation is the pro-cess of capturing expert knowledge about one or more uncertain quantities in theform of a probability distribution. It is an important tool to provide understand-ing of uncertain phenomena and inputs to decision-making processes. There hasbeen a steadily growing demand for elicitation in many fields throughout indus-try, government and science — see, for example, Garthwaite and O’Hagan [2000],Gosling et al. [2012], Usher and Strachan [2013], and Bamber et al. [2019]. Inparticular, elicitation has been advocated and used in pharmaceutical science[Kinnersley and Day, 2013, Dallow et al., 2018] and public health [Ren et al.,2017, Soares and Bojke, 2018]. Due to the cognitive biases that experts aresubject to, several frameworks and procedures have been proposed to guide theelicitation process in order to minimise these biases. The SHeffield ELicitationFramework (SHELF) described in Section 3 of this paper is one such framework.It can be challenging to elicit judgements about a quantity of interest (QoI),when these judgements are being made conditional on knowledge about an-other quantity. Similarly, QoIs are often likely to be dependent, in which casethe challenge of eliciting a joint distribution for several QoIs arises. Thereare many methods in the literature for capturing knowledge about dependen-cies between multiple variables [Daneshkhah and Oakley, 2010, Werner et al.,2018]. However, these methodologies are typically reported in the literature asstandalone methods rather than forming part of a complete elicitation protocollike SHELF. Also, whereas SHELF is a generic protocol that is applicable to avery wide range of applications, most of these methodologies have considerablerestrictions. • They may constrain the type of variables and distributions to be fitted— for example, to Dirichlet distributions for proportions [Elfadaly andGarthwaite, 2013, Zapata-V´azquez et al., 2014]. • They may be tailored for a specific application — for example, landcover [Baey et al., 2017] or system reliability [Norrington et al., 2008]. • They may consider complex restructuring for large numbers of dependentvariables [Sigurdsson et al., 2001, Bedford and Cooke, 2001, Truong et al.,2013].We present generic methods for eliciting joint distributions through judge-ments that experts can realistically make. Like the SHELF protocol itself, these2ethods are applicable in all areas where elicitation is required, and to use themeffectively there are important choices to be made. Examples of its use, andthe choices made, in any specific field can therefore serve as valuable guidesfor others to follow. We illustrate their use within SHELF in a pharmaceuticalexample that we fully describe in Section 2. The example concerns assessingthe probability of success (PoS) of a Phase 3 drug development program. Suchprograms are expensive, resource-intensive long-term commitments for any or-ganisation. The decision to proceed with a Phase 3 program depends on manyconsiderations including the unmet medical need and market opportunity a newdrug may address, as well as the probability of success to address these needs.As part of a pilot project to evaluate a new PoS framework at Novartis, weconducted a PoS assessment for an asthma drug. While Phase 2 studies hadprovided information on the effect of the drug on a surrogate outcome, no datawere available on the primary endpoint of the key Phase 3 studies: moderate-to-severe asthma exacerbations, which are potentially life-threatening events witha significant burden on patients’ lives [Global Initiative for Asthma, 2020]. Ad-ditionally, there was an important key secondary endpoint — forced expiratoryvolume in 1 second (FEV ), an endpoint commonly used in asthma trials —, forwhich Phase 2 data were available, but experts’ judgements were sought on theeffect of the different treatment duration and trial population in Phase 3. If thedrug worked on one endpoint, it was considered to more likely work on the otherendpoint. Thus, a joint distribution was required. Techniques to address bothproblems through expert elicitation are available within the SHELF framework.In Section 3, we first give a brief overview of elicitation methods and ofSHELF. Then we describe the extension method for eliciting judgements aboutPhase 3 outcomes by linking to Phase 2 results and the copula method foreliciting joint distributions. In Section 4, we return to our motivating exampleand describe how we used these techniques to estimate the PoS of that drugdevelopment program. We also compare the obtained expert judgements withthe outcomes of the Phase 3 studies. We finish with some conclusions andrecommendations in Section 5. The Phase 3 program of fevipiprant, a prostaglandin D receptor 2 antagonistfor the treatment of asthma, was selected to pilot a new PoS framework that hassince been introduced at Novartis [Hampson et al., 2021]. At the time of thePoS assessment, fevipiprant had been studied in several Phase 2 randomisedcontrolled trials (RCTs) and the Phase 3 clinical trials comparing two fevip-iprant doses (150 or 450 mg once a day) with placebo were underway, withdata collection almost complete. This timing was one reason the program wasselected as a pilot, because it ensured that the PoS assessment could not beinfluenced by the Phase 3 data, while at the same time minimising the timeuntil the PoS assessment could be compared to the Phase 3 results. In reality,the assessment of the program and the decision to proceed with Phase 3 had3lready been taken at the end of Phase 2 based on more limited information.One major challenge was that — unlike the Phase 2 trials — the key Phase3 trials focused on more severe asthma patients with the sub-population with ablood eosinophil count ≥
250 cells/ µ l. The primary null hypotheses for thissub-population were tested first in the trials’ testing procedures [Brightlinget al., 2020]. None of the Phase 2 trials evaluated the effect of fevipipranton moderate-to-severe asthma exacerbations. The annualised rate of such ex-acerbations was the primary endpoint of the two most important trials in thePhase 3 program [Brightling et al., 2020]. Instead, a surrogate endpoint of re-duction in sputum eosinophil counts had been measured in one of the Phase 2trials [Gonem et al., 2016]. FEV was a key secondary endpoint in the Phase3 program and has high regulatory acceptance as a measure of asthma con-trol [Committee for Medicinal Products for Human Use, 2015]. FEV had beena primary or secondary endpoint of several of the Phase 2 studies including fordose ranging [Bateman et al., 2017], but these trials were of shorter durationand had a patient population with milder asthma than the Phase 3 trials.As per the newly implemented PoS framework at Novartis, success was de-fined as regulatory approval with point estimates for key endpoints achievingor exceeding targets specified as part of a target product profile (TPP). It wasassumed that regulatory approval would require statistical significance at theone-sided 0.025 significance level for at least one dose for both exacerbationsand FEV in both of the key Phase 3 trials. Thus, to calculate the PoS, weneeded a joint prior distribution for the effects of fevipirant on exacerbationsand FEV .Given the data that were available at the time of the PoS assessment, wedecided to do this by eliciting the judgements of a group of experts. The questionthen was how best to structure the elicitation process: we wanted to explicitlyleverage the Phase 2 data on the surrogate endpoint of reduction in sputumeosinophil counts, since this was arguably the most relevant evidence we hadfor informing beliefs about the effect of fevipiprant on exacerbations.We also expected experts to judge a larger effect of fevipiprant on FEV to be more likely the larger the effect of the drug on asthma exacerbations is.As a consequence, in order to fully characterise the joint distribution of thesetwo treatment effects we would need to understand the size and direction ofthe dependence between these two quantities. In the next section, we describethe various approaches considered for the elicitation, before we return to themotivating example in Section 4 and describe how we practically applied thesemethods. Elicitation can be done informally, but numerous pitfalls await the inexperi-enced practitioner, including well-established sources of bias in expert judge-4ents [O’Hagan et al., 2006, European Food Safety Authority, 2014, O’Hagan,2019a]. Therefore, when the expert judgements are sufficiently important it isnecessary to employ a formal procedure in the interests of quality and defensi-bility. A small number of established elicitation protocols have been developedand refined by experienced practitioners [an overview is given in Dias et al.,2018].The SHELF protocol is characterised by carefully structured sequences ofjudgements designed to minimise biases and a unique way of eliciting a consensusprobability distribution from a group of experts [Gosling, 2018]. It is one of themost widely used elicitation protocol in the field of pharmaceutical science. TheSHELF package of advice, templates and tools to support researchers wishingto conduct expert knowledge elicitation may be freely downloaded from theSHELF website [O’Hagan, 2019b]. Since its inception in 2008, SHELF has beensteadily expanded with new advice and methods. For example, the extensionmethod described in Section 3.3 was added in version 4 [Oakley and O’Hagan,2019].
The SHELF protocol is distinguished by a number of key elements. • Individual elicitation — discussion — group elicitation. Serious elicitationalmost always requires using a group of experts in order to capture theircombined knowledge. SHELF elicits a single distribution from the groupbut begins by eliciting judgements from each expert independently. Thisis followed by the experts discussing their differences to share their exper-tise, opinions and interpretations of the evidence. Finally, group judge-ments are elicited and the result is a “consensus” distribution fitted tothese judgements. This combination of individual and group elicitationsis the most important distinguishing feature of SHELF. The individualelicitations show each expert’s beliefs and form a basis for the subsequentdiscussion. The discussion is an opportunity to share and debate thoseopinions with a view to achieving a common understanding and is intendedto extract maximum value from their joint expertise, • The SHELF workshop. The discussion and group elicitation phases requirethat the experts come together in what is called a SHELF workshop.Typically, they are physically together in a room, although SHELF canbe used with other arrangements, including video-conferencing. • The evidence dossier. Prior to the workshop, a dossier is prepared sum-marising the evidence regarding the QoIs. In a typical elicitation thereis some relevant evidence available, but there is not enough direct evi-dence to identify the value of any QoI (otherwise expert judgement wouldnot be needed). The dossier ensures that all experts have access to thesame evidence and that it is all fresh in their minds when they make theirjudgements. The essence of expert knowledge elicitation is that different5xperts interpret and weight the evidence differently, based on their ownexperience. The discussion phase in SHELF is where these differences areaired and debated. • The rational impartial observer (RIO). Even after discussing and debating,SHELF does not expect the experts to reach complete agreement (suchthat they now have the same knowledge and beliefs about an uncertainquantity, represented by the same probability distribution). Instead theyare asked to judge what a rational impartial observer, called RIO, mightreasonably believe, having seen their individual judgements and listenedto their discussion. By taking the perspective of RIO, experts can reachagreement on a distribution that represents a rational impartial view oftheir combined knowledge. • The facilitator. The SHELF workshop is led by a facilitator, who hasexpertise in the process of eliciting expert knowledge, and in particular isfamiliar with SHELF. The facilitator works with the experts to accuratelycapture their knowledge, facilitates the group discussion and leads themin applying the RIO perspective. The facilitator’s role may also be foundin other elicitation protocols, but it is particularly important in SHELF.The group interaction in SHELF’s discussion is another possible sourceof biases, which must be managed by the skill and experience of the fa-cilitator. Many other protocols do not admit group discussion, therebyavoiding the risk of those biases but also losing the opportunity for theexperts to share and debate their judgements. • SHELF templates. The conduct of the workshop is recorded on SHELFtemplates, which play a dual role. First, they organise the progress ofthe workshop through a predefined series of steps. In particular, boththe individual elicitations and the group elicitation are directed throughcontrolled sequences of judgements. The entire process and the elicitationsequences used are based on research into the psychology of judgement,and on extensive experience in practical elicitation. Second, the templatesserve to document a SHELF workshop, such that conduct of the workshopand the development of each elicited distribution is clearly set out.The result of an elicitation for a single uncertain QoI is a probability dis-tribution. Accordingly, the judgements that experts are asked to make areprobabilistic. The basic sequence of judgements at the individual judgementsstage is as follows.1. Plausible range. Experts are first asked to specify upper and lower plausi-ble bounds such that they judge values of the QoI outside that range to beimplausible. A numerical interpretation of ‘implausible’, for instance as a1% or 5% probability, is not generally made, since the primary function ofthis step is to encourage the experts to think of all possibilities, therebyreducing any tendency to overconfidence.6. Median. Experts are next asked to specify their median value for the QoI,such that they regard it as equally likely that the QoI would be above orbelow this value.3. Quartiles or tertiles. Finally, experts specify their quartile or tertile values(the choice of which to ask for being according to the facilitator’s prefer-ence). Just as the median divides the plausible range into two intervalsthat are judged equally likely, quartiles divide it into four equally likelyintervals and tertiles into three. In expert elicitations for the Novartis PoSframework we have favoured eliciting tertiles instead of quartiles, becausewe consider thinking about three instead of four equally likely intervalsless challenging for experts.The SHELF package contains copious advice and tools to help the expertsto understand and make these judgements reliably. In particular, by followingthe SHELF protocol the facilitator asks questions in such a way that biases areminimised and there is no need for the experts to have a thorough understandingof probability or statistical theory. Training in making these judgements isalso available through an online self-paced course accessed from the SHELFwebsite [O’Hagan, 2018].For the group judgements, the facilitator may ask the experts to agree onprobabilities that RIO might assign to three specific propositions, such as thatthe QoI is negative, or that it exceeds some specified value. A probability distri-bution is then fitted to the three RIO probability judgements. SHELF providessome R software for fitting distributions using various standard families, suchas normal, t, gamma, lognormal or beta distributions [Oakley, 2020]. However,other distributions may be fitted, and indeed a major consideration in SHELF isthat the form of the elicited distribution should not be constrained in any way.The facilitator will work with the experts to identify a suitable distribution torepresent their judgements. The fitted distribution is the final outcome of theelicitation.Throughout the process, and particularly when determining the final agreeddistribution, the facilitator will prompt and challenge the experts to ensure thatthe final distribution genuinely represents what RIO might believe after seeingthe experts’ judgements and listening to their discussions.
The SHELF package contains several techniques for eliciting a joint distribu-tion for two or more uncertain quantities, including the extension and copulamethods. The extension method is a generic technique that allows considerableflexibility for the form of the joint distribution. It is e.g. suitable for elicitingjudgements about the treatment effect for a Phase 3 endpoint ( X ) based onthe Phase 2 results for a surrogate endpoint ( Y ). The fact that Phase 3 fol-lows Phase 2 chronologically makes it natural to express judgements about X conditional on Y . 7or two QoIs, X and Y , the extension method consists of obtaining amarginal distribution for Y and a set of conditional distributions for X given Y = y . The elicitation of joint distributions requires the following steps.1. A marginal distribution for Y is obtained. This distribution can be elicitedas described in Section 3.2, but could also be the result of an analysis ofavailable data. E.g. in the asthma case study introduced in Section 2 itis a meta-analytic predictive distribution [Neuenschwander et al., 2010]based on Phase 2 data.2. A conditional distribution (as always, from the perspective of RIO) iselicited for X conditional on Y equalling the median of its elicited marginaldistribution, also following the basic method of Section 3.2.3. Several other quantiles of the elicited marginal distribution of Y are se-lected as conditioning points; typically these will be the quartiles, 5thand 95th percentiles. Median values are elicited for X conditional on Y equalling each of theses conditioning points (first the 5th and 95th per-centiles and then the quartiles). The basic SHELF approach of individualjudgements – discussion – group judgements is used for each.4. The final step is to ‘fit’ a set of conditional distributions to these judge-ments. First, a median function m ( y ) is fitted to the elicited conditionalmedians. This might for instance be a polynomial or a piecewise-linear fit(with extrapolation), and may be applied on a transformed scale. Second,a model is chosen to determine the conditional distributions based on thedistribution at the Y -median elicited in Step 2. For instance, it may bedecided that the Y -median distribution can be applied to all conditionals,simply shifted to follow the m ( y ) function. Alternatively, the variancemay also be scaled depending on m ( y ). These choices are available in theSHELF R software, but again other choices can be made. The facilitatorwill always work with the experts to identify a ‘fit’ that best representstheir judgements.The extension method is appropriate when the experts perceive a naturalcausal link from Y to X . Indeed, it is particularly useful when the objectiveis to elicit a distribution for X but the experts would find it easier to makejudgements about X if they knew the value of Y . In this case, the marginaldistribution of X is the main outcome of the elicitation process. Although itwill not generally be feasible to derive that marginal distribution analyticallyfrom the elicited joint structure, a large Monte Carlo sample can be drawn bysampling values y i from the marginal distribution of Y and then sampling x i conditional on Y = y i . The Monte Carlo samples { x i } are then samples fromthe marginal distribution of X and, if needed, a distribution can be fitted tothe samples. 8 .4 The SHELF copula method When there is no natural ordering of related QoIs based on time or causality,the extension method requires an arbitrary imposition of an ordering and theconditional judgements are more difficult for the experts. The SHELF copulamethod is appropriate for two or three QoIs and does not require the elicitationof conditional distributions. However, it does place some constraints on thejoint distribution. The method has the following steps.1. Marginal distributions are elicited for each QoI individually, using thebasic method of Section 3.2.2. For each pair of QoIs, a single judgement concerning their degree of corre-lation is made. This judgement is called the concordance probability, andis the probability that both QoIs lie on the same side of their respectiveelicited medians.3. A Gaussian copula joint distribution [Trivedi and Zimmer, 2007] is thenfitted to these marginal distributions and concordance probabilities. Thefacilitator shows the experts suitable displays or summaries of the jointdistribution to verify that it is a reasonable representation of their beliefs.With just two QoIs, the copula method is simple to apply. The Gaussian copulaimposes a restriction on the joint distribution but in practice it will usually bean adequate fit to the experts’ judgements.In principle, the copula method is applicable for larger numbers of QoIs, butit is difficult to use for more than three. With three QoIs, three concordanceprobabilities need to be elicited. Under the Gaussian copula assumption, eachconcordance probability can be transformed to a correlation coefficient and theresulting correlation matrix must be positive definite. It is quite possible for theexperts’ elicited concordance probabilities to fail to produce a valid correlationmatrix, and they must then revisit their judgements with the aid of the facil-itator to achieve an adequate fit. With more than three QoIs, the number ofconcordance probabilities rapidly increases, as does the likelihood of the elicitedvalues not corresponding to a valid correlation matrix.The SHELF copula method is a natural choice to construct a joint distribu-tion for the effects of a drug on two Phase 3 endpoints, such as a primary andsecondary clinical outcome.The interested reader will find full technical details, as well as much practicaladvice, on these and other elicitation techniques in the SHELF package [Oakleyand O’Hagan, 2019].
In this section, we provide an in-depth description of the expert elicitationand PoS calculation for the example introduced in Section 2. We decided to9tructure the elicitation process into three parts. First, we followed the SHELFextension method by using Phase 2 data to establish a marginal distribution forthe effect of fevipiprant on sputum eosinophil counts and then elicited from agroup of experts a set of conditional judgements on the effect on exacerbationsin the Phase 3 population given different values for the effects on this surrogateendpoint. Secondly, we elicited the experts’ beliefs on the effect of fevipipranton FEV in the Phase 3 population. Finally, we used the SHELF copula methodto elicit the dependence between drug effects on exacerbations and FEV . Fevipiprant was studied in four Phase 2 RCTs in asthma and the results of thesestudies for the FEV endpoint are summarised in Figure 1.1. A Proof of Concept RCT (ClinicalTrials.gov identifier NCT01253603) witha 4 week treatment duration in patients on reliever therapy did not showan effect of fevipiprant on the primary endpoint of FEV in the overalltrial population, but more favourable results were seen for a subgroup ofmore severe patients [Erpenbeck et al., 2016].2. A dose finding RCT (NCT01437735) with a 12 week treatment dura-tion [Bateman et al., 2017] was the basis of the selection of one of thePhase 3 doses.3. A 12-week RCT looked at potential differences in effects in patients withatopic and non-atopic asthma (NCT01836471).4. Finally, there was a RCT (NCT01545726) that showed a reduction ofsputum eosinophil counts after 12 weeks of treatment with fevipiprantcompared with a placebo group [Gonem et al., 2016]. The ratio of a3.5-fold (95% CI 1.7 to 7.0; p=0.0014) lower ratio of geometric means inFigure 1: Observed differences in FEV to placebo with 95% confidence intervalsfor fevipiprant in Phase 2 studies in the subgroup with a blood eosinophil countof ≥
250 cells/ µ L and in the overall trial populations (A: atopic patients, NA:non-atopic patients) ll lll l
500 QD150 QD450 QD 450 QD450 QD 225 BID
Patients with high blood eosinophil counts−250 0 250 500NCT01253603NCT01437735NCT01836471 (NA)NCT01836471 (A)NCT01545726
Difference in FEV1 to placebo [mL] C li n i c a l T r i a l s . go v s t ud y i den t i f i e r l llll l
500 QD 150 QD450 QD450 QD450 QD225 BID
Overall population−100 0 100 200NCT01253603NCT01437735NCT01836471 (NA)NCT01836471 (A)NCT01545726
Difference in FEV1 to placebo [mL]
PredictivedistributionNCT01545726 90% 75% 50% 25% 0% −100%
Reduction [%] in sputum eosinophils for fevipiprant 225 mg BIDcompared with placebo at week 12 S t ud y A number of anti-inflammatory treatments that lower sputum eosinophilcounts have been shown to reduce exacerbation rates in asthma patients with el-evated sputum eosinophil counts [Petsky et al., 2010]. This evidence was mostlygenerated with corticosteroids, but suggests that sputum eosinophil counts maybe a surrogate for a reduction in exacerbations. As part of the evidence dossierfor this expert elicitation, we assembled more recent evidence from 22 trials ofother drug classes [Green et al., 2002, Chlumsk´y et al., 2006, Jayaram, 2006,Nair et al., 2009, Fleming et al., 2011, Castro et al., 2011, Pavord et al., 2012,Laviolette et al., 2013, Wenzel et al., 2013, Ortega et al., 2014, Gauvreau et al.,2014, Castro et al., 2015, Bleecker et al., 2016, FitzGerald et al., 2016, Correnet al., 2017, Panettieri et al., 2018, Russell et al., 2018, Castro et al., 2018].The data are shown in Panel A of Figure 3 and the results from a Bayesianmeta-regression model are shown in Panel B of the figure. Without data froma variety of different drugs, this meta-regression would be highly questionable,because then its findings might only apply to a specific mode of action. Notethat some of these data were not available at the time the Phase 3 program forfevipiprant was started.For the question of the likely effect of fevipiprant on FEV in asthma patientswith blood eosinophil counts ≥
250 cells/ µ L, the evidence dossier presented thePhase 2 results for the overall population, as well as for subgroups defined byblood eosinophil counts (see Figure 1).In addition, the evidence dossier gave details of the fevipiprant Phase 3program, and discussed the strengths and limitations of the available evidencethat the experts needed to bear in mind.11igure 3: Effects of anti-inflammatory asthma therapies on sputum eosinophilcounts and exacerbation rates compared with placebo: Estimates with 95%confidence intervals for exacerbation rate ratios and ratio of geometric mean (vs.placebo) ratios of sputum eosinophil levels at the end of the study compared withbaseline (Panel A), and meta-regression using random drug effects on interceptand slope of relationship, as well as random study effects (Panel B); Studies 10and 11 are the two parts of study NCT02414854 that were not blinded againsteach other. benralizumab mepolizumab reslizumab tralokinumab dupilumab tezepelumab sputum based strategy mepolizumab sputum based strategy benralizumab reslizumab tralokinumab dupilumab tezepelumab fevipiprant
Exacerbations Sputum eosinophils T r ea t m en t e ff e c t vs . pbo Ratio compared with placebo S t ud y A l ll l ll l benralizumab mepolizumabreslizumab tralokinumab dupilumabtezepelumab sputum based strategy % p r e d i c t i o n i n t e r v a l
50% 80% 95%
Reduction (%) in sputum eosinophils vs. placebo E x a c e r ba t i on r edu c t i on ( % ) vs . pbo . B .2 Choice of quantities of interest for elicitation The QoI to be elicited were chosen based on their importance for meeting thesuccess definition of the PoS framework and lack of evidence to directly informa predictive distribution. The global project team considered the results in thetwo exacerbation trials (NCT02555683 and NCT02563067) in the pre-specifiedsubgroup of patients with high eosinophil counts to be the most important tofulfil the TPP. These 1-year exacerbation RCTs compared two doses of fevip-iprant with a placebo on top of continued standard of care therapy in severeasthma patients. The rate of asthma exacerbations (TPP target: ≥
40% rela-tive rate reduction compared with placebo) was the primary endpoint of thesestudies, while the key secondary endpoint of FEV (TPP target: ≥
120 mLimprovement in FEV compared with placebo) was considered to be especiallyimportant for regulatory approval. There was considerable historical data onthe placebo exacerbation rate, the between patient heterogeneity in the exac-erbation rate [Holzhauer et al., 2017] and the variability in FEV so that thesequantities did not require elicitation.The biggest source of uncertainty regarding the PoS was about the effects offevipiprant on asthma exacerbations and FEV , as well as about their correla-tion. For this reason, these were identified as the QoIs for the expert elicitation.We carefully chose the phrasing of the questions about the QoIs to make it easyfor the experts to think about them and express their judgements.We decided to use the extension method to elicit judgements about therelative rate reduction in exacerbations conditional on a specified reductionin sputum eosinophils, and to use the copula method to elicit the associationbetween the two QoIs. On that basis, we formally defined the following threeQoIs: • X is the average reduction in moderate to severe asthma exacerbationsachieved by fevipiprant compared to placebo over the population of eligiblepatients, • Y is the average reduction in sputum eosinophil counts achieved by fevip-iprant compared to placebo over the population of eligible patients, • Z is the average increase in FEV achieved by fevipiprant compared toplacebo over the population of eligible patients.Eligible patients are defined as matching the inclusion criteria for the NCT02555683and NCT02563067 Phase 3 trials and having blood eosinophil counts of at least250 cells/ µ L. Note that because we had already derived the marginal predictivedistribution in Figure 2 for the reduction Y in sputum eosinophil counts fromPhase 2 data, the extension method for the QoI X required only conditionaldistributions to be elicited.The choice and phrasing of the QoIs in elicitation is an important earlytask. Quantities must be clearly and unambiguously defined, in terms that arefamiliar to the experts. It must be clear that each quantity has a unique, well-defined (but unknown) value. We chose to elicit treatment effects compared13ith placebo as percentage reductions in exacerbations and improvements inFEV , because these are widely used effect measures in asthma trials commonlyexpressed in these terms that were familiar to the experts. The effects aredefined as averages over all potential patients so that they have well-defined andunique values. The experts would be asked for their judgements on questionssuch as:1. Given that an anti-inflammatory drug reduces sputum eosinophil countsby Y , what do you judge to be the likely values for the relative exacerba-tion rate reduction X in eligible patients?2. What do you judge to be the likely values for the difference Z betweenfevipiprant and placebo in FEV in millilitres (mL) in eligible patients?3. Given the judgements about the reduction in exacerbations and the changein FEV caused by fevipiprant, how likely do you judge it to be that both Y and Z will be on the same side of your median values? In order to capture the full range of opinions and differing past experiencesamongst experts, a group of company internal experts was convened. The 5selected experts all had extensive experience in drug development in the respi-ratory area. Two were part of the fevipirant team (a clinician and a statistician),while 3 were not members of the fevipiprant team (a clinician, a translationalmedicine expert and a regulatory affairs expert). These experts were selected,because the QoIs appeared to be related to clinical trials and understandingmechanistic considerations around the drug efficacy. We wanted at least someof this key expertise to be from outside of the fevipiprant project team to ensurean outside opinion would be heard. A statistician was considered important toprovide a perspective on the available evidence and the expert in regulatoryaffairs was selected due to a broad experience with multiple previous programs.Prior to the elicitation workshop, all experts were encouraged to work throughan online course on expert elicitation [O’Hagan, 2018] and they were guidedthrough a practice exercise by the facilitator at the start of the workshop.
The elicitation workshop was an in-person 4-hour meeting with one facilitator,one recorder and five experts. While the facilitator guided the meeting andasked the experts questions, the role of the recorder was to operate the SHELFsoftware, project relevant visualisations for the experts and to take minutes ofthe meeting. 14 .3.3 Elicitation of first quantity of interest
The median of the marginal distribution of Y shown in Figure 2 — based ona Bayesian analysis of Phase 2 sputum eosinophil data — was a 66% reduction(80% interval from 52 to 76%). Round numbers are easier for experts to con-dition on, and so, for the first QoI, the median of 66 % was rounded to 65 %.Thus, the experts were first asked for their judgement on Y conditional on X being a 65% reduction in sputum eosinophil counts.For the individual judgements about this QoI, the tertile method was used.Each expert first independently wrote down their plausible range for the QoI,followed by their median and the points that divide the plausible range intoequally probable thirds. At each step the experts were asked to challenge theirown judgements. For instance, after specifying their plausible range, expertswere asked to consider their reaction if a large study estimated X to be outsidethat range; would they acknowledge that their range was too narrow, or wouldthey be suspicious of the reported estimate? If their reaction would be theformer one, then they should widen their plausible range.Then the individual judgements were revealed to the group and the ex-perts were asked to explain their judgements. In this wide-ranging discussion,a number of points were raised and the main arguments were recorded usingthe SHELF templates. Afterwards, consensus judgements were obtained usingthe probability method: experts were asked what probability RIO (the Ratio-nal Impartial Observer) would assign to the relative exacerbation rate reductionbeing less than 25%, greater than 40% and less than 35%. After significant dis-cussion, the group agreed that RIO would assign probabilities of 30%, 30% and50%, respectively. A Beta(2.81, 3.05) distribution scaled to a plausible range of0 to 70% was fitted to these judgements and shown to the experts. The expertsfelt that this distribution, with a median at 33.4% (90% credible interval 11.9to 55.8%), adequately represented their knowledge. The result of this elicita-tion was a distribution for X (exacerbation reduction), given that Y (sputumeosinophil reduction) is 65%. The results of the individual judgements and thegroup judgement are shown on the left-hand side of Figure 4.Then, the experts were asked for their conditional judgement about the me-dian percentage reduction in exacerbations given an effect on sputum eosinophilof 50%, then for 75%, 60% and 70%. These numbers correspond approximatelyto 10%, 90%, 25% and 75% points of the marginal predictive distribution foreffects of fevipiprant on sputum eosinophil counts, respectively. Thus, theycharacterise conditional judgements across the bulk of this distribution. Theirorder was chosen in order to minimise known sources of cognitive bias and to en-sure that experts needed to think carefully about each judgement. The elicitedmedians are shown in Panel A of Figure 5.It was agreed that over the plausible range of effects on sputum eosinophilcounts, there was no probability that the drug could increase the number of exac-erbations, because the assumption that fevipiprant reduced sputum eosinophilsindicated at least some positive benefit. It was therefore appropriate to modelthe distributions of exacerbation reductions at intermediate sputum eosinophil15igure 4: Distributions elicited from individual experts, linear pool of thesedistributions and group judgements MedianLower thirdof distribution Upper thirdMiddle third Upper plausiblelimitLower plausiblelimitABCDEPoolGroupjudgment 0% 20% 40% 60% E x pe r t Lowest 10% Highest 10%012 0% 20% 40% 60%
Relative exacerbation rate reduction given 65% sputumeosinophil reduction compared with placebo D en s i t y o f g r oup j udge m en t MedianLower thirdof distribution Upper thirdMiddle thirdUpper plausiblelimitLowerplausiblelimitABCDPoolGroupjudgment 0 50 100 150 200 E x pe r t Lowest10% Highest 10%0.0000.0050.0100.015 0 50 100 150 200
Treatment difference in pre−bronchodilatorFEV1 [mL] compared with placebo D en s i t y o f g r oup j udge m en t effects through a log transformation — i.e. to assume that median(log( X | Y ))is a piecewise linear function of Y. The experts were shown the resulting me-dian relationship shown in Panel A of Figure 5 and agreed that it representeda reasonable RIO opinion.Using the log transformation, the conditional distribution given Y = 65%was assumed for X conditional on other values of Y , but scaled to follow theelicited median model — i.e. we shifted the median of each Beta-distributionaccording to Panel A of the figure and kept the variance on the log-scale con-stant. The recorder showed the experts the resulting conditional distributionplot in Panel B of Figure 5. The facilitator pointed out how the scaling hadresulted in less uncertainty conditional on Y = 50% but more conditional on Y = 75%. The experts confirmed that this was a reasonable representation oftheir beliefs.The elicitation of the first QoI was now complete and the required (marginal)distribution for X was computed by Monte Carlo simulation by combining theelicited conditional relationship with the predictive distribution for Y from Fig-ure 2. It is shown in the top-most panel of Figure 7. The elicitation for the second QoI then proceeded using the tertile method forindividual judgements, followed by a discussion and, again, using the probabilitymethod for the consensus judgement. The resulting judgements are shown onthe right-hand side of Figure 4.The joint distribution of the treatment effects on exacerbations and FEV , X and Y , was then elicited using the copula method. The correlation waselicited through the concordance probability, i.e. RIO’s judgement of the prob-16igure 5: Piecewise-linear median model for the elicited medians (Panel A) andconditional distributions for the relative exacerbation rate reduction across therange of plausible effects on sputum eosinophil counts (Panel B) l l l l l Effect on sputum eosinophils (Y) C ond i t i ona l m ed i an r e l a t i v e r a t e r edu c t i on i n e x a c e r ba t i on s [ % ] ( X ) A Relative rate reduction in exacerbations [%]given effect on sputum eosinophil counts (X) E ff e c t on s pu t u m eo s i noph il s ( Y ) B based on 10,000 Monte Carlo samples l lll ll lll l lll ll lllll ll ll ll l llll lll ll l llll lllll l ll l lll l lll ll lll ll ll l ll ll lll l llll ll ll lllll l llll lll ll ll lll ll ll llll llll ll l ll l lll l lll ll ll l ll lll l ll ll ll l l ll ll ll l lllll l l ll l ll l ll l l lll llll ll lll lllll l l l l llll ll lll l l lll ll ll ll l ll ll ll l lll l ll l lll lll lllll lll lll ll llll l ll lllll l l lll l lllll l ll l l llll ll l l l l ll l l ll lll ll ll lllll lll ll ll l ll ll lll l l ll llll lll ll ll ll lll ll l lll l lll llll lll ll l l ll ll ll ll ll ll ll ll lll ll l lll llll lllll ll ll ll l ll lll ll l ll lll lll l llll ll ll ll ll l ll l lll llllll ll ll l llll l ll lll ll ll ll lll lll lll l lll ll lll ll lllll llll ll l l lll lll l lll ll lll ll l ll ll ll lll l ll ll l ll l lll l l lll llll l lll lll ll ll l l ll lll l lll ll ll ll l llll l l l l ll l ll ll ll l ll lll ll l lll lll ll llll lll l l lll lll ll ll l llll ll ll ll llll llll lll l ll llll ll l ll ll ll ll l llll l l lllll ll l l l ll l l ll ll ll l lll ll ll ll ll lll l ll l lll llll ll llllll llll l ll l l lll ll l llll ll ll l ll ll l ll l ll ll ll l l l llllll l lll ll ll l ll l llll lll ll lll l ll ll l ll l ll l ll lll ll ll ll ll l l ll l llll l ll lll l ll ll ll l ll ll l lll l lll l l l l ll ll l ll ll lll l ll lll l ll lll ll lll lll ll ll ll lll ll lll ll ll ll lll l ll l l llll l lll l lllll l lll ll ll lllll l ll lll llll ll l ll l ll l l llllll lll lll lll l lll ll ll l llll ll lll l lll l l l ll ll lll ll l l ll lll lll ll l l lll lll ll l ll l lll ll llll ll llll l ll lll l ll l ll ll ll l ll ll llll lll lll l l l llllll ll ll ll ll ll lllll l lll ll lll llll l lll ll ll lll lll l ll ll lll ll lllll lll ll ll lll ll ll ll ll llll l ll lll ll l lll llll l ll l ll l l lll llll l l ll lll l llll ll ll llll ll l lll ll l ll lll l ll l ll ll llll lll l lll ll lll l lll l ll l lll lll ll l l ll ll lll l ll ll ll llll lllll ll l l l ll l lll lll lll l ll ll ll ll l ll l ll l l ll llll l ll l ll lll ll l lll lll l l ll l ll lll ll l lll l ll ll ll ll llll lll ll l l ll lll ll lll lll llll ll lll llll llll l ll l lll ll ll ll l ll l ll llll l ll lll ll ll ll llll l lll ll l ll ll lll l ll lll ll ll lll ll l ll ll llll lll lll l lll l ll l ll ll ll ll l lll l ll lll lll ll llll ll lll l ll l llll ll lll ll ll l l ll ll l llll l llll l l ll l l ll ll l ll ll ll l ll ll lll l ll l l lll l lll l lll l ll lll ll ll ll ll ll lllll ll lll l ll ll ll l ll ll l l ll lll ll ll lll ll lll l l lll lll lll ll llllll lll ll l ll l l ll l lll lll l lll lll llll l ll lll ll lll lll lll ll l ll l lll lll l lll ll l l ll lll l lll ll llll ll ll lll ll lll lll l l ll ll l ll llll l ll l ll l lll l ll llll l l ll lll lll ll ll ll ll l l l lll ll l ll l l ll ll l ll llll l ll lll llll lll ll l ll l lll l ll l l ll l l ll ll ll l lll l llll l ll ll lll ll ll l l ll l ll l ll lll llll l lll lll l l lll lll ll ll ll ll l lll lll ll ll ll llll l lll ll llll ll ll l ll l lll l lll lll ll ll ll lll lllll ll ll lll llll ll l l llll ll ll l ll ll ll lll ll lll ll ll l ll ll llll l lll ll ll lll lll l lll lll l ll ll ll ll ll ll l ll l l l llll ll lll ll l ll l l l lll llll ll ll ll l lll llll l lll llll l ll l l ll l l lll lll ll l ll ll ll l l ll lll l ll l ll l l lll l ll l l l lll lll l ll ll lll l lllll ll l lll l ll lll ll lll ll ll l ll ll ll l ll ll lll llll ll l llll l ll l l ll ll lll ll ll lll l llll ll llll l lll ll lll l ll l llll ll lll ll l l ll lll lll ll lll l lll ll lll l l ll ll l ll ll ll l ll lll ll ll lll ll l ll llll ll ll l l l lll l ll lll lll ll ll l l l lll llll lll l ll lll ll l ll lll ll lll lll l ll l llll l ll l lll ll lll l ll lll l ll l l llllll l lll ll ll ll ll ll ll ll ll l ll ll llll ll lll l ll l ll l ll llll l ll lll ll l ll ll ll lll l l ll ll ll llll l llll l ll llll lll lll llll ll l l ll l l ll llll ll ll lll ll ll l l l ll lll llll llll ll l ll ll llll ll l ll l llll ll ll ll ll ll lll l llll l ll ll ll ll ll lll l ll lll l lll ll lll ll l ll l ll ll ll lll ll ll ll ll lll llll ll ll l ll ll ll llll ll lll l lllll llll l l ll ll l lll ll l ll l ll l ll ll l llll l lll lll ll ll l ll l l lll l ll ll ll ll lll lll ll ll lll l l ll ll ll l ll ll llll l llll llll lll lll ll ll l ll ll ll ll l lll llll l lll lll lll lll lll llll lllll lllll ll l llll llll llll l ll ll ll llll ll ll lll l llll l lll ll ll l ll ll ll l lll ll ll ll ll l llll ll l ll ll l lll lll ll llll ll llll l ll l lll lllll llll l lll ll lll ll ll lll l lll lll lll ll l lll l ll l l llll llll ll l ll l ll ll l lll ll lll l ll l llll lll l lll l l ll ll l lll ll ll l ll llll l l ll l lll l lllll ll ll l l ll lll lll ll ll llll l l ll lll ll ll ll ll llll ll l ll l lll l ll ll l ll ll ll l lll ll ll l ll l l ll ll ll ll lll ll l l l lll lll l ll lll l ll llll ll ll ll l ll l ll ll l ll l lll ll ll ll ll ll ll lll l llll ll lll l l ll l lll l ll ll l lllll l lll ll llll l llll ll l ll llll l llll l lll ll lll lll llll l l ll ll ll ll ll l llllll l ll l ll lll lll lll llllll lll ll ll ll l ll ll l ll l lll l lll lll l lll lll l ll l l ll lll lll lll ll llll l ll ll ll ll lll ll ll ll ll ll lll ll ll l l l ll l lll lll ll ll ll llllll l ll ll lll lll l ll lllll ll llll lll ll ll l ll l ll l ll l ll ll llll l l l lll l llll l ll ll lll l ll ll ll lll ll lll ll lll ll l llll lll l ll lll ll l ll ll ll ll l ll l l ll ll ll ll ll ll llll ll lll l ll lll l l ll llll l l lll ll llll ll ll lll l lll llll ll llll ll l lll lll ll ll l ll lll lll lll l ll ll ll lll lll l lll ll l ll llll lll ll l lll ll ll l ll ll l lll ll ll ll l ll ll ll l l ll llll llll l ll l lll l llllll l ll ll ll llll lll l ll lll llll l ll l l lll l l ll lll ll lll l ll ll lll ll l ll ll llll lll l ll lll lll l l ll ll lll ll l ll l llll llll ll l ll l l ll l ll lll ll ll l l lll ll lll ll ll l ll lll ll ll lll ll lll ll ll ll llll l ll l lll l llll l l lll lll lll l lll lll ll l ll lll l l lllll l lll lll ll llll l l ll l ll llll ll lll l ll l ll lll ll l ll l ll ll ll ll lll ll ll lll l ll ll ll l llll lll lll l l lll ll ll ll ll l lll ll l ll ll ll ll ll ll l ll l ll ll lll ll l ll l lll ll l lll l ll ll l llll l l ll l ll lll lll lll ll ll l ll ll ll lll lll ll lll ll ll l llll lllll lll ll ll ll ll ll l llll lll ll ll l ll lll llll lllll ll l lll lll ll l ll l lll ll lll l lll ll llll l l ll ll lll ll ll ll ll llll ll l l lll lll lll lll ll l ll ll l ll l lll ll l ll ll l ll lll l ll ll ll llll lll ll l ll ll l l lll l ll ll l l lll ll ll llll l ll ll ll l ll ll l ll l ll lll ll lll ll llll ll l llll l lll l l ll l l ll ll lll l llll l ll l llll l l ll l lll l ll ll lll lll lll ll ll lll ll lll ll lllll l llll l lll ll ll lll ll ll lll l ll ll ll l ll ll ll lll lll lllll lll l ll lll llll ll l lll lll l l ll ll l ll ll lll lll lllll ll lll l llll l l ll llll l llll lll l lll llll l lll ll lll l l ll ll ll ll lll lll ll ll lll ll l l lll ll l ll lll llll ll ll lll ll ll ll ll l ll llll ll lll l ll lllll lll lll l lll l ll lllll ll l ll ll l lll l l lllll ll ll ll ll ll l ll ll ll l ll lll l l lll l lllllll ll ll l ll ll lll l ll ll l ll llllll lll ll lll l ll l lll ll l lll l ll ll lll lllll ll ll lll lll ll ll ll lll l llll ll ll l ll llll ll lll l ll l lll ll ll ll l ll l lll lll lll lll l lll l ll ll lll l ll l lll l llll l ll ll ll ll ll l lll lll ll l lllll l lllll ll ll lll llll lll ll l lll ll ll ll ll ll l ll l l l llll ll l ll ll ll ll l ll ll ll ll ll ll ll lll llll llll ll ll ll ll ll ll ll lll l l lll lll ll l ll l ll lll ll ll lll ll ll ll ll l ll ll lll ll ll ll l ll lll lll l ll l lll ll l ll ll llll lllll llll lll ll l ll lll ll lll ll lll ll l lll l lll ll ll l ll llll llll ll l ll ll ll lll ll ll lllll l l lll lll lll ll l ll l ll llll llll l ll ll lll l llll l lll ll ll ll llll l ll ll ll ll l ll ll ll l l lll l ll ll l ll lll l ll l lll l lll l l ll llll l ll l lll lll ll ll l ll lll lll l ll ll ll lll l ll ll l lll llll l l llll l llll l ll l ll lll ll llll llllll ll lll ll ll l l llllll l llll llll lll lll l ll llllll l llll l llll lllll lll lll lll llll lll ll ll ll ll l lll lll llll lll lll ll l l lll ll l l ll ll ll l ll lll ll l l lllll l l l l ll ll lll lll ll ll ll lll ll ll l ll l l ll ll lllll llll lll ll ll llll llll ll l lll ll l lll l l ll lll l ll ll ll l ll llll ll ll ll ll lll l ll ll ll lll l l ll ll ll l ll l lll l llll l ll ll l ll lll l l l lll ll ll l l l ll lll lll ll ll l ll ll ll ll ll llll lllll l ll ll ll l l llll l lll ll ll ll ll l ll lll llll ll llll l lllll l llll llll l lll lll l lll l ll l ll ll lll lll ll lll ll l llll ll llll lll lll l l ll ll ll l ll l l ll ll ll ll lll lll lllll lll lll ll ll ll ll ll lll l lll ll ll lll ll lll lll ll ll l llll lll ll lll l ll llll ll ll l lll ll ll lll l ll l ll ll l ll ll ll lll ll ll l ll ll l ll l lllll ll ll l ll ll l ll l lll llll ll ll ll ll l ll l ll l lll l l ll l lll l ll ll lll l l ll l l lll ll l llll l llll llll ll l lll ll ll l ll l lll ll llll ll l ll llll lll ll l ll llll l llll ll lll lll l ll llll ll l l l l ll llll l llll ll ll ll lll l l lll l lll llll l lll l lll ll ll ll ll ll l lll llll l lll lll l ll ll l ll lll l ll lll lll ll llll ll llll ll llll ll lll llll l ll l lllll ll l ll ll l l ll lll l lll l ll lll ll lll l ll ll l ll l ll l lll lll l lll lll ll l lll ll llll lll l lll ll ll ll ll ll ll ll l lll lll llll l lll l ll l ll l l lllllll l ll l ll lll llll ll ll ll llll ll l lll llll lll l ll ll l ll ll ll l l lll lll l l lll ll lllll ll ll l l ll lll ll lll ll l l ll ll ll lll ll l l llll ll lll ll lll llll ll llll l l l ll l lll lll l ll lll l ll lllll ll l ll ll l ll l ll lll ll ll ll l lll ll ll l l llll l lll ll llll l l lll l lll ll lll lllll ll lll ll llllll ll l ll l ll l ll llll lll l lll lll ll lll ll l llll l l l ll ll ll ll ll ll ll lll l ll ll l l l ll lllll lll lllll ll ll ll l lll ll l l llll lll l ll ll ll l ll l ll l l ll ll ll ll ll l ll lllll ll lll ll l ll ll ll l ll l lll ll l lll l lll l l ll l llll lll ll lll l ll ll ll lll ll ll ll lll l lllll ll ll llll ll ll ll l lll l l lll l lll ll l ll ll l ll ll ll lll ll ll ll ll ll lll lll llll ll l lll ll ll ll l lll ll lll l llll llll ll lllll ll ll l ll ll ll lll l lll lll lll l lll llll lll l ll ll ll l lll ll ll lll ll l lll ll l l l ll ll l l ll l ll lll ll ll ll l llll ll l lll ll l ll l ll l l ll ll ll l ll lll l ll l l ll ll l ll l ll lll l ll l llll l ll lll l lll l lll l l ll lll l ll l lll ll llll l l ll lll l ll ll l ll ll ll l lll ll ll ll lll ll lll lllll l llll ll ll l llll l l ll l lll lll llll l lll lll ll ll lll ll ll l lll l l ll llll ll l ll ll lllll ll ll ll ll lllll llll ll l ll ll l lll ll l lll lll lll l lll lll l lll llll l l ll l ll l ll lll ll ll lll lll lll ll l ll ll ll lll ll l l l lll lll l ll ll l lll lll l ll l lll lll ll lll ll ll ll l ll ll llll ll ll ll l ll l ll ll ll lll ll l l ll lll ll lll l l l l l lll lll llll lll l ll ll lll lll l l ll llll lllll l llll l l llll llll ll llll l lll ll ll l ll l ll ll l lll ll ll l llllll l l l ll l l lll lll llll lll ll ll ll l ll l ll ll ll ll l lll ll lll ll l ll lll ll ll llll ll l ll l ll lllll l lllll l llll ll ll ll ll ll lll llll ll llll l ll llll ll ll llll l ll l ll l lll l l l ll llll l ll ll lll ll lll ll ll llll l lll ll llll llll l lll ll ll lll llll ll ll l ll l ll lll l l ll ll l ll llll l llll lll lll lll ll l ll ll ll l lll l l lll ll ll l lll l ll l l l ll lll ll ll l ll l l llll ll ll l lll ll lll ll l llll ll l lll l l ll ll ll ll lll ll lll l ll lll l l ll lll llll ll lll l l ll ll ll ll lll l llllll l ll ll lll lll ll l ll ll l lll l lll ll l lll ll l llll ll ll ll ll lll lll l ll llll ll llll ll l l ll ll l llll ll ll ll ll l ll l l l lll lll l lll l ll ll l lll ll ll l lll lllll ll l ll l ll l llll l llll l ll ll ll lll l llll llll ll ll l llll llll ll l llll ll l lll ll l ll ll l ll lll l ll ll l lll ll ll lll ll ll ll l lll l ll ll ll l lll lll l l ll lll ll l lll ll l lll ll lll l ll ll lllll llll ll l lll lllll l ll ll l lll ll llll l llll l l ll l ll ll lll l ll ll ll lll ll ll l ll lll l ll l lll lll llll ll lll lll lll l lll ll ll ll lll l ll lll lll ll ll ll ll l ll ll l lll ll ll ll l l ll ll ll l lll ll lll l ll lll l lll ll l lll ll lll ll ll ll ll ll llll lll lll l ll lll llll ll l ll lll l ll lll llll lll l llll l lll l l lll ll l l llll llll ll lll l ll ll l lll ll ll ll l l l lll ll l l l ll l ll ll lll ll l ll l ll ll llll ll l ll ll l l ll ll ll l lll ll lll ll ll ll ll l lll ll l ll ll ll ll l ll l l ll lll l l l ll llll ll llll l lll l lll l lll lll lll l lll l lll l ll ll lll lll ll ll ll l lll ll l llll l l ll lll lll ll l ll ll l lll ll l l lll ll l ll ll ll l l ll ll llll ll lllll ll ll l ll l ll l ll ll l ll l ll l ll ll lll l llll l ll llll llll lll l l lll ll l lll l lll l ll ll l ll l lll ll ll ll l ll lll lll l ll ll lll ll lll l lll ll lll l ll l lll ll ll ll ll l l ll l llll l lllll l ll lllll l lll llll l lll llll ll llll l ll l ll llll l l ll l lll lll l l l llll l lll lll l ll llll ll ll ll lll l ll ll l lll ll l l lll l ll llll l ll ll lll ll ll l ll lll ll lll ll ll llll ll lll llll l lll l llll lll ll l ll lll l lll l lll lll llll ll l l l ll ll l ll l ll ll l lll l ll l lll l ll ll ll l llllll l llll llll ll l ll ll ll lll lll ll llll lll llll llll l ll ll l ll l lll lll lll l ll l l lll l lll ll ll ll ll l ll ll ll l l lll l ll l ll l ll l llll lll ll llll l ll l lll l lll l l lll l lll l ll ll ll l ll l lll l ll l ll ll l ll l ll lll ll l lll l l ll ll l lll l ll ll ll lllll ll ll lll ll ll ll lll ll lll lllll ll ll ll ll ll ll l l lll ll lll l llll l lll l ll ll ll llll l ll ll lll l lll l lll ll ll ll l ll lll lll llll l lll ll l lll ll ll ll lll l l lll lll l l ll lll l lllll ll ll l ll lll ll llll lllll ll l ll l ll lll lll lll l lll l l lll llll l ll ll lll llll l ll l l ll ll l ll ll ll ll l ll l ll ll l lll ll l llll ll ll ll lll lll lllll ll ll ll lll l ll ll ll ll ll l ll ll ll l l lll l ll ll ll l lll lll lll ll lll ll ll ll ll lll ll ll lll lll l lll ll ll llll l lll ll l lll ll ll lll l l l lllll lll ll ll ll ll l ll ll ll lll ll ll ll lll l ll lll ll l l l llll ll llll l llll ll ll l ll l lll lll lll lll lll llll ll llll ll llll ll l l l lllll l ll ll l ll l lll l llll l lll ll lll ll l ll l l l llll lllll l lll lll lllllll llll l l lll l lllll l l l ll ll lll lll l l lll ll ll lllll lll ll l l l ll lll ll ll ll l ll lll lll ll l ll llll l lll llll lll ll ll ll ll ll ll l ll ll l llll ll ll lll ll ll l lll l llll ll ll l lll lllll ll ll ll llll llll ll l l lll lllll lll lll lll ll lll lll ll lll ll l lll lll l lll ll ll lll lll lll ll l ll l ll llll llll l ll lll ll l ll ll lll lll l ll lll ll llll ll ll lll lll ll ll lll lll l lll lll ll llll lll ll ll llllll ll ll llll lll Relative exacerbation rate reduction given 65% sputumeosinophil reduction compared with placebo T r ea t m en t d i ff e r en c e i n p r e − b r on c hod il a t o r F EV [ m L ] c o m pa r ed w i t h p l a c ebo Neighbours ability that the true values of X and Y would both be on the same side oftheir elicited medians. The experts found the concordance probability difficultto judge. After the facilitator gave an alternative explanation in terms of theconditional probability that one variable was above its median given that theother was above its median, a concordance probability of 0.7 was tentativelyagreed by the experts. The experts were shown a graphic similar to Figure 6for the case of a concordance probability of 0.7 and found it very helpful andin accord with their expectations. Alternative concordance probabilities wereexplored using the same graphical display. The correlation was too tight with0.8 concordance and the experts felt that there was appreciable positive corre-lation so 0.5 concordance was not considered appropriate. The resulting jointdistribution is shown in Figure 6. We already described the basic aims of the newly introduced PoS frameworkat Novartis at a high level in Section 2. Its practical application involves thefollowing four steps [Hampson et al., 2021]. First, a benchmark probability ofapproval for a project at the start of Phase 2 is estimated based on a smallnumber of program characteristics by a logistic regression model trained on a18atabase of drug development projects. Second, a Bayesian analysis is con-ducted, in which the prior for the efficacy effects is set based on the benchmarkprobability of efficacy success in both Phase 2 and 3. This prior is then usedin combination with Phase 2 data to obtain a posterior distribution for drugefficacy. Phase 3 studies are then simulated using samples from the posteriorin order to estimate the probability of the key efficacy endpoints meeting TPPcriteria in the Phase 3 program. Benchmark information is also used to accountfor the risk of program failure due to an unexpected safety issue and of notobtaining regulatory approval despite a successful Phase 3 program. Third, aprogram risk assessment is done to capture other risks not already covered bythe previous calculations. This assessment is then used to adjust the probabil-ity of a registration with a label meeting TPP criteria to obtain the PoS. Theadjustment in this step was also determined using elicitation process. Finally,in exceptional circumstances a fourth step allows for an adjustment for factorsnot captured by the preceding three steps.In this case study, the Bayesian analysis in the second step of the PoS ap-proach could not directly inform the PoS of the Phase 3 program due to thedifferences in endpoints and population between Phase 2 and 3. Thus, the resultsof the Bayesian analysis for sputum eosinophil counts in Figure 2 were linkedto the efficacy on asthma exacerbations in Phase 3 using an expert elicitationin the manner described in Section 3.3. In contrast, the effect of fevipipranton FEV was elicited directly from the experts and the joint distribution ofthe efficacy of fevipiprant for both endpoints was then obtained as described inSection 3.4.For pragmatic reasons the Novartis PoS approach foresees that only one ortwo key endpoints should be considered in the definition of success. For thisreason, it was decided to ignore the other two key secondary endpoints (asthmacontrol questionnaire and asthma related quality of life questionnaire) of thesePhase 3 trials for the purposes of the PoS calculation. The estimated benchmarks for the first indication of a respiratory orally admin-istered small molecule without a FDA breakthrough designation were: • a Phase 2 success probability of 24%, • a Phase 3 success probability of 60% conditional on Phase 2 success, and • an approval probability of 94% conditional on Phase 2 and 3 success.The program risk assessment [Hampson et al., 2021] considered the majorityof categories to fall into the lowest risk category with one question falling intothe intermediate risk category.When these numbers were combined with simulated Phase 3 outcomes basedon the elicited quantities, a PoS of 4% was calculated. The main hurdle was19EV and the high TPP target for exacerbation reduction. If one only consid-ered a TPP requiring a relative exacerbation reduction of 30% with no require-ments for FEV , the PoS became 41%. The whole PoS process required approximately 2 months. After an initial re-view, we identified that an expert elicitation workshop would be needed. On28 May 2019, we identified the facilitator for the workshop and compiled a listof candidate dates. In the meantime, the team worked to assemble an evidencedossier. By 12 June, we had arranged a elicitation workshop on 12 July af-ter confirming the availability of five experts. By 1 July, the evidence dossierhad been drafted by the biostatistics team, was shared with the facilitator andrecorder, and was finalised on 8 July after a review by internal experts, fourdays before the workshop. One learning was that we should have shared thedossier with the experts earlier in order to allow them to provide feedback onits contents so that additional evidence could have been introduced up-front.On 12 July the workshop took place using version 4 of the SHELF methodologyand on 20 July 2019 the final report of the elicitation meeting was issued. Allrecordings from the meeting were made using the templates provided as partof the SHELF documents package and participants were kept anonymous inthese minutes by using the letters A to E for the experts, as well as Z for thefacilitator.
The results of the Phase 3 trials, for which we conducted the expert elicitation,are shown in Figure 7. As can be seen only one comparison within one of thetwo trials was associated with a confidence interval that excluded no effect, butthis result was not considered statistically significant after an adjustment formultiplicity [Brightling et al., 2020]. The results of the Phase 3 trials are veryinformative in the sense that the 95% confidence intervals essentially excludethe TPP targets.These results are consistent with the elicited prior information from theexperts: the experts essentially excluded the possibility that the true effectof the studied fevipiprant doses on FEV meet the TPP target, while for theprimary exacerbation endpoint, the experts judged that there was a reasonablepossibility that the true effect was at or above the TPP target. On the basisof these Phase 3 results Novartis did not pursue a filing for an indication inasthma. The quality of decisions in the presence of uncertainty can be improved by takingthe judgements of experts based on the available evidence into account. When20igure 7: Implied distribution for true effect of fevipiprant 450 mg QD onexacerbations and FEV based on elicited expert judgements, and study resultsin the high blood eosinophil subgroup of the Phase 3 exacerbation trials l l T PP t a r ge t
75% 50% 25% 0% −25% −50% D en s i t y o f e x pe r t e li c i t ed p r i o r ll l ll l
150 mg QD150 mg QD150 mg QD450 mg QD450 mg QD450 mg QD T PP t a r ge t NCT02563067NCT02555683Pooled 75% 50% 25% 0% −25% −50%
Relative exacerbation rate reduction forfevipiprant compared with placebo S t ud y l l T PP t a r ge t −50 0 50 100 150 l ll ll l
150 mg QD150 mg QD150 mg QD450 mg QD450 mg QD450 mg QD T PP t a r ge t NCT02563067NCT02555683Pooled −50 0 50 100 150
Treatment difference in pre−bronchodilatorFEV1 [mL] compared with placebo stakes are high, as with major investment decisions by a pharmaceutical com-pany, the necessary effort and cost of obtaining experts’ judgements is negligiblecompared to the cost of a wrong decision. This is one of the reasons why thenew Novartis PoS framework, which is applied for the decision to initiate piv-otal trials for a project, recommends expert elicitation when substantial directevidence about QoIs is not available. The SHELF extension method and theSHELF copula method address two common scenarios in this setting: when weextrapolate the evidence from surrogate endpoints to Phase 3 endpoints, andwhen how much a drug affects one endpoint changes how much we judge it toaffect other endpoints.There are currently no published examples of how to apply these methods aspart of the SHELF protocol in the pharmaceutical industry. Therefore, we feltit would be helpful to share an example illustrating the full extent of real-worldcomplexities and the relevant practical considerations. This will hopefully helpothers that wish to use expert elicitation to inform clinical drug developmentor other types of high stakes decisions.We do not wish to overemphasise the outcomes from a single example. Nev-ertheless, the close alignment between the experts’ group judgements with thetrial outcomes, which were not known to the experts at the time of the elicita-tion workshop, supports the validity of expert elicitation in drug development.If a similar elicitation outcome had been available at the time of the decision tostart the Phase 3 program for fevipiprant, it would have suggested a lower PoSthan assigned at the time and may have led to re-evaluation of the assumptionsregarding the secondary FEV endpoint. However, this proof of concept forelicitation as part of a new PoS framework was performed 4 years after thisdecision and used information that only became available subsequently.21he project team noted that the evidence dossier and the discussions in theelicitation workshop were extremely helpful for assembling and understandingthe existing evidence on the efficacy of the drug. It may sometimes be the casethat teams are very well aware of the clinical trials conducted for their product,but have not systematically reviewed the indirect evidence that is available fromother sources. After the elicitation workshop the experts expressed that theyappreciated the structured and scientific process, that they found the methodol-ogy intuitive, and that they were positively surprised how fully non-statisticianscould participate in the workshop.While we describe a particular example of an elicitation workshop, we havenow run several similar workshops at Novartis and some of the authors of thispaper have several years of experience of doing so with other clients. On thisbasis, we offer a number of practical recommendations. It is important to startpreparing the evidence dossier as early as possible so that experts and otherstakeholders can give feedback prior to a workshop. This is also an opportunityto let senior leaders with strong positive opinions on projects provide the evi-dence they wish to be considered. Additionally, it can be difficult for expertsto free their agenda for long workshops and we have found that people find ithard to concentrate in virtual meetings for as long as in in-person workshops.This has led us to investigate options for eliciting individual judgements prior tothe main workshop. It is also important to clearly communicate how elicitationresults will be used. In the context of the PoS of drug development programs,this meant making it clear that the resulting probability is not the sole determi-nant of funding for a project. We now routinely remind teams that investmentdecisions will also be based on other factors such as the costs of development,market opportunity and unmet medical need. We thank Ana-Maria Tanase, Christian Hasenfratz and Hanns-Christian Till-mann for being experts for the asthma case study, as well as Kelvin Stott,Giovanni Della Cioppa and Karine Baudou for their support of the pilot phaseof the Novartis PoS initiative. 22 eferences
Charlotte Baey, Ullrika Sahlin, Yann Clough, and Henrik G Smith. A model toaccount for data dependency when estimating floral cover in different land usetypes over a season.
Environmental and ecological statistics , 24(4):505–527,2017.Jonathan L Bamber, Michael Oppenheimer, Robert E Kopp, Willy P Aspinall,and Roger M Cooke. Ice sheet contributions to future sea-level rise fromstructured expert judgment.
Proceedings of the National Academy of Sciences ,116(23):11195–11200, 2019.Eric D. Bateman, Alfredo G. Guerreros, Florian Brockhaus, Bj¨orn Holzhauer,Abhijit Pethe, Richard A. Kay, and Robert G. Townley. Fevipiprant, anoral prostaglandin DP2receptor (CRTh2) antagonist, in allergic asthma un-controlled on low-dose inhaled corticosteroids.
European Respiratory Jour-nal , 50(2):1700670, 8 2017. doi: 10.1183/13993003.00670-2017. URL https://doi.org/10.1183%2F13993003.00670-2017 .Tim Bedford and Roger M Cooke. Probability density decomposition for con-ditionally dependent random variables modeled by vines.
Annals of Mathe-matics and Artificial intelligence , 32(1-4):245–268, 2001.Eugene R Bleecker, J Mark FitzGerald, Pascal Chanez, Alberto Papi, Steven FWeinstein, Peter Barker, Stephanie Sproule, Geoffrey Gilmartin, Magnus Au-rivillius, Viktoria Werkstr¨om, and Mitchell Goldman. Efficacy and safetyof benralizumab for patients with severe asthma uncontrolled with high-dosage inhaled corticosteroids and long-acting β -agonists (SIROCCO): arandomised, multicentre, placebo-controlled phase 3 trial. The Lancet , 388(10056):2115–2127, 10 2016. doi: 10.1016/s0140-6736(16)31324-1. URL https://doi.org/10.1016%2Fs0140-6736%2816%2931324-1 .Christopher E Brightling, Eugene R Bleecker, Veit J Erpenbeck, Sebastian Fu-cile, Pablo Altman, David Lawrence, Caterina Brindicci, and Barbara Knorr.Luster-1 and -2: Two randomized controlled trials of the prostaglandind2 receptor 2 antagonist, fevipiprant, in asthma.
Clinical Investigation , 9(2):55–63, 2019. URL .Christopher E Brightling, Mina Gaga, Hiromasa Inoue, Jing Li, JorgeMaspero, Sally Wenzel, Samopriyo Maitra, David Lawrence, Florian Brock-haus, Thomas Lehmann, Caterina Brindicci, Barbara Knorr, and Eugene RBleecker. Effectiveness of fevipiprant in reducing exacerbations in patientswith severe asthma (LUSTER-1 and LUSTER-2): two phase 3 randomisedcontrolled trials.
The Lancet Respiratory Medicine , 9 2020. doi: 10.1016/s2213-2600(20)30412-4. URL https://doi.org/10.1016%2Fs2213-2600%2820%2930412-4 . 23ario Castro, Sameer Mathur, Frederick Hargreave, Louis-Philippe Boulet,Fang Xie, James Young, H. Jeffrey Wilkins, Timothy Henkel, andParameswaran Nair. Reslizumab for poorly controlled, eosinophilic asthma.
American Journal of Respiratory and Critical Care Medicine , 184(10):1125–1132, 11 2011. doi: 10.1164/rccm.201103-0396oc. URL https://doi.org/10.1164%2Frccm.201103-0396oc .Mario Castro, James Zangrilli, Michael E Wechsler, Eric D Bateman, Guy GBrusselle, Philip Bardin, Kevin Murphy, Jorge F Maspero, ChristopherO ' Brien, and Stephanie Korn. Reslizumab for inadequately controlled asthmawith elevated blood eosinophil counts: results from two multicentre, parallel,double-blind, randomised, placebo-controlled, phase 3 trials.
The Lancet Res-piratory Medicine , 3(5):355–366, 5 2015. doi: 10.1016/s2213-2600(15)00042-9.URL https://doi.org/10.1016%2Fs2213-2600%2815%2900042-9 .Mario Castro, Jonathan Corren, Ian D. Pavord, Jorge Maspero, Sally Wen-zel, Klaus F. Rabe, William W. Busse, Linda Ford, Lawrence Sher, J. MarkFitzGerald, Constance Katelaris, Yuji Tohda, Bingzhi Zhang, HeribertStaudinger, Gianluca Pirozzi, Nikhil Amin, Marcella Ruddy, Bolanle Akin-lade, Asif Khan, Jingdong Chao, Renata Martincova, Neil M.H. Graham,Jennifer D. Hamilton, Brian N. Swanson, Neil Stahl, George D. Yancopoulos,and Ariel Teper. Dupilumab efficacy and safety in moderate-to-severe un-controlled asthma.
New England Journal of Medicine , 378(26):2486–2496,6 2018. doi: 10.1056/nejmoa1804092. URL https://doi.org/10.1056%2Fnejmoa1804092 .J Chlumsk´y, I Striz, M Terl, and J Vondracek. Strategy aimed at reduc-tion of sputum eosinophils decreases exacerbation rate in patients withasthma.
Journal of International Medical Research , 34(2):129–139, 32006. doi: 10.1177/147323000603400202. URL https://doi.org/10.1177%2F147323000603400202 .Committee for Medicinal Products for Human Use. Guideline on the clinicalinvestigation of medicinal products for the treatment of asthma, 2015. URL . CHMP/EWP/2922/01 Rev.1.Jonathan Corren, Jane R. Parnes, Liangwei Wang, May Mo, Stephanie L.Roseti, Janet M. Griffiths, and Ren´e van der Merwe. Tezepelumab in adultswith uncontrolled asthma.
New England Journal of Medicine , 377(10):936–946, 9 2017. doi: 10.1056/nejmoa1704064. URL https://doi.org/10.1056%2Fnejmoa1704064 .Nigel Dallow, Nicky Best, and Timothy H Montague. Better decision makingin drug development through adoption of formal prior elicitation.
Pharma-ceutical Statistics , 17(4):301–316, 2018.24lireza Daneshkhah and JE Oakley. Eliciting multivariate probability distribu-tions.
Rethinking risk measurement and reporting , 1:23, 2010.Luis C Dias, Alec Morton, and John Quigley. Elicitation.
Springer InternationalPublishing. MR3700912. doi: https://doi. org/10.1007/978-3-319-65052-4 , 1(2):3, 2018.Fadlalla G Elfadaly and Paul H Garthwaite. Eliciting dirichlet and connor–mosimann prior distributions for multinomial models.
Test , 22(4):628–646,2013.Veit J. Erpenbeck, Todor A. Popov, David Miller, Steven F. Weinstein, SheldonSpector, Baldur Magnusson, Wande Osuntokun, Paul Goldsmith, MarkusWeiss, and Jutta Beier. The oral CRTh2 antagonist QAW039 (fevipiprant):A phase II study in uncontrolled allergic asthma.
Pulmonary Pharmacology& Therapeutics , 39:54–63, 8 2016. doi: 10.1016/j.pupt.2016.06.005. URL https://doi.org/10.1016%2Fj.pupt.2016.06.005 .European Food Safety Authority. Guidance on expert knowledge elicitation infood and feed safety risk assessment.
EFSA Journal , 12(6):3734, 2014.J Mark FitzGerald, Eugene R Bleecker, Parameswaran Nair, Stephanie Korn,Ken Ohta, Marek Lommatzsch, Gary T Ferguson, William W Busse, PeterBarker, Stephanie Sproule, Geoffrey Gilmartin, Viktoria Werkstr¨om, Mag-nus Aurivillius, and Mitchell Goldman. Benralizumab, an anti-interleukin-5receptor α monoclonal antibody, as add-on treatment for patients with se-vere, uncontrolled, eosinophilic asthma (CALIMA): a randomised, double-blind, placebo-controlled phase 3 trial. The Lancet , 388(10056):2128–2141,10 2016. doi: 10.1016/s0140-6736(16)31322-8. URL https://doi.org/10.1016%2Fs0140-6736%2816%2931322-8 .Louise Fleming, Nicola Wilson, Nicolas Regamey, and Andrew Bush. Useof sputum eosinophil counts to guide management in children with severeasthma.
Thorax , 67(3):193–198, 8 2011. doi: 10.1136/thx.2010.156836. URL https://doi.org/10.1136%2Fthx.2010.156836 .Paul H Garthwaite and Anthony O’Hagan. Quantifying expert opinion in theuk water industry: an experimental study.
Journal of the Royal StatisticalSociety: Series D (The Statistician) , 49(4):455–477, 2000.Gail M. Gauvreau, Paul M. O ' Byrne, Louis-Philippe Boulet, Ying Wang, Don-ald Cockcroft, Jeannette Bigler, J. Mark FitzGerald, Michael Boedigheimer,Beth E. Davis, Clapton Dias, Kevin S. Gorski, Lynn Smith, Edgar Bautista,Michael R. Comeau, Richard Leigh, and Jane R. Parnes. Effects of an anti-TSLP antibody on allergen-induced asthmatic responses.
New England Jour-nal of Medicine , 370(22):2102–2110, 5 2014. doi: 10.1056/nejmoa1402895.URL https://doi.org/10.1056%2Fnejmoa1402895 .25lobal Initiative for Asthma. Global strategy for asthma management andprevention, 2020. URL https://ginasthma.org/ . Available at https://ginasthma.org/ .Sherif Gonem, Rachid Berair, Amisha Singapuri, Ruth Hartley, Marie F M Lau-rencin, Gerald Bacher, Bj¨orn Holzhauer, Michelle Bourne, Vijay Mistry, Ian DPavord, Adel H Mansur, Andrew J Wardlaw, Salman H Siddiqui, Richard AKay, and Christopher E Brightling. Fevipiprant, a prostaglandin d 2 receptor2 antagonist, in patients with persistent eosinophilic asthma: a single-centre,randomised, double-blind, parallel-group, placebo-controlled trial.
The LancetRespiratory Medicine , 4(9):699–707, 9 2016. doi: 10.1016/s2213-2600(16)30179-5. URL https://doi.org/10.1016%2Fs2213-2600%2816%2930179-5 .John Paul Gosling. SHELF: the Sheffield elicitation framework. In
Elicitation ,pages 61–93. Springer, 2018.John Paul Gosling, Andy Hart, David C Mouat, Mirzet Sabirovic, Simon Scan-lan, and Alick Simmons. Quantifying experts’ uncertainty about the futurecost of exotic diseases.
Risk Analysis: An International Journal , 32(5):881–893, 2012.Ruth H Green, Christopher E Brightling, Susan McKenna, Beverley Hargadon,Debbie Parker, Peter Bradding, Andrew J Wardlaw, and Ian D Pavord.Asthma exacerbations and sputum eosinophil counts: a randomised controlledtrial.
The Lancet , 360(9347):1715–1721, 11 2002. doi: 10.1016/s0140-6736(02)11679-5. URL https://doi.org/10.1016%2Fs0140-6736%2802%2911679-5 .Lisa V. Hampson, Bj¨orn Bornkamp, Bj¨orn Holzhauer, Joseph Kahn, Markus R.Lange, Wen-Lin Luo, Giovanni Della Cioppa, Kelvin Stott, and Steffen Baller-stedt. Improving the assessment of the probability of success in late stage drugdevelopment. arXiv e-prints , art. arXiv:2102.02752, February 2021.Bj¨orn Holzhauer, Craig Wang, and Heinz Schmidli. Evidence synthesis fromaggregate recurrent event data for clinical trial design and analysis.
Statisticsin Medicine , 37(6):867–882, 11 2017. doi: 10.1002/sim.7549. URL https://doi.org/10.1002%2Fsim.7549 .L. Jayaram. Determining asthma treatment by monitoring sputum cell counts:effect on exacerbations.
European Respiratory Journal , 27(3):483–494, 32006. doi: 10.1183/09031936.06.00137704. URL https://doi.org/10.1183%2F09031936.06.00137704 .Nelson Kinnersley and Simon Day. Structured approach to the elicitation of ex-pert beliefs for a bayesian-designed clinical trial: a case study.
Pharmaceuticalstatistics , 12(2):104–113, 2013.Michel Laviolette, David L. Gossage, Gail Gauvreau, Richard Leigh, RonOlivenstein, Rohit Katial, William W. Busse, Sally Wenzel, Yanping Wu,Vivekananda Datta, Roland Kolbeck, and Nestor A. Molfino. Effects of26enralizumab on airway eosinophils in asthmatic patients with sputumeosinophilia.
Journal of Allergy and Clinical Immunology , 132(5):1086–1096.e5, 1 2013. doi: 10.1016/j.jaci.2013.05.020. URL https://doi.org/10.1016%2Fj.jaci.2013.05.020 .Parameswaran Nair, Marcia M.M. Pizzichini, Melanie Kjarsgaard, Mark D.Inman, Ann Efthimiadis, Emilio Pizzichini, Frederick E. Hargreave, andPaul M. O ' Byrne. Mepolizumab for prednisone-dependent asthma withsputum eosinophilia.
New England Journal of Medicine , 360(10):985–993,3 2009. doi: 10.1056/nejmoa0805435. URL https://doi.org/10.1056%2Fnejmoa0805435 .Beat Neuenschwander, Gorana Capkun-Niggli, Michael Branson, and David JSpiegelhalter. Summarizing historical information on controls in clinical tri-als.
Clinical Trials , 7(1):5–18, 2010. doi: 10.1177/1740774509356002. URL https://doi.org/10.1177/1740774509356002 . PMID: 20156954.Lisa Norrington, John Quigley, Ashley Russell, and Robert Van der Meer. Mod-elling the reliability of search and rescue operations with Bayesian Belief Net-works.
Reliability Engineering & System Safety , 93(7):940–949, 2008.Jeremy Oakley.
SHELF - Tools to Support the Sheffield Elicitation Framework ,2020. URL https://CRAN.R-project.org/package=SHELF . R package ver-sion 1.7.0.Jeremy E. Oakley and Anthony O’Hagan.
SHELF: the Sheffield ElicitationFramework (version 4) . School of Mathematics and Statistics, University ofSheffield, UK, 2019. Available at http://tonyohagan.co.uk/shelf .Anthony O’Hagan. Probabilistic judgements for expert elicitation (e-learningcourse), 2018. URL .Available at .Anthony O’Hagan. Expert knowledge elicitation: subjective but scientific.
TheAmerican Statistician , 73(sup1):69–81, 2019a.Anthony O’Hagan. SHELF: the Sheffield Elicitation Framework, 2019b. URL . (accessed on 1 January 2021).Anthony O’Hagan, Caitlin E Buck, Alireza Daneshkhah, J Richard Eiser, Paul HGarthwaite, David J Jenkinson, Jeremy E Oakley, and Tim Rakow. Uncertainjudgements: eliciting experts’ probabilities . John Wiley & Sons, Chichester,2006.Hector G. Ortega, Mark C. Liu, Ian D. Pavord, Guy G. Brusselle, J. MarkFitzGerald, Alfredo Chetta, Marc Humbert, Lynn E. Katz, Oliver N. Keene,Steven W. Yancey, and Pascal Chanez. Mepolizumab treatment in patientswith severe eosinophilic asthma.
New England Journal of Medicine , 371(13):27198–1207, 9 2014. doi: 10.1056/nejmoa1403290. URL https://doi.org/10.1056%2Fnejmoa1403290 .Reynold A Panettieri, Ulf Sj¨obring, AnnaMaria P´eterffy, Peter Wessman,Karin Bowen, Edward Piper, Gene Colice, and Christopher E Brightling.Tralokinumab for severe, uncontrolled asthma (STRATOS 1 and STRATOS2): two randomised, double-blind, placebo-controlled, phase 3 clinical tri-als.
The Lancet Respiratory Medicine , 6(7):511–525, 7 2018. doi: 10.1016/s2213-2600(18)30184-x. URL https://doi.org/10.1016%2Fs2213-2600%2818%2930184-x .Ian D Pavord, Stephanie Korn, Peter Howarth, Eugene R Bleecker, RolandBuhl, Oliver N Keene, Hector Ortega, and Pascal Chanez. Mepolizumab forsevere eosinophilic asthma (DREAM): a multicentre, double-blind, placebo-controlled trial.
The Lancet , 380(9842):651–659, 8 2012. doi: 10.1016/s0140-6736(12)60988-x. URL https://doi.org/10.1016%2Fs0140-6736%2812%2960988-x .H L Petsky, C J Cates, T J Lasserson, A M Li, C Turner, J A Kynaston, and A BChang. A systematic review and meta-analysis: tailoring asthma treatmenton eosinophilic markers (exhaled nitric oxide or sputum eosinophils).
Thorax ,67(3):199–208, 10 2010. doi: 10.1136/thx.2010.135574. URL https://doi.org/10.1136%2Fthx.2010.135574 .S Ren, JE Oakley, and JW Stevens. Evidence synthesis for health technologyassessment with limited studies.
Value in Health , 20(9):A770, 2017.Richard J Russell, Latifa Chachi, J Mark FitzGerald, Vibeke Backer, RonaldOlivenstein, Ingrid L Titlestad, Charlotte Suppli Ulrik, Timothy Harrison,Dave Singh, Rekha Chaudhuri, Brian Leaker, Lorcan McGarvey, Salman Sid-diqui, Millie Wang, Martin Braddock, Lars H Nordenmark, David Cohen,Himanshu Parikh, Gene Colice, Christopher E Brightling, Michel Laviolette,Tina Skjold, Læge Carl Nielsen, and Peter Howarth. Effect of tralokinumab,an interleukin-13 neutralising monoclonal antibody, on eosinophilic airway in-flammation in uncontrolled moderate-to-severe asthma (MESOS): a multicen-tre, double-blind, randomised, placebo-controlled phase 2 trial.
The LancetRespiratory Medicine , 6(7):499–510, 7 2018. doi: 10.1016/s2213-2600(18)30201-7. URL https://doi.org/10.1016%2Fs2213-2600%2818%2930201-7 .JH Sigurdsson, LA Walls, and JL Quigley. Bayesian belief nets for managing ex-pert judgement and modelling reliability.
Quality and Reliability EngineeringInternational , 17(3):181–190, 2001.Marta O Soares and Laura Bojke. Expert elicitation to inform health technologyassessment. In
Elicitation , pages 479–494. Springer, 2018.Pravin K Trivedi and David M Zimmer.
Copula Modeling: An Introduction forPractitioners . now Publishers Inc., Hanover, 2007.28huong N Truong, Gerard BM Heuvelink, and John Paul Gosling. Web-basedtool for expert elicitation of the variogram.
Computers & geosciences , 51:390–399, 2013.Will Usher and Neil Strachan. An expert elicitation of climate, energy andeconomic uncertainties.
Energy policy , 61:811–821, 2013.Sally Wenzel, Linda Ford, David Pearlman, Sheldon Spector, Lawrence Sher,Franck Skobieranda, Lin Wang, Stephane Kirkesseli, Ross Rocklin, BrianBock, Jennifer Hamilton, Jeffrey E. Ming, Allen Radin, Neil Stahl, George D.Yancopoulos, Neil Graham, and Gianluca Pirozzi. Dupilumab in persistentasthma with elevated eosinophil levels.
New England Journal of Medicine ,368(26):2455–2466, 6 2013. doi: 10.1056/nejmoa1304048. URL https://doi.org/10.1056%2Fnejmoa1304048 .Christoph Werner, Anca M Hanea, and Oswaldo Morales-N´apoles. Elicitingmultivariate uncertainty from experts: Considerations and approaches alongthe expert judgement process. In
Elicitation , pages 171–210. Springer, 2018.Rita Esther Zapata-V´azquez, Anthony O’Hagan, and Leonardo Soares Bastos.Eliciting expert judgements about a set of proportions.