[PDF] Eliciting judgements about dependent quantities of interest: The SHELF extension and copula methods illustrated using an asthma case study

Abstract

Pharmaceutical companies regularly need to make decisions about drug development programs based on the limited knowledge from early stage clinical trials. In this situation, eliciting the judgements of experts is an attractive approach for synthesising evidence on the unknown quantities of interest. When calculating the probability of success for a drug development program, multiple quantities of interest - such as the effect of a drug on different endpoints - should not be treated as unrelated. We discuss two approaches for establishing a multivariate distribution for several related quantities within the SHeffield ELicitation Framework (SHELF). The first approach elicits experts' judgements about a quantity of interest conditional on knowledge about another one. For the second approach, we first elicit marginal distributions for each quantity of interest. Then, for each pair of quantities, we elicit the concordance probability that both lie on the same side of their respective elicited medians. This allows us to specify a copula to obtain the joint distribution of the quantities of interest. We show how these approaches were used in an elicitation workshop that was performed to assess the probability of success of the registrational program of an asthma drug. The judgements of the experts, which were obtained prior to completion of the pivotal studies, were well aligned with the final trial results.

Full PDF

EEliciting judgements about dependent quantitiesof interest: The SHELF extension and copulamethods illustrated using an asthma case study

Bj¨orn Holzhauer Lisa V. Hampson John Paul Gosling Bj¨orn Bornkamp Joseph Kahn Markus R. Lange Wen-Lin Luo Caterina Brindicci David Lawrence Steﬀen Ballerstedt Anthony O’Hagan Novartis Pharma AG, Analytics, Basel, Switzerland JBA Risk Management Ltd, Skipton, United Kingdom Novartis Pharmaceuticals Corporation, Analytics, East Hanover,NJ, USA The University of Sheﬃeld, School of Mathematics and Statistics,Sheﬃeld, United KingdomFebruary 16, 2021

Abstract

Pharmaceutical companies regularly need to make decisions aboutdrug development programs based on the limited knowledge from earlystage clinical trials. In this situation, eliciting the judgements of expertsis an attractive approach for synthesising evidence on the unknown quan-tities of interest. When calculating the probability of success for a drugdevelopment program, multiple quantities of interest — such as the eﬀectof a drug on diﬀerent endpoints — should not be treated as unrelated.We discuss two approaches for establishing a multivariate distributionfor several related quantities within the SHeﬃeld ELicitation Framework(SHELF). The ﬁrst approach elicits experts’ judgements about a quantityof interest conditional on knowledge about another one. For the secondapproach, we ﬁrst elicit marginal distributions for each quantity of inter-est. Then, for each pair of quantities, we elicit the concordance probabilitythat both lie on the same side of their respective elicited medians. This al-lows us to specify a copula to obtain the joint distribution of the quantitiesof interest.We show how these approaches were used in an elicitation workshopthat was performed to assess the probability of success of the registrationalprogram of an asthma drug. The judgements of the experts, which wereobtained prior to completion of the pivotal studies, were well aligned withthe ﬁnal trial results. a r X i v : . [ s t a t . M E ] F e b Introduction

The decision to continue or stop the development of a new drug is an exampleof high-stakes decision making in the pharmaceutical industry. To continueusually means a commitment to large and costly clinical trials that may exposethe enrolled patients to risks, while to stop may mean a missed opportunity tohelp patients. At the same time, only limited data are usually available. Thus,improving the decision making in these situations is an important problem.For decision making with no or limited directly relevant data, eliciting thejudgements of a group of experts is one approach to eﬀectively combining theavailable direct and indirect evidence. Expert knowledge elicitation is the pro-cess of capturing expert knowledge about one or more uncertain quantities in theform of a probability distribution. It is an important tool to provide understand-ing of uncertain phenomena and inputs to decision-making processes. There hasbeen a steadily growing demand for elicitation in many ﬁelds throughout indus-try, government and science — see, for example, Garthwaite and O’Hagan [2000],Gosling et al. [2012], Usher and Strachan [2013], and Bamber et al. [2019]. Inparticular, elicitation has been advocated and used in pharmaceutical science[Kinnersley and Day, 2013, Dallow et al., 2018] and public health [Ren et al.,2017, Soares and Bojke, 2018]. Due to the cognitive biases that experts aresubject to, several frameworks and procedures have been proposed to guide theelicitation process in order to minimise these biases. The SHeﬃeld ELicitationFramework (SHELF) described in Section 3 of this paper is one such framework.It can be challenging to elicit judgements about a quantity of interest (QoI),when these judgements are being made conditional on knowledge about an-other quantity. Similarly, QoIs are often likely to be dependent, in which casethe challenge of eliciting a joint distribution for several QoIs arises. Thereare many methods in the literature for capturing knowledge about dependen-cies between multiple variables [Daneshkhah and Oakley, 2010, Werner et al.,2018]. However, these methodologies are typically reported in the literature asstandalone methods rather than forming part of a complete elicitation protocollike SHELF. Also, whereas SHELF is a generic protocol that is applicable to avery wide range of applications, most of these methodologies have considerablerestrictions. • They may constrain the type of variables and distributions to be ﬁtted— for example, to Dirichlet distributions for proportions [Elfadaly andGarthwaite, 2013, Zapata-V´azquez et al., 2014]. • They may be tailored for a speciﬁc application — for example, landcover [Baey et al., 2017] or system reliability [Norrington et al., 2008]. • They may consider complex restructuring for large numbers of dependentvariables [Sigurdsson et al., 2001, Bedford and Cooke, 2001, Truong et al.,2013].We present generic methods for eliciting joint distributions through judge-ments that experts can realistically make. Like the SHELF protocol itself, these2ethods are applicable in all areas where elicitation is required, and to use themeﬀectively there are important choices to be made. Examples of its use, andthe choices made, in any speciﬁc ﬁeld can therefore serve as valuable guidesfor others to follow. We illustrate their use within SHELF in a pharmaceuticalexample that we fully describe in Section 2. The example concerns assessingthe probability of success (PoS) of a Phase 3 drug development program. Suchprograms are expensive, resource-intensive long-term commitments for any or-ganisation. The decision to proceed with a Phase 3 program depends on manyconsiderations including the unmet medical need and market opportunity a newdrug may address, as well as the probability of success to address these needs.As part of a pilot project to evaluate a new PoS framework at Novartis, weconducted a PoS assessment for an asthma drug. While Phase 2 studies hadprovided information on the eﬀect of the drug on a surrogate outcome, no datawere available on the primary endpoint of the key Phase 3 studies: moderate-to-severe asthma exacerbations, which are potentially life-threatening events witha signiﬁcant burden on patients’ lives [Global Initiative for Asthma, 2020]. Ad-ditionally, there was an important key secondary endpoint — forced expiratoryvolume in 1 second (FEV ), an endpoint commonly used in asthma trials —, forwhich Phase 2 data were available, but experts’ judgements were sought on theeﬀect of the diﬀerent treatment duration and trial population in Phase 3. If thedrug worked on one endpoint, it was considered to more likely work on the otherendpoint. Thus, a joint distribution was required. Techniques to address bothproblems through expert elicitation are available within the SHELF framework.In Section 3, we ﬁrst give a brief overview of elicitation methods and ofSHELF. Then we describe the extension method for eliciting judgements aboutPhase 3 outcomes by linking to Phase 2 results and the copula method foreliciting joint distributions. In Section 4, we return to our motivating exampleand describe how we used these techniques to estimate the PoS of that drugdevelopment program. We also compare the obtained expert judgements withthe outcomes of the Phase 3 studies. We ﬁnish with some conclusions andrecommendations in Section 5. The Phase 3 program of fevipiprant, a prostaglandin D receptor 2 antagonistfor the treatment of asthma, was selected to pilot a new PoS framework that hassince been introduced at Novartis [Hampson et al., 2021]. At the time of thePoS assessment, fevipiprant had been studied in several Phase 2 randomisedcontrolled trials (RCTs) and the Phase 3 clinical trials comparing two fevip-iprant doses (150 or 450 mg once a day) with placebo were underway, withdata collection almost complete. This timing was one reason the program wasselected as a pilot, because it ensured that the PoS assessment could not beinﬂuenced by the Phase 3 data, while at the same time minimising the timeuntil the PoS assessment could be compared to the Phase 3 results. In reality,the assessment of the program and the decision to proceed with Phase 3 had3lready been taken at the end of Phase 2 based on more limited information.One major challenge was that — unlike the Phase 2 trials — the key Phase3 trials focused on more severe asthma patients with the sub-population with ablood eosinophil count ≥

250 cells/ µ l. The primary null hypotheses for thissub-population were tested ﬁrst in the trials’ testing procedures [Brightlinget al., 2020]. None of the Phase 2 trials evaluated the eﬀect of fevipipranton moderate-to-severe asthma exacerbations. The annualised rate of such ex-acerbations was the primary endpoint of the two most important trials in thePhase 3 program [Brightling et al., 2020]. Instead, a surrogate endpoint of re-duction in sputum eosinophil counts had been measured in one of the Phase 2trials [Gonem et al., 2016]. FEV was a key secondary endpoint in the Phase3 program and has high regulatory acceptance as a measure of asthma con-trol [Committee for Medicinal Products for Human Use, 2015]. FEV had beena primary or secondary endpoint of several of the Phase 2 studies including fordose ranging [Bateman et al., 2017], but these trials were of shorter durationand had a patient population with milder asthma than the Phase 3 trials.As per the newly implemented PoS framework at Novartis, success was de-ﬁned as regulatory approval with point estimates for key endpoints achievingor exceeding targets speciﬁed as part of a target product proﬁle (TPP). It wasassumed that regulatory approval would require statistical signiﬁcance at theone-sided 0.025 signiﬁcance level for at least one dose for both exacerbationsand FEV in both of the key Phase 3 trials. Thus, to calculate the PoS, weneeded a joint prior distribution for the eﬀects of fevipirant on exacerbationsand FEV .Given the data that were available at the time of the PoS assessment, wedecided to do this by eliciting the judgements of a group of experts. The questionthen was how best to structure the elicitation process: we wanted to explicitlyleverage the Phase 2 data on the surrogate endpoint of reduction in sputumeosinophil counts, since this was arguably the most relevant evidence we hadfor informing beliefs about the eﬀect of fevipiprant on exacerbations.We also expected experts to judge a larger eﬀect of fevipiprant on FEV to be more likely the larger the eﬀect of the drug on asthma exacerbations is.As a consequence, in order to fully characterise the joint distribution of thesetwo treatment eﬀects we would need to understand the size and direction ofthe dependence between these two quantities. In the next section, we describethe various approaches considered for the elicitation, before we return to themotivating example in Section 4 and describe how we practically applied thesemethods. Elicitation can be done informally, but numerous pitfalls await the inexperi-enced practitioner, including well-established sources of bias in expert judge-4ents [O’Hagan et al., 2006, European Food Safety Authority, 2014, O’Hagan,2019a]. Therefore, when the expert judgements are suﬃciently important it isnecessary to employ a formal procedure in the interests of quality and defensi-bility. A small number of established elicitation protocols have been developedand reﬁned by experienced practitioners [an overview is given in Dias et al.,2018].The SHELF protocol is characterised by carefully structured sequences ofjudgements designed to minimise biases and a unique way of eliciting a consensusprobability distribution from a group of experts [Gosling, 2018]. It is one of themost widely used elicitation protocol in the ﬁeld of pharmaceutical science. TheSHELF package of advice, templates and tools to support researchers wishingto conduct expert knowledge elicitation may be freely downloaded from theSHELF website [O’Hagan, 2019b]. Since its inception in 2008, SHELF has beensteadily expanded with new advice and methods. For example, the extensionmethod described in Section 3.3 was added in version 4 [Oakley and O’Hagan,2019].

The SHELF protocol is distinguished by a number of key elements. • Individual elicitation — discussion — group elicitation. Serious elicitationalmost always requires using a group of experts in order to capture theircombined knowledge. SHELF elicits a single distribution from the groupbut begins by eliciting judgements from each expert independently. Thisis followed by the experts discussing their diﬀerences to share their exper-tise, opinions and interpretations of the evidence. Finally, group judge-ments are elicited and the result is a “consensus” distribution ﬁtted tothese judgements. This combination of individual and group elicitationsis the most important distinguishing feature of SHELF. The individualelicitations show each expert’s beliefs and form a basis for the subsequentdiscussion. The discussion is an opportunity to share and debate thoseopinions with a view to achieving a common understanding and is intendedto extract maximum value from their joint expertise, • The SHELF workshop. The discussion and group elicitation phases requirethat the experts come together in what is called a SHELF workshop.Typically, they are physically together in a room, although SHELF canbe used with other arrangements, including video-conferencing. • The evidence dossier. Prior to the workshop, a dossier is prepared sum-marising the evidence regarding the QoIs. In a typical elicitation thereis some relevant evidence available, but there is not enough direct evi-dence to identify the value of any QoI (otherwise expert judgement wouldnot be needed). The dossier ensures that all experts have access to thesame evidence and that it is all fresh in their minds when they make theirjudgements. The essence of expert knowledge elicitation is that diﬀerent5xperts interpret and weight the evidence diﬀerently, based on their ownexperience. The discussion phase in SHELF is where these diﬀerences areaired and debated. • The rational impartial observer (RIO). Even after discussing and debating,SHELF does not expect the experts to reach complete agreement (suchthat they now have the same knowledge and beliefs about an uncertainquantity, represented by the same probability distribution). Instead theyare asked to judge what a rational impartial observer, called RIO, mightreasonably believe, having seen their individual judgements and listenedto their discussion. By taking the perspective of RIO, experts can reachagreement on a distribution that represents a rational impartial view oftheir combined knowledge. • The facilitator. The SHELF workshop is led by a facilitator, who hasexpertise in the process of eliciting expert knowledge, and in particular isfamiliar with SHELF. The facilitator works with the experts to accuratelycapture their knowledge, facilitates the group discussion and leads themin applying the RIO perspective. The facilitator’s role may also be foundin other elicitation protocols, but it is particularly important in SHELF.The group interaction in SHELF’s discussion is another possible sourceof biases, which must be managed by the skill and experience of the fa-cilitator. Many other protocols do not admit group discussion, therebyavoiding the risk of those biases but also losing the opportunity for theexperts to share and debate their judgements. • SHELF templates. The conduct of the workshop is recorded on SHELFtemplates, which play a dual role. First, they organise the progress ofthe workshop through a predeﬁned series of steps. In particular, boththe individual elicitations and the group elicitation are directed throughcontrolled sequences of judgements. The entire process and the elicitationsequences used are based on research into the psychology of judgement,and on extensive experience in practical elicitation. Second, the templatesserve to document a SHELF workshop, such that conduct of the workshopand the development of each elicited distribution is clearly set out.The result of an elicitation for a single uncertain QoI is a probability dis-tribution. Accordingly, the judgements that experts are asked to make areprobabilistic. The basic sequence of judgements at the individual judgementsstage is as follows.1. Plausible range. Experts are ﬁrst asked to specify upper and lower plausi-ble bounds such that they judge values of the QoI outside that range to beimplausible. A numerical interpretation of ‘implausible’, for instance as a1% or 5% probability, is not generally made, since the primary function ofthis step is to encourage the experts to think of all possibilities, therebyreducing any tendency to overconﬁdence.6. Median. Experts are next asked to specify their median value for the QoI,such that they regard it as equally likely that the QoI would be above orbelow this value.3. Quartiles or tertiles. Finally, experts specify their quartile or tertile values(the choice of which to ask for being according to the facilitator’s prefer-ence). Just as the median divides the plausible range into two intervalsthat are judged equally likely, quartiles divide it into four equally likelyintervals and tertiles into three. In expert elicitations for the Novartis PoSframework we have favoured eliciting tertiles instead of quartiles, becausewe consider thinking about three instead of four equally likely intervalsless challenging for experts.The SHELF package contains copious advice and tools to help the expertsto understand and make these judgements reliably. In particular, by followingthe SHELF protocol the facilitator asks questions in such a way that biases areminimised and there is no need for the experts to have a thorough understandingof probability or statistical theory. Training in making these judgements isalso available through an online self-paced course accessed from the SHELFwebsite [O’Hagan, 2018].For the group judgements, the facilitator may ask the experts to agree onprobabilities that RIO might assign to three speciﬁc propositions, such as thatthe QoI is negative, or that it exceeds some speciﬁed value. A probability distri-bution is then ﬁtted to the three RIO probability judgements. SHELF providessome R software for ﬁtting distributions using various standard families, suchas normal, t, gamma, lognormal or beta distributions [Oakley, 2020]. However,other distributions may be ﬁtted, and indeed a major consideration in SHELF isthat the form of the elicited distribution should not be constrained in any way.The facilitator will work with the experts to identify a suitable distribution torepresent their judgements. The ﬁtted distribution is the ﬁnal outcome of theelicitation.Throughout the process, and particularly when determining the ﬁnal agreeddistribution, the facilitator will prompt and challenge the experts to ensure thatthe ﬁnal distribution genuinely represents what RIO might believe after seeingthe experts’ judgements and listening to their discussions.

The SHELF package contains several techniques for eliciting a joint distribu-tion for two or more uncertain quantities, including the extension and copulamethods. The extension method is a generic technique that allows considerableﬂexibility for the form of the joint distribution. It is e.g. suitable for elicitingjudgements about the treatment eﬀect for a Phase 3 endpoint ( X ) based onthe Phase 2 results for a surrogate endpoint ( Y ). The fact that Phase 3 fol-lows Phase 2 chronologically makes it natural to express judgements about X conditional on Y . 7or two QoIs, X and Y , the extension method consists of obtaining amarginal distribution for Y and a set of conditional distributions for X given Y = y . The elicitation of joint distributions requires the following steps.1. A marginal distribution for Y is obtained. This distribution can be elicitedas described in Section 3.2, but could also be the result of an analysis ofavailable data. E.g. in the asthma case study introduced in Section 2 itis a meta-analytic predictive distribution [Neuenschwander et al., 2010]based on Phase 2 data.2. A conditional distribution (as always, from the perspective of RIO) iselicited for X conditional on Y equalling the median of its elicited marginaldistribution, also following the basic method of Section 3.2.3. Several other quantiles of the elicited marginal distribution of Y are se-lected as conditioning points; typically these will be the quartiles, 5thand 95th percentiles. Median values are elicited for X conditional on Y equalling each of theses conditioning points (ﬁrst the 5th and 95th per-centiles and then the quartiles). The basic SHELF approach of individualjudgements – discussion – group judgements is used for each.4. The ﬁnal step is to ‘ﬁt’ a set of conditional distributions to these judge-ments. First, a median function m ( y ) is ﬁtted to the elicited conditionalmedians. This might for instance be a polynomial or a piecewise-linear ﬁt(with extrapolation), and may be applied on a transformed scale. Second,a model is chosen to determine the conditional distributions based on thedistribution at the Y -median elicited in Step 2. For instance, it may bedecided that the Y -median distribution can be applied to all conditionals,simply shifted to follow the m ( y ) function. Alternatively, the variancemay also be scaled depending on m ( y ). These choices are available in theSHELF R software, but again other choices can be made. The facilitatorwill always work with the experts to identify a ‘ﬁt’ that best representstheir judgements.The extension method is appropriate when the experts perceive a naturalcausal link from Y to X . Indeed, it is particularly useful when the objectiveis to elicit a distribution for X but the experts would ﬁnd it easier to makejudgements about X if they knew the value of Y . In this case, the marginaldistribution of X is the main outcome of the elicitation process. Although itwill not generally be feasible to derive that marginal distribution analyticallyfrom the elicited joint structure, a large Monte Carlo sample can be drawn bysampling values y i from the marginal distribution of Y and then sampling x i conditional on Y = y i . The Monte Carlo samples { x i } are then samples fromthe marginal distribution of X and, if needed, a distribution can be ﬁtted tothe samples. 8 .4 The SHELF copula method When there is no natural ordering of related QoIs based on time or causality,the extension method requires an arbitrary imposition of an ordering and theconditional judgements are more diﬃcult for the experts. The SHELF copulamethod is appropriate for two or three QoIs and does not require the elicitationof conditional distributions. However, it does place some constraints on thejoint distribution. The method has the following steps.1. Marginal distributions are elicited for each QoI individually, using thebasic method of Section 3.2.2. For each pair of QoIs, a single judgement concerning their degree of corre-lation is made. This judgement is called the concordance probability, andis the probability that both QoIs lie on the same side of their respectiveelicited medians.3. A Gaussian copula joint distribution [Trivedi and Zimmer, 2007] is thenﬁtted to these marginal distributions and concordance probabilities. Thefacilitator shows the experts suitable displays or summaries of the jointdistribution to verify that it is a reasonable representation of their beliefs.With just two QoIs, the copula method is simple to apply. The Gaussian copulaimposes a restriction on the joint distribution but in practice it will usually bean adequate ﬁt to the experts’ judgements.In principle, the copula method is applicable for larger numbers of QoIs, butit is diﬃcult to use for more than three. With three QoIs, three concordanceprobabilities need to be elicited. Under the Gaussian copula assumption, eachconcordance probability can be transformed to a correlation coeﬃcient and theresulting correlation matrix must be positive deﬁnite. It is quite possible for theexperts’ elicited concordance probabilities to fail to produce a valid correlationmatrix, and they must then revisit their judgements with the aid of the facil-itator to achieve an adequate ﬁt. With more than three QoIs, the number ofconcordance probabilities rapidly increases, as does the likelihood of the elicitedvalues not corresponding to a valid correlation matrix.The SHELF copula method is a natural choice to construct a joint distribu-tion for the eﬀects of a drug on two Phase 3 endpoints, such as a primary andsecondary clinical outcome.The interested reader will ﬁnd full technical details, as well as much practicaladvice, on these and other elicitation techniques in the SHELF package [Oakleyand O’Hagan, 2019].

In this section, we provide an in-depth description of the expert elicitationand PoS calculation for the example introduced in Section 2. We decided to9tructure the elicitation process into three parts. First, we followed the SHELFextension method by using Phase 2 data to establish a marginal distribution forthe eﬀect of fevipiprant on sputum eosinophil counts and then elicited from agroup of experts a set of conditional judgements on the eﬀect on exacerbationsin the Phase 3 population given diﬀerent values for the eﬀects on this surrogateendpoint. Secondly, we elicited the experts’ beliefs on the eﬀect of fevipipranton FEV in the Phase 3 population. Finally, we used the SHELF copula methodto elicit the dependence between drug eﬀects on exacerbations and FEV . Fevipiprant was studied in four Phase 2 RCTs in asthma and the results of thesestudies for the FEV endpoint are summarised in Figure 1.1. A Proof of Concept RCT (ClinicalTrials.gov identiﬁer NCT01253603) witha 4 week treatment duration in patients on reliever therapy did not showan eﬀect of fevipiprant on the primary endpoint of FEV in the overalltrial population, but more favourable results were seen for a subgroup ofmore severe patients [Erpenbeck et al., 2016].2. A dose ﬁnding RCT (NCT01437735) with a 12 week treatment dura-tion [Bateman et al., 2017] was the basis of the selection of one of thePhase 3 doses.3. A 12-week RCT looked at potential diﬀerences in eﬀects in patients withatopic and non-atopic asthma (NCT01836471).4. Finally, there was a RCT (NCT01545726) that showed a reduction ofsputum eosinophil counts after 12 weeks of treatment with fevipiprantcompared with a placebo group [Gonem et al., 2016]. The ratio of a3.5-fold (95% CI 1.7 to 7.0; p=0.0014) lower ratio of geometric means inFigure 1: Observed diﬀerences in FEV to placebo with 95% conﬁdence intervalsfor fevipiprant in Phase 2 studies in the subgroup with a blood eosinophil countof ≥

250 cells/ µ L and in the overall trial populations (A: atopic patients, NA:non-atopic patients) ll lll l

500 QD150 QD450 QD 450 QD450 QD 225 BID

Patients with high blood eosinophil counts−250 0 250 500NCT01253603NCT01437735NCT01836471 (NA)NCT01836471 (A)NCT01545726

Difference in FEV1 to placebo [mL] C li n i c a l T r i a l s . go v s t ud y i den t i f i e r l llll l

500 QD 150 QD450 QD450 QD450 QD225 BID

Overall population−100 0 100 200NCT01253603NCT01437735NCT01836471 (NA)NCT01836471 (A)NCT01545726

Difference in FEV1 to placebo [mL]

PredictivedistributionNCT01545726 90% 75% 50% 25% 0% −100%

Reduction [%] in sputum eosinophils for fevipiprant 225 mg BIDcompared with placebo at week 12 S t ud y A number of anti-inﬂammatory treatments that lower sputum eosinophilcounts have been shown to reduce exacerbation rates in asthma patients with el-evated sputum eosinophil counts [Petsky et al., 2010]. This evidence was mostlygenerated with corticosteroids, but suggests that sputum eosinophil counts maybe a surrogate for a reduction in exacerbations. As part of the evidence dossierfor this expert elicitation, we assembled more recent evidence from 22 trials ofother drug classes [Green et al., 2002, Chlumsk´y et al., 2006, Jayaram, 2006,Nair et al., 2009, Fleming et al., 2011, Castro et al., 2011, Pavord et al., 2012,Laviolette et al., 2013, Wenzel et al., 2013, Ortega et al., 2014, Gauvreau et al.,2014, Castro et al., 2015, Bleecker et al., 2016, FitzGerald et al., 2016, Correnet al., 2017, Panettieri et al., 2018, Russell et al., 2018, Castro et al., 2018].The data are shown in Panel A of Figure 3 and the results from a Bayesianmeta-regression model are shown in Panel B of the ﬁgure. Without data froma variety of diﬀerent drugs, this meta-regression would be highly questionable,because then its ﬁndings might only apply to a speciﬁc mode of action. Notethat some of these data were not available at the time the Phase 3 program forfevipiprant was started.For the question of the likely eﬀect of fevipiprant on FEV in asthma patientswith blood eosinophil counts ≥

250 cells/ µ L, the evidence dossier presented thePhase 2 results for the overall population, as well as for subgroups deﬁned byblood eosinophil counts (see Figure 1).In addition, the evidence dossier gave details of the fevipiprant Phase 3program, and discussed the strengths and limitations of the available evidencethat the experts needed to bear in mind.11igure 3: Eﬀects of anti-inﬂammatory asthma therapies on sputum eosinophilcounts and exacerbation rates compared with placebo: Estimates with 95%conﬁdence intervals for exacerbation rate ratios and ratio of geometric mean (vs.placebo) ratios of sputum eosinophil levels at the end of the study compared withbaseline (Panel A), and meta-regression using random drug eﬀects on interceptand slope of relationship, as well as random study eﬀects (Panel B); Studies 10and 11 are the two parts of study NCT02414854 that were not blinded againsteach other. benralizumab mepolizumab reslizumab tralokinumab dupilumab tezepelumab sputum based strategy mepolizumab sputum based strategy benralizumab reslizumab tralokinumab dupilumab tezepelumab fevipiprant

Exacerbations Sputum eosinophils T r ea t m en t e ff e c t vs . pbo Ratio compared with placebo S t ud y A l ll l ll l benralizumab mepolizumabreslizumab tralokinumab dupilumabtezepelumab sputum based strategy % p r e d i c t i o n i n t e r v a l

50% 80% 95%

Reduction (%) in sputum eosinophils vs. placebo E x a c e r ba t i on r edu c t i on ( % ) vs . pbo . B .2 Choice of quantities of interest for elicitation The QoI to be elicited were chosen based on their importance for meeting thesuccess deﬁnition of the PoS framework and lack of evidence to directly informa predictive distribution. The global project team considered the results in thetwo exacerbation trials (NCT02555683 and NCT02563067) in the pre-speciﬁedsubgroup of patients with high eosinophil counts to be the most important tofulﬁl the TPP. These 1-year exacerbation RCTs compared two doses of fevip-iprant with a placebo on top of continued standard of care therapy in severeasthma patients. The rate of asthma exacerbations (TPP target: ≥

40% rela-tive rate reduction compared with placebo) was the primary endpoint of thesestudies, while the key secondary endpoint of FEV (TPP target: ≥

120 mLimprovement in FEV compared with placebo) was considered to be especiallyimportant for regulatory approval. There was considerable historical data onthe placebo exacerbation rate, the between patient heterogeneity in the exac-erbation rate [Holzhauer et al., 2017] and the variability in FEV so that thesequantities did not require elicitation.The biggest source of uncertainty regarding the PoS was about the eﬀects offevipiprant on asthma exacerbations and FEV , as well as about their correla-tion. For this reason, these were identiﬁed as the QoIs for the expert elicitation.We carefully chose the phrasing of the questions about the QoIs to make it easyfor the experts to think about them and express their judgements.We decided to use the extension method to elicit judgements about therelative rate reduction in exacerbations conditional on a speciﬁed reductionin sputum eosinophils, and to use the copula method to elicit the associationbetween the two QoIs. On that basis, we formally deﬁned the following threeQoIs: • X is the average reduction in moderate to severe asthma exacerbationsachieved by fevipiprant compared to placebo over the population of eligiblepatients, • Y is the average reduction in sputum eosinophil counts achieved by fevip-iprant compared to placebo over the population of eligible patients, • Z is the average increase in FEV achieved by fevipiprant compared toplacebo over the population of eligible patients.Eligible patients are deﬁned as matching the inclusion criteria for the NCT02555683and NCT02563067 Phase 3 trials and having blood eosinophil counts of at least250 cells/ µ L. Note that because we had already derived the marginal predictivedistribution in Figure 2 for the reduction Y in sputum eosinophil counts fromPhase 2 data, the extension method for the QoI X required only conditionaldistributions to be elicited.The choice and phrasing of the QoIs in elicitation is an important earlytask. Quantities must be clearly and unambiguously deﬁned, in terms that arefamiliar to the experts. It must be clear that each quantity has a unique, well-deﬁned (but unknown) value. We chose to elicit treatment eﬀects compared13ith placebo as percentage reductions in exacerbations and improvements inFEV , because these are widely used eﬀect measures in asthma trials commonlyexpressed in these terms that were familiar to the experts. The eﬀects aredeﬁned as averages over all potential patients so that they have well-deﬁned andunique values. The experts would be asked for their judgements on questionssuch as:1. Given that an anti-inﬂammatory drug reduces sputum eosinophil countsby Y , what do you judge to be the likely values for the relative exacerba-tion rate reduction X in eligible patients?2. What do you judge to be the likely values for the diﬀerence Z betweenfevipiprant and placebo in FEV in millilitres (mL) in eligible patients?3. Given the judgements about the reduction in exacerbations and the changein FEV caused by fevipiprant, how likely do you judge it to be that both Y and Z will be on the same side of your median values? In order to capture the full range of opinions and diﬀering past experiencesamongst experts, a group of company internal experts was convened. The 5selected experts all had extensive experience in drug development in the respi-ratory area. Two were part of the fevipirant team (a clinician and a statistician),while 3 were not members of the fevipiprant team (a clinician, a translationalmedicine expert and a regulatory aﬀairs expert). These experts were selected,because the QoIs appeared to be related to clinical trials and understandingmechanistic considerations around the drug eﬃcacy. We wanted at least someof this key expertise to be from outside of the fevipiprant project team to ensurean outside opinion would be heard. A statistician was considered important toprovide a perspective on the available evidence and the expert in regulatoryaﬀairs was selected due to a broad experience with multiple previous programs.Prior to the elicitation workshop, all experts were encouraged to work throughan online course on expert elicitation [O’Hagan, 2018] and they were guidedthrough a practice exercise by the facilitator at the start of the workshop.

The elicitation workshop was an in-person 4-hour meeting with one facilitator,one recorder and ﬁve experts. While the facilitator guided the meeting andasked the experts questions, the role of the recorder was to operate the SHELFsoftware, project relevant visualisations for the experts and to take minutes ofthe meeting. 14 .3.3 Elicitation of ﬁrst quantity of interest

The median of the marginal distribution of Y shown in Figure 2 — based ona Bayesian analysis of Phase 2 sputum eosinophil data — was a 66% reduction(80% interval from 52 to 76%). Round numbers are easier for experts to con-dition on, and so, for the ﬁrst QoI, the median of 66 % was rounded to 65 %.Thus, the experts were ﬁrst asked for their judgement on Y conditional on X being a 65% reduction in sputum eosinophil counts.For the individual judgements about this QoI, the tertile method was used.Each expert ﬁrst independently wrote down their plausible range for the QoI,followed by their median and the points that divide the plausible range intoequally probable thirds. At each step the experts were asked to challenge theirown judgements. For instance, after specifying their plausible range, expertswere asked to consider their reaction if a large study estimated X to be outsidethat range; would they acknowledge that their range was too narrow, or wouldthey be suspicious of the reported estimate? If their reaction would be theformer one, then they should widen their plausible range.Then the individual judgements were revealed to the group and the ex-perts were asked to explain their judgements. In this wide-ranging discussion,a number of points were raised and the main arguments were recorded usingthe SHELF templates. Afterwards, consensus judgements were obtained usingthe probability method: experts were asked what probability RIO (the Ratio-nal Impartial Observer) would assign to the relative exacerbation rate reductionbeing less than 25%, greater than 40% and less than 35%. After signiﬁcant dis-cussion, the group agreed that RIO would assign probabilities of 30%, 30% and50%, respectively. A Beta(2.81, 3.05) distribution scaled to a plausible range of0 to 70% was ﬁtted to these judgements and shown to the experts. The expertsfelt that this distribution, with a median at 33.4% (90% credible interval 11.9to 55.8%), adequately represented their knowledge. The result of this elicita-tion was a distribution for X (exacerbation reduction), given that Y (sputumeosinophil reduction) is 65%. The results of the individual judgements and thegroup judgement are shown on the left-hand side of Figure 4.Then, the experts were asked for their conditional judgement about the me-dian percentage reduction in exacerbations given an eﬀect on sputum eosinophilof 50%, then for 75%, 60% and 70%. These numbers correspond approximatelyto 10%, 90%, 25% and 75% points of the marginal predictive distribution foreﬀects of fevipiprant on sputum eosinophil counts, respectively. Thus, theycharacterise conditional judgements across the bulk of this distribution. Theirorder was chosen in order to minimise known sources of cognitive bias and to en-sure that experts needed to think carefully about each judgement. The elicitedmedians are shown in Panel A of Figure 5.It was agreed that over the plausible range of eﬀects on sputum eosinophilcounts, there was no probability that the drug could increase the number of exac-erbations, because the assumption that fevipiprant reduced sputum eosinophilsindicated at least some positive beneﬁt. It was therefore appropriate to modelthe distributions of exacerbation reductions at intermediate sputum eosinophil15igure 4: Distributions elicited from individual experts, linear pool of thesedistributions and group judgements MedianLower thirdof distribution Upper thirdMiddle third Upper plausiblelimitLower plausiblelimitABCDEPoolGroupjudgment 0% 20% 40% 60% E x pe r t Lowest 10% Highest 10%012 0% 20% 40% 60%

Relative exacerbation rate reduction given 65% sputumeosinophil reduction compared with placebo D en s i t y o f g r oup j udge m en t MedianLower thirdof distribution Upper thirdMiddle thirdUpper plausiblelimitLowerplausiblelimitABCDPoolGroupjudgment 0 50 100 150 200 E x pe r t Lowest10% Highest 10%0.0000.0050.0100.015 0 50 100 150 200

Treatment difference in pre−bronchodilatorFEV1 [mL] compared with placebo D en s i t y o f g r oup j udge m en t eﬀects through a log transformation — i.e. to assume that median(log( X | Y ))is a piecewise linear function of Y. The experts were shown the resulting me-dian relationship shown in Panel A of Figure 5 and agreed that it representeda reasonable RIO opinion.Using the log transformation, the conditional distribution given Y = 65%was assumed for X conditional on other values of Y , but scaled to follow theelicited median model — i.e. we shifted the median of each Beta-distributionaccording to Panel A of the ﬁgure and kept the variance on the log-scale con-stant. The recorder showed the experts the resulting conditional distributionplot in Panel B of Figure 5. The facilitator pointed out how the scaling hadresulted in less uncertainty conditional on Y = 50% but more conditional on Y = 75%. The experts conﬁrmed that this was a reasonable representation oftheir beliefs.The elicitation of the ﬁrst QoI was now complete and the required (marginal)distribution for X was computed by Monte Carlo simulation by combining theelicited conditional relationship with the predictive distribution for Y from Fig-ure 2. It is shown in the top-most panel of Figure 7. The elicitation for the second QoI then proceeded using the tertile method forindividual judgements, followed by a discussion and, again, using the probabilitymethod for the consensus judgement. The resulting judgements are shown onthe right-hand side of Figure 4.The joint distribution of the treatment eﬀects on exacerbations and FEV , X and Y , was then elicited using the copula method. The correlation waselicited through the concordance probability, i.e. RIO’s judgement of the prob-16igure 5: Piecewise-linear median model for the elicited medians (Panel A) andconditional distributions for the relative exacerbation rate reduction across therange of plausible eﬀects on sputum eosinophil counts (Panel B) l l l l l Effect on sputum eosinophils (Y) C ond i t i ona l m ed i an r e l a t i v e r a t e r edu c t i on i n e x a c e r ba t i on s [ % ] ( X ) A Relative rate reduction in exacerbations [%]given effect on sputum eosinophil counts (X) E ff e c t on s pu t u m eo s i noph il s ( Y ) B based on 10,000 Monte Carlo samples l lll ll lll l lll ll lllll ll ll ll l llll lll ll l llll lllll l ll l lll l lll ll lll ll ll l ll ll lll l llll ll ll lllll l llll lll ll ll lll ll ll llll llll ll l ll l lll l lll ll ll l ll lll l ll ll ll l l ll ll ll l lllll l l ll l ll l ll l l lll llll ll lll lllll l l l l llll ll lll l l lll ll ll ll l ll ll ll l lll l ll l lll lll lllll lll lll ll llll l ll lllll l l lll l lllll l ll l l llll ll l l l l ll l l ll lll ll ll lllll lll ll ll l ll ll lll l l ll llll lll ll ll ll lll ll l lll l lll llll lll ll l l ll ll ll ll ll ll ll ll lll ll l lll llll lllll ll ll ll l ll lll ll l ll lll lll l llll ll ll ll ll l ll l lll llllll ll ll l llll l ll lll ll ll ll lll lll lll l lll ll lll ll lllll llll ll l l lll lll l lll ll lll ll l ll ll ll lll l ll ll l ll l lll l l lll llll l lll lll ll ll l l ll lll l lll ll ll ll l llll l l l l ll l ll ll ll l ll lll ll l lll lll ll llll lll l l lll lll ll ll l llll ll ll ll llll llll lll l ll llll ll l ll ll ll ll l llll l l lllll ll l l l ll l l ll ll ll l lll ll ll ll ll lll l ll l lll llll ll llllll llll l ll l l lll ll l llll ll ll l ll ll l ll l ll ll ll l l l llllll l lll ll ll l ll l llll lll ll lll l ll ll l ll l ll l ll lll ll ll ll ll l l ll l llll l ll lll l ll ll ll l ll ll l lll l lll l l l l ll ll l ll ll lll l ll lll l ll lll ll lll lll ll ll ll lll ll lll ll ll ll lll l ll l l llll l lll l lllll l lll ll ll lllll l ll lll llll ll l ll l ll l l llllll lll lll lll l lll ll ll l llll ll lll l lll l l l ll ll lll ll l l ll lll lll ll l l lll lll ll l ll l lll ll llll ll llll l ll lll l ll l ll ll ll l ll ll llll lll lll l l l llllll ll ll ll ll ll lllll l lll ll lll llll l lll ll ll lll lll l ll ll lll ll lllll lll ll ll lll ll ll ll ll llll l ll lll ll l lll llll l ll l ll l l lll llll l l ll lll l llll ll ll llll ll l lll ll l ll lll l ll l ll ll llll lll l lll ll lll l lll l ll l lll lll ll l l ll ll lll l ll ll ll llll lllll ll l l l ll l lll lll lll l ll ll ll ll l ll l ll l l ll llll l ll l ll lll ll l lll lll l l ll l ll lll ll l lll l ll ll ll ll llll lll ll l l ll lll ll lll lll llll ll lll llll llll l ll l lll ll ll ll l ll l ll llll l ll lll ll ll ll llll l lll ll l ll ll lll l ll lll ll ll lll ll l ll ll llll lll lll l lll l ll l ll ll ll ll l lll l ll lll lll ll llll ll lll l ll l llll ll lll ll ll l l ll ll l llll l llll l l ll l l ll ll l ll ll ll l ll ll lll l ll l l lll l lll l lll l ll lll ll ll ll ll ll lllll ll lll l ll ll ll l ll ll l l ll lll ll ll lll ll lll l l lll lll lll ll llllll lll ll l ll l l ll l lll lll l lll lll llll l ll lll ll lll lll lll ll l ll l lll lll l lll ll l l ll lll l lll ll llll ll ll lll ll lll lll l l ll ll l ll llll l ll l ll l lll l ll llll l l ll lll lll ll ll ll ll l l l lll ll l ll l l ll ll l ll llll l ll lll llll lll ll l ll l lll l ll l l ll l l ll ll ll l lll l llll l ll ll lll ll ll l l ll l ll l ll lll llll l lll lll l l lll lll ll ll ll ll l lll lll ll ll ll llll l lll ll llll ll ll l ll l lll l lll lll ll ll ll lll lllll ll ll lll llll ll l l llll ll ll l ll ll ll lll ll lll ll ll l ll ll llll l lll ll ll lll lll l lll lll l ll ll ll ll ll ll l ll l l l llll ll lll ll l ll l l l lll llll ll ll ll l lll llll l lll llll l ll l l ll l l lll lll ll l ll ll ll l l ll lll l ll l ll l l lll l ll l l l lll lll l ll ll lll l lllll ll l lll l ll lll ll lll ll ll l ll ll ll l ll ll lll llll ll l llll l ll l l ll ll lll ll ll lll l llll ll llll l lll ll lll l ll l llll ll lll ll l l ll lll lll ll lll l lll ll lll l l ll ll l ll ll ll l ll lll ll ll lll ll l ll llll ll ll l l l lll l ll lll lll ll ll l l l lll llll lll l ll lll ll l ll lll ll lll lll l ll l llll l ll l lll ll lll l ll lll l ll l l llllll l lll ll ll ll ll ll ll ll ll l ll ll llll ll lll l ll l ll l ll llll l ll lll ll l ll ll ll lll l l ll ll ll llll l llll l ll llll lll lll llll ll l l ll l l ll llll ll ll lll ll ll l l l ll lll llll llll ll l ll ll llll ll l ll l llll ll ll ll ll ll lll l llll l ll ll ll ll ll lll l ll lll l lll ll lll ll l ll l ll ll ll lll ll ll ll ll lll llll ll ll l ll ll ll llll ll lll l lllll llll l l ll ll l lll ll l ll l ll l ll ll l llll l lll lll ll ll l ll l l lll l ll ll ll ll lll lll ll ll lll l l ll ll ll l ll ll llll l llll llll lll lll ll ll l ll ll ll ll l lll llll l lll lll lll lll lll llll lllll lllll ll l llll llll llll l ll ll ll llll ll ll lll l llll l lll ll ll l ll ll ll l lll ll ll ll ll l llll ll l ll ll l lll lll ll llll ll llll l ll l lll lllll llll l lll ll lll ll ll lll l lll lll lll ll l lll l ll l l llll llll ll l ll l ll ll l lll ll lll l ll l llll lll l lll l l ll ll l lll ll ll l ll llll l l ll l lll l lllll ll ll l l ll lll lll ll ll llll l l ll lll ll ll ll ll llll ll l ll l lll l ll ll l ll ll ll l lll ll ll l ll l l ll ll ll ll lll ll l l l lll lll l ll lll l ll llll ll ll ll l ll l ll ll l ll l lll ll ll ll ll ll ll lll l llll ll lll l l ll l lll l ll ll l lllll l lll ll llll l llll ll l ll llll l llll l lll ll lll lll llll l l ll ll ll ll ll l llllll l ll l ll lll lll lll llllll lll ll ll ll l ll ll l ll l lll l lll lll l lll lll l ll l l ll lll lll lll ll llll l ll ll ll ll lll ll ll ll ll ll lll ll ll l l l ll l lll lll ll ll ll llllll l ll ll lll lll l ll lllll ll llll lll ll ll l ll l ll l ll l ll ll llll l l l lll l llll l ll ll lll l ll ll ll lll ll lll ll lll ll l llll lll l ll lll ll l ll ll ll ll l ll l l ll ll ll ll ll ll llll ll lll l ll lll l l ll llll l l lll ll llll ll ll lll l lll llll ll llll ll l lll lll ll ll l ll lll lll lll l ll ll ll lll lll l lll ll l ll llll lll ll l lll ll ll l ll ll l lll ll ll ll l ll ll ll l l ll llll llll l ll l lll l llllll l ll ll ll llll lll l ll lll llll l ll l l lll l l ll lll ll lll l ll ll lll ll l ll ll llll lll l ll lll lll l l ll ll lll ll l ll l llll llll ll l ll l l ll l ll lll ll ll l l lll ll lll ll ll l ll lll ll ll lll ll lll ll ll ll llll l ll l lll l llll l l lll lll lll l lll lll ll l ll lll l l lllll l lll lll ll llll l l ll l ll llll ll lll l ll l ll lll ll l ll l ll ll ll ll lll ll ll lll l ll ll ll l llll lll lll l l lll ll ll ll ll l lll ll l ll ll ll ll ll ll l ll l ll ll lll ll l ll l lll ll l lll l ll ll l llll l l ll l ll lll lll lll ll ll l ll ll ll lll lll ll lll ll ll l llll lllll lll ll ll ll ll ll l llll lll ll ll l ll lll llll lllll ll l lll lll ll l ll l lll ll lll l lll ll llll l l ll ll lll ll ll ll ll llll ll l l lll lll lll lll ll l ll ll l ll l lll ll l ll ll l ll lll l ll ll ll llll lll ll l ll ll l l lll l ll ll l l lll ll ll llll l ll ll ll l ll ll l ll l ll lll ll lll ll llll ll l llll l lll l l ll l l ll ll lll l llll l ll l llll l l ll l lll l ll ll lll lll lll ll ll lll ll lll ll lllll l llll l lll ll ll lll ll ll lll l ll ll ll l ll ll ll lll lll lllll lll l ll lll llll ll l lll lll l l ll ll l ll ll lll lll lllll ll lll l llll l l ll llll l llll lll l lll llll l lll ll lll l l ll ll ll ll lll lll ll ll lll ll l l lll ll l ll lll llll ll ll lll ll ll ll ll l ll llll ll lll l ll lllll lll lll l lll l ll lllll ll l ll ll l lll l l lllll ll ll ll ll ll l ll ll ll l ll lll l l lll l lllllll ll ll l ll ll lll l ll ll l ll llllll lll ll lll l ll l lll ll l lll l ll ll lll lllll ll ll lll lll ll ll ll lll l llll ll ll l ll llll ll lll l ll l lll ll ll ll l ll l lll lll lll lll l lll l ll ll lll l ll l lll l llll l ll ll ll ll ll l lll lll ll l lllll l lllll ll ll lll llll lll ll l lll ll ll ll ll ll l ll l l l llll ll l ll ll ll ll l ll ll ll ll ll ll ll lll llll llll ll ll ll ll ll ll ll lll l l lll lll ll l ll l ll lll ll ll lll ll ll ll ll l ll ll lll ll ll ll l ll lll lll l ll l lll ll l ll ll llll lllll llll lll ll l ll lll ll lll ll lll ll l lll l lll ll ll l ll llll llll ll l ll ll ll lll ll ll lllll l l lll lll lll ll l ll l ll llll llll l ll ll lll l llll l lll ll ll ll llll l ll ll ll ll l ll ll ll l l lll l ll ll l ll lll l ll l lll l lll l l ll llll l ll l lll lll ll ll l ll lll lll l ll ll ll lll l ll ll l lll llll l l llll l llll l ll l ll lll ll llll llllll ll lll ll ll l l llllll l llll llll lll lll l ll llllll l llll l llll lllll lll lll lll llll lll ll ll ll ll l lll lll llll lll lll ll l l lll ll l l ll ll ll l ll lll ll l l lllll l l l l ll ll lll lll ll ll ll lll ll ll l ll l l ll ll lllll llll lll ll ll llll llll ll l lll ll l lll l l ll lll l ll ll ll l ll llll ll ll ll ll lll l ll ll ll lll l l ll ll ll l ll l lll l llll l ll ll l ll lll l l l lll ll ll l l l ll lll lll ll ll l ll ll ll ll ll llll lllll l ll ll ll l l llll l lll ll ll ll ll l ll lll llll ll llll l lllll l llll llll l lll lll l lll l ll l ll ll lll lll ll lll ll l llll ll llll lll lll l l ll ll ll l ll l l ll ll ll ll lll lll lllll lll lll ll ll ll ll ll lll l lll ll ll lll ll lll lll ll ll l llll lll ll lll l ll llll ll ll l lll ll ll lll l ll l ll ll l ll ll ll lll ll ll l ll ll l ll l lllll ll ll l ll ll l ll l lll llll ll ll ll ll l ll l ll l lll l l ll l lll l ll ll lll l l ll l l lll ll l llll l llll llll ll l lll ll ll l ll l lll ll llll ll l ll llll lll ll l ll llll l llll ll lll lll l ll llll ll l l l l ll llll l llll ll ll ll lll l l lll l lll llll l lll l lll ll ll ll ll ll l lll llll l lll lll l ll ll l ll lll l ll lll lll ll llll ll llll ll llll ll lll llll l ll l lllll ll l ll ll l l ll lll l lll l ll lll ll lll l ll ll l ll l ll l lll lll l lll lll ll l lll ll llll lll l lll ll ll ll ll ll ll ll l lll lll llll l lll l ll l ll l l lllllll l ll l ll lll llll ll ll ll llll ll l lll llll lll l ll ll l ll ll ll l l lll lll l l lll ll lllll ll ll l l ll lll ll lll ll l l ll ll ll lll ll l l llll ll lll ll lll llll ll llll l l l ll l lll lll l ll lll l ll lllll ll l ll ll l ll l ll lll ll ll ll l lll ll ll l l llll l lll ll llll l l lll l lll ll lll lllll ll lll ll llllll ll l ll l ll l ll llll lll l lll lll ll lll ll l llll l l l ll ll ll ll ll ll ll lll l ll ll l l l ll lllll lll lllll ll ll ll l lll ll l l llll lll l ll ll ll l ll l ll l l ll ll ll ll ll l ll lllll ll lll ll l ll ll ll l ll l lll ll l lll l lll l l ll l llll lll ll lll l ll ll ll lll ll ll ll lll l lllll ll ll llll ll ll ll l lll l l lll l lll ll l ll ll l ll ll ll lll ll ll ll ll ll lll lll llll ll l lll ll ll ll l lll ll lll l llll llll ll lllll ll ll l ll ll ll lll l lll lll lll l lll llll lll l ll ll ll l lll ll ll lll ll l lll ll l l l ll ll l l ll l ll lll ll ll ll l llll ll l lll ll l ll l ll l l ll ll ll l ll lll l ll l l ll ll l ll l ll lll l ll l llll l ll lll l lll l lll l l ll lll l ll l lll ll llll l l ll lll l ll ll l ll ll ll l lll ll ll ll lll ll lll lllll l llll ll ll l llll l l ll l lll lll llll l lll lll ll ll lll ll ll l lll l l ll llll ll l ll ll lllll ll ll ll ll lllll llll ll l ll ll l lll ll l lll lll lll l lll lll l lll llll l l ll l ll l ll lll ll ll lll lll lll ll l ll ll ll lll ll l l l lll lll l ll ll l lll lll l ll l lll lll ll lll ll ll ll l ll ll llll ll ll ll l ll l ll ll ll lll ll l l ll lll ll lll l l l l l lll lll llll lll l ll ll lll lll l l ll llll lllll l llll l l llll llll ll llll l lll ll ll l ll l ll ll l lll ll ll l llllll l l l ll l l lll lll llll lll ll ll ll l ll l ll ll ll ll l lll ll lll ll l ll lll ll ll llll ll l ll l ll lllll l lllll l llll ll ll ll ll ll lll llll ll llll l ll llll ll ll llll l ll l ll l lll l l l ll llll l ll ll lll ll lll ll ll llll l lll ll llll llll l lll ll ll lll llll ll ll l ll l ll lll l l ll ll l ll llll l llll lll lll lll ll l ll ll ll l lll l l lll ll ll l lll l ll l l l ll lll ll ll l ll l l llll ll ll l lll ll lll ll l llll ll l lll l l ll ll ll ll lll ll lll l ll lll l l ll lll llll ll lll l l ll ll ll ll lll l llllll l ll ll lll lll ll l ll ll l lll l lll ll l lll ll l llll ll ll ll ll lll lll l ll llll ll llll ll l l ll ll l llll ll ll ll ll l ll l l l lll lll l lll l ll ll l lll ll ll l lll lllll ll l ll l ll l llll l llll l ll ll ll lll l llll llll ll ll l llll llll ll l llll ll l lll ll l ll ll l ll lll l ll ll l lll ll ll lll ll ll ll l lll l ll ll ll l lll lll l l ll lll ll l lll ll l lll ll lll l ll ll lllll llll ll l lll lllll l ll ll l lll ll llll l llll l l ll l ll ll lll l ll ll ll lll ll ll l ll lll l ll l lll lll llll ll lll lll lll l lll ll ll ll lll l ll lll lll ll ll ll ll l ll ll l lll ll ll ll l l ll ll ll l lll ll lll l ll lll l lll ll l lll ll lll ll ll ll ll ll llll lll lll l ll lll llll ll l ll lll l ll lll llll lll l llll l lll l l lll ll l l llll llll ll lll l ll ll l lll ll ll ll l l l lll ll l l l ll l ll ll lll ll l ll l ll ll llll ll l ll ll l l ll ll ll l lll ll lll ll ll ll ll l lll ll l ll ll ll ll l ll l l ll lll l l l ll llll ll llll l lll l lll l lll lll lll l lll l lll l ll ll lll lll ll ll ll l lll ll l llll l l ll lll lll ll l ll ll l lll ll l l lll ll l ll ll ll l l ll ll llll ll lllll ll ll l ll l ll l ll ll l ll l ll l ll ll lll l llll l ll llll llll lll l l lll ll l lll l lll l ll ll l ll l lll ll ll ll l ll lll lll l ll ll lll ll lll l lll ll lll l ll l lll ll ll ll ll l l ll l llll l lllll l ll lllll l lll llll l lll llll ll llll l ll l ll llll l l ll l lll lll l l l llll l lll lll l ll llll ll ll ll lll l ll ll l lll ll l l lll l ll llll l ll ll lll ll ll l ll lll ll lll ll ll llll ll lll llll l lll l llll lll ll l ll lll l lll l lll lll llll ll l l l ll ll l ll l ll ll l lll l ll l lll l ll ll ll l llllll l llll llll ll l ll ll ll lll lll ll llll lll llll llll l ll ll l ll l lll lll lll l ll l l lll l lll ll ll ll ll l ll ll ll l l lll l ll l ll l ll l llll lll ll llll l ll l lll l lll l l lll l lll l ll ll ll l ll l lll l ll l ll ll l ll l ll lll ll l lll l l ll ll l lll l ll ll ll lllll ll ll lll ll ll ll lll ll lll lllll ll ll ll ll ll ll l l lll ll lll l llll l lll l ll ll ll llll l ll ll lll l lll l lll ll ll ll l ll lll lll llll l lll ll l lll ll ll ll lll l l lll lll l l ll lll l lllll ll ll l ll lll ll llll lllll ll l ll l ll lll lll lll l lll l l lll llll l ll ll lll llll l ll l l ll ll l ll ll ll ll l ll l ll ll l lll ll l llll ll ll ll lll lll lllll ll ll ll lll l ll ll ll ll ll l ll ll ll l l lll l ll ll ll l lll lll lll ll lll ll ll ll ll lll ll ll lll lll l lll ll ll llll l lll ll l lll ll ll lll l l l lllll lll ll ll ll ll l ll ll ll lll ll ll ll lll l ll lll ll l l l llll ll llll l llll ll ll l ll l lll lll lll lll lll llll ll llll ll llll ll l l l lllll l ll ll l ll l lll l llll l lll ll lll ll l ll l l l llll lllll l lll lll lllllll llll l l lll l lllll l l l ll ll lll lll l l lll ll ll lllll lll ll l l l ll lll ll ll ll l ll lll lll ll l ll llll l lll llll lll ll ll ll ll ll ll l ll ll l llll ll ll lll ll ll l lll l llll ll ll l lll lllll ll ll ll llll llll ll l l lll lllll lll lll lll ll lll lll ll lll ll l lll lll l lll ll ll lll lll lll ll l ll l ll llll llll l ll lll ll l ll ll lll lll l ll lll ll llll ll ll lll lll ll ll lll lll l lll lll ll llll lll ll ll llllll ll ll llll lll Relative exacerbation rate reduction given 65% sputumeosinophil reduction compared with placebo T r ea t m en t d i ff e r en c e i n p r e − b r on c hod il a t o r F EV [ m L ] c o m pa r ed w i t h p l a c ebo Neighbours ability that the true values of X and Y would both be on the same side oftheir elicited medians. The experts found the concordance probability diﬃcultto judge. After the facilitator gave an alternative explanation in terms of theconditional probability that one variable was above its median given that theother was above its median, a concordance probability of 0.7 was tentativelyagreed by the experts. The experts were shown a graphic similar to Figure 6for the case of a concordance probability of 0.7 and found it very helpful andin accord with their expectations. Alternative concordance probabilities wereexplored using the same graphical display. The correlation was too tight with0.8 concordance and the experts felt that there was appreciable positive corre-lation so 0.5 concordance was not considered appropriate. The resulting jointdistribution is shown in Figure 6. We already described the basic aims of the newly introduced PoS frameworkat Novartis at a high level in Section 2. Its practical application involves thefollowing four steps [Hampson et al., 2021]. First, a benchmark probability ofapproval for a project at the start of Phase 2 is estimated based on a smallnumber of program characteristics by a logistic regression model trained on a18atabase of drug development projects. Second, a Bayesian analysis is con-ducted, in which the prior for the eﬃcacy eﬀects is set based on the benchmarkprobability of eﬃcacy success in both Phase 2 and 3. This prior is then usedin combination with Phase 2 data to obtain a posterior distribution for drugeﬃcacy. Phase 3 studies are then simulated using samples from the posteriorin order to estimate the probability of the key eﬃcacy endpoints meeting TPPcriteria in the Phase 3 program. Benchmark information is also used to accountfor the risk of program failure due to an unexpected safety issue and of notobtaining regulatory approval despite a successful Phase 3 program. Third, aprogram risk assessment is done to capture other risks not already covered bythe previous calculations. This assessment is then used to adjust the probabil-ity of a registration with a label meeting TPP criteria to obtain the PoS. Theadjustment in this step was also determined using elicitation process. Finally,in exceptional circumstances a fourth step allows for an adjustment for factorsnot captured by the preceding three steps.In this case study, the Bayesian analysis in the second step of the PoS ap-proach could not directly inform the PoS of the Phase 3 program due to thediﬀerences in endpoints and population between Phase 2 and 3. Thus, the resultsof the Bayesian analysis for sputum eosinophil counts in Figure 2 were linkedto the eﬃcacy on asthma exacerbations in Phase 3 using an expert elicitationin the manner described in Section 3.3. In contrast, the eﬀect of fevipipranton FEV was elicited directly from the experts and the joint distribution ofthe eﬃcacy of fevipiprant for both endpoints was then obtained as described inSection 3.4.For pragmatic reasons the Novartis PoS approach foresees that only one ortwo key endpoints should be considered in the deﬁnition of success. For thisreason, it was decided to ignore the other two key secondary endpoints (asthmacontrol questionnaire and asthma related quality of life questionnaire) of thesePhase 3 trials for the purposes of the PoS calculation. The estimated benchmarks for the ﬁrst indication of a respiratory orally admin-istered small molecule without a FDA breakthrough designation were: • a Phase 2 success probability of 24%, • a Phase 3 success probability of 60% conditional on Phase 2 success, and • an approval probability of 94% conditional on Phase 2 and 3 success.The program risk assessment [Hampson et al., 2021] considered the majorityof categories to fall into the lowest risk category with one question falling intothe intermediate risk category.When these numbers were combined with simulated Phase 3 outcomes basedon the elicited quantities, a PoS of 4% was calculated. The main hurdle was19EV and the high TPP target for exacerbation reduction. If one only consid-ered a TPP requiring a relative exacerbation reduction of 30% with no require-ments for FEV , the PoS became 41%. The whole PoS process required approximately 2 months. After an initial re-view, we identiﬁed that an expert elicitation workshop would be needed. On28 May 2019, we identiﬁed the facilitator for the workshop and compiled a listof candidate dates. In the meantime, the team worked to assemble an evidencedossier. By 12 June, we had arranged a elicitation workshop on 12 July af-ter conﬁrming the availability of ﬁve experts. By 1 July, the evidence dossierhad been drafted by the biostatistics team, was shared with the facilitator andrecorder, and was ﬁnalised on 8 July after a review by internal experts, fourdays before the workshop. One learning was that we should have shared thedossier with the experts earlier in order to allow them to provide feedback onits contents so that additional evidence could have been introduced up-front.On 12 July the workshop took place using version 4 of the SHELF methodologyand on 20 July 2019 the ﬁnal report of the elicitation meeting was issued. Allrecordings from the meeting were made using the templates provided as partof the SHELF documents package and participants were kept anonymous inthese minutes by using the letters A to E for the experts, as well as Z for thefacilitator.

The results of the Phase 3 trials, for which we conducted the expert elicitation,are shown in Figure 7. As can be seen only one comparison within one of thetwo trials was associated with a conﬁdence interval that excluded no eﬀect, butthis result was not considered statistically signiﬁcant after an adjustment formultiplicity [Brightling et al., 2020]. The results of the Phase 3 trials are veryinformative in the sense that the 95% conﬁdence intervals essentially excludethe TPP targets.These results are consistent with the elicited prior information from theexperts: the experts essentially excluded the possibility that the true eﬀectof the studied fevipiprant doses on FEV meet the TPP target, while for theprimary exacerbation endpoint, the experts judged that there was a reasonablepossibility that the true eﬀect was at or above the TPP target. On the basisof these Phase 3 results Novartis did not pursue a ﬁling for an indication inasthma. The quality of decisions in the presence of uncertainty can be improved by takingthe judgements of experts based on the available evidence into account. When20igure 7: Implied distribution for true eﬀect of fevipiprant 450 mg QD onexacerbations and FEV based on elicited expert judgements, and study resultsin the high blood eosinophil subgroup of the Phase 3 exacerbation trials l l T PP t a r ge t

75% 50% 25% 0% −25% −50% D en s i t y o f e x pe r t e li c i t ed p r i o r ll l ll l

150 mg QD150 mg QD150 mg QD450 mg QD450 mg QD450 mg QD T PP t a r ge t NCT02563067NCT02555683Pooled 75% 50% 25% 0% −25% −50%

Relative exacerbation rate reduction forfevipiprant compared with placebo S t ud y l l T PP t a r ge t −50 0 50 100 150 l ll ll l

150 mg QD150 mg QD150 mg QD450 mg QD450 mg QD450 mg QD T PP t a r ge t NCT02563067NCT02555683Pooled −50 0 50 100 150

Treatment difference in pre−bronchodilatorFEV1 [mL] compared with placebo stakes are high, as with major investment decisions by a pharmaceutical com-pany, the necessary eﬀort and cost of obtaining experts’ judgements is negligiblecompared to the cost of a wrong decision. This is one of the reasons why thenew Novartis PoS framework, which is applied for the decision to initiate piv-otal trials for a project, recommends expert elicitation when substantial directevidence about QoIs is not available. The SHELF extension method and theSHELF copula method address two common scenarios in this setting: when weextrapolate the evidence from surrogate endpoints to Phase 3 endpoints, andwhen how much a drug aﬀects one endpoint changes how much we judge it toaﬀect other endpoints.There are currently no published examples of how to apply these methods aspart of the SHELF protocol in the pharmaceutical industry. Therefore, we feltit would be helpful to share an example illustrating the full extent of real-worldcomplexities and the relevant practical considerations. This will hopefully helpothers that wish to use expert elicitation to inform clinical drug developmentor other types of high stakes decisions.We do not wish to overemphasise the outcomes from a single example. Nev-ertheless, the close alignment between the experts’ group judgements with thetrial outcomes, which were not known to the experts at the time of the elicita-tion workshop, supports the validity of expert elicitation in drug development.If a similar elicitation outcome had been available at the time of the decision tostart the Phase 3 program for fevipiprant, it would have suggested a lower PoSthan assigned at the time and may have led to re-evaluation of the assumptionsregarding the secondary FEV endpoint. However, this proof of concept forelicitation as part of a new PoS framework was performed 4 years after thisdecision and used information that only became available subsequently.21he project team noted that the evidence dossier and the discussions in theelicitation workshop were extremely helpful for assembling and understandingthe existing evidence on the eﬃcacy of the drug. It may sometimes be the casethat teams are very well aware of the clinical trials conducted for their product,but have not systematically reviewed the indirect evidence that is available fromother sources. After the elicitation workshop the experts expressed that theyappreciated the structured and scientiﬁc process, that they found the methodol-ogy intuitive, and that they were positively surprised how fully non-statisticianscould participate in the workshop.While we describe a particular example of an elicitation workshop, we havenow run several similar workshops at Novartis and some of the authors of thispaper have several years of experience of doing so with other clients. On thisbasis, we oﬀer a number of practical recommendations. It is important to startpreparing the evidence dossier as early as possible so that experts and otherstakeholders can give feedback prior to a workshop. This is also an opportunityto let senior leaders with strong positive opinions on projects provide the evi-dence they wish to be considered. Additionally, it can be diﬃcult for expertsto free their agenda for long workshops and we have found that people ﬁnd ithard to concentrate in virtual meetings for as long as in in-person workshops.This has led us to investigate options for eliciting individual judgements prior tothe main workshop. It is also important to clearly communicate how elicitationresults will be used. In the context of the PoS of drug development programs,this meant making it clear that the resulting probability is not the sole determi-nant of funding for a project. We now routinely remind teams that investmentdecisions will also be based on other factors such as the costs of development,market opportunity and unmet medical need. We thank Ana-Maria Tanase, Christian Hasenfratz and Hanns-Christian Till-mann for being experts for the asthma case study, as well as Kelvin Stott,Giovanni Della Cioppa and Karine Baudou for their support of the pilot phaseof the Novartis PoS initiative. 22 eferences

Charlotte Baey, Ullrika Sahlin, Yann Clough, and Henrik G Smith. A model toaccount for data dependency when estimating ﬂoral cover in diﬀerent land usetypes over a season.

Environmental and ecological statistics , 24(4):505–527,2017.Jonathan L Bamber, Michael Oppenheimer, Robert E Kopp, Willy P Aspinall,and Roger M Cooke. Ice sheet contributions to future sea-level rise fromstructured expert judgment.

Proceedings of the National Academy of Sciences ,116(23):11195–11200, 2019.Eric D. Bateman, Alfredo G. Guerreros, Florian Brockhaus, Bj¨orn Holzhauer,Abhijit Pethe, Richard A. Kay, and Robert G. Townley. Fevipiprant, anoral prostaglandin DP2receptor (CRTh2) antagonist, in allergic asthma un-controlled on low-dose inhaled corticosteroids.

European Respiratory Jour-nal , 50(2):1700670, 8 2017. doi: 10.1183/13993003.00670-2017. URL https://doi.org/10.1183%2F13993003.00670-2017 .Tim Bedford and Roger M Cooke. Probability density decomposition for con-ditionally dependent random variables modeled by vines.

Annals of Mathe-matics and Artiﬁcial intelligence , 32(1-4):245–268, 2001.Eugene R Bleecker, J Mark FitzGerald, Pascal Chanez, Alberto Papi, Steven FWeinstein, Peter Barker, Stephanie Sproule, Geoﬀrey Gilmartin, Magnus Au-rivillius, Viktoria Werkstr¨om, and Mitchell Goldman. Eﬃcacy and safetyof benralizumab for patients with severe asthma uncontrolled with high-dosage inhaled corticosteroids and long-acting β -agonists (SIROCCO): arandomised, multicentre, placebo-controlled phase 3 trial. The Lancet , 388(10056):2115–2127, 10 2016. doi: 10.1016/s0140-6736(16)31324-1. URL https://doi.org/10.1016%2Fs0140-6736%2816%2931324-1 .Christopher E Brightling, Eugene R Bleecker, Veit J Erpenbeck, Sebastian Fu-cile, Pablo Altman, David Lawrence, Caterina Brindicci, and Barbara Knorr.Luster-1 and -2: Two randomized controlled trials of the prostaglandind2 receptor 2 antagonist, fevipiprant, in asthma.

Clinical Investigation , 9(2):55–63, 2019. URL .Christopher E Brightling, Mina Gaga, Hiromasa Inoue, Jing Li, JorgeMaspero, Sally Wenzel, Samopriyo Maitra, David Lawrence, Florian Brock-haus, Thomas Lehmann, Caterina Brindicci, Barbara Knorr, and Eugene RBleecker. Eﬀectiveness of fevipiprant in reducing exacerbations in patientswith severe asthma (LUSTER-1 and LUSTER-2): two phase 3 randomisedcontrolled trials.

The Lancet Respiratory Medicine , 9 2020. doi: 10.1016/s2213-2600(20)30412-4. URL https://doi.org/10.1016%2Fs2213-2600%2820%2930412-4 . 23ario Castro, Sameer Mathur, Frederick Hargreave, Louis-Philippe Boulet,Fang Xie, James Young, H. Jeﬀrey Wilkins, Timothy Henkel, andParameswaran Nair. Reslizumab for poorly controlled, eosinophilic asthma.

American Journal of Respiratory and Critical Care Medicine , 184(10):1125–1132, 11 2011. doi: 10.1164/rccm.201103-0396oc. URL https://doi.org/10.1164%2Frccm.201103-0396oc .Mario Castro, James Zangrilli, Michael E Wechsler, Eric D Bateman, Guy GBrusselle, Philip Bardin, Kevin Murphy, Jorge F Maspero, ChristopherO ' Brien, and Stephanie Korn. Reslizumab for inadequately controlled asthmawith elevated blood eosinophil counts: results from two multicentre, parallel,double-blind, randomised, placebo-controlled, phase 3 trials.

The Lancet Res-piratory Medicine , 3(5):355–366, 5 2015. doi: 10.1016/s2213-2600(15)00042-9.URL https://doi.org/10.1016%2Fs2213-2600%2815%2900042-9 .Mario Castro, Jonathan Corren, Ian D. Pavord, Jorge Maspero, Sally Wen-zel, Klaus F. Rabe, William W. Busse, Linda Ford, Lawrence Sher, J. MarkFitzGerald, Constance Katelaris, Yuji Tohda, Bingzhi Zhang, HeribertStaudinger, Gianluca Pirozzi, Nikhil Amin, Marcella Ruddy, Bolanle Akin-lade, Asif Khan, Jingdong Chao, Renata Martincova, Neil M.H. Graham,Jennifer D. Hamilton, Brian N. Swanson, Neil Stahl, George D. Yancopoulos,and Ariel Teper. Dupilumab eﬃcacy and safety in moderate-to-severe un-controlled asthma.

New England Journal of Medicine , 378(26):2486–2496,6 2018. doi: 10.1056/nejmoa1804092. URL https://doi.org/10.1056%2Fnejmoa1804092 .J Chlumsk´y, I Striz, M Terl, and J Vondracek. Strategy aimed at reduc-tion of sputum eosinophils decreases exacerbation rate in patients withasthma.

Journal of International Medical Research , 34(2):129–139, 32006. doi: 10.1177/147323000603400202. URL https://doi.org/10.1177%2F147323000603400202 .Committee for Medicinal Products for Human Use. Guideline on the clinicalinvestigation of medicinal products for the treatment of asthma, 2015. URL . CHMP/EWP/2922/01 Rev.1.Jonathan Corren, Jane R. Parnes, Liangwei Wang, May Mo, Stephanie L.Roseti, Janet M. Griﬃths, and Ren´e van der Merwe. Tezepelumab in adultswith uncontrolled asthma.

New England Journal of Medicine , 377(10):936–946, 9 2017. doi: 10.1056/nejmoa1704064. URL https://doi.org/10.1056%2Fnejmoa1704064 .Nigel Dallow, Nicky Best, and Timothy H Montague. Better decision makingin drug development through adoption of formal prior elicitation.

Pharma-ceutical Statistics , 17(4):301–316, 2018.24lireza Daneshkhah and JE Oakley. Eliciting multivariate probability distribu-tions.

Rethinking risk measurement and reporting , 1:23, 2010.Luis C Dias, Alec Morton, and John Quigley. Elicitation.

Springer InternationalPublishing. MR3700912. doi: https://doi. org/10.1007/978-3-319-65052-4 , 1(2):3, 2018.Fadlalla G Elfadaly and Paul H Garthwaite. Eliciting dirichlet and connor–mosimann prior distributions for multinomial models.

Test , 22(4):628–646,2013.Veit J. Erpenbeck, Todor A. Popov, David Miller, Steven F. Weinstein, SheldonSpector, Baldur Magnusson, Wande Osuntokun, Paul Goldsmith, MarkusWeiss, and Jutta Beier. The oral CRTh2 antagonist QAW039 (fevipiprant):A phase II study in uncontrolled allergic asthma.

Pulmonary Pharmacology& Therapeutics , 39:54–63, 8 2016. doi: 10.1016/j.pupt.2016.06.005. URL https://doi.org/10.1016%2Fj.pupt.2016.06.005 .European Food Safety Authority. Guidance on expert knowledge elicitation infood and feed safety risk assessment.

EFSA Journal , 12(6):3734, 2014.J Mark FitzGerald, Eugene R Bleecker, Parameswaran Nair, Stephanie Korn,Ken Ohta, Marek Lommatzsch, Gary T Ferguson, William W Busse, PeterBarker, Stephanie Sproule, Geoﬀrey Gilmartin, Viktoria Werkstr¨om, Mag-nus Aurivillius, and Mitchell Goldman. Benralizumab, an anti-interleukin-5receptor α monoclonal antibody, as add-on treatment for patients with se-vere, uncontrolled, eosinophilic asthma (CALIMA): a randomised, double-blind, placebo-controlled phase 3 trial. The Lancet , 388(10056):2128–2141,10 2016. doi: 10.1016/s0140-6736(16)31322-8. URL https://doi.org/10.1016%2Fs0140-6736%2816%2931322-8 .Louise Fleming, Nicola Wilson, Nicolas Regamey, and Andrew Bush. Useof sputum eosinophil counts to guide management in children with severeasthma.

Thorax , 67(3):193–198, 8 2011. doi: 10.1136/thx.2010.156836. URL https://doi.org/10.1136%2Fthx.2010.156836 .Paul H Garthwaite and Anthony O’Hagan. Quantifying expert opinion in theuk water industry: an experimental study.

Journal of the Royal StatisticalSociety: Series D (The Statistician) , 49(4):455–477, 2000.Gail M. Gauvreau, Paul M. O ' Byrne, Louis-Philippe Boulet, Ying Wang, Don-ald Cockcroft, Jeannette Bigler, J. Mark FitzGerald, Michael Boedigheimer,Beth E. Davis, Clapton Dias, Kevin S. Gorski, Lynn Smith, Edgar Bautista,Michael R. Comeau, Richard Leigh, and Jane R. Parnes. Eﬀects of an anti-TSLP antibody on allergen-induced asthmatic responses.

New England Jour-nal of Medicine , 370(22):2102–2110, 5 2014. doi: 10.1056/nejmoa1402895.URL https://doi.org/10.1056%2Fnejmoa1402895 .25lobal Initiative for Asthma. Global strategy for asthma management andprevention, 2020. URL https://ginasthma.org/ . Available at https://ginasthma.org/ .Sherif Gonem, Rachid Berair, Amisha Singapuri, Ruth Hartley, Marie F M Lau-rencin, Gerald Bacher, Bj¨orn Holzhauer, Michelle Bourne, Vijay Mistry, Ian DPavord, Adel H Mansur, Andrew J Wardlaw, Salman H Siddiqui, Richard AKay, and Christopher E Brightling. Fevipiprant, a prostaglandin d 2 receptor2 antagonist, in patients with persistent eosinophilic asthma: a single-centre,randomised, double-blind, parallel-group, placebo-controlled trial.

The LancetRespiratory Medicine , 4(9):699–707, 9 2016. doi: 10.1016/s2213-2600(16)30179-5. URL https://doi.org/10.1016%2Fs2213-2600%2816%2930179-5 .John Paul Gosling. SHELF: the Sheﬃeld elicitation framework. In

Elicitation ,pages 61–93. Springer, 2018.John Paul Gosling, Andy Hart, David C Mouat, Mirzet Sabirovic, Simon Scan-lan, and Alick Simmons. Quantifying experts’ uncertainty about the futurecost of exotic diseases.

Risk Analysis: An International Journal , 32(5):881–893, 2012.Ruth H Green, Christopher E Brightling, Susan McKenna, Beverley Hargadon,Debbie Parker, Peter Bradding, Andrew J Wardlaw, and Ian D Pavord.Asthma exacerbations and sputum eosinophil counts: a randomised controlledtrial.

The Lancet , 360(9347):1715–1721, 11 2002. doi: 10.1016/s0140-6736(02)11679-5. URL https://doi.org/10.1016%2Fs0140-6736%2802%2911679-5 .Lisa V. Hampson, Bj¨orn Bornkamp, Bj¨orn Holzhauer, Joseph Kahn, Markus R.Lange, Wen-Lin Luo, Giovanni Della Cioppa, Kelvin Stott, and Steﬀen Baller-stedt. Improving the assessment of the probability of success in late stage drugdevelopment. arXiv e-prints , art. arXiv:2102.02752, February 2021.Bj¨orn Holzhauer, Craig Wang, and Heinz Schmidli. Evidence synthesis fromaggregate recurrent event data for clinical trial design and analysis.

Statisticsin Medicine , 37(6):867–882, 11 2017. doi: 10.1002/sim.7549. URL https://doi.org/10.1002%2Fsim.7549 .L. Jayaram. Determining asthma treatment by monitoring sputum cell counts:eﬀect on exacerbations.

European Respiratory Journal , 27(3):483–494, 32006. doi: 10.1183/09031936.06.00137704. URL https://doi.org/10.1183%2F09031936.06.00137704 .Nelson Kinnersley and Simon Day. Structured approach to the elicitation of ex-pert beliefs for a bayesian-designed clinical trial: a case study.

Pharmaceuticalstatistics , 12(2):104–113, 2013.Michel Laviolette, David L. Gossage, Gail Gauvreau, Richard Leigh, RonOlivenstein, Rohit Katial, William W. Busse, Sally Wenzel, Yanping Wu,Vivekananda Datta, Roland Kolbeck, and Nestor A. Molﬁno. Eﬀects of26enralizumab on airway eosinophils in asthmatic patients with sputumeosinophilia.

Journal of Allergy and Clinical Immunology , 132(5):1086–1096.e5, 1 2013. doi: 10.1016/j.jaci.2013.05.020. URL https://doi.org/10.1016%2Fj.jaci.2013.05.020 .Parameswaran Nair, Marcia M.M. Pizzichini, Melanie Kjarsgaard, Mark D.Inman, Ann Efthimiadis, Emilio Pizzichini, Frederick E. Hargreave, andPaul M. O ' Byrne. Mepolizumab for prednisone-dependent asthma withsputum eosinophilia.

New England Journal of Medicine , 360(10):985–993,3 2009. doi: 10.1056/nejmoa0805435. URL https://doi.org/10.1056%2Fnejmoa0805435 .Beat Neuenschwander, Gorana Capkun-Niggli, Michael Branson, and David JSpiegelhalter. Summarizing historical information on controls in clinical tri-als.

Clinical Trials , 7(1):5–18, 2010. doi: 10.1177/1740774509356002. URL https://doi.org/10.1177/1740774509356002 . PMID: 20156954.Lisa Norrington, John Quigley, Ashley Russell, and Robert Van der Meer. Mod-elling the reliability of search and rescue operations with Bayesian Belief Net-works.

Reliability Engineering & System Safety , 93(7):940–949, 2008.Jeremy Oakley.

SHELF - Tools to Support the Sheﬃeld Elicitation Framework ,2020. URL https://CRAN.R-project.org/package=SHELF . R package ver-sion 1.7.0.Jeremy E. Oakley and Anthony O’Hagan.

SHELF: the Sheﬃeld ElicitationFramework (version 4) . School of Mathematics and Statistics, University ofSheﬃeld, UK, 2019. Available at http://tonyohagan.co.uk/shelf .Anthony O’Hagan. Probabilistic judgements for expert elicitation (e-learningcourse), 2018. URL .Available at .Anthony O’Hagan. Expert knowledge elicitation: subjective but scientiﬁc.

TheAmerican Statistician , 73(sup1):69–81, 2019a.Anthony O’Hagan. SHELF: the Sheﬃeld Elicitation Framework, 2019b. URL . (accessed on 1 January 2021).Anthony O’Hagan, Caitlin E Buck, Alireza Daneshkhah, J Richard Eiser, Paul HGarthwaite, David J Jenkinson, Jeremy E Oakley, and Tim Rakow. Uncertainjudgements: eliciting experts’ probabilities . John Wiley & Sons, Chichester,2006.Hector G. Ortega, Mark C. Liu, Ian D. Pavord, Guy G. Brusselle, J. MarkFitzGerald, Alfredo Chetta, Marc Humbert, Lynn E. Katz, Oliver N. Keene,Steven W. Yancey, and Pascal Chanez. Mepolizumab treatment in patientswith severe eosinophilic asthma.

New England Journal of Medicine , 371(13):27198–1207, 9 2014. doi: 10.1056/nejmoa1403290. URL https://doi.org/10.1056%2Fnejmoa1403290 .Reynold A Panettieri, Ulf Sj¨obring, AnnaMaria P´eterﬀy, Peter Wessman,Karin Bowen, Edward Piper, Gene Colice, and Christopher E Brightling.Tralokinumab for severe, uncontrolled asthma (STRATOS 1 and STRATOS2): two randomised, double-blind, placebo-controlled, phase 3 clinical tri-als.

The Lancet Respiratory Medicine , 6(7):511–525, 7 2018. doi: 10.1016/s2213-2600(18)30184-x. URL https://doi.org/10.1016%2Fs2213-2600%2818%2930184-x .Ian D Pavord, Stephanie Korn, Peter Howarth, Eugene R Bleecker, RolandBuhl, Oliver N Keene, Hector Ortega, and Pascal Chanez. Mepolizumab forsevere eosinophilic asthma (DREAM): a multicentre, double-blind, placebo-controlled trial.

The Lancet , 380(9842):651–659, 8 2012. doi: 10.1016/s0140-6736(12)60988-x. URL https://doi.org/10.1016%2Fs0140-6736%2812%2960988-x .H L Petsky, C J Cates, T J Lasserson, A M Li, C Turner, J A Kynaston, and A BChang. A systematic review and meta-analysis: tailoring asthma treatmenton eosinophilic markers (exhaled nitric oxide or sputum eosinophils).

Thorax ,67(3):199–208, 10 2010. doi: 10.1136/thx.2010.135574. URL https://doi.org/10.1136%2Fthx.2010.135574 .S Ren, JE Oakley, and JW Stevens. Evidence synthesis for health technologyassessment with limited studies.

Value in Health , 20(9):A770, 2017.Richard J Russell, Latifa Chachi, J Mark FitzGerald, Vibeke Backer, RonaldOlivenstein, Ingrid L Titlestad, Charlotte Suppli Ulrik, Timothy Harrison,Dave Singh, Rekha Chaudhuri, Brian Leaker, Lorcan McGarvey, Salman Sid-diqui, Millie Wang, Martin Braddock, Lars H Nordenmark, David Cohen,Himanshu Parikh, Gene Colice, Christopher E Brightling, Michel Laviolette,Tina Skjold, Læge Carl Nielsen, and Peter Howarth. Eﬀect of tralokinumab,an interleukin-13 neutralising monoclonal antibody, on eosinophilic airway in-ﬂammation in uncontrolled moderate-to-severe asthma (MESOS): a multicen-tre, double-blind, randomised, placebo-controlled phase 2 trial.

The LancetRespiratory Medicine , 6(7):499–510, 7 2018. doi: 10.1016/s2213-2600(18)30201-7. URL https://doi.org/10.1016%2Fs2213-2600%2818%2930201-7 .JH Sigurdsson, LA Walls, and JL Quigley. Bayesian belief nets for managing ex-pert judgement and modelling reliability.

Quality and Reliability EngineeringInternational , 17(3):181–190, 2001.Marta O Soares and Laura Bojke. Expert elicitation to inform health technologyassessment. In

Elicitation , pages 479–494. Springer, 2018.Pravin K Trivedi and David M Zimmer.

Copula Modeling: An Introduction forPractitioners . now Publishers Inc., Hanover, 2007.28huong N Truong, Gerard BM Heuvelink, and John Paul Gosling. Web-basedtool for expert elicitation of the variogram.

Computers & geosciences , 51:390–399, 2013.Will Usher and Neil Strachan. An expert elicitation of climate, energy andeconomic uncertainties.

Energy policy , 61:811–821, 2013.Sally Wenzel, Linda Ford, David Pearlman, Sheldon Spector, Lawrence Sher,Franck Skobieranda, Lin Wang, Stephane Kirkesseli, Ross Rocklin, BrianBock, Jennifer Hamilton, Jeﬀrey E. Ming, Allen Radin, Neil Stahl, George D.Yancopoulos, Neil Graham, and Gianluca Pirozzi. Dupilumab in persistentasthma with elevated eosinophil levels.

New England Journal of Medicine ,368(26):2455–2466, 6 2013. doi: 10.1056/nejmoa1304048. URL https://doi.org/10.1056%2Fnejmoa1304048 .Christoph Werner, Anca M Hanea, and Oswaldo Morales-N´apoles. Elicitingmultivariate uncertainty from experts: Considerations and approaches alongthe expert judgement process. In

Elicitation , pages 171–210. Springer, 2018.Rita Esther Zapata-V´azquez, Anthony O’Hagan, and Leonardo Soares Bastos.Eliciting expert judgements about a set of proportions.