[PDF] What is the probability that a vaccinated person is shielded from Covid-19? A Bayesian MCMC based reanalysis of published data with emphasis on what should be reported as `efficacy'

Abstract

Based on the information communicated in press releases, and finally published towards the end of 2020 by Pfizer, Moderna and AstraZeneca, we have built up a simple Bayesian model, in which the main quantity of interest plays the role of {\em vaccine efficacy} (`\epsilon'). The resulting Bayesian Network is processed by a Markov Chain Monte Carlo (MCMC), implemented in JAGS interfaced to R via rjags. As outcome, we get several probability density functions (pdf's) of \epsilon, each conditioned on the data provided by the three pharma companies. The result is rather stable against large variations of the number of people participating in the trials and it is `somehow' in good agreement with the results provided by the companies, in the sense that their values correspond to the most probable value (`mode') of the pdf's resulting from MCMC, thus reassuring us about the validity of our simple model. However we maintain that the number to be reported as `vaccine efficacy' should be the mean of the distribution, rather than the mode, as it was already very clear to Laplace about 250 years ago (its `rule of succession' follows from the simplest problem of the kind). This is particularly important in the case in which the number of successes equals the numbers of trials, as it happens with the efficacy against `severe forms' of infection, claimed by Moderna to be 100%. The implication of the various uncertainties on the predicted number of vaccinated infectees is also shown, using both MCMC and approximated formulae.

Full PDF

aa r X i v : . [ s t a t . A P ] F e b What is the probability that a vaccinated person isshielded from Covid-19?

A Bayesian MCMC based reanalysis of published datawith emphasis on what should be reported as ‘eﬃcacy’

Giulio D’Agostini and Alfredo Esposito Abstract

Based on the information communicated in press releases, and ﬁnally pub-lished towards the end of 2020 by Pﬁzer, Moderna and AstraZeneca, we havebuilt up a simple Bayesian model, in which the main quantity of interest playsthe role of vaccine eﬃcacy (‘ ǫ ’). The resulting Bayesian Network is processedby a Markov Chain Monte Carlo (MCMC), implemented in JAGS interfacedto R via rjags. As outcome, we get several probability density functions (pdf’s)of ǫ , each conditioned on the data provided by the three pharma companies.The result is rather stable against large variations of the number of peopleparticipating in the trials and it is ‘somehow’ in good agreement with the re-sults provided by the companies, in the sense that their values correspond tothe most probable value (‘mode’) of the pdf’s resulting from MCMC, thusreassuring us about the validity of our simple model. However we maintainthat the number to be reported as vaccine eﬃcacy should be the mean of thedistribution, rather than the mode, as it was already very clear to Laplaceabout 250 years ago (its ‘rule of succession’ follows from the simplest problemof the kind). This is particularly important in the case in which the number ofsuccesses equals the numbers of trials, as it happens with the eﬃcacy against‘severe forms’ of infection, claimed by Moderna to be 100%. The implicationof the various uncertainties on the predicted number of vaccinated infectees isalso shown, using both MCMC and approximated formulae. “. . . the most important questions of life . . .are indeed for the most part only problems in probability”“we ﬁnd that an event having occurred successivelyany number of times, the probability that it will happen againthe next time is equal to this number increased by unitydivided by the same number, increased by two units” (Laplace) Universit`a “La Sapienza” and INFN, Roma, Italia, [email protected] Retired, [email protected]

Introduction

Our perspectives about living with Covid-19 are changed dramatically in the lastmonths of 2020 with the results from the vaccine trials by Pﬁzer and Moderna and,a bit later, by AstraZeneca. The ﬁrst one claimed a 90% eﬃcacy [1] (then updatedto 95% in a further press release [2] and in a much more detailed paper [3]) and thesecond one 94.5% [4] (later updated to 94.1% [5, 6]), while AstraZeneca press releasedand then published two values (90.0% and 62.1%) depending on two (unplanned [7, 8])diﬀerent experimental settings. In an unpublished ﬁrst paper based on the ﬁrst press releases by Pﬁzer and Mod-erna, we remarked that, since the announcements did not mention any uncertainty,we understood that the initial Pﬁzer’s number was the result of a rounding, withuncertainty of the order of the percent. But then, we continued, we were highlysurprised by the Moderna’s announcement, providing the tenths of the percent, as ifit were much more precise. We had indeed the impression that the ‘point ﬁve’ wastaken very seriously, not only by media speakers, who put the emphasis on the thirddigit, but also by experts from which we would have expected a phrasing implyingsome uncertainty in the result (see e.g. Ref. [10]). The same remarks apply to thelater AstraZeneca announcement. In fact, a fast exercise shows that, in order to havean uncertainty of the order of a few tenths of percent, the number of vaccine-treatedindividuals that got the Covid-19 had to be at least of the order of several hundreds .But this was not the case. In fact, the actual numbers were indeed much smaller,as we learned from the Moderna ﬁrst press release [4]: “ This ﬁrst interim analysiswas based on 95 cases, of which 90 cases of COVID-19 were observed in the placebogroup versus 5 cases observed in the mRNA-1273 group, resulting in a point estimateof vaccine eﬃcacy of 94.5% ( p < . ) ”.Now, it is a matter of fact that if a physicist reads for an experimental result anumber like ‘5’, she tends to associate to it, as a rule of thumb, an uncertainty of theorder of its square root, that is ≈ .

2. Applied to the Moderna claims, this implies an ineﬃcacy of about (5 . ± . about (94 . ± . p < . p-values and other frequentist prescriptions (see Ref. [12] and references therein).In our ﬁrst paper [9] we tried then to understand whether it was possible to getan idea of the possible values of eﬃcacy consistent with the data, each one associatedwith its degree of belief on the basis of the few data available in those days. Inother words, our purpose was and is to arrive to a probability density function (pdf), In one of the experimental setting, indicated here as ‘low dose’-‘standard dose’, the ﬁrst vaccinedose was half of the planned one (‘standard dose’-‘standard dose’ setting). The paper is available on the web site of one of the authors, together with the slides of a relatedwebinar and the code to reproduce the results [9]. n V I n P I n V Is n P Is Moderna-1 [4] Nov 16 14134 - 14073 5 90 – –Moderna-2 [5, 6] Nov 30 (same)

11 185 0 30Pﬁzer [2, 3] Nov 18 18198 - 18325 8 162 1 9AstraZeneca LD-SD [7, 8] Nov 23 1367 - 1374 3 30 – –AstraZeneca SD-SD [7, 8] Nov 23 4440 - 4455 27 71 – –Table 1:

Bare data concerning the number of infected in the vaccine group ( n V I ) and inthe placebo group ( n P I ). ‘Moderna-1’ is just an interim result , based on the same sample of‘Moderna-2’, that we had used in Ref. [9]. In the case of Moderna and Pﬁzer comprehensiveresults, also the numbers for the occurrence of ‘severe forms of infection’ were reported ( n V Is and n P Is , respectively). In the case of AstraZeneca LD-SD stands for ‘low dose followed bystandard dose’, SD-SD for ‘two consecutive standard doses’. eﬃcacy value 95% ‘uncertainty interval’Moderna-1 [4] 0 .

945 ———-Moderna-2 [6] 0 .

941 [0 . , . conﬁdence interval)Pﬁzer [3] 0 .

950 [0 . , . credible interval)AstraZeneca LDSD [7] 0 .

900 [0 . , . conﬁdence interval)AstraZeneca SDSD [7] 0 .

621 [0 . , . conﬁdence interval)Table 2: Published vaccine eﬃcacy (value and 95% ‘uncertainty interval’). although not obtained in closed form, of the quantity of interest.In the present paper we not only extend our analysis to the published data [3,6, 8] (see Tab. 1), but can also compare our results with the published ones (seeTab. 2), which also include an indication of the uncertainty to be associated withthem. What makes us conﬁdent about the validity of our simple model is that thepress released and ﬁnally published results concerning ‘eﬃcacy’ (see Tab. 2) are inexcellent agreement with the mode of the distribution we get analyzing our modelthrough a Markov Chain Monte Carlo (MCMC). This is not a surprise to us, indeed.In fact we are aware of statistical methods which tend to produce as ‘estimate’ themost probable value of the quantity of interest, that would be inferred starting froma ﬂat prior [13]. The fact that diﬀerent kind of ‘uncertainty intervals’ are providedwill be discussed at the due point. We only anticipate here that they have in thiscase equivalent meaning.The paper is organized as follows. In Sec. 2 we describe and show how to imple-ment in JAGS [14] the causal model connecting in a probabilistic way the quantities3f interest, among which the primary role is played by the ‘eﬃcacy’ ǫ . We also give, infootnote 4, some indications on how to proceed in order to get exact results for f ( ǫ ),although they can only be obtained numerically. The MCMC results are shown anddiscussed in Sec. 3. Then the question asked in the title is tackled, with a didactictouch and including some historical remarks, in Sec. 4. The observation that theresulting pdf’s of ǫ can by approximated rather well by Beta distributions (Sec. 5)leads us to discuss in further detail the role of the priors , initially chosen simplyuniform. Then Sec. 6 is devoted to the related question of predicting the numberof vaccinated people that shall result infected, taking into account several sourcesof uncertainty. Finally, in Sec. 7 we extend our analysis to the level of protectiongiven by the vaccines against the disease severity, in which the outcome of a simpleapplication of probability theory is at odds with simplistic, extraordinary claims. InSec. 8 we sum up the analysis strategy and the outcome of the paper. Then someconclusions and ﬁnal remarks follow.

Dealing with problems of this kind, we have learned (see e.g. [16]) the importanceof building up a graphical representation of the causal model relating the quantitiesof interest, some of them ‘observed’ and others ‘unobserved’, among the latter thequantities we wish to infer. Also in this case, despite some initial skepticism aboutthe possibility of getting some meaningful results, once we have built up the model,very basic indeed, it was clear that the main outcome concerning the vaccine eﬃcacywas not depending on the many aspects of the trials. Our initial doubts were in factrelated to the several details concerning the people involved in the test campaign,but they ﬁnally resulted to be much less critical than we had at ﬁrst thought.The causal model used in this analysis is implemented in the

Bayesian network ofFig. 1. The top nodes n V and n P stand for the number of individuals in the vaccineand placebo (i.e. control) groups, respectively, as the subscripts indicate, while thebottom ones ( n V I and n P I ) are the number of individuals of the two groups resultinginfected during the trial. These are the observed nodes of our model and their valuesare summarized in Tab.1.Then, there is the question of how to relate the numbers of infectees to the num-bers of the participants in the trial. This depends in fact on several variables, likethe prevalence of the virus in the population(s) of the involved people, their socialbehavior, personal life-style, age, health state and so on. And, hopefully, it depends ‘ All 30 cases occurred in the placebo group and none in the mRNA-1273 vaccinated group. ’‘ ...and vaccine eﬃcacy against severe COVID-19 was 100% ’ (Moderna press release [5], based onno severe cases out of the 11 infectees in the vaccine group). This result has been reported as 100%eﬃcacy with (uncritical!) great emphasis also in the media [15] – a reminder of the C. Sagan’s quotethat “ extraordinary claims require extraordinary evidence ” is here in order. A n V n P n V A n P A n V I n P I − ǫ Figure 1:

Simpliﬁed Bayesian network of the vaccine vs placebo experiment (see text). on the fact that a person has been vaccinated or not. Lacking detailed information,we simplify the model introducing an assault probability p A , that is a catch-all termembedding the many real life variables, apart being vaccinated or not. Nodes n V A and n P A in the network of Fig. 1 represent then the number of ‘assaulted individuals’in each group, and they are modeled according to binomial distributions, that is n V A ∼ Binom( n V , p A ) (1) n P A ∼ Binom( n P , p A ) , (2)represented in the graphical model by solid arrows.The ‘assaulted individuals’ of the control group are then assumed to be all infected,and hence the deterministic link with dashed arrow relating node n P A to node n P I follows (indeed the two numbers are exactly the same in our model, and we makethis distinction only for graphical symmetry with respect to the vaccine group).Instead, the ‘assaulted individuals’ of the other group are ‘shielded’ by the vaccinewith probability ǫ , that we therefore identify with eﬃcacy , although we shall comeback at the due point about what should be reported as ‘eﬃcacy’. The probabilityof becoming infected if assaulted is therefore equal to 1 − ǫ , so that node n V I is relatedto node n V A by n V I ∼ Binom( n V A , − ǫ ) . (3)At this point all the rest is a matter of calculations, that we do by MCMC techniques with the help of the program JAGS [14] interfaced with R [17] via rjags [18]. Indeed one could try to get an exact solution for the pdf of ǫ . The steps needed are: write down model {nP.I ~ dbin(pA, nP) We easily recognize in lines 1. and 2. of the R code the above Eqs. (1) and (2),while line 4. stands for Eq. (3). Line 6. is simply the transformation of ‘1 − ǫ ’ (‘ﬀe’in the code) to ǫ , the quantity we want to trace in the ‘chain’. Finally lines 3. and5. describe the priors of the ‘unobserved nodes’ that have no ‘parents’, in this case the joint pdf of all the variables in the network; condition on the certain variables; marginalize overall the uncertain variables besides ǫ . Referring to Ref. [16] for details, here is the structure of theunnormalized pdf obtained starting from uniform priors over ǫ and p A : f ( ǫ | n V , n P , n V I , n P I ) ∝ n V X n VA = n VI Z d p A (cid:20) n V A !( n V A − n V I )! · (1 − ǫ ) n VI · ǫ n VA − n VI (cid:21) · (cid:20) n V A ! ( n V − n V A )! · p n VA A · (1 − p A ) n V − n VA (cid:21) · (cid:2) p n PA A · (1 − p A ) n P − n PA (cid:3) , where the three terms within square brackets are the three binomial distributions entering the model,stripped of all irrelevant constant factors. Simplifying and reorganizing the various terms we get f ( ǫ | n V , n P , n V I , n P I ) ∝ (1 − ǫ ) n VI · n V X n VA = n VI ǫ n VA − n VI ( n V A − n V I )! ( n V − n V A )! · Z p n VA + n PA A · (1 − p A ) n V − n VA + n P − n PA d p A . We recognize that the integral R x r − · (1 − x ) s − d x , in terms of a generic variable x , deﬁnes thespecial function beta B ( r, s ), thus obtaining f ( ǫ | . . . ) ∝ (1 − ǫ ) n VI · n V X n VA = n VI ǫ n VA − n VI ( n V A − n V I )! ( n V − n V A )! · B ( n V A + n P A +1 , n V − n V A + n P − n P A +1) . Then the integral over ǫ follows, in order to get the normalization factor. Finally, all moments ofinterest can be evaluated. All this can be done numerically. However, we proceed to MCMC, beingits use much simpler and also for the ﬂexibility it oﬀers (for example in the case we need to extendthe model, as we shall do in Secs. 6 and 7). Those who have no experience with JAGS can ﬁnd in Ref. [16] several ready-to-run R scripts. ± stand. unc. centr. 95% cred. int. P ( ǫ ≥ . . ± .

028 [0 . , . . ± .

019 [0 . , . . ± .

075 [0 . , . . ± .

090 [0 . , . .

945 ———-Moderna-2 [6] 0 .

941 [0 . , . conﬁdence interval)Pﬁzer [3] 0 .

950 [0 . , . credible interval)AstraZeneca LDSD [7] 0 .

900 [0 . , . conﬁdence interval)AstraZeneca SDSD [7] 0 .

621 [0 . , . conﬁdence interval)Table 3: Top table: MCMC results for the model parameter ǫ (see text). Bottom table: sameas Tab. 2 for easier comparison with the MCMC results. p A and 1 − ǫ . We use for both a uniform prior , modeled by a Beta distribution (seeSec. 4.1 for details) with parameters { , } . Then we have to provide the data, inour case n V , n P , n V I and n P I . The program samples the space of possibilities andreturns lists of numbers (a ‘chain’) for each ‘monitored variable’, which can then beanalyzed ‘statistically’. For example the frequency of occurrence of the values in eachlist is expected to be proportional to the probability of that values of the variable(Bernoulli’s theorem). Similarly we can evaluate correlations among variables. We run the model against the published data and (for Moderna) also those fromthe ﬁrst press release, all summarized in Tab. 2. The MCMC results concerning the‘eﬃcacy parameter’ ǫ are summarized in Tab. 3 (for the reader’s convenience theresults of Tab. 2 are repeated below it). We cannot go here into the details of this choice that we consider quite reasonable, given theinformation provided by the data, and refer for the details to Ref. [16] and references therein. Thefact that, as we shall see in next section, the modes of the distributions of ǫ that result from ouranalysis practically coincide with the eﬃcacy values reported by the three companies means thatthey have also used ‘ﬂat priors’, or frequentist methods which implicitly entail a ﬂat prior [13]. Weshall see in Sec. 5.1 how ‘informative priors’ (e.g. by experts) can be taken into account in a secondstep, without the need of repeating the analysis for each choice of priors. .4 0.5 0.6 0.7 0.8 0.9 1.0 e f ( e ) Moderna_1Moderna_2PfizerAZ−LDSDAZ−SDSD

Figure 2:

MCMC results for the vaccine eﬃcacies. The vertical lines indicate the resultsprovided by the pharma companies, in practical perfect agreement with the mode of the MCMCestimated probability distributions.

The MCMC based pdf’s of ǫ are plotted in Fig. 2 with smooth curves showingthe proﬁle of the histograms of the ǫ values in the chains. For comparison, thevertical dashed lines show the results of the pharma companies (‘eﬃcacy value’ inTab. 2). As we can see, they correspond practically exactly to the modal values of thedistributions. This makes us quite conﬁdent about the validity of our simple modelfor this quantitative analysis, although we maintain that the single number for theeﬃcacy to be provided is not the mode, but rather the mean of the distribution, aswe shall argue in Sec. 4. However, some remarks are in order already at this point.In fact, although there is no doubt about the fact that the most complete descriptionof a probabilistic inference is given by the pdf of the quantity of interest, it is alsowell understood that it is often convenient to summarize the distribution with just afew numbers.Usually, when inferring physical quantities, the preference goes to the mean andthe standard deviation (the latter being related to the concept of standard uncer-tainty [11]) because of rather general probability theory theorems which make theiruse convenient for further evaluations (‘propagations’). Other ways to summarizewith just a couple of numbers a probability distribution are intervals which contain8he uncertain value of the variable of interest at a given probability level ( credibleinterval ). We report then in Tab. 3 the 95% central credible interval evaluated fromthe MCMC chains as well as the 90% ‘right side credible interval’. Other usefulsummaries, depending on the problem of interest, can be the most probable valueof the distribution (mode) and the median , i.e. the value that divides the possiblevalues into two equally probable intervals. As we have stated above, the modes of theMCMC based pdf’s coincides with the values reported as ‘eﬃcacy value’ in Tab. 2,which contains also what we have generically indicated as 95% ‘uncertainty inter-val’, in form of credible interval for Pﬁzer and conﬁdence interval for the other twocompanies. The MCMC also provides results for the other ‘unobserved’ nodes of the causalmodel, in our case p A and n V A . We refrain from quoting results on the ‘assaultprobability’, because they could easily be misunderstood, as they strongly depend,contrary to ǫ , on the values of n V and n P , being p A a catch-all quantity embeddingseveral real life variables, including the virus prevalence. We have however checkedthat our main results on ǫ are stable against the (simultaneous) variations of n V and n P by orders of magnitude (thus implying similar large variations of p A ).We give, instead, the results concerning n V A that we expect to be around n P I . Weget, in fact, respectively for Moderna-1, Moderna-2, Pﬁzer, AstraZeneca (LDSD) andAstraZeneca (SDSD) the following values: 89 ±

13, 185 ±

19, 160 ±

18, 29 ± ±

12 (note that the standard uncertainty is not simply the root square of n V I , asa rule of thumb would suggest). It is now time to come to the question asked in the title. We have already used thenoun eﬃcacy , associated to the uncertain variable ǫ of our model of Fig. 1. Then,analyzing the published data, we have got by MCMC several pdf’s of ǫ , that is f ( ǫ | Moderna-1), f ( ǫ | Moderna-2), and so on (see Fig. 2). Hereafter, since what weare going to say is rather general, we shall indicate the generic pdf by f ( ǫ | data , H ),where H stands for the set of hypotheses underlying our inference and not speciﬁed The meaning of such an interval is that, conditioned on the data used and on the model as-sumptions, we consider P ( ǫ < ǫ low ) = P ( ǫ > ǫ high ) = 2 . ǫ low and ǫ high are the boundariesof the interval. It is important to understand that, strictly speaking, a 95% frequentistic conﬁdence intervaldoes not provide the interval in which the authors are 95% conﬁdent that the ‘true value’ of in-terest lies (see Refs. [12, 13] and references therein), although it is ‘often the case’ for ‘routinemeasurements’ [13]. In general we are used to indicating by I the background state of information [16], but we reservehere the symbol I for ‘infected’.

9n detail.Let us now focus on the probability that an assaulted individual gets infected.Indicating by A the condition ‘the individual is assaulted’, by V the condition ‘vac-cinated’ and by I the event ‘the individual gets infected’ (and therefore A , V and I their logical negations), we get, rather trivially, P ( I | A ) = 0 , (4)while, in the case of assault, the probability of infection depends on whether theindividual has been vaccinated or not. In the case of placebo, following our model,we simply get P ( I | A, V ) = 1 . (5)Instead, in case the individual has been vaccinated, the probability of infection willdepend on ǫ , that is, for the special cases of perfect shielding and no shielding (i.e.no better than the placebo), P ( I | A, V, ǫ = 1) = 0 (6) P ( I | A, V, ǫ = 0) = 1 . (7)In general, if we were certain about the precise value of ǫ , the probability of gettinginfected or not is related in a simple way to ǫ : P ( I | A, V, ǫ ) = 1 − ǫ (8) P ( I | A, V, ǫ ) = ǫ . (9)The above equations, and in particular Eq. (9), express in mathematical terms themeaning we associate to eﬃcacy , in terms of the model parameter ǫ : the probabilitythat a vaccinated person gets shielded from a virus (or from any other agent). Butthe value of ǫ cannot be known precisely. It is, instead, aﬀected by an uncertainty,as it (practically) always happens for results of measurements [11] (and indeed alsothe pharma companies accompany their results with uncertainties – see Tab.2). Ina probabilistic approach, this means that there are values of ǫ we believe more andvalues we believe less. All this, we repeat it, is summarized by the probability densityfunction f ( ǫ | data , H ) . The way to take into account all possible values of ǫ , each weighted by f ( ǫ | data , H ),is to follow the rules of probability theory, thus obtaining P ( I | A, V, data , H ) = Z ǫ · f ( ǫ | data , H ) d ǫ . (10)10hich represents the probability that a vaccinated person , not belonging to the trialsample, gets shielded from Covid-19 , on the basis of the data obtained from the trialand all (possibly reasonable) hypotheses assumed in the data analysis. It is easy tounderstand that P ( I | A, V, data , H ) is what really matters and that should thereforebe communicated as eﬃcacy to the scientiﬁc community and to the general public. Now, technically, Eq. (10) is nothing but the mean of the distribution of ǫ . Thisshould then be the number to report, and not the mode of the distribution, whichhas no immediate probabilistic meaning for the questions of interest.Now, if we compare the ‘eﬃcacy values’ of Tab. 2 with the mean values of Tab. 3we see that in most cases the diﬀerences are rather small (about 1/3 to 1/2 of astandard deviation), although the modal values (to which, as we have showed above,the published eﬃcacies correspond) are always a bit higher than the mean values, dueto the left skewness of the pdf’s. Therefore our point is mostly methodological, withsome worries when the mean value and the most probable one diﬀer signiﬁcantly. Problem in the Doctrine of Chances

Evaluating the probability of future events on the basis of the outcomes of previoustrials on ‘apparently the same conditions’ is an old, classical problem in probabilitytheory that goes back to about 250 years ago and it is associated to the names ofBayes [20] and Laplace [21]. The problem can be sketched as considering events whoseprobability of occurrence depends on a parameter which we generically indicate as p ,i.e. P ( E | p ) = p . Idealized examples of the kind are the proportion of white balls in a box containinga large number of white and black balls (with the extracted ball put back into thebox after each extraction), the bias of a coin and the ratio of the chosen surfacein which a ball thrown ‘at random’ can stop, with respect to the total surface of ahorizontal table (this was the case of the Bayes’ ‘billiard’, although the Reverend didnot mention a billiard).A related problem concerns the number of times (‘ X ’) events of a given kind occurin n trials, assuming that p remains constant. The result is given by the well knownbinomial, that is X ∼ Binom( n, p ) , (11)whose graphical causal model is shown in the left diagram of Fig. 3. The problem In this paper we only focus on eﬃcacy, without even trying to enter on the related topics of eﬀectiveness , that refers to how well the vaccine performs in the real world (see e.g. [19]), that isinﬂuenced by several other factors. pX √ √ n pX √ √ Figure 3:

Graphical models of the binomial distribution (left) and its ‘inverse problem’. Thesymbol ‘ √ ’ indicates the ‘observed’ nodes of the network , that is the value of the quantityassociated to it is ( assumed to be ) certain. The other node (only one in this simple case) is‘unobserved’ and it is associated to a quantity whose value is uncertain. ﬁrst tackled in quantitative terms by Bayes and Laplace was how to evaluate theprobability of a ‘future’ event E f , based on the information that in the past n trialsthe event of that kind occurred X = x times (‘number of successes’) and on the as-sumption of a regular ﬂow from past to future , that is assuming p constant althoughuncertain. In symbols, we are interested in P ( E f | n, x, H ) , where H stands, as above, for all underlying hypotheses. Both Bayes and Laplacerealized that the problem goes through two steps: ﬁrst ﬁnding the probability distri-bution of p and then evaluating P ( E f | n, x, H ) taking into account all possible valuesof p . In modern terms1 . → f ( p | n, x, H ) (12)2 . → P ( E f | n, x, H ) = Z P ( E f | p ) · f ( p | n, x, H ) (13)= Z p · f ( p | n, x, H ) . (14)The basic reasoning behind these two steps is expressly outlined in the Sixth and

Seventh Principle of the Calculus of Probabilities , expounded by Laplace in ChapterIII of his

Philosophical Essay on Probabilities [22]: • the Sixth Principle , in terms of the possible causes C i responsible of the observedevent E , is essentially what is presently known as Bayes’ theorem, that is P ( C i | E ) = P ( E | C i ) · P ( C i ) P k P ( E | C k ) · P ( C k ) , (15) A reminder of Russell’s inductivist turkey is a must at this point!

12n which P ( C i ) is the so called prior probability of C i , i.e. not taking intoaccount the piece of information provided by the observation of E . Note thatthe role of P ( C i ) was explicitly considered by Laplace, who 1) before gavethe rule in the case of P ( C i ) numerically all equal, which then drop fromEq. (15); 2) then speciﬁed that “ if these various causes, considered `a priori ,are unequally probable, it is necessary, in the place of the probability of theevent resulting from each cause, to employ the product of this probability by thepossibility of the cause itself .” (Here ‘possibility’ and ‘probability’ are clearlyused as synonyms.) Then, the importance of the ﬁnding is stressed:“ This is the fundamental principle of this branch of the analysis ofchances which consists in passing from events to causes. ”Generalizing this ‘principle’ to an inﬁnite number of causes, associated to allpossible values of the parameter p , with the ‘event’ being the observation of X = x successes in n trials, we get the case sketched in the right diagram ofFig. 3, in which the unobserved node is now p . Equation (15) becomes then, interms of the probability function of X and of the pdf of p [for which we takethe freedom of using the same symbol ‘ f ()’], f ( p | n, x ) = f ( x | n, p ) · f ( p ) R f ( x | n, p ) · f ( p ) d p . (16) • The

Seventh Principle then states that “ the probability of a future event isthe sum of the products of the probability of each cause, drawn from the eventobserved, by the probability that, this cause existing, the future event will occur ”,that is P ( E f ) = X i P ( C i ) · P ( E f | C i ) . (17)Generalizing also this ‘principle’ to an inﬁnite number of causes associated toall possible values of the parameter p we get Eq. (13), and then Eq. (14): theprobability of interest is the mean of the distribution of p .The solution of Eq. (16), in the case X is described by Eq. (11) and we consider allvalues of p `a priori equally likely, is a Beta pdf, that is p ∼ Beta( r, s ) (18) See e.g. Ref. [16] and references therein. r = x + 1 and s = n − x + 1. Mean value and variance of the possible values of p are then µ ≡ E( p ) = rr + s (19) σ ≡ Var( p ) = r · s ( r + s + 1) · ( r + s ) . (20)Finally, using Eq. (13) and Eq. (14) we get the Laplace’s rule of succession P ( E f | n, x, H ) = x + 1 n + 2 . (21)Thus, in the special case of ‘ n successes in n trials’, “ we ﬁnd that an event havingoccurred successively any number of times, the probability that it will happen againthe next time is equal to this number increased by unity divided by the same number,increased by two units ” [22], i.e. P ( E f | n, x = n, H ) = n + 1 n + 2 . (22)In the case of x = n = 11 we have then 12/13, or 92.3%. Reporting thus 100% (seefootnote 3) can be at least misleading, especially because such a value can be (as it has indeed been ) nowadays promptly broadcasted uncritically by the media (see e.g.[15] – we have heard so far no criticism in the media of such an incredible claim, butonly sarcastic comments by colleagues). Moving to our results about the ‘model parameter ǫ ’ (it is now time to be morecareful with names), reported in Tab. 3 and Fig. 2, it should now be clear why thenumber to report as eﬃcacy should be the mean of the distribution. As far as thedistribution of ǫ is concerned, given the similarity of the inferential problem that wasﬁrst solved by Bayes and Laplace, we have good reasons to expect that it shouldnot ‘diﬀer much’ from a Beta. In order to test the correctness of our guess we havedone the simple exercise of superimposing over the MCMC distributions of Fig. 2 theBeta pdf’s evaluated from mean and standard deviation of Tab. 3. The distributionparameters can be in fact obtained solving Eqs. (19) - (20) for r and s : r = (1 − µ ) · µ σ − µ (23) s = 1 − µµ · (cid:20) (1 − µ ) · µ σ − µ (cid:21) . (24) To make it clear, no ‘ﬁt’ on the MCMC histogram has been performed. .4 0.5 0.6 0.7 0.8 0.9 1.0 e f ( e ) Moderna_2PfizerAZ−LDSDAZ−SDSD

Figure 4:

MCMC inferred distributions of ǫ (solid lines exactly as in Fig. 2) with superim-posed (dashed lines, often coinciding with the solid ones) the corresponding Beta distributionsevaluated from the mean values and the standard deviations resulting from MCMC. The result is shown in Fig. 4. As we can see, the agreement is rather good forall cases, especially for Moderna and Pﬁzer, for which the Beta and MCMC curvespractically coincide. ǫ The fact that the MCMC results are described with high degree of accuracy by aBeta distribution is not only a sterile curiosity, but has indeed an interesting practicalconsequence.As we have seen, the pdf’s of ǫ have been obtained starting from a uniform prior.The same must be for Pﬁzer, since they also did a Bayesian analysis, as explicitlystated in their paper [3] and as revealed by the expression ‘credible interval’ (seeTab. 2), and their values practically coincide with ours. Instead, in the case of theother results, the expression ‘conﬁdence interval’ seems to refer to a frequentisticanalysis, in which “there are no priors”. But in reality it is not diﬃcult to show thatsound frequentist analyses (e.g. those based on likelihood ) can be seen as approxi-15ations of Bayesian analyses in which a ﬂat prior was used (see e.g. Ref. [13]). Theresulting ‘estimate’ corresponds to the mode of the posterior distribution under thatassumption.The question is now what to do if an expert has a ‘non ﬂat’ informative prior (indeed none would `a priori believe that values of ǫ close to zero or to unity would beequally likely!). Should she ask to repeat the analysis inserting her prior distributionof ǫ ? Fortunately this is not the case. Indeed, as we have discussed in Ref. [16], due tothe symmetric and peer roles of likelihood and prior in the so called Bayes’ rule, each ofthe two has the role of ‘reshaping’ the other . Moreover, since a posterior distributionbased on a uniform prior concerning the variable of interest can be interpreted as alikelihood (besides factors irrelevant for the inference), we can apply to it an expert’sprior in a second time (see Ref. [16] for details). It becomes then clear the importanceof the observation that the pdf’s of ǫ derived by MCMC can be approximated by Betadistributions: from the MCMC mean and standard deviation we can evaluate the Betaof interest, as we have seen above; this function can be then easily multiplied by theexpert’s prior; the normalization can be done by numerical integration and ﬁnallythe posterior distribution of ǫ also conditioned on the expert’s prior can be obtained.This implementation in a second step of the expert opinion becomes particularlysimple if also her prior is modeled by a Beta, recognized to be a quite ﬂexible distri-bution. For example, indicating by r F and s F the Beta parameters [calculated withEq. (23) - (24)] obtained by a ﬂat prior and by r and s the parameters of the Betainformative prior, the posterior distribution will still be a Beta with parameters r p = r + r F − s p = s + s F − . (26) In fact, assuming that the MCMC based pdf of ǫ starting from a ﬂat prior (‘ F ’) can be approx-imated by a Beta, we can write it, neglecting irrelevant factors, as f F ( ǫ ) ∝ ǫ r F − · (1 − ǫ ) s F − . Expressing also the informative prior by a Beta, that is f ( ǫ ) ∝ ǫ r − · (1 − ǫ ) s − , and applying the Bayes’ rule, we get for the posterior (‘ p ’) f p ( ǫ ) ∝ f F ( ǫ ) × f ( ǫ ) ∝ ǫ r F − r − · (1 − ǫ ) s F − s − ∝ ǫ ( r F + r − − · (1 − ǫ ) ( s F + s − − , from which Eqs. (25) - (26) follow. A n V n P n V A n P A n V I n P I − ǫp ′ A n ′ V n ′ V A n ′ V I Figure 5:

Variation of the model of Fig. 1, in this ﬁgure inside a box, in order to consider theeﬀect of the vaccine on n ′ V individuals of a diﬀerent population (see text). Once we have got the eﬃcacy of a vaccine, or, even better, the full information about ǫ inferred by the data, we can tackle another interesting problem: if we vaccinate n ′ V individuals in (possibly) another population, how many of them will become infected?Obviously, this depends not only on f ( ǫ ) but also on many other parameters that wecan model with the assault probability p ′ A in the new population, and, obviously, onthe fact that n ′ V must be only a small part of the entire population, so that we do nothave to consider issues related to herd immunity. In order to do this exercise we needto enlarge our causal model, thus getting that shown in Fig. 5. But in reality thereis no need to set up a new JAGS model and rerun the MCMC. We can just use thechain of ǫ obtained processing through the MCMC the original model of Fig. 1, anddo the remaining work with ‘direct’ Monte Carlo. However, having observed that theresulting pdf of ǫ is approximated a Beta, we can do it starting from the mean andstandard deviation obtained from the MCMC. Here is e.g. how the evaluation of n ′ V I can be performed by sampling (in the R code we have indicated n ′ V as nV , and so on): mu <- 0.944; sigma <- 0.019; e.rep <- 0.950 mu <- 0.599; sigma <- 0.090; e.rep <- 0.621 A number of hundred thousand vaccinated individuals has been used, with an abso-lutely hypothetical value of assault probability of 1 %. The script can also be used tosimulate the eﬀect of a precise value of ǫ , thus exactly corresponding to the eﬃcacy,just setting its standard deviation to a very small value. The result of this idealizedsituation, using e.g. ǫ = 0 . is shown in the top histogram of Fig. 6. In this idealized case thedistribution of n ′ V I is simply a binomial, i.e. n ′ V I ∼ Binom( n ′ V , p ov ) , with the overall probability p ov being equal to p ′ A · (1 − ǫ ), that is p ov = 0 . ǫ is shown in the second (top-down) histogramof the same ﬁgure. As we can see, the distribution becomes remarkably wider andmore asymmetric, with a right-hand skewness, eﬀect of the left-hand skewness of f ( ǫ ). There is nothing special with this choice, and what follows is a little more than an exercise,strongly dependent on the the assumption on p ′ A . en s i t y . . . . . . . n V I mean = 56.0 std = 7.5 D en s i t y . . . . . n V I mean = 56.0 std = 20.4 D en s i t y . . . . . n V I mean = 56.0 std = 9.4 D en s i t y . . . . . . n V I mean = 56.0 std = 21.3 Figure 6:

Distribution of the predicted number of vaccinated infectees, subject to (top-down):exact values of ǫ and p ′ A ; uncertain ǫ ; uncertain p ′ A ; uncertain ǫ and p ′ A (see text for details).

19e see then, in the third histogram, the eﬀect of a hypothetical uncertainty about p ′ A , modeled here with a standard deviation of σ ( p ′ A ) = 0 . × p ′ A (but this has to beunderstood really as an exercise done only to have an idea of the eﬀect, because areasonable uncertainty could indeed be much larger ). Finally, including both sourcesof uncertainty, we get the histogram and the numbers at the bottom of the ﬁgure.Vertical lines show the predicted values for n ′ V I by using the MCMC mean value (solidline) and using the modal value (dashed lines).As a further step, following Ref. [16] (see in particular Secs. 5.2.1 and 5.3.1 there),let us try to get approximated formulae for the expected value and the standarddeviation of n ′ V I . The idea, we shortly remind, is to start with the expected value andvariance evaluated for the expected values of ǫ and p ′ A , and then make a ‘propagationof uncertainty’ by linearization (as if ǫ and p ′ A were ‘systematics’). Here are theresulting formulaeE( n ′ V I ) ≈ n ′ V · E( p ′ A ) · [1 − E( ǫ )] (27) σ ( n ′ V I ) ≈ σ (cid:2) n ′ V I | p ′ A = E( p ′ A ) , ǫ = E( ǫ ) (cid:3) + ( n ′ V · [1 − E( ǫ )]) · σ ( p ′ A ) + ( n ′ V · E( p ′ A )) · σ ( ǫ ) ≈ n ′ V · E( p ′ A ) · [1 − E( ǫ )] · [1 − E( p ′ A ) · [1 − E( ǫ )]]+ n ′ V · [1 − E( ǫ )] · σ ( p ′ A ) + n ′ V · E ( p ′ A ) · σ ( ǫ ) , (28)which we have checked to be in excellent agreement with the results from directMonte Carlo. A last point we wish to address in this paper is related to the eﬃcacy of the vaccinesagainst disease severity, based on the data reported by Moderna and Pﬁzer (seeTab. 1): in the ﬁrst case 30 people got a ‘severe form’ out of 185 infectees in the controlgroup; none of the severe cases occurred in the group of 11 vaccinated infectees; inthe second the corresponding numbers are 9 in 162 and 1 in 8.In order to analyze this further pieces of information we can simply extend theBayesian network of Fig. 1 adding four nodes (see Fig. 7): n V Is and n P Is represent For example, here is the R code to be added in the above script, immediately after the assignment‘ pA <- 0.01 ’, in order to implement Eqs. (27) - (28): spA <- 0.1 * pAest.nvI <- nV * pA * (1-mu)est.sigma.nvI <- sqrt( nV * pA * (1-mu) * (1 - pA * (1-mu)) +(nV*(1-mu))^2 * spA^2 + (nV*pA)^2 * sigma^2 )cat(sprintf("Approximated nvI: mean+-sigma: %.1f +- %.1f\n",est.nvI,est.sigma.nvI)) A n V n P n V A n P A n V I n P I − ǫp V Is p P Is n V Is n P Is Figure 7:

Extended Bayesian network of the vaccine vs placebo experiment (see text). the number of infected individuals that got the disease in a severe form, while p V Is and p P Is represent the corresponding probability of developing the disease in such away in the two cases. Again we can use the binomial distributions: n V Is ∼ Binom( n V I , p V Is ) (29) n P Is ∼ Binom( n P I , p P Is ) , (30)and the following JAGS model would result: model {nP.I ~ dbin(pA, nP) being n V I and n P I observed nodes , i.e. n V I and n P I are just data, the bottom nodes involving( n V Is , n V I , p V Is ) and ( n P Is , n P I , p P Is ) get ‘separated’ from the rest of the network. Inother words there is no ﬂow of evidence from ( n V Is , p V Is ), or from ( n P Is , p P Is ), to therest of the network. Therefore the problem has a rather simple solution. In particular,using uniform priors for p V Is and p P Is , we get p V Is ∼ Beta( n V Is +1 , n V I − n V Is + 1) p P Is ∼ Beta( n P Is +1 , n P I − n P Is + 1) . Nevertheless, for didactic purposes, we report in Fig. 8 the histograms of the MCMCresults with superimposed the Beta pdf’s (solid lines – we shall come back later tomeaning of the dashed line added in the case of Moderna).As far as the control groups are concerned (green, narrower histograms and curvesin Fig. 8), the results from Moderna and Pﬁzer data are quite diﬀerent. In bothcases we get rather narrow distributions, as expected from the rather large numbersinvolved (and therefore the central values are close to the proportion of severe caseswith respect to the total number). But they diﬀer substantially and, using mean andstandard deviation to summarize them, we getModerna: p P Is = 0 . ± . p P Is = 0 . ± . , diﬀering by 0 . ± . approximately10-15% of cases progress to severe disease, and about 5% become critically ill ” [23].Obviously, we are not in the position to make any statement about the reason of thedisagreement, that could be due to the diﬀerent populations on which the trials havebeen performed. But, if this were the case, one could have some doubts about thevalidity of comparing the diﬀerent sets of data. We can only leave the question toepidemiology experts.Passing to the vaccine groups (red, broader histograms and curves in Fig. 8), thecrude summaries in terms of mean and standard deviation giveModerna: p V Is = 0 . ± . p V Is = 0 . ± . . If we consider just the mean values, it seems that there is a very large diﬀerencebetween the performances of the two vaccines. However their diﬀerence is hardlysigniﬁcant, being their diﬀerence − . ± .

14. Also the fact that the two curves looks substantially diﬀerent should not impress a statistics expert if she knew fromwhich number they have been derived. The main diﬀerence is due to the fact that22 evere disease probability D en s i t y Moderna severe disease probability D en s i t y Pfizer

Figure 8:

Distribution of the probability of getting a severe form of Covid-19 (see text). p V Is = 0 . ± .

096 – practically the same, an experienced physicist wouldsay.It is quite evident that it is not possible to draw general conclusions on the eﬃcacyof the generic vaccine on softening the impact of the disease. But the real point wewish to highlight, given the spread of distributions, is that we do not have enoughdata for drawing sound conclusion. For this reason, we point that even for this aspect,press releasing a 100% eﬀect and not dealing with the unavoidable uncertainties andtheir impact when applied to decision making is quite misleading. Figure 8 indeedshows that the probability of becoming severely ill in the vaccine group is deﬁnitivelylow but, quite obviously, not zero and with a relevant overlap with the distributionevaluated for the control group.

In this paper, spurred by the press releases ﬁrst and then by the published resultsabout the performance of the candidate vaccines, we have reanalyzed the publisheddata with the help of a Bayesian network processed with MCMC methods. Theaim was that of obtaining, for each data set, the pdf of the model variable ǫ , whosemeaning is the following: if we assume for it an exact value, then the probability thata ‘virus assaulted individual’ gets infected would be exactly − ǫ .Our results are in excellent agreement with the published ones, if the latter areproperly interpreted. In fact, it turns out that they coincide with the modes of therespective pdf f ( ǫ ), although we maintain that if a single number has to be provided,especially to the media, it should be the mean of the distribution. This has, in fact,the meaning of the probability that a vaccinated person will be shielded by the virus,taking into account the unavoidable uncertainty on ǫ , fully described by f ( ǫ ). Andthis is what really matters to deﬁne the eﬃcacy of a vaccine. Willing however toreduce the result of our analysis to a single number to be compared with the re-leased ones, we get respectively and with reasonable rounding, 93%, 94%, 94%, 86%and 60% for Moderna-1, Moderna-2, Pﬁzer, AstraZeneca (LDSD) and AstraZeneca(SDSD), versus 94.5%, 94.1%, 95.0%, 90.0% and 62.1% of Tab. 2. Therefore, as faras these numbers are concerned, there is then a substantial agreement of the out-come of our analysis with the published results, simply because when a probabilitydistribution is unimodal and rather symmetrical then mode and mean tend to coin-cide. Therefore, with respect to the main results, our contribution to this point ismainly methodological. The probability theory based result is, instead, at odds withModerna 100% claimed eﬃcacy against severe disease, for which a more sound 92%should be quoted. 24n order to summarize more eﬀectively the probability distribution of ǫ with justa couple of numbers, our preference goes to mean and standard deviation, althoughwe also report the bounds of the central 95% credible interval . This interval is, oncemore, in excellent agreement not only with the Pﬁzer result, which has also publishedan interval having exactly the same meaning, but also with the uncertainty intervalsof the other companies, although they provide conﬁdence intervals , which, strictlyspeaking, do not have the same meaning of the credible intervals. This is not asurprise to us. We are in fact aware that in many practical cases not only frequentisticpoint estimates are equivalent to the mode of the posterior distribution of the modelparameter, if a uniform prior was used in a Bayesian analysis based on the same data,but also ‘95% conﬁdence intervals’ tend to be, numerically, equal to the 95% probableintervals. This takes us to the question of the priors. As just reminded, a uniform priorover ǫ has been used in our analysis. But, clearly, not because we believe thatthe eﬃcacy of a vaccine that has reached the Phase-3 trial has the same chanceto be close to zero or to one. Instead, a ﬂat prior can be considered a convenientpractical choice, if the inference is dominated by the data, as it is often the case.Moreover, the advantage of a uniform prior in parametric inference is that the eﬀectof an informative prior reﬂecting the opinion of experts can be taken into accountat a later time. This ‘posterior use of priors’ might sound paradoxical, but it isimportant to remind that in Bayesian inference ‘prior’ does not indicate time orderbut rather ‘based on the status of knowledge without taking into account the newpiece of information’ provided by the data entering the speciﬁc analysis. Having, infact, prior and likelihood symmetric and peer roles in the Bayes’ rule, an expert canuse her prior to ‘reshape’ the posterior pdf resulting from data analysis, if a ﬂat priorwas used, without having to ask to repeat the analysis (see Ref. [16] for details).This reshaping becomes particularly simple if the prior is modeled by a convenient,rather ﬂexible probability distribution such as the Beta. In fact, as we have seen, thepdf of ǫ starting from a ﬂat prior tends to resemble a Beta. The same is then trueif also the prior is modeled by a Beta (this is related to the well known fact of theBeta being the conjugate prior of a binomial distribution, even though our model isnot just a simple binomial). These observations are particularly interesting becausethey lead to the expressions (25) - (26), which, together with Eqs. (23) - (24), allow totake easily into account the expert priors. In fact, if the priors are rather vague, r and s appearing in Eqs. (25) - (26) are quite small (although larger than one, since ǫ = 0 and ǫ = 1 are `a priori reasonably ruled out) and, in particular, smaller that r F and s F . If, instead, the expert has a strong opinion about the possible values of ǫ , But this is not a general rule, as discussed in detail in Ref. [13]. Just to have an idea of the numbers we are dealing with, the values of { r F , s F } resulting fromour analysis are equal to { , . } , { , } , { , . } , { , . } and { , } , respectively forModerna-1, Moderna-2, Pﬁzer, AstraZeneca (LDSD) and AstraZeneca (SDSD). r and s will play a role in her posterior, and in that of her community , if itsmembers trust her. Coming back to the way to summarize f ( ǫ ), our preference goes to its meanand standard deviation. The mean because, as reminded above, has the meaningof eﬃcacy for vaccine treated people not having been involved in the trial, if allpossible values of ǫ are taken into account. The standard deviation because it ismostly convenient, together with the mean, to make use of the result of the inferencein further considerations and in ‘propagation of uncertainties’, thanks to generalprobability rules.We have just reminded the utility of mean and standard deviation in order tore-obtain f ( ǫ ), under the hypothesis that it is almost a Beta distribution, making useof Eqs. (23) - (24). The application related to ‘propagation of uncertainties’ that wehave seen in the paper has to do with predicting the number of individual that will getinfected in a group that it is going to be vaccinated. This is a problem in probabilisticforecasting and the number of interest is uncertain for several reasons. There is,unavoidably, the uncertainty deriving from the inherent binomial distribution, having assumed an assault probability p ′ A in the new population. But also the uncertaintieson the values of p ′ A and ǫ play a role, that can even be dominant with respect to the‘statistical’ eﬀect of the binomial.Now, the probability distribution of the number of vaccinated infectees can beevaluated extending our basic Bayesian network, as we have done here. But wehave also stressed the importance of having approximated expressions, based on lin-earization, for its expected value and standard deviation. And such expressions, thusobtained considering ǫ and p ′ A as ‘systematic’ [13], depend then on their mean andstandard deviation. For example the contribution to σ ( n ′ V I ) due to the uncertain ǫ ,and then to be added ‘in quadrature’ to the other sources of uncertainty, is given by n ′ V · E( p ′ A ) · σ ( ǫ ). This gives at a glance the contribution to the global uncertaintywithout having to run a Monte Carlo. Finally, a comment on how to possible reduce σ ( ǫ ) is in order. In fact, the relativeuncertainty on ǫ depends on the small number of vaccinated infectees. This suggeststhat the quality of its ‘measurement’ could be improved, keeping constant the totalnumbers of individuals entering the trial, if the size of the placebo group is reduced.We have checked by simulation that reducing it by 2/3, thus having about a factor ofﬁve between the two groups, σ ( ǫ ) is expected to be reduced by about 20%. Not muchindeed, but this diﬀerent sharing of individuals in the two groups would have the Remember that Science and its popularization is based on a long chain of rational beliefs [13].Think for example to the reasons you believe in gravitational waves, provided that you really believethat they could exist and that they have ﬁnally being detected on Earth starting from 2015 – our trusted source ensures us that 67 of them have been ‘observed’ so far [24]. Note also that what really enters in Eqs. (27) - (28) is 1 − ǫ whose relative uncertainty is around30% even in the best cases of Moderna-2 and Pﬁzer [it reaches 54% for AstraZeneca (LDSD),becoming ‘only’ 22% for AstraZeneca (SDSD), characterized however by a large value of 1 − ǫ ]. A re-analysis of the 2020 data on vaccine trials by Pﬁzer, Moderna and AstraZenecahas been performed with didactic intent and focusing on uncertainties and on whatmost eﬀectively should be reported as outcome. With this respect, we appreciateonce more the role provided by Bayesian networks in clarifying the meaning of eachvariable entering the model and its relation with the others. In particular, we makea distinction between the ‘eﬃcacy variable’ ǫ and the eﬃcacy to be reported to thescientiﬁc community and to the general public as probability that a newly vaccinatedperson will shielded from Covid-19 . The uncertain values of ǫ are characterized bythe pdf f ( ǫ ), obtained in this work by MCMC, and whose mean value has exactlythe meaning of eﬃcacy , in analogy to Laplace’s rule of succession to quantify theprobability of a future event.We have stressed not only the importance of providing the most complete infor-mation of f ( ǫ ), whose graphical representation provides better than many words itsuncertainty, but also that of summarizing the results (if only two numbers have to bechosen) with mean and standard deviation of the distribution. In fact, although theprobability distribution of ǫ is what is really needed to the development of predictivewhat-if scenarios, mean and standard deviation are the most useful quantities to beused for further approximated evaluations.With regards to the comparison with the published result, the eﬃcacy valuesobtained in this analysis as mean of the probability distribution of ǫ are in goodagreement with them for the reasons just reminded in the previous section, with theexception of the 100% claim that speaks for itself. This agreement plays an importantrole in validating the causal model on which the analysis is based, indeed very simple(but not simplistic!), that can be used by students and researchers to repeat theanalysis – for this reason pieces of programming code have also been provided. References [1] Pﬁzer Inc., Press Release, November 9, 2020, .[2] Pﬁzer Inc., Press Release, November 18, 2020, .

3] P. Fernando et al. for the C4591001 Clinical Trial Group (Pﬁzer),

Safety and Eﬃcacy of theBNT162b2 mRNA Covid-19 Vaccine , The New England Journal of Medicine, .[4] Moderna Inc., Press Release, November 16, 2020, https://investors.modernatx.com/news-releases/news-release-details/modernas-covid-19-vaccine-candidate-meets-its-primary-efficacy .[5] Moderna Inc., Press Release, November 30, 2020, https://investors.modernatx.com/news-releases/news-release-details/moderna-announces-primary-efficacy-analysis-phase-3-cove-study .[6] L.R. Baden et al. Eﬃcacy and Safety of the mRNA-1273 SARS-CoV-2 Vaccine , The NewEngland Journal of Medicine, .[7] AstraZeneca PLC, News Release, 23 November 2020, .[8] M. Voysey et al. (AstraZeneca)

Safety and eﬃcacy of the ChAdOx1 nCoV-19 vaccine(AZD1222) against SARS-CoV-2: an interim analysis of four randomised controlled tri-als in Brazil, South Africa, and the UK , December 8, 2020, https://doi.org/10.1016/S0140-6736(20)32661-1 .[9] G. D’Agostini and A. Esposito,

Inferring vaccine eﬃcacies and their uncertainties. A simplemodel implemented in JAGS/rjags , 19 November 2020, .[10] A. Fauci at NBC News, November 16, 2020, https://youtu.be/8CG8aI4XCGw?t=32 .[11] International Organization for Standardization (ISO),

Guide to the expression of uncertaintyin measurement , Geneva, Switzerland, 1993, .[12] G. D’Agostini,

The Waves and the Sigmas (To Say Nothing of the 750 GeV Mirage) ,arXiv:1609.01668 [physics.data-an].[13] G. D’Agostini,

Bayesian Reasoning in Data Analysis. A critical Introduction , World Scien-tiﬁc, 2003.[14] M. Plummer,

JAGS: A Program for Analysis of Bayesian Graphical Models Using GibbsSampling , Proceedings of the 3rd International Workshop on Distributed Statistical Com-puting (DSC 2003), March 20–22, Vienna, Austria. ISSN 1609-395X, http://mcmc-jags.sourceforge.net/ .[15] See, e.g.,J. Cohen, Science, ‘Absolutely remarkable’: No one who got Moderna’s vaccine in trialdeveloped severe COVID-19 , November 30, 2020,

Moderna Applies for Emergency F.D.A. Approval for ItsCoronavirus Vaccine , November 30 2020, .Repubblica,

Vaccino, Moderna chiede l’autorizzazione all’uso di emergenza alla Fda e ll’Ema. L’azienda americana annuncia i risultati della fase 3: “Eﬃcacia pari al 94,1% eﬁno al 100% nei casi gravi” , November 30, 2020, .E. Cohen, CNN, Moderna applies for FDA authorization for its Covid-19vaccine , December 1, 2020, https://edition.cnn.com/2020/11/30/health/moderna-vaccine-fda-eua-application/index.html .[16] G. D’Agostini and A. Esposito,

Checking individuals and sampling populations with imperfecttests , arXiv:2009.04843 [q-bio.PE].[17] R Core Team (2018),

R: A language and environment for statistical computing . R Foundationfor Statistical Computing, Vienna, Austria, .[18] M. Plummer, rjags: Bayesian Graphical Models using MCMC .R package version 4-10, https://CRAN.R-project.org/package=rjags .[19] [20] Bayes, Thomas and Price, Richard

An Essay towards solving a Problem in the Doctrine ofChance. By the late Rev. Mr. Bayes, communicated by Mr. Price, in a letter to John Canton,A.M.F.R.S. , Philosophical Transactions of the Royal Society of London. 53: 370–418, (1763), https://doi.org/10.1098%2Frstl.1763.0053 .[21] P.S. Laplace,

M´emoire sur la probabilit´e des causes par les ´ev´enements” , M´emoire del’Acad´emie royale des Sciences de Paris (Savants ´etrangers), Tome VI, p. 621, 1774, https://gallica.bnf.fr/ark:/12148/bpt6k77596b/f32 .[22] P.S. Laplace,

Essai philosophique sur les probabilit´es , 1814, http://books.google.it/books?id=JrEWAAAAQAAJ . (English quotes from the classicaltranslation by F.W. Truscott and F.L. Emory: https://bayes.wustl.edu/Manual/laplace A philosophical essay on probabilities.pdf .)[23] World Health Organization, , have%20lasting%20health%20effects. [24] LIGO-Virgo Cumulative Event/Candidate Rate Plot O1-O3, https://dcc.ligo.org/LIGO-G1901322/public ..