Annual modulations from secular variations: not relaxing DAMA?
PPrepared for submission to JCAP
Annual modulations from secularvariations: not relaxing DAMA?
Andrea Messina a,b
Marco Nardecchia a,b
Stefano Piacentini a,b, a Sapienza, Università di Roma, Italy b INFN, Sezione di Roma, ItalyE-mail: [email protected], [email protected],[email protected]
Abstract.
In a recent paper [arXiv:2002.00459], Buttazzo et al. show how the annuallymodulated rate of the DAMA experiments can be possibly interpreted as an artefact dueto the interplay between a time-dependent background and the method to account forit. In this work, we compare this hypothesis against the sinusoidal dark matter signal asproposed by the DAMA collaboration. We produce in a Bayesian approach a quantitativecomparison of how much the experimental observations are in support of each hypothesis.Our conclusions are that the odds against the hypothesis of a time varying backgroundbeing responsible for the annual modulation are decreased by a Bayes factor larger than after considering the public available data of the DAMA/NaI and DAMA/LIBRA ex-periments. In this work we also elaborate on general aspects of the analysis procedure.Indeed, in order to optimise the background subtraction procedure, the DAMA collabora-tion only considers data-taking cycles with a duration of roughly one year. We argue thatany data-taking cycle is informative, and we propose a procedure to include this effect, aswell as the possibility to include a slowly varying component for the background. Keywords: dark matter detectors, dark matter experiments
ArXiv ePrint: Corresponding author. a r X i v : . [ h e p - e x ] M a r ontents JAGS and rjags
JAGS
MCMC simulation 20
B.1 Prior choices 21B.2 Posterior results 23
The results of the DAMA/NaI and DAMA/LIBRA experiments are interpreted by theDAMA collaboration as a strong evidence of the presence of Dark Matter (DM) particlesin the galactic halo [1–6]. This conclusions derive from the compatibility of the annuallymodulated rate with a sinusoidal signal, characterized by a phase and a period in agreementwith those expected from the interaction of DM particles.In a recent paper [7] it has been discussed the possibility that, due to the analysisprocedure followed by the DAMA collaboration, the observed annual modulation could bereproduced by a slowly varying time-dependent background. This possibility has been usedto interpret the modulation of residuals of the single-hit scintillation rate as a function oftime published by the DAMA collaboration [2, 4, 6].The argument proposed in ref. [7] goes as follows: since the residual rate is computedby subtracting the average total rate in every cycle of data-taking, if the background rate isnot constant over time this procedure can generate a time modulated rate. More precisely,in the hypothesis of the actual presence of a DM signal, the total rate r ( t ) is given by: r ( t ) = r ( t ) + A cos (cid:18) πtT − φ (cid:19) , (1.1)where r ( t ) is the time-dependent background, A is the amplitude of the oscillating DMsignal, T is the period of the oscillation, which is equal to 1 year, and φ is a phase such thatthe peak of the oscillation is around the nd June, as expected in DM galactic halo models.– 1 –f r is constant over time, then one can choose a time window of width ∆ equal to anymultiple of T and average the rate in that window to isolate the background contribution.Under such an hypothesis, the residuals obtained by subtracting window by window thisaverage are: S ( t ) ≡ r ( t ) − (cid:104) r ( t ) (cid:105) ∆ = A cos (cid:18) πtT − φ (cid:19) (1.2)and then the signal is isolated from the background. This is what has been done by theDAMA collaboration, with the time windows ∆ chosen to be of the order of 1 year.On the other hand, as it has been pointed out, if the background is slowly varyingover time, for example linearly, the choice of using time windows of the order of the pe-riod of the wanted signal can produce residuals with a modulation of the same period.The possibility of a time-varying background is supported by the fact that in undergrounddetectors, especially in the keV energy range, the features of the background are not com-pletely modeled. In particular in ref. [7] explanations due to out-of-equilibrium physical orinstrumental effects are considered. The authors of ref. [7] conclude that the available datacould not safely exclude the extreme possibility that the modulation in the residual rate isproduced by a slowly varying background only. Such a conclusion is based on the debat-able argument that the DAMA residuals could be fitted by an linearly growing backgroundwith a χ / d.o.f (cid:39) .In ref. [8] a Bayesian comparison between the cosine and the null hypothesis onthe DAMA (2-6) keVee energy window dataset has been performed, leading to a decisiveevidence against the null hypothesis. The aim of this paper is to perform a thoroughcomparison of the two hypotheses of a cosine modulation and a slowly varying background,and give a quantitative conclusion on their performance by studying the likelihood ratio,the model complexity, and the posterior probabilities. We highlight that the odds in favorof the cosine modulation hypothesis are decreased by a Bayes factor of the order of with respect to pure linearly varying background hypothesis, when considering the (2-6) keVee energy window of all the three experimental phases of DAMA. For the sake ofcompleteness, this conclusion stands out also considering only the likelihood ratio of thehypotheses. However, we second the proposal of ref. [7] to encourage the experimentalcollaboration to add a slowly varying background component in the fitting procedure. Anovel observation we point out in this work is that there is no conceptual obstruction touse data-taking cycles with generic time intervals (i.e. with a duration different than 1year).This paper is organized as follows: in Section 2 we briefly describe the experimentaldata considered in our study and we motivate the possibility to use generic time intervals.In Section 3 we explain the details of our analysis, performed using the Bayesian approach.In Section 4 we show our results and in particular we give a quantitative comparisonbetween the various models in terms of Bayesian factors, likelihood ratios and Ockham’sfactors. Finally, conclusions are given in Section 5. The DAMA and DAMA/LIBRA experiments use ultra-radiopure NaI(TI) scintillatingcrystals as active target, coupled with photomultipliers to measure the amount of de-posited energy. The single-hit scintillation events rate is used to look for a possible signalof DM interactions with matter over the large backgrounds due to natural radioactivity ofthe detector and surrounding environment. The event rate, expressed in in cpd/kg/keVee(where ’ee’ stands for electron equivalent), has only been published [2, 4, 6] after the sub-traction of its time-average over each cycle (of roughly one-year duration) following the– 2 –rocedure outlined in the previous section. Each cycle starts every year roughly aroundthe beginning of September.The residual rates are available in different energy windows: for the DAMA/NaI andthe DAMA/LIBRA Phase I experiments, the energy windows are (2-4), (2-5), and (2-6)keVee; for the DAMA/LIBRA Phase II the energy windows are (1-3), (1-6), and (2-6)keVee. The (2-6) keVee energy window is the one with the smallest uncertainties, and it iscommon to the three phases. For this reason, to give a quantitative and fair comparison ofthe different hypotheses in the three stages we decided to study only the residuals in the(2-6) KeVee energy window.In principle, our analysis could be easily extended to other energy bins, although webelieve our conclusions would not change significantly.
The algorithm used by the DAMA collaboration to extract the time-dependent residualrate from the data has a two-fold objective: account for any constant or slowly varyingcomponent while keeping the sinusoidal time structure of an hypothetical DM signal. Theserequirements put constraints on the optimal time intervals of the data-taking cycle.The variance of the sinusoidal signal is given by the quantity ( α − β ) , with the timeaverages α = (cid:104) cos π/T ( t − t ) (cid:105) and β = (cid:104) cos 2 π/T ( t − t ) (cid:105) . For ∆ = T , the quantity ( α − β ) is . (i.e. β = 0 ). In ref. [6] the value of 0.5 is taken as a figure of merit to asseswhether the detectors were operational evenly throughout the period of the modulation ofone year. To the best of our understanding a value of ( α − β ) much different than 0.5implies that either the average of the signal is not vanishing ( β (cid:54) = 0 ), or the data-takingperiod is not close to one year, or both.The DAMA collaboration used the argument that ( α − β ) (cid:54) = 0 . to remove the firstcycle of data-taking in table 1 of ref. [6] from the the analysis. t i ( days ) Δ i ( da ys ) β ( t i , Δ i ) - - - - Figure 1 . Left : Contour plot of the function β given in eq. (2.1); the points represent thedata-taking cycles given in table 1 of ref. [6]. In particular, the point number 1 corresponds to thefirst and discarded data-taking cycle. Right : Fit of a sinusoidal model (red) to the DAMA/LI-BRA Phase II residual rate in the (2-6) keVee energy window obtained taking into account thesubtraction-bias effect. The dotted red line is the corresponding sinusoidal modulation without thesubtraction. The grey are is the region discarded as explained in ref. [6]: since, assuming the sinu-soidal signal, the effect is perfectly quantifiable, in principle also this region could be used for theresiduals analysis. – 3 –he procedure to extract the residual rate might induce a subtraction of the signal ifthe time interval is not chosen carefully. In particular for a data-taking starting in t i andextending to t i + ∆ i , the average of the signal is a function of t i and ∆ i : β ( t i , ∆ i ) = 1∆ i (cid:90) t i +∆ i t i cos (cid:18) πT ( t (cid:48) − t ) (cid:19) dt (cid:48) . (2.1)The contour plot of β ( t i , ∆ i ) as a function of t i and ∆ i is given in fig. 1. Independentlyof t i , if ∆ i is equal to T , as expected, β ( t i , ∆ i = T ) = 0 . This is the approximate condi-tion used by the DAMA collaboration to include a data-taking cycles in the modulationanalysis. However, as it is shown in fig. 1, there are other combination of t i and ∆ i thatresult in β ( t i , ∆ i ) = 0 ; of particular relevance there is the one with t i and t i + ∆ i chosensymmetrically with respect to the time when the cosine is zero. Figure 1 also reports thevalues of β ( t i , ∆ i ) for the 7 data-taking cycles reported in ref. [6]; we note that for thefirst and the second cycle the relative contribution of the signal average is of the order of10%. In particular, we computed the effect of such a subtraction for the first and discardedcycle and represented it graphically in the plot on the right side of fig. 1. The plot showsthe expected behaviour in the discarded region (shaded area): the dotted line correspondsto a sinusoidal model with no subtraction (as generally used), and the continuous line isthe corrected sinusoidal model. The difference between the dotted and continuous linescorresponds to the value of β in the two considered cycles. Although this effect is generallysmall in the cycles selected by the DAMA collaboration, we suggest to account for it in thefitting procedure and use the entire dataset without being forced to drop any data-takingcycle, even if completely asymmetrical. It would be sufficient to compute the integral ineq. (2.1) and include it in eq. (1.2) potentially as a function of A , T , and t .Finally, we stress that if the background is not constant as a function of time it willhave an impact on the value of the average of the rate in the relative data-taking cycle.In particular the linear term of expansion of the background as a function of time willproduce a sawtooth-shape in the residual rate. The possibility to associate a probability value to any uncertain quantity is at the basis ofthe Bayesian approach to data analysis. This may apply to the outcome of a measurementbefore it is carried out, to a parameter of interest within a given model, as well as tothe model itself. Models and outcomes of measurements are then related by the rules ofprobability theory. We can compute, at least in principle, the probability of any specificpropositions given some state of information. This is particularly useful in the context ofmodel comparison: given two or more models, we can compute the probability that theactual experimental observations are in support of each of them, and then rank the modelsaccording to decreasing probability.In the following, we summarize how models can be compared in the Bayesian ap-proach, while in Appendix A we give all the details. Let’s suppose to have a dataset D = { x i } composed of n measurements x i , a set of models or hypotheses H i where eachhypothesis could in general depend on a vector of parameters (cid:126)θ i . Within a model, weassume to know the prior probability π ( H i ) , the likelihood function L ( H i , (cid:126)θ i ; D ) , and theprior probability density function (pdf) of all the model’s parameters π ( (cid:126)θ i | H i ) .The posterior odds ratio is given as: p ( H A | D ) p ( H B | D ) = BF AB × π ( H A ) π ( H B ) , with BF AB = L ( H A ; D ) L ( H B ; D ) ; (3.1)– 4 –here L ( H A ; D ) is the so called marginal likelihood defined as: L ( H A ; D ) = (cid:90) L ( (cid:126)θ A , H A ; D ) π ( (cid:126)θ A | H A ) d(cid:126)θ A . (3.2)The Bayes factor BF AB encapsulates all the new information associated to the measure-ments, although, for parametric models, it could critically depend on the choice of thepriors π ( (cid:126)θ i | H i ) . In any real situation, the integral of eq. (3.2) is not solvable analyticallyand can only be estimated numerically. Nevertheless, in the case where the informationcontent of the measurements is such that the priors can be considered sufficiently vaguein the range of the parameters space where the likelihood is sizable, we can consider theprior as a constant and use the Laplace approximation to estimate the integral. This is avery useful approximation to get sense of the different contributions to the Bayes factorbecause it can be factorised as: BF AB = LR AB × OF AB . (3.3)where we defined ( LR AB ) as the ratio of the likelihoods evaluated at their maximum, and OF AB , the so called Ockham’s factor, estimated by working out the integral in eq. (3.2)as shown in more details in Appendix A. In this factorization LR AB depends solely onthe measurements, while OF AB also on the priors. The Ockham’s factor is a measure ofthe complexity of the two models that depends on the number and the a-priori variabilityof the model’s parameters. The message behind this factorization is that even though amodel with more parameters can be more flexible and thus better fit the data producing anhigher likelihood, there is a price to pay for having a more complex model. In other words:even if the likelihood ratio pushes towards a more complex model which often better fitsthe data, on the other hand the Ockham’s factor penalises it.Finally, two other useful criteria to quantify the ability of a model to describe theobservations are the Bayesian Information Criterion (BIC) [9] and the Akaike InformationCriterion (AIC) [10], a review of which is given for example in ref.[11]. Given a dataset ofsize n , and a model A , with a number k A of parameters (cid:126)θ A whose value that maximise thelikelihood is ˆ (cid:126)θ A , the BIC is defined as: BIC A = k A ln ( n ) − (cid:16) L (ˆ (cid:126)θ A , H A ; D ) (cid:17) ; (3.4)and the AIC is defined as: AIC A = 2 k A − (cid:16) L (ˆ (cid:126)θ A , H A ; D ) (cid:17) ; (3.5)According to these criteria, the smaller the value of the BIC (AIC) is, the better thedescription of the observations.When comparing two models, the ∆ BIC AB and ∆ AIC AB can be defined as: ∆ BIC AB = BIC A − BIC B = ln ( n ) ( k A − k B ) − LR AB ) . (3.6) ∆ AIC AB = AIC A − AIC B = 2 ( k A − k B ) − LR AB ) . (3.7) Note that for normal likelihoods the LR AB can be expressed in terms of χ A ( B ) = (cid:80) i (cid:20) ( x i − µ A ( B ) ( x i )) σ i (cid:21) as: LR AB = exp (cid:104) − χ A − χ B (cid:105) , where µ A ( B ) ( x i ) represents the true value of the measurement x i in the model A ( B ) . This means that the LR AB of the two models evaluated at their best fit values is simply given by exp [ − ∆ ˆ χ AB / . In a frequentistic approach to model comparison, the ∆ ˆ χ AB is often taken as test-statisticand its probability distribution used to compute p-values. For nested models, where the more general modelcontains the simpler plus k additional parameters, the Wilks’ theorem says that pdf (∆ ˆ χ AB ) is itself a χ distribution with k degrees of freedom. For non-nested models one would have to estimate pdf (∆ ˆ χ AB ) ,possibly by sampling it with pseudo-experiments. – 5 –he previous equations show how the ∆ BIC AB and ∆ AIC AB are closely related to the LR AB , plus a penalty term that penalises the model which is more complex.For the interpretation of the odds ratio or Bayes factor we refer to the criterion basedon Jeffreys scale [12]: a value of > represents strong evidence in favor of model A, anda value of > represents decisive evidence. Similar evaluation can be done based on the ∆ BIC and the ∆ AIC . We decided to consider, in our analysis, three different models: • COS model: this is the pure-cosine model, where the only free parameter is theamplitude A of the modulation: S COS ( t ) = A cos (cid:18) πT ( t − t ) (cid:19) , (3.8)where the period T is assumed to be fixed to 1 year and t = 152 . d such that thepeak of the modulation is on the nd June. • SAW model: this is the pure-sawtooth model, where the only free parameter is theslope B of the linearly varying sawtooth: S SAW ( t ) = B ( t − t i ) with t i − ∆ i < t < t i + ∆ i , (3.9)where ∆ i is the width of the time window to which t belongs, and t i is the center ofthis time window. • C+S model: this is the cosine plus sawtooth model, where this time the free param-eters to be considered are two, A and B : S C + S ( t ) = S COS ( t ) + S SAW ( t ) , (3.10)where S COS and S SAW are defined in the eq. (3.8) and (3.9), respectively.What we would like to stress is that the models have been chosen in such a way thatthe COS and the SAW model have the same number of parameter, in order to minimisethe impact of the Ockham’s factor (and thus of the priors of the parameters) in the finalBayes factor. Indeed, we expect that, if the C+S model doesn’t give a strong improvementin the likelihoods with respect to the other two models, the Ockham’s factor will drive theBayes factor in favour of the simpler models.
JAGS and rjags
The computation of the posterior probabilities, even for models relatively simple as thosedescribed in the previous section, is often only possible by Monte Carlo integration. Themost common way to solve problems of this kind is by sampling the unnormalised poste-rior distribution by a Markov Chain Monte Carlo (MCMC). For our study we used thegeneral analysis framework R [14] and the MCMC algorithm called Gibbs Sampler as im-plemented in
JAGS [15] and interfaced with R in the package rjags [16]. The details of theimplementation are given in Appendix B.We assume, as it is done by the DAMA collaboration [1], that the measurements atdifferent times are independent and follow a normal likelihood in each time bin t i with– 6 –nown standard deviation σ i given by the experimental uncertainty. The total likelihoodis then given by: L ( { µ i } , { σ i } ; { D i } ) = n (cid:89) i = i √ πσ i exp (cid:20) − ( y i − µ i ) σ i (cid:21) , (3.11)where the true parameters { µ i } are given as a function of { t i } by implementing one of thethree models outlined above, for example for the COS model: µ i = A cos (cid:18) πT ( t i − t ) (cid:19) (3.12)with the only unknown parameter represented by the amplitude A . For each unknownparameter in the model we have to give a prior distribution probability. In the followingwe used flat priors for all parameters.The implementation in JAGS of the model outlined above and the discussion on theprior’s choice is described in Appendix B. The Monte Carlo simulation gives the unnor-malised posterior probability of the parameters of interest sampled using the Gibbs algo-rithm. The results reported in this work are obtained with a single Markov chain with · steps. Finally, to estimate the marginal likelihood and the Bayes factor we used the bridgesampling package [17]. This package uses the same Markov chain used to samplethe posterior probability for the integration of the marginal likelihood. In the following, toget a feeling of the contribution of the priors we also report the value of the maximum ofthe log-likelihood (best fit χ ); when comparing two models, we computed analytically thelikelihood ratio, and the Ockham’s factor as described in Appendix A using eq. (A.10). We used a toy simulation to qualify the performance of the model selection criteria outlinedabove on the specific problem discussed in this work. We generated a few cycles of pseudo-data with the same time structure of those of DAMA, and residual rates sampled fromindependent normal pdfs with true values given by a mixture of COS and SAW modelsand known σ . We choose a value of A for the COS model, and B for the SAW modelsimilar to the amplitude of the DAMA/LIBRA phase II residuals, and a σ comparablewith their uncertainties. We fitted these pseudo-datasets with the three hypotheses: COS,SAW and C+S, and checked the sensitivity of the model comparison criteria to indicatethe correct model used in the generation. The source code for the simulated analysis ispublicly available on GitHub [13].Figure 2 shows the pseudo-datasets with the underlying true model on the left-handside and the best fit for each of the three models on the right-hand side. Table 1 reportsfor the three pseudo-datasets the values of the model comparison criteria. We can see thatin all cases the underlying model is correctly selected for each of the criterion of modelcomparison. The largest scores occur when we compare COS and SAW models, but sinceC+S is an extension of both COS and SAW, in some cases that include C+S we may havea relatively small BF (of the order of 10), mostly due to OF contribution, namely thedifference in complexity between the two compared models.
As already said at the beginning of the Section 3, we decided to analyse the residualrate in the (2-6) keVee energy window. Since the precision of the measurements and thesize of the modulation are different in the various phases, as a first step we analysed the– 7 –OS generated data BF [dB] LR [dB] ∆ BIC ∆ AICCOS vs SAW . . − . − . COS vs C+S . − . − . − . SAW vs C+S − . − . . . C+S generated data BF [dB] LR [dB] ∆ BIC ∆ AICCOS vs SAW . . − . − . COS vs C+S − . − . .
30 7 . SAW vs C+S − . − . . . SAW generated data BF [dB] LR [dB] ∆ BIC ∆ AICCOS vs SAW − . − . . . COS vs C+S − . − . . . SAW vs C+S . . − . − . Table 1 . Comparison between the various models in the three pseudo-datasets in terms of Bayesfactor (BF), likelihood ratio (LR), and difference of Bayesian Information Criterion ( ∆ BIC) andAkaike Information Criterion ( ∆ AIC).
DAMA phase BF C,S [dB] LR C,S [dB] OF C,S [dB] ∆ BIC ∆ AICDAMA/NaI − . − . . .
34 8 . LIBRA Phase I . . . − . − . LIBRA Phase II . . . − . − . All . . . − . − . Table 2 . Bayes factor (BF), likelihood ratio (LR), Ockham’s factor (OF), Bayesian InformationCriterion (BIC) difference and Akaike Information Criterion (AIC) difference between the COSand the SAW model for the three stages of DAMA experiments. For the first three rows, the ∆ BICand the ∆ AIC values are equal because the two models have the same number of parameters. Thelast row refers to the comparison between COS and a SAW model with 3 different slopes as detailedin Section 4.3. data individually on each of the three datasets, testing only the two hypotheses of pure-cosine (COS) and pure-sawtooth (SAW) contribution. We tested on the most informativedataset also the third model (C+S), and then we performed the analysis on the wholedataset. Finally, we evaluated the performance of the cosine model promoting the periodand the phase to free parameters.
In table 2 we show the results of the comparison between the COS and the SAW modelsin terms of Bayes factor, likelihood ratio, Ockham’s factor, Bayesian Information Criteriondifference and Akaike Information Criterion difference in every experimental stage. Sincesome of these quantities can eventually be very large or very small, BF, LR and OF aregiven in decibels . This table also shows how the contribution of the Ockham’s factor,which is the component of the Bayes factor critically dependent on the choice of the priorsof the parameters, is marginal. As explicitly shown in Appendix B, for each of the threedatasets we chose a flat prior on B as well as on A . However, for all the three datasetswe tested different possible priors (uniform, normal and gamma distributions), but in allcases the Ockham’s factor contribution is always under control, namely it changes of at Given a quantity x , the decibel d is defined as d = 10 log ( x ) . – 8 – igure 2 . Left:
Simulated data (points) and true underlying model (red line) for the COS, C+S,and SAW models respectively.
Right: best fit for the COS (red solid line), SAW (green dottedline), and C+S (blue dashed line) models respectively. the most 4-5 units in decibels with respect to our final choice of the prior. The results forthe different stages can be summarized as follows: • For the DAMA/NaI dataset, the Bayes factor, which is − . in normal units,disfavours the pure-cosine model in favour of the pure-sawtooth one. However this isthe less informative dataset, namely it is the one with relative uncertainties on eachpoint larger than in the other phases. The fit is shown in fig. 3. • For the DAMA/LIBRA Phase I dataset, the Bayes factor is of the order of . ,and this means that already at this stage, where the relative uncertainties are smallerthan in the previous case, the cosine model starts to win on the pure-sawtooth model.The fit is shown in fig. 4. • For the DAMA/LIBRA Phase II dataset, the Bayes factor is . . This meansthat in the region in where the data give the maximum of the information, in thesense that we explained before, the pure-cosine model is greatly preferred to thepure-sawtooth model. In fig. 5 the result of the fit is presented.– 9 –AMA Fit to COS model Fit to SAW modelphase A [cpd/kg/keVee] χ / dof B [cpd/kg/keVee/yr] χ / dof DAMA/NaI . ± . . /
36 0 . ± . . / LIBRA I . ± . . /
49 0 . ± . . / LIBRA II . ± . . /
51 0 . ± . . / Table 3 . Results of the fit for the cosine amplitude A of the COS model and the sawtoothcoefficient B of the SAW model obtained on the DAMA residuals in the (2-6) keVee energy windowduring the different phases, together with the corresponding χ / d.o.f. The details of the three fits are described in table 3, in which all the reported valuesare obtained computing the expected value and the standard deviation on the posteriorprobability of each parameter.
Figure 3 . Fit to the DAMA/NaI residual rate in the (2-6) keVee energy window. The regionsin which the blue line is constant (and equal to zero) are the regions between two non-contiguousdata-taking cycles.
Figure 4 . Fit to the DAMA/LIBRA Phase I residual rate in the (2-6) keVee energy window.The regions in which the blue line is constant (and equal to zero) are the regions between twonon-contiguous data-taking cycles.
Since the third dataset is the most informative one, as already shown in fig. 5, we decided totest not only the COS and the SAW model, but also the C+S model, as defined in Section– 10 – igure 5 . Fit to the DAMA/LIBRA Phase II residual rate in the (2-6) keVee energy window.The regions in which the blue line is constant (and equal to zero) are the regions between twonon-contiguous data-taking cycles. B obtained in the C+S model is consistent withzero, B = ( − . ± . cpd/kg/keVee/yr, and the value of A is consistent with thevalue of A obtained in the COS model. Indeed, looking at the fig. 5, the green (S+C fit)and the red (COS fit) lines are very close: in fact the χ /dof of the C+S fit is very similarto that of the COS fit, or, in other words, the LR of the models is very close to one (itis − . = 0 . in normal units). Therefore the Bayes factor in the COS vs C+S case isdriven by the Ockham’s factor, and since the number of parameters is different in the twomodels, the COS model wins against the C+S model. The C+S model is an extension ofthe COS model but, even if it’s more complex, it doesn’t improve the goodness of fit, andthe price to pay for the greater complexity is reflected in the Bayes factor. Since in this casethe value of BF depends critically on the priors, we tried to use different possible priors forthe parameters as before, but in all cases the BF favoured the COS model. On the otherhand, for the SAW and C+S models, although the Ockham’s factor pushes in the directionof the simpler SAW model, the contribution of the LR drives the BF in favour of the C+Smodel. The final message of this analysis is that the most informative dataset availableto us can be better represented by models that contain a dominant cosine component, asthe COS and C+S models, with respect to the SAW model. In addition, the sawtoothcomponent of the C+S model, quantified by the B parameter, is compatible with zero. Since adding the sawtooth component to the cosine modulation in the most informativedataset didn’t produce any change with respect to the pure-cosine model, the next step ofour analysis was comparing the SAW and COS model on all the available data. In principlein the various phases the size of the sawtooth variation could be different, thus we allowedto have three different B parameters. The results of the global fit is shown in fig. 6 andthe details can be summarized as follows: • For the COS fit: A = (0 . ± . cpd/kg/keVee χ /dof = 116 . / (4.1)– 11 –it results A [cpd/kg/keVee] B [cpd/kg/keVee/yr] χ / dof C+S fit . ± . − . ± . . / Model comparison BF [dB] LR [dB] OF [dB] ∆ BIC ∆ AICCOS vs SAW . . . − . − . COS vs C+S . − . . − . − . SAW vs C+S − . − . . . . Table 4 . Top : Results of the fit of the cosine amplitude A and the sawtooth coefficient B of the C+S model obtained on the DAMA residuals in the (2-6) keVee energy window during theDAMA/LIBRA Phase II, together with the corresponding χ / d.o.f. Bottom : Comparison betweenthe various models in the DAMA/LIBRA Phase II dataset in terms of Bayes factor (BF), likelihoodratio (LR), Ockham’s factor (OF) and difference of Bayesian Information Criterion ( ∆ BIC) andAkaike Information Criterion ( ∆ AIC). For all these four metrics (BF, LR, ∆ BIC, ∆ AIC) theSAW model is largely disfavoured.
Figure 6 . Fit to the DAMA residual rate in all the three stages in the (2-6) keVee energy window. • For the SAW fit: B NaI = (0 . ± . cpd/kg/keVee/yr B LIBRAI = (0 . ± . cpd/kg/keVee/yr χ /dof = 145 . / B LIBRAII = (0 . ± . cpd/kg/keVee/yr (4.2)As a cross check, for the SAW model in each of the phases we obtain for the parameters B the same results of table 3. The comparison between COS and SAW model gives BF = 88 . dB ,LR = 64 . dB ,OF = 24 . dB , ∆ BIC = − . , ∆ AIC = − . . (4.3)In this case the Ockham’s factor is still quite marginal with respect to the LR, and thefinal Bayes factor, which is . in normal units, pushes again strongly towards the COSmodel. – 12 – .3.1 Extraction of phase and period of the cosine model Finally, we decided to fit all the available data with a pure-cosine model in which the phaseand the period of the modulation are not fixed, that is to say that t and T defined ineq. (3.8) are now treated as parameters. In this case, choosing uniform priors for the twonew parameters, we obtain: A = (0 . ± . cpd/kg/keVee t = (0 . ± . yr χ /dof = 112 . / T = (1 . ± . yr (4.4)These results are compatible with a period of 1 year and a t = 152 . d = 0 . yr . Thecomparison between cosine and SAW model this time gives BF = 62 . dB ,LR = 71 . dB ,OF = − . dB , ∆ BIC = − . , ∆ AIC = − . . (4.5)Therefore, again, even if now the SAW model enters the game with the same number ofparameters of the cosine model (in fact now the OF slightly favours the SAW model), theBF, which is . in normal units, still pushes strongly towards the cosine model. All our studies indicate that the time modulation of the DAMA residual rate in the (2 − keVee energy window cannot be described by an artefact due to the algorithm to subtracta slowly varying background.Nevertheless, we believe that such an algorithm could potentially have an impact onthe extraction of the parameter of interest of the signal. In particular the definition of thetime window used to average the rate can introduce the following problems in the residualrate: • a non zero contribution due to the possible presence of a signal as discussed in Sec. 2.1; • a sawtooth time modulation due to the presence of a slowly varying background; sucha modulation can enhance or reduce the amplitude of a sinusoidal signal, as well asaffect its phase.For this reason we suggest to use a model that takes these effects into account.Let’s consider a data-taking cycle that starts at t ∗ and extends for a period of time ∆ , the true value µ i of the observed rate in the time bin t i is: µ i = A cos (cid:18) πT ( t i − t ) (cid:19) − A ∆ (cid:90) t ∗ +∆ t ∗ cos (cid:18) πT ( t (cid:48) − t ) (cid:19) dt (cid:48) + B (cid:18) t i − ∆2 (cid:19) (4.6)where A is the amplitude, T the period, and (2 π t /T ) the phase of the sinusoidal signal,while B is the slope of a linearly varying background. For a different time dependenceof the background a linear model is however the first order contribution in a time powerseries.For the sake of completeness, we deployed such a model to fit the three phases ofthe experiment in the (2-6) keVee energy window. Like we did in Section 4.3, since in the– 13 –arious phases the size of the sawtooth variation could be different, we allowed to havethree different B parameters. We performed a first fit (F1) keeping both T and t fixedto the values expected for a DM signal, as well as a second one (F2) allowing them tovary. To better understand the differences with respect to the COS model fit, we decidedto show in the fig. 7 and fig. 8 the marginal posteriors pdfs of all the fit parameters for F1and F2, respectively. F1 F2A [cpd/kg/keVee] . ± . . ± . B1 [cpd/kg/keVee/yr] . ± . . ± . B2 [cpd/kg/keVee/yr] . ± . . ± . B3 [cpd/kg/keVee/yr] − . ± . − . ± . Table 5 . Results of the F1 and F2 fit in terms of bet-fit values of the parameters.
As shown in table 5, except for B which is positive, the other two are compatible with zeroin both cases. The value of A extracted in this way is slightly smaller with respect to theprevious cases because of the expected anti-correlation with the Bi parameters. The smallcorrelation between the Bi parameters, assumed to be a-priori independent, is an inducedeffect of their common anti-correlation with the A parameter. For the F2 fit, the valuesof t and T are consistent with those expected for a DM signal. Finally, we remark thateven though we included into the model the correction for a signal “double-counting” asdescribed in Section 2.1, this effect is not relevant with the current choice of time intervalsfor the experimental cycles. – 14 – igure 7 . Marginal posterior pdfs for the free parameters of the proposed model (Sec. 4.4) at fixed T and t for the fit of the whole (2-6) keVee DAMA dataset. The 4 plots on the diagonal of thefigure are the uni-dimensional pdfs of each single free parameter obtained by marginalising on theall the others. The 10 bi-dimensional pdfs in the bottom-left corner of the figure give the marginalpdfs of each pair of parameters obtained by marginalising on the other. The plots show also thecredible regions at . , . , . probability as black, green, and blue contours respectively. Thecorrelation coefficients are given in the upper-right corner of the figure. – 15 – igure 8 . Marginal posterior pdfs for the free parameters of the proposed model (Sec. 4.4)for the fit of the whole (2-6) keVee DAMA dataset. The 6 plots on the diagonal of the figureare the uni-dimensional pdfs of each single free parameter obtained by marginalising on the allthe others. The 15 bi-dimensional pdfs in the bottom-left corner of the figure give the marginalpdfs of each pair of parameters obtained by marginalising on the other. The plots show also thecredible regions at . , . , . probability as black, green, and blue contours respectively. Thecorrelation coefficients are given in the upper-right corner of the figure. – 16 – Conclusions
In this paper we elaborated on the proposal of ref. [7] to interpret the DAMA modulationas a possible effect of a slow varying time dependent background. We performed a Bayesiananalysis of the DAMA residual rate in the (2-6) keVee energy range with the aim of givinga quantitative comparison of the two possible hypotheses: the cosine modulation claimedby the DAMA collaboration and the sawtooth model proposed in ref. [7]. The source codeused in this work has been made publicly available on
GitHub [13].For the different models considered we estimated the posterior probabilities, andfor each pair of models we computed both the Bayes factor, factorised also in terms oflikelihood ratio and Ockham’s factor, clearly separating the contributions coming from thedata and the prior choice, the Bayesian Information Criterion and the Akaike InformationCriterion.The conclusion of this work can be summarised as follows. With an exception forthe DAMA/NaI dataset, which is however the less informative one, in all the comparisonswe performed in Section 4 the sawtooth hypothesis is always disfavoured with respect tothe cosine modulation hypothesis. This is quantifiable in the Bayes factor which is in allcases between and . To be more specific, in the case of using the entire dataset andcomparing the cosine with just the amplitude as a free parameter in the fit against thesawtooth with three independent and free to vary slope parameters for the three phasesof data-taking, we obtain a Bayes factor of . in favour of the cosine model. Theeffect of the priors choice has been extensively tested and can be reasonably quantifiedin a contribution at the most of the order to the total BF. Finally, we used the fullcosine model with free period and phase and still found a BF of order . For this case weobtained a value of the period and phase compatible with those reported by the DAMAcollaboration.Finally, we point out that the background subtraction algorithm used by DAMA canintroduce a bias in the parameters of the fitted signal, produced by a residual contributionfrom the expected signal as well as a contribution from a slowly varying background.Although the time intervals are chosen in a such a way that this bias is small, we showedthat these effects can be safely taken into account in the analysis as shown in Section 4.4. Acknowledgments
We thank Dario Buttazzo, Fabio Cappella, and Paolo Panci for useful discussions.– 17 –
Bayesian model comparison
In the following we shall suppose to have a dataset D = { x i } composed of n measurements x i , a set of models or hypotheses H i where each hypothesis could in general depend on avector of parameters (cid:126)θ i . Within a model, we assume to know the likelihood function: L ( H i , (cid:126)θ i ; D ) = p ( D | H i , (cid:126)θ i , I ) , (A.1)the expression on the right-hand-side of the previous equation is the probability densityfunction (pdf) to observe some specific data D given the model H i , its parameters (cid:126)θ i , andall the relevant information I before the measurement is carried out. The likelihood, whichis numerically identical to p ( D | H i , (cid:126)θ i , I ) , has to be intended as a function of the model andits parameters at fixed (observed) data. We shall assign a prior probability π ( H i | I ) to eachmodel, as well as a prior probability π ( (cid:126)θ i | H i , I ) to its parameters, both conditioned to thestate of information I . Priors are intended to encapsulate the probability that the model isvalid before data are seen. For the parameters, the priors define their range of variabilityby setting the probability to find them in any given interval of the allowed space. In thefollowing, all probabilities are conditional to the state of information I even if, to keep thenotation simple, we often don’t write it explicitly.Suppose we want to compare two different models H i and H j , which for the momentwe assume don’t depend on any parameter. To compare these models we can compute theratio of the posterior probabilities: O ij ≡ p ( H i | D, I ) p ( H j | D, I ) , (A.2)this ratio gives the odds in favour of one model over the other. To express the odds asa function of the likelihoods and priors, we shall apply the Bayes theorem; that is, forexample for the numerator of eq. (A.2), p ( H i | D, I ) = p ( D | H i , I ) π ( H i | I ) π ( D | I ) . (A.3)The denominator π ( D | I ) is the prior probability of data and it can be more easily inter-preted if expressed in terms of the likelihoods by imposing that (cid:80) i p ( H i | D, I ) = 1 . Todo this we need to have a complete class of hypotheses, and this is not often the case.However, if we are only interested in the odds ratio we can neglect this term as it cancelsout in the odds. The ratio between the probabilities of the two models becomes: O ij ≡ p ( D | H i , I ) p ( D | H j , I ) × π ( H i | I ) π ( H j | I ) . (A.4)The data dependent part is the so called “Bayes factor” (BF), and it is defined as: BF ij ≡ p ( D | H i , I ) p ( D | H j , I ) . (A.5)In the hypothesis in which, due to our a priori ignorance, all the model’s priors are equal,considering the odds ratio is equivalent to considering the Bayes factor.The case of a model H i ( (cid:126)θ i ) dependent on a vector of parameters (cid:126)θ i is more complicatedbecause we have to compute the marginal likelihood of the model, which corresponds tothe likelihood of the model averaged over all possible values of the parameters. For this– 18 –eason, it is often called average or marginalised likelihood. The marginal likelihood iscomputed by marginalising over the model’s parameters: p ( D | H i , I ) = (cid:90) p ( D, (cid:126)θ i | H i , I ) d(cid:126)θ i = (cid:90) p ( D | H i , (cid:126)θ i , I ) p ( (cid:126)θ i | H i , I ) d(cid:126)θ i , = (cid:90) L ( (cid:126)θ i , H i ; D ) π ( (cid:126)θ i | H i ) d(cid:126)θ i , (A.6)where we applied the multiplication law of probabilities, and, in the last equality, expressedthe integrand in terms of likelihood and prior probabilities of the parameters. In any realsituation, this integral is not solvable analytically and can only be estimated numerically.Nevertheless, in the case where the information content of our measurement is such thatthe priors can be considered sufficiently vague in the range of the parameters space wherethe likelihood is sizable, we can consider the prior as a constant and use the Laplaceapproximation to estimate the integral. This is a very useful approximation to get senseof the different contributions to the marginal likelihood. For simplicity, let’s assume themodel depends just on a single parameter θ i , the best fit to the data occurs for θ = ˆ θ (inthe following we will use ˆ x to refer to the best fit value of the quantity x ), the likelihoodis reasonably normal, and the prior is flat over a range ∆ θ i . Under this assumptions theintegral can be approximated as: p ( D | H i , I ) = (cid:90) L ( θ i , H i ; D ) π ( θ i | H i ) dθ i , (cid:39) θ i √ π σ ( ˆ θ i ) L ( ˆ θ i , H i ; D ) . (A.7)The last expression makes it clear that the marginal likelihood corresponds to the likelihoodevaluated at the best fit value of θ i weighted by a volume factor corresponding to theparameter’s uncertainty. In this oversimplified example, the proportional factor betweenthe global and the best fit likelihood is given by the ratio of two quantities, namely ∆ θ i and σ ( θ i ) , which control respectively the possible range of θ i before and after the data are seen.This factor is often referred to as the Ockham’s factor as it naturally penalises a model bya quantity proportional to the necessary complexity of the model to get a good fit withrespect to the a priori model complexity; the ideal case where the model is a good one andthe a priori variability of the parameter is just enough to fit the data, the Ockham’s factoris of order one. The Ockham’s factor enters in the BF through the marginal likelihood, andthus when we compare two models we automatically account for the potential differencein their complexity.In order to visualise that, let’s make a useful example. Let’s consider two model: • Model A : this models has only one parameter, let’s call it θ A , and the prior over thisparameter is a uniform distribution in the range [ θ A , θ A ] of width ∆ θ A , and thus thepdf is: p ( θ A | H A , I ) = 1∆ θ A (A.8) • Model B : this models has two parameters, θ B and θ B , that are uncorrelated andthe prior over this parameters is a uniform distribution of width ∆ θ B and ∆ θ B ,respectively: p ( θ B , θ B | H B , I ) = p ( θ B | H B , I ) p ( θ B | H B , I ) = 1∆ θ B θ B (A.9)– 19 –ssuming normal likelihoods, and using the eq. (A.7), the BF can be estimated as: BF AB (cid:39) p ( D | ˆ θ A , H A , I ) p ( D | ˆ θ B , ˆ θ B , H B , I ) × √ πσ (ˆ θ A )∆ θ A (cid:30) (cid:32) (2 π ) σ (ˆ θ B ) σ (ˆ θ B )∆ θ B ∆ θ B (cid:33) = LR AB × OF AB , (A.10)where we defined the likelihood ratio ( LR AB ) and the Ockham’s factor ( OF AB ) as LR AB = p ( D | ˆ θ A , H A , I ) p ( D | ˆ θ B , ˆ θ B , H j , I ) ,OF AB = √ πσ (ˆ θ A )∆ θ A (cid:30) (cid:32) (2 π ) σ (ˆ θ B ) σ (ˆ θ B )∆ θ B ∆ θ B (cid:33) (A.11)The message behind eq. (A.10) is that even though a model with more parameterscan be more flexible and thus better fit the data producing an higher likelihood, one hasto pay the price of having a more complex model. In other words: even if the likelihoodratio pushes towards a more complex model which often better fits the data, on the otherhand the Ockham’s factor penalises it.It is important to stress that for parametric models the Bayes factor critically dependson the choice of priors, that, in this case, is represented by the choice of the widths ∆ θ i .For models with the same number of parameters, especially if the parameters have thesame physical meaning, the Ockham’s factor should be of order one, and the BF shouldfairly be insensitive to priors’ choice.For models with a different number of parameters one has to pay particular attentionto avoid introducing biases due to an ‘unreasonable choice of priors’. For this reason, inthe following we will start by comparing models with just one parameter and then moveto more complex and realistic scenarios exploring the sensitivity of our conclusions to theprior’s choice. B Details of the
JAGS
MCMC simulation
The full hierarchical probability model described in Section (3.2) is implemented in
JAGS .To define the model one uses the
BUGS language [18–20], which, for example for the COSmodel, looks as follows: model {for ( i in 1: n ) {y [ i ] ~ dnorm ( mu [ i ] , 1. / ( sd [ i ]^2))mu [ i ] <- A * cos (2. * pi * ( t [ i ] - t0 ) / T )} where the for loop is over the entries of the data n -tuple, dnorm() stands for the normaldistribution density function whose parameters are the true amplitudes mu[i] as given bythe functional form of the model, and the standard deviations sd[i] given as τ i = 1 /σ i .The last line defines the prior pdf for the parameter of interest A , which, in this case isgiven by a uniform distribution in the range [0 , maxA ] . This model will be fit to time seriesdata described by an n -tuple { D i } = { t i , y i , σ i } of length n , containing for each time bin t i the measurement y i and the symmetric uncertainty σ i associated to it.– 20 – igure 9 . Posterior pdf (black lines) and priors pdf (red lines) for the free parameters A of theCOS model (left) and B of the SAW model (right) for the fit of the DAMA/LIBRA Phase II data. Figure 10 . Likelihood of the DAMA/LIBRA Phase II data for the COS model (left) and theSAW model (right).
B.1 Prior choices
The priors used in our analysis are the following: • for all the cases in which there’s a cosine modulation the amplitude A has a uniformprior: π ( A ) = (cid:40) A if ≤ A ≤ ∆ A otherwise (B.1)with ∆ A = 0 . cpd/kg/keV ee . One could in general choose a wider prior, but withthe upper limit set by the results of the other DM experiments. • for all the cases in which there’s a sawtooth contribution, the slope B has a uniform– 21 – igure 11 . Posterior pdf (black lines) and priors pdf (red lines) for the free parameter A of theCOS model for the fit of the DAMA/LIBRA Phase II data: the priors in these cases are gammadistributions with a shape α = 1 and a scale θ = 100 (left) or θ = 5000 (right). As expected, theposterior is quite different from the flat prior case of fig. 9 only in the most extreme case (right). prior: π ( B ) = (cid:40) B if − ∆ B ≤ B ≤ ∆ B otherwise (B.2)with ∆ B = 0 . cpd/kg/keV ee/yr . In this case we allowed B to have negative values inorder to take into account a possible slowly decreasing sawtooth. As in the previouscase, one could in general choose a wider prior, but with the upper limit on thebackground set by the knowledge of the DAMA experimental apparatus. • for the case in which we let the period and the phase of the cosine free to vary, thetwo parameters T and t have a uniform prior: π ( T ) = (cid:40) T if ˜ T − ∆ T ≤ T ≤ ˜ T + ∆ T otherwiseπ ( t ) = (cid:40) t if ˜ t − ∆ t ≤ t ≤ ˜ t + ∆ t otherwise (B.3)where: ˜ T = 1 yr ˜ t = 152 . d = 0 . yr ∆ T = ˜ T / . yr ∆ t = ˜ t / . yr (B.4)In this case, since we would like to study the effect of letting these two parametersfree to vary, we chose quite large priors around the expected values for a DM signal.Since in all cases, the likelihoods are quite narrow with respect to the priors wechoose, we were able to easily estimate the OF using the Laplace approximation as shownin Appendix A. – 22 –n all this cases these priors could be different in principle, and we checked, by chang-ing both the width of the priors and their functional form, that the conclusions of our anal-ysis wouldn’t change with other reasonable choices. As an example, in fig. 9 we show theposterior and the prior for the parameters of the COS and SAW models in the DAMA/LI-BRA Phase II experiment; for the sake of completeness, in fig. 10 we show the likelihoodsof the two models in the same dataset. In fig. 11, instead, we show, always as an example,how the posterior of the COS parameter in the DAMA/LIBRA Phase II dataset wouldchange with two different choices of priors (a gamma distrubution in both cases): in onecase, the prior is significantly different from zero in the region where the likelihood has themaximum, and thus it does not affect significantly the posterior pdf which is comparablewith that of fig. 9 (fig. 11: left-hand side); in the other case, the prior pdf is so muchconcentrated in zero that the posterior changes significantly (fig. 11: right-hand side). B.2 Posterior results
The results of our fits, given in terms of posterior pdfs, are showed in fig. 12, 13, 14, 15, 16,17, 18. In all cases, the MCMC chains reach equilibrium after few steps showing a stabletrace. The numerical error on the results sampled with O (10 ) steps is negligible. As onecan see from all the figures listed here, all the posteriors are reasonably normal, as couldbe expected from the assumption of normal likelihoods and uniform priors. In particular: • fig. 15 shows the expected anti-correlation between A and B of the C+S model and,in addition, indicates that B is consistent with zero. • fig. 17 shows, as expected, that the three Bi slopes of the sawtooth for the SAWmodel are not correlated when fitted on the three stages of the experiment. • fig. 18 shows the posterior distributions of the parameters A , T and t of the cosinemodel when we let the period and the phase free to vary. As expected T and t arestrongly anti-correlated, while we don’t see any remarkable correlation between theother parameters.Finally, in fig. 8 we show the result of a complete fit on all the (2-6) keVee dataset,obtained using a C+S model with three different slopes in the three different stages andtreating the period and the phase of the cosine modulation as free parameters, and, infig. 7, we display the result of the same fit, but fixing the period and the phase of thecosine modulation to their expected values ( T = 1 yr and t = 152 . d ).– 23 – igure 12 . Posterior pdfs for the free parameters of the two models for the fit of the DAMA/NaIdata. On the left hand side, the parameter B for the SAW model, and, on the right hand side, theparameter A for the COS model. Figure 13 . Posterior pdfs for the free parameters of the two models for the fit of the DAMA/LI-BRA I data. On the left hand side, the parameter B for the SAW model, and, on the right handside, the parameter A for the COS model. – 24 – igure 14 . Posterior pdfs for the free parameters of the two models for the fit of the DAMA/LI-BRA II data. On the left hand side, the parameter B for the SAW model, and, on the right handside, the parameter A for the COS model. – 25 – igure 15 . Posterior pdfs for the free parameters of the C+S model for the fit of the DAMA/LI-BRA II data. The figure shows in the lower-left corner the combined pdf reporting also the credibleregions at . , . , . probability as black, green, and blue contours respectively. The upper-leftand lower-right plots are the marginal pdfs for the parameter A and B respectively. The correlationcoefficient is given in the upper-right corner of the figure. – 26 – igure 16 . Posterior pdf for the free parameter of the COS model for the fit of the whole (2-6)keVee DAMA dataset. – 27 – igure 17 . Marginal posterior pdfs for the free parameters of the SAW model with three differentslopes for the fit of the whole (2-6) keVee DAMA dataset. The 3 plots on the diagonal of thefigure are the uni-dimensional pdfs of each single free parameter obtained by marginalising on theall the others. The 3 bi-dimensional pdfs in the bottom-left corner of the figure give the marginalpdfs of each pair of parameters obtained by marginalising on the other. The plots show also thecredible regions at . , . , . probability as black, green, and blue contours respectively. Thecorrelation coefficients are given in the upper-right corner of the figure. – 28 – igure 18 . Marginal posterior pdfs for the free parameters of the cosine model for the fit ofthe whole (2-6) keVee DAMA dataset. The 3 plots on the diagonal of the figure are the uni-dimensional pdfs of each single free parameter obtained by marginalising on the all the others.The 3 bi-dimensional pdfs in the bottom-left corner of the figure give the marginal pdfs of eachpair of parameters obtained by marginalising on the other. The plots show also the credible re-gions at . , . , . probability as black, green, and blue contours respectively. The correlationcoefficients are given in the upper-right corner of the figure. – 29 – eferences [1] R. Bernabei et al., Riv. Nuovo Cim. , 1-73 (2003), and refs. therein,[arXiv:astro-ph/0307403].[2] R. Bernabei et al., Phys. Lett. B480
D13 , 2127 (2005), [arXiv:astro-ph/0501412].[4] R. Bernabei et al., Phys. J.
C56 , 333 (2008), arXiv:0804.2741 [astro-ph].[5] R. Bernabei et al., Eur. Phys. J.
C73 , 2648 (2013), arXiv:1308.5109 [astro-ph.GA].[6] R. Bernabei et al., Nucl. Phys. At. Energy (2018) 307, arXiv:1805.10486 [hep-ex].[7] D. Buttazzo, P. Panci, N. Rossi and A. Strumia, arXiv:2002.00459 [hep-ex].[8] A. Krishak, A. Dantuluri and S. Desai, JCAP (2020) no.02, 007, arXiv:1906.05726[astro-ph.CO].[9] Akaike H., IEEE T. Automat. Contr. (1974) 716.[10] Schwarz G., Ann. Statist. (1978) 461.[11] A. R. Liddle, Mon. Not. Roy. Astron. Soc. (2007) L74, astro-ph/0701113.[12] H. Jeffreys, Theory of probability, 3rd ed, Oxford Classics series (reprinted 1998) (OxfordUniversity Press, Oxford, UK, 1961).[13] GitHub repository for the source code of this work https://github.com/piacent/bayes_analysis [14] R Core Team (2018), R Foundation for Statistical Computing, Vienna, Austria. , R Core Team (2018).[15] M. Plummer, Proceedings of the 3rd International Workshop on Distributed StatisticalComputing (DSC 2003), March 20-22, Vienna, Austria. ISSN 1609-395X, http://mcmc-jags.sourceforge.net/ .[16] M. Plummer, R package version 4-10, https://cran.r-project.org/web/packages/rjags/index.html .[17] Q. F. Gronau et al. , https://github.com/quentingronau/bridgesampling .[18] W.R. Gilks, A. Thomas , D.J. Spiegelhalter, The Statistician , 169-178 (1994).[19] D. Lunn, D. Spiegelhalter, A. Thomas, N. Best, Statistics in Medicine, (2009) 3049-67.[20] The BUGS Project, ..