[PDF] BlurRing - Researchain

Abstract

A code package, BlurRing, is developed as a method to allow for multi-dimensional likelihood visualisation. From the BlurRing visualisation additional information about the likelihood can be extracted. The spread in any direction of the overlaid likelihood curves gives information about the uncertainty on the confidence intervals presented in the two-dimensional likelihood plots.

Full PDF

BBlurRing

Lydia Brenner and Carsten BurgardSeptember 4, 2018

Abstract

A code package, BlurRing, is developed as a method to allow formulti-dimensional likelihood visualisation. From the BlurRing visual-isation additional information about the likelihood can be extracted.The spread in any direction of the overlaid likelihood curves gives infor-mation about the uncertainty on the conﬁdence intervals presented inthe two-dimensional likelihood plots.

Introduction

The predictive power of the StandardModel (SM) of particle physics has, sofar, been conﬁrmed with every mea-surement at the Large Hadron Col-lider (LHC). With the discovery ofthe Higgs boson in 2012 by the AT-LAS and CMS collaboration [3, 4], theSM is able to map theoretical pre-diction to experimental measurementswith a level of precision that has notbeen seen before. Due to the dis-covery of the Higgs boson, new waysto look at the data coming from theLHC have become of high interest tothe experiments. However many toolsare not equipped to handle these newmethods. One of the main complica-tions with the new type of measure-ments performed at the LHC comesfrom the inability to represent multi-dimensional likelihoods. While previ-ously measurements, and in particu-lar searches for new particles, had oneor two parameters of interest, current measurements take more than two pa-rameters into account.While it is perfectly possible torepresent likelihoods in one or two di-mensions without approximating themodels, it is not possible to repre-sent likelihoods with higher dimen-sionality in a comprehensive way on atwo dimensional paper. The currentmethod for multi-dimensional likeli-hoods proﬁles the remaining parame-ters [1]. This method ﬁts all parame-ters simultaneously and uses the bestﬁtted value for the remaining, i.e. notplotted, parameters. The proﬁlingmethod incorporates the remainingparameters into the likelihood, ratherthan ﬁxing the remaining parametersto their expected values. However,it does not show how the likelihoodchanges due to possible changes in thevalues of the other parameters.Furthermore, the proﬁled likeli-hood curve in two dimensions doesnot necessarily map to a continu-ous path in the full parameter space. a r X i v : . [ phy s i c s . d a t a - a n ] M a y o incorporate this missing informa-tion in plotting higher dimensionallikelihoods, the BlurRing method isdeveloped. The BlurRing methodis explained in detail below, andthe full package available throughtinyurl.com/BlrRng and fully imple-mented in RooFit [5]. Method

The BlurRing method is based on us-ing a random sampling of the parame-ter space of the remaining parameters.The method is introduced through theuse of a three dimensional example.The example model is built by addinga Gaussian signal to a uniform back-ground. The parameters of interestare the signal strength, n , the cen-tral value of the Gaussian, m , and thewidth of the Gaussian, w . The fullmodel is n · Gauss( v, m, w ) + (1 − n ) · Unif( v ) , where v is the ﬁtted variable, n ∈ [0 , m ∈ [ − ,

5] with a nominal value of 0and w ∈ [0 . ,

4] with a nominal valueof 4. This model is used to generate100 or 1000 events. The distributionof events in the model is shown in ﬁg-ure 1.Figure 2 shows the likelihood curvesfor a two dimensional slice of the pa-rameter space; L ( n, w ) = L ( m, w, m nom ) L ( ˆ m, ˆ w, ˆ m ) . In this case the third parameter is setto the nominal value of that param-eter. Although slices can be made in all three planes, the overall struc-ture of the likelihood in three dimen-sions is not clear from the thee pos-sible slices. The remaining slices canbe found in Appendix A. For a Gaus-sian likelihood, the shape of the likeli-hood has a well known elliptical shape.At the minimum of any likelihood aGaussian approximation can be con-structed from the covariance matrixestimate. The Gaussian form of thelikelihood created from the shape ofthe likelihood at the minimum of theﬁt is called the Hessian approxima-tion; L Hessian ∝ exp (cid:32)(cid:88) ij · x i · H ij · x j (cid:33) , where x is the vector of the sampledmodel parameters and H ij is the Hes-sian matrix H ij = ∂ ln L ( x ) ∂x i ∂x j (cid:12)(cid:12)(cid:12)(cid:12) ˆ x . The Hessian approximation of the like-lihood for the example in this paper isshown in the same ﬁgure, and the el-liptical shape is easily recognisable.

Sampling

Instead of ﬁxing the third parame-ter to the nominal value or best ﬁt-ted value, the BlurRing method usesa random sampling method for the re-maining parameters; L ( x | a, b ) , where x is the vector of the sampledmodel parameters and a and b are thescanned model parameters. In the ex-ample of this paper, there is only one2 - - - - - E v en t s / ( . ) - - - - - E v en t s / ( . ) Figure 1: Example model with uniform background and Gaussian signal for100 events (left) and 1000 events (right). n ll w n ll w Figure 2: The Negative Log Likelihood (nll) contours for w vs n (left) and itshessian approximation (right) for a model with 100 events.remaining parameter which is sampledrandomly, but the method is indepen-dent of the number of sampled param-eters.In principle, the samples can be chosenarbitrarily. However, in order for theresulting representation to be indica-tive of the true shape of the likelihood,the Likelihood itself is normalised andthen interpreted as a probability den-sity function to draw the sample val-ues from. Samples are thus drawnfrom the Bayesian posterior density function with a ﬂat prior. This way,not only the shapes of the contours arereﬂective of the Likelihood, but alsothe distribution of samples itself pro-vides a visual indication of the struc-ture of the likelihood.The technical implementation accom-panying this paper is using a simplerejection-sampling method, which iseﬃcient enough for the example pro-vided here. Samples are drawn uni-formly from the parameter space re-stricted to the Z -sigma hyper-ellipse3o void outliers. Samples are acceptedif a randomly chosen value is smallerthan the likelihood value at this point.For each accepted sample of the Like-lihood, the likelihood is evaluated ona grid of points, from which the con-tours are then extracted. For some Zσ ellipse U around the minimum ˆ x , thesamples are chosen uniformly as x ∼ Unif ( U ) and the comparison value ischosen as y ∼ Unif ([1 , Zσ ]), such that P accept ( x ) = P (cid:18) y < L ( x ) L (ˆ x ) (cid:19) = L ( x ) L (ˆ x )and hence y ∼ L ( x )(ˆ x )for the accepted points, where x is thevector of the sampled model parame-ters.A second implementation also accom-panying this publication is using amultivariate Gaussian as a Hessian ap-proximation of the Likelihood and isdrawing samples from this approxi-mation, which is possible using stan-dard random number generators with-out rejecting any points, simply using y ∼ Gauss L ( x ) . A third method is also provided along-side in the package. Employing theGibbs-Sampling [2] method, MarkovChain Monte Carlo (MCMC) sam-pling is used to more eﬃciently coverthe parameter space with samples.For each individual component, the re-jection sampler from the ﬁrst imple-mentation is used. Since the exam-ple discussed in this paper considers asingle sampled parameter, Gibbs sam-pling is identical to rejection sampling for this case and is therefore not shownseparately.In cases of few parameters and verycomplicated shapes, rejection sam-pling is the safest and most accurate,but also slowest option. For higher di-mensional problems, Gibbs samplingshould be used, where the full likeli-hood structure is still preserved, butthe independence of the individualsamples is not guaranteed. For prob-lem with an extremely large parameterspace, the hessian sampling will pro-vide a very eﬃcient method at the costof some of the elegance of the Blur-Ring method.

Visualisation

By drawing the likelihood contours forrandomly sampled values of the re-maining parameters, the eﬀect of vary-ing proﬁled parameters becomes moreclear. For the Hessian approximationcase, where an elliptical shape is ex-pected, each slice of two parametersclearly shows the expected shape ascan be seen in ﬁgure 3. The dottedlines give the one and two sigma con-tours of the likelihood where the re-maining parameters are ﬁxed to theirnominal values. Figure 3 also showsthe one and two sigma likelihood con-tours for the full likelihood, where thenon-Gaussian shape is very clear. Inﬁgure 4 the likelihood contours arenormalised to the minimum condi-tional of the sample values to give in-formation purely on the distortion ofthe shape of the likelihood contour.Additional ﬁgures can be found in Ap-pendix A.Not only does this method rep-resent the full likelihood in a more4 .5 2 2.5 3 3.5 w0.250.30.350.40.450.50.550.60.65 n n Figure 3: The Negative Log Likelihood (nll) contours with BlurRing for w vs n (left) and its hessian approximation (right) for a model with 100 events. n n Figure 4: The Negative Log Likelihood (nll) contours normalised with Blur-Ring for w vs n (left) and its hessian approximation (right) for a model with100 events.comprehensive way, the spread in anydirection gives information related tothe uncertainty on conﬁdence intervalgiven on any single or couple of pa-rameters. A large spread in the likeli-hood curves in the BlurRing plots in-dicate that a small change in the valueof the not-plotted parameters can havea large eﬀect on the conﬁdence intervaldetermined from the likelihood curve.Unlike scans where the remainingparameters are ﬁxed to their nominalvalues, the BlurRing method does notuse a simpliﬁed model. While pro-ﬁled likelihoods extracted from simul-taneous ﬁts only show the true likeli-hood curves near the minimum of theﬁt, while incorporating the BlurRingmethod allows the likelihood curves to be correctly presented throughout thefull parameter space.New information can be obtainedfor the plots about the stability of themodels, and correctness of the likeli-hood curves. Conclusions

The BlurRing method allows formulti-dimensional likelihood visuali-sation from which addition informa-tion about the likelihood can be ex-tracted. The spread in any directionof the likelihood curves gives informa-tion about the uncertainty on the con-ﬁdence intervals presented in the two-dimensional likelihood plots.5 cknowledgments

We would like to thank GottfriedHerold and Dimitri Scheftelowitsch forproviding useful hints on sampling im-plementations. We would also like to thank Glen Cowan and Wouter Verk-erke for interesting discussions. Fi-nally, we would like to thank theDESY and Nikhef institutes for giv-ing us the opportunity to pursue thisendeavour.

References [1]

G. Cowan , Statistical Data Analysis , Oxford science publications, Claren-don Press, 1998.[2]

S. Geman and D. Geman , Stochastic relaxation, gibbs distributions, andthe bayesian restoration of images , IEEE Transactions on Pattern Analysisand Machine Intelligence, PAMI-6 (1984), pp. 721–741.[3]

The ATLAS Collaboration , Observation of a new particle in thesearch for the Standard Model Higgs boson with the ATLAS detector atthe LHC , Phys. Lett., B716 (2012), pp. 1–29.[4]

The CMS Collaboration , Observation of a new boson at a mass of125 GeV with the CMS experiment at the LHC , Phys. Lett., B716 (2012),pp. 30–61.[5]

W. Verkerke and D. Kirkby , The

RooFit toolkit for data modeling ,(2003). 6

Appendix: Additional ﬁgures n ll - - - - - m n ll - - - - - m n ll w Figure 5: Negative Log Likelihood (nll) contours for m vs w (left) m vs n (middle) and w vs n (right) for a model with 100 events. n ll - - - - - m n ll - - - - - m n ll w Figure 6: Hessian approximation of Negative Log Likelihood (nll) contours for m vs w (left) m vs n (middle) and w vs n (right) for a model with 100 events.7 .5 - w - n n Figure 7: Hessian approximation of Negative Log Likelihood (nll) contourswith BlurRing for m vs w (left) m vs n (right) and w vs n (bottom) for amodel with 100 events. - w - n n Figure 8: Hessian approximation of Negative Log Likelihood (nll) contoursnormalised with BlurRing for m vs w (left) m vs n (right) and w vs n (bottom)for a model with 100 events. 8 .5 - w - n n Figure 9: Negative Log Likelihood (nll) contours with BlurRing for m vs w (left) m vs n (right) and w vs n (bottom) for a model with 100 events. - w - n n Figure 10: Negative Log Likelihood (nll) contours normalised with BlurRingfor m vs w (left) m vs n (right) and w vs nn