[PDF] HistFitter software framework for statistical data analysis

Abstract

We present a software framework for statistical data analysis, called HistFitter, that has been used extensively by the ATLAS Collaboration to analyze big datasets originating from proton-proton collisions at the Large Hadron Collider at CERN. Since 2012 HistFitter has been the standard statistical tool in searches for supersymmetric particles performed by ATLAS. HistFitter is a programmable and flexible framework to build, book-keep, fit, interpret and present results of data models of nearly arbitrary complexity. Starting from an object-oriented configuration, defined by users, the framework builds probability density functions that are automatically fitted to data and interpreted with statistical tests. A key innovation of HistFitter is its design, which is rooted in core analysis strategies of particle physics. The concepts of control, signal and validation regions are woven into its very fabric. These are progressively treated with statistically rigorous built-in methods. Being capable of working with multiple data models at once, HistFitter introduces an additional level of abstraction that allows for easy bookkeeping, manipulation and testing of large collections of signal hypotheses. Finally, HistFitter provides a collection of tools to present results with publication-quality style through a simple command-line interface.

Full PDF

OOctober 7, 2014 http://cern.ch/histfitter

HistFitter software framework for statistical data analysis

M. Baak a , G.J. Besjes b,c , D. Cˆot´e d , A. Koutsman e , J. Lorenz f,g and D. Short h a CERN, Geneva, Switzerland b Experimental and Theoretical High Energy Physics, IMAPP, Faculty of Science, Radboud Uni-versity Nijmegen, The Netherlands c Nikhef, Amsterdam, The Netherlands d University of Texas, Arlington, USA e TRIUMF, Vancouver, Canada f Ludwig-Maximilians-Universit¨at M¨unchen, M¨unchen, Germany g Excellence Cluster Universe, Garching, Germany h University of Oxford, Oxford, UK

Abstract

We present a software framework for statistical data analysis, called HistFitter, that has been used exten-sively by the ATLAS Collaboration to analyze big datasets originating from proton-proton collisions at theLarge Hadron Collider at CERN. Since 2012 HistFitter has been the standard statistical tool in searchesfor supersymmetric particles performed by ATLAS.HistFitter is a programmable and ﬂexible framework to build, book-keep, ﬁt, interpret and present resultsof data models of nearly arbitrary complexity. Starting from an object-oriented conﬁguration, deﬁnedby users, the framework builds probability density functions that are automatically ﬁtted to data andinterpreted with statistical tests. A key innovation of HistFitter is its design, which is rooted in coreanalysis strategies of particle physics. The concepts of control, signal and validation regions are woven intoits very fabric. These are progressively treated with statistically rigorous built-in methods. Being capableof working with multiple data models at once, HistFitter introduces an additional level of abstraction thatallows for easy bookkeeping, manipulation and testing of large collections of signal hypotheses. Finally,HistFitter provides a collection of tools to present results with publication-quality style through a simplecommand-line interface. a r X i v : . [ h e p - e x ] O c t ONTENTS CONTENTS

Contents ONTENTS CONTENTS ONTENTS Introduction

This paper describes a software framework for statistical data analysis, called “HistFitter”, thathas been used extensively by the ATLAS Collaboration [1] to analyze big datasets originating fromproton-proton collisions at the Large Hadron Collider (LHC) at CERN. Most notably, HistFitterhas become a de facto standard in searches for supersymmetric particles since 2012, see for exam-ple [2–19], with some usage for Exotic [20, 21] and Higgs boson [22] physics. HistFitter is writtenin Python and C++, the former being used for conﬁguration and the latter for CPU-intensivecalculations. Internally, HistFitter uses the software packages HistFactory [23] and RooStats [24],which are based on RooFit [25] and ROOT [26, 27], to construct parametric models and performstatistical tests of the data. HistFitter extends these tools in four key areas:1.

Programmable framework:

HistFitter performs complete statistical analyses of pre-formatted input data samples, from a single user-deﬁned conﬁguration ﬁle, by putting to-gether tools from several sources in a coherent and programmable framework.2.

Analysis strategy:

HistFitter has built-in concepts of control, signal and validation re-gions, which are used to constrain, extrapolate and validate data model predictions acrossan analysis. The framework also introduces a statistically rigorous treatment of the validationregions.3.

Bookkeeping:

HistFitter can keep track of numerous data models, including all generatedinput histograms, both before and after adjustment to measured data, and can performstatistical tests and model-parameter scans of all these models in an organized way. Thisintroduces a powerful additional level of abstraction, which aids the processing of largecollections of signal hypothesis tests.4.

Presentation and interpretation:

HistFitter provides a collection of methods to deter-mine the statistical signiﬁcance of signal hypotheses, estimate the quality of likelihood ﬁts,and produce publication-quality tables and plots expressing these results.This paper details these extensions and is organized as follows. Section 2 summarizes the dataanalysis strategy and the statistical formalism used for many searches and measurements at theLHC. Section 3 describes how this strategy is ingrained in the HistFitter framework. Section 4sketches how support for multiple Probability Density Function (PDF) instances of nearly arbitrarycomplexity has been implemented in HistFitter with a modular object-oriented design. Section 5explains how the PDFs can be used to perform statistical ﬁts of various types. Section 6 describeshow the ﬁt results can be conveniently presented and visualized with diﬀerent methods. Finally,Sec. 7 shows how signal hypotheses can be tested quantitatively in several ways. The publiclyavailable release of HistFitter is described in Sec. 8, before concluding in Sec. 9. Also referred to as “regressions” in the literature. ONTENTS Data analysis strategy

Particle physics experiments require the careful analysis of large data samples, coming from anexperimental apparatus, in order to measure the properties of fundamental particles. A very activeﬁeld of research is focused on using these datasets to discover physical processes that have beenpredicted by theoretical models, but have not yet been observed in nature.Analyses generally rely on external predictions for the various background and signal componentsin the data to aid the interpretation of observations, where the signal component describes theprocess of interest. In particle physics, simulations of known and hypothesized physics processesare run through a detailed detector simulation, and are subsequently reconstructed with the samealgorithms as the data. In addition, background samples can be constructed using data-drivenmethods. The simulated samples may depend on one or many model parameters, for example themasses of hypothesized new particles such as foreseen by supersymmetry. It may be required, forinstance when signals are analyzed over a multi-dimensional space of model parameters, to samplefrom a “grid” of potential signal scenarios, with each point on that grid corresponding to a uniquepoint in the multi-dimensional parameter space. If no excess is observed in the data, exclusionlimits may be set within this grid, excluding a subset of the tested parameter values.HistFitter conﬁgures and builds parametric models to describe the observed data, and providestools to interpret the data in terms of these models. It uses the concepts of control, validation, andsignal regions in the construction and handling of these models. A key innovation of HistFitter isto weave these concepts into its very fabric, and to treat them with statistically rigorous methods.The technical implementation of HistFitter is detailed in the following sections, where we explaintwo key ideas in data analysis strategy that have helped shape HistFitter.

Any physics analysis aiming to study a speciﬁc phenomenon involves deﬁning a region of phasespace, obtained by applying selections to a set of kinematic observables, where a particular signalmodel predicts a signiﬁcant excess of events over the predicted background level. Such a signalenriched region is called a signal region , or SR.To estimate background processes contaminating the SR(s) in a semi-data-driven way, one typicallydeﬁnes control region(s) , or CR(s), in which the dominant background(s) can be controlled bycomparison to the data samples. CRs are speciﬁcally designed to have a high purity for one typeof background, and should be free of signal contamination.A third important component of data analysis is the validation of the model used to predictthe number of background events in the SR(s).

Validation region(s) , VR(s), are deﬁned for thispurpose. VR(s) are typically placed in between the CR(s) and SR(s). Hence, the choice ofVR(s) is typically a trade-oﬀ between maximizing statistical signiﬁcance and minimizing signalcontamination, while controlling the assumptions in the extrapolation from CR(s) to SR(s).The concept of extrapolation between CRs, VRs and SRs is schematically shown in Fig. 1. Anysuch region can have one or many event bins, as illustrated by the dashed lines. The extrapolationhappens in observables chosen to separate the regions, as discussed in Sec. 2.2, and shown by the5

ONTENTS

Extrapolation and transfer factors observable 1 o b s e r v a b l e Figure 1 : A schematic view of an analysis strategy with multiple control, validation and signal regions. Allregions can have single- or multiple bins, as illustrated by the dashed lines. The extrapolation from thecontrol to the signal regions is veriﬁed in the validation regions that lie in the extrapolation phase space. arrows on the ﬁgure.To extract accurate and quantitative information from the data, particle physicists frequently usea Probability Density Function (PDF) whose parameters are adjusted with a ﬁtting procedure.The ﬁt to data is based on statistically independent CRs and SRs, which ensures that they can bemodeled by separate PDFs and combined into a simultaneous ﬁt. A crucial point of the HistFitteranalysis strategy is the sharing of PDF parameters in all regions: CRs, SRs and VRs. Thisprocedure enables the use of information from each signal and background component, as well assystematics uncertainties, consistently in all regions.The analysis strategy ﬂow is schematically shown in Fig. 2. Through the ﬁt to data, the observedevent counts in CR(s) are used to coherently normalize background estimates in all regions, no-tably the SR(s). If the dominant background processes are estimated with Monte Carlo (MC)simulations, their initial predictions are scaled to observed levels in the corresponding CRs using normalization factors computed in the ﬁt. This results in so-called “normalized background pre-dictions”. These are then used for extrapolation into the VRs and SRs, as discussed in the nextsub-section.

An underlying assumption has been made in the previous sections, notably that extrapolations overthe kinematic variables used to diﬀerentiate SR(s) from CR(s) are well modeled after ﬁtting thePDF to data in CR(s). Once the dominant background processes have been normalized in CR(s),the corresponding modiﬁcations to the PDF can be extrapolated to the VR(s), which is (are)6

ONTENTS

Extrapolation and transfer factors N o r m a li z a t i o n v i a li k e li h oo d ﬁ t CR1CR N I n t e r p r e t a t i o n u s i n g l og - li k e li h oo d r a t i o CR1CR N SR 1SR NSignalVR 1VR N

Initial predicted background Normalizedbackground Background extra-polated to SRValidation of extrapolatedbackground ﬁt result TransferfactorTransferfactor

Figure 2 : Overview of a typical analysis strategy ﬂow with HistFitter. then used to verify the validity of this assumption. In HistFitter, this extrapolation is statisticallyrigorous because the PDF is coherently deﬁned in all the CRs, SRs and VRs, even though theVRs are not used as a constraint in the ﬁt. Technical details on the comprehensive extrapolationtechnique used in HistFitter are given in Sec. 5.2.Once a satisfactory agreement is found between normalized background predictions and observeddata in the VRs, the background predictions are further extrapolated to the SR(s), and, by con-vention, are only then compared with the observed data (see Fig. 2); a process generally called“unblinding” or “opening the box”. This unblinding strategy is useful for validating the perfor-mance of the extrapolations, and, in a wider physics context, a) to get conﬁdence in the methodsused and b) to avoid analyzers from using premature SR predictions, and thus potentially biasingthe physics results.Key ingredients of the ﬁtting procedure are the ratios of expected event counts, called transferfactors , or TFs, of each normalized background process between each SR and each CR. The TFsallow the observations in the CRs to be converted into background estimates in the SRs, using: N p (SR , est . ) = N p (CR , obs . ) × (cid:20) MC p (SR , raw)MC p (CR , raw) (cid:21) = µ p × MC p (SR , raw) , (1) where N p (SR , est . ) is the SR background estimate for each simulated physics processes p consideredin the analysis, N p (CR , obs . ) is the observed number of data events in the CR for the process, andMC p (SR , raw) and MC p (CR , raw) are raw and unnormalized estimates of the contributions fromthe process to the SR and CR respectively, as obtained from MC simulation. Similar equationsenable the background estimates to be normalized coherently across all the CRs and the VRs. Theratio appearing in the square brackets in Eq. 1 is deﬁned as the transfer factor TF. N p (CR , obs) isoften rewritten as µ p multiplied by a ﬁxed, nominal background prediction, where µ p is the actualnormalization factor obtained in the ﬁt to data.An important feature of using TFs is that systematic uncertainties on the predicted backgroundprocesses can be canceled in the extrapolation; a virtue of using the ratio of MC estimates. The7 ONTENTS HistFitter software framework total uncertainty on the number of background events in the SR is then a combination of thestatistical uncertainties in the CR(s) and the residual systematic uncertainties of the extrapolation.For this reason, CRs are often deﬁned by somewhat looser cuts than the SR, in order to increaseCR data event statistics without signiﬁcantly increasing residual uncertainties in the TFs, which inturn reduces the extrapolation uncertainties to the SR. More information on the use of normalizedsystematic uncertainties is given in Sec. 4.4.

HistFitter provides a programmable framework to build and test a set of data models. To do so,HistFitter takes a user-deﬁned conﬁguration as input, together with raw data. The HistFitterprocessing sequence then consists of three steps, illustrated by Fig. 3. From left to right:1. Based on the user-deﬁned conﬁguration, HistFitter automatically prepares initial histograms,using ROOT, from the provided input source(s) that model the physics processes in thedata. (The user-deﬁned conﬁguration and histogram creation is discussed further below andin Sec. 4.)2. According to each speciﬁed conﬁguration, the generated histograms are combined by Hist-Factory to construct a corresponding PDF. At the end of this process, each PDF is storedin a

RooWorkspace object together with the dataset and model conﬁguration.3. The constructed PDFs are used to perform ﬁts of the data with RooFit, perform statisticaltest with RooStats, and to produce plots and tables.These steps all require a substantial bookkeeping and conﬁguration machinery, which is providedby HistFitter. The following sub-sections summarize the central HistFitter conﬁguration tool andthe prominent features of the HistFactory and RooStats software tools that HistFitter utilizes.The various steps can can executed individually or consecutively in a single run, and are allcontrolled with a single (and simple) user-deﬁned conﬁguration ﬁle. For example, in early stagesof an analysis cut-selections may need to be determined, requiring frequent regeneration of just thehistograms that describe the data. Whereas, when moving to later stages in an analysis, allowingone to go straight from a low level description of the data to a high level (such as the statisticalsigniﬁcance determination) can have quite a beneﬁcial impact.One of the key beneﬁts of having a single conﬁguration is to aid collaboration between the variousmembers of an analysis group. For example, the ability to rerun an analysis reasonably quickly,as long as the histograms have been created, helps tremendously in sharing workload betweena collaboration, as does having the statistical tools easily accessible to the various members ofan analysis group. Likewise, the process of combining existing analyses is made more eﬃcientthan if each group has to independently submit histograms to some third party for a statisticalcombination. 8

ONTENTS

Conﬁguration manager

Input Histogram 1Histogram 2Histogram 3Histogram 4Histogram 5Histogram 6 PDF APDF BPDF CPDF D Fits / plots / limits, p-valuesFits / plots / limits, p-valuesFits / plots / limits, p-valuesFits / plots / limits, p-values

HistFitter HistFactory RooFit / HistFitter / RooStats

Figure 3 : Overview of the HistFitter processing sequence.

The central HistFitter conﬁguration and bookkeeping machinery is built around a conﬁgurationmanager, configManager , implemented by two singleton objects: one in Python and one in C++.When executing HistFitter, users interact with the Python interface of the configManager todeﬁne, for each data model, a fitConfig object. The conﬁguration manager can hold any numberof fitConfig objects.A fitConfig object contains a PDF describing the CR, SR and VR data belonging to the model,together with meta-data required for the sequence of building, ﬁtting, visualizing and interpret-ing each conﬁguration ( i.e. one entire row in Fig. 3) including the generation of relevant inputhistograms. The fitConfig class is described further in Sec. 4.1.In terms of design patterns, the configManager can be seen as a “factory of factories”, since itgenerates the construction of ﬁtConﬁg objects that are themselves factories of PDF objects. Byproducing a list of data models, HistFitter introduces an additional level of abstraction whichallows hypothesis tests to be performed over grids of signal models.The construction of each data model typically requires the preparation of tens to hundreds ofhistograms. This can lead to memory exhaustion problems for long lists of models. However, whilesignal samples tend to be unique to each model, the background samples are often identical in mostof the models. When preparing input histograms for each sample of each model, the conﬁgManagerstores unique auto-generated names in a Python dictionary. The dictionary is used in turn to9

ONTENTS

HistFactory eﬃciently identify and re-use the histograms that can be shared between independent data models(see Fig. 3), which signiﬁcantly reduces the memory usage of the software. Additionally, thegenerated histograms are stored in an external ﬁle, allowing them to be directly loaded whenrerunning HistFitter in the same conﬁguration. This avoids the need for their, usually time-consuming, regeneration, and helps in sharing workload between a collaboration.

HistFitter uses the HistFactory package to construct a parametric model describing the data,based on provided input histograms. This parametric model describes the nominal prediction andassociated systematic variations of multiple signal and background processes in multiple regions,up to nearly arbitrary complexity. The input histograms can be generated by HistFitter, or canbe provided externally by users.As detailed in Ref. [23], the PDF constructed by HistFactory describes the parameter(s) of interest,such as the rate of a signal process, the normalization factors for background processes (as estimatedfrom the data), and the so-called nuisance parameters that parametrize the impact of systematicuncertainties. Each systematic uncertainty i is described with a nuisance parameter, θ i , thatcontinuously interpolates between the variation and nominal templates, e.g. θ i = ± ± σ variations and θ i = 0 for the nominal template, where 1 σ means one standard deviation.The general likelihood L of the analyses considered here is the product of Poisson distributionsof event counts in the SR(s) and/or CR(s) and of additional distributions that implement theconstraints on systematic uncertainties. It can be written as: L ( n , θ | µ sig , b , θ ) = P SR × P CR × C syst = P ( n S | λ S ( µ sig , b , θ )) × (cid:89) i ∈ CR P ( n i | λ i ( µ sig , b , θ )) × C syst ( θ , θ ) . (2)The ﬁrst two factors of Eq. 2 (see P without subscript) reﬂect the Poisson measurements of n S and n i , the number of observed events in the signal region and each control region i . The Poissonexpectations λ S and λ i are functions depending on the predictions b for various background sources,the nuisance parameters that parametrize systematic uncertainties, the normalization factors forbackground processes, µ p , and also the signal strength parameter µ sig . For µ sig = 0 the signalcomponent is turned oﬀ, and for µ sig = 1 the signal expectation equals the nominal value of themodel under consideration.The predictions for signal and background sources are forced to be positive in HistFactory for anyvalues of the nuisance parameters and in any histogram bin.Systematic uncertainties are included using the probability density function C syst ( θ , θ ), where θ are the central values of the auxiliary measurements around which θ can be varied, for examplewhen maximizing the likelihood. The impact of changes in nuisance parameters on the expectationvalues are described completely by the functions predicting the amount of signal and background, λ S and λ i . For independent nuisance parameters, C syst is simply a product of the probabilitydistributions corresponding to the auxiliary measurements describing each of the systematic un-10 ONTENTS

RooStats certainties, typically Gaussians G with unit width, C syst ( θ , θ ) = (cid:89) j ∈ S G ( θ − θ j ) , (3)where S is the full set of systematic uncertainties considered. The auxiliary measurements θ aretypically ﬁxed to zero, but can be varied when generating pseudo experiments (see below).Several interpolation (and extrapolation) algorithms are employed in HistFactory to describe thePDF for all values of nuisance parameters θ j . Some details of these algorithms are discussed inSec. 4.4, but for a complete overview the reader is referred to Ref. [23].The execution of HistFactory results in a RooWorkspace , a persistent

RooFit object containingthe parametrized PDF, the dataset, and a helper object summarizing the model conﬁguration.As discussed in the next sub-section, these are used as input to perform statistical tests with theRooStats package.

HistFitter is capable of performing a list of pre-conﬁgured statistical tests to one or severaldataset(s) from a single command-line call. To do so, it interfaces with the RooStats package.These tests are:1. hypothesis tests of signal models;2. the construction of expected and observed conﬁdence intervals on model parameters. Forexample, the 95% conﬁdence level upper limit on the rate of a signal process;3. the signiﬁcance determination of a potentially observed event excess.A suite of statistical calculations can be performed, as conﬁgured by the user, ranging fromBayesian to Frequentist philosophies and using various test statistic quantities as input. Bydefault, HistFitter employs a Frequentist method to perform hypothesis tests and uses the proﬁlelikelihood ratio q µ sig as test statistic (details below in Sec. 3.3.1). The CLs method [28] is used totest the exclusion of new physics hypothesis. Whenever appropriate, this method is approximatedby asymptotic formulae [29] to speed up the evaluation process. Though not strictly HistFitterspeciﬁc, some details follow in the next two sub-sections.More details about how hypothesis tests are performed with HistFitter are given in Sec. 7. Supported test statistics are: a maximum likelihood estimate of the parameter of interest, a simple likelihoodratio − L ( µ, ˜ θ ) /L (0 , ˜ θ )), as used by the LEP collaborations, a ratio of proﬁle likelihoods − L ( µ, ˆˆ θ ) /L (0 , ˆ θ )),as used by the Tevatron collaborations, or a proﬁle likelihood ratio − L ( µ, ˆˆ θ ) /L (ˆ µ, ˆ θ )), as used by the LHCcollaborations. For the later case, the hypothesis tests can be evaluated as one- or two-sided. The sampling of thetest statistics is done either with a Bayesian, Frequentist, or a hybrid calculator [24]. ONTENTS

RooStats

As described in detail in Ref. [29], the likelihood function L used in the proﬁle likelihood ratio isbuilt from the observed data and the parametric model that describes the data. The proﬁle loglikelihood ratio for one hypothesized signal rate µ sig is given by the test statistic: q µ sig = − (cid:18) L ( µ sig , ˆˆ θ ) L (ˆ µ sig , ˆ θ ) (cid:19) , (4)where ˆ µ sig , ˆ θ maximize the likelihood function, and ˆˆ θ maximize the likelihood for the speciﬁc, ﬁxedvalue of the signal strength µ sig . Diﬀerent deﬁnitions of q µ sig apply to discovery and signal modelexclusion hypothesis tests, and also to diﬀerent ranges of µ sig , as discussed in detail in Ref. [29].The Frequentist probability value, or p -value, assigned by an hypothesis test of the data, e.g. adiscovery or signal model exclusion test, is calculated using a distribution of the test statistic, f ( q µ sig | µ sig , θ ). This distribution can be obtained by throwing multiple pseudo experiments thatrandomize the number of observed events and the central values of the auxiliary measurements.The test statistic q µ sig has an important property. According to Wilks’ theorem [30] the distribu-tion of f ( q µ sig | µ sig , θ ) is known in the case of a large statistics data sample. For a single signalparameter, µ sig , it follows a χ distribution with one degree of freedom and is independent of ac-tual values of the auxiliary measurements, thus making it easy to approximate. The case of largestatistics is also called “the asymptotic regime”. Approximation of large statistics holds reasonablywell in most cases, e.g. from as few as O (10) data events. Typically one therefore uses asymptoticformulas to evaluate the p -value of the hypothesis test, avoiding the need for time-costly pseudoexperiments. When not working in the asymptotic regime, i.e. in cases of low statistics, the distribution of thetest statistic f ( q µ sig | µ sig , θ ) needs to be sampled using pseudo experiments. As the true values ofthe auxiliary measurement are unknown, one ideally scans µ sig and θ to generate a suﬃcientlyhigh number of pseudo experiments for each set. In this way one can ﬁnd the values that give themost conservative p -value for the parameter of interest. For example, one cannot exclude a signalmodel if there is any set of auxiliary measurement values where the CLs value is greater than 5%.This is not a practical procedure when there is a large set of auxiliary measurements to consider.However, it turns out a good guess can be made of what values of θ maximize the p -value. Theidea is the following. As a p -value is based on the observed data, the largest p -value essentiallycorresponds to the scenario that is most compatible with the data. Therefore one ﬁrst ﬁts thenuisance parameters based on the observed data and the hypothesized value of µ sig , including allﬁt regions. These are then used to set the auxiliary measurement values. In statistics terms: thenuisance parameters have been “proﬁled” on the observed data. Based on this, one generates the Note that the maximization of the likelihood function forces the need for continuous and smooth parametricmodels to describe the signal and background processes present in the data. Equal to taking the median and width of a collection of pseudo experiments, see discussion in Ref.[29]. ONTENTS Programming of Probability Density Functions pseudo experiments that are expected to maximize the p -value over the auxiliary measurements,and the observed p -value is evaluated as usual. This procedure is called “the proﬁle construction”.This procedure guarantees exact statistical coverage for a counting experiment in the case wherethe ﬁtted values of θ correspond to their true values. Towards the asymptotic regime, however, thedistribution of f ( q µ sig | µ sig , θ ) becomes independent of the values of the auxiliary measurementsused to generate the pseudo experiments. As a result, when using this procedure, the p -valueobtained from the hypothesis test is robust, and generally will not undercover.Both the observed and expected p -values depend equally on the unknown true values of the auxil-iary measurements. For consistency reasons, the convention adopted at the LHC is to use the samevalues to obtain the expected p -value as the observed p -value on the data. I.e. the same ﬁttedbackground levels are used to generate pseudo experiments for both cases, such that the predictedexpectation is the most compatible assessment for the actual observation. Through this choicethe expected p -value now depends indirectly on the observed data in the SR(s). A consequence isdiscussed in Sec. 7.3. HistFitter is designed to build and manipulate PDFs of nearly arbitrary complexity.In the terminology of HistFactory, the likelihood function in Eq. 2 has multiple channels , whichneed inputs in the form of samples , corresponding to the signal and background processes for thatregion. In turn, the various samples have systematic uncertainties, or systematics . A HistFactory“channel” is a synonym for a “region”, generically referring to either CR, SR or VR in this section.The systematic uncertainties can be either statistical, theoretical or experimental in nature. TheseHistFactory C++ classes are mirrored (over-loaded) by HistFitter in Python, and extends themby adding the ﬂexibility to construct multiple PDFs from these building blocks in a programmableway, as discussed further in this section.An example HistFitter conﬁguration ﬁle, written in Python and demonstrating these components,is shown in Appendix A.

HistFitter uses the fitConfig class to construct its PDFs. The design of this class allows for thecreation of highly complex PDFs, describing highly non-trivial analysis setups, with only a fewlines of intuitive code.This is conﬁgured by users as follows: from configManager import configMgrmyFitConfig = configMgr.addFitConfig("myAnalysisName") where myFitConfig is a reference to a new ﬁtConﬁg object owned by the conﬁgManager. TheﬁtConﬁg class logically corresponds to a PDF decorated with meta-data about the properties of13

ONTENTS

The ﬁt conﬁguration conﬁgMgrﬁtConﬁg 1Channel 1 Channel 2Systematic A Systematic BSample I Sample IISystematic A Systematic C Sample II Sample IIISystematic D Systematic C

Correlatedsystematics Correlated sample Correlatedsystematics

Figure 4 : Illustration of a ﬁt conﬁguration in HistFitter. Each ﬁtConﬁg instance deﬁnes a PDF built froma list of channel ( i.e.

CR, SR or VR), sample and systematic objects. Each channel owns a list of samplesand each sample owns a list of systematic uncertainties. Correlated samples and systematics are declaredby being given identical names. Otherwise they are treated as un-correlated. addSystematic() conﬁgManagerﬁtConﬁgSampleChannelSystematic addSample()addChannel()

Figure 5 : The methods addChannel() , addSample() and addSystematic() are used to build complex PDFsin an intuitive way. The methods addSample and addSystematic implement a “trickle down” mechanism,discussed in the text. ONTENTS

Channels the contained channels (CR, SR, VR), including visualization, ﬁtting and interpretation options.During conﬁguration, instances of channels, samples and systematics are put together by ﬁtConﬁgobjects, together with links to the corresponding input histograms. During execution, the ﬁtConﬁginformation is used to steer the HistFactory package’s creation of a

RooSimultaneous objectmodelling the actual PDF with RooFit.Fig. 4 illustrates the modular design of a typical HistFitter ﬁt conﬁguration. The user inter-face provides the methods addChannel() , addSample() and addSystematic() to build up datamodels in an intuitive manner. For instance, samples and systematics can be eﬃciently addedto multiple channels through a “trickle-down” mechanism, as illustrated by Fig. 5. This meansthat fitConfig.addSample() adds a sample to all the channels owned by the ﬁtConﬁg, while channel.addSample() adds a sample to one speciﬁc channel. Similarly, sample.addSystematic() only adds a systematic to one speciﬁc sample while channel.addSystematic() adds a systematicto all the samples owned by the channel and fitConfig.addSystematic() adds a systematic toall the samples of all the channels owned by the ﬁtConﬁg.Since diﬀerent channels often share the same samples (meaning: physics processes), and diﬀerentsamples often share the correlated systematic uncertainties, the trickle-down mechanism is in factan extremely useful feature. It makes it so that complex conﬁgurations of PDFs can often bedescribed with only a few lines of code. As illustrated in Fig. 5, one simply adds all channels,samples, and systematic uncertainties directly to the fitConfig object and lets these “trickledown”, thereby automatically creating a highly advanced ﬁt conﬁguration.A basic ﬁt conﬁguration can also be conveniently cloned and extended to specify new conﬁgurations,a feature which is frequently used to build data models corresponding to multiple signal hypothesesfrom a common background description. The

Channel objects contain data from a region of phase space deﬁned by event selection criteriaon the input dataset. Channels can represent either a simple event count ( i.e. one bin), or themulti-binned distribution of a physical observable. New binned and un-binned channels can beadded to a ﬁtConﬁg by calling: myChannel = myFitConfig.addChannel("myObs", ["mySelection"], nBins, varLow, varHigh)myUnbinnedChannel = myFitConfig.addChannel("cuts", ["mySelection"], 1, 0.5, 1.5) where myObs is the name of an element of the input dataset, nBins , varLow and varHigh indicatethe number of bins and the range of values as for a one-dimensional histogram, and mySelection speciﬁes the selection criteria of the considered region. For un-binned channels, cuts is a reservedkeyword indicating that only the total the number of events passing the selection criteria needs tobe considered. As discussed in Sec. 5.1, a

Channel object can represent a CR, SR or VR. This information isconﬁgured by users as follows: This is sometimes referred to as a “cut-and-count” experiment in the literature. ONTENTS

Samples myFitConfig.setBkgConstrainChannels(myChannel)myFitConfig.setValidationChannels(myChannel)myFitConfig.setSignalChannels(myChannel)

It is possible to add an arbitrary number of channels to a given ﬁtConﬁg by simply calling addChannel() multiple times. Consequently, HistFitter automatically performs simultaneous ﬁtsconstrained by the data of all

BkgConstrainChannels (CR) and

SignalChannels (SR), but notby the

ValidationChannels (VR). The data itself is described by a list of

Sample objects ownedby each channel, as discussed in the next sub-section.

The

Sample class logically corresponds to a component of a RooFit PDF decorated with HistFittermeta-data. In a typical particle physics analysis, each sample corresponds to a speciﬁc physicsprocess and several samples are needed to model a complete dataset.In HistFitter, samples can be deﬁned in a speciﬁc channel or deﬁned simultaneously in multiplechannels. The Sample class also owns a list of objects representing its systematic uncertainties.Importantly, samples provide the link between PDF components and raw input data. Three typesof inputs are supported:1. TTree: a ROOT data structure, stored in a

TFile , in which a list of events is mapped to alist of key-value pairs characterizing the properties of each event;2. Float: ﬂoating-point numbers provided by users through the Python interface of HistFitter;3. Histogram: pre-made histograms using the ROOT

TH1 data structure, stored in an external

TFile .The most commonly used type of input is

TTree , which provides maximal ﬂexibility and featuresbut requires the largest amount of processing power and disk I/O. Float inputs tend to be usedfor quick tests and simple processes. Histogram inputs can be used for compatibility with externalframeworks, and also allow the user to conveniently skip the

TTree -to-histogram processing whenre-building PDFs. In all cases, the raw input is transformed into histograms as speciﬁed by Sampleobjects, before being saved to a temporary ﬁle and passed to HistFactory to build the RooFit PDFs(see Sec. 4.1).A basic sample can be created and conﬁgured by users as follows: mySample = Sample("SampleName",myColor)myChannel.addSample(mySample) which constructs a sample object owned by myChannel and displayed with myColor color bythe visualization tools. In this example, HistFitter takes inputs from a

TTree object named

SampleName in the default ROOT ﬁle speciﬁed at the conﬁgManager level. To construct thesample, HistFitter uses the event selection criteria of the parent channel and applies a defaultsample weight. 16

ONTENTS

Systematic uncertainties

The default settings can be over-written by users to achieve speciﬁc goals. For instance, a samplecan be built from Float input with: mySample.buildHisto([100,34,220], "region", "observable") where the list [100,34,220] speciﬁes the values of three bins in an histogram. The default sampleweight and path to the input data can also be over-written as follows: mySample.setWeight(("weight1","weight2"))mySample.setFileList(["File1.root","File2.root"])mySample.setTreeName("ArbitraryName")mySample.setHistoName("ArbitraryName")

Weights are passed as a string to also allow the easy use of weights stored in a ROOT

TTree . Inaddition, the

Sample class has optional methods to conﬁgure its corresponding RooFit PDF, suchas: mySample.setStatConfig(False)mySample.setNormFactor("myNorm", 1.0, 0.0, 10.0) resulting in the deactivation (activated by default) of built-in Poisson statistical uncertainties, andin the creation of a ﬁt normalization factor myNorm with initial value 1.0 and allowed range 0.0 to10.0, respectively.Last but not least, HistFitter provides many features for modeling the systematic uncertaintiesassociated to each sample, as discussed in the next sub-section.

For each model component, a nominal distribution representing the best available prediction istypically provided to the physics analysis as a histogram owned by a

Sample object. These com-ponents typically have systematic uncertainties whose impact gets quantiﬁed in dedicated studies.This is often modeled as variations of one standard deviation around the nominal prediction, pro-vided to the physics analysis as sets of two additional histograms. These systematic uncertaintiesare parametrized in the PDF with nuisance parameters, as in Eq. 2.In HistFitter, systematic uncertainties are implemented with a dedicated

Systematic class withseveral options. In a typical analysis, several

Systematic objects are built and owned by a parent

Sample . Through the trickle down mechanism described in Sec. 4, systematics can be deﬁned fora speciﬁc sample or deﬁned simultaneously for multiple samples and/or multiple channels.A

Systematic object can be conceived as a doublet of samples specifying up and down variationsaround the parent

Sample . Hence

Systematic objects can be constructed from the same types ofinputs as

Samples , namely:

TTree , Float and histogram.When using

TTree inputs, two methods can be used to compute the up/down variations of asystematic: weight-based or tree-based. In the weight-based method, histograms are always built17

ONTENTS

Systematic uncertainties

Basic systematic methods in HistFactory overallSys uncertainty of the global normalization, not aﬀecting the shape histoSys correlated uncertainty of shape and normalization shapeSys uncertainty of statistical nature applied to a sum of samples, bin bybinAdditional systematic methods in HistFitter overallNormSys overallSys constrained to conserve total event count in a list of re-gion(s) normHistoSys histoSys constrained to conserve total event count in a list of region(s) normHistoSysOneSide one-sided normHistoSys uncertainty built from tree-based or weight-based inputs normHistoSysOneSideSym symmetrized normHistoSysOneSide overallHistoSys factorized normalization shape and uncertainty, described with overallSys and histoSys respectively overallNormHistoSys overallHistoSys in which the shape uncertainty is modeled with anormHistoSys and the global normalization uncertainty is modeledwith an overallSys shapeStat shapeSys applied to an individual sample

Table 1 : Sub-set of the systematic methods available in HistFitter. The methods are speciﬁed by a stringargument containing a combination of basic HistFactory methods and optional HistFitter keywords: norm , OneSide and/or

Sym . Systematic objects can be built with Tree-based, weight-based, Float or histograminput methods in all cases. from the same

TTree , using three diﬀerent sets of weights: up, nominal and down. In the tree-based method, histograms are built from three diﬀerent

TTree s using the same set of weights. Ifonly one variation is available, users can either build a one-sided uncertainty or symmetrize thevariation as nominal ± (up − nominal)nominal . Systematic objects can be created by users as follows: mySys = Systematic("myTreeSys", "ASample", "ASample\_UP", "ASample\_DOWN", "tree", "myMethods")mySys = Systematic("myWeightSys", ["nominalWeights"], ["upWeights"], ["downWeights"], "weight", "myMethods")mySys = Systematic("myUserSys", ["nominalWeights"], 1.1, 0.8, "user", "myMethods") where myTreeSys and myWeightSys rely on the tree-based and weight-based methods. myUserSys relies on the Float input discussed above, and, in this example, has asymmetric up and down inputuncertainty values of 10% and 20%. The last argument myMethods is discussed below. Systematicobjects are then associated to

Sample or Channel objects with: mySample.addSystematic(mySys)myChannel.addSystematic(mySys)

As illustrated in Fig. 4, correlated systematic uncertainties are declared simply by giving themidentical names in the corresponding

Sample s. Otherwise they are treated as uncorrelated.18

ONTENTS Performing ﬁts

When turning the above into nuisance parameters, additional input is required to specify the inter-polation (extrapolation) algorithm and constraint parametrization for each systematic uncertainty.This is done with the argument myMethods above. Several possible analysis strategies can be en-visaged, requiring a detailed discussion, case by case. To address this, HistFitter does not enforcea speciﬁc strategy but provides users with as many methods as possible to cover all reasonablepossibilities.The basic methods for systematic uncertainties deﬁned in HistFactory are called: overallSys , histoSys and shapeSys , and are listed in the top half of Tab. 1.An overallSys describes an uncertainty of the global normalization of the sample. This methoddoes not aﬀect the shape of the distributions of the sample. A histoSys describes a correlateduncertainty of the shape and normalization. Both methods use a Gaussian constraint to model anuncertainty and allow for asymmetric uncertainties, simply by providing asymmetric input values orhistograms, respectively. By default they are conﬁgured to use a 6th-order polynomial interpolationtechnique between the ± σ and nominal histograms and a linear extrapolation beyond | σ | [23],though they can be conﬁgured diﬀerently.A shapeSys describes an uncertainty of statistical nature, typically arising from limited MC statis-tics. In HistFactory, shapeSys is modeled with an independent parameter for each bin of eachchannel that is, however, shared between all samples with StatConfig==True (see Sec. 4.2). Forsimplicity, users can also set a threshold below which samples are neglected when building a shapeSys . The interpolation and extrapolation technique used for shapeSys are as for histoSys ,and parametrized as a Poissonian constraint.To respond to various use cases encountered during real-life analysis of ATLAS Run-1 data, Hist-Fitter provides additional systematic methods derived from the basic HistFactory methods. Asub-set of the systematic methods available in HistFitter is listed in the bottom half of Tab. 1.These methods can be speciﬁed with combinations of the norm , OneSide and

Sym keywords. The norm keyword indicates that the total event count is required to remain invariant in a user-speciﬁedlist of normalization region(s) when constructing up/down variations. This describes uncertaintiesof the shape only. Such a systematic uncertainty is transformed from an uncertainty on eventcounts in each region into a systematic uncertainty on the transfer factors, as discussed in Sec. 2(Eq. 1). The

OneSide and

Sym keywords indicate that a one-sided or a symmetrized uncertaintyshould be constructed when using tree-based or weight-based inputs.

Diﬀerent ﬁt strategies are commonly used in physics analyses, diﬀering by the usage of particularcombinations of CRs, SRs, and VRs, and by the consideration of a signal model or not. Fitstrategies aim to either derive background estimates in VRs and SRs, or to make quantitativestatements on the compatibility of the background estimate(s) with the observed data in theSR(s). HistFitter is tailored speciﬁcally to the design and implementation of such ﬁt strategies.Also discussed in this section are the technical and statistical details of the extrapolation of back-ground processes across CRs, SRs and VRs. 19

ONTENTS

Common ﬁt strategies

The three most commonly used ﬁt strategies in HistFitter are deﬁned as: the “background-onlyﬁt”; the “model-dependent signal ﬁt”; and the “model- in dependent signal ﬁt” . This sectiondescribes the details of each ﬁt strategy, as also summarized in Tab. 2 at the end of the section.The application of these to validation and hypothesis-testing purposes is described in detail inSecs. 6 and 7, respectively. Background-only ﬁt

The purpose of this ﬁt strategy is to estimate the total background in SRs and VRs, withoutmaking assumptions on any signal model. As the name suggests, only background samples areused in the model. The CRs are assumed to be free of signal contamination. The ﬁt is onlyperformed in the CR(s), and the dominant background processes are normalized to the observedevent counts in these regions. As the background parameters of the PDF are shared in all diﬀerentregions, the result of this ﬁt is used to predict the background event levels in the SRs and VRs.The background predictions from the background-only ﬁt are independent of the observed numberof events in each SR and VR, as only the CR(s) are used in the ﬁt. This allows for an unbiasedcomparison between the predicted and observed number of events in each region. In Sec. 6 thebackground-only ﬁt predictions are used to present the validation of the transfer factor-basedbackground level predictions.Another important use case for background-only ﬁt results in the SR(s) is for external groupsto perform an hypothesis test on an untested signal model, which has not been studied by theexperiment. With the complex ﬁts currently performed at the LHC, it may be diﬃcult (if notimpossible) for outsiders to reconstruct these. An independent background estimate in the SR, asprovided by the background-only ﬁt, is then the correct estimate to use as input to any hypothesistest (see Sec. 7).Technical details of the extrapolation approach are discussed in Sec. 5.2, and validation examplesare given in Sec. 6.

Model-dependent signal ﬁt

This ﬁt strategy is used with the objective of studying a speciﬁc signal model. In the absence ofa signiﬁcant event excess in the SR(s), as concluded with the background-only ﬁt conﬁguration,exclusion limits can be set on the signal models under study. In case of excess, the model-dependentsignal ﬁt can be used to measure properties such as the signal strength. The ﬁt is performed in theCRs and SRs simultaneously. Along with the background samples, a signal sample is included inall regions, not just the SR(s), to correctly account for possible signal contamination in the CRs.A normalization factor, the signal strength parameter µ sig , is assigned to the signal sample.Note that this ﬁt strategy can be run with multiple SRs (and CRs) simultaneously, as long as these Other nomenclature (but deemed confusing) for the model-dependent and model-independent signal ﬁt are“exclusion ﬁt” and “discovery ﬁt” respectively. ONTENTS

Common ﬁt strategies are statistically independent, non-overlapping regions. If multiple SRs are sensitive to the samesignal model, performing the model-dependent signal ﬁt on the statistical combination of theseregions shall, in general, give better (or equal) exclusion sensitivity than obtained in the individualanalyses. An example of this is given in Fig. 10 (right) of Sec. 7.1. As shown in Sec. 4, combiningthe channels of multiple analyses into a single ﬁt conﬁguration is a straightforward exercise inHistFitter.In a similar fashion, using multiple bins of a signal-sensitive observable in the deﬁnition of theSR(s) will generally give a better sensitivity to any signal model studied . An example of such a“shape-ﬁt” signal region is shown in Fig. 8 of Sec. 6.1.Typically, a grid of signal samples for a particular signal model is produced by varying some of itsmodel parameters, such as the masses of supersymmetric particles. The model-dependent signal ﬁtis repeated for each of these grid points, thereby probing the phase space of the model. Examplesof this are provided in Secs. 7.1 and 7.2. Model-independent signal ﬁt

An analysis searching for new physics phenomena typically sets model-independent upper limitson the number of events beyond the expected number of events in each SR. In this way, for anysignal model of interest, anyone can estimate the number of signal events predicted in a particularsignal region and check if the model has been excluded by current measurements or not.Setting the upper limit is accomplished by performing a model-independent signal ﬁt. For thisﬁt strategy, both the CRs and SRs are used, in the same manner as for the model-dependentsignal ﬁt. Signal contamination is not allowed in the CRs, but no other assumptions are madefor the signal model, also called a “dummy signal” prediction. The SR in this ﬁt conﬁgurationis constructed as a single-bin region, since having more bins requires assumptions on the signalspread over these bins. The number of signal events in the signal region is added as a parameterto the ﬁt. Otherwise, the ﬁt proceeds in the same way as the model-dependent signal ﬁt.Examples of upper limits on numbers of beyond the SM events, obtained using this setup, areprovided in Sec. 7.3.The model-independent signal ﬁt strategy, ﬁtting both the CRs and each SR, is also used toperform the background-only hypothesis test, which quantiﬁes the signiﬁcance of any observedexcess of events in a SR, again in a manner that is independent of any particular signal model.More details on the background-only hypothesis test are discussed in Sec. 7.4. One main but subtlediﬀerence between the model-independent signal hypothesis test (Sec. 7.3) and the background-only hypothesis test (Sec. 7.4) is that the signal strength parameter is set to one or zero in theproﬁle likelihood numerator respectively. Both the addition of simultaneous SRs and of shape information in these SRs will make an analysis moreversatile. Since the shape of signals models over multiple bins or multiple SRs will in general be diﬀerent from thebackground prediction, in doing so the ﬁt has gained separation power to distinguish the two. In particular, thesensitivity to signal models not considered in the optimization of the SRs is often retained. ONTENTS

Extrapolation and error propagation

Fit setup

Background-only ﬁt Model-dependent Model-independentsignal ﬁt signal ﬁt

Samples used backgrounds backgrounds + signal backgrounds +dummy signal

Fit regions

CR(s) CR(s) + SR(s) CR(s) + SR

Table 2 : Summary of the ﬁt strategies supported in HistFitter, as described in the text.

This section discusses the extrapolation of coherently normalized background estimates from theCR(s) to each SR and VR, as obtained from the background-only ﬁt , discussed in Sec. 5.1.The basic strategy behind the background extrapolation approach is to share the backgroundparameters of the PDF in all the diﬀerent regions: CRs, SRs and VRs.As discussed in Sec. 3.2, a likelihood function is built from both the parametric model and theobserved data. In other words, a background-only ﬁt to the CRs technically requires a PDFmodeling only the CRs. On the other hand, the extrapolation of the normalized backgroundprocesses from the CRs to the SRs and VRs, which uses the background-only ﬁt result, requires a diﬀerent PDF containing all these regions.In HistFitter, the technical construction of these various PDFs proceeds as follows. First a totalPDF describing all CRs, SRs and VRs is constructed using HistFactory. This PDF is not used toﬁt the data, as the likelihood is unaware of the concept of diﬀerent region types. HistFitter hasdedicated functions to deconstruct and reconstruct PDFs, based on the channel types discussed inSec. 4.2.To perform the background-only ﬁt, the total PDF is deconstructed and then reconstructed de-scribing only the CRs. The result of the background-only ﬁt is stored, containing the values, theerrors and the covariance matrix corresponding to all ﬁt parameters. After this ﬁt, the normalizedbackgrounds are extrapolated to the SRs (or VRs). For this HistFitter deconstructs and recon-structs the total PDF, now describing the CRs and SRs (or VRs). The background-only ﬁt resultis then incorporated into this PDF to obtain the extrapolated background prediction b in any SR(or VR).Once the background-only ﬁt to data has been performed and the total PDF been reconstructed,an estimate of the uncertainty on an extrapolated background prediction σ b, tot can be calculated.The determination of this error requires the uncertainties and correlations from the stored ﬁtresult. The total error on b is calculated using the typical error propagation formula σ b, tot = n (cid:88) i (cid:18) ∂b∂η i (cid:19) σ η i + n (cid:88) i n (cid:88) j (cid:54) = i ρ ij (cid:18) ∂b∂η i (cid:19)(cid:18) ∂b∂η j (cid:19) σ η i σ η j , (5)where η i are the ﬂoating ﬁt parameters, consisting of normalization factors µ k and nuisance pa-rameters θ l , ρ ij is the correlation coeﬃcient, between η i and η j , and σ η i is the standard deviationof η i . Any partial derivatives to b are evaluated on the ﬂy. The background-only ﬁt estimates are sometimes called “blinded” background estimates, as the SR(s) and VR(s)are not included in the ﬁt. ONTENTS Presentation of results

The after-ﬁt parameter values, errors and correlations are saved in the RooFit class

RooFitResult .Let us take an example of a background-only ﬁt result (from CRs only) that needs to be extrapo-lated to a SR. The total PDF (consisting of CRs and SRs) contains a set of parameters that canbe subdivided as follows.1. There is a large set of parameters shared between CRs and SR, η shared , for example thebackground normalization factors and most systematic uncertainties.2. There is a subset of parameters connected only to the CRs, η CR , for example the uncertaintiesdue to limited Monte Carlo statistics in the CRs.3. And ﬁnally, there is another subset of parameters connected only to the SR, η SR .When the ﬁt is performed with the (deconstructed) CRs-only PDF, only the parameters η shared and η CR are evaluated and saved in the ﬁt result.Hence when this ﬁt result is propagated to the SR, the estimated error only contains the param-eters that are shared between the CRs and the SR, and thus is incomplete. The uncertaintiescorresponding to η SR are not picked up in Eq. 5, as these are not contained in the ﬁt result.HistFitter uses an expanded version of the RooFitResult class, called

RooExpandedFitResult ,that contains all of the nuisance parameters of all regions in the extrapolation PDF, even if these arenot used in the background-only ﬁt conﬁguration. This expansion makes it possible to extrapolateall of the shared parameters, while keeping the unshared parameters, such that a complete errorcan be calculated in any region. In the expanded ﬁt result, the correlations between the sharedand unshared parameters are set to zero.Using the

RooExpandedFitResult class the VRs can now provide a rigorous statistical cross check.If the background-only ﬁt to the CRs ﬁnds that changing the background normalization and/orshape parameters of a kinematic distribution gives a better description of the data, this will bereﬂected automatically in the VRs. Likewise, if the uncertainty on these parameters has a strongimpact, and is reduced by the ﬁt, the eﬀect will be readily propagated.In HistFitter, the before-ﬁt parameter values, errors and correlations are stored in an expanded ﬁtresult object as well. The before-ﬁt and after-ﬁt background value and uncertainty predictions canthus be easily compared. A few assumptions are made to construct this before-ﬁt object. First,all correlations are set to zero prior to the ﬁt, eﬀectively taking out the second term of the Eq. 5.Second, the errors on the normalization factors of the background processes are unknown prior tothe ﬁt, and hence set to zero.The ﬁt strategies of Sec. 5.1 are illustrated in Fig. 6, together with the PDF restructuring detailedin this section. In Fig. 6, the various constructed PDFs are indicated as rounded squares and theﬁt conﬁgurations, on the right-hand side, as squares.

HistFitter also contains an extensive array of user-friendly functions and scripts, which help withthe understanding and detailing of the results obtained from the ﬁts. These scripts and plotting23

ONTENTS

Visualization of ﬁt results

Total PDFCRs + VRs + SRs PDFCRs + VRsPDFCRs onlyPDFCRs + SRs ValidationBackground-only ﬁtModel-dep. signal ﬁt

Extrapolation

Model-indep. signal ﬁt

Figure 6 : An overview of the various PDFs HistFitter uses internally, together with their typical use. Thelarge PDF for all regions is automatically deconstructed into separate, smaller ones deﬁned on those subsetsof regions depending on the ﬁt and/or statistical test performed. The PDFs are indicated as roundedsquares, and the ﬁt conﬁgurations as squares. functions are generalized, such that for every model built with HistFitter all of these features comewithout any need for further coding. All scripts and plotting functions can be called by single-linecommands.Two main presentation components are the visualization of ﬁt results and scripts for producingevent yield and uncertainty tables. Both rely critically on the ﬁts to data and uncertainty extrap-olation features discussed in Sec. 5. All tables and plots, discussed in the next two sections, canbe produced for any ﬁt conﬁguration of a deﬁned model, as well as before and after the ﬁt to thedata. Multiple details, such as the legends on plots or the set of regions to be processed for tables,can easily be set in the conﬁguration ﬁle or from the command line.All tables and ﬁgures shown in this section come directly from publications by the ATLAS collab-oration, and serve only as illustrations of the HistFitter tools that are discussed.

HistFitter can produce several classes of ﬁgures to visualize ﬁt results, as detailed below.Fig. 7 shows an example plot of a multi-bin (control) region before (left) and after (right) the ﬁt tothe data, taken from Ref. [2]. Similar plots can be produced for any region deﬁned in HistFitter,either single- or multi-binned. Each sample in the example region (channel) is portrayed bya diﬀerent color. The samples can also be plotted separately (not shown), for the purpose ofunderstanding the distribution of the uncertainties over the samples. The impact of the ﬁt to datacan be studied by comparing the before-ﬁt to after-ﬁt distributions. In the after-ﬁt plot, as a resultof the ﬁt, the normalization, shapes and corresponding uncertainties of the background samples24

ONTENTS

Visualization of ﬁt results have been adjusted to best describe the observed data over all bins , illustrating the analysisstrategy outlined in Sec. 2.An example of two multi-binned SRs, used in Ref. [19], is shown in Fig. 8. Two diﬀerent supersym-metry models, for which each SR is sensitive to strong variations over the bins, are superimposedon the before-ﬁt background predictions. E v en t s / G e V ATLAS -1 Ldt = 4.7 fb ∫ =7 TeV)sData 2011 (Standard Modelmultijets (data estimate)ttW+jets & Z+jetssingle top & diboson [GeV] missT E D a t a / S M E v en t s / G e V ATLAS -1 Ldt = 4.7 fb ∫ =7 TeV)sData 2011 (Standard Modelmultijets (data estimate)ttW+jets & Z+jetssingle top & diboson [GeV] missT E D a t a / S M Figure 7 : Example produced by the ATLAS collaboration and taken from Ref. [2]. Distribution of missingtransverse momentum in the single lepton W +jets control region before (left) and after (right) the ﬁnal ﬁtto all background control regions. [GeV] eff m400 600 800 1000 1200 1400 E v en t s / G e V ATLAS = 8 TeVs, L dt = 20.3 fb ∫ SR1b RegionDataSM TotalFake leptonsCharge flipTop + XDiboson + Triboson χ∼ tc+ → g~ production, g~ g~ ) 20 GeV t~) = m( χ∼ m( ) = (700, 400) GeV t~, g~( [GeV] eff m400 600 800 1000 1200 1400 E v en t s / G e V ATLAS = 8 TeVs, L dt = 20.3 fb ∫ SR0b RegionDataSM TotalFake leptonsCharge flipTop + XDiboson + Triboson χ∼ χ∼ qqq’q’WW → g~ g~ ) χ∼ ) = 2 m( ± χ∼ m( ) = (705, 225) GeV χ , g~( Figure 8 : Example produced by the ATLAS collaboration and taken from Ref. [19]. Eﬀective mass distribu-tions in the signal regions SR0b and SR1b, used as input for model-dependent signal ﬁts. For illustration,predictions are shown from two SUSY signal models with particular sensitivity in each signal region.

Fig. 9 shows an example of a pull distribution for a set of non-overlapping VRs, as produced withHistFitter and taken from Ref. [2]. The example relies on the background prediction in each VR, Realize that one single-bin control region only has a handle on the normalization of one background sample, andnot on the background shape. ONTENTS

Visualization of ﬁt results as obtained from a ﬁt to the CRs, and tests the validity of the transfer-factor based extrapolation.The pull χ is calculated as the diﬀerence between the observed n obs and predicted event numbers n pred , divided by the total systematic uncertainty on the background prediction, σ pred , added inquadrature to the Poissonian variation on the expected number of background events, σ stat , exp . χ = n obs − n pred σ tot (6) σ tot = (cid:113) σ + σ , exp (7)If, on average, the pulls for all the validation regions would be negative (positive), the data isoverestimated (underestimated) and the model needs to be corrected. If the background modelis properly tuned, on average good agreement is found between the data and the estimated back-ground model. tot σ ) / pred - n obs (n -3 -2 -1 0 1 2 3 ATLAS

VR soft lep T VR m 1lepVR W 1lepVR top 1lep µ VR Z 4j eVR Z 4j 2lep µ VR Z 2j eVR Z 2j 2lep µ VR top 4j eVR top 4j 2lep µ VR top 2j eVR top 2j 2lep

Electron ChannelMuon ChannelElectron-Muon Channel

Figure 9 : Example produced by the ATLAS collaboration and taken from Ref. [2]. Summary of the ﬁtresults in the validation regions. The diﬀerence between the observed and predicted number, divided by thetotal (statistical and systematic) uncertainty on the prediction, is shown for each validation region.

Other validation plots can be produced, mostly helpful for internal or debugging purposes. Like-lihood scans can be made for any of the ﬁt parameters to help understand the likelihood maxi-mization performed in the ﬁt. Furthermore, the correlation matrix of any ﬁt can be plotted tostudy correlations between the ﬁt parameters and possible degenerate degrees of freedom. Thoseexamples are not unique to HistFitter, and are therefore not shown here.26

ONTENTS

Scripts for event yield, systematic uncertainty and pull tables

The production of detailed tables showing the estimated background event levels, the number ofobserved events and the breakdown of the systematic uncertainties is an essential part of everyanalysis. HistFitter includes several scripts to produce publication ready (LaTeX) tables.Table 3 shows the results of the background-only ﬁt to the CRs, extrapolated to a set of SRs andbroken down into the various background processes, as taken from Ref. [3] and produced withHistFitter. The total background prediction, combined with the number of observed events in asignal region, allows the discovery p -value or limit setting to be re-derived by others, externallyfrom the original analysis. The background predictions before the ﬁt are shown in parenthesis.The error on the total background estimate shows the statistical (from limited MC simulationand CR statistics combined) and systematic uncertainties separately, while for the individualbackground samples the combined uncertainties are given as a single number. The uncertaintieson the predicted background event yields are quoted as symmetric, except where the negative errorreaches below zero predicted events, in which case the negative error gets truncated to zero. Theerrors shown are the after-ﬁt uncertainties, though before-ﬁt uncertainties can also be shown bythe table-production script. Process Signal RegionSR-A tight SR-B tight SR-C tight SR-D tight SR-E tight t ¯ t +single top 0 . ± . . ± . . ± . . ± . . ± . Z +jets 3 . ± . . ± . . ± . . ± . . ± . W +jets 2 . ± . . ± . . ± . . ± . . ± . . ± .

02 (0.01) 0 . ± .

07 (0.02) 0 . ± .

03 (0.01) 0 . ± . . ± . . ± . . ± . . ± . . ± . . ± . . ± . ± . . ± . ± . . ± . ± . . ± . ± . ± ± Table 3 : Example produced by the ATLAS collaboration and taken from Ref. [3]. Illustration of observednumbers of events in data and ﬁtted background components in each SR, as obtained from a background-onlyﬁt to CRs. For the total background estimates, the quoted uncertainties give the statistical (MC simulationand CR combined) and systematic uncertainties respectively. For the individual background components,the total uncertainties are given, while the values in parenthesis indicate the pre-ﬁt predictions.

There are two methods implemented in HistFitter to calculate the systematic uncertainty on abackground level prediction of an analysis associated to a speciﬁc (set of) nuisance parameter(s),such as detector response eﬀects or theoretical uncertainties.1. The ﬁrst method takes the nominal after-ﬁt result and sets all ﬂoating parameters constant.Then, iteratively, it sets each (or several, as requested) nuisance parameter(s) η i ﬂoating,and calculates the uncertainty propagated to the background prediction due to the speciﬁcparameter(s), using the covariance matrix of the nominal ﬁt and Eq. 5.2. The second method sets a single (or multiple, as requested) ﬂoating nuisance parameter(s)constant and then reﬁts the data, thus excluding these systematic uncertainties from themodel. The quadratic diﬀerence between the total error of the nominal setup and the ﬁxed27 ONTENTS Interpretation of results parameter(s) setup is then assigned as the systematic uncertainty, as follows: σ η = (cid:114)(cid:16) σ nominaltot (cid:17) − (cid:16) σ η = C tot (cid:17) . (8)Table 4 shows the systematic breakdown of the background estimate uncertainty in a set of signalregions, as produced with method one and taken from Ref. [19]. Each row shows the uncertaintycorresponding to one or more nuisance parameters, as detailed in the reference. Signal Region

SR3b SR0b SR1b SR3Llow SR3Lhigh

Observed events

Total expected background events . ± . . ± . . ± . . ± . . ± . Systematic uncertainties on expected background

Fake-lepton background ± . +1 . − . . − . ± . < . < . ± . ± . ± . ± . E missT scale and resolution ± . ± . ± . ± . ± . ± . ± . ± . ± . ± . b -jet tagging ± . ± . ± . < . ± . ttV , ttH , tZ and t ¯ tt ¯ t ± . ± . ± . ± . ± . < . ± . ± . ± . ± . ± . ± . ± . < . ± . < . ± . ± . Table 4 : Example produced by the ATLAS collaboration and taken from Ref. [19]. Number of observeddata events and expected backgrounds and summary of the systematic uncertainties on the backgroundpredictions for SR3b, SR0b, SR1b, SR3Llow and SR3Lhigh. The breakdown of the systematic uncertaintieson the expected backgrounds, expressed in units of events, is also shown. The individual uncertainties arecorrelated and therefore do not necessarily add up in quadrature to the total systematic uncertainty.

HistFitter provides the functionality to perform hypothesis tests of the data through calls to theappropriate RooStats classes, and to interpret the corresponding results in the form of plots andtables. Four diﬀerent statistical tests are available in HistFitter. Each of these depend on the ﬁtsetups outlined in Sec. 5.1. In each of these setups both the CR(s) and SR(s) are part of the inputto the ﬁt.In the absence of an observed excess of events in one or more SR(s), the ﬁrst two methods set exclu-sion limits on speciﬁc signal models. Both use the model-dependent signal ﬁt conﬁguration. Thethird approach obtains exclusion upper limits on any potential new physics signal, without modeldependency. The fourth interpretation performs the signiﬁcance determination of a potentiallyobserved event excess. Both of these rely on the model-independent signal ﬁt conﬁguration.These diﬀerent statistical tests are discussed in the following sections. As discussed in Sec. 3.3, a28

ONTENTS

Signal model hypothesis test

Frequentist approach is used in all of the methods explained below, together with the CLs methodin case of exclusion hypothesis tests.In Sec. 5.1 we have introduced background-level estimates that are obtained from a background-only ﬁt performed in the CRs only. For completeness we also introduce the concept of background-level estimates obtained from a background-only ﬁt in both the CRs and SRs . These ﬁt resultsare obtained with any model-(in)dependent signal ﬁt setup, where the signal component has beenturned oﬀ. As discussed in Sec. 3.3, the RooStats routines employed here, using the proﬁle likeli-hood ratio as test statistic, perform such a background-only ﬁt to both the CRs and SRs beforerunning the p -value determination; a strategy using the most accurate background-level estimatesavailable. A consequence of this is discussed in Sec. 7.3, together with an example.All tables and ﬁgures shown in this section (except for Tab. 6) come directly from publications bythe ATLAS collaboration and serve only as illustrations of the HistFitter tools that are discussed. The various tools and scripts to execute the signal hypothesis tests, as well as to visualize theresults, provided in HistFitter are explained here.In the signal model hypothesis test, a speciﬁc model of new physics is tested against the background-only model assumption. A signal model prediction is present in all CRs and SRs, as implementedin the model-dependent signal limit ﬁt conﬁguration of Sec. 5.1. The parameter of interest used inthese hypothesis tests is the signal strength parameter, where a signal strength of zero correspondsto the background-only model, and a signal strength of one to the background plus signal model.A ﬁt of the background plus signal model is performed ﬁrst, with the signal strength being a freenormalization parameter, to obtain an idea about potential ﬁt failures or problems in the laterhypothesis testing. The ﬁt result is stored for later usage in the interpretation of the hypothesistest results.Usually, signal hypothesis tests are run for multiple signal scenarios making up a speciﬁc modelgrid, e.g. by modifying a few parameters for a speciﬁc supersymmetry model. HistFitter providesthe possibility to collect the results for the diﬀerent signal scenarios in a data text ﬁle, collectingin particular the observed and expected CLs values, but also the p -values for the various signals.Only results of hypothesis tests with a successful initial free ﬁt are saved to the data text ﬁle.Another macro transforms these entries into two-dimensional histograms, for example showing theCLs values versus the SUSY parameter values (or particle masses) of the signal scenarios tested.A linear algorithm is used to interpolate the CLs values between signal model parameter values.HistFitter provides macros to visualize the results of the hypothesis tests graphically. An exampleis shown in Fig. 10 (left), taken from Ref. [2]. The exclusion limits are shown at 95% conﬁdencelevel, based on the CLs prescription, in a so-called one decay step (1-step) simpliﬁed model [31].There are only two free parameters in these particular SUSY models, m ˜ g and m ˜ χ , which are usedas the variables on the axes to represent this speciﬁc SUSY model grid. The dark dashed lineindicates the expected limit as function of gluino and neutralino masses and the solid red line the Sometimes confusingly called “unblinded” background-level estimates. ONTENTS

Signal strength upper limit [GeV] g~ m

300 400 500 600 700 800 900 1000 1100 1200 [ G e V ] χ∼ m χ∼ < m g ~ m , x=1/2 χ∼ χ∼ qqqqWW → g~-g~ =7 TeVs, -1 L dt = 4.7 fb ∫ e xc l uded m ode l c r o ss s e c t i on s [f b ] N u m be r s g i v e % C L ) theorySUSY σ ± Observed limit ( ) exp σ ± Expected limit (

All limits at 95% CL

ATLAS [GeV] g~ m

600 700 800 900 1000 1100 1200 1300 1400 1500 [ G e V ] c~ m c~ + m t < * m g ~ m )g~) >> m( t~, m( c~ t t ﬁ g~ production, g~ g~ -1 L dt = 20.3 fb (cid:242) ) theorySUSY s – Observed limit ( ) exp s – Expected limit (SR1b expected onlySR3b expected onlySR3Llow expected onlySR3Lhigh expected only

ATLAS

All limits at 95% CL

Figure 10 : Examples produced by the ATLAS collaboration. Excluded regions at 95% conﬁdence level ina 1-step simpliﬁed model (left), with initial gluino pair production and subsequent decay of the gluinos via˜ g → qq ˜ χ ± → qqW ˜ χ and taken from Ref. [2]. Observed and expected limits on gluino-mediated top squarkproduction (right), obtained from a simultaneous ﬁt to four signal regions, and expected exclusion limitsfrom the individual signal regions, produced by the ATLAS collaboration and taken from Ref. [19]. observed limit. The yellow band gives the 1 σ uncertainty on the expected limit, excluding thetheoretical uncertainties on the signal prediction. The dotted red lines show the impact of thetheoretical uncertainties of the signal model prediction on the observed exclusion contour.Likewise Fig. 10 (right) shows the observed and expected exclusion limits on a gluino-mediatedtop squark production model, taken from Ref. [19], as obtained from the statistical combination offour multi-binned SRs performed with HistFitter. Besides the expected exclusion limit from thesimultaneous ﬁt to all SRs, the expected exclusion limits from the individual SRs are shown forcomparison. As in Sec. 7.1, we consider a speciﬁc signal model and the model-dependent signal limit ﬁt con-ﬁguration in this section. HistFitter provides the possibility to set an upper limit on the signalstrength parameter µ sig given the observed data in the signal regions. To do so, the value of thesignal strength needs to be evaluated for which the CLs value falls below a certain level, usually5% (for a 95% CL upper limit).In an initial scan, multiple hypothesis tests are executed using the asymptotic calculator [29] toevaluate the CLs values for a wide range of signal strength values. A second scan follows in asmaller, reﬁned interval, using the expected upper limit derived from the ﬁrst scan.The obtained upper limit on the signal strength can then easily be converted into an upper limiton the excluded cross section of the signal model tested initially. These cross section upper limits30 ONTENTS

Model-independent upper limit

Signal channel (cid:104) σ vis (cid:105) [fb] S S p ( s = 0)SR3b 0 .

19 3 . . +1 . − . . .

80 16 . . +3 . − . . .

65 13 . . +3 . − . . .

42 8 . . +2 . − . . .

23 4 . . +1 . − . . Table 5 : Example produced by the ATLAS collaboration and taken from Ref. [19]. The 95% CL upper limitson the visible cross section ( (cid:104) σ vis (cid:105) ), deﬁned as the product of acceptance, reconstruction eﬃciency andproduction cross section, and the observed and expected 95% CL upper limits on the number of signal events( S and S ). The last column shows the probability, capped at 0 .

5, that a background-only experimentis more signal-like than observed number number of events in a signal region (discussed in Sec. 7.4). are often displayed together with the limits obtained from the signal hypothesis test, introducedin Sec. 7.1. The example discussed in Fig. 10 (left) includes the cross section upper limits as greynumbers for each of the tested signal models.

To obtain the 95% CL upper limit on the number of events in a “beyond the Standard Modelprediction” for each SR, the ﬁt in the SR proceeds in the same way as the background-only ﬁt,except that the number of events observed in the signal region (evaluated in one bin) is addedas an input to the ﬁt. The signal strength parameter is constrained to be non-negative. Thestatistical test uses the model-independent signal ﬁt conﬁguration described in Sec. 5.1. Thismodel-independent upper limit is evaluated using the same approach as in Sec. 7.2.By normalizing the signal-strength from the ﬁt to the integrated luminosity of the data sample,and accounting for the uncertainty on the recorded luminosity, this can be interpreted as theupper limit on the visible cross section of new physics, σ vis . Here σ vis is deﬁned as the product ofacceptance, reconstruction eﬃciency and production cross section.HistFitter includes a script to calculate and present the upper limits on the number of signal eventsand on the visible cross section in a publication ready (LaTeX) table. An example, based on thebackground estimates of Tab. 4, is shown in Tab. 5.As discussed in Sec. 3.3, the proﬁle-likelihood based hypothesis tests use the background-levelestimates obtained from a background-only ﬁt to both the CRs and SRs (the best estimatesavailable). For consistency, both the observed and expected upper limit (or p -value) determinationuse the same background-level estimates, such that the expected limit is the most compatible andpredictive assessment for the observed limit. As a consequence, the expected upper limit dependsindirectly on the observed data.This feature is demonstrated in Tab. 6, which shows a counting experiment with a constant back-ground expectation and an increasing number of observed events, resulting in a consistent risein the internal background-level estimates. As a result, the 95% CL upper limit on the expected31 ONTENTS

Background-only hypothesis test number of signal events rises as a function of the number of observed events. This behavior, thoughperhaps counter-intuitive, is a consequence of the proﬁle-likelihood based limit setting procedureemployed here.Expected background Observed events Background estimate S S . ± . . ± . . . +2 . − . . ± . . ± . . . +2 . − . . ± . . ± . . . +3 . − . . ± . . ± . . . +3 . − . Table 6 : The observed and expected 95% CL upper limits on the number of signal events ( S and S ),as a function the background expectation and the observed number of events, as obtained with asymptoticformulas for a single-bin counting experiment. The third column shows the background estimate obtainedfrom a ﬁt to the expected background and observed number of events. For completeness, yet not tailored to HistFitter needs, the background-only hypothesis test quanti-ﬁes the signiﬁcance of an excess of events in the signal region by the probability that a background-only experiment is more signal-like than observed, also called the discovery p -value. The same ﬁtconﬁguration is used as in Sec. 7.3. An example of calculated discovery p -values is shown in thelast column of Tab. 5. The probability of the SM background to ﬂuctuate to the observed numberof events or higher in each SR has been capped at 0 . The HistFitter software package is publicly available through the web-page http://cern.ch/histﬁtter, which requires ROOT release v5.34.20 or greater. The web-page contains a description ofthe source code, a tutorial on how to set up an analysis, and working examples of how to run anduse the code.

We have presented a software framework for statistical data analysis, called HistFitter, that hasbeen used extensively by the ATLAS Collaboration to analyze big datasets originating from proton-proton collisions at the LHC at CERN.HistFitter provides a programmable framework to build and test a set of data models of nearlyarbitrary complexity. Starting from an input conﬁguration, deﬁned by users, it uses the software Support is provided on a best-eﬀort basis. ONTENTS Conclusion packages HistFactory, RooStats, RooFit and ROOT to construct PDFs that are ﬁtted to data andinterpreted with statistical tests, automatically.HistFitter brings forth several innovative features. It provides a modular conﬁguration inter-face with a trickle-down mechanism that is very eﬃcient and intuitive for users. It has built-inconcepts of control, signal and validation regions, with rigorous statistical treatment, tailored tosupport a complete particle physics analysis. It is capable of working with multiple data modelsat once, which introduces an additional level of abstraction that is powerful when searching fornew phenomena in large experimental datasets. Finally, HistFitter provides a sizable collectionof tools and options, resulting from experience gained during real-life analysis of ATLAS Run-1data, that allows, through simple command-line commands, the presentation of end results withpublication-style quality.

Acknowledgments

We are grateful to the RooFit, RooStats, and HistFactory authors for a fruitful collaboration anduseful feedback, in particular to Kyle Cranmer, Lorenzo Moneta and Wouter Verkerke. We wouldlike to thank our ATLAS colleagues for permission to reproduce the published ﬁgures and tables inSecs. 6 and 7 to illustrate the HistFitter tools discussed. We also thank the ATLAS collaborationand its SUSY physics group for useful discussions and suggestions for the development of HistFitter,and in particular thank Monica D’Onofrio and Jamie Boyd for their feedback on this paper. Andwe are speciﬁcally grateful to following members of the ATLAS SUSY physics group for theirsupport and contributions to the development of HistFitter: Andreas Hoecker, Till Eifert, ZacharyMarshall, Emma Sian Kuwertz, Evgeny Khramov, Sophio Pataraia and Marcello Barisonzi.This work was supported by CERN, Switzerland; the DFG cluster of excellence “Origin andStructure of the Universe”; the Natural Sciences and Engineering Research Council of Canadaand the ATLAS-Canada Subatomic Physics Project Grant; the Department Of Energy and theNational Science Foundation of the United States of America; FOM and NWO, the Netherlands;and STFC, United Kingdom. 33

ONTENTS A Example conﬁguration

A Example conﬁguration

An example HistFitter conﬁguration ﬁle is shown here, using the programmable components ofSec. 4. A description of the example follows below. The example conﬁguration illustrates a single-bin counting experiment. In short: • The single deﬁned channel is called SR . • This contains two background samples A and B , besides the data sample Data . All requiredinputs are extracted from the ﬁles fileA.root and fileB.root . • There are two systematic uncertainties deﬁned, treeSys and weightSys , where the latter isonly applied to sample A . There is also a luminosity uncertainty, applied to both samples. • There are two ﬁt conﬁguration objects created, one for a discovery hypothesis test, labeled

Discovery , and one for a model-dependent exclusion ﬁt, labeled

Exclusion . These containa (dummy) signal sample (predicting 1 signal event) and a speciﬁc signal sample, called

Signal , respectively.Additional options and comments are given in-line. from configManager import configMgrfrom ROOT import kBlack , kGreen , kAzure , kMagenta , kPinkfrom configWriter import fitConfig , Measurement , Channel , Samplefrom systematic import Systematic configMgr . calculatorType = 2 configMgr . testStatType = 3 configMgr . nPoints = 20 configMgr . analysisName = " MyOneBinExample "configMgr . histCacheFile = " data /"+ configMgr . analysisName +". root "configMgr . outputFileName = " results /"+ configMgr . analysisName +" _Output . root " configMgr . inputLumi = 0.001 configMgr . outputLumi = 4.713 configMgr . setLumiUnits ("fb -1 ")configMgr . blindSR = FalseconfigMgr . useSignalInBlindedData = False bgdFiles = []if configMgr . readFromTree :bgdFiles . append (" fileA . root ")bgdFiles . append (" fileB . root ")else :bgdFiles = [ configMgr . histCacheFile ]pass ONTENTS A Example conﬁguration configMgr . setFileList ( bgdFiles ) configMgr . cutsDict [" SR "] = " leptonPt (cid:32) >(cid:32) 20 (cid:32) && (cid:32)((cid:32)( met (cid:32) >(cid:32) 160 (cid:32) && (cid:32) met / meff (cid:32) >(cid:32) 0.2) (cid:32) || (cid:32)met (cid:32) >(cid:32) 1000) " configMgr . weights = (" genWeight " , " eventWeight " , " leptonWeight " , " triggerWeight ") highWeights = (" eventWeight " , " weightUp ")lowWeights = (" eventWeight " , " weightDown ")weightSys = Systematic (" KtScaleTop " , configMgr . weights , highWeights , lowWeights , "weight " , " overallSys ") treeSys = Systematic (" SYS " , " _NoSys " , " _SYS_up " , " _SYS_down " , " tree " , " overallSys ")configMgr . nomName = " _NoSys " aSample = Sample ("A" , kGreen )bSample = Sample ("B" , kAzure )dataSample = Sample (" Data " , kBlack )dataSample . setData () discoveryFitConfig = configMgr . addFitConfig (" Discovery ")meas = discoveryFitConfig . addMeasurement ( name =" NormalMeasurement " , lumi =1.0 ,lumiErr =0.039)meas . addPOI (" mu_SIG ") discoveryFitConfig . addSamples ([ aSample , bSample , dataSample ]) discoveryFitConfig . getSample ("A"). addSystematic ( weightSys )discoveryFitConfig . addSystematic ( treeSys )

SR = discoveryFitConfig . addChannel (" cuts " , [" SR "], 1, 0.5 , 1.5)discoveryFitConfig . setSignalChannels ([ SR ])SR . addDiscoverySamples ([ " SIG "], [1.] , [0.] , [100.] , [ kMagenta ]) exclusionFitConfig = configMgr . addFitConfig (" Exclusion ")meas = exclusionFitConfig . addMeasurement ( name =" NormalMeasurement " , lumi =1.0 ,lumiErr =0.039)meas . addPOI (" mu_SIG ") ONTENTS A Example conﬁguration exclusionFitConfig . addSamples ([ aSample , bSample , dataSample ])exclusionFitConfig . getSample ("A"). addSystematic ( weightSys )exclusionFitConfig . addSystematic ( treeSys )SR = exclusionFitConfig . addChannel (" cuts " , [" SR "], 1, 0.5 , 1.5)exclusionFitConfig . setSignalChannels ([ SR ]) sigSample = Sample (" Signal " , kPink )sigSample . setFileList ([ " signal . root " ])sigSample . setNormByTheory ()sigSample . setNormFactor (" mu_SIG " , 1. , 0. , 5.)exclusionFitConfig . addSamples ( sigSample )exclusionFitConfig . setSignalSample ( sigSample ) EFERENCES REFERENCES

References [1] ATLAS Collaboration, ATLAS Collaboration, JINST , S08003 (2008), Seehttp://atlas.web.cern.ch/Atlas/Collaboration/.[2] ATLAS Collaboration, ATLAS Collaboration, Phys.Rev. D86 , 092002 (2012), [1208.4688].[3] ATLAS Collaboration, ATLAS Collaboration, Phys.Rev.

D87 , 012008 (2013), [1208.0949].[4] ATLAS Collaboration, ATLAS Collaboration, Phys.Rev.Lett. , 211802 (2012),[1208.1447].[5] ATLAS Collaboration, ATLAS Collaboration, Phys.Rev.Lett. , 211803 (2012),[1208.2590].[6] ATLAS Collaboration, ATLAS Collaboration, Phys.Lett.

B718 , 879 (2013), [1208.2884].[7] ATLAS Collaboration, ATLAS Collaboration, Phys.Lett.

B718 , 841 (2013), [1208.3144].[8] ATLAS Collaboration, ATLAS Collaboration, Phys.Lett.

B720 , 13 (2013), [1209.2102].[9] ATLAS Collaboration, ATLAS Collaboration, JHEP , 094 (2012), [1209.4186].[10] ATLAS Collaboration, ATLAS Collaboration, Eur.Phys.J.

C72 , 2215 (2012), [1210.1314].[11] ATLAS Collaboration, ATLAS Collaboration, JHEP , 124 (2012), [1210.4457].[12] ATLAS Collaboration, ATLAS Collaboration, Eur.Phys.J.

C73 , 2362 (2013), [1212.6149].[13] ATLAS Collaboration, ATLAS Collaboration, JHEP , 130 (2013), [1308.1841].[14] ATLAS Collaboration, ATLAS Collaboration, JHEP , 189 (2013), [1308.2631].[15] ATLAS Collaboration, ATLAS Collaboration, JHEP , 169 (2014), [1402.7029].[16] ATLAS Collaboration, ATLAS Collaboration, 1403.4853.[17] ATLAS Collaboration, ATLAS Collaboration, Eur.Phys.J.

C74 , 2883 (2014), [1403.5222].[18] ATLAS Collaboration, ATLAS Collaboration, JHEP , 071 (2014), [1403.5294].[19] ATLAS Collaboration, ATLAS Collaboration, JHEP , 035 (2014), [1404.2500].[20] ATLAS Collaboration, ATLAS Collaboration, JHEP , 075 (2013), [1210.4491].[21] ATLAS Collaboration, ATLAS Collaboration, 1405.4254.[22] ATLAS Collaboration, ATLAS Collaboration, ATLAS-CONF-2013-079 (2013).[23] ROOT Collaboration, K. Cranmer, G. Lewis, L. Moneta, A. Shibata and W. Verkerke, CERN-OPEN-2012-016 (2012).[24] L. Moneta et al. , PoS

ACAT2010 , 057 (2010), [1009.1003].[25] W. Verkerke and D. P. Kirkby, eConf

C0303241 , MOLT007 (2003), [physics/0306116].37

EFERENCES REFERENCES [26] R. Brun and F. Rademakers, Nucl.Instrum.Meth.