A First Determination of Parton Distributions with Theoretical Uncertainties
NNPDF Collaboration, Rabah Abdul Khalek, Richard D. Ball, Stefano Carrazza, Stefano Forte, Tommaso Giani, Zahari Kassabov, Emanuele R. Nocera, Rosalyn L. Pearson, Juan Rojo, Luca Rottoli, Maria Ubiali, Cameron Voisey, Michael Wilson
CCAVENDISH-HEP-19-07DAMTP-2019-20Edinburgh 2019/07Nikhef 2019-013TIF-UNIMI-2019-4
A First Determination of Parton Distributions with TheoreticalUncertainties
Rabah Abdul Khalek , , Richard D. Ball , Stefano Carrazza , Stefano Forte , Tommaso Giani , Zahari Kassabov ,Emanuele R. Nocera , Rosalyn L. Pearson , Juan Rojo , , Luca Rottoli , , Maria Ubiali , Cameron Voisey , andMichael Wilson Department of Physics and Astronomy, VU Amsterdam, De Boelelaan 1081, NL-1081, HV Amsterdam, The Netherlands Nikhef, Science Park 105, NL-1098 XG Amsterdam, The Netherlands The Higgs Centre for Theoretical Physics, University of Edinburgh, JCMB, KB, Mayfield Rd, Edinburgh EH9 3JZ, Scotland Tif Lab, Dipartimento di Fisica, Universit`a di Milano and INFN, Sezione di Milano, Via Celoria 16, I-20133 Milano, Italy Cavendish Laboratory, University of Cambridge, Cambridge, CB3 0HE, United Kingdom Nikhef, Science Park 105, NL-1098 XG Amsterdam, The Netherlands Department of Physics and Astronomy, VU Amsterdam, De Boelelaan 1081, NL-1081, HV Amsterdam, The Netherlands Dipartimento di Fisica G. Occhialini, U2, Universit`a degli Studi di Milano-Bicocca, Piazza della Scienza, 3, 20126 Milano,Italy INFN, Sezione di Milano-Bicocca, 20126, Milano, Italy DAMTP, University of Cambridge, Wilberforce Road, Cambridge, CB3 0WA, United KingdomOctober 18, 2019
Abstract.
The parton distribution functions (PDFs) which characterize the structure of the proton arecurrently one of the dominant sources of uncertainty in the predictions for most processes measured at theLarge Hadron Collider (LHC). Here we present the first extraction of the proton PDFs that accounts for themissing higher order uncertainty (MHOU) in the fixed-order QCD calculations used in PDF determinations.We demonstrate that the MHOU can be included as a contribution to the covariance matrix used forthe PDF fit, and then introduce prescriptions for the computation of this covariance matrix using scalevariations. We validate our results at next-to-leading order (NLO) by comparison to the known next order(NNLO) corrections. We then construct variants of the NNPDF3.1 NLO PDF set that include the effectof the MHOU, and assess their impact on the central values and uncertainties of the resulting PDFs.
PACS.
The search for new physics at present [1] and future [2]high-energy colliders, and specifically at the LHC, hasturned from the mapping of the energy frontier to theexploration of the precision frontier: looking for subtledeviations from Standard Model predictions. In this en-deavor, an accurate estimate of uncertainties associatedwith these predictions is crucial. At present, these uncer-tainties have two main origins. The first is the missinghigher order uncertainty (MHOU) from the truncation ofthe QCD perturbative expansion. The second is relatedto knowledge of the structure of the colliding protons, asencoded in the parton distributions (PDFs) [3].PDFs are extracted by comparing theoretical predic-tions to experimental data. Currently, PDF uncertaintiesonly account for the propagated statistical and systematicerrors on the measurements used in their determination.However, the same MHOU which affects predictions atthe LHC also affect predictions for the various processesthat enter the PDF determination. These are currentlyneglected, perhaps because they are believed to be gener- ally less important than experimental uncertainties. How-ever, as PDFs become more precise, in particular thanksto ever tighter constraints from LHC data [4], MHOUs inPDF determinations will eventually become significant.Already in recent PDF sets making extensive use of LHCdata, such as NNPDF3.1 [5], the shift between PDFs atnext-to-leading order (NLO) and the next order (NNLO)is sometimes larger than the PDF uncertainties from theexperimental data.Here we present the first PDF extraction that system-atically accounts for the MHOU in the QCD calculationsused to extract them. MHOUs are routinely estimated byvarying the arbitrary renormalization µ r and factoriza-tion µ f scales of perturbative computations [1], thoughalternative methods have also been proposed [6,7,8]. Ourinclusion of the MHOU in a PDF fit involves two steps:first we establish how theoretical uncertainties can be in-cluded in such a fit through a covariance matrix [9,10],and then we find a way of computing and validating thecovariance matrix associated with the MHOU using scale a r X i v : . [ h e p - ph ] O c t Rabah Abdul Khalek et al.: A First Determination of Parton Distributions with Theoretical Uncertainties variations [11]. By producing variants of NNPDF3.1 whichinclude the MHOU, we are then able to finally address thelong-standing question of their impact on state-of-the-artPDF sets. A detailed discussion of our results is presentedin a companion paper [12], to which we refer for full com-putational details, definitions, proofs and results.Assuming that theory uncertainties can be modeled asGaussian distributions, in the same way as experimentalsystematics, then the associated theory covariance matrix S ij can be expressed in terms of nuisance parameters S ij = N (cid:88) k ∆ ( k ) i ∆ ( k ) j , (1)where ∆ ( k ) i = T ( k ) i − T (0) i is the expected shift with respectto the central theory prediction for the i -th cross-section, T (0) i , due to the theory uncertainty, and N is a normal-ization factor determined by the number of independentnuisance parameters. Since theory uncertainties are inde-pendent of the experimental ones, the two can be com-bined in quadrature: the χ used to assess the agreementof theory and data is given by χ = N dat (cid:88) i,j =1 (cid:16) D i − T (0) i (cid:17) ( S + C ) − ij (cid:16) D j − T (0) j (cid:17) , (2)with D i the central experimental value of the i -th data-point, and C ij the experimental covariance matrix. Moredetails of the implementation of the theory covariance ma-trix in PDF fits may be found in Refs. [9,10].The choice of nuisance parameters ∆ ( k ) i used in Eq. (1)to estimate a particular theoretical uncertainty is not unique,reflecting the fact that such estimates always have somedegree of arbitrariness. Here we focus on the MHOU, andchoose to use scale variations to estimate ∆ ( k ) i . A stan-dard procedure [1] is the so-called 7-point prescription, inwhich the MHOU is estimated from the envelope of resultsobtained with the following scales( k f , k r ) ∈ { (1 , , (2 , , ( , ) , (2 , , (1 , , ( , , (1 , ) } where k r = µ r /µ (0) r and k f = µ f /µ (0) f are the ratios ofthe renormalization and factorization scales to their cen-tral values. Varying µ r estimates the MHOU in the hardcoefficient function of the specific process, while the µ f variation estimates the MHOU in PDF evolution.In order to compute a covariance matrix, we must notonly choose a set of scale variations, but also make someassumptions about the way they are correlated. We do thisby, first of all, classifying the input datasets used in PDFfits into processes as indicated in Table 1: charged-current(CC) and neutral-current (NC) deep-inelastic scattering(DIS), Drell-Yan (DY) production of gauge bosons (invari-ant mass, transverse momentum, and rapidity distribu-tions), single-jet inclusive and top pair production cross-sections. Note that this step requires making an educatedguess as to which cross-sections are likely to have a similarstructure of higher-order corrections. Process Type DatasetsDIS NC NMC, SLAC, BCDMS, HERA NCDIS CC NuTeV, CHORUS, HERA CCDY CDF, D0, ATLAS, CMS, LHCb ( y , p T , M ll )JET ATLAS, CMS inclusive jetsTOP ATLAS, CMS total+differential cross-sections Table 1.
Classification of datasets into process types.
Next, we formulate a variety of prescriptions for howto construct Eq. (1) by picking a set of scale variationsand correlation patterns. A simple possibility is the 3-point prescription, in which we vary both scales coherently(thus setting k f = k r ) by a fixed amount about the centralvalue, independently for each process. More sophisticatedprescriptions vary the two scales independently, but by thesame amount, and assume that while µ r is only correlatedwithin a given process, µ f is fully correlated among pro-cesses. This assumption is based on the observation that µ f variations estimate the MHOU in the evolution equa-tions, which are universal (process-independent), thoughit is an approximation given that the evolution of differ-ent PDFs is governed by different anomalous dimensions,which do not necessarily share the same MHO corrections.We then proceed to the validation of the resultingcovariance matrices at NLO. We use the same experi-mental data and theory calculations as in the NNPDF3.1 α s study [13] with two minor differences: the value ofthe lower kinematic cut has been increased from Q =2 .
69 GeV to 13 .
96 GeV in order to ensure the valid-ity of the perturbative QCD expansion when scales arevaried downwards, and the HERA F b and fixed-targetDrell-Yan cross-sections have been removed, for techni-cal reasons related to difficulties in implementing scalevariation. In total we then have N dat = 2819 data points.The theory covariance matrix S ij has been constructed bymeans of the ReportEngine software [14] taking as inputthe scale-varied NLO theory cross-sections T i ( k f , k r ), pro-vided by APFEL [15] for the DIS structure functions andby
APFELgrid [16] combined with
APPLgrid [17] for thehadronic cross-sections.Since for the processes in Table 1 the NNLO predic-tions are known, we can validate the NLO covariance ma-trix against the known NNLO results. For this exercise, acommon input NLO PDF is used in both cases. In order tovalidate the diagonal elements of S ij , which correspond tothe overall size of the MHOU, we first normalize it to thecentral theory prediction, (cid:98) S ij = S ij /T (0) i T (0) j . Then wecompare in Fig. 1 the relative uncertainties, σ i = (cid:113) (cid:98) S ii to the relative shifts between predictions at NLO andNNLO, δ i = ( T (0) , nnlo i − T (0) , nlo i ) /T (0) , nlo i , for each of the N dat = 2819 observables. In all cases, δ i turns out to besmaller or comparable to σ i , showing that this prescrip-tion provides a good (if somewhat conservative) estimateof the diagonal theory uncertainties. abah Abdul Khalek et al.: A First Determination of Parton Distributions with Theoretical Uncertainties 3 D I S N C D I S CC D Y J E T S T O P % w r t c e n t r a l t h e o r y T ( ) i MHOU (9 pt)NNLO-NLO Shift
Fig. 1.
The relative uncertainties σ i (9-point prescription) onthe 2819 datapoints used in the PDF fit, compared to theknown NLO-NNLO relative shifts δ i in theory prediction. The validation of the full covariance matrix includ-ing correlations is more subtle. We first diagonalize (cid:98) S ij ,by finding the (orthonormal) eigenvectors e ai which cor-respond to positive eigenvalues ( s a ) : these define a sub-space S orthonormal to the large null subspace. The di-mension N S of S depends on the total number of indepen-dent scale variations, the number of processes, and thecorrelation pattern. Its determination is nontrivial, andit requires computing firstly the total number of distinctscale variations for any pair of processes, i.e., the totalnumber of vectors ∆ ( k ) in Eq. (1), and secondly deter-mining the full set of linear relations between them inorder to establish how many of them are independent (seeRef. [12]).For the 5 processes in Table 1, and the 9-point pre-scription, we find N S = 28, while for the simpler 3-pointprescription N S = 6. We then compute the N S projections δ a of the NLO-NNLO shifts δ i along each eigenvector, andcompare them to the square root of the correspondingeigenvalues, s a . Finally we compute the length | δ miss i | ofthe remaining component of the vector δ i that lies in thenull subspace of (cid:98) S .The validation can be considered successful if the angle θ = arcsin( | δ miss i | / | δ i | ) is small, meaning that the NNLO-NLO shift lies substantially within the subspace S esti-mated by the scale variations, and furthermore if | δ a | (cid:39)| s a | , so that the size of the shift along each eigenvector iscorrectly estimated by the corresponding eigenvalue. Us-ing the 9-point prescription, for individual processes wefind θ = 3 o , o , o , o , o for top, jets, DY, NC andCC DIS respectively. For the complete dataset with thesame prescription we find θ = 26 o .The projected shifts and eigenvalues are compared inFig. 2. The size of the eigenvalues generally falls as theprojected shifts get smaller. For the six largest eigenvec-tors the eigenvalue is always larger than the shift and, inall but two cases, of very similar size to the shift. Theseventh eigenvalue is smaller than, but of the same or-der as, the shift, while the eighth eigenvalue significantlyunderestimates the shift. However, given that the eightheigenvalue is already one order of magnitude smaller thatthe first, this means that most of the shift is well describedby the theory covariance matrix, and somewhat overesti-mated by it in just a few cases. We conclude that the Number of eigenvalues = 28 | a ||s a || miss | | a / s a | | a /s a | = 1 Fig. 2.
The square root eigenvalues s a of the theory covari-ance matrix (cid:98) S ij computed using the 9-point prescription, andthe projections δ a of the NNLO-NLO shift vector δ i on theeigenvectors. The length | δ miss i | of the component of δ i lying inthe null subspace of (cid:98) S ij is also shown. C C + S (3pt) C + S (9pt) χ φ Table 2.
The central χ per datapoint and the average un-certainty reduction φ for the 3-point and 9-point fits. validation is successful: remarkably, the pattern of corre-lations of theory shifts in a 2819-dimensional vector spaceis well captured by just 28 nuisance parameters.Adding the theory covariance matrix S ij to the exper-imental covariance matrix C ij , while increasing the diago-nal uncertainty on each individual prediction, also (andperhaps more importantly) introduces a set of theory-induced correlations between different experiments andprocesses, even when the experimental data points are un-correlated. This is illustrated in Fig. 3, showing the com-bined experimental and theoretical (9-point) correlationmatrix: it is clear that sizable correlations appear evenbetween experimentally unrelated measurements.We can now proceed to a NLO global PDF determina-tion with a theory covariance matrix S ij computed usingthe 9-point prescription. From the point of view of theNNPDF fitting methodology, the addition of the theorycontribution to the covariance matrix does not entail anychanges: we follow the procedure of Ref. [18], but withthe covariance matrix C ij now replaced by C ij + S ij , bothin the Monte Carlo replica generation and in the fitting.In Table 2 we show some fit quality estimators for theresulting PDF sets obtained using only the experimentalcovariance matrix, alongside the theory covariance matrixwith two different prescriptions. Rabah Abdul Khalek et al.: A First Determination of Parton Distributions with Theoretical Uncertainties D I S N C D I S C C D Y J E T S T O P DIS NCDIS CCDYJETSTOP
Experimental + Theory Correlation Matrix (9 pt)
Fig. 3.
The combined experimental and theoretical (9-point)correlation matrix for the N dat cross-sections in the fit. In particular, we show the χ per datapoint and the φ estimator [18], which gives the ratio of the uncertainty inthe predictions using the output PDFs to that of the origi-nal data, averaged in quadrature over all data. The qualityof the fit is improved by the inclusion of the MHOU, withthe 9-point prescription performing rather better than 3-point. Interestingly, φ only increases by around 30% whenone includes the theory covariance matrix, much less thanthe 70% one would expect taking into account the rela-tive size of the NLO MHOU and experimental uncertain-ties. This means that in the region of the data, takingthe MHOU into account increases the PDF uncertaintiesonly rather moderately. This suggests that the addition ofthe MHOU is resolving some of the tension between dataand theory, so that the larger overall uncertainty is partlycompensated by the improved fit quality, though of coursethe highly correlated nature of theory uncertainties alsoplays a role in reducing their impact.In Fig. 4 we compare at Q = 10 GeV the gluon andquark singlet PDFs obtained at NLO with and withouta theory covariance matrix, normalized to the latter. Wealso show the central NNLO result when the theory covari-ance matrix is not included. Three features of this com-parison are apparent. First, when including the MHOU,the increase in PDF uncertainty in the data region is quitemoderate, in agreement with the φ values of Tab. 2. Sec-ond, the NLO-NNLO shift is fully compatible with theoverall uncertainty. Finally, the central value is also mod-ified by the inclusion of S ij in the fit, as the balance be-tween different data sets adjusts according to their rel-ative theoretical precision. Interestingly, the central pre-diction shifts towards the known NNLO result, showingthat, thanks to the inclusion of the MHOU, the overall fitquality has improved.Finally, in Fig. 5 we compare the dependence of thefit results on the specific choice of prescription for S ij ,specifically for the 3- and 9-point cases, normalized to the - - - - - x ) [ r e f] ) / g ( x , Q g ( x , Q NNPDF3.1 Global, Q = 10 GeV
NLO, CNLO, C+S(9pt)NNLO, C
NNPDF3.1 Global, Q = 10 GeV - - - - - x ) [ r e f] ( x , Q S ) / ( x , Q S NNPDF3.1 Global, Q = 10 GeV
NLO, CNLO, C+S(9pt)NNLO, C
NNPDF3.1 Global, Q = 10 GeV
Fig. 4.
The gluon and quark singlet PDFs from the NNPDF3.1NLO fits without and with the MHOU (9-points) in the covari-ance matrix at Q = 10 GeV, normalized to the former. Thecentral NNLO result is also shown. latter. In general the two results are consistent, but re-sults with the 3-point prescription have somewhat smalleruncertainties and, more importantly, their central value iscloser to that when the MHOU is not included (see Fig. 4),so that the improved agreement between the NLO and fullNNLO noted in Fig. 4 would be mostly lost if the 3-pointprescription were adopted, providing further confirmationfor preferring the 9-point prescription.It is important to understand that the meaning ofPDFs and their uncertainties changes once the theorycovariance matrix is included: so the error bands e.g. inFig. 4 have a different meaning according to whether thetheory covariance matrix is included. When it is included,PDF uncertainties account for data and methodologicaluncertainties, but also for MHOUs. Also, their central val-ues now optimize the agreement with data based on a χ which includes MHOUs.The usage of these PDFs is accordingly different. Firstly,they should be combined with hard cross-sections whichalso include MHOU. The MHOU on the prediction and thePDF uncertainty (now also including MHOUs) should becombined in the standard way (i.e. in quadrature), sincewith a universal PDF it is not possible to keep track of thecorrelations (which surely exist) between MHOU in pro- abah Abdul Khalek et al.: A First Determination of Parton Distributions with Theoretical Uncertainties 5 - - - - - x ) [ r e f] ) / g ( x , Q g ( x , Q NNPDF3.1 Global, Q = 10 GeV
NLO, C+S(9pt)NLO, C+S(3pt)
NNPDF3.1 Global, Q = 10 GeV
Fig. 5.
Same as Fig. 4 for the gluon, comparing the 3-pointand 9-point prescriptions as a ratio to the latter. cesses used for PDF determination, and the MHOU in theprediction itself. This neglected correlation is likely to bea small effect in most situations [12], and it leads to a con-servative uncertainty estimate. Second, it is important tokeep in mind that MHOUs in the theory prediction mustbe included in the computation of the χ when assessingthe agreement of these PDFs with new data, since, as wehave seen, their central value is shifted as a consequenceof the inclusion of the MHOUs.In summary, we have presented the first global PDFanalysis that accounts for the MHOU associated with thefixed order QCD perturbative calculations used in the fit.The inclusion of the MHOU shifts central values by anamount that is not negligible on the scale of the PDFuncertainty, moving the NLO result towards the NNLOresult. PDF uncertainties increase moderately, because ofthe improvement of fit quality due to the rebalancing ofdatasets according to their theoretical precision. For thisto be effective, the correlations in S ij play a crucial role.These correlations are rather more extensive than those re-lated to experimental systematics, since all different mea-surements of the same process are correlated through theircommon MHO corrections, and different processes are cor-related through MHO corrections to perturbative evolu-tion. A more accurate treatment of these correlations (es-pecially those related to perturbative evolution) will bethe subject of future studies.Our results pave the way towards a fully consistenttreatment of MHOU for precision LHC phenomenology.The NLO results presented here will be upgraded to NNLO,facilitated by tools such as the APPLfast grid interface tothe
NNLOJET program [19]. We thus anticipate that theupcoming NNPDF4.0 PDF set will be able to fully ac-count for MHOU both at NLO and NNLO, as well asother sources of theory uncertainty, such as those relatedto nuclear corrections [10,20].
Acknowledgments.
R. D. B. is supported by the UKScience and Technology Facility Council through grantST/P000630/1.S. F. is supported by the European Re- search Council under the European Union’s Horizon2020 research and innovation Programme (grant agree-ment n.740006). T. G. is supported by The ScottishFunding Council, grant H14027. Z. K. is supportedby the European Research Council Consolidator Grant“NNLOforLHC2”. E. R. N. is supported by the Euro-pean Commission through the Marie Sk(cid:32)lodowska-CurieAction ParDHonS FFs.TMDs (grant number 752748).R. L. P. and M. W. are supported by the STFCgrant ST/R504737/1. J. R. is supported by the Euro-pean Research Council Starting Grant “PDF4BSM” andby the Netherlands Organization for Scientific Research(NWO). L. R. is supported by the European ResearchCouncil Starting Grant “REINVENT” (grant number714788). M. U. is partially supported by the STFC grantST/L000385/1 and funded by the Royal Society grantsDH150088 and RGF/EA/180148. C. V. is supported bythe STFC grant ST/R504671/1.
References
1. D. de Florian, et al., (2016), 1610.079222. M. Cepeda, et al., (2019), 1902.001343. J. Gao, L. Harland-Lang, J. Rojo, Phys. Rept. , 1(2018), 1709.049224. J. Rojo, et al., J. Phys.
G42 , 103103 (2015), 1507.005565. R.D. Ball, et al., Eur. Phys. J.
C77 (10), 663 (2017),1706.004286. M. Cacciari, N. Houdeau, JHEP , 039 (2011),1105.51527. A. David, G. Passarino, Phys. Lett.
B726 , 266 (2013),1307.18438. E. Bagnaschi, M. Cacciari, A. Guffanti, L. Jenniches,JHEP , 133 (2015), 1409.50369. R.D. Ball, A. Deshpande, (2018). 1801.0484210. R.D. Ball, E.R. Nocera, R.L. Pearson, Eur. Phys. J. C79 (3), 282 (2019), 1812.0907411. R.L. Pearson, C. Voisey, Nucl. Part. Phys. Proc. ,24 (2018), 1810.0199612. R. Abdul Khalek, et al., (2019), 1906.1069813. R.D. Ball, S. Carrazza, L. Del Debbio, S. Forte, Z. Kass-abov, J. Rojo, E. Slade, M. Ubiali, Eur. Phys. J.
C78 (5),408 (2018), 1802.0339814. Z. Kassabov. Reportengine: A framework for declarativedata analysis. https://doi.org/10.5281/zenodo.2571601(2019)15. V. Bertone, S. Carrazza, J. Rojo, Comput.Phys.Commun. , 1647 (2014), 1310.139416. V. Bertone, S. Carrazza, N.P. Hartland, Comput. Phys.Commun. , 205 (2017), 1605.0207017. T. Carli, et al., Eur.Phys.J.