The Bjorken sum rule with Monte Carlo and Neural Network techniques
aa r X i v : . [ h e p - ph ] J u l Preprint typeset in JHEP style - HYPER VERSION
FREIBURG-PHENO-09/04
The Bjorken sum rule with Monte Carlo and NeuralNetwork techniques
L. Del Debbio
SUPA, School of Physics and Astronomy, University of EdinburghEdinburgh EH9 3JZ, ScotlandEmail: [email protected]
A. Guffanti
Physikalisches Institut, Albert-Ludwigs-Univerit¨atHermann-Herder-Straße 3, 79104 Freiburg, GermanyEmail: [email protected]
A. Piccione
I.T.I.S. PininfarinaVia Ponchielli 16, 10024 Moncalieri, ItalyandI.N.F.N. MilanoVia Celoria 16, 20133 Milano, ItalyEmail: [email protected]
Abstract:
Determinations of structure functions and parton distribution functions havebeen recently obtained using Monte Carlo methods and neural networks as universal, un-biased interpolants for the unknown functional dependence. In this work the same meth-ods are applied to obtain a parametrization of polarized Deep Inelastic Scattering (DIS)structure functions. The Monte Carlo approach provides a bias–free determination of theprobability measure in the space of structure functions, while retaining all the informationon experimental errors and correlations. In particular the error on the data is propa-gated into an error on the structure functions that has a clear statistical meaning. Wepresent the application of this method to the parametrization from polarized DIS dataof the photon asymmetries A p and A d from which we determine the structure functions g p ( x, Q ) and g d ( x, Q ), and discuss the possibility to extract physical parameters fromthese parametrizations. This work can be used as a starting point for the determinationof polarized parton distributions. Keywords:
Polarized DIS, structure functions, g p , Neural Networks. ontents
1. Introduction 12. Experimental data 33. The NNPDF approach 4
4. Phenomenology 85. Results 9
6. Conclusions 19A. Experimental errors 20
1. Introduction
In QCD the description of scattering processes at large momentum transfer ( Q ≫ Λ QCD )involving (polarized) hadrons in the initial state is based on the factorization theorem.The latter allows a separation between the high–energy dynamics, described by coefficientfunctions which are calculable in perturbative QCD, from low–energy, non-perturbative ef-fects, binding partons into hadrons, which are encoded into (polarized) parton distributionfunctions (PDFs).The growth in statistics and increase in precision of data from experiments involvingpolarized hadrons scattering calls for a more accurate determination of polarized PDFsand their errors. A crucial problem in this respect is the determination of the uncer-tainty on a function (i.e. a probability measure on a space of functions) from a finite setof experimental data points. In the standard PDF extraction approach to the problemthe infinite–dimensional space of continuous functions is mapped into a finite–dimensionalspace of parameters by choosing a particular basis in the space of functions and truncatingthe basis to a finite number of elements. This procedure entails some degree of arbitrari-ness. Any sensible choice must strike a balance between two competing requirements: onthe one hand a small number of parameters introduces a bias in the determination of both– 1 –he functional form and the errors, as the chosen parametrization would not allow enoughflexibility; on the other hand a large number of parameters could spoil the convergence ofthe fit, or be too sensitive to the statistical fluctuation of the experimental data.This problem has been addressed by the NNPDF Collaboration in the case of unpo-larized Deep Inelastic Scattering (DIS) structure functions in Refs. [1, 2], and in the caseof the unpolarized PDFs in Refs. [3–5] using a method based on statistical inference andneural networks as an interpolating tool.While avoiding technical complications linked to the extraction of PDFs from ob-servables, the determination of structure functions addresses the main issue of devising afaithful estimation of errors on a function extracted from experimental data. The mainingredient in the studies above is the usage of Monte Carlo methods to obtain a repre-sentation of the probability measure in the space of structure functions. An ensemble ofartificial data is generated, which reproduces all the statistical features (i.e. variances andcorrelations) of the original experimental data. Each set of artificial data is called a replica.A structure function, parametrized by a neural network, is then fitted to each replica. Thenet result of this procedure is an ensemble of fitted functions. This ensemble of fitted func-tions provides a representation of the measure in the space of structure functions. Errorsand correlations of any observable involving the structure functions are obtained averagingover the ensemble of fits. Moreover suitable statistical estimators can be defined from theMonte Carlo ensemble which provide a quantitative description of the possible biases andinconsistencies in the fitting procedure. This method has been described in great detail inRefs. [1–4] to which the interested reader should refer.The aim of this work is to apply the same techniques to obtain a bias–free parametriza-tion of the photon asymmetries A p and A d from available polarized DIS data and extractfrom them the corresponding structure functions g p and g d . We provide further testing ofthe Monte Carlo method, and produce statistically meaningful error bars for the structurefunction. Besides allowing us to address all systematics related to the data and the method,such a parametrization might be an ideal input for a fit based on factorization scheme-invariant evolution equations to determine α s , as proposed in Refs. [6, 7]. As shown in thiswork, a careful treatment of statistical and systematic errors leads to a reliable extractionof physically meaningful parameters such as α s , g A , and the higher–twist contributions tothe structure functions. While these are not the best determinations available for theseparameters, the results we obtain are in agreement with other determinations, and showthe robustness of the Monte Carlo method.We shall now discuss in turn the two steps that are needed to produce the MonteCarlo sample of fitted functions: first the treatment of the experimental data, and then theactual fitting procedure. The experimental data points included in the fit are discussed inSection 2; Section 3 summarizes briefly the NNPDF approach and the characteristics ofthe neural networks used for this particular study. The results of our fits, together withtheir phenomenological implications are presented and discussed in Sections 4 and 5.– 2 – . Experimental data The cross section asymmetry for parallel and anti-parallel configurations of longitudinalbeam and target polarizations is given by: A || = σ ↑↓ − σ ↑↑ σ ↑↓ + σ ↑↑ (2.1)and it is related to the virtual–photon asymmetries A , A by: A || = D ( A + ηA ) ≃ DA . (2.2)The photon depolarization factor D depends on kinematic factors and on the ratio: R ( x, Q ) = σ L ( x, Q ) σ T ( x, Q ) , (2.3)where σ L and σ T are the longitudinal and the transverse cross sections respectively (seee.g. Refs. [8, 9] for a detailed definition of all the quantities).The polarized structure functions g and g are related to the virtual-photon asymme-tries by: A ( x, Q ) = g ( x, Q ) − γ g ( x, Q ) F ( x, Q ) , (2.4) A ( x, Q ) = γ g ( x, Q ) + g ( x, Q ) F ( x, Q ) ; (2.5)where F ( x, Q ) = (1 + γ )2 x [1 + R ( x, Q )] F ( x, Q ) , (2.6)is the unpolarized structure function, γ = m x Q , and m denotes the nucleon mass.The main features of each experimental data set used in the present analysis aresummarized in Tab. 1, and their kinematical coverage of the ( x, Q )-plane is shown inFig. 1. We observe that the kinematical coverage of the available data is rather small,especially when compared to the one of the available unpolarized DIS data, thus we willhave a sizable region of the kinematical plane in which the fit extrapolates the behaviourextracted from the region covered by data.From Tab. 1 we infer that the systematic errors are on average one order of magnitudesmaller than the statistical ones. This justifies the procedure of neglecting correlations forsystematic errors and the procedure of summing errors in quadrature when computing thefigure of merit ( χ ) to be minimized in the fitting procedure.Finally we notice that E155 data have been corrected to yield A by adding in Eq.(2.4) the g contribution evaluated with the Wandzura–Wilczek relation and using theparametrization of g /F given in Ref. [14]: this shift is also added as a source of uncertaintyin the total error of the data set. – 3 – xperiment x range Q (GeV ) range N dat h σ stat i h σ syst i h σ norm i Type Ref.ProtonEMC 0.015 - 0.466 3.5 - 29.5 10 0.077 0.024 0.028 A [10]SMC 0.001 - 0.480 0.3 - 58.0 15 0.026 0.003 0.012 A [11]SMC low- x A [12]E143 0.031 - 0.75 1.27 - 9.52 28 0.045 0.016 0.012 A [13]E155 0.015 - 0.75 1.22 - 34.72 24 0.043 0.018 0.026 g /F [14]HERMES06 0.0058 - 0.7311 0.26 - 14.29 45 0.126 0.019 0.017 A [15]DeuteronCOMPASS 0.0051 - 0.474 1.18 - 47.5 12 0.034 0.017 0.011 A [16]SMC 0.001 - 0.480 0.3 - 58.0 15 0.032 0.003 0.006 A [11]SMC low- x A [12]E143 0.031 - 0.75 1.27 - 9.52 28 0.066 0.011 0.008 A [13]E155 0.015 - 0.75 1.22 - 34.72 24 0.091 0.009 0.011 g /F [14]HERMES06 0.0058 - 0.7311 0.26 - 14.29 45 0.089 0.007 0.009 A [15] Table 1:
The proton and deuteron experimental data sets included in the present analysis. We showthe kinematic range, the number of points, the average statistical, systematic and normalizationuncertainty, and the measured observable.
Figure 1:
Experimental data in the ( x, Q ) plane used in the present analysis for the proton (left)and for the deuteron (right) target.
3. The NNPDF approach
In this section we briefly review the approach used to extract an unbiased determination ofthe asymmetry A and the structure function g from the available inclusive polarized DISdata, following the analysis performed by the NNPDF Collaboration for the determinationof the unpolarized structure function F [1, 2] and the parton densities [3–5].The core idea underlying the NNPDF approach is based on using Monte Carlo methodsto build a representation of the probability measure in the space of structure functions,and parametrizing the space of structure functions using neural networks. We refer theinterested reader to the papers cited above for a detailed description of the methods andin the following we will briefly discuss the settings used in this analysis. It is worthwhileemphasizing that the Monte Carlo method does not require the use of neural networks,– 4 –nd would yield a robust determination of the errors with any parametrization of thestructure function, provided the parametrization is sufficiently flexible. A comparison ofMonte Carlo analyses based on different parametrizations was performed in the frameworkof the HERA-LHC workshop by comparing the standard H1 and NNPDF analyses. Thegreater flexibility of the neural network parametrization compared to fixed functional formsis reflected in the larger error bands obtained using the NNPDF method, especially whenconsidering the x region not covered by data (i.e. the extrapolation region). These featuresare illustrated by the results in Sects. 3.2 and 3.4 of Ref. [17]. We generate N rep Monte-Carlo replicas of the experimental data according to A (art) ,k ( x, Q ) = r k,N σ N A (exp)1 ( x, Q ) ! h A (exp)1 ( x, Q ) + r k,t σ t ( x, Q ) i , (3.1)where r k are Gaussian distributed random numbers, σ N is the quadratic sum of the normal-ization errors and σ t is the total error, obtained by summing in quadrature the statisticaland systematic errors, the latter assumed to be uncorrelated.Following Ref. [18], the covariance matrix for experimental data points is evaluatedusing: cov ij = σ N i σ N j + δ ij σ i,t , (3.2)while during the fit for each replica, we minimize: χ k ) = N data X i =1 A ( art ) ,k ( x, Q ) − A ( net ) ,k ( x, Q )¯ σ ( k ) i,t ! , (3.3)where ¯ σ ( k ) i,t = r k,N σ N A ( exp )1 ( x, Q ) ! σ i,t . (3.4)The number of Monte Carlo replicas of the data is determined by requiring that theaverage over the replicas reproduces the features (central values, errors and correlations) ofthe original experimental data to a required accuracy. The quantitative check is performedby means of the statistical estimators described in the appendix of Ref. [2] and the resultsfor sets of 10, 100 and 1000 replicas are collected in Tab. 2 for the proton target data andin Tab. 3 for the deuteron target data. We observe that all the considered estimators havethe correct scaling behaviour as the number of replica grows. We also point out that thelarge percentage error on the deuteron central values is due to a bulk of data whose valuesare close to zero. Artificial neural networks, see e.g. Ref. [19], are a class of algorithms which provide arobust and universal approximant to incomplete or noisy data, with the only requirementof continuity. Neural networks are universal approximators for measurable functions [20].– 5 –0 100 1000 h P E h h A ( art )1 i rep i i r h A ( art )1 i h V (cid:2) σ ( art ) (cid:3) i dat . · − . · − . · − h P E (cid:2) h σ ( art ) i (cid:3) i dat h σ ( art ) i dat r (cid:2) σ ( art ) (cid:3) h V (cid:2) ρ ( art ) (cid:3) i dat . · − . · − . · − h ρ ( art ) i dat r (cid:2) ρ ( art ) (cid:3) h V (cid:2) cov ( art ) (cid:3) i dat . · − . · − . · − h cov ( art ) i dat r (cid:2) cov ( art ) (cid:3) Table 2:
Statistical estimators for Monte Carlo replicas of A for the proton data. The experimentaldata have h σ (exp) i dat = 0 . h ρ (exp) i dat = 0 . h cov (exp) i dat = 0 .
10 100 1000 h P E h h A ( art )1 i rep i i r h A ( art )1 i h V (cid:2) σ ( art ) (cid:3) i dat . · − . · − . · − h P E (cid:2) h σ ( art ) i (cid:3) i dat h σ ( art ) i dat r (cid:2) σ ( art ) (cid:3) h V (cid:2) ρ ( art ) (cid:3) i dat . · − . · − . · − h ρ ( art ) i dat r (cid:2) ρ ( art ) (cid:3) h V (cid:2) cov ( art ) (cid:3) i dat . · − . · − . · − h cov ( art ) i dat r (cid:2) cov ( art ) (cid:3) Table 3:
Statistical estimators for Monte Carlo replicas of A for the deuteron data. The experi-mental data have h σ (exp) i dat = 0 . h ρ (exp) i dat = 0 . h cov (exp) i dat = 0 . This means that any continuous function can be approximated to any degree of accuracy bya sufficiently large neural network with one hidden layer and non-linear neuron activationfunction.One of the main reasons to use neural networks in place of any other redundantparametrization is the existence of efficient techniques for training them, i.e. determin-ing the parameters of the network (thresholds and weights) so that it reproduces a givenset of input-output data. Equivalently one could say that a sufficiently large neural networkprovides a description of the data which is largely free of functional bias.The analysis presented here uses a class of neural networks known as multilayer feed-– 6 –orward perceptrons, trained using a genetic algorithm [21, 22]. The networks we employedhave one hidden layer and a 2-4-1 architecture, which gives us a total of 17 free parametersfor each network to be determined during the training. The guidance principle in the choiceof the network architecture to be used is that it should provide a redundant parametrizationfor the data to be fitted, i.e. the network should have enough flexibility to fit not onlythe underlying physical law but also the statistical fluctuations of the experimental data.This property is crucial in ensuring that the fit results are not biased by the specificparametrization. The lack of functional bias is established a posteriori by verifying that fitsperformed with networks with different architectures lead to statistically equivalent results.This is achieved using the statistical estimators introduced in the NNPDF Collaboration’sstudies; the results of these comparisons are presented and discussed later.The training of the individual
Figure 2: χ for the training (red) and validation (green)sets of one replica in the reference fit to A p . networks to the Monte Carlo repli-cas is performed by minimizing thefigure of merit given in Eq. (3.3).Given the extensive size and com-plex structure of the parameter space(a neural network with n param-eters, weights and thresholds has2 n ! equivalent global minimum con-figurations), the most efficient train-ing algorithm turns out to be agenetic algorithm. The details ofthe implementation are discussedin Ref. [3].As already pointed out in var-ious references the fact that we adopt a redundant parametrization and that the figureof merit minimized in the training procedure is monotonically decreasing might lead to overfitting the data: the neural network reproduces not only the underlying physical lawbut also the statistical noise of the data sample. To prevent this from happening and todetermine the optimal fit we adopt a criterion to stop our fit based on the cross-validation method. Once again our procedure is completely analogous to the one used for the unpo-larized NNPDF fits.For each replica of the experimental data we subdivide the data into a training and a validation set, respectively containing a fraction f tr and (1 − f tr ) of randomly chosen datapoints of each experiment.We train one neural network on each replica of the data using the χ of the trainingset as a figure of merit to be minimized. In parallel we compute the χ of the validationset. We stop the training when we find that the χ smeared over a given number ofgenerations is decreasing for the training set while increasing for the validation set. Agraphical illustration of such a behaviour for one of the replicas in the reference fit is givenin Fig. 2. – 7 – . Phenomenology The study of the first moments of polarized structure functions is of phenomenologicalinterest, since they can be used to extract information on the fraction of polarizationcarried by partons and on physical couplings. In the MS scheme we haveΓ p,n ( Q ) = Z dx g p,n ( x, Q ) = (4.1)= 136 h ( a ± a )∆ C MSNS ( α s ( Q )) + 4 a ∆ C MSS ( α s ( Q )) i , where ∆ C MSNS ( α s ( Q )) and ∆ C MSS ( α s ( Q )) are the first moments of the non-singlet andsinglet Wilson coefficient functions, respectively, and a = (∆ u + ∆¯ u ) − (∆ d + ∆ ¯ d ) , (4.2) a = (∆ u + ∆¯ u ) + (∆ d + ∆ ¯ d ) − s + ∆¯ s ) , (4.3) a = (∆ u + ∆¯ u ) + (∆ d + ∆ ¯ d ) + (∆ s + ∆¯ s ) ≡ ∆Σ . (4.4)Using isotopic spin invariance, it can be shown that a is the axial coupling g A = G A /G V that governs neutron β -decay. Accurate measurements yield (see e.g. Ref. [9]): g A = 1 . ± . . (4.5)The difference of the g moments for proton and neutron leads to the Bjorken sumrule Γ NS1 ( Q ) = Γ p ( Q ) − Γ n ( Q ) = 16 g A ∆ C MSNS ( α s ( Q )) + δ T + δ τ ; (4.6)where δ T is the target mass correction and δ t is the correction due to higher twists. Targetmass corrections have been studied in Refs. [23–26] and can be evaluated for any moment n at the first order in m /Q using [24]: δ T = g ( n )1 ( Q ) − g ( n )10 ( Q ) = m Q n ( n + 1)( n + 2) (cid:20) ( n + 4) g n +210 ( Q ) + 4 n + 2 n + 1 g n +220 ( Q ) (cid:21) , (4.7)where g ( n ) i ( Q ) = R dx x n − g i ( x, Q ) and g i is the structure function taken at zero massof the nucleon. The higher–twist contribution is simply δ τ = µ Q , (4.8)where µ can be extracted from experimental data at low Q such as the CLAS data [27].Finally, the coefficient function of Eq. (4.6) has been calculated in Ref. [28] and up to order α s is given by:∆ C MSNS ( α s ( Q )) = 1 − α s ( Q ) π − (cid:18) − n f (cid:19) (cid:18) α s ( Q ) π (cid:19) − (cid:18) . − . n f + 115648 n f (cid:19) (cid:18) α s ( Q ) π (cid:19) . (4.9)– 8 –or the running coupling we use the expanded solution of the renormalization group equa-tion, up to NNLO we have: α s (cid:0) Q (cid:1) = α s (cid:0) Q (cid:1) LO " α s (cid:0) Q (cid:1) LO (cid:2) α s (cid:0) Q (cid:1) LO − α s (cid:0) M Z (cid:1)(cid:3) ( b − b )+ α s (cid:0) Q (cid:1) NLO b ln α s (cid:0) Q (cid:1) NLO α s (cid:0) M Z (cid:1) , (4.10)with α s (cid:0) Q (cid:1) NLO = α s (cid:0) Q (cid:1) LO (cid:20) − b α s (cid:0) Q (cid:1) LO ln (cid:18) β α s (cid:0) M Z (cid:1) ln Q M Z (cid:19)(cid:21) , (4.11) α s (cid:0) Q (cid:1) LO = α s (cid:0) M Z (cid:1) β α s (cid:0) M Z (cid:1) ln Q M Z , (4.12)and the beta function coefficients given by Q da s ( Q ) dQ = − X k =0 β k a s ( Q ) k +2 , a s ( Q ) = α s ( Q )4 π , (4.13)where β = 11 − n f , (4.14) β = 102 − n f ,β = 28572 − n f + 32554 n f , and b i ≡ β i /β .
5. Results
In this section we present our parametrization of the proton and deuteron asymmetriesand the structure functions extracted from them.We assess the quality of the fit by comparing our extraction with the experimentaldata included in the analysis, and by studying the stability of our results under variationsof the parametrization used for the networks.Then, as an example of a possible application of our result to a phenomenologicalanalysis, we study the extraction of the physical parameters (the strong coupling constant α s and the axial coupling g A ) from the Bjorken sum rule. In order to give a faithful erroron the extracted quantities we study the impact of the different assumptions which areneeded to reconstruct the structure functions and then to evaluate the Bjorken sum rule.Results are compared to existing estimates. – 9 –n Tab. 4 we show the χ /N data Proton χ Deuteron χ EMC 0.370 COMPASS 0.885SMC 0.480 SMC 1.100SMC low- x x Table 4:
The χ of the fit for proton and deuterondata. for each target and each experimentaldata set included in the present anal-ysis. We first observe the overall goodquality of our fit. For the proton thesmall values of χ for EMC, SMC andHERMES can be explained by a pos-sible overestimate of experimental er-rors. For the deuteron all the χ are oforder 1, except for the E155 data setwhich has a value of χ significantlysmaller than one. The somewhat largervalue of χ for the E143 deuteron data set can be understood by looking at Figs. 3 and 4where we present a comparison of our fit to experimental data in different kinematical re-gions. We observe that in the case of E143 the deuteron data show small incompatibilitiesamong themselves, and the large value of χ is a reflection of this. It is interesting to re-mark that a careful analysis of the χ value for each experiment allows the identification ofpotential incompatible data. This feature had already been pointed out in the unpolarizedstudies by the NNPDF Collaboration.In Tabs. 5, 6, and 7 we study the self–stability of the fit and the stability against thevariation of the parametrization with respect to a smaller and a larger architecture. To thisextent we define four different regions: one where we expect our fit to be an interpolationof the available data (Data region) and three where its behaviour is extrapolated to regionsof the ( x, Q )-plane not covered by present data: • Data: 0 . < x < .
75 and 2 GeV < Q <
20 GeV ; • Low- x : 0 . < x < .
001 and 2 GeV < Q <
20 GeV ; • Low- Q : 0 . < x < . . < Q < ; • High- Q : 0 . < x < . < Q <
60 GeV .We observe that all the estimators for self–stabilities are of order unity (or smaller), mean-ing that different subsets within the whole ensemble of replicas have the same statisticalfeatures.When we compare our final fit to a fit performed using networks with a smaller ar-chitecture, we notice that the the two fits are statistically equivalent. The same happensfor the comparison with a fit done with networks with a larger architecture, with the onlyexception of the errors on the deuteron fit in the extrapolation (all distances are order 1.5),which show some minor instability. In order to reconstruct the structure function g from data on the asymmetry A as givenin Eq. (2.4) some additional assumptions are needed. In the following we assess the impact– 10 – igure 3: The fitted asymmetries compared to proton (left) and deuteron (right) data for0 .
01 GeV < Q < (upper row), 1 GeV < Q < (central row) and3 GeV < Q < (lower row). In the plots A is evaluated at the central value of each Q range. of our assumptions for g , F and R on the determination of first moment of g . Thesechecks are done using an ensemble of 100 replicas, which is enough to this purpose, and ina range of x and Q which is entirely in the data region in order to avoid any extrapolationeffects. Finally, we compare our result for 1000 replicas with the sum rules obtained byexperimental collaborations.The first assumption whose impact we consider is the one on the structure function– 11 – igure 4: The fit compared to proton (left) and deuteron (right) data for 5 GeV < Q <
10 GeV (upper row), 10 GeV < Q <
30 GeV (central row) and 30 GeV < Q <
60 GeV (lower row).In the plots A is evaluated at the central value of each Q range. g , which is evaluated from the Wandzura–Wilczek relation [29] g W W ( x, Q ) = − g ( x, Q ) + Z x dyy g ( y, Q ) . (5.1)Inserting this expression into Eq. (2.4) gives g ( x, Q ) = 11 + γ (cid:18) A ( x, Q ) F ( x, Q ) + γ Z x dyy g ( y, Q ) (cid:19) , (5.2)– 12 – roton Data Low- x Low- Q High- Q h d [ A ] i . ± .
011 0 . ± .
011 1 . ± .
015 0 . ± . h d [ σ A ] i . ± .
006 0 . ± .
011 0 . ± .
014 0 . ± . x Low- Q High- Q h d [ A ] i . ± .
008 0 . ± .
012 0 . ± .
012 0 . ± . h d [ σ A ] i . ± .
007 0 . ± .
014 0 . ± .
011 0 . ± . Table 5:
Self–stability estimators evaluated with 100 replicas. The entries in the table show thestatistical differences between results based on different subsets of 100 replicas randomly chosen inour Monte Carlo ensemble.
Proton Data Low- x Low- Q High- Q h d [ A ] i . ± .
010 0 . ± .
014 0 . ± .
012 1 . ± . h d [ σ A ] i . ± .
008 1 . ± .
013 0 . ± .
014 1 . ± . x Low- Q High- Q h d [ A ] i . ± .
012 1 . ± .
012 1 . ± .
017 0 . ± . h d [ σ A ] i . ± .
008 1 . ± .
010 0 . ± .
010 1 . ± . Table 6:
Stability estimators for the reference fit (architecture 2-4-1) compared to a fit with asmaller architecture (2-3-1).
Proton Data Low- x Low- Q High- Q h d [ A ] i . ± .
012 1 . ± .
017 0 . ± .
011 1 . ± . h d [ σ A ] i . ± .
010 1 . ± .
014 0 . ± .
010 1 . ± . x Low- Q High- Q h d [ A ] i . ± .
009 0 . ± .
012 0 . ± .
015 0 . ± . h d [ σ A ] i . ± .
009 1 . ± .
015 1 . ± .
011 1 . ± . Table 7:
Stability estimators for the reference fit (architecture 2-4-1) compared to a fit with alarger architecture (2-5-1). which needs to be evaluated iteratively. To this purpose we take the initial value g ( x, Q )evaluated with g ( x, Q ) = 0, and we use g ( i WW )1 ( x, Q ) = 11 + γ (cid:18) A ( x, Q ) F ( x, Q ) + γ Z x dyy g ( i WW − ( y, Q ) (cid:19) . (5.3)From Tab. 8 we see that for Q values above 2 GeV one iteration is enough to stabilizethe result for the first moment of g computed in the data region. For lower scales, say Q ≃ , at least two iterations of the Wandzura-Wilczek relation are needed in orderto obtain a stable result. In the following the index i W W will be omitted as the numberof iterations used should be evident from the scale at which the first moment of g isevaluated.For the unpolarized structure function F we use the parametrization given in Ref. [2]for the proton and the one given in Ref. [1] for the deuteron. Since these parametrizationshave also been extracted using a Monte Carlo procedure, ensembles of replicas are availablefor F ; hence the result for g is evaluated as: g ( x, Q ) = 1 N rep N rep X k =1 (cid:20) A ( k )1 ( x, Q ) (1 + γ )2 x [1 + R ( x, Q )] F ( k )2 ( x, Q ) + γ g ( k )2 ( x, Q ) (cid:21) , (5.4)– 13 – roton Q = 2 GeV Q = 5 GeV Q = 10 GeV Q = 20 GeV i ww = 0 0 . ± . . ± . . ± . . ± . i ww = 1 0 . ± . . ± . . ± . . ± . i ww = 2 0 . ± . . ± . . ± . . ± . Q = 2 GeV Q = 5 GeV Q = 10 GeV Q = 20 GeV i WW = 0 0 . ± . . ± . . ± . . ± . i WW = 1 0 . ± . . ± . . ± . . ± . i WW = 2 0 . ± . . ± . . ± . . ± . Table 8:
First moment for x between 0.01 and 0.75 for different number of iterations of theWandzura–Wilczek relation. which takes into account both the uncertainty on A and the one on F (with g ( k )2 ( x, Q )we denote the expression in Eq. (5.1) evaluated for the k -th replica). Since there is nocorrelation between the extraction of A and the one of F the replicas of A , and F canbe sampled independently.In order to estimate the contribution of the uncertainty on F to the uncertainty on g , we can recompute g as g ( x, Q ) = 1 N rep N rep X k =1 (cid:20) A ( k )1 ( x, Q ) (1 + γ )2 x [1 + R ( x, Q )] h F i ( x, Q ) + γ g ( k )2 ( x, Q ) (cid:21) , (5.5)where for each k -th replica of A we use the averaged value of the unpolarized structurefunction h F i ( x, Q ) = 1 N rep N rep X k =1 F ( k )2 ( x, Q ) . (5.6)This procedure clearly freezes the fluctuations in F , which is kept fixed to its averagevalue. The result is given in Tab. 9, where we see that the contribution to the uncertaintyon the first moment of g due to F is negligible. In the following we will always use g asgiven from Eq. (5.4). Proton Q = 2 GeV Q = 5 GeV Q = 20 GeV Eq. (5.4) 0 . ± . . ± . . ± . . ± . . ± . . ± . Q = 2 GeV Q = 5 GeV Q = 20 GeV Eq. (5.4) 0 . ± . . ± . . ± . . ± . . ± . . ± . Table 9:
First moment for x between 0.01 and 0.75 with and without the error on F Finally a parametrization of R ( x, Q ) is needed in order to extract g from A . Herewe use R SLAC ( x, Q ) given in Ref. [30, 31]. Such a parametrization provides also an errorestimate, which we use to assess the impact of R SLAC ( x, Q ) on the total uncertainty of thefirst moment of g . In Tab. 10 we compare the sum rule evaluated with the central valueof R SLAC ( x, Q ) with the one obtained by taking into account the error on R SLAC ( x, Q ).This is achieved by letting R SLAC ( x, Q ) fluctuate within its own error in the Monte Carlo– 14 –ample; for the k -th replica we use: R SLAC ( x, Q ) + r ( k ) ∆ R SLAC ( x, Q ) , (5.7)where ∆ R SLAC ( x, Q ) is the error on the parametrization, and r ( k ) is a univariate Gaussianrandom number. Since R SLAC ( x, Q ) is a parametrization of experimental data, we takethe error as a statistical one, with no correlation between different replicas, and thus weuse a different random number each time a value of R SLAC ( x, Q ) is needed. From theresults collected in Tab. 10 we conclude that the error on R SLAC ( x, Q ) is also negligible. Proton Q = 2 GeV Q = 5 GeV Q = 20 GeV R SLAC ( x, Q ) 0 . ± . . ± . . ± . R SLAC ( x, Q ) + r ( k ) ∆ R SLAC ( x, Q ) 0 . ± . . ± . . ± . Q = 2 GeV Q = 5 GeV Q = 20 GeV R SLAC ( x, Q ) 0 . ± . . ± . . ± . R SLAC ( x, Q ) + r ( k ) ∆ R SLAC ( x, Q ) 0 . ± . . ± . . ± . Table 10:
First moment for x between 0.01 and 0.75 with and without the error on R We will now compare our results for the integral of g at different scales and overdifferent x ranges obtained using our ensemble of 1000 replicas with those obtained bydifferent experimental collaborations.In Tab. 11 results for the proton and Target SMC98 This Analysis Q = 10GeV p 0 . ± .
009 0 . ± . . ± .
007 0 . ± . Table 11:
Comparison of the proton anddeuteron sum rules (cid:16)R . . dxg ( x, Q ) (cid:17) as deter-mined in the present analysis with the results ob-tained by the SMC collaboration [11]. the deuteron sum rules are compared to theresult of Ref. [11]. We observe that theresults are compatible within errors, andthat our evaluation has a larger error.In Tab. 12 we compare our result withthe ones in Refs. [13], and [15]. First wenotice that the errors are of the same size,while our central values are systematicallysmaller, with a significant difference for theproton at low Q . A substantial part of effect can be attributed to the different parametriza-tion used for the unpolarized structure function. Indeed, if we evaluate the sum rule of E143with the SMC98 F p parametrization [11, 32, 33], at Q = 2 GeV we obtain 0 . ± . Q = 2 . we get0 . ± . F p used in the different analysis.It is clear that, while the different F p parametrizations agree in the kinematical regioncovered by experimental data, they differ significantly at low- Q in the large- x regionwhere there are no data and an extrapolation is needed. For the ALLM and the SMC98parametrizations the large- x behaviour is determined by the chosen functional form; theNNPDF parametrization interpolates by continuity from the last experimental point tothe kinematical constrain F ( x = 1 , Q ) = 0. The difference among the parametrizations– 15 – igure 5: Comparison of NNPDF, SMC98 and ALLM parametrizations of the unpolarized struc-ture function F p in the region where we evaluate the Bjorken sum rule. is then enhanced once we multiply by the asymmetry A to reconstruct the polarizedstructure function g . Target E143 This Analysis Q = 2GeV p 0 . ± .
007 0 . ± . . ± .
006 0 . ± . − . ± . − . ± . . ± .
016 0 . ± . Q = 5GeV p 0 . ± .
007 0 . ± . . ± .
004 0 . ± . − . ± . − . ± . . ± .
013 0 . ± .
014 Target HERMES This Analysis Q = 2 . p 0 . ± . . ± . . ± . . ± . − . ± . − . ± . . ± . . ± . Q = 5GeV p 0 . ± . . ± . . ± . . ± . − . ± . − . ± . . ± . . ± . Table 12:
Comparison of the integral of g over different x ranges, at different scales, as determinedform the present analysis with the results obtained form by E143, left pad: R . . dxg ( x, Q ) andHERMES, right pad: R . . dxg ( x, Q ) Since no neutron target data have been used in our fit, the neutron structure function, g n , is evaluated from the proton and deuteron ones as g n ( x, Q ) = 2 g d ( x, Q )1 − . ω D − g p ( x, Q ) , (5.8)where ω D is the probability of the deuteron to be in the D state; we use ω D = 0 .
05 whichcovers most of the published values [35]. We observe that, even if the neutron sum rule isa pure prediction, it is compatible with other estimations.In Fig. 6 we show a comparison of the polarized structure function as extracted in thisanalysis to data, in the region where we evaluate the Bjorken sum rule. The comparisonshows a good agreement. – 16 – igure 6:
Plot of the structure functions in the region where we evaluate the Bjorken sum rule.The fit curves are taken at Q = 5 GeV . In order to extract the strong coupling α s and the axial coupling g A from the values of theBjorken sum rule we need to extrapolate our fit in the Bjorken variable x down to x = 0and up to x = 1.In this section we discuss the impact of these extrapolations on the extraction of thecouplings, and we assess the impact of target mass corrections. Finally we present theresults we obtain for α s and g A from our fit. All checks are performed with 100 replicas,while for final results we use the full set of 1000 replicas.The extrapolation at large– x is embedded in the parametrization of F , as discussedin the previous section. Therefore we do not need any further assumption to constrain thelarge– x behaviour.The low– x behavior of the structure function g is instead very weakly constrained bydata, and the Regge behaviour is usually assumed; following Ref. [36] we write: g ( x, Q ) ≃ A x b , (5.9)with 0 < b < .
5. Such an assumption requires to choose a value of x match such that for x < x match the Regge behaviour is assumed to set in. The normalization factor A in Eq. 5.9is then determined by the matching condition g (fit)1 ( x match , Q ) = A x b match . (5.10)In order to choose the matching point, we fix the Regge exponent to 0 . Q in the range0 < x < x match , and we look for the minimum value of the errorfor each value of Q .From the results collected in Tab. 13 we see that x match grows as Q gets larger. Thisis understood looking at Fig. 1 where we see that for larger scales the coverage of the datamoves towards higher values of x . – 17 –nce the matching point is been de- Q (GeV ) x match Γ NS1 ( Q ) Error1 0.0100 0.12499 0.0209892 0.0100 0.1356 0.0182393 0.0100 0.14324 0.0178274 0.0200 0.13847 0.0182755 0.0200 0.14322 0.0190216 0.0200 0.14757 0.0201157 0.0200 0.15142 0.0214298 0.0300 0.1458 0.022629 0.0300 0.14827 0.02385910 0.0300 0.15054 0.02516511 0.0300 0.15265 0.0265312 0.0500 0.14111 0.02770313 0.0500 0.14241 0.02874914 0.0500 0.14366 0.02982815 0.0500 0.14487 0.03094116 0.0500 0.14604 0.03208717 0.0500 0.14718 0.03326718 0.0500 0.14829 0.0344819 0.0800 0.13397 0.0355220 0.0800 0.13461 0.036394 Table 13:
First moment of g with different valuesof Q : the error on Γ NS1 ( Q ) has been added a 100%uncertainty on the low- x extrapolation. termined, in order to take into accountthe uncertainty on the value of the Reggeexponent and the one on the choice ofthe matching point, we randomize theRegge exponent in the range − . < b < . x match < x < x match .In Tab. 14 we present the compari-son for the first moment of the structurefunction g evaluated with and withoutthe target mass correction as given inEq. (4.7). We observe that the shift onthe values of the moment due to the in-clusion of these effects is smaller thanthe experimental error even at the low-est Q .In principle we could extract g A , α s and the higher-twist term by fitting Eq. (4.6)evaluated from data at a given value of Q . In practice we evaluate N Q dif-ferent moments taken at different Q inthe kinematical region where we havea good coverage by experimental data:2 GeV ≤ Q ≤
20 GeV . Indeed for Q >
20 GeV the errors on the com-puted moments become so large that their weighted contribution in the combination isnegligible. On the lower side of the energy range we choose to start from Q = 2 GeV ,since below this scale a perturbative QCD approach might not be reliable. For this reasonwe do not fit the higher-twist term, but we will access its contribution by varying the lowercut in Q .We then proceed following the Proton Q = 1 GeV Q = 2 GeV noTMC 0 . ± . . ± . . ± . . ± . Q = 1 GeV Q = 2 GeV noTMC 0 . ± . . ± . . ± . . ± . Table 14:
First moment with and without TMC with b = 0 . x extrapolation matched at x = 0 . procedure described in detail in Sect.4.3 of Ref. [38]: the extraction ofcouplings is done by combining mo-ments at different values of Q in thechosen range and fitting Eq. (4.6)using MINUIT [39] where g A and α s are the chosen as free parame-ters. The moments at different Q are correlated, since they are com-puted using the same fitted parametrization. As detailed in Ref. [38] these correlationsinduce numerical instabilities in the inversion of the correlation matrix and off-diagonal– 18 –nstabilities due to non-diagonal elements in the correlation matrix becoming dominant.Both these instabilities lead to unreliable results for the extracted couplings.In order to fix the maximum value of N Q Q g A α s ( M Z )2+5 1.04 ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± Table 15:
Fits with different choices of Q . for which the extraction of the parameters isnumerically stable and reliable, we study theerror on the determination of g A and α s as wevary the number of included moments. Oncewe exclude moments with large correlations, weare left with a small number of combinations,which are showed in Tab. 15.The combination giving the smallest error( Q = 2 , ), once we evaluate asymmetricerrors, yields α s ( M Z ) = 0 . +0 . − . , (5.11)for the strong coupling, while the error on g A is found to be symmetric.The only sources of theoretical uncertainty g A α s ( M Z ) k F ± ± ± ± ± ± ± ± Table 16:
Reference fit ( Q = 2 , ,NNLO, 1000 reps, k F = 1) compared withvariations of the factorization scale. left to consider are the one due to the choice offactorization scale Q = k F m q , which we studyby varying k F in the range 0 . < k F < Q value is moved up to Q = 3 GeV (see Tab.15) and down to Q = 1 GeV ( g A =0 . ± .
14 and α s ( M Z ) = 0 . ± . g A and the strong coupling constant α s g A = 1 . ± . . ) +0 . − . (theo . ) = 1 . ± . . ) (5.12) α s ( M Z ) = 0 . +0 . − . (exp . ) +0 . − . (theo . ) = 0 . +0 . − . (tot . ) , which are compatible with previous extractions [42,43] from polarized DIS and the Bjorkensum rule.
6. Conclusions
We extracted a parametrization of the spin asymmetries A p,d , based on all available DISdata using the Monte-Carlo sampling techniques and neural networks as basic interpolationtools. We checked in the process that the statistical methods developed for the unpolarizedstudies by the NNPDF Collaboration can be naturally extended to handle the new data– 19 –ets considered in this work. Our main result is an effective tool, which we used to testdifferent assumptions needed to reconstruct the polarized structure function g . As anexample of possible applications we compared to previous estimations of experimental sumrules, and we found that the used parametrization for the unpolarized structure function F can be a sizable source of error at low values of Q . We also performed a study ofthe Bjorken sum rule, and the extraction of the axial coupling and the strong coupling,obtaining values which are compatible with previous analysis.It would be interesting to compare the results obtained for the Bjorken sum rule whendetermined from global QCD fits to polarized DIS, SDIS and hadron-hadron collisionsdata, like the one presented in [44], especially once W production data from RHIC will beincluded in such fits, providing an extra constraints on the light flavours separation.The present study is also meant to be a first step towards the application of the NNPDFtechniques to the determination of a set of polarized PDFs with a faithful error estimation. Acknowledgments
We thank S. Forte, G. Ridolfi and M. Anselmino for useful discussion and suggestions.We are greatly indebted to R. De Vita (CLAS), L. De Nardo and A. Fantoni (HERMES),A. Bressan (COMPASS) and T. S. Toole (E155) for detailed information on experimentaldata. The work of LDD is supported by an STFC Advanced Fellowship.
A. Experimental errors
Experimentally, we have A || ≃ Cf P b P t N − − N + N − + N + , (A.1)where • C is a nuclear correction that depends on the material the target is made of; • f is the dilution factor which accounts for the fact that only a fraction of the targetnucleons is polarizable; • P b and P t are the beam and target polarizations; • N − (+) is the number of scattered electrons/muons per incident charge for negative(positive) beam helicity.Thus, most of the errors quoted by experiments are normalization errors. • EMC [10]: A = 0 is assumed; 9.6% overall normalization due to beam and tar-get polarization; multiplicative errors on R and f ; additive errors on A , the falseasymmetry K and the radiative correction.– 20 – SMC98 [11]: A = 0 is assumed; multiplicative errors on P t , P b , R , f and thepolarized background ∆ P bg ; additive errors on A , the false asymmetry ∆ A false , theradiative correction and the momentum resolution. • SMC low- x [12]: A = 0 is assumed; multiplicative errors on P t , P b , R , f and thepolarized background ∆ P bg ; additive errors on A , the false asymmetry ∆ A false andthe radiative correction. • E143 [13]: g is evaluated using the Wandzura-Wilczeck relation g W W ( x, Q ) = − g ( x, Q ) + Z x dyy g ( x, Q ) , (A.2)using and empirical fit of g /F = ax α (1 + bx + cx )(1 + Cf ( Q )); multiplicativeerrors on P t , P b , f and the nuclear correction C which account for a total 3.7% forthe proton and 4.9% for the deuteron; additive uncorrelated error on the radiativecorrections. • E155 [14]: g is evaluated in the same way of E143, but the parameters of the fittedfunctional form have different values; we will add as a shift the difference between A and g /F ; multiplicative errors on P t , P b , f and the nuclear correction C whichaccount for a total 7.6% for the proton; additive uncorrelated error on the radiativecorrections. • HERMES06 [15]: a parametrization for g is fitted to existing data; normalizationserrors of 5.2% for the proton and 5% included in the systematic quoted for each datapoint; additional additive error on the parametrization used for g . • COMPASS [16]: A = 0 is assumed; multiplicative errors on P t , P b , the dilutionfactor f and the depolarization factor D ; additive errors on the false asymmetry andthe radiative correction. References [1] S. Forte, L. Garrido, J. I. Latorre and A. Piccione, JHEP (2002) 062[arXiv:hep-ph/0204232].[2] L. Del Debbio, S. Forte, J. I. Latorre, A. Piccione and J. Rojo [NNPDF Collaboration],JHEP , 080 (2005) [arXiv:hep-ph/0501067].[3] L. Del Debbio, S. Forte, J. I. Latorre, A. Piccione and J. Rojo [NNPDF Collaboration],JHEP (2007) 039 [arXiv:hep-ph/0701127].[4] R. D. Ball et al. [NNPDF Collaboration], Nucl. Phys. B (2009) 1 [arXiv:0808.1231[hep-ph]].[5] R. D. Ball et al. [The NNPDF Collaboration], arXiv:0906.1958 [hep-ph].[6] J. Blumlein and H. Bottcher, Nucl. Phys. B , 225 (2002) [arXiv:hep-ph/0203155].[7] J. Blumlein and A. Guffanti, AIP Conf. Proc. , 261 (2005). – 21 –
8] M. Anselmino, A. Efremov and E. Leader, Phys. Rept. , 1 (1995) [Erratum-ibid. , 399(1997)] [arXiv:hep-ph/9501369].[9] S. E. Kuhn, J. P. Chen and E. Leader, arXiv:0812.3535 [hep-ph].[10] J. Ashman et al. [European Muon Collaboration], Nucl. Phys. B (1989) 1.[11] B. Adeva et al. [Spin Muon Collaboration], Phys. Rev. D (1998) 112001.[12] B. Adeva et al. [Spin Muon Collaboration], Phys. Rev. D , 072004 (1999) [Erratum-ibid. D , 079902 (2000)].[13] K. Abe et al. [E143 collaboration], Phys. Rev. D , 112003 (1998) [arXiv:hep-ph/9802357].[14] P. L. Anthony et al. [E155 Collaboration], Phys. Lett. B (2000) 19[arXiv:hep-ph/0007248].[15] A. Airapetian et al. [HERMES Collaboration], Phys. Rev. D (2007) 012007[arXiv:hep-ex/0609039].[16] E. S. Ageev et al. [COMPASS Collaboration], Phys. Lett. B (2005) 154[arXiv:hep-ex/0501073].[17] M. Dittmar et al. , arXiv:0901.2504 [hep-ph].[18] G. D’Agostini, Bayesian reasoning in data analysis: A critical introduction , World Scientific,2003.[19] C. M. Bishop,
Neural Networks for Pattern Recognition , Oxford University Press, Oxford1995.[20] K. Hornik, M. Stinchcombe and H. White,
Neural Networks , vol. 2, 359 (1989).[21] M. Mitchell,
An introduction to genetic algorithms , MIT Press, 1998[22] J. I. Latorre and J. Rojo, JHEP , 055 (2004).[23] S. Matsuda and T. Uematsu, Nucl. Phys. B (1980) 181.[24] A. Piccione and G. Ridolfi, Nucl. Phys. B (1998) 301 [arXiv:hep-ph/9707478].[25] J. Blumlein and A. Tkabladze, Nucl. Phys. B (1999) 427 [arXiv:hep-ph/9812478].[26] A. Accardi and W. Melnitchouk, Phys. Lett. B (2008) 114 [arXiv:0808.2397 [hep-ph]].[27] C. Simolo, arXiv:0807.1501 [hep-ph].[28] S. A. Larin and J. A. M. Vermaseren, Phys. Lett. B (1991) 345.[29] S. Wandzura and F. Wilczek, Phys. Lett. B (1977) 195.[30] L. W. Whitlow, S. Rock, A. Bodek, E. M. Riordan and S. Dasu, Phys. Lett. B (1990)193.[31] K. Abe et al. [E143 Collaboration], Phys. Lett. B , 194 (1999) [arXiv:hep-ex/9808028].[32] A. Milsztajn, A. Staude, K. M. Teichert, M. Virchaux and R. Voss, Z. Phys. C (1991) 527.[33] M. Arneodo et al. [New Muon Collaboration.], Phys. Lett. B (1995) 107[arXiv:hep-ph/9509406].[34] H. Abramowicz and A. Levy, arXiv:hep-ph/9712415. – 22 –
35] W. Buck and F. Gross, Phys. Rev.
D 20 , 2361 (1979), M. Z. Zuilhof and J. A. Tjon, Phys.Rev.
C 22 , 2369 (1980), M. Lacombe et al., ibid. 21, (1980), R. Machleidt et al., Phys.Rep. , 1 (1987)[36] F. E. Close and R. G. Roberts, Phys. Lett. B (1994) 257 [arXiv:hep-ph/9407204].[37] R. Abbate and S. Forte, Phys. Rev. D (2005) 117503 [arXiv:hep-ph/0511231].[38] S. Forte, J. I. Latorre, L. Magnea and A. Piccione, Nucl. Phys. B (2002) 477[arXiv:hep-ph/0205286].[39] F. James and M. Roos, Comput. Phys. Commun. , 343 (1975).[40] A. Deur et al. , Phys. Rev. Lett. (2004) 212001 [arXiv:hep-ex/0407007].[41] A. Deur et al. , Phys. Rev. D (2008) 032001 [arXiv:0802.3198 [nucl-ex]].[42] G. Altarelli, R. D. Ball, S. Forte and G. Ridolfi, Nucl. Phys. B (1997) 337[arXiv:hep-ph/9701289].[43] G. Altarelli, R. D. Ball, S. Forte and G. Ridolfi, Acta Phys. Polon. B (1998) 1145[arXiv:hep-ph/9803237].[44] D. de Florian, R. Sassot, M. Stratmann and W. Vogelsang, arXiv:0904.3821 [hep-ph].(1998) 1145[arXiv:hep-ph/9803237].[44] D. de Florian, R. Sassot, M. Stratmann and W. Vogelsang, arXiv:0904.3821 [hep-ph].