Extraction of the multiplicity dependence of Multiparton Interactions from LHC pp data using Machine Learning techniques
EExtraction of the multiplicity dependence ofMultiparton Interactions from LHC pp data usingMachine Learning techniques
Antonio Ortiz, Erik Zepeda
Instituto de Ciencias Nucleares, Universidad Nacional Aut´onoma de M´exico, ApartadoPostal 70-543, M´exico Distrito Federal 04510, M´exico
Abstract
Over the last years, Machine Learning (ML) methods have been success-fully applied to a wealth of problems in high-energy physics. For instance, ina previous work we have reported that using ML techniques one can extractthe Multiparton Interactions (MPI) activity from minimum-bias pp data. Us-ing the available LHC data on transverse momentum spectra as a function ofmultiplicity, we reported the average number of MPI ( (cid:104) N mpi (cid:105) ) for minimum-bias pp collisions at √ s = 5 .
02 and 13 TeV. In this work, we apply the sameanalysis to a new set of data. We report that (cid:104) N mpi (cid:105) amounts to 3 . ± . √ s = 7 TeV. These complementary resultssuggest a modest center-of-mass energy dependence of (cid:104) N mpi (cid:105) . The study isfurther extended aimed at extracting the multiplicity dependence of (cid:104) N mpi (cid:105) forthe three center-of-mass energies. We show that our results qualitatively agreewith existing ALICE measurements sensitive to MPI. Namely, (cid:104) N mpi (cid:105) increasesapproximately linearly with the charged-particle multiplicity. But, it deviatesfrom the linear dependence at large charged-particle multiplicities. The devia-tion from the linear trend can be explained in terms of a bias towards harderprocesses given the multiplicity selection at mid-pseudorapidity. The results ∗ Corresponding authors
Email address: [email protected], [email protected] (AntonioOrtiz, Erik Zepeda)
Preprint submitted to Journal of L A TEX Templates January 26, 2021 a r X i v : . [ h e p - ph ] J a n eported in this paper provide additional evidence of the presence of MPI in ppcollisions, and they can be useful for a better understanding of the heavy-ion-likebehaviour observed in pp data. Keywords:
Multiparton Interactions, Machine Learning, LHC
1. Introduction
The possibility of having Multiparton Interactions (MPI), i.e. several parton-parton interactions within the same hadron-hadron collision, is expected giventhe composite nature of hadrons. Indeed, at the Large Hadron Collider (LHC)energies, already at a transverse momentum transfer of a few GeV/ c the crosssection for leading order (LO) parton-parton scatterings exceeds the total ppinelastic cross section [1]. This apparent inconsistency can be resolved by con-sidering additional partonic scatterings within the same pp collision [2]. Datasupport the presence of MPI in pp collisions [3, 4, 5, 6]. For instance, in ppcollisions at √ s < . √ s follows the Koba-Nielsen-Oleson (KNO) scalingwith scaling variable z = N ch / (cid:104) N ch (cid:105) [7]. However, such a scaling is violated athigher energies [8]. This effect can been interpreted as a consequence of particleproduction through MPI [9].Beyond the importance of Multiparton Interactions for high-energy physics,the study of its effects in pp collisions has recently attracted the attention ofthe heavy-ion community (see e.g. [10, 11]). This because the multiplicitydependent studies of pp data unveiled heavy-ion-like features, i.e. azimuthalanisotropies [12], the enhancement of (multi-)strange hadrons [13], as well asthe mass ordering in the hadron p T spectra [14]. Besides the hydrodynami-cal approach [15, 16], Multiparton Interactions, which is a key mechanism ofMonte Calor (MC) generators like PYTHIA 8 [17] and HERWIG 7 [18], offer analternative possibility to explain the observed phenomena. For instance, colorreconnection and MPI can mimic radial flow patterns in pp collisions [19]. Mod-els based on the QCD theory of MPI have been shown to explain collectivity2rom interference effects in hadronic collisions with N mpi parton-parton scatter-ings [10, 11]. PYTHIA 8 with rope hadronization model [20], which assumesthe formation of ropes due to overlapping of strings in a high-multiplicity en-vironment (high N mpi ), describes the strangeness enhancement [21]. Regardingthe phenomena at large transverse momentum ( p T ), the model also producessome features which are present in heavy-ion data [22, 23, 24].It is worth mentioning that the early LHC data already suggested that inhigh-multiplicity pp collisions, the MPI activity could be more relevant thanassumed. And that this could give rise to new effects [3]. For this reason wehave proposed the extraction of MPI from minimum-bias pp data using MachineLearning (ML) methods [6]. In this paper, we extend that study aiming atextracting the multiplicity dependence of N mpi from the available LHC data [25,14]. Our results are contrasted with ALICE data [4] at lower energies, anddiscussed in terms of what we know from the PYTHIA 8 model.The paper is organised as follows: section 2 describes the analysis, wherethe input variables and the models used for the study are discussed. Results arepresented in section 3, and finally section 4 contains a summary and outlook.
2. Analysis
The goal of our work is the extraction of the average number of MPI fromLHC data. Our approach relies on a multivariate regression technique based onBoosted Decision Trees (BDT), which follows the strategy reported in Ref. [6],The study is conducted using the Toolkit for Multivariate Analysis (TMVA)framework which provides a ROOT-integrated machine learning environment forthe processing and parallel evaluation of multivariate classification and regres-sion technique [26]. The training is performed using pp collisions at √ s = 13 TeVsimulated with PYTHIA 8.244 [17] event generator (tune 4C [27]). For themultiplicity dependent studies, we use the same input variables as reported inRef. [6]. The choice of the variables is based on their correlation with N mpi [28],as well as their availability as published data. We consider the event-by-event3 |<0.8) h (| æh /d ch N d Æ / h /d ch N d æ m p i N Æ / m p i N pp Pythia 8.244 (tune 4C)=5.02 TeVs=7 TeVs=13 TeVs BDT, MC for training:Pythia tune 4CPythia tune MonashPythia tune 2CHerwig soft tuneBDT, MC for training:Pythia tune 4CPythia tune MonashPythia tune 2CHerwig soft tuneBDT, MC for training:Pythia tune 4CPythia tune MonashPythia tune 2CHerwig soft tuneBDT, MC for training:Pythia tune 4CPythia tune MonashPythia tune 2CHerwig soft tune Figure 1: Monte Carlo closure test using pp collisions at √ s = 5 .
02, 7 and 13 TeV simulatedwith PYTHIA 8 tune 4C. The self normalized average number of Multiparton Interactionsas a function of the self normalized mid-pseudorapidity charged particle multiplicity (solidmarkers) is shown along BDT results (lines). The results from ML are obtained consideringdifferent MC models for training: 4C (solid line), Monash (dashed line), 2C (dotted line) andthe soft tune of HERWIG 7 (dash-dotted line). The grey band around the MC predictionindicates the systematic uncertainties (see the text for more details). average transverse momentum and its correlation with the mid-pseudorapiditycharged particle multiplicity ( N ch ), as it encodes information about the underly-ing particle production mechanism. These quantities are calculated for primarycharged particles within | η | < .
8, in addition, the average p T considers trackswith transverse momentum above 0.15 GeV/ c .The systematic uncertainty which was assigned in our previous study tookinto account a variation of the model. To this end, different PYTHIA 8 tuneswere used for training: 2C, 4C and Monash 2013. In the present study, weproceed in the same way. The main features of the PYTHIA 8 tunes used inour analysis are listed below. 4 The tune 2C was obtained from fits to TEVATRON data, therefore, itwas not presented as a “complete” MC tune for LHC [27]. Instead, it wasprovided as a starting point for more sophisticated tunes using the LHCdata. This explains why this tune gives the worst description of the LHCdata. We have chosen this model in order to evaluate the impact in ourresults if BDT are trained with a model which is known to fail to describethe data. • The model 4C, on the other hand, used the early LHC minimum-bias andunderlying-event data (pp at √ s = 0 . • The Monash 2013 model is tuned to a bigger set of LHC data [29]. Con-trary to the previous tunes, Monash 2013 starts from a more careful tuneto LEP data, and it involves several parameter changes.The effects of the hadronization model used for training is also investi-gated using the Monte Carlo generator HERWIG 7.2 [30] for training insteadPYTHIA 8. The effects of both the MPI model and hadronization model areconsidered in the systematic uncertainties.Before processing the data using the trained BDT, first we show that theprocedure is robust against the MC model used for training. To this end, weperform a Monte Carlo closure test. Figure 1 shows the correlation betweenthe self normalized number of Multiparton Interactions ( N mpi / (cid:104) N mpi (cid:105) ) and theself normalized mid-pseudorapidity charged particle multiplicity ( N ch / (cid:104) N ch (cid:105) )in pp collisions at √ s = 5 .
02, 7 and 13 TeV. The results were obtained usingPYTHIA 8 tune 4C. For N ch / (cid:104) N ch (cid:105) <
3, the self normalized N mpi increaseslinearly with the event multiplicity. While, for higher multiplicities, we observea deviation of the self normalized N mpi with respect to the linear trend. Thisobservation suggests that very high multiplicity pp collisions can only be pro-duced by high multiplicity jets [31]. The figure also displays the results obtainedfrom regression (lines). Namely, the MC information (average p T and multiplic-ity) of pp collisions at √ s = 5 .
02, 7 and 13 TeV simulated with PYTHIA 85une 4C, was evaluated using four different sets of BDT. Each one was trainedconsidering different MC models: the three PYTHIA 8 tunes described above,as well as the soft tune of HERWIG 7.2. Figure 1 shows that using ML-basedregression, one can recover the energy and multiplicity dependence. The smallvariations with respect to the true correlation (markers) are well covered by thesystematic uncertainties, which amount to 30% for N ch / (cid:104) N ch (cid:105) →
0, and 15%for N ch / (cid:104) N ch (cid:105) > (cid:104) p T (cid:105) and N ch was developed. To this end,we built a toy MC using the available ALICE data [14, 25], which contain the p T spectra for different multiplicity classes defined by the event activity at ei-ther mid-pseudorapidity (Tracklets-based estimator) or forward pseudorapidity(VZERO-based estimator). For simplicity, each event class was simulated as-suming that its multiplicity spectrum follows a Poisson distribution [32]. Theircorresponding average multiplicity values as well as their contribution to theinelastic cross section were taken from [14, 25]. With this information, N ch pseudo-particles were generated in each event, where each psuedo-particle had atransverse momentum which obeyed the p T spectra reported by ALICE [14, 25].Figure 2 displays the mean transverse momentum as a function of the averagecharged-particle multiplicity density in pp collisions at √ s = 5 .
02, 7 and 13 TeV.The comparison between the toy MC and the data is displayed. Within uncer-tainties, the toy MC reproduces the correlation between the (cid:104) p T (cid:105) and (cid:104) dN ch /dη (cid:105) .In our approach, the information produced by the toy MC is processed with thetrained BDT.The toy MC approach was validated using PYTHIA 8, the MC non-closure( N mpi from regression compared to the true N mpi ) was found to be significantlysmaller than the systematic uncertainty due to model dependence. In addi-tion, different conditions were varied to estimate a systematic uncertainty onthe target variable. Fixing the spectral shape of the transverse momentumdistribution, we vary the average charged-particle multiplicity density at theirminimum and maximum values given by their corresponding uncertainties. On6 igure 2: Mean transverse momentum as a function of the average charged-particle multiplic-ity density in pp collisions at √ s = 5 .
02, 7 and 13 TeV. ALICE data [25, 14] (solid markers) arecompared with results from a toy Monte Carlo (solid lines). Boxes around the data indicatethe systematic uncertainties. the other hand, we vary the average transverse momentum in the same way asmultiplicity, but fixing the average charged-particle multiplicity density at theirmean value. These variations provide an additional source of systematic un-certainty in our target variable, however, their contributions are also negligiblewith respect to the one due to the model dependence discussed before.
3. Results
Firstly, we report a result which complements those reported in Ref. [6].Following the same strategy discussed in Ref. [6], we use the ALICE data frompp collisions at √ s = 7 TeV [14] to get the average MPI activity. The aver-age number of Multiparton Interactions is found to be (cid:104) N mpi (cid:105) = 3 . ± . √ s = 13 TeV( √ s = 5 .
02 TeV). Figure 3 displays the average number of MPI as a function ofthe center-of-mass energy, for pp collision at √ s = 5.02, √ s = 7 TeV [14], and 13TeV [25]. Within 3 σ , we obtain a regression value which is above unity, there-fore, our results support the presence of MPI in pp collisions. We also observe a7 (TeV) s æ m p i N Æ |<1 h Events with at least one primary charged particle within |pp (INEL>0) collisions |<0.8) h Data from: ALICE, EPJC 79 (2019) no.10, 857 (||<0.5) h Data from: ALICE, PRC 99, 024906 (2019) (|
Figure 3: Average number of MPI as a function of the center-of- mass energy. The trainedBDT were applied to ALICE data [14, 25]. Results for pp collisions at √ s = 7 TeV, arecompared to those for pp collisions at √ s = 5 .
02 and 13 TeV reported in [6]. modest energy dependence, which is similar to that predicted by PYTHIA 8 [6].Secondly, figure 4 displays the self normalized number of MPI ( N mpi / (cid:104) N mpi (cid:105) )as a function of the self-normalized mid-pseudorapidity charged-particle multi-plicity ( N ch / (cid:104) N ch (cid:105) ) in pp collisions at √ s = 5 .
02, 7 and 13 TeV from ALICEdata. We observe that N mpi / (cid:104) N mpi (cid:105) vs. N ch / (cid:104) N ch (cid:105) does not show a significantcenter-of-mass energy dependence. Moreover, for N ch < (cid:104) N ch (cid:105) the self nor-malized N mpi increases linearly with the event multiplicity. While, for highermultiplicities, we observe a deviation of the self normalized N mpi with respect tothe linear trend. This result qualitatively agrees with PYTHIA 8 (see figure 1).Last but not least, it is worth mentioning how our results compare withthose from the “mini-jet analysis” of ALICE [4]. That analysis consists on themeasurement of pair-yields per trigger in two-particle azimuthal correlations be-tween charged trigger and associated particles in pp collisions at √ s = 0 .
9, 2.76and 7 TeV. The analysis was performed at mid-pseudorapidity ( | η | < .
9) forthe transverse momentum thresholds for trigger particles of p trigg . T > . c and for associated particles of p assoc . T > . . c . Based on PYTHIAsimulations, the so-called number of uncorrelated seeds is defined, and the re-8 æ ch N Æ / ch N æ m p i N Æ / m p i N SPD-based multiplicity estimator = 13 TeV s pp = 5.02 TeV s pp V0M-based multiplicity estimator = 13 TeV s pp = 7 TeV s pp = 5.02 TeV s pp Data from: ALICE, PRC 99, 024906 (2019);EPJC 79 (2019) no.10, 857Training: Pythia 8.244 tune 4C Figure 4: The self normalized average number of Multiparton Interactions as a function of theself normalized mid-pseudorapidity charged particle multiplicity is shown for pp collisions at √ s = 5 .
02, 7 and 13 TeV. The color boxes around the MC prediction indicate the systematicuncertainties (see the text for more details). sults from data are discussed in the context of the semi-hard parton–partoninteractions. The data indicate that the charged particle multiplicity increasesapproximately linearly with the number of uncorrelated seeds. However, it de-viates from the linear dependence at large charged particle multiplicities. Inaddition, the data exhibit a weak center-of-mass energy dependence. Theseobservations are fully consistent with our results which use Machine Learning.And they suggest that at highest multiplicities (at mid-pseudorapidity) a furtherincrease of the number of Multiparton Interactions becomes very improbable,instead high multiplicities can only be reached by selecting events with manyhigh-multiplicity jets [4]. A similar conclusion is obtained from a study of thejet production as a function of event multiplicity in pp collisions [33, 31].9 . Conclusions
In this work, we report the extraction of the average number of MultipartonInteractions from pp data at the LHC energies. Using the existing data on p T spectra as a function of event multiplicity in pp collisions at √ s = 7 TeV, wehave found (cid:104) N mpi (cid:105) = 3 . ± .
01 for minimum-bias pp collisions. The com-parisons with our previous results for pp collisions at √ s = 5 .
02 and 13 TeVindicate a modest energy dependence of N mpi . This observation is consistentwith predictions by PYTHIA 8.244. Implicitly, our results also provide exper-imental evidence of the presence of MPI in hadronic interactions. In addition,we also report the multiplicity dependence of N mpi for the three center-of-massenergies. We have found that for N ch < (cid:104) N ch (cid:105) the event multiplicity increaseslinearly with the self normalized N mpi . While, for N ch > (cid:104) N ch (cid:105) , a deviationwith respect to the linear trend of the self normalized N mpi as a function ofevent multiplicity is observed. Which suggests that these collisions can onlybe reached by selecting events with high multiplicity jets. All the results re-ported in this paper, are fully consistent with the so-called “mini-jet analysis”of ALICE, where a quantity sensitive to MPI was measured as a function ofmultiplicity and the center-of-mass energy. Based on all the crosschecks whichwere performed using MC, and the agreement with an independent measure-ment of ALICE at lower center-of-mass energies, the present results confirmthat our approach is robust. Therefore, it can be used by experiments in orderto study the particle production as a function of MPI. This will help to ruleout models, and would contribute to the understanding of the heavy-ion-likefeatures observed in pp data.
5. Acknowledgments
Authors acknowledge Antonio Paz for providing the simulations with HER-WIG 7.2. Support for this work has been received from CONACyT under theGrant No. A1-S-22917. E. Z. acknowledges the fellowship of CONACyT.10 eferences [1] Manuel Bahr, Jonathan M. Butterworth, and Michael H. Seymour. TheUnderlying Event and the Total Cross Section from Tevatron to the LHC.
JHEP , 01:065, 2009.[2] Torbjorn Sjostrand and Maria van Zijl. Multiple Parton-parton Interactionsin an Impact Parameter Picture.
Phys. Lett. B , 188:149–154, 1987.[3] Betty Abelev et al. Transverse sphericity of primary charged particles inminimum bias proton-proton collisions at √ s = 0 .
9, 2.76 and 7 TeV.
Eur.Phys. J. C , 72:2124, 2012.[4] Betty Abelev et al. Multiplicity dependence of two-particle azimuthal cor-relations in pp collisions at the LHC.
JHEP , 09:049, 2013.[5] Antonio Ortiz. Experimental results on event shapes at hadron colliders.
Adv. Ser. Direct. High Energy Phys. , 29:343–357, 2018.[6] Antonio Ortiz, Antonio Paz, Jos´e D. Romo, Sushanta Tripathy, Erik A.Zepeda, and Irais Bautista. Multiparton interactions in pp collisions frommachine learning-based regression. Phys. Rev. D , 102(7):076014, 2020.[7] Z. Koba, Holger Bech Nielsen, and P. Olesen. Scaling of multiplicity distri-butions in high-energy hadron collisions.
Nucl. Phys. B , 40:317–334, 1972.[8] G.J. Alner et al. Scaling Violations in Multiplicity Distributions at 200-GeVand 900-GeV.
Phys. Lett. B , 167:476–480, 1986.[9] I.M. Dremin and V.A. Nechitailo. Soft multiple parton interactions as seenin multiplicity distributions at Tevatron and LHC.
Phys. Rev. D , 84:034026,2011.[10] Boris Blok, Christian D. J¨akel, Mark Strikman, and Urs Achim Wiede-mann. Collectivity from interference.
JHEP , 12:074, 2017.1111] Boris Blok and Urs Achim Wiedemann. Collectivity in pp from resummedinterference effects?
Phys. Lett. B , 795:259–265, 2019.[12] Vardan Khachatryan et al. Observation of Long-Range Near-Side AngularCorrelations in Proton-Proton Collisions at the LHC.
JHEP , 09:091, 2010.[13] Jaroslav Adam et al. Enhanced production of multi-strange hadrons inhigh-multiplicity proton-proton collisions.
Nature Phys. , 13:535–539, 2017.[14] Shreyasi Acharya et al. Multiplicity dependence of light-flavor hadron pro-duction in pp collisions at √ s = 7 TeV. Phys. Rev. C , 99(2):024906, 2019.[15] Piotr Bozek. Collective flow in p-Pb and d-Pd collisions at TeV energies.
Phys. Rev. , C85:014911, 2012.[16] James L. Nagle and William A. Zajc. Small System Collectivity in Rel-ativistic Hadronic and Nuclear Collisions.
Ann. Rev. Nucl. Part. Sci. ,68:211–235, 2018.[17] Torbj¨orn Sj¨ostrand, Stefan Ask, Jesper R. Christiansen, Richard Corke,Nishita Desai, Philip Ilten, Stephen Mrenna, Stefan Prestel, Christine O.Rasmussen, and Peter Z. Skands. An introduction to PYTHIA 8.2.
Comput.Phys. Commun. , 191:159–177, 2015.[18] Stefan Gieseke, Christian Rohr, and Andrzej Siodmok. Colour reconnec-tions in Herwig++.
Eur. Phys. J. C , 72:2225, 2012.[19] Antonio Ortiz, Peter Christiansen, Eleazar Cuautle Flores, Ivonne Maldon-ado Cervantes, and Guy Pai´c. Color Reconnection and Flowlike Patternsin pp Collisions.
Phys. Rev. Lett. , 111(4):042001, 2013.[20] Christian Bierlich and Jesper Roy Christiansen. Effects of color reconnec-tion on hadron flavor observables.
Phys. Rev. D , 92(9):094010, 2015.[21] Ranjit Nayak, Subhadip Pal, and Sadhana Dash. Effect of rope hadroniza-tion on strangeness enhancement in p − p collisions at LHC energies. Phys.Rev. D , 100(7):074023, 2019. 1222] Aditya Nath Mishra, Antonio Ortiz, and Guy Paic. Intriguing similaritiesof high- p T particle production between pp and A − A collisions. Phys. Rev.C , 99(3):034911, 2019.[23] P.M. Jacobs. Search for jet quenching effects in high multiplicity pp colli-sions at √ s=13 TeV. Nucl. Phys. A , 1005:121924, 2021.[24] Gyula Benc´edi, Antonio Ortiz, and Sushanta Tripathy. Apparent modifi-cation of the jet-like yield in proton-proton collisions with large underlyingevent.
J. Phys. G , 48(1):015007, 2020.[25] Shreyasi Acharya et al. Charged-particle production as a function of multi-plicity and transverse spherocity in pp collisions at √ s = 5 .
02 and 13 TeV.
Eur. Phys. J. C , 79(10):857, 2019.[26] A. Hoecker, P. Speckmayer, J. Stelzer, J. Therhaag, E. von Toerne, H. Voss,M. Backes, T. Carli, O. Cohen, A. Christov, D. Dannheim, K. Danielowski,S. Henrot-Versille, M. Jachowski, K. Kraszewski, A. Krasznahorkay Jr.,M. Kruk, Y. Mahalalel, R. Ospanov, X. Prudent, A. Robert, D. Schouten,F. Tegenfeldt, A. Voigt, K. Voss, M. Wolter, and A. Zemla. Tmva - toolkitfor multivariate data analysis, 2007.[27] Richard Corke and Torbjorn Sjostrand. Interleaved Parton Showers andTuning Prospects.
JHEP , 03:032, 2011.[28] Eleazar Cuautle, Antonio Ortiz, and Guy Paic. Effects produced by multi-parton interactions and color reconnection in small systems.
Nucl. Phys.A , 956:749–752, 2016.[29] Peter Skands, Stefano Carrazza, and Juan Rojo. Tuning PYTHIA 8.1: theMonash 2013 Tune.
Eur. Phys. J. , C74(8):3024, 2014.[30] Johannes Bellm et al. Herwig 7.2 release note.
Eur. Phys. J. C , 80(5):452,2020. 1331] Antonio Ortiz, Gyula Bencedi, and H´ector Bello. Revealing the source ofthe radial flow patterns in proton–proton collisions using hard probes.
J.Phys. G , 44(6):065001, 2017.[32] A.I. Golokhvastov. Independent production and Poisson distribution.