[PDF] On feasibility of azimuthal flow studies with Principal Component Analysis

Abstract

It is shown that the Principal Component Analysis applied to azimuthal single-particle distributions allows to perform flow analysis in ways that are analogous to the traditional approaches based on multi-particle correlations. In particular, symmetric cumulants are considered. It is demonstrated also that statistical fluctuations due to a finite number of particles per event practically do not play a role for higher order PCA-based cumulants.

Full PDF

OOn feasibility of azimuthal ﬂow studies with Principal Component Analysis

Igor Altsybeev Saint-Petersburg State University, Universitetskaya nab. 7/9, St. Petersburg, 199034, Russia ∗ (Dated: July 27, 2020)It is shown that the Principal Component Analysis applied to azimuthal single-particle distribu-tions allows to perform ﬂow analysis in ways that are analogous to the traditional approaches basedon multi-particle correlations. In particular, symmetric cumulants are considered. It is demon-strated also that statistical ﬂuctuations due to a ﬁnite number of particles per event practically donot play a role for higher order PCA-based cumulants. ∗ [email protected] a r X i v : . [ nu c l - t h ] J u l I. INTRODUCTION

Principal Component Analysis (PCA) is a method for decorrelation of multivariate data. PCA ﬁnds the mostoptimal basis for a given problem and thus reduces its dimensionality. Recently, it was suggested to apply PCA toheavy-ion collisions data on two-particle azimuthal correlations [1], in order to reveal hidden patterns of the collectivebehaviour of the hadronic medium. In [2] it was shown that PCA applied directly to single-particle azimuthal ( ϕ )distributions in A–A collisions reveals Fourier harmonics as the natural and the most optimal basis.In the latter approach, technically, one needs to take distributions of particles in M bins in each out of N events,normalize them and subtract the mean, and then apply PCA to the obtained N × M matrix (see more details in [2, 3]).As an output from PCA, we have a set of orthonormal eigenvectors ( e i , i = 1 , ..., M ), and also a set of coeﬃcients α ki ( k = 1 , ..., N ) of PCA decomposition, such that the particle distribution in k -th event (denoted as x ( k ) that is avector with M elements) can be written as x ( k ) = M (cid:88) i =1 α ( k ) i e i . (1)By construction, the ﬁrst K components ( K < M ) contain the most of the total variance of a dataset.The azimuthal ﬂow in heavy-ion collisions is typically studied using expansion of particle azimuthal probabilitydensity in a series: f ( ϕ ) = 12 π (cid:2) ∞ (cid:88) n =1 v n cos (cid:0) n ( ϕ − Ψ n ) (cid:1)(cid:3) , (2)where v n are the ﬂow coeﬃcients. As mentioned above, PCA applied to event-by-event azimuthal single-particledistributions reveals the Fourier basis, and thus coeﬃcients of the PCA decomposition gain a deﬁnite meaning. If thedecomposition (2) is applied event-by-event, values of ˆ v n observed in the k -th event are related to the PCA coeﬃcients(assuming that the elliptic ﬂow dominates, and the next is the triangular ﬂow) as follows: ˆ v ( k )2 = (cid:113) M (cid:113) α k + α k ,ˆ v ( k )3 = (cid:113) M (cid:113) α k + α k , and so on.In this paper, it is demonstrated that the coeﬃcients of the PCA decomposition can be combined into expressionsthat are equivalent to the multi-particle cumulants used in the ﬂow studies. The ﬁrst observable considered in Section2 is the ﬂow amplitude calculated via two- and four-particle correlations. So-called symmetric cumulants that measurecorrelations between amplitudes of ﬂow harmonics of diﬀerent orders are studied in Section 3. It is estimated alsohow the statistical ﬂuctuations contribute to the values extracted using PCA. II. HIGHER-ORDER CUMULANTS FROM PCA

In ﬂow studies, the simplest way to get an estimate for the amplitudes v n is to use the two-particle cumulants c n { } : v n { } = (cid:112) c n { } = (cid:112) (cid:104) v n (cid:105) . (3)It is well-known that this quantity suﬀers from the so called non-ﬂow contributions coming from e.g. resonance decaysand jets. Suppression of the non-ﬂow is typically done by utilizing the multi-particle cumulants. For example, theamplitude of the n -th harmonic can be estimated from the fourth-order cumulant c n { } [4] as v n { } = (cid:112) − c n { } = (cid:112) (cid:104) v n (cid:105) − (cid:104) v n (cid:105) . (4)We may try to adopt (3) and (4) for the ﬂow studies with the PCA. When one deals with event-by-event particledistributions (like in the present application of the PCA), it is essential to investigate the inﬂuence of the statisticalﬂuctuations due to a limited number of particles per event. Following the approach used, for instance, in [5, 6], wedenote a true amplitude of the n -th Fourier harmonic in a given event as v n and an amplitude of the statistical noiseas a n . If ˆ v n is the observed amplitude, extracted by PCA in a given event, projections of the corresponding ﬂow vectoron x and y axes in the transverse plane areˆ v n,x = v n,x + a n,x , ˆ v n,y = v n,y + a n,y . (5)The squares of the v n , a n and ˆ v n are v n = v n,x + v n,y , (6) a n = a n,x + a n,y , (7)ˆ v n = ˆ v n,x + ˆ v n,y = v n + a n + 2( v n,x a n,x + v n,y a n,y ) . (8)After averaging (8) over events, we get (cid:104) ˆ v n (cid:105) = (cid:104) v n (cid:105) + (cid:104) a n (cid:105) + 2 (cid:10) v n,x a n,x (cid:11) + 2 (cid:10) v n,y a n,y (cid:11) . (9)If we assume that signal and the statistical noise are uncorrelated, the last two terms factorize: (cid:10) v n,x a n,x (cid:11) = (cid:104) v n,x (cid:105)(cid:104) a n,x (cid:105) and (cid:10) v n,y a n,y (cid:11) = (cid:104) v n,y (cid:105)(cid:104) a n,y (cid:105) . Since event-averaged values of the x - and y -components of v n and a n are zero, (9) becomes (cid:104) ˆ v n (cid:105) = (cid:104) v n (cid:105) + (cid:104) a n (cid:105) , (10)and the true value v n is found by inverting (10): (cid:104) v n (cid:105) = (cid:104) ˆ v n (cid:105) − (cid:104) a n (cid:105) . (11)This result was obtained in [2] and can be used to get an estimation of the Fourier amplitudes v n { } using (3). Values (cid:104) a n (cid:105) measure the statistical ﬂuctuations. They can be calculated by applying PCA to the same events, but withrandomized ϕ -angles.The eﬀect from the statistical noise correction on the v is shown in Figure 1 for Pb-Pb events simulated in AMPTgenerator (2.8 × events). Uncorrected raw ˆ v values (upper gray diamonds), extracted directly from PCA, after thecorrection become blue open circles. These circles are on top of the values obtained with the traditional two-particlecumulant method (full circles). It can be seen that eﬀect from the correction is more pronounced for the peripheralevents, where a number of particles per event is lower.For the fourth power of the observed magnitude, we can use (8) again:ˆ v n = (cid:2) v n + a n + 2( v n,x a n,x + v n,y a n,y ) (cid:3) . (12)Averaging (12) over events and taking into account that x and y components of the noise are independent and theirvariances are equal, (cid:104) a n,x a n,y (cid:105) = (cid:104) a n,x (cid:105)(cid:104) a n,y (cid:105) and (cid:104) a n,x (cid:105) = (cid:104) a n,y (cid:105) , we get (cid:104) ˆ v n (cid:105) = (cid:104) v n (cid:105) + (cid:104) a n (cid:105) + 4 (cid:104) v n (cid:105)(cid:104) a n (cid:105) . (13)From (11) and (13), 2 (cid:104) ˆ v n (cid:105) − (cid:104) ˆ v n (cid:105) = 2 (cid:104) v n (cid:105) − (cid:104) v n (cid:105) + 2 (cid:104) a n (cid:105) − (cid:104) a n (cid:105) , (14)and, inverting (14) and substituting into (4), we get the following estimation for the v n : v n { } = (cid:113) (cid:104) ˆ v n (cid:105) − (cid:104) ˆ v n (cid:105) − (cid:0) (cid:104) a n (cid:105) − (cid:104) a n (cid:105) (cid:1) . (15)From this expression, one may note that when v n (cid:29) a n , values of v n { } are remarkably insensitive to the statisticalnoise. Indeed, for a hypothetical case of constant magnitudes of the ﬂow and the statistical noise, v n { } = (cid:112) ˆ v n − a n ,while v n { } = (cid:112) ˆ v n − a n .In Figure 1, open squares correspond to estimations of v n { } based on PCA coeﬃcients, formula (15). They matchsmall closed squares that stand for v n { } calculated with the traditional approach using four-particle correlations. Atthe same time, it can be seen that the raw values ˆ v n { } (diamonds), calculated by (15) without taking into accountthe statistical term, are almost on top of the squares even for peripheral events. This indicates that the correction forstatistical ﬂuctuations is almost irrelevant for the v n { } as soon as the number of particles in events is large enough.Toy studies showed that this is the case when the ﬂow magnitude is v n ∼ . (cid:38) centrality (%) v AMPT Pb-Pb@5 TeV diamonds uncorrected values

PCA v {2}trad. v {2} PCA v {4}trad. v {4} FIG. 1. Amplitudes v of the second Fourier harmonic as a function of centrality in Pb-Pb collisions at 5 TeV in AMPT(2.8 × events). Values extracted by PCA are shown by open markers: circles – v { } , squares – v { } . Small full markersshow calculations using standard 2- and 4-particle cumulant methods. Gray diamonds denote PCA values before the correctionon the statistical noise. Analysis is performed for charged particles with pseudorapidity | η | < . p T ) range is 0.2–5 GeV/ c . Number of ϕ bins used for PCA is M = 48. III. SYMMETRIC CUMULANTS WITH PCA

Following the same strategy, we can investigate feasibility of the studies of the so-called symmetric cumulants [7]with the PCA. This observable measures correlations between the amplitudes of the n -th and m -th Fourier harmonicsand is deﬁned as SC( n, m ) = (cid:104) v n v m (cid:105) − (cid:104) v n (cid:105)(cid:104) v m (cid:105) . (16)Previous attempt to study the symmetric cumulants with the PCA was done in a paper [3]. Since the PCA basisobtained in [3] was somehow distorted (i.e. not identical to the Fourier harmonics), extracted SC in this paper valuesdid not match the “truth” values.We start with the single-event raw quantity:ˆ v n ˆ v m = (ˆ v n,x + ˆ v n,y )(ˆ v n,x + ˆ v n,y ) == (cid:2) ( v n,x + a n,x ) + ( v n,y + a n,y ) (cid:3)(cid:2) ( v m,x + a m,x ) + ( v m,y + a m,y ) (cid:3) == (cid:2) v n + a n + 2( v n,x a n,x + v n,y a n,y ) (cid:3)(cid:2) v m + a m + 2( v m,x a m,x + v m,y a m,y ) (cid:3) . (17)Averaging over events and taking into account (cid:104) v n,x (cid:105) = (cid:104) v n,y (cid:105) = (cid:104) a n,x (cid:105) = (cid:104) a n,y (cid:105) = 0 (the same for the m -thharmonic), and also the factorization of the noise harmonics (cid:104) a n a m (cid:105) = (cid:104) a n (cid:105)(cid:104) a m (cid:105) as well as the noise and the signal (cid:104) v n a m (cid:105) = (cid:104) v n (cid:105)(cid:104) a m (cid:105) , we obtain (cid:104) ˆ v n ˆ v m (cid:105) = (cid:104) v n v m (cid:105) + (cid:104) v n (cid:105)(cid:104) a m (cid:105) + (cid:104) v m (cid:105)(cid:104) a n (cid:105) + (cid:104) a n (cid:105)(cid:104) a m (cid:105) . (18)Using (11) and (18), the desired term (cid:104) v n v m (cid:105) can be expressed via “measurable” quantities as (cid:104) v n v m (cid:105) = (cid:104) ˆ v n ˆ v m (cid:105) − (cid:104) ˆ v n (cid:105)(cid:104) a m (cid:105) − (cid:104) ˆ v m (cid:105)(cid:104) a n (cid:105) + (cid:104) a n (cid:105)(cid:104) a m (cid:105) . (19)The ﬁnal expression for the SC( n, m ) is thusSC( n, m ) = (cid:104) v n v m (cid:105) − (cid:104) v n (cid:105)(cid:104) v m (cid:105) = (cid:104) ˆ v n ˆ v m (cid:105) − (cid:104) ˆ v n (cid:105)(cid:104) ˆ v m (cid:105) . (20)It is remarkable that all terms related to the statistical noise are canceled.Figure 2 shows centrality dependence of the symmetric cumulants SC(3,2) and SC(4,2) in Pb-Pb collisions fromAMPT. It can be seen that values extracted from PCA (open markers) match with calculations using multi-particlecorrelations (closed markers). For SC(4,2), there are slight deviations in peripheral centrality classes, a possible reasonis the event-plane correlation of these two harmonics. Detailed investigation of this is out of the scope of this article. centrality (%) S C ( m , n )

1e 6

AMPT Pb-Pb@5 TeV

SC(3,2) PCASC(3,2) trad.SC(4,2) PCASC(4,2) trad.

FIG. 2. Centrality dependence of the symmetric cumulants SC(3,2) and SC(4,2) in Pb-Pb collisions from AMPT. Openmarkers – values are extracted from PCA, closed markers – by traditional method of multi-particle correlations. Particleswithin | η | < . p T range 0.2–5 GeV/ c . IV. SUMMARY

It was shown that the Principal Component Analysis applied to event-by-event azimuthal single-particle distri-butions allows to perform ﬂow analyses that are analogous to the traditional approaches based on multi-particlecorrelations. As the ﬁrst example, ﬂow amplitudes based on the fourth-order cumulant were considered. As thesecond case, correlations between ﬂow amplitudes in terms of symmetric cumulants were calculated. Using realisticevents from the AMPT generator, PCA results were directly compared to calculations using the traditional techniques,a good correspondence was obtained. It was demonstrated also that a contribution from statistical ﬂuctuations dueto a ﬁnite number of particles per event to the higher-order PCA-based cumulants is small.

ACKNOWLEDGEMENTS

This study is supported by Russian Science Foundation, grant 17-72-20045. [1] R. S. Bhalerao, J.-Y. Ollitrault, S. Pal, and D. Teaney, Phys. Rev. Lett. , 152301 (2015), arXiv:1410.7739 [nucl-th].[2] I. Altsybeev, Phys. Part. Nucl. , 314 (2020), arXiv:1909.03979 [nucl-th].[3] Z. Liu, W. Zhao, and H. Song, Eur. Phys. J. C , 870 (2019), arXiv:1903.09833 [nucl-th].[4] N. Borghini, P. M. Dinh, and J.-Y. Ollitrault, Phys. Rev. C , 054901 (2001), arXiv:nucl-th/0105040.[5] J. Jia and P. Huo, Phys. Rev. C , 034905 (2014), arXiv:1402.6680 [nucl-th].[6] R. He, J. Qian, and L. Huo, (2017), arXiv:1702.03137 [nucl-th].[7] A. Bilandzic, C. H. Christensen, K. Gulbrandsen, A. Hansen, and Y. Zhou, Phys. Rev. C89