[PDF] Development of a Machine Learning Based Analysis Chain for the Measurement of Atmospheric Muon Spectra with IceCube

Abstract

High-energy muons from air shower events detected in IceCube are selected using state of the art machine learning algorithms. Attributes to distinguish a HE-muon event from the background of low-energy muon bundles are selected using the mRMR algorithm and the events are classified by a random forest model. In a subsequent analysis step the obtained sample is used to reconstruct the atmospheric muon energy spectrum, using the unfolding software TRUEE. The reconstructed spectrum covers an energy range from 10^4\,GeV to 10^6\,GeV. The general analysis scheme is presented, including results using the first year of data taken with IceCube in its complete configuration with 86 instrumented strings.

Full PDF

XXXV European Cosmic Ray Symposium, Turin, Sept. 4-9 2016 Development of a Machine Learning Based Analysis Chain for the Measurementof Atmospheric Muon Spectra with IceCube

Tomasz Fuchs for the IceCube Collaboration ∗ Technical University Dortmund, Experimental Physics V,D-44221 Dortmund, Germany

High-energy muons from air shower events detected in IceCube are selected using state of the artmachine learning algorithms. Attributes to distinguish a HE-muon event from the background oflow-energy muon bundles are selected using the mRMR algorithm and the events are classiﬁed bya random forest model. In a subsequent analysis step the obtained sample is used to reconstructthe atmospheric muon energy spectrum, using the unfolding software TRUEE. The reconstructedspectrum covers an energy range from 10 GeV to 10 GeV. The general analysis scheme is presented,including results using the ﬁrst year of data taken with IceCube in its complete conﬁguration with86 instrumented strings.

I. INTRODUCTION

IceCube is a cubic kilometer detector array locatedat the geographic South Pole. Its 5160 Digital OpticalModules (DOMs) are used to detect secondary muonsproduced either in neutrino interactions with ice orbedrock, or in cosmic ray air showers. The combi-nation of the conventional (i.e. produced in pion andkaon decays) atmospheric and astropysical neutrinoﬂux dominates over the expected prompt componentwhich is produced mainly by charmed hadron decaysin the atmosphere. Therefore an accurate measure-ment of the prompt ﬂux magnitude is a diﬃcult taskfor a large-volume neutrino detector. Since the energyspectrum of atmospheric muons has no astrophysicalcomponent, this ﬂux can be studied to determine themagnitude of the prompt ﬂux which is produced inleptonic decays of charmed hadrons and unﬂavoredmesons.In general high-energy muons from air showersare accompanied by a bundle of low-energy muons.Therefore, the detection of HE-muons within a muonbundle is a challenging task as the IceCube detec-tor lacks the spatial resolution to resolve individualmuons within a muon bundle. To analyze the atmo-spheric muons we developed a data mining based anal-ysis scheme that selects algorithmically from a largepool of reconstructed event properties the most pow-erful ones for distinguishing signal and background.A similar approach was used to analyze atmosphericneutrinos[1] in IceCube.The developed analysis chain was used to recon-struct the spectrum of high-energy muons reachingthe IceCube in-ice detector. For this analysis a HE-muon is deﬁned as a muon which fulﬁlls the followingcondition E µ, HE ≥ . E bundle , tot . . (1) ∗ http://icecube.wisc.edu E µ, HE describes the energy of the HE-muon and E bundle , tot . the total bundle energy including theHE-muon. Since the muons within the bundle cannotbe resolved spatially, high-energy muons can only beidentiﬁed indirectly from the relation between catas-trophic and continuous energy losses, and separationfrom background is challenging.Our analysis chain has the following three parts:a) the selection of event properties (hereafter referredto as attributes) to be used in the analysis, b) theclassiﬁcation of events into signal and background us-ing these attributes, and c) the reconstruction of thespectrum of the HE-muons at the surface. II. ATTRIBUTE SELECTION

Choosing the right attributes is a crucial step forevery analysis. For this proceeding every value whichis measured or reconstructed for every single detectedor simulated event is referred to as an attribute. Fromthe pool of all available attributes we select algorith-mically those that agree between data and simulationusing a comparison of the integrated distributions.This is also a criterion in the attribute selection sinceonly well simulated attributes can be used to drawconclusions for measured events. For following stepsit is common to manually select attributes based onexperience and expert knowledge. In this analysis thelarge dimensionality of the set of attributes rendersthis procedure impossible.An algorithmic approach like the mRMRalgorithm[2] is necessary. In the mRMR algo-rithm the relevance D and the redundancy R of aspeciﬁc set of attributes is analyzed. In a speciﬁcattribute set the relevance D describes the correlationof each attribute to the signal and background eventswhereas the redundancy R quantiﬁes the correlationbetween the attributes in this set. In the mRMRalgorithm a maximization of the equationΦ( D, R ) = D − R (2) eConf TBA a r X i v : . [ a s t r o - ph . I M ] J a n XXV European Cosmic Ray Symposium, Turin, Sept. 4-9 2016 is performed. The attribute set is built up iterativelyby adding a single attribute which maximizes Φ(

D, R )in equation 2 in each step. The advantage of thisprocedure is the possibility to determine the best at-tributes which are able to classify the data for anysize of the attribute set. In this analysis the ﬁnal at-tribute set uses 30 attributes which proved to be agood tradeoﬀ between execution time and separationquality.

III. EVENT CLASSIFICATION

With the selected set of attributes a classiﬁcationof events into signal and background has to be done.This can be achieved by splitting the event sampleat a certain value of a certain attribute (a so called”straight cut”), but algorithms from computer sciencelike deep neural networks[4], boosted decision trees[5]or random forests[6] are increasingly used and showcomparable or superior performance. Here the ran-dom forest algorithm is used to perform the selectionof HE-muons. A random forest is an ensemble of singledecision trees which select random attributes at everyknot and is proven to be more robust against over-training. In the case of a two-class classiﬁcation prob-lem the random forest determines a score which indi-cates the most likely class aﬃliation. Here the scoreis used to distinguish between signal (HE-muons) andbackground events (low-energy muon bundles). Be-cause the random forest builds multiple random deci-sion trees, its score can simply be calculated by thefraction of single trees which classiﬁed an event to besignal T + over the total number of built trees T . S = T + /T (3)To avoid a misleading classiﬁcation result by an over-trained model the data to build the model is split intoa training and a test set with randomly chosen events.The results of the model trained on the training setapplied to the test set can be seen in Figure 1. Bothclasses can be distinguished in the score distribution,so it can be used to separate the HE-muons from back-ground events using a straight cut in the score value.To evaluate the optimal cut values, diﬀerent perfor-mance indicators are used. The most common indica-tors are the eﬃciency and the purity of a potential setof events. The eﬃciency describes the fraction of sig-nal events above a speciﬁc score value. The purity isthe ratio of signal events to the total number of eventsabove a speciﬁc score value. The requirements on thepurity and eﬃciency are problem dependent so that ageneral approach is not available. In this analysis ascore cut of S ≥ . . ± .

6) % and a purity of (93 . ± .

4) %.The score cut was chosen because for higher cut val-ues the eﬃciency of the resulting sample drops signif-icantly. To rule out a possible mismatch between data andsimulation the score distribution on simulated eventsis compared to data events. In Figure 2 an agree-ment within 20 % between simulated and analyzeddata events is seen. Since the data used to train themodel is limited there is an uncertainty for the scoredistributions as well as the eﬃciency and purity. Theuncertainty for the score, eﬃciency and purity are es-timated using a 5-fold cross validation. In a crossvalidation the training set is split into n parts. Themodel is trained on n − FIG. 1: Random Forest score for the training and thetest set for the separation of HE-muons and eventswithout high-energy muons. Additionally the chosenscore limit of 0 . . eConf TBA XV European Cosmic Ray Symposium, Turin, Sept. 4-9 2016 − . . . . . . . log ( d E/ d x SplineMPE BINS / (GeV m − )) . . . . . . l og ( E µ / G e V ) − − − FIG. 3: Correlation between the surface muon energy andthe energy estimator. For each energy estimator bin themean and the standard deviation of the surface energy areshown.

Depth / km . . . . . l og ( E µ / G e V ) − − − FIG. 4: Correlation between the surface muon energy andthe propagated distance in the ice. For each distance bin themean and the standard deviation of the surface energy areshown.

IV. UNFOLDING

The muon energy at the surface cannot be mea-sured directly. Its distribution has to be reconstructedfrom observables that can be measured at the detec-tor. A common method is the forward folding whichﬁts the expected event properties to the data vary-ing parameters of a model of the muon energy spec-trum at the surface. This method has the disadvan-tage that the resulting spectral shape is limited to thechosen parametrization. An alternative approach tothis is the unfolding method which estimates the sur-face muon spectrum directly by inverting the responsefunctions which describe the expected event propertiesfor each value of the surface energy of the muon. This method is based on a Fredholm integral of the secondkind which can be discretized to (cid:126)g = A (cid:126)f + (cid:126)b. (4)The sought after distribution (cid:126)f is folded with a re-sponse matrix A . The response matrix A describesthe expected distribution of a speciﬁc attribute (cid:126)g foreach bin in surface muon energy. A background con-tribution (cid:126)b has to be considered in addition to obtainthe expected distribution of (cid:126)f . The matrix A is deter-mined from simulated events and the reconstruction of (cid:126)f is done using a regularized[7] likelihood ﬁt with thesoftware TRUEE[8]. In the case of HE-muons two at-tributes are used for the reconstruction of the surfaceenergy. This surface energy of the muon is determinedby the reconstructed energy losses of the muon in thedetector and the estimated energy losses occuring dur-ing the propagation from the surface to the detector.To account for their energy loss prior to reaching Ice-Cube, the distance from the surface to the detectorboundary is used since no measurement can be per-formed outside the detector. The correlation betweenthe surface energy E µ and the reconstructed energyloss dE / dx SplineMPEBINS and the distance from the sur-face

Depth of the events are shown in the Figures 3and 4.Using these attributes a surface energy spectrumof the HE-muons can be determined and is shown inFigure 5. To enhance the visibility of spectral fea-tures of the reconstructed ﬂux the result is multipliedby ( E µ / GeV) . The blue lines in Figure 5 are theo-retic predictions for the ﬂux and are split into a con-ventional part (using SIBYLL-2.1[10] as the hadronicinteraction model and GaisserH3a[11] as the cosmic-ray spectrum) and a prompt contribution from charm(ERS[12]) and unﬂavored (Unﬂavored[13]) particles.A previous analysis without a machine learning ap-proach published in [9] is shown with its uncertaintiesas a gray band. The results of this analysis are plottedas black circles.The uncertainty of this analysis has two contribu-tions. A smaller one from the limited statistics of ob-served events and a larger one from the limited statis-tics of simulated signal events. To estimate the uncer-tainty from the limited simulation events the spectrumwas reconstructed using multiple resampled subsets ofthe total set. The variance of the spectra was thenconsidered in the estimation of the uncertainty. Thisyields a large uncertainty, especially for the higher en-ergies ( E µ > GeV). This is due to the fact thatevents with more energy are far less often simulateddue to the assumed E − cosmic ray spectrum in thesimulation. eConf TBA XXV European Cosmic Ray Symposium, Turin, Sept. 4-9 2016 E µ / GeV10 -3 -2 -1 Φ µ × ( E µ / G e V ) / ( c m s s r G e V ) − IceCube Preliminary

SIBYLL-2.1 (GaisserH3a)ERSUnflavoredSIBYLL-2.1 + ERS + UnflavoredAstropart. Phys. 78 (2016) 1-27IC86-I+ σ stat . + σ MC FIG. 5: Unfolded muon spectrum multiplied with E using the IC86-I data. The reconstructed muon spectrum fromAP.78 is shown in gray. In blue the theoretic prediction for a conventional part (SIBYLL-2.1[10]) with the assumedGaisserH3a[11] cosmic ray ﬂux model in brackets are shown and a the contribution by a prompt component from charm(ERS[12]) and unﬂavored (Unﬂavored[13]) particles. V. CONCLUSION AND FUTUREPERSPECTIVES

The described machine learning based analysischain proved that it can be used to seperate HE-muonsfrom background events and unfold their energy spec-trum. The reconstructed spectrum of HE-muons iscompatible with theoretical predictions and with pre-vious results from the IceCube collaboration. Thisspectrum does not include systematic eﬀects com-pared to the published results shown in the gray band.Therefore it is expected that the uncertainty of the re-constructed spectrum will increase with the inclusion of systematic eﬀects. The statistical signiﬁcance is notsuﬃcient to determine whether a prompt contributionis present on the predicted level in the resulting spec-trum.Since the uncertainty is due to the lack of simulatedevents, improvements of the ﬁnal result are expectedwith growing numbers of simulations. The availablesimulation statistics for the years 2012 and above is 7times larger than in this analysis. This will decreasethe statistical uncertainty of the simulation and ex-tend the results of this analysis to higher energies. [1] M. G. Aartsen et al. , The European Physical JournalC (2015)[2] C. Ding et al. , IEEE Transactions on Pattern Analysisand Machine Intelligence (2016)[3] L. Breiman, Machine Learning (2001)[4] Y. Bengio and Samy Bengio, Advances in Neural In-formation Processing Systems 12 (200)[5] Y. Freud and R. E. Schapire, Machine Learning: Pro-ceedings of the Thirteenth International Conference(1996) [6] L. Breiman, Machine Learning (2001)[7] A. N. Tikhonov, Dokl. Akad. Nauk SSSR (1963)[8] N. Milke et al. , NIM A (2013)[9] M.G. Aartsen et al. , Astropart. Phys. (2016)[10] A. Fedynitch et al. , Phys. Rev. D (2012)[11] T.K. Gaisser, Astropart. Phys. (2012)[12] R. Enberg et al. , Phys. Rev. D (2008)[13] A. Fedynitch et al. , PoS(ICRC2015) (2015).(2015).