CNN-based event classification for alpha-decay events in nuclear emulsion
J. Yoshida, H. Ekawa, A. Kasagi, M. Nakagawa, K. Nakazawa, N.Saito, T.R. Saito, M. Taki, M. Yoshimoto
CCNN-based event classification for alpha-decay eventsin nuclear emulsion
J. Yoshida a,b, ∗ , H. Ekawa b , A. Kasagi b,c , M. Nakagawa b , K. Nakazawa c , N.Saito b , T.R. Saito b,d,e , M. Taki f , M. Yoshimoto c a Physics Department, Tohoku University, Aramaki, Aoba-ku, Sendai 980-8578, Japan b High Energy Nuclear Physics Laboratory, Cluster for Pioneering Research, RIKEN, 2-1Hirosawa, Wako, Saitama 351-0198, Japan c Physics Department, Gifu University, 1-1 Yanagido, Gifu 501-1193, Japan d GSI Helmholtz Centre for Heavy Ion Research, Planckstrasse 1, D-64291 Darmstadt,Germany e School of Nuclear Science and Technology, Lanzhou University, 222 South Tianshui Road,Lanzhou, Gansu Province, 730000, China f Graduate School of Artificial Intelligence and Science, Rikkyo University, 3-34-1 NishiIkebukuro, Toshima-ku, Tokyo 171-8501, Japan
Abstract
We developed an efficient classifier that sorts alpha-decay events from variousvertex-like objects in nuclear emulsion using a convolutional neural network(CNN). Alpha-decay events in the emulsion are standard calibration sources forthe relation between the track length and kinetic energy in each emulsion sheet.We trained the CNN using 15,885 images of vertex-like objects including 906alpha-decay events and tested it using a dataset of 46,948 images including 255alpha-decay events. By tuning the hyperparameters of the CNN, the trainedmodels achieved an Average Precision Score of 0.740 ± ± ± ± ∗ Corresponding author
Email address: [email protected] (J. Yoshida)
Preprint submitted to Journal of L A TEX Templates September 15, 2020 a r X i v : . [ nu c l - e x ] S e p /7 compared to the estimated load of the former method without a CNN. Keywords:
Machine learning, CNN, Nuclear emulsion, Alpha-decay, Doublehypernucleus
1. Introduction
Nuclear emulsion is one of the detectors used for visualising the tracks ofcharged particles with the highest spatial resolution at the micrometre-scale orbetter [1]. Such an excellent spatial resolution has provided numerous oppor-tunities in fundamental studies [2, 3, 4, 5, 6] and applications [7, 8] during thepast 80 years.One of the recent areas to employ the emulsion is the experimental inves-tigation into double hypernuclei, which are baryonic bound states with twostrange-quarks [9, 10]. Studies on double hypernuclei have extended our un-derstanding of the nuclear force to the general baryon-baryon interaction underthe flavoured-SU(3) symmetry. Double hypernuclei are produced through thecapture of a Ξ − hyperon with two strange-quarks in a nucleus. In the doublestrangeness system produced, a conversion process of Ξ − p → ΛΛ takes place,followed by the decay. Because of the small Q-value of the conversion process,i.e., approximately 28 MeV, particles and fragments from the production anddecay of the double hypernuclei have small kinetic energy, and the track lengthof these particles is extremely short, i.e., typically on the order of 10 microme-tres in a solid material. When a Ξ − hyperon is stopped in the emulsion, visualinformation of the tracks can be recorded simultaneously with the productionand decay of the double hypernuclei. Through visual analyses of the length andboldness of the recorded tracks, particles and fragments are identified and theirkinetic energy can be deduced. Therefore, the produced double hypernucleuscan be identified and its mass value can be obtained even with only one eventobserved in the emulsion.Experimental studies on double hypernuclei using the emulsion have made2ignificant progress during the last decade. The most recent experiment forstudying double hypernuclei was carried out in J-PARC as the E07 experiment[10]. The basic design of the experiment conducted was based on the hybridemulsion method. In this method, only a small area of the emulsion sheetis scanned, and the area is defined by tracking information of the Ξ − particlemeasured by the other precise detectors in front of the emulsion. This drasticallyreduces the load and time for the analysis. Owing to the high beam intensity,large solid angle of the detectors, and modern techniques used in an emulsionanalysis [11], 10-times more double hypernuclear events have been expected tobe detected than those observed in the former experiments [9]. The experimentaldata are being analysed and several nuclei, including ΛΛ Be, have been identifiedthus far [12]. However, the performance of the hybrid method has yet to beperfected, and only one-third of the expected candidates have been observed.An exhaustive search method as an alternative to the hybrid method is alsobeing developed [13], and the first Ξ hypernucleus candidate was observed [14,15]. With this method called ”overall scanning” herein, the entire volume of theirradiated emulsion is scanned, and therefore the method is capable of detectingevents of a non-triggered double hypernuclear production. With the overallscanning of emulsion sheets irradiated during the J-PARC E07 experiment, thedetection of approximately 10 events related to the production and decay ofdouble hypernuclei is expected. Although the load for the analysis and theamount of scanned data will be drastically increased with the existing overallscanning technique, the technique is expected to take on the main role in studieson double hypernuclei and could replace the presently used hybrid method.For the overall scanning approach, we previously developed a scanning sys-tem called ”Vertex Picker” [13]. This system takes exhaustive micro-graphs ofthick emulsion sheets and detects vertex-like objects with three or more tracksoriginated from a single point. However, this method detects not only the vertexcandidates of hypernuclei but also a large number of other vertex-like objects(e.g., an alpha-decay event and a beam-nucleus interaction) and non-vertex ob-jects (e.g., a cross of unrelated tracks and a black spot similar to dust), as shown3 a) (b) (c) (d) Figure 1: Images of typical objects detected using Vertex Picker. (a) An alpha-decay eventof a thorium series, (b) interaction of a hadron beam particle and a nucleus in an emulsionlayer, (c) cross of unrelated tracks, (d) a black spot similar to dust in an emulsion layer. Thesizes of these images are 224 pixels ×
224 pixels and 87 × µ m at the object. in Figure 1. The developed system improves the speed of the vertex-search by afactor of approximately 20; however, the speed must be further improved for theoverall scanning technique. Moreover, the ratio of the detected vertex events ofinterest to the other detected events is far from satisfactory.To achieve further improvements, we developed a new technique for detect-ing vertex-like objects in the emulsion by employing a convolutional neural net-work (CNN). Image classification using a CNN has made remarkable progressin recent years and has reached a level comparable to that of human visualclassification [16, 17].Our ultimate goal is to develop a fast method for detecting candidates ofvertices related to double hypernuclei with a large detection efficiency and anexcellent signal-to-background ratio. However, there are insufficient amountsof training and validation data on double hypernuclei for the developmentof amethod using a CNN. Therefore, as the first step, we attemped to develop amethod using a CNN for detecting alpha-decay events in the emulsion. Alpha-decay events are traces of a spontaneous chain decay of long-lived radioisotopessuch as uranium and thorium in the emulsion. Recorded images of alpha-decayevents of uranium and thorium series have been characterised as consisting fourand five bold tracks of approximately 25-50 µ m in length, respectively. Thosealpha-decay events and associated tracks are extremely important because we4 able 1: Summary of dataset and numbers of images. Dataset name Alpha-decays Others TotalTraining (TRAIN) 906 14979 15885Validation (VALID) 214 3814 4028Test (TEST) 255 46693 46948use them for calibrating the relation between the track length and kinetic en-ergy in each emulsion sheet. The present study employs a sufficient numberof alpha-decay events selected by Vertex Picker and by the human eye duringthe development using a CNN. Furthermore, this study attempts to transitionthe emulsion techniques from a mature approach to a state-of-art technologyby introducing a CNN, and the present work is a foundation for the furtherdevelopments towards studies of double hypernuclei with the emulsion.
2. Event classification using CNN
For the development of an event classifier for alpha-decay events using aCNN, we prepared three datasets for training (TRAIN), validation (VALID),and testing (TEST). Prior to the development, we already had images selectedfrom six of the 50 mm ×
50 mm × As discussed in the previous sub-section, we have to find an (N, M) pa-rameter combination to achieve the optimal CNN performance to efficientlyclassify alpha-decay events from other events. To evaluate the performance ofthe classifiers, we use the area under the Precision-Recall Curve, also known asthe Average Precision Score” [27]. A Precision-Recall Curve is widely used tovisualise the performance of a binary classifier, particularly for an imbalanceddataset like ours. The curve consists of pairs of two parameters, widely referredto as precision and recall at different threshold values to discriminate positiveand negative samples based on the output values of a CNN. The precision cor-responds to the purity in the classified samples, whereas the recall representsthe selection efficiency of the classifier.The procedure for selecting the best (N, M) is as follows. Initially, we con-duct the training using a specified (N, M) until the minimum validation loss isobserved, and we defined the best model having the minimum validation loss.After the training, we evaluated the Average Precision Score for the VALID7 igure 2: Evolution of the losses for the TRAIN and VALID dataset for four iterations of thetraining process at (N = 2, M = 24), as an example. The best epoch for each of the four trialswas 36, 54, 55, and 65, respectively. dataset. We applied this process four times with the same (N, M) and differentrandom seeds to check the reproducibility. We iterated this process with var-ious pairings of (N, M), and the pair considered as the best occurs when themean of the four Average Precision Scores is at maximum. Finally, to evaluatethe averaged performance, we applied the best four models individually to theTEST dataset with the chosen (N, M).
3. Results of the trained CNN model
We conducted a grid search of the hyperparameters of RandAugument, i.e.,N and M, for N = {
2, 4, 6, 8 } and M = {
6, 12, 18, 24, 30 } . For each (N, M) pair,we searched the best model at the specific epoch providing the minimum loss forthe VALID dataset. Figure 2 shows the evolution of losses for four iterations ofthe training process, for example, at (N = 2, M = 24). The blue and grey linesrepresent the loss values, respectively, of the TRAIN and VALID dataset forthe four trials. The loss values of the TRAIN dataset decrease as the trainingprogresses; however, the loss of validation begins to increase gradually at acertain point. As described in the previous section, we define the best model8s having the minimum validation loss. To find the epoch number providingthe minimum validation loss, we applied a smoothing for the validation lossvalues with a method called the exponentially weighted moving average, whichis formulated through the following recurrence formula: S = V ,S i = wS i − + (1 . − w ) V i , (1)where S i and V i are the i -th smoothed value and the original value, respectively,and w is a weight taking a value of between zero and 1 and specifies the degree ofsmoothing. In the present study, we set the weight to 0.9 and stopped trainingwhen the smoothed loss exceeded 115% of the minimum.Through a grid search among N = {
2, 4, 6, 8 } and M = {
6, 12, 18, 24, 30 } ,we obtained the mean and standard deviation of the four Average PrecisionScores for the VALID dataset, as shown in Figure 3. The best combination isachieved with (N = 2, M = 24), and the best score obtained is 0.980 ± ±
4. Comparison with former method without a CNN
The results of the classification of the alpha-decay events using the developedCNN model were compared to those of a former method without a CNN. In theformer method, vertices and tracks are reconstructed in a three-dimensionalspace by combining the full information on the vertices and associated tracksfrom Vertex Picker [13]. It should be noted that the method with the CNN9 igure 3: The Average Precision Score for the VALID dataset for various pairings (N, M) forRandAugment. The best value is 0.980 ± developed in this study employs only cropped images from Vertex Picker. Withthe former method without a CNN, candidates of alpha-decay events are sortedusing the information of the track multiplicity from the associated vertex andthe track length. The former method without a CNN sorted 2489 alpha-decaycandidates from the total number of 46,948 events, including 201 true alpha-decay events. Thus, the precision and recall are 0.081 ± ± √ N .The performance of the developed CNN model was evaluated by comparingit to the result of the former method without a CNN, and a comparison at thesame recall value of 0.788 is summarised in Table 2. The developed methodusing the CNN selected 366 ±
18 alpha-decay candidates including 201 truealpha-decay events. Therefore, the precision of the developed model with theCNN was obtained as 0.547 ± ± igure 4: Distribution of the output values of one of the best CNN models at (N = 2, M =24) for alpha-decay and other events in the TEST dataset.Table 2: Comparison of performances between our former method without a CNN (w/o CNN)and the developed CNN method (w/ CNN) at similar recall. The CNN method improved theprecision over that of the other approach by a factor of 6.8 ± Method Precision Recall Number of candidatesw/o CNN 0.081 ± ± ± ±
5. Summary
We developed an event classifier with a CNN for alpha-decay events recordedas a micro-graphs of vertex-like objects in nuclear emulsion. The developedCNN models were efficiently trained to discriminate between images of alpha-decay events and other vertex-like objects by employing random augmentationand over sampling. The Average Precision Score for the TEST dataset, whichis a metric of the classification performance, was determined to be 0.740 ± ± ± igure 5: Precision-Recall curves at (N = 2, M = 24). This curve consists of pairs of precisionand recall at different discriminant threshold values for the output of the CNN. the visual inspection will be reduced to approximately 1/7 while maintaininga similar efficiency by introducing the developed CNN method in comparisonto the former method without a CNN. The developed technique described inthe present paper will be a foundation for the further development to discovera number of double hypernuclei through the overall scanning of the emulsionsheets of the J-PARC E07 experiment. Acknowledgement
This work was supported by JSPS KAKENHI Grant Numbers 16H02180,20H00155, and 19H05147 (Grant-in-Aid for Scientific Research on InnovativeAreas 6005). We thank the J-PARC E07 collaboration for providing the emul-sion sheets. We also thank Prof. H. Tamura at Tohoku University for thefruitful discussions.
References [1] W. H. Barkas, Pure & Applied Physics series, I II, Academic Press,1963. 122] C. M. G. Lattes, G. P. S. Occhialini, C. F. Powell, Observations on thetracks of slow mesons in photographic emulsions, Nature 160 (4066) (1947)453–456. doi:10.1038/160453a0 .URL https://doi.org/10.1038/160453a0 [3] K. Niu, E. Mikumo, Y. Maeda, A Possible Decay in Flight of aNew Type Particle, Progress of Theoretical Physics 46 (5) (1971)1644–1646. arXiv:https://academic.oup.com/ptp/article-pdf/46/5/1644/5271903/46-5-1644.pdf , doi:10.1143/PTP.46.1644 .URL https://doi.org/10.1143/PTP.46.1644 [4] S. Aoki, et al., Direct Observation of Sequential Weak Decay of aDouble Hypernucleus, Progress of Theoretical Physics 85 (6) (1991)1287–1298. arXiv:https://academic.oup.com/ptp/article-pdf/85/6/1287/5211910/85-6-1287.pdf , doi:10.1143/PTP.85.1287 .URL https://doi.org/10.1143/PTP.85.1287 [5] K. Kodama, et al., Observation of tau neutrino interactions, PhysicsLetters B 504 (3) (2001) 218 – 224. doi:https://doi.org/10.1016/S0370-2693(01)00307-0 .URL [6] OPERA Collaboration, Observation of tau neutrino appearance in theCNGS beam with the OPERA experiment, Progress of Theoretical andExperimental Physics 2014 (10), 101C01. arXiv:https://academic.oup.com/ptep/article-pdf/2014/10/101C01/4414189/ptu132.pdf , doi:10.1093/ptep/ptu132 .URL https://doi.org/10.1093/ptep/ptu132 [7] H. K. Tanaka, et al., High resolution imaging in the inhomo-geneous crust with cosmic-ray muon radiography: The densitystructure below the volcanic crater floor of mt. asama, japan,Earth and Planetary Science Letters 263 (1) (2007) 104 – 113.13 oi:https://doi.org/10.1016/j.epsl.2007.09.001 .URL [8] K. Morishima, et al., Discovery of a big void in khufu’s pyramid by ob-servation of cosmic-ray muons, Nature 552 (7685) (2017) 386–390. doi:10.1038/nature24647 .URL https://doi.org/10.1038/nature24647 [9] J. K. Ahn, et al., Double-Λ hypernuclei observed in a hybrid emulsionexperiment, Phys. Rev. C 88 (2013) 014003. doi:10.1103/PhysRevC.88.014003 .URL https://link.aps.org/doi/10.1103/PhysRevC.88.014003 [10] K. Imai, K. Nakazawa, H. Tamura, J-PARC E07 experiment. Sys-tematic study of double-strangeness system with an emulsion-counterhybrid method, http://j-parc.jp/researcher/Hadron/en/pac_0606/pdf/p07-Nakazawa.pdf .[11] M. K. Soe, et al., Automatic track following system to study doublestrangeness nuclei in nuclear emulsion exposed to the observable limit,Nuclear Instruments and Methods in Physics Research Section A: Acceler-ators, Spectrometers, Detectors and Associated Equipment 848 (2017) 66– 72. doi:https://doi.org/10.1016/j.nima.2016.12.046 .URL [12] H. Ekawa, et al., Observation of a Be double-Lambda hypernucleusin the J-PARC E07 experiment, Progress of Theoretical and Ex-perimental Physics 2019 (2), 021D02. arXiv:https://academic.oup.com/ptep/article-pdf/2019/2/021D02/27970468/pty149.pdf , doi:10.1093/ptep/pty149 .URL https://doi.org/10.1093/ptep/pty149 doi:https://doi.org/10.1016/j.nima.2016.11.044 .URL [14] K. Nakazawa, et al., The first evidence of a deeply bound state of Xi–14N system, Progress of Theoretical and Experimental Physics 2015 (3),033D02. arXiv:https://academic.oup.com/ptep/article-pdf/2015/3/033D02/9720010/ptv008.pdf , doi:10.1093/ptep/ptv008 .URL https://doi.org/10.1093/ptep/ptv008 [15] E. Hiyama, K. Nakazawa, Structure of s=-2 hypernucleiand hyperon-hyperon interactions, Annual Review of Nuclearand Particle Science 68 (1) (2018) 131–159. arXiv:https://doi.org/10.1146/annurev-nucl-101917-021108 , doi:10.1146/annurev-nucl-101917-021108 .URL https://doi.org/10.1146/annurev-nucl-101917-021108 [16] Y. LeCun, Y. Bengio, G. Hinton, Deep learning, Nature 521 (7553) (2015)436–444. doi:10.1038/nature14539 .URL https://doi.org/10.1038/nature14539 [17] K. He, et al., Delving deep into rectifiers: Surpassing human-level per-formance on imagenet classification, CoRR abs/1502.01852. arXiv:1502.01852 .URL http://arxiv.org/abs/1502.01852 [18] K. He, et al., Deep residual learning for image recognition, CoRRabs/1512.03385. arXiv:1512.03385 .URL http://arxiv.org/abs/1512.03385 arXiv:1710.05381 .URL http://arxiv.org/abs/1710.05381 [22] https://github.com/ufoym/imbalanced-dataset-sampler .[23] E. D. Cubuk, et al., Randaugment: Practical automated data augmentationwith a reduced search space (2019). arXiv:1909.13719 .[24] https://github.com/ildoonet/pytorch-randaugment .[25] J. Deng, et al., Imagenet: A large-scale hierarchical image database, in:2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009,pp. 248–255.[26] D. P. Kingma, J. Ba, Adam: A method for stochastic optimization (2014). arXiv:1412.6980 .[27] J. Davis, M. Goadrich, The relationship between precision-recall and roccurves, in: Proceedings of the 23rd International Conference on MachineLearning, ICML 06, Association for Computing Machinery, New York, NY,USA, 2006, p. 233240. doi:10.1145/1143844.1143874 .URL https://doi.org/10.1145/1143844.1143874https://doi.org/10.1145/1143844.1143874