Graph neural network for 3D classification of ambiguities and optical crosstalk in scintillator-based neutrino detectors
Saúl Alonso-Monsalve, Dana Douqa, César Jesús-Valls, Thorsten Lux, Sebastian Pina-Otey, Federico Sánchez, Davide Sgalaberna, Leigh H. Whitehead
GGraph neural network for 3D classification of ambiguities and optical crosstalk inscintillator-based neutrino detectors
Sa´ul Alonso-Monsalve,
1, 2, ∗ Dana Douqa, † C´esar Jes´us-Valls, Thorsten Lux, SebastianPina-Otey,
4, 5
Federico S´anchez, Davide Sgalaberna, and Leigh H. Whitehead CERN, The European Organization for Nuclear Research, 1211 Meyrin, Switzerland Universidad Carlos III de Madrid, Av. de la Universidad, 30, 28911 Madrid, Spain University of Geneva, Section de Physique, DPNC, 1205 Gen`eve, Switzerland IFAE, Institut de F´ısica d’Altes Energies, Carrer de Can Magrans, 08193 Barcelona, Spain Aplicaciones en Inform´atica Avanzada (AIA), 08172 Sant Cugat del Vall`es Barcelona, Spain ETH Zurich, Institute for Particle Physics and Astrophysics, CH-8093 Zurich, Switzerland Cavendish Laboratory, University of Cambridge, Cambridge, CB3 0HE, United Kingdom
Deep learning tools are being used extensively in high energy physics and are becoming central inthe reconstruction of neutrino interactions in particle detectors. In this work, we report on theperformance of a graph neural network in assisting with particle flow event reconstruction. Thethree-dimensional reconstruction of particle tracks produced in neutrino interactions can be subjectto ambiguities due to high multiplicity signatures in the detector or leakage of signal betweenneighboring active detector volumes. Graph neural networks potentially have the capability ofidentifying all these features to boost the reconstruction performance. As an example case study,we tested a graph neural network, inspired by the GraphSAGE algorithm, on a novel 3D-granularplastic-scintillator detector, that will be used to upgrade the near detector of the T2K experiment.The developed neural network has been trained and tested on diverse neutrino interaction samples,showing very promising results: the classification of particle track voxels produced in the detectorcan be done with efficiencies and purities of 94-96% per event and most of the ambiguities can beidentified and rejected, while being robust against systematic effects.
I. INTRODUCTION
Since 1999, a series of neutrino oscillation experimentshave provided deep insight into the nature of neutri-nos [1–8]. A number of these experiments are long-baseline neutrino oscillation experiments that use twodetectors to characterize a beam of (anti-)neutrinos: anear detector, located a few hundred meters away fromthe target that measures the original beam composition,and a far detector, located several hundred kilometresaway, that allows for the determination of the beam com-position after neutrino flavor oscillations.The energy of these beam neutrinos ranges from a fewhundred MeV up to several GeV. Charged particles canbe produced in neutrino interactions, and the energy thatthey deposit as they traverse the detector can be used toreconstruct the events. In general, the larger the energytransferred from the neutrino to the nucleus, the largerthe number of particles and particle types produced inthe final state. Modeling nuclear interactions in the tar-get nuclei is highly complex, particularly for high energytransfers where the hadronic component of the interac-tion is more important. As a result, current long-baselineneutrino oscillation experiments mostly analyze interac-tions with low particle multiplicity. This situation, how-ever, is expected to change in the coming years. On onehand, the statistical and systematic uncertainties of cur-rent experiments have decreased significantly over recent ∗ E-mail: [email protected] † E-mail: [email protected] years such that neutrino-nucleus modeling is becoming adominant source of uncertainty [8, 9]. On the other hand,future experiments like DUNE [10] will use a broad-bandenergy neutrino beam, expecting a significant fraction ofthe neutrino interactions to have a high energy transferto the nucleus.As a result, in recent years, the neutrino physics com-munity has turned its attention to measuring neutrino-nucleus interaction cross-sections for different ranges ofenergies and target materials [11] as a way to constrainthe oscillation uncertainties while providing new mea-surements to further develop the interaction models. Inparallel, a new generation of neutrino detectors are un-der development that aim to resolve and reliably identifyshort particle tracks even in very complex interactions.To achieve this, two main detector technologies standout: one is based on Liquid Argon Time-Projection-Chambers (LArTPCs) [12] and the other is based onfinely segmented plastic scintillators with three readoutviews [13] that will form part of the near detectors forT2K [14] and, possibly, DUNE [15].For the latter, the detector response to a charged par-ticle is read out into three orthogonal 2D projections.When reconstructing the 3D neutrino event, differenttypes of hits are rebuilt, introducing non-physical entitiesthat can hinder the reconstruction process. Due to thespatial disposition of such hits, an approach of utilizingGraph Neural Networks (GNNs) [16] is proposed to per-form the classification of 3D hits to provide clean tracksfor event reconstruction.The article proceeds in the following way: Sec. II de-scribes properly the motivation behind the methodology a r X i v : . [ h e p - e x ] S e p given the details of the detector technology. Section IIIintroduces deep learning techniques and explains the spe-cific GNN algorithm used. The simulated data samplesand GNN training are discussed in Sec. IV. Results anda study of systematic uncertainties are given in Secs. Vand VI, respectively, followed by concluding remarks inSec. VII. II. MOTIVATION
A finely segmented scintillator detector consists of a 3Dmatrix of plastic scintillator cubes. The scintillation lightproduced by charged particles traversing the cubes is readout by three orthogonal wavelength-shifting (WLS) fibersthat transport the scintillation light out of the detectorwhere silicon photomultipliers (SiPMs) convert it into acertain number of photoelectrons (p.e.), as illustrated inFigs. 1 and 2.FIG. 1: Geometry of a single SuperFGD element.Here, we consider the Super Fine-Grained Detector(SuperFGD) [14] as a specific case-study. The detec-tor contains 192 × ×
184 plastic scintillator cubes, each1 × × in size, and provides three orthogonal 2D pro-jections of particle tracks produced by a neutrino inter-action, as depicted in Fig. 4a.To reconstruct neutrino interactions in three dimen-sions, the light yield measurements in the three 2D viewsare matched together, as shown in Fig. 4b. The 3D ob-jects, corresponding to the cubes where the energy de-position is reconstructed, are referred to as voxels . Inaddition to the cubes where a particle has passed and de-posited energy, light-leakage between neighboring cubescan create additional crosstalk signals [17, 18], as de-picted in Fig. 2. Moreover, ambiguities in the matchingprocess can give rise to ghost voxels, shown in Fig. 3.To accurately reconstruct neutrino interactions inthese detectors, it is crucial to be able to classify eachvoxel as one of the three types: • Track : a real energy deposit from a charged parti-cle, henceforth referred to as track signals. • Crosstalk : a real energy deposit from light-leakagebetween neighboring cubes. • Ghost : fake signals coming from the ambiguitywhen matching the three 2D views into 3D.Figure 4c shows the three types of voxels using truthinformation after 3D matching has been performed foran example neutrino interaction. Once these voxels areclassified, the ghost voxels can be removed before thefull event reconstruction proceeds, while simultaneouslycleaning the particle tracks of crosstalk.FIG. 2: Sketch of the signal generation, fiber transportand signal detection processes highlighting theproduction of optical crosstalk signals.
Y XZ
Track or crosstalk voxel(x, y', z)Ghost voxel(x, y, z) Track or crosstalk voxel(x, y, z') Track or crosstalk voxel(x', y, z)
FIG. 3: Example of a ghost voxel appearance. Whenmatching the coordinates of the fibers that recordedenergy deposition, voxels may appear where no truesignal exists, becoming non-physical ghost voxels.In this article, we represent the voxels as nodes in agraph and classify the signals using a deep learning tech-nique based on a GNN. The abstract data representationprovided by graphs makes this method very versatile andapplicable to any experiment where the output data fromthe detector elements can be represented as a list of fea-tures with arbitrary dimensionality. X
110 130 150 170 191 Z Y (a) Projections of the observed neutrino interaction onto thethree 2D detector views (XY, XZ, and YZ). X
110 130 150 170 191 Z Y (b) 3D view of the neutrino interaction after the 3D matchingof the three 2D views. The 3D voxels are shown as darkpoints. Projections of the observed neutrino interaction ontothe three 2D detector views (XY, XZ, and YZ) are shown asshadow. X
110 130 150 170 191 Z Y (c) 3D view of the neutrino interaction after the 3D matching of the three 2D views. The 3Dvoxels labelled as track (red), crosstalk (blue) and ghost (yellow) according to the truthinformation from the simulation. Projections of the observed neutrino interaction onto thethree 2D detector views (XY, XZ, and YZ) are shown as shadows. FIG. 4: Visualization of a neutrino interaction in a finely segmented 3D scintillator detector, demonstrating therelationship between the observed 2D projections onto the three orthogonal 2D views, the reconstructed 3D voxelsand the true classification of the voxels.
III. BACKGROUNDA. Deep learning
Deep learning techniques are now commonly appliedwithin the field of neutrino physics. In particular, Con-volutional Neural Network (CNN) [19] algorithms thatoperate on two-dimensional images of the neutrino inter-actions have been very successful in a number of tasks,such as event classification [10, 20–22] and pixel-levelidentification of track-like (linear) and shower-like (lo-cally dense) energy deposits [23, 24]. However, imagesof neutrino interactions are typically very sparse as onlythose readout channels with a detected signal give riseto pixels with non-zero values, and in the case of thedetector presented in Sec. II the average occupancy ofthe detector for a neutrino interaction is less than 0.02%.Thus, much of the computation time is spent unneces-sarily applying convolutions to a large number of pixelswith zero values.The goal of this work is to classify 3D voxels as oneof three categories (track, crosstalk or ghost), which isa natively three dimensional problem. To apply a 3DCNN-based algorithm to this detector would require twomillion voxels to avoid any downsampling or croppingof the input data, which is computationally prohibitive.We here investigate a sparse data representation, wherevoxels are represented as nodes in a graph. In computerscience, a graph G is a data structure that representsa mathematical concept consisting of vertices V (alsoknown as nodes) and edges E : G = ( V , E ) . (1)A graph can be directed, where each edge has a startingand an ending vertex that define a direction, or undi-rected, where the edge simply connects two nodes with-out inducing a sense of direction. In our case, we usean undirected graph, since we are only interested in thespatial connections between vertices. Figure 5 shows acomparison of the 3D CNN and graph data structures, aswell as the radial search method used for defining edgesbetween nodes.As mentioned above, each detector cube is representedas a node in a graph, and each node consists of a listof input variables called features that describe the phys-ical properties of the detected signal (see Section IV andAppendix A). The deep learning algorithm that operateson graphs is the Graph Neural Network (GNN) [16, 25].GNNs are used in many different fields [26, 27] and canbe applied for graph classification [28, 29] or node clas-sification [30–32]. In this article, a GNN inspired bythe GraphSAGE algorithm [32] is used to classify indi-vidual voxels in SuperFGD events. The application ofGNNs to data from neutrino experiments has been re-cently demonstrated by the IceCube experiment in orderto identify entire events as atmospheric neutrino interac-tions, outperforming a 3D CNN [33]. Other GNN-based studies have been performed for particle reconstructionin high energy physics detectors [34–36]. To the best ofour knowledge, the approach we present in this paper isthe first attempt to use GNNs for node classification inneutrino experiments. radius r L W H
3D Image: H x L x W pixels Graph: K nodes + N edges r FIG. 5: Data and computation size comparison betweena 3D image and a graph. The size of the 3D image onthe left is fixed ( H × L × W ) regardless of the number ofhits as CNNs require fixed image sizes. The connectedgraph shown on the right is a much more efficientrepresentation of the data. Each hit is represented as agraph node and connections, called edges, are madebetween neighboring hits within a sphere of radius r . B. GraphSAGE
GraphSAGE [32] is a technique that leverages the fea-tures of graph nodes V - which can range from physi-cal information to text attributes - to generate efficientrepresentations on previously unseen samples by learn-ing aggregator functions from training nodes. These ag-gregators can be simple functions (e.g., mean or maxi-mum) or more complex ones, such as LSTM cells [37],and must be functions that take an arbitrary number ofinputs without any given order. The model learns notonly K aggregator functions that combine informationfrom neighboring nodes but also a set of weight matrices W k , ∀ k ∈ { , ..., K } , which are used to propagate infor-mation through the K layers of the model and combineslocal information of the node with the aggregator infor-mation of its neighbors into an encoding vector (see Al-gorithm 1). The number of aggregator functions is alsoused to define the depth of the model, meaning that aGraphSAGE model has a depth of K . Once trained, itcan produce the embedding of a new node given its inputfeatures and neighborhood; this embedding is then usedas the input of a multilayer perceptron (MLP) [38] thatis responsible for predicting the label.Since GraphSAGE learns from node features, it allowsus to decide which physical information to use for eachvoxel. This means that the model can follow the particleflow, i.e., by predicting the label for each voxel based on Algorithm 1:
GraphSAGE embedding generation(i.e., forward propagation) algorithm (from [32])
Input :
Graph G ( V , E ); input features { x v , ∀ v ∈ V} ;depth K ; weight matrices W k , ∀ k ∈ { , ..., K } ;non-linearity σ ; differentiable aggregatorfunctions aggregate k , ∀ k ∈ { , ..., K } ;neighborhood function N : v → V Output:
Vector representations z v for all v ∈ V h v ← x v , ∀ v ∈ V ; for k = 1 ...K do for v ∈ V do h k N ( v ) ← aggregate k ( { h k − u , ∀ u ∈ N ( v ) } ); h kv ← σ (cid:0) W k · concat ( h k − v , h k N ( v ) ) (cid:1) end h kv ← h kv / (cid:107) h kv (cid:107) , ∀ v ∈ V end z v ← h Kv , ∀ v ∈ V the physical attributes of the target voxel as well as thefeatures of its neighbors. IV. METHODOLOGYA. Data sample generation
In order to generate data sets of neutrino interactionswith true labels that allow to train and benchmark theclassification algorithm, the steps below are followed. Foreach neutrino interaction:1. Initial particle types and initial kinematics arespecified for all final-state particles produced in theinteraction.2. Initial particles are propagated through the detec-tor geometry producing further particles and leav-ing signals in the form of energy deposits.3. Using particle energy deposits, the detector re-sponse is simulated.4. The information is stored as a list of voxels with aknown true label.
Initial particle types and kinematics
The initial particle types and their associated kinematicswere simulated following two approaches. Firstly, GE-NIE datasets were created using
GENIE-G18.10b neu-trino interaction software [39]. For a given neutrino flux i and target geometry specification, it generates a list ofrealistic neutrino event interactions both in the numberand type of outgoing particles, often referred to as event i We used the T2K flux, which peaks at 600 MeV/c, see Ref. [40]. topologies, and in their individual initial kinematics. Sec-ondly, particle-gun (Pgun) (1-track) and particle-bomb(P-Bomb) (multi-track) datasets were constructed to beas complementary as possible to the GENIE datasetsby minimizing the modeling dependency. The numberof initial particles and their types were fixed in each ofthese datasets and their input kinematics were chosen tobe randomly and uniformly distributed in the range 10-1000 MeV/c. A summary regarding the number of eventsand voxels in the two datasets, as well as of the class dis-tribution is presented in Tab. I.
GENIEdataset Training Validation Testing
Track Crosstalk Ghost
Fraction 43% 37% 20%
P-Bombdataset Training Validation Testing
Track Crosstalk Ghost
Fraction 49% 38% 13%
TABLE I: Descriptions of both GENIE and P-Bombdatasets, displaying the number of events and numberof voxels used for training, validating and testing themodels. Additionally the fractions of the differentclasses of voxels are shown.
Particle propagation simulation in the detector
The SuperFGD detector geometry was simulated as de-scribed in Ref. [14]. The particle propagation and physicssimulation is done by means of
GEANT v4-10.6.1 [41].GEANT is a Monte Carlo based toolkit that providesrealistic propagation of particles through matter. It out-puts a list of energy deposits.All energy deposits ii occurring in the same detectorcube, including the effect of Birks’ quenching [42], aresummed to form the list of track voxels . To simulate im-perfect cube light-tightness, the 3D voxelized energy isthen randomly shared ( µ = 2 . crosstalk voxels (see Figure 2).Then, the 3D voxelized energy of both track and crosstalkvoxels is projected onto its three orthogonal planes wherethe detector 2D signals are simulated, converting the con-tinuous energy deposit into discretized photons, weightedby distance-dependent attenuation factors, which are de-tected with 35% probability. To mimic a minimumthreshold detection sensitivity, only 2D hits with three ormore detected photons are kept. Then, the 2D hits arematched into 3D reconstructed voxels only if the same ii Only signals in the first 100 ns are considered. Further delayedsignals, such as decays, can be treated as independent graphs.
XYZ coordinate combination can be made using two dif-ferent combinations of 2D planes. In this process, due toambiguities some extra voxels are created, the ghost vox-els (see Fig. 3). Finally, those track and crosstalk voxelsnot reconstructed after the 3D matching are discardedfrom the original lists. An example of the 2D to 3D re-construction is shown in Figures 4a and 4b.
Simulation output
The resulting output from the simulation is a list of voxelsand their associated energy deposits in the three planes,each with one of the following three labels that we want toclassify, as described in Sec. II: track, crosstalk or ghostvoxel.
B. Network architecture
Each graph in GraphSAGE is constructed using theproximity of two voxels in that graph. If both voxelsare spatially located within a radius of 1.75 cm iii , thenwe consider them to be connected in the graph by anedge; we repeat the same procedure for each pair of vox-els iv . Additionally, we consider a neighborhood depthof three, i.e., to produce the embedding of a voxel, weuse the voxel features together with its first neighbors’features, the features of the neighbors of its neighbors,i.e, second neighbors’ features, and the features of theneighbors of the neighbors of its neighbors, i.e., thirdneighbors’ features. The aggregator used to combine thefeature of the neighbors is the mean aggregator, whichproduces the average of the neighbors’ values. This finalembedding is then passed to an MLP consisting of twofully connected layers - each followed by a LeakyReLUactivation function, and a final output layer followed bya softmax activation function. Figure 6 illustrates theGraphSAGE-based approach used, while Tab. II showsthe architectural parameters chosen. Categorical cross-entropy is chosen as the loss function to minimize duringtraining, as it is considered the standard one for multi-class classification problems: J = − m m (cid:88) i =1 c (cid:88) j =1 y ( i ) j log ˆ y ( i ) j , (2)where: • y ( k ) : true values corresponding to the k th trainingexample. iii To link only those voxels within the 3 × × √ + 1 + 1 ≈ . iv If a voxel has no neighbors, it is discarded from the graph andcannot be classified; this happens for less than 0.6% of the totalnumber of voxels. • ˆ y ( k ) : predicted values corresponding to the k th training example. • m : number of training examples. • c : number of classes/neurons corresponding to theoutput. In this case, the three classes are: track,crosstalk, and ghost. Parameter valueEncoding size 128Depth 3Aggregator meanFully Connected Layer 1 128 neuronsFully Connected Layer 2 128 neuronsFully Connected Layer 3 (output) 3 neurons
TABLE II: Architectural parameters; for moreinformation about the meaning of the parameters, seeSec. III B.The output layer of the model consists of three neu-rons, one for each of the three classes, with values v i for i = 1 , ,
3. The sum of neuron values is given by (cid:80) i =1 v i = 1 such that each neuron value gives a frac-tional score that can be used to classify voxels. In otherwords, the model returns scores for each voxel to be oneof the three desired outputs, which can be interpreted asthe probability: track-like, crosstalk-like, or ghost-like. C. Training
The network was trained for 50 epochs v using Python3.6.9 and PyTorch 1.3.0 [43] as the deep learning frame-work, on an NVIDIA RTX 2080 Ti GPU. Adam [44] isused as the optimizer, with a mini-batch size of 32, andan initial learning rate of 0.001 (divided by 10 when theerror plateaus, as suggested in [45]). The model has atotal of 105,347 parameters. Figure 7 shows the valida-tion results during the training process, measured by the F -score metric: F = 2 precision · recallprecision + recall . (3)The precision and recall are defined as:precision = true positives true positives + false positives , (4)recall = true positives true positives + true negatives , (5) v Epoch: one forward pass and one backward pass of all the train-ing examples. In other words, an epoch is one pass over theentire dataset. k=1k=2k=3 (a) Sample neighborhood. aggregator 1aggregator 2aggregator 3 (b) Aggregate feature informationfrom neighbors. F C L1 F C L2 softmax track voxelcrosstalk voxelghost voxel (c) Use aggregated information asinput for the fully connected layersand predict the label. FIG. 6: Visual illustration of the GraphSAGE sample and aggregate approach with a depth of three [32].where the labels are compared as one class vs. all theothers. The model used later for inference on new datais the one that maximizes the F -score for the validationset, as it has the best generalization for unseen data.
10 20 30 40 50
Epoch . . . . . . F s c o r e Val. GENIEVal. P-BombBest epoch GENIEBest epoch P-Bomb
FIG. 7: Validation F1 results on GENIE and P-Bombsamples.
V. RESULTS
The GNN voxel-type predictions are compared againstthe true labels to evaluate the network performance andidentify possible areas of improvement. Here, we choosethe output class with the highest score as the predictedclass of each voxel although, depending on the type ofanalysis, different selection criteria could be applied inthe future.The efficiencies and purities of these predictions arecalculated by two methods: per voxel and per event. Thelatter method evaluates the correctness of predictions onan event-by-event basis, while the former does an overallcalculation of the efficiencies and purities for all voxelsin all events of the sample. The results of both meth-ods for four sets of training/testing samples are shownin Tab. III, giving nearly identical performance that is independent of the dataset used to train and test theGNN.As an example, Fig 8 shows the voxel prediction re-sults from the GNN when applied to the event shownin Fig. 4, a GENIE event that features a track almostcompletely composed of ghost voxels. Figure 8a showsthe class predicted for each voxel, while Fig. 8b displayswhich voxels were correctly/incorrectly classified.A more in-depth analysis of the GNN performance canbe carried out by studying the effects of different eventproperties on the efficiencies and purities of the predic-tions. For these studies, the results of the GNN trainedand tested on the GENIE dataset are used.One of the factors expected to affect these predictionsis the number of voxels in the event. Figure 9 shows therelationship between the mean efficiency and purity perevent for each type of voxel as a function of the totalnumber of voxels in the event. The figure also shows themean number of events in each bin (in light blue). It isclear that both the efficiencies and purities of the threetypes of voxels decrease as the number of voxels in theevent increases. This decrease is coupled with an increaseof the fraction of ghost voxels as the total number ofvoxels increases, which are the hardest for the GNN toclassify.The number of tracks in the event is an estimate ofthe complexity of its topology. According to Fig. 10, theclassification efficiencies and purities drop as the numberof tracks increases. This behaviour is also correlated withthe increasing fraction of ghost voxels in the events.The region around the interaction vertex is of particu-lar interest in the event. It is expected that a high spatialdensity of voxels within a certain volume of the detectormay pose a challenge for the GNN to correctly identifythe voxel type. This can be observed by studying theefficiencies and purities as a function of the distance tothe interaction vertex, as shown in Fig. 11. At the inter-action vertex itself, it is clear that there are only trackvoxels and the GNN can identify them with over 96%efficiency and 100% purity. The following 2 cm exhibitonly a small fraction of ghost voxels, mainly due to the
GENIE Training P-Bomb TrainingGENIETesting PerVoxel
Track Crosstalk Ghost Track Crosstalk GhostEfficiency 93% 90% 84% Efficiency 93% 89% 80%Purity 93% 87% 91% Purity 91% 86% 89%
PerEvent
Track Crosstalk Ghost Track Crosstalk GhostEfficiency 94% 94% 88% Efficiency 94% 93% 88%Purity 96% 91% 92% Purity 95% 91% 91%
P-BombTesting PerVoxel
Track Crosstalk Ghost Track Crosstalk GhostEfficiency 94% 93% 87% Efficiency 95% 93% 88%Purity 95% 90% 92% Purity 95% 91% 92%
PerEvent
Track Crosstalk Ghost Track Crosstalk GhostEfficiency 94% 94% 87% Efficiency 95% 93% 88%Purity 96% 90% 92% Purity 96% 91% 92%
TABLE III: Mean efficiencies and purities of voxel classification, calculated for the whole sample (per voxel) and asa mean of the event-by-event efficiencies and purities (per event).high spatial density of voxels with real signals in thatvolume, which is mainly occupied by track and crosstalkvoxels. As we go further from the vertex, the efficienciesand purities increase up to 10 cm, after which we expectthe density of voxels to decrease allowing for more ghostvoxels. Therefore, at large distances we observe that theefficiencies and purities tend to decrease, most notablythe efficiencies of crosstalk and ghost voxels and the pu-rity of crosstalk voxels.As the main goal of this GNN is to identify ghost voxelsin order to eliminate them from the events, it is impor-tant to make sure that true track and crosstalk voxelsare not lost in the process. According to the tests per-formed, only 1.1% of all true track voxels and 3.3% ofcrosstalk voxels are incorrectly classified as ghost voxelsby the GNN. In addition, it is important not to miss ghostvoxels: the GNN correctly identified 84.5% of all ghostvoxels, where 72.1% of those classified incorrectly werepredicted as crosstalk. Therefore, although not ideal,this issue is not critical as crosstalk voxels have a smallerinfluence on future studies than track voxels.Lastly, we compare the results of the GNN against aconventional method of voxel classification which relieson a charge cut. As described in Appendix A, each voxelhas three charges that correspond to the signals fromthe three fibers passing through it. Since other voxelsalong the same fiber may have signals causing a largeramplitude to be recorded, we consider the smallest ofthese three charges to be the most accurate estimationof the true voxel charge. Hence, this minimum chargeis used for the purposes of this charge cut. Since, bydefinition, we expect higher energy deposition in trackvoxels compared to crosstalk and ghost voxels, we seta lower limit for the minimum charge in a voxel suchthat any voxels with a higher minimum charge than thethreshold are classified as track voxels. Fig. 12 shows thedistribution of the minimum voxel charge for the threetypes of voxels. From this figure, it is clear that it is notpossible to separate ghost from crosstalk voxels. Thus,this classification is only binary such that we have two categories: track or other. We decide to place this cutat 12 p.e., where the track and non-track voxel PDFsintersect.To compare the results of this cut with those of ourGNN, we combine the predictions of the crosstalk andghost categories. Table IV shows the efficiency and purityof the classifications for the two methods. It is evidentthat using only a charge cut can still yield a comparabletrack voxel classification efficiency to the GNN. However,it struggles to correctly classify non-track voxels which,in turn, reduces the purity of the predicted track voxels.Another improvement given by GNN over the chargecut is the capability of rejecting “fake” tracks, i.e. a clus-ter of ghost voxels that closely resembles the structure ofa real particle track. Since fake tracks are usually pro-duced by the shadowing of real tracks, the correspondingnumber of p.e. measured in the three readout views ishigher than 12 p.e., hence the charge cut cannot rejectthem easily. The ability of the GNN to reject ghost tracksis shown in Appendix C for a number of neutrino inter-actions and compared to the charge cut method.
GNN Charge Cut
Track Other Track OtherEfficiency 94% 96% Efficiency 93% 80%Purity 96% 95% Purity 80% 91%
TABLE IV: Mean efficiencies and purities of voxelclassification for the GNN and a simple charge cut.Figure 13 shows the advantage of the three-fold classi-fication of the GNN over the binary classification of thecharge cut when comparing the fraction of true total de-posited energy obtained using each method. In the caseof the GNN, the total deposited energy in an event is thesum of the true energy deposited in all non-ghost voxels.For the charge cut, only the energy deposited in trackvoxels is used. This causes an average energy loss of 5%per event when using a method that also excludes thecrosstalk voxels, compared to less than 1% when using X
110 130 150 170 191 Z Y trackcrosstalkghost (a) Prediction: voxels are colored based on the GNN predictions. X
110 130 150 170 191 Z Y correctincorrect (b) Accuracy: voxels correctly classified by the GNN are shown in green. FIG. 8: Example GNN prediction results for the interaction shown in Fig. 4.0
Total Number of Voxels in the Event . . . . . . . . . M ea n C l a ss i fi ca ti on E f fi c i e n c y TrackCrosstalkGhost . . . . . . . . F r ac ti ono f E v e n t s (a) Efficiency. Total Number of Voxels in the Event . . . . . . . . . M ea n C l a ss i fi ca ti on P u r it y TrackCrosstalkGhost . . . . . . . . F r ac ti ono f E v e n t s (b) Purity. Total Number of Voxels in the Event . . . M ea n F r ac ti on (c) Mean fraction of each type of voxel as a function of thenumber of voxels in the event (blue = track,orange = crosstalk, green = ghost). FIG. 9: Efficiency and purity as a function of thenumber of voxels in the event for a sample trained andtested on GENIE simulated data.the GNN that can isolate ghost voxels.
VI. SYSTEMATIC UNCERTAINTYCONSIDERATIONS
The results presented in Sec. V show that the GNNis a very powerful technique for removing ghost voxelsand identifying optical crosstalk in 3D-reconstructed neu-trino interactions. It is important to demonstrate thatthis technique is robust and does not introduce new sys-tematic uncertainties, potentially given by a sub-optimalchoice of the training sample.One of the main limitations in the measurement ofthe neutrino oscillation parameters in long-baseline ex-
Number of Tracks in the Event . . . . . . . . . M ea n C l a ss i fi ca ti on E f fi c i e n c y TrackCrosstalkGhost . . . . . . . F r ac ti ono f E v e n t s (a) Efficiency. Number of Tracks in the Event . . . . . . . . . M ea n C l a ss i fi ca ti on P u r it y TrackCrosstalkGhost . . . . . . . F r ac ti ono f E v e n t s (b) Purity. Number of Tracks in the Event . . . M ea n F r ac ti on (c) Mean fraction of each type of voxel as a function of thenumber of tracks in the event (blue = track,orange = crosstalk, green = ghost). FIG. 10: Efficiency and purity as a function of thenumber of tracks in the event for a sample trained andtested on GENIE simulated data.periments comes from uncertainties in the modeling ofneutrino interactions, not yet fully constrained by dataand partially incomplete for describing all the details ofthe interaction final state. For example, the modelingof hadron multiplicity and kinematics may considerablychange the image of the neutrino interaction, particularlynear the neutrino vertex, or the total energy depositedby all the particles produced by the neutrino interaction.Hence, it is hard to obtain a data-driven control sam-ple to train a neural network without making any priorassumptions. Since the GNN is trained only on a sub-set of the parameter space, the results could be biased ifthe detected neutrino interactions belong to a region ofthe parameter space not well covered by the MC genera-tor. Thus, there must be careful studies of the systematic1
Distance From the Interaction Vertex [cm] . . . . . . M ea n C l a ss i fi ca ti on E f fi c i e n c y TrackCrosstalkGhost F r ac ti ono f V ox e l s (a) Efficiency. Distance From the Interaction Vertex [cm] . . . . . . . . . M ea n C l a ss i fi ca ti on P u r it y TrackCrosstalkGhost F r ac ti ono f V ox e l s (b) Purity. Distance From the Interaction Vertex [cm] . . . M ea n F r ac ti on (c) Mean fraction of each type of voxel as a function of thedistance to the vertex (blue = track, orange = crosstalk,green = ghost). FIG. 11: Efficiency and purity as a function of thedistance to the neutrino interaction vertex for a sampletrained and tested on GENIE data.uncertainties in order to account for a potentially incom-plete sampling of the parameter space. In this section, weinvestigate potential sources of systematic uncertainty.As described in Sec. V, different training samples, (GE-NIE or P-Bomb) were generated and the results weresummarised in Tab. III, demonstrating that the perfor-mance is still very good even when the samples used fortraining and testing were different.A few comments about the training sample genera-tion are necessary. Whilst the GENIE sample belongsto a particular choice of the neutrino interaction model,the P-Bomb sample aims, in principle, to be as model-independent as possible. However, generalizing the train-ing sample enough to contain all possible final states ofthe neutrino interaction is computationally prohibitive.
The Minimum of the Three Voxel Charges [p.e.] . . . . . . . . . Charge CutTrackCrosstalkGhost
FIG. 12: The distribution of the minimum chargeamong the three voxel charges for the GENIE sample. .
90 0 .
92 0 .
94 0 .
96 0 .
98 1 . Fraction of True Total Deposited Energy . . . . . . F r ac ti ono f E v e n t s E GENIEdep E ChargeCutdep
FIG. 13: The fraction of the true total deposited energyobtained when using the GNN (trained on GENIE) orthe charge cut as a classification method.The training sample will always have some limitationsfrom the subjective decisions about which neutrino inter-action topologies are sufficiently improbable to be omit-ted. Thus, it is necessary to adopt a figure of merit toquantitatively evaluate how two training samples differin terms of sampling of the parameter space. Followingthe prescription described in Ref. [46], we computed thedistance between the correlation matrices of the GNNinput variables of the different training samples (GENIEand P-Bomb), defined as d ( C , C ) = 1 − tr { C C }(cid:107) C (cid:107)(cid:107) C (cid:107) (6)where C and C are the correlation matrices obtainedfrom two different training samples. In the text we willrefer to d ( C , C ) simply as to the distance between train-ing samples. Although this method is an approximationto a multivariate Gaussian distribution of the probabilitydensity function and does not take into account the scalesand means of the variables, it still provides a quick wayto understand how similar the training samples are. Nev-ertheless, other heavily computational methods, such asthose involving the difference of probability density func-2tion integrals, could be used, but are beyond the scopeof this article. The correlation matrices of the featuresfor the GENIE and P-Bomb datasets are presented inFig. 17 in Appendix B. An “alternative” P-Bomb sam-ple, inclusive of non-physical event topologies (e.g. eventsnot predicted by neutrino event generators) with respectto the other one, was generated for the systematic uncer-tainty evaluation. However, since the distance betweenthe original and the alternative P-Bomb samples is nearlyzero, the latter was not used. In Tab. V, the distancesof Eq. (6) between the correlation matrices from threegenerated training samples are shown. GENIE P-Bomb AlternativeGENIE - 0.020075 0.020803
P-Bomb
Alternative
TABLE V: Distance as defined in Eq. 6 between thecorrelation matrices obtained from the different trainingsamples. Details of the generation of the GENIE andP-Bomb samples is described in Sec. IV A. TheAlternative sample was built with a particle bombsimilar to the P-Bomb sample but with additional eventtopologies.The robustness of the GNN against model dependen-cies can be verified by training different neural networkson different event samples and applying them to the sameset of neutrino interactions. A difference in the observ-ables used in the physics measurement, such as particlemomenta, energy deposit, etc., obtained by the differ-ent training can be assigned as a systematic uncertaintyintroduced by the method.A study was performed to evaluate the impact of themethod on the total true energy deposited in the de-tector. The difference between the total energy depositcomputed after rejecting the voxels classified as ghostsfor both network trainings was computed. Fig. 14 showsthe distribution of the total true deposited energy be-fore and after discarding the voxels classified as ghosts.Both GENIE- and P-Bomb- trained GNNs give very sim-ilar results over the full range of total deposited energy.The total true deposited energy computed with and with-out ghost rejection differ on average by less than 1 MeV.Hence, it is expected to be improved by increasing thestatistics of the training samples. The total differencebetween GENIE- and P-Bomb- trained GNNs is foundto be less than 1 MeV with a standard deviation of ap-proximately 5.5 MeV, mainly due to a few outlier en-tries, and 68% of the events with a difference better than0.192 MeV, as shown in Fig. 15.This corresponds to less than 2% of the mean totaldeposited energy per event. In Fig. 16 the impact ofthe different training sample is shown as a function ofthe total deposited energy. The fractional standard de-viation, defined as the standard deviation of the differ- ence between deposited energy computed from differentGNN trainings and divided by the true deposited en-ergy, shown in the bottom panel, is less than 2% andalmost constant as a function of the deposited energy.This means that the performance of the method is aboutthe same irrespective of the total deposited energy. Thisstudy confirms that GNN can be used for classifying 3Dvoxels potentially with limited systematic uncertaintiesin the deposited energy, while drastically improving thetracking capability.
Total Deposited Energy per Event [MeV] . . . . . . F r ac ti ono f E v e n t s E Truedep E GENIEdep E P − Bombdep
FIG. 14: Distribution of the total true deposited energyafter rejecting the ghost voxels classified either withGENIE- (dashed orange) or P-Bomb- (dotted green)trained GNNs and without any ghost rejection (solidblue). The mean total deposited energy per event isabout 288 MeV. − − − − E GENIEdep - E P − Bombdep [MeV] . . . . . . . F r ac ti ono f E v e n t s FIG. 15: Difference between the total true depositedenergy computed after rejecting the ghost voxelsclassified with GENIE- and P-Bomb- trained GNNs.The mean is 0.78 MeV while the standard deviation is5.5 MeV. About 40% of events show no differencebetween P-Bomb and GENIE, 68% have a differencewithin ± ± NominalCrosstalk2.7%
Track Crosstalk GhostEfficiency 93% 90% 84%Purity 92% 87% 91%
Crosstalk2%
Track Crosstalk GhostEfficiency 92% 89% 81%Purity 94% 83% 89%
Crosstalk5%
Track Crosstalk GhostEfficiency 94% 89% 88%Purity 86% 91% 93%
TABLE VI: Mean efficiencies and purities of voxelclassification, per voxel, for different crosstalk values,i.e. 2.7% (nominal) 2% and 5%. The GNN was trainedwith GENIE training samples with nominal crosstalkand tested on the same GENIE sample with differentcrosstalk values to study its robustness.
VII. CONCLUSIONS
A graph neural network inspired by GraphSAGE wasdeveloped and tested on simulated neutrino interactionsin a 3D voxelized fine-granularity plastic-scintillator de-tector with three 2D readout views. The advantage ofthis neural network is that, the graph data structuresprovide a natural representation of the neutrino interac-tions.The neural network was able to identify ambiguitiesand scintillation light leakage between neighboring activescintillator detector volumes as well as real signatures leftby particles with efficiencies and purities in the range of94-96% per event, with a clear improvement with respectto less sophisticated methods. In particular, it can rejectfake tracks produced by the shadowing of real tracks ob-served in the 2D readout views. The performance wastested for neutrino events with different number of vox-els, number of tracks and voxels at different distancesfrom the vertex, variables that could hint to interactionmodel dependencies of the method. Efficiencies and pu-rities were found to be relatively stable and the trendswere consistent with the expectation. The robustness ofthe neural network against possible systematic uncertain-ties introduced by the method was tested. The resultswere obtained using neural networks trained on differentsamples, produced either with the GENIE event genera-tor or by randomizing the number of final state particlesand relative momentum to obtain a more generic samplethat does not belong to any particular theoretical model.It was found that the bias introduced on the total de-posited energy of the event by arbitrarily choosing a dif-ferent training sample is, on average, less than 1 MeV.The impact of potential mismodeling of the light leakagebetween neighboring scintillator volumes was tested. Re-sults show that the performance of the neural network isrobust to expected changes in the crosstalk modeling.To conclude, we showed that a graph neural networkhas great potential in assisting a 3D particle-flow recon-struction of neutrino interactions. Similar results may beexpected for other types of detectors that aim to a 3Dreconstruction of the neutrino event from 2D projectionsand that share analogous features like ambiguities andleakage of signal between detector voxels.
VIII. ACKNOWLEDGMENTS
D. Douqa and F. S´anchez acknowledge the Swiss Na-tional Foundation Grant No. 200021 85012. C. Jes´us-Valls and T. Lux acknowledge funding from the Span-ish Ministerio de Econom´ıa y Competitividad (SEIDI-MINECO) under Grants No. FPA2016-77347-C2-2-P andSEV-2016-0588. S. Pina-Otey acknowledges the supportof the Industrial Doctorates Plan of the Secretariat ofUniversities and Research of the Department of Businessand Knowledge of the Generalitat of Catalonia. The au-thors also thank T. Jiang, T. Zhao and D. Wang for their4implementation of GraphSAGE vi , on which the softwareof this paper was based.This work was initiated in the framework of the T2KNear Detector upgrade project, fruitful discussions in thiscontext with our colleagues are gratefully acknowledged.The authors acknowledge the T2K Collaboration for pro-viding the neutrino interaction and detector simulationsoftware. Appendix A: Input variables
The list of variables used as features for the graphnodes is given below. Each node is placed at XYZ co-ordinates matching the center of a cube, however, thesecenter coordinates are not node variables by themselvessince the detector response is isotropic. The numbers infront of each variable match those in Fig 17. • Number of photons detected in the XY, XZ or YZ-fiber intersecting the cube under consideration cor-rected by the expected attenuation. • Number of active voxels intersected by the fiberassociated to peXY, peXZ or peYZ •
6: pewav
Average number of detected photons peXY, peXZ,peYZ , each weighted by the fiber multiplicity mXY,mXZ, mYZ . pewav = peXYmXY + peXZmXZ + peYZmYZ • Relative difference between the light measured intwo different 2D planes. pullX = peXY − peXZpeXY + peXZpullY = peXY − peYZpeXY + peYZpullZ = peXZ − peYZpeXZ + peYZ •
10: residual
Similarity of the light yield measured in the three2D planes, measured as the squared distance from vi https://github.com/twjiang/graphSAGE-pytorch each peXY, peXZ, peYZ to the average, weightedby the squared average. µ = peXY + peXZ + peYZ residual = ( peXY − µ ) + ( peXZ − µ ) + ( peYZ − µ ) µ •
11: pullXYZ
Similarity of the light yield measured in the three2D planes, measured as a combination of 2D pulls( a , a , a ) weighted by pewav . a = peXYmXY − peXZmXZpeXYmXY + peXZmXZ a = peXYmXY − peYZmYZpeXYmXY + peYZmYZ a = peXZmXZ − peYZmYZpeXZmXZ + peYZmYZ pullXYZ = a a + a a + a a pewav •
12: ratioMQ
Ratio between the average voxel multiplicity in thethree fibers and pewav . ratioMQ = mXY + mXZ + mYZ pewav • Number of active neighbor voxels in a sphere ofcertain radius. (cid:44) → R1 , r=1 cm. (cid:44) → R2 , r=2 cm. (cid:44) → R3 , r=5 cm. R2 was not used as a variable due tothe high correlation with R1 , but is used to compute RR . • Boolean variables representing the existence of im-mediate neighbors in each of the 6 surroundingcubes •
21: orthogonal neighbor
It is 1 if any of x+, x-, y+, y-, z+, z- is 1. •
22: RR
Ratio between the number of close and far voxels.The (cid:15) = 10 − prevents numerical problems when R3 =0. RR = R2R3 + (cid:15) •
23: ratioDQ
Ratio between the average voxel distance aveDist around the voxel and the weighted average lightyield pewav . ratioDQ = aveDistpewav •
24: aveDist
Average distance from the voxel center C to all firedvoxel centers ( C i ) within a sphere of radius 2.5 cm. aveDist = N N (cid:88) i EuclidianDist(
C, C i )A number of these variables are calculated from thesame underlying properties of the energy deposits. Intheory, an infinitely deep GNN trained on an infiniteamount of training data would be able to extract all ofthe information required for classification from the fewunderlying properties. In practice, we use a larger num-ber of derived variables to guide the GNN to allow itto more easily extract information from the data and toconverge quickly in the training process. Global positionwas intentionally not used as a variable to avoid the GNNto learn neutrino modelling specific behaviours. Appendix B: Comparison of GENIE and P-Bombsimulated data samples
Figure 17 shows the correlations of the input variablesdefined in Appendix A for the GENIE and P-Bomb datasamples. Differences between the two matrices arise fromthe different topologies of interactions produced by thetwo generator methods.
Appendix C: Event Gallery
This section contains a number of visualizations toshow the classification performance of the GNN for anumber of neutrino interactions with different complex-ity and topology. Displays are shown for different eventsin Figs 18 - 23: all voxels with their true classification,only the true track voxels, the classified track voxels us-ing the charge cut method, and the classified track voxelsusing the GNN. The interactions shown here are exam-ples of interactions containing many ghost voxels in orderto showcase the GNN performance.The track voxel classification ability of the charge cutand GNN methods can be seen by comparing subfigures(c) and (d) with (b), respectively, for each interaction.The GNN is able to reject ghost voxels very well, asshown in Figs 19, 21, 22 and 23 where ghost tracks re-main using the charge cut method. In general, the per-formance improvement from the GNN increases with thecomplexity of the interactions. For simple interactions
GENIE dataset correlation matrix (a) GENIE dataset correlation matrix.
Particle Bomb dataset correlation matrix (b) P-Bomb dataset correlation matrix.
FIG. 17: Correlation matrices for the input variables ofthe GENIE and P-Bomb datasets used. Appendix Agives the mapping between the numbers on the axesand the variable names.with only a single muon in the final state both methodsperform similarly.6 X
110 130 150 170 191 Z Y trackcrosstalkghost (a) The 3D voxels labelled as track (red), crosstalk (blue) andghost (yellow) according to the truth information from thesimulation are shown. X
110 130 150 170 191 Z Y track (b) Only the 3D voxels labelled as track according to thetruth information from the simulation are shown. X
110 130 150 170 191 Z Y track (c) The 3D voxels labelled as track according to the chargecut classification are shown. X
110 130 150 170 191 Z Y track (d) The 3D voxels labelled as track according to the GNNclassification are shown. FIG. 18: 3D visualization of a neutrino interaction in a finely segmented 3D scintillator detector after the 3Dmatching of the three 2D views. The GNN cut is able to almost entirely reject the fake track traveling on the XZplane and stopping near to the vertex at X ∼
160 mm and Z ∼
70 mm, while the charge cut cannot.7 X Z Y trackghostcrosstalk (a) The 3D voxels labelled as track (red), crosstalk (blue) andghost (yellow) according to the truth information from thesimulation are shown. X Z Y track (b) Only the 3D voxels labelled as track according to thetruth information from the simulation are shown. X Z Y track (c) The 3D voxels labelled as track according to the chargecut classification are shown. X Z Y track (d) The 3D voxels labelled as track according to the GNNclassification are shown. FIG. 19: 3D visualization of a neutrino interaction in a finely segmented 3D scintillator detector after the 3Dmatching of the three 2D views. The charge cut is not able to reject two fake tracks, one coming from a vertex aX¡50 mm Z¡50 mm traveling on the XZ plane and stopping near to the vertex at X ∼
160 and Z ∼
70. Moreover, thecharge cut leave a bump of ghost voxels around the vertex that could mimic the interaction of a few low-energyprotons, an effect that could bias the reconstruction of the neutrino energy.8 X Z Y trackcrosstalkghost (a) The 3D voxels labelled as track (red), crosstalk (blue) andghost (yellow) according to the truth information from thesimulation are shown. X Z Y track (b) Only the 3D voxels labelled as track according to thetruth information from the simulation are shown. X Z Y track (c) The 3D voxels labelled as track according to the chargecut classification are shown. X Z Y track (d) The 3D voxels labelled as track according to the GNNclassification are shown. FIG. 20: 3D visualization of a neutrino interaction in a finely segmented 3D scintillator detector after the 3Dmatching of the three 2D views. In this even the performance of GNN and the charge cut is quite similar becausethe ghost voxels are mainly given by the overlap of crosstalk hits in the 2D readout views.9 X Z Y trackcrosstalkghost (a) The 3D voxels labelled as track (red), crosstalk (blue) andghost (yellow) according to the truth information from thesimulation are shown. X Z Y track (b) Only the 3D voxels labelled as track according to thetruth information from the simulation are shown. X Z Y track (c) The 3D voxels labelled as track according to the chargecut classification are shown. X Z Y track (d) The 3D voxels labelled as track according to the GNNclassification are shown. FIG. 21: 3D visualization of a neutrino interaction in a finely segmented 3D scintillator detector after the 3Dmatching of the three 2D views. This neutrino event has a quite high multiplicity and tracks are quite close eachother. This produce relatively big clusters of ghost voxels that produce at least two fake tracks even after the chargecut. Instead GNN allows to classify ghosts more precisely and correctly visualize the correct number of tracks.Moreover, the charge cut makes true tracks more fat making their separation harder and, potentially, less precise theparticle momentum reconstruction.0 X Z Y trackcrosstalkghost (a) The 3D voxels labelled as track (red), crosstalk (blue) andghost (yellow) according to the truth information from thesimulation are shown. X Z Y track (b) Only the 3D voxels labelled as track according to thetruth information from the simulation are shown. X Z Y track (c) The 3D voxels labelled as track according to the chargecut classification are shown. X Z Y track (d) The 3D voxels labelled as track according to the GNNclassification are shown. FIG. 22: 3D visualization of a neutrino interaction in a finely segmented 3D scintillator detector after the 3Dmatching of the three 2D views. Although this is a relatively simple neutrino event, the charge cut is not able toreject a fake track stopping near the neutrino interaction vertex while GNN can provide a much cleanerreconstruction.1 X Z Y trackcrosstalkghost (a) The 3D voxels labelled as track (red), crosstalk (blue) andghost (yellow) according to the truth information from thesimulation are shown. X Z Y track (b) Only the 3D voxels labelled as track according to thetruth information from the simulation are shown. X Z Y track (c) The 3D voxels labelled as track according to the chargecut classification are shown. X Z Y track (d) The 3D voxels labelled as track according to the GNNclassification are shown. FIG. 23: 3D visualization of a neutrino interaction in a finely segmented 3D scintillator detector after the 3Dmatching of the three 2D views. In the neutrino event GNN can easily reject the relatively big cluster of ghostvoxels that would make difficult a proper reconstruction of the number of tracks and corresponding energy, inparticular near to the interaction vertex.2 [1] Y. Fukuda et al. (Super-Kamiokande), Phys. Rev. Lett. , 1562 (1998), arXiv:hep-ex/9807003 [hep-ex].[2] B. Aharmim et al. (SNO), Phys. Rev. C , 055502(2005), arXiv:nucl-ex/0502021 [nucl-ex].[3] T. Araki et al. (KamLAND), Phys. Rev. Lett. , 081801(2005), arXiv:hep-ex/0406035 [hep-ex].[4] M. H. Ahn et al. (K2K), Phys. Rev. D , 072003 (2006),arXiv:hep-ex/0606032 [hep-ex].[5] F. P. An et al. (Daya Bay), Phys. Rev. Lett. , 171803(2012), arXiv:1203.1669 [hep-ex].[6] P. Adamson et al. (MINOS), Phys. Rev. Lett. ,191801 (2014), arXiv:1403.0867.[7] M. Acero et al. (NOvA), Phys. Rev. Lett. , 151803(2019), arXiv:1906.04907 [hep-ex].[8] K. Abe et al. (T2K), Nature , 339 (2020),arXiv:1910.03887 [hep-ex].[9] M. A. Acero et al. (NOvA), Phys. Rev. D98 , 032012(2018), arXiv:1806.00096 [hep-ex].[10] B. Abi et al. (DUNE), “Deep Underground Neu-trino Experiment (DUNE), Far Detector Technical De-sign Report, Volume II DUNE Physics,” (2020),arXiv:2002.03005 [hep-ex].[11] M. Tanabashi et al. (Particle Data Group), Phys. Rev.D , 030001 (2018).[12] C. Rubbia, “The Liquid Argon Time Projection Cham-ber: A New Concept for Neutrino Detectors,” (1977).[13] A. Blondel, F. Cadoux, S. Fedotov, M. Khabibullin,A. Khotjantsev, A. Korzenev, A. Kostin, Y. Kudenko,A. Longhin, A. Mefodiev, P. Mermod, O. Mineev,E. Noah, D. Sgalaberna, A. Smirnov, and N. Yershov,Journal of Instrumentation , P02006 (2018).[14] K. Abe et al. , “T2K ND280 Upgrade - Technical DesignReport,” (2019), arXiv:1901.03750 [physics.ins-det].[15] G. Yang (DUNE), PoS ICHEP2018 , 868 (2019).[16] A. Sperduti and A. Starita, IEEE Transactions on NeuralNetworks , 714 (1997).[17] A. Blondel et al. , “The SuperFGD Prototype ChargedParticle Beam Tests,” (2020), arXiv:2008.08861[physics.ins-det].[18] O. Mineev et al. , Nuclear Instruments and Methods inPhysics Research Section A: Accelerators, Spectrometers,Detectors and Associated Equipment , 134 (2019).[19] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E.Howard, W. Hubbard, and L. D. Jackel, Neural Compu-tation , 541 (1989).[20] A. Aurisano et al. , Journal of Instrumentation ,P09001 (2016).[21] R. Acciarri et al. (MicroBooNE), JINST , P03011(2017), arXiv:1611.05531 [physics.ins-det].[22] B. Abi et al. (DUNE), “Neutrino interaction classificationwith a convolutional neural network in the DUNE fardetector,” (2020), arXiv:2006.15052 [physics.ins-det].[23] B. Abi et al. (DUNE), “The Single-Phase ProtoDUNETechnical Design Report,” (2017), arXiv:1706.07081[physics.ins-det].[24] C. Adams et al. (MicroBooNE Collaboration), Phys. Rev.D , 092001 (2019).[25] J. Zhou, G. Cui, Z. Zhang, C. Yang, Z. Liu, L. Wang,C. Li, and M. Sun, “Graph neural networks: A reviewof methods and applications,” (2018), arXiv:1812.08434[cs.LG]. [26] D. K. Duvenaud, D. Maclaurin, J. Iparraguirre, R. Bom-barell, T. Hirzel, A. Aspuru-Guzik, and R. P. Adams,in Advances in Neural Information Processing Systems28 , edited by C. Cortes, N. D. Lawrence, D. D. Lee,M. Sugiyama, and R. Garnett (Curran Associates, Inc.,2015) pp. 2224–2232.[27] Y. Shen, H. Li, S. Yi, D. Chen, and X. Wang, in
The Eu-ropean Conference on Computer Vision (ECCV) (2018).[28] Y. Wang, Y. Sun, Z. Liu, S. E. Sarma, M. M. Bronstein,and J. M. Solomon, “Dynamic Graph CNN for Learningon Point Clouds,” (2018), arXiv:1801.07829 [cs.CV].[29] Z. Ying, J. You, C. Morris, X. Ren, W. Hamilton,and J. Leskovec, in
Advances in Neural InformationProcessing Systems 31 , edited by S. Bengio, H. Wal-lach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, andR. Garnett (Curran Associates, Inc., 2018) pp. 4800–4810.[30] B. Perozzi, R. Al-Rfou, and S. Skiena, in
Proceedingsof the 20th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining , KDD ’14 (Asso-ciation for Computing Machinery, New York, NY, USA,2014) p. 701–710.[31] T. N. Kipf and M. Welling, “Semi-Supervised Classifi-cation with Graph Convolutional Networks,” (2016),arXiv:1609.02907 [cs.LG].[32] W. L. Hamilton, R. Ying, and J. Leskovec, “Induc-tive Representation Learning on Large Graphs,” (2017),arXiv:1706.02216 [cs.SI].[33] N. Choma, F. Monti, L. Gerhardt, T. Palczewski, Z. Ron-aghi, M. Prabhat, W. Bhimji, M. Bronstein, S. Klein,and J. Bruna (2018) pp. 386–391.[34] S. Farrell, P. Calafiura, M. Mudigonda, Prabhat, D. An-derson, J.-R. Vlimant, S. Zheng, J. Bendavid, M. Spirop-ulu, G. Cerati, L. Gray, J. Kowalkowski, P. Spentzouris,and A. Tsaris, “Novel deep learning methods for trackreconstruction,” (2018), arXiv:1810.06111 [hep-ex].[35] S. R. Qasim, J. Kieseler, Y. Iiyama, and M. Pierini,The European Physical Journal C (2019),10.1140/epjc/s10052-019-7113-9.[36] X. Ju, S. Farrell, P. Calafiura, D. Murnane, Prab-hat, L. Gray, T. Klijnsma, K. Pedro, G. Cerati,J. Kowalkowski, G. Perdue, P. Spentzouris, N. Tran, J.-R. Vlimant, A. Zlokapa, J. Pata, M. Spiropulu, S. An,A. Aurisano, J. Hewes, A. Tsaris, K. Terao, andT. Usher, “Graph Neural Networks for Particle Recon-struction in High Energy Physics detectors,” (2020),arXiv:2003.11603 [physics.ins-det].[37] S. Hochreiter and J. Schmidhuber,Neural Computation , 1735 (1997),https://doi.org/10.1162/neco.1997.9.8.1735.[38] F. Rosenblatt, Cornell Aeronautical Laboratory (1957),report 85-460-1.[39] C. Andreopoulos et al. , Nucl. Instrum. Meth. A , 87(2010), arXiv:0905.2517 [hep-ph].[40] K. Abe et al. (T2K), Phys. Rev. D87 , 012001(2013), [Addendum: Phys. Rev.D87,no.1,019902(2013)],arXiv:1211.0469 [hep-ex].[41] S. Agostinelli et al. (GEANT4), Nucl. Instrum. Meth.
A506 , 250 (2003).[42] J. B. Birks, Proceedings of the Physical Society. SectionA , 874 (1951). [43] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury,G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga,A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison,A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai,and S. Chintala, in Advances in Neural Information Pro-cessing Systems 32 (Curran Associates, Inc., 2019) pp.8024–8035. [44] D. P. Kingma and J. Ba, CoRR abs/1412.6980 (2014),arXiv:1412.6980.[45] K. He, X. Zhang, S. Ren, and J. Sun, CoRR abs/1512.03385 (2015), arXiv:1512.03385.[46] M. Herdin, N. Czink, H. Ozcelik, and E. Bonek, in2005 IEEE 61st Vehicular Technology Conference