[PDF] Material Identification in Nuclear Waste Drums using Muon Scattering Tomography and Multivariate Analysis

Abstract

The use of muon scattering tomography for the non-invasive characterisation of nuclear waste is well established. We report here on the application of a combination of feature discriminators and multivariate analysis techniques to locate and identify materials in nuclear waste drums. After successful training and optimisation of the algorithms they are then tested on a range of material configurations to assess the system's performance and limitations. The system is able to correctly identify uranium, iron and lead objects on a ~few \text{cm} scale. The system's sensitivity to small uranium objects is also established as 0.90 +0.07 −0.12 , with a false positive rate of 0.12 +0.12 −0.07 .

Full PDF

PPrepared for submission to JINST

Material Identiﬁcation in Nuclear Waste Drums usingMuon Scattering Tomography and Multivariate Analysis

M.J. Weekes, 𝑎, A.F. Alrheli, 𝑎 D. Barker, 𝑎 D. Kikoła, 𝑏 A.K. Kopp, 𝑐 M. Mhaidra, 𝑏,𝑐

J.P. Stowell, 𝑎 L.F. Thompson, 𝑎 J.J. Velthuis. 𝑐 𝑎 University of Sheﬃeld, Department of Physics and Astronomy, Hounsﬁeld Road, Sheﬃeld, S3 7RH, UK. 𝑏 Warsaw University of Technology, Pl. Politechniki 1, 00-661 Warsaw, Poland 𝑐 University of Bristol, School of Physics, HH Wills Physics Laboratory, Tyndall Avenue, Bristol BS8 1TL,UK.

E-mail: [email protected]

Abstract: The use of muon scattering tomography for the non-invasive characterisation ofnuclear waste is well established. We report here on the application of a combination of featurediscriminators and multivariate analysis techniques to locate and identify materials in nuclear wastedrums. After successful training and optimisation of the algorithms they are then tested on a rangeof material conﬁgurations to assess the system’s performance and limitations. The system is ableto correctly identify uranium, iron and lead objects on a few cm scale. The system’s sensitivity tosmall uranium objects is also established as 0 . + . − . , with a false positive rate of 0 . + . − . .Keywords: Analysis and Statistical Methods, Particle Tracking Detectors Corresponding author. a r X i v : . [ phy s i c s . i n s - d e t ] F e b ontents – 1 – Introduction

It is important to develop non-destructive methods to determine the contents of sealed nuclear wastepackages, in order to minimise the risks of environmental contamination and personnel radiationexposure and to allow for more eﬀective safeguarding. Non-Destructive Assay (NDA) techniquesin current use include calorimetry and Muon Scattering Tomography (MST).NDA techniques can analyse drum contents in a variety of ways. For example, calorimetrycan be used to measure the mass of nuclear material inside a container by its heat emission [1].In contrast, MST (with exposure times of several days to weeks) can produce full 3D images ofa volume of interest, allowing individual objects inside the drum to be viewed as well as givinginformation on their atomic number 𝑍 and density [2].Simulation studies are useful tools to assess MST techniques and algorithms; the techniquedescribed in this paper was developed and tested via Monte Carlo simulations. It uses MST datain combination with Multi-Variate Analysis (MVA) classiﬁers and clustering algorithms to ap-proximately identify the locations and shapes of objects stored in a concrete-ﬁlled waste drum.Subsequently, additional trained classiﬁers are applied to each identiﬁed object to classify themas ‘iron’, ‘lead’, or ‘uranium’, representing low-threat medium- 𝑍 material, low-threat high- 𝑍 ma-terial, and high-threat high- 𝑍 material respectively. The use of these four materials allows threeclassiﬁcation problems of interest to be investigated: separation of stored objects from the concretebackground, separating medium- and high- 𝑍 materials, and distinguishing between two high- 𝑍 materials.Previous applications of machine learning techniques to MST imaging have demonstratedmethods for distinguishing between drums containing uranium and lead blocks [3] and for recon-structing the size of uranium blocks [4]. Our system builds on these through the ability to isolateand identify multiple distinct bodies of diﬀerent materials and sizes in a waste drum. Other previousresearch into combining machine learning and MST include applications in cargo scanning [5][6],a related problem for which short exposure times are required. Cosmic rays interact with the Earth’s atmosphere to produce showers of particles, some of whichsubsequently decay to muons, resulting in a muon ﬂux at sea level of around 1 cm − min − [7]. Thesecosmic ray muons are highly penetrating due to their large mass and lack of strong interactions.They have an angular distribution that varies approximately as cos 𝜃 , where 𝜃 is the zenith angle.Muons are also highly sensitive to the atomic number 𝑍 of the material they are passing through,making them suitable candidates for tomographic imaging of nuclear waste drums.Muons undergo multiple elastic Coulomb scatterings in matter, with the projected scatteringangles following an approximately Gaussian distribution with width 𝜎 given by 𝜎 ≈ . 𝛽𝑐 𝑝 √︁ 𝑋 / 𝑋 (2.1)where 𝛽 is the muon speed divided by the speed of light in a vacuum, 𝑐 ; 𝑝 is the muon momentum, 𝑋 is the thickness of the material and 𝑋 is the radiation length of the material [8]. The latter is– 2 –iven by 𝑋 = . 𝐴𝑍 ( 𝑍 + ) ln ( /√ 𝑍 ) [ g · cm − ] (2.2)where 𝜌 is the material density and 𝐴 is atomic mass [9].A general MST experiment consists of two sets of particle detectors, one above and one belowsome volume of interest such as a waste drum (see Figure 1). Multiple layers of detector arenecessary in order to construct a three dimensional trajectory for each muon from the detector hits.This allows the incoming and outgoing trajectories of each muon to be measured and hence themuon scattering angles to be calculated. Figure 1 . Schematic showing the principle of muon scattering tomography applied to a nuclear wastedrum containing a block of high- 𝑍 material (in green). Particle detectors measure the trajectories of muonsbefore and after encountering the volume of interest, allowing the scattering angle 𝜃 (here exaggerated) to becalculated. Several algorithms have been developed to enable imaging of a volume of interest from MSTdata. The simplest is the Point of Closest Approach (PoCA) algorithm [10], which models amuon’s multiple scatterings as a single scattering at a single point (‘scattering vertex’), foundby extrapolating the incoming and outgoing tracks into the volume and ﬁnding the point whichminimises the distance to each. This assumption allows for fast computation at the expense ofimage quality. A more advanced MST algorithm has been used in this study (see section 2.1) whichbuilds on PoCA by exploiting the spatial density of scattering vertices; a high density of scattering– 3 –ertices corresponds to the presence of high- 𝑍 material as large-angle muon scatterings occur moreoften in such materials. This algorithm, developed in [11], improves on the widely-used Point of Closest Approach (PoCA)muon tomography algorithm [2] by taking into account the degree of spatial clustering of muonscattering vertices. A higher density of vertices corresponds to higher- 𝑍 materials (once the muonmomentum is accounted for, see below) as strong muon scatterings take place with greater frequencyin such materials.The volume is divided into cubic voxels of side length 1 cm. The incoming and outgoing muontracks are extrapolated through the volume, and the point at which the distance between the tracksis minimal (the PoCA) is designated as the scattering vertex for the muon. This is repeated for allof the detected muons that encounter the volume of interest. Next, the scattering vertices insideeach 1 cm voxel are sorted by the scattering angle of the corresponding muon, and the verticescorresponding to the 𝑛 largest scattering angles are kept (voxels which contain less than 𝑛 verticesare discarded). This factor of 𝑛 is an important tunable parameter of the algorithm. High values of 𝑛 improve the contrast between high and low- 𝑍 materials, as a greater sample of muons are kept,but reduce image ‘quality’ (i.e. the number of non-empty voxels in the image) as more voxels fallbelow the cut and are removed from the image. Figure 2 . Comparison of distributions of the binned clustering algorithm discriminator, for 20 cm cubes ofuranium and concrete. Lower discriminator values correspond to higher 𝑍 material. For each of the (cid:0) 𝑛 (cid:1) pairs of vertices 𝑖, 𝑗 in each voxel, a metric value 𝑚 𝑖 𝑗 is calculated accordingto 𝑚 𝑖 𝑗 = | V 𝑖 − V 𝑗 |( 𝜃 𝑖 ˜ 𝑝 𝑖 ) · ( 𝜃 𝑗 ˜ 𝑝 𝑗 ) (2.3)where V 𝑖 , 𝜃 𝑖 and ˜ 𝑝 𝑖 are respectively the position, scattering angle and normalised (by a factor of3 GeV/c) momentum of muon 𝑖 . Weighting by muon momentum is necessary as large scatteringangles could indicate low-momentum muons being scattered in low- 𝑍 materials instead of strong– 4 –cattering in high- 𝑍 materials. In an experimental system, the muon momentum can be estimatedusing the muon scatterings between the detector planes, as the planes are of known material andthickness. Following the method of [12], for our simulations the momentum was obtained by addinga smearing factor to the Monte Carlo truth momentum. The smearing factor was drawn from aGaussian with width 50% of the Monte Carlo truth momentum.Finally, the median of the distribution of log ( 𝑚 𝑖 𝑗 ) in a voxel is determined; this is the algorithm’sdiscriminator value for that voxel. Comparing the distributions of this discriminator for high- andlow- 𝑍 materials shows that the discriminator is sensitive to 𝑍 (see Figure 2). Figure 3 . 𝑥𝑦 (left) and 𝑥𝑧 (right) slices from binned clustering output images of waste drums containing10 cm side length cubes of uranium (top) and lead (bottom). Exposure time = 10 days, 𝑛 = 5. The smearingeﬀect along the 𝑧 axis is due to uncertainty in the scattering vertex 𝑧 coordinate for tracks with small scatteringangles. Note that the plotted discriminator values have been subtracted from 12 for visual clarity. For imaging purposes, each voxel is ﬁlled with its discriminator value as described above,creating a tomogram of the volume of interest. Viewing slices of discriminator values through theimage (see Figure 3) allows regions of high- 𝑍 material to be identiﬁed visually. This gives a degreeof information about the locations and morphologies of objects stored in the drum. However, it is– 5 –ulnerable to a vertical smearing eﬀect inherent in the PoCA reconstruction, and without an objectof known material for comparison, it is diﬃcult to determine the speciﬁc materials of objects ‘byeye’. Additionally, without any way to automatically remove background materials such as the steeldrum and concrete matrix, the 3D image must be viewed in slices to determine the locations ofstored objects.By default, the binned clustering algorithm only takes into account the median of the log ( 𝑚 𝑖 𝑗 ) distribution in each voxel. To test the possibility that additional material information is encoded inthe shape of the log ( 𝑚 𝑖 𝑗 ) value distribution, variables capturing the shape were used to train MVAclassiﬁers. These classiﬁers are then used to separate the regions of the image corresponding toobjects stored in the drum from the concrete matrix. Subsequently the classiﬁers are used to assigna material to each identiﬁed object. All simulations were performed using CRESTA [13], a cosmic ray simulation platform built onthe Geant4 [14] particle physics toolkit and the CRY [15] cosmic ray library. Within CRESTA aMST detector system comprising two particle detector modules above and below a waste drum wassimulated (see Figure 4). This represents a ‘generic’ MST detector system, designed for imaging a1 m waste drum. The detector modules are 2 m by 2 m and each consists of two layers of resistiveplate chambers (RPCs), polystyrene scintillator triggers and three layers of drift chambers. TheRPCs and drift chambers have spatial resolutions of ∼ 𝜇 m and ∼ 𝑥 and 𝑦 layers, allowing 3D muon hits to be recorded and theincoming and outgoing tracks reconstructed. Figure 4 . MST detector system simulated in CRESTA: detector modules above and below a waste drum, inwhich objects can be placed. The detector modules are approximately 2 × The waste drum is made of steel (approx. 91% iron, 9% carbon; element isotopes in naturalabundances). It is approximately 100 cm in length and 30 cm in radius (see Figure 5 for precisedimensions), and is ﬁlled with homogeneous concrete of density 2 . − .– 6 – igure 5 . The simulated concrete-ﬁlled steel nuclear waste drum used in CRESTA, with its dimensions. Frazão et al. [3] used MVA classiﬁers trained on simulated MST data to discriminate between wastedrums containing lead and uranium blocks. This method can be thought of in a ‘global’ sense,distinguishing between two categories of waste drum but not analysing the speciﬁc drum contentsin terms of bodies encased in the concrete. Our approach by contrast is ‘local’, as we are able toproduce localised material information down to the scale of single 1 cm voxels. This approachrequires longer exposure times (of the order of several days rather than hours) but gives moredetailed material information. This allows for the possibility of combining these two techniques. Auser could use the former method and a short exposure to identify drums likely to contain high- 𝑍 material, then subsequently apply our method and a longer exposure to the ﬂagged drums to identifythe stored objects and their materials.Our MVA classiﬁers were built, trained and analysed using TMVA [16], a ROOT [17]-integratedmachine learning platform. Our set of variables used as input to the MVA classiﬁers are obtainedvia the binned clustering algorithm (see section 2.1). The algorithm calculates a set of metric valuesfor each voxel, with each value corresponding to a pair of muon scattering vertices. By default, thealgorithm uses the median of the distribution of log ( 𝑚 𝑖 𝑗 ) values only as the discriminator for eachvoxel. We build on this by ﬁrst binning the log ( 𝑚 𝑖 𝑗 ) values into 28 bins (see Figure 6), calculatingthe normalised bin counts, and passing the counts to the MVA classiﬁers as the input variables(Figure 7) for that voxel. This approach allows more of the shape of the distribution of metric valuesto be captured, enhancing the information available to the classiﬁers.TMVA allows multiple MVA methods to be trained simultaneously and their eﬃcacy compared.The performance of a binary MVA classiﬁer can be quantiﬁed through a Receiver OperatingCharacteristic (ROC) curve: a plot of the true positive rate (also called sensitivity) against the falsepositive rate for diﬀerent cuts on the classiﬁer response for the test sample. The Area Under theCurve (AUC) is a standard measure of the classiﬁer’s discriminating power. AUC = = . igure 6 . Comparison of distributions of log(metric) values for a voxel corresponding to uranium (left) andconcrete (right). The median of each distribution is used as the discriminator in binned clustering algorithmimages such as Figure 3. The normalised bin counts are used as the MVA input variables. Figure 7 . Example distributions of some of the input variables used to train the MVA classiﬁers, herespeciﬁcally a binary uranium-lead classiﬁer. The variables are the normalised bin counts (see Figure 6) ofthe log ( 𝑚 𝑖 𝑗 ) values calculated by the binned clustering MST algorithm. The signal set (blue) are voxels in a20 cm cube of uranium, and the background set (red) an equivalent cube of lead. Applying the classiﬁer to the training datasets and comparing the resulting AUC for a range ofMVA methods (Figure 8) shows that the Gradient-Boosted Decision Tree (BDTG) method is themost suitable, with AUC = . .

808 and 0 .

804 respectively. For this– 8 – igure 8 . ROC curves showing discriminating power for several TMVA methods when applied to thedescribed binned clustering algorithm variables, for distinguishing voxels in 20 cm cubes of uranium andlead. reason the BDTG method is used hereafter as the default MVA method.

The MVAs were trained on a number of simulated MST muon track data corresponding to a 10 dayexposure of four diﬀerent waste drums: an ‘empty’ drum containing only concrete, and three drumscontaining 20 cm side length cubes (see Figure 9) of iron, lead and uranium, in the centre of thedrum and aligned with its central axis. Only the voxels in the cube (or the equivalent volume for thehomogeneous empty drum) were passed to the classiﬁer. The binned cluster algorithm’s 𝑛 parameter(see section 2.1) was set to 20. The dataset is split into equally sized ‘training’ and ‘testing’ sets;the MVA is trained on the former then applied to the latter as an overtraining check. For a binary Figure 9 . Example simulated geometry used for MVA training: 20 cm side length uranium cube, in thecentre of the waste drum. – 9 –lassiﬁer, one dataset of voxel variables is designated as ‘signal’ and the other ‘background’, whereasa non-binary classiﬁer is passed a single signal dataset and several background datasets. In eachcase, the classiﬁer attempts to distinguish signal voxels from background(s) voxels, such that whenapplied to a new voxel it will be classiﬁed correctly as often as is possible from the provided variablesand the classiﬁer’s discriminating power. The non-binary classiﬁers are trained to distinguish thesignal set from all the provided backgrounds (i.e. one-vs-all classiﬁcation). TMVA calculatesan optimum cut value on the classiﬁer response, with a response above the cut being considered‘signal-like’ and below ‘background-like’. The optimum cut corresponds to the point at which thesignal eﬃciency is equal to the background rejection. On the ROC curve, this corresponds to thepoint with the maximum Youden index [18], deﬁned as signal eﬃciency + background rejection − ◦ line connecting the curve’sends.To check for overtraining, TMVA’s standard check was used: the training signal and backgrounddatasets of voxels are both randomly split into two equal groups, with one being used to train theclassiﬁer and the other reserved for testing. The trained classiﬁer is then applied to the test set.The classiﬁer output distributions for the training and test sets are then directly compared (seeFigure 10), with a close match between the distributions indicating a low degree of overtraining.A Kolmogorov-Smirnov test is also performed to quantify the similarity of the distributions. Inour case, the distributions of the test and training MVA outputs are a close match visually. TheKolmogorov-Smirnov test value is low however, indicating some degree of overtraining has takenplace. Figure 10 . TMVA overtraining check plot for the uranium-lead binary MVA classiﬁer. The MVA outputdistributions for the signal and background training sets are overlaid with the output distributions for the testsets for comparison and a Kolmogorov-Smirnov test is performed. – 10 – .3 Momentum information

To determine the importance of momentum information for material classiﬁcation, two alternativeapproaches to the muon momentum were investigated in addition to the 50% Gaussian smeared truthmomentum described in 2.1. These were using the Monte Carlo truth momentum itself, with nosmearing, and ﬁxing the measured muon momentum at a constant value of 3 GeV/c, i.e. removingmomentum information entirely. A comparison of binned clustering algorithm output images ofa drum containing 15 cm cubes of uranium, lead and iron for the diﬀerent approaches is shownin Figure 11. Using the Monte Carlo truth momentum results in a slightly sharper image withless variation in the concrete background, whereas using ﬁxed momentum signiﬁcantly reducesthe quality of the image with the iron cube in particular diﬃcult to distinguish from the concretebackground.

Figure 11 . 𝑥𝑦 slices from binned clustering algorithm output images (with the algorithm’s discriminatorvalue subtracted from 12) of a waste drum containing 15 cm side length cubes of uranium, lead and iron,with three diﬀerent muon momentum approaches: using the Monte Carlo truth momentum (left), applyinga 50% Gaussian smear to the truth momentum (centre), and removing momentum information entirely byﬁxing it at a constant value (right). Exposure time = 10 days, 𝑛 = To quantify the eﬀect on material discrimination, binary uranium-lead MVA classiﬁers trainedas described in section 3.2 but with samples obtained using the three diﬀerent momentum approacheswere used to create ROC curves for each scenario (Figure 12). Comparing the AUC for each caseshows that smearing the momentum slightly reduces the discriminating power of the classiﬁer,with AUC = .

852 for the truth momentum and AUC = .

811 for the 50% smeared momentum.The ﬁxed momentum classiﬁer has signiﬁcantly worse performance with AUC = . It is necessary to attempt to remove the voxels corresponding to the concrete background and steelshell from the binned clustering algorithm output image. The remaining voxels, corresponding to– 11 – igure 12 . Comparison of ROC curves and their AUCs for the three momentum approaches (Monte Carlotruth momentum, 50% Gaussian smeared truth momentum, and ﬁxed momentum). The MVA classiﬁertrained to discriminate uranium and lead voxels from samples taken from drums containing 20 cm cubes,with exposure time 10 days. Smearing the momentum reduces the discriminating power by a small degree,removing momentum information greatly reduces discriminating power. stored objects, can then be sorted into distinct clusters using the algorithm described in section 4.2.The non-binary concrete classiﬁer’s training outputs and ROC curves are shown in Figure 13.As the dimensions of the drum are known, the steel outer shell voxels can be removed triviallythorough a cylindrical spatial cut on the image. Subsequently an MVA classiﬁer trained as describedin section 3, designating the dataset of concrete voxels as ‘signal’ and the other materials as‘backgrounds’, is applied to the remaining voxels to ﬁlter out the concrete voxels. As the classiﬁeris not perfect, some voxels that correspond to concrete in the original simulated geometry remainin the ﬁltered image. The problem is partially mitigated by applying a simple ﬁltering algorithmto remove ‘isolated’ voxels from the image. Each remaining voxel has its 6 nearest neighbourvoxels checked; if they are all empty, the voxel is removed from the image. Figure 14 illustrates theresult of applying this process to a simulated geometry of three 15 cm cubes. The removed voxelsare coloured white in the images; the remaining voxels are black. To test the performance of thenearest neighbour ﬁltering method, the false positive and false negative rates were calculated forthis example. Deﬁning a false positive as voxel that does not correspond to concrete being ﬁlteredout, and a false negative as a voxel that does correspond to concrete passing the ﬁlter, the falsepositive rate was 0 . + . − . and the false negative rate was 0 . ± . igure 13 . MVA training outputs (top) and ROC curves (bottom) for concrete vs iron/lead/uranium non-binary classiﬁer. The optimum cut (blue) corresponds to the point at which signal eﬃciency is equal tobackground rejection. negative rate however indicates that a large number of concrete voxels remain in the ﬁnal image;this corresponds to the smearing in the 𝑧 direction of objects in the drum visible in Figure 3. Subsequently these identiﬁed and separated ‘object’ voxels need to be grouped into individualclusters, each corresponding to a body stored in the drum. This will allow material informationto be calculated by applying MVA classiﬁers to each identiﬁed body. The clustering is achievedthrough the widely used 𝑘 -means clustering algorithm, which in its simplest form operates asfollows: • Choose a value for the number of clusters, 𝑘 . • Pick 𝑘 randomly selected data points to be the initial cluster centroids.– 13 – igure 14 . Illustrative example of MVA-ﬁltering algorithm applied to a simulated geometry of a drumcontaining 15 cm cubes of uranium, lead and iron. Voxels passing the MVA ﬁltering process described aboveare coloured black. • For each data point, calculate the Euclidean distance (in geometric space) to each of thecentroids and assign the point to the cluster with the closest centroid. • Calculate new centroids as the new centres of the clusters. • Repeat until the centroid locations converge.– 14 –hough this algorithm is fast and easy to implement, it requires the number of clusters 𝑘 to beknown in advance and used as an input. One solution is to run the algorithm multiple times withrange of 𝑘 values as input, and calculate some ﬁgure of merit of the clustering output for each.A commonly used ﬁgure of merit for clustering algorithms is the Dunn index [19], deﬁned as theratio between the minimum inter-cluster distance and the maximum intra-cluster distance. A highDunn index therefore indicates well-separated and compact clusters. The inter- and intra-clusterdistances can be deﬁned to suit the problem; in our case the inter-cluster distance metric is thedistance between the closest two data points in the two clusters, and the intra-cluster distance metricis the distance between the two furthest-apart points in a cluster. Deﬁned in this way, the 𝑘 valuethat corresponds to the maximum Dunn index will represent the most natural choice for 𝑘 . In mostcases this will correspond to the number of bodies stored in the waste drum. In some cases thealgorithm can under-estimate 𝑘 if e.g. two objects are in contact or very close together.In practise, the simple 𝑘 -means algorithm often produces poor clustering solutions if therandomly chosen initial centroids are too close together. This problem is avoided by choosing theﬁrst centroid only from a uniform distribution and the subsequent 𝑘 − 𝑘 -means++’ [20]. Figure 15 shows the result of applyingthe 𝑘 -means++ algorithm to a drum containing 15 cm cubes of iron, lead and uranium.This algorithm occasionally fails when applied to MVA-ﬁltered binned clustering images suchas Figure 14, as the ‘noise’ voxels that do not correspond to a stored object can be treated as anew superﬂuous cluster. These ‘fake’ clusters are much more sparse than clusters corresponding tostored objects. This allows the problem to be mitigated by deﬁning a cluster density and removingclusters with densities below some cut. We deﬁne cluster density as the ratio of the number of voxelsin the cluster to the cube of the mean inter-voxel distance. A density cut of 5 × − voxel cm − iseﬀective at removing the sparse clusters.A small percentage of voxels that correspond to concrete in the drum will be incorrectly passedby the classiﬁer and included in the ﬁltered image. These will be incorporated into one of theclusters, which could cause an incorrect material decision. These voxels will be outliers in thecluster as the majority of the cluster voxels will be close to the cluster centroid; thus they can beﬁltered out by placing a cut on the distribution of voxel-centroid distances for each cluster. Choosingthe cut so as to remove voxels for which the voxel-centroid distance is greater than one standarddeviation from the mean of this distribution is eﬀective at removing outlier voxels.Finally, a ﬁlter is applied to remove approximately the outermost voxel layer (see Figure 16)from the surface of each cluster. This is necessary as in general there will be a degree of smearingbetween a stored body and the concrete background, due to scattering vertices from muons passingclose to the object contributing to the algorithm’s metric values (see section 2.1) and hence aﬀectingthe variables that are passed to the MVA classiﬁers. The ﬁltering is achieved by calculating themean of the centroid-voxel distances for each cluster, and removing voxels for which the distance isgreater than the mean. – 15 – igure 15 . 𝑥𝑦 (left) and 𝑥𝑧 (right) slices of the clustering solution for a simulated waste drum containingthree 15 cm cubes of diﬀerent materials. The voxels separated by the method described in section 4.1 havebeen grouped into three clusters using the k -means++ clustering algorithm. Figure 16 . 𝑥𝑦 (left) and 𝑥𝑧 (right) slices of the clustering solution of ﬁgure 15 after ﬁltering the outermostvoxels from each object. Here black indicates voxels removed from the cluster. Further MVA classiﬁers are now applied to the voxels in each identiﬁed cluster to obtain materialinformation for the bodies stored in the drum. Two additional MVA classiﬁers are trained: anon-binary classiﬁer that separates iron signal from lead and uranium backgrounds (see Figure 17),– 16 –nd a ﬁnal binary classiﬁer to discriminate lead and uranium (Figure 18). The training ROC AUCsfor these classiﬁers show that the lead and uranium cases are easily distinguished from iron (as theAUC values are close to 1), whereas the lead/uranium classiﬁer does not perform as well, due tothe similarity of the materials’ 𝑍 values. Figure 17 . MVA training output and ROC curves for iron/lead/uranium non-binary classiﬁer. The optimumcut corresponds to the point at which signal eﬃciency is equal to background rejection.

Each MVA classiﬁer will produce a single response value for each voxel it is applied to. Ifthe value falls above the cut (see section 3.2), the voxel will be considered signal-like, and if itfalls below, background-like. Each identiﬁed object is a set of voxels; we apply the classiﬁers toeach voxel to obtain the object’s distributions of response values, then calculate the proportions ofresponse values that fall above the cuts (i.e. the proportion of the object’s voxels that are signal-like)to arrive at a single value from each classiﬁer for each object. Figure 19 shows the MVA classiﬁer– 17 – igure 18 . MVA training output and ROC curves for lead/uranium binary classiﬁer. The optimum cutcorresponds to the point at which signal eﬃciency is equal to background rejection. response distributions for the three identiﬁed objects in the 15 cm cube example simulated geometry.

Applying the integral method described above to these distributions results in uranium, lead andiron ‘material scores’ for each object stored in the drum. The uranium and lead material scoresare subsequently multiplied by 1 − iron score, i.e. the ‘not-iron’ score. These scores are very– 18 – igure 19 . Distributions of responses of MVA classiﬁers applied to found clusters from a simulated wastedrum containing 15 cm cubes of uranium (object 2), lead (object 1) and iron (object 3). The optimum cutsfor the classiﬁers correspond to the points at which the signal eﬃciency is equal to background rejection. eﬀective at distinguishing objects of diﬀerent materials once the sizes of the objects are takeninto account. The material scores are intuitively viewed as a pie chart (see Figure 20). For thesimulated drum containing three 15 cm side length cubes of uranium, lead and iron, each object hasthe MVA-calculated material score that corresponds to the true material as the largest score. The– 19 – igure 20 . Material scores for simple geometry of three 15 cm cubes, uranium, lead and iron, aligned withvoxel grid. scores for the uranium and lead blocks are also clearly distinguished from each other. However, thissimulation is an idealised case due to the large size of the objects and their similarity to the 20 cmcube training geometries.Applying the MVA classiﬁers to a similar but more challenging geometry of three 10 cm sidelength cubes (see Figure 21), two eﬀects become apparent. Firstly, the classiﬁers do not performas well i.e. the score corresponding to the true material is not necessarily the largest. For example,the ‘uranium’ score has reduced from 0 . ± .

025 for the 15 cm cube case to 0 . ± . 𝑍 object: a largerobject will lead to larger detected muon scattering angles, and hence a smaller binned clusteringmetric value (see 2.1). Hence a large lead object can appear more ‘uranium-like’ than a smallerlead object. The implication is that the size of stored objects must be taken into account to reliablydetermine their material composition. – 20 – igure 21 . Material estimate results for simple geometry of three 10 cm cubes, uranium, lead and iron,aligned with voxel grid. To quantify the relations between the object size and the material scores, we applied our systemto a series of simulated drums containing spheres of diﬀerent materials and increasing radii. Theresults are shown in Figure 22. It is apparent that whilst there is no simple relation between thematerial scores and the object volume, objects of diﬀerent material are clearly distinguished for awide range of volumes.However, these plots can be used empirically to arrive at a single decision material for eachidentiﬁed stored body in the drum. As the volumes of the clusters (equivalent to the number ofconstituent voxels) are known, the plots in Figure 22 give the ‘expected’ material scores for a clusterof that size if the object was composed of one of the three materials. Finally, a material decision isarrived at by comparing the object’s actual material scores with each set of expected values. Thematerial with the best match, i.e. the minimal 3D Euclidean norm between the actual and expected– 21 – igure 22 . Relationship between the MVA-calculated material scores and the size of the stored object. Eachsimulated geometry contains a single sphere of increasing radii, composed of uranium (top), lead (middle)or iron (bottom). material scores, is selected as the ﬁnal material decision.This approach was tested on more complex simulated geometries. Figure 23 shows results fora drum similar to the three-cube example of Figure 20, but with objects of irregular size, location– 22 –nd rotation. In this case the system has accurately identiﬁed the correct material for each object.Despite the uranium block’s low uranium score compared to the equivalent 15 cm cube (Figure 20),the calibration by volume has correctly identiﬁed it as uranium.

Figure 23 . Material estimate results for more complex geometry of three objects, uranium, lead and iron, notaligned with the voxel grid or centred. By calibrating the three material scores against the volume calibrationcurves, (Figure 22), the correct material has been assigned in each case.

A further example with a larger number of objects is shown in Figure 24. This drum containsﬁve objects (two uranium, two lead, and one iron) of a wider range of shapes, dispersed more evenlythrough the drum. However, the system still performs well. The identiﬁed clusters are a close matchto the true locations of the stored objects. Both uranium objects are correctly assigned, as is theiron sphere and one of the lead objects. One lead object, a tube, has been incorrectly identiﬁedas iron. This indicates a limitation of the system when attempting to determine the materials ofnon-spherical objects. – 23 – igure 24 . Material estimate results for more complex geometry of ﬁve objects of various materials andshapes, dispersed throughout the drum. Note that the 2D cluster plots are viewed as side-on and bird’s eyeviews of the 3D map; this is necessary to view all the clusters as they do not all intersect the central 𝑥𝑦 and 𝑧𝑥 planes. Four of the objects have been assigned the correct material; one lead object has been incorrectlyclassiﬁed as iron. To establish the system’s sensitivity and false positive rate, we then applied it to a set of randomlygenerated waste drum simulations. Each simulation contained three spheres of radius 6 cm,randomly dispersed throughout the drum but constrained to not intersect each other. 100 simulationswere run in total. 50 simulations contained one uranium, one lead and one iron sphere, and theremaining 50 contained two lead spheres and one iron sphere. A true positive identiﬁcation of auranium object was deﬁned to be an object identiﬁed close to the true location of a uranium spherethat was designated as uranium by the system. Conversely a false positive comprised any assignmentof a uranium decision to an object in a drum not containing uranium. With these criteria, we founda sensitivity of 0 . + . − . , and a false positive rate of 0 . + . − . (95% Clopper-Pearson conﬁdenceintervals). – 24 – Conclusions

We have demonstrated that machine learning techniques are a powerful tool for enhancing theinformation about a waste drum’s contents that can be obtained in a muon scattering tomographyexperiment. MVA classiﬁers trained on variables obtained from the distribution of binned clusteringalgorithm metric values are eﬀective at discriminating materials in waste drums. The concretematrix can be distinguished from stored objects of mid- and high- 𝑍 material, allowing the voxelscorresponding to the matrix to be removed, and the remaining object voxels sorted into clusters.Additional material information can be obtained with further MVA classiﬁers, to discriminateﬁrst mid- 𝑍 (e.g. iron) from high- 𝑍 (lead, uranium) objects, and then between materials withsimilar 𝑍 . The eﬀectiveness of the material discrimination is highly dependent on object size. Byestablishing the empirical relation between object size and the MVA classiﬁers’ material outputscores, a ﬁnal material decision can be made for each identiﬁed stored body in the simulated wastedrum. This has shown to be accurate for a wide range of object sizes, shapes and drum locations.When tested against a set of simulated drums containing 6 cm radius spheres of diﬀerentmaterials in randomly determined positions, the system performed with a true positive rate of0 . + . − . , and a false positive rate of 0 . + . − . , indicating this approach is eﬀective at identifyinguranium objects inside waste drums. The main identiﬁed vulnerabilities are objects with largediﬀerences in 𝑍 (e.g. iron and uranium) being very close too each other, and more spatiallyextended objects being misidentiﬁed, although the latter problem could be mitigated by extendingthe object size-based decision method (see Figure 22) to account for a wider range of object shapes. Acknowledgments

This project has received funding from the Euratom research and training programme 2014-2018under grant agreement No 755371. – 25 – eferences [1] W Kubinski, C Carasco, D Kikola, C Mathonat, D Ricard, D Tefelski, and H Tietze-Jaensch.Calorimetric non-destructive assay of large volume and heterogeneous radioactive waste drums. In

EPJ Web of Conferences , volume 225, page 06003. EDP Sciences, 2020.[2] L Schultz, K Borozdin, J Gomez, G Hogan, J McGill, C Morris, W Priedhorsky, A Saunders, andM Teasdale. Image reconstruction and material Z discrimination via cosmic ray muon radiography.

Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers,Detectors and Associated Equipment , 519(3):687–694, 2004.[3] L Frazão, J Velthuis, C Thomay, and C Steer. Discrimination of high-Z materials in concrete-ﬁlledcontainers using muon scattering tomography.

Journal of Instrumentation , 11(07):P07020, 2016.[4] L Frazão, J Velthuis, S Maddrell-Mander, and C Thomay. High-resolution imaging of nuclear wastecontainers with muon scattering tomography.

Journal of Instrumentation , 14(08):P08005, 2019.[5] T Stocki, C Warren, M Magill, B Morgan, J Smith, D Ong, V Anghel, J Armitage, J Botte, andK et al. Boudjemline. Machine learning for the cosmic ray inspection and passive tomography project(CRIPT). In , pages 91–94. IEEE, 2012.[6] X Pan, Y Zheng, Z Zeng, X Wang, and J Cheng. Experimental validation of material discriminationability of muon scattering tomography at the TUMUTY facility.

Nuclear Science and Techniques ,30(8):1–9, 2019.[7] L Schultz, G Blanpied, K Borozdin, A Fraser, N Hengartner, A Klimenko, C Morris, C Orum, andM Sossong. Statistical reconstruction for cosmic ray muon tomography.

IEEE transactions on ImageProcessing , 16(8):1985–1993, 2007.[8] G Lynch and O Dahl. Approximations to multiple coulomb scattering.

Nuclear Instruments andMethods in Physics Research Section B: Beam Interactions with Materials and Atoms , 58(1):6–10,1991.[9] S. Eidelman et al. Review of Particle Physics.

Physics Letters B , 592:1, 2004.[10] S Riggi, V Antonuccio-Delogu, M Bandieramonte, U Becciani, A Costa, P La Rocca, P Massimino,C Petta, C Pistagna, F Riggi, et al. Muon tomography imaging algorithms for nuclear threat detectioninside large volume containers with the muon portal detector.

Nuclear Instruments and Methods inPhysics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment ,728:59–68, 2013.[11] C Thomay, J Velthuis, P Baesso, D Cussans, PAW Morris, C Steer, J Burns, S Quillin, andM Stapleton. A binned clustering algorithm to detect high-Z material using cosmic muons.

Journalof Instrumentation , 8(10):P10013, 2013.[12] C Morris, C Alexander, J Bacon, K Borozdin, D Clark, R Chartrand, C Espinoza, A Fraser,M Galassi, and J et al. Green. Tomographic imaging with cosmic ray muons.

Science & GlobalSecurity , 16(1-2):37–53, 2008.[13] C Steer, P Stowell, and L Thompson. CRESTA: Cosmic rays for engineering, scientiﬁc, andtechnology applications. https://gitlab.com/cosmicraysim/cresta .[14] GEANT Collaboration, S Agostinelli, et al. Geant4–a simulation toolkit.

Nucl. Instrum. Meth. A ,506(25):0, 2003. – 26 –

15] C Hagmann, D Lange, and D Wright. Cosmic-ray shower generator (CRY) for Monte Carlo transportcodes. In , volume 2, pages 1143–1146.IEEE, 2007.[16] A Hoecker, P Speckmayer, J Stelzer, J Therhaag, E von Toerne, and H Voss. TMVA: Toolkit forMultivariate Data Analysis.

PoS , ACAT:040, 2007.[17] R Brun and F Rademakers. Root—an object oriented data analysis framework.

Nuclear Instrumentsand Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and AssociatedEquipment , 389(1-2):81–86, 1997.[18] W Youden. Index for rating diagnostic tests.

Cancer , 3(1):32–35, 1950.[19] J Dunn. A fuzzy relative of the isodata process and its use in detecting compact well-separatedclusters.

Journal of Cybernetics , 3(3):32–57, 1973.[20] D Arthur and S Vassilvitskii. k-means++: The advantages of careful seeding. http://ilpubs.stanford.edu:8090/778/ , June 2006., June 2006.