[PDF] Machine Learning for Imaging Cherenkov Detectors

Abstract

Imaging Cherenkov detectors are largely used in modern nuclear and particle physics experiments where cutting-edge solutions are needed to face always more growing computing demands. This is a fertile ground for AI-based approaches and at present we are witnessing the onset of new highly efficient and fast applications. This paper focuses on novel directions with applications to Cherenkov detectors. In particular, recent advances on detector design and calibration, as well as particle identification are presented.

Full PDF

PPrepared for submission to JINST

International Workshop on Fast Cherenkov Detectors -Photon detection, DIRC design and DAQ,11 - 13 September 2019,Castle Rauischholzhausen, Justus-Liebig-University Giessen, Germany

Machine Learning for Imaging Cherenkov Detectors

C. Fanelli a , b , a Laboratory for Nuclear Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA b Jeﬀerson Lab, EIC Center, Newport News, VA 23606, USA

E-mail: [email protected]

Abstract:Imaging Cherenkov detectors are largely used in modern nuclear and particle physics experi-ments where cutting-edge solutions are needed to face always more growing computing demands.This is a fertile ground for AI-based approaches and at present we are witnessing the onset ofnew highly eﬃcient and fast applications. This paper focuses on novel directions with applicationsto Cherenkov detectors. In particular, recent advances on detector design and calibration, as wellas particle identiﬁcation are presented.Keywords: Cherenkov detectors, reconstruction, calibration, machine learning Corresponding author. a r X i v : . [ phy s i c s . i n s - d e t ] J un ontents Cherenkov detectors are largely used in modern nuclear and particle physics experiments for chargedparticle identiﬁcation (PID). Cherenkov radiation is released in the shape of a cone along the particlemomentum direction when a charged particle moves through a dielectric medium at a speed largerthan the phase velocity of light in that medium. A Cherenkov detector is typically equipped withsingle photon detectors, and the charged particles can be identiﬁed by the shapes of the detected hitpatterns.The plenitude of photon detectors gives rise to a pattern recognition task being the originalmaster domain of Artiﬁcial Intelligence (AI). AI is becoming ubiquitous in our ﬁeld, particularlyin high energy physics [1]: following a standard taxonomy [2], generally AI encompasses all theconcepts related to the integration of human intelligence into machines; a subset of AI is machinelearning (ML), which means empowering machines to learn, and typically these algorithms can begrouped into supervised, unsupervised and reinforcement learning; whereas deep learning (DL),a subset of ML—often considered the evolution of ML—is based on deep ( i.e. , made by manyhidden layers) neural networks. In the most frequent applications, features are selected and a modelis trained for classiﬁcation or regression using signal and background examples.Artiﬁcial intelligence can have an impact on a wide range of activities in our ﬁeld, e.g. , DAQ,monitoring, calibration, data reconstruction and analysis, detector design etc.This proceeding focuses on existing approaches and novel directions with speciﬁc applicationsto imaging Cherenkov detectors.The outline of this paper is as follows: existing reconstruction methods including diﬀerent areasof research are discussed in Sec. 2; novel directions (detector calibration, design optimization, andPID) are presented in Sec. 3; summary and conclusions are reported in Sec. 4.– 1 –

State of the Art

First applications of Artiﬁcial Neural Networks (ANN) [3] for pattern recognition in ring-imagingCherenkov (RICH) detectors date back to the nineties [4], where a two-layer neural network withforward connections was proposed to evaluate the analysis of a set of images produced by an opticalRICH detector at CERN.More recent applications of ANN to pattern recognition have been realized for the CompressedBaryonic Matter (CBM) experiment [5], which aims to measure dileptons emitted from the hot anddense phase in heavy-ion collisions. Electron identiﬁcation is performed by a RICH and transitionradiation detectors. Data processing plays an important role, and in this case fast and eﬃcientalgorithms based on fast Hough transform are used for the ring search. Ultimately, ANN is usedfor ring classiﬁcation. In particular, a detailed study of the procedure of fake ring elimination isperformed taking into account the optical distortions resulting in ellipse ﬁtting methods and radiuscorrections.Another interesting application comes from the LHCb experiment [6], where ML-based globalPID algorithms are developed to improve the identiﬁcation of particles combining the informationassociated to several sub-detectors [7]: two RICH sub-detectors provide charged hadrons ( π , K , p )identiﬁcation over a wide momentum range, from 2 to 100 GeV/c. Muons ( µ ) are identiﬁed mainlyby dedicated muon chambers, while electron ( e ) and photon ( γ ) identiﬁcation is assured by thecalorimeters. Global PID can be obtained by simply computing Log Likelihoods (LL) separatelyfor each sub-detector and then combining them. It has been proved though that more reﬁnedmethods like ANN outperform the LL-based approach. Furthermore, dedicated methods (see, e.g. ,[8]) have been developed to improve the ﬂatness of the reconstruction eﬃciencies as a function ofthe particle phase-space in order to reduce systematic uncertainties.The Very Energetic Radiation Imaging Telescope Array System (VERITAS) [9] has been oneof the ﬁrst experiments to make use of DL for detection of Cherenkov rings. The Muon Hunter[10] is an interesting project hosted on the Zooniverse platform, where volunteers select pictures ofdata from the VERITAS cameras to identify muon ring images. Volunteering work is the basis toobtain a reliable set of training data to ﬁnally mimic their human experience by ML.VERITAS is made by four 12m diamater imaging atmospheric Cherenkov telescopes sited atthe Fred Lawrence Whipple Observatory in southern Arizona. The identiﬁcation of muon ringimages is based on a Convolutional Neural Network (CNN) [11] which is capable of rejectingbackground events and identifying suitable calibration data to monitor the telescope performanceas a function of time. The supervision of the volunteers provided a more eﬃcient machine learningmodel and helped identifying unexpected images.In neutrino experiments, water Cherenkov detectors are commonly used to distinguish between e / µ leptons, which determine the ﬂavour of the interacting neutrino in the medium. An example ofthis kind of detectors is Super-Kamiokande [12], where the use of ML always in form of CNN basedon Tensorﬂow [13] has been explored [14]. Notice that the standard analysis ﬁtting algorithm isspeciﬁcally tuned for the problem of e / µ separation and is based on complex modelling of Cherenkovlight moving and scattering in the detector. The DL approach, on the other hand, has the advantagethat it requires no prior knowledge of the Cherenkov physics, and all these features are directlylearned by the algorithm which turns out to have performance comparable to the the standard– 2 –ID. Application of ML techniques to reconstruct lepton energies in water Cherenkov detectorsis proposed for TITUS [15]—an intermediate detector for the Hyper-Kamiokande experiment. Ithas been found in [16] that this leads to more than 50% improvement in the energy resolution forall lepton energies compared to standard approaches based upon lookup tables, with performancecomparable to likelihood-function based techniques currently in use. This section describes novel AI-based methods applied to imaging Cherenkov detectors in nuclearand particle physics: methods for detector design and calibration based on Bayesian optimizationare described in Sec. 3.1; PID applications based on DL are described in Sec. 3.2.

Bayesian Optimization (BO) [17, 18] is among the most eﬃcient tools for tuning the parametersof a black-box functions f ( x ) , searching for the global optimum x ∗ over a bounded domain χ of f . In particular, f can be noisy, non-diﬀerentiable and expensive to evaluate. Typically Gaussianprocesses [19] are used to build a surrogate model of f , but other regression methods such asdecision trees can also be used. Once the probabilistic model is determined, a cheap utility function(also called acquisition function) is considered to guide the process of sampling the next point toevaluate.In the following, two applications are described: in Sec. 3.1.1 BO is applied to detectorcalibration and in Sec. 3.1.2 to detector design. This section shows how Bayesian optimization can be applied to detector calibration. As anexample, we consider the DIRC (acronym of Detection of Internally Reﬂected Cherenkov light)detector recently installed in GlueX [20], a particle physics experiment located in Hall D at theJeﬀerson Lab (JLab) [21].The DIRC will improve the GlueX PID capabilities in the forward region (see Fig. 1 (left)),and it is essential for the physics program of GlueX, whose primary goal is to search for andultimately study the properties of hybrid mesons [22]. It consists of four bar boxes oriented to forma plane 4 m away from the ﬁxed target of the experiment and two photon cameras. Each bar boxcontains 12 fused silica radiators (1.725 × ×

490 cm ) with a small wedge attached to the endwhich is read out. The DIRC photon camera contains multiple ﬂat mirrors to direct the light to thephotodetector plane. Each photon camera is attached to two bar boxes and is equipped with an arrayof ∼

100 Hamamatsu H12700 MaPMTs [23]. A DIRC employs rectangular fused-silica bars bothas Cherenkov radiators and as light guides. Typically the detected hit pattern in the PMT plane issparse making the reconstruction rather challenging. The readout electronics boards are the sameas for the CLAS12 RICH [24] in Hall B at JLab.During data taking, when the optical box is ﬁlled with distilled water, several components couldbe misaligned. Consequently, a set of ( (cid:38)

10) main alignment parameters should be monitored (seeFig. 1 (right)), e.g. , the three-segmented mirror angles and spatial displacement, the position of thebars relative to the outside of the bar box, the relative distance of the mirror to PMT plane, etc. For– 3 –ll these reasons, the DIRC calibration can be considered a black-box problem—with potentiallymany non-diﬀerentiable terms—in a noisy environment.

Figure 1 . (left) A schematic view of the GlueX detector with the DIRC, which consists of four radiatorboxes and two photon cameras; (right) Schematic diagram of the cross-section of the photon camera. Theattached bar boxes are shown on the bottom right. The bottom ﬂat mirror is aligned with the radiators. Imagetaken from [25].

In the following the vector of oﬀsets is called (cid:174) θ , and results shown are taken from a preliminaryclosure test presented in [26] (where the reader can ﬁnd more details about the estimated tolerances).The calibration strategy proposed in [26] allows to self-learn the oﬀsets from observationsof real data. The idea is to use a large sample of pions with high-purity obtained from channelslike ρ decays which are abundantly produced in photoproduction experiments like GlueX. At lowmomentum these charged particles are well identiﬁed by other subdetectors in GlueX (without theDIRC contribution) and can be used as high-purity sample for the calibration procedure.The objective function can be any ﬁgure of merit (FoM) (in the form of, e.g. , a log-likelihood)that compares noisy observations—the sparse hit pattern detected in the PMT plane—with theexpected hit pattern under certain hypotheses: the particle ID, momentum, location and orientationof the charged particle traversing the bars, and the oﬀsets. These preliminary studies are basedon FastDIRC [27], a fast Monte Carlo and reconstruction algorithm for DIRC detectors which ismore than 10 times faster than Geant-based simulation and allows to parameterize the eﬀect ofthe oﬀsets. About O( ) charged particles are selected to form a high-purity sample by imposingcertain requirements, like ﬁducial cuts on the incident and azimuthal angles with respect to thenormal to the plane containing the fused silica bars and a minimum distance from the bar center.The closure test consists basically in injecting a known set of oﬀsets and reverse-engineer themwithin the allowed tolerances. This has been proven with 7 main parameters and has been tested fordiﬀerent values of the oﬀsets. A framework based on BO has been used, which allows to performthe calibration in a much reduced amount of time compared to a simple random search (RS). Timeperformance will be further optimized in future studies.– 4 –he closure test is completed by using the determined oﬀsets to calibrate the data, and checkif the detector performance is consistent with what one would obtain with the true oﬀsets. Goodproxies to evaluate the quality of the reconstruction are the single photon resolution (SPR), or theeﬀective resolution (SPR normalized to the root of the photon yield), and the area under curve(AUC) of a ROC curve [28]. These quantities can be evaluated considering π s and K s at diﬀerentkinematics.Examples of ROC curves obtained in the closure test are shown in Fig. 2: (i) the left plotcorresponds to the true values of the oﬀsets; (ii) the middle plot is without calibration; (iii) the rightplot shows the ROC after calibration. As shown in [26], the values of the SPR and the AUC obtainedafter the calibration are consistent with the expected ones, proving the quality of the calibrationprocedure. Kaon Efficiency P i on R e j e c t i on true Kaon Efficiency P i on R e j e c t i on default Kaon Efficiency P i on R e j e c t i on calibrated Figure 2 . ROC curves (pion rejection vs kaon eﬃciency) for particles with kinematics (P, θ , φ ): (4 GeV, 4deg, 40 deg). From left to right, (i) using the ‘true’ oﬀsets, (ii) without calibration (oﬀsets set to default nullvalues), (iii) after calibration based on the results of the BO. The AUC is used as proxy for the closure test.Further details can be found in [26]. This section shows an example of AI applied to detector design and is based on a recent work doneby [29], where a highly parallelized and automated procedure using Bayesian optimization andmachine learning that encodes detector requirements is proposed.The design of the dual-radiator Ring Imaging Cherenkov (dRICH) [30–32] detector—underdevelopment as part of the particle-identiﬁcation system at the future Electron-Ion Collider (EIC)[33]—is considered as a case study. The baseline design consists of two radiators (aerogel andC F gas) sharing the same outward-focusing spherical mirror and highly segmented ( ≈ pixel size) photosensors located outside of the charged-particle acceptance. Details of the dRICHdetector design are shown in Fig. 3.For the dRICH design optimization, eight parameters are considered to improve the PIDperformance: the refractive index and thickness of the aerogel radiator; the focusing mirror radius, itslongitudinal (which determines the eﬀective thickness of the gas) and radial positions (correspondingto the axis going in the radial direction in each of the six mirror sectors, see Fig. 3); and the 3Dshifts of the photon sensor tiles with respect to the mirror center on a spherical surface, which tosome extent determines the sensor area and orientation relative to the mirror. These parameters,cover rather exhaustively the two major components of the dRICH: its radiators and optics. Theregions of parameter space explored are based on previous studies [34] under the constraint that– 5 – igure 3 . Geant4 based simulation of the dRICH. In transparent wired red is the aerogel radiator, intransparent wired green is the gas radiator volume; the mirrors sectors are in gray and the photo-detectorsurfaces (spherical shape) of about 8500 cm per sector in dark-yellow. A pion of momentum 10 GeV/c issimulated. Image taken from [29]. it is possible to implement any values resulting from the optimization with (at worst) only minorhardware issues to solve.Since the aim of the design optimization is to maximize the PID performance of the dRICH—toprovide full hadron identiﬁcation from a few GeV/c up to large momentum values—the separationpower between pions and kaons has been as the objective function. In order to simultaneouslyoptimize the combined PID performance of both the aerogel and gas parts in the dRICH, twoworking points have been determined based on the performance of the baseline design. Figure 4shows the posterior distribution in 2-dimensional subspaces of the design parameter space. Theseplots illustrate the possible correlations among the parameters. The optimal point in each subspaceis marked with a red dot. Notice that the black points, corresponding to the points evaluated by theBO in its ask-and-tell procedure, tend to form basins of attraction around the minimum.– 6 –

92 296 300 2.62.5 p a r t i a l d e p . R p o s r [ c m ]

126 132 138 2.62.5 p a r t i a l d e p . pos r p o s l [ c m ] -304 -300 -296 2.62.5 p a r t i a l d e p . pos l t il e s x [ c m ] -4 0 4 2.62.5 p a r t i a l d e p . tiles x t il e s y [ c m ] -4 0 4 2.62.5 p a r t i a l d e p . tiles y t il e s z [ c m ] -104 -100 -96 2.62.5 p a r t i a l d e p . tiles z n ( a e r o g e l ) p a r t i a l d e p . n (aer.)

292 294 296 298

R [cm] t ( a e r o g e l ) . . . . . pos r [cm]

304 302 300 298 296 pos l [cm] tiles x [cm] tiles y [cm]

104 102 100 98 96 tiles z [cm] . . . . . n(aerogel) p a r t i a l d e p . t (aer.) Figure 4 . 2D plot of the objective function (color axis). The optimization strategy of the dRICH designinvolves tuning 8 parameters. In order to study possible correlations, each parameter is drawn against theother. The evaluations made by the optimizer are shown through an intensity gradient in the point trailranging from white (ﬁrst call of parallel observations) to black (last call). After about 55 calls, the stoppingcriteria are activated. The dotted red lines correspond to the projections on each variable of the optimal pointfound by the BO. More details can be found in [29].

Stopping criteria are developed to automatize the procedure which converges within a reducednumber of iterations compared to RS. A detailed comparison between BO and RS is performed in[29] both in terms of PID and time performance.The PID capabilities of the dRICH detector are substantially improved using the AI-basedapproach: the solid curves shown in Fig. 5 correspond to the new optimized design whereas thedashed curves are related to the previous baseline design [34]. The developed procedure allowsalso to estimate the expected tolerances from the posterior distribution, within which any variationof the parameters does not alter the determined detector performance.Currently, there are many ongoing eﬀorts to simulate and analyse EIC detector designs, and thesame approach can be employed for any such study and can be extended to a global PID optimization– 7 –

10 20 70 momentum [GeV/c] π / K s e p a r a t i o n σ aerogel (optimized)gas (optimized)aerogel (legacy)gas (legacy) Figure 5 . π / K separation as number of σ , as a function of the charged particle momentum. The plotshows the improvement in the separation power with the approach discussed in [29] compared to the legacybaseline design [34]. The curves are drawn with 68% C.L. bands. of multiple detectors. Interestingly, real-world costs of the components could be also included inthe optimization method. This section shows recent applications of deep learning applied to the DIRC detector for fastsimulation and reconstruction of charged particles. Similar approaches can in principle be appliedto other imaging Cherenkov detectors.

A deep learning application to simulate Cherenkov detector response appeared recently in [35],where it has been proposed to use a generative adversarial neural network (GAN) [36] to bypasslow-level details at the photon generation stage. This work is based on events simulated withFastDIRC [27] assuming the design of the GlueX DIRC previously discussed in Sec. 3.1.1.The GAN architecture is trained to reproduce high-level features based on input observables of theincident charged particles, allowing for an improvement in simulation speed. The authors of thiswork claim a good precision and very fast performance from their studies.A novel deep learning algorithm for fast reconstruction has been proposed in [37] which canbe applied to any imaging Cherenkov detectors. The core of this architecture is a generative modelwhich leverages on a custom Variational Auto-encoder (VAE) [38] combined to Maximum MeanDiscrepancy (MMD) [39], with a Convolutional Neural Network (CNN) [11] extracting featuresfrom the space of the latent variables for classiﬁcation. A VAE is a particular type of generativemodels trying to simulate how the data are generated, in order to understand the underlying causalrelations. – 8 – igure 6 . A ﬂowchart of DeepRICH: the inputs are concatenated—n.b., the ⊕ represents the concatenationbetween vectors—and injected into the encoder, which generates a set of vectors of latent variables, whichare then used for both the classiﬁcation of the particle and for the reconstruction of the hits. Image takenfrom [37]. To this end, an Encoder produces a vector of latent variables by taking as input certain kinematicparameters—namely: the momentum P , the polar and azimuthal angles with respect to the normalto the radiator bars, θ and φ , as well as the position where the charged particle is traversing the bar( X , Y )—concatenated with the hit pattern of the charged particle. The vectors of latent variablesassociated to the hits of a particle are used to classify the particle. The Decoder reconstructsthe input hits using the latent variables and the kinematic parameters. Standard regularizationtechniques ( e.g. , dropout, batch normalization) are considered in order to prevent overﬁtting andimprove performance and stability of the network (more details can be found in Table 1 of [37]).A ﬂowchart of the DeepRICH network is represented in Fig. 6.The model is trained by minimizing the total loss function which consists of three loss functions,– 9 –ne for reconstruction, one for classiﬁcation, another one calculated with MMD.Training samples are prepared in form of discrete hypercubes in the kinematic parameter space( P , θ, φ, X , Y ) of the charged particle. The hyperparameters of DeepRICH have been optimized witha Bayesian optimizer. An example of hit patterns reconstructed by DeepRICH after the tuning ofthe hyperparameters is shown in Fig. 7. x [ mm ] − −

500 0 500 1000 1500 y [ m m ] −

50 0 50100150200250 t [ n s ] reconstructed π real π x [ mm ] − −

500 0 500 1000 1500 y [ m m ] −

50 0 50100150200250 t [ n s ] reconstructed Kreal K Figure 7 . Example of hit points reconstructed by DeepRICH at 4 GeV/c, with an almost perfect overlapbetween the reconstructed and the injected hits of both pions (left) and kaons (right). Image taken from [37].

The capability of distinguishing π s from K s and eﬀectively doing PID depends on the featuresand the causal relations learnt in the space of the latent variables. A 3D visualization in the spaceof the latent variables is shown in Fig. 8, where dimensionality reduction methods ( viz. , t-SNE[40]) are used to provide a representation of the 20-dimensional space of the latent variables. Asexpected, the larger the momentum the lower is the π / K separation. − −

50 0 50 100 − −

50 0 50 100 − − π K − −

50 0 50 100 150 200 − −

50 0 50 100 − − − − π K Figure 8 . Example of features extracted by the CNN module from π ’s and K ’s at 4 GeV/c (left) and 5 GeV/c(right). These features are then used to classify the particle with a MLP. The plot shows a better separationbetween π / K at 4 GeV/c, which means that the network has good distinguishing power. As expected thepoints become less separated at larger momentum. The 3D visualization is obtained with t-SNE [40]. Imagetaken from [37]. – 10 – thorough comparison with the simulation/reconstruction package FastDIRC is discussed in[37]. DeepRICH has the advantage to bypass low-level details needed to build a likelihood andallows to achieve a sensitive improvement in computation time at potentially the same reconstructionperformance of other established reconstruction algorithms.In fact preliminary results show high reconstruction eﬃciency combined to fast inference time:in particular the time reconstruction of O( ms ) per batch of particles makes DeepRICH potentiallyfaster than established reconstruction methods available at present [37]. AI is entering transversely across diﬀerent research areas improving the science output. In particularwe are witnessing a growing number of applications in particle and nuclear physics, beginning withhigh-level physics analysis and followed by the development of reconstruction algorithms in morerecent years. This paper describes novel directions regarding AI-based applications for imagingCherenkov detectors.Bayesian optimization is currently being used for detector calibration and design optimization.The misalignment oﬀsets aﬀecting the instrumentation can be directly inferred from real data withBO, and a real-world example has been described for the GlueX DIRC detector. BO can also be usedto improve single or multi-detector designs of future experiments like EIC. Recent advances in theﬁeld show that this can be done in a highly parallelized and automated way. With regard to design,real-world costs of the components could be included in the optimization process. Furthermore,recent optimization packages have been developed to improve the scalability of BO with the numberof observations.Deep Learning applications have been recently explored to provide fast and accurate recon-struction of charged particles for the DIRC detector. The fast reconstruction time in particular,makes a DL-based algorithm suitable for near real-time applications ( e.g. calibration). Anotherimportant feature is related to the nature of generative models like VAE, which suggests a temptingscenario of generalizing these new algorithms to fast generation of events once the behavior in thelatent space is learnt.Finally another suggestive application could be training these algorithms using pure samples ofidentiﬁed particles from real data, allowing to deeply learn the response of the Cherenkov detectors.

Acknowledgments

This material is based upon work supported by the U.S. Department of Energy, Oﬃce of Science,Oﬃce of Nuclear Physics under contract DE-FG02-94ER40818.

References [1] K. Albertsson et al.,

Machine learning in high energy physics community white paper , J. Phys. Conf. (2018) , [ ].[2] P. Mehta et al.,

A high-bias, low-variance introduction to Machine Learning for physicists , Phys. Rep. (2019) 1 – 124, [ ]. – 11 –

3] J. J. Hopﬁeld,

Artiﬁcial neural networks , IEEE Circuits Devices Mag. (1988) 3–10.[4] M. Castellano, E. Nappi, F. Posa and G. Satalino, An artiﬁcial neural network computational schemefor pattern matching problems in high-energy physics , tech. rep., CM-P00056571, 1991.[5] S. Lebedev and G. Ososkov,

Fast algorithms for ring recognition and electron identiﬁcation in theCBM RICH detector , Particles and Nuclei Letters (2009) 161–176.[6] LHCb Collaboration, LHCb PID upgrade technical design report , tech. rep., 2013.[7] D. Derkach, M. Hushchyn and N. Kazeev,

Machine Learning based Global Particle IdentiﬁcationAlgorithms at the LHCb Experiment , EPJ Web Conf. (2019) 06011.[8] A. Rogozhnikov, A. Bukva, V. Gligorov, A. Ustyuzhanin and M. Williams,

New approaches forboosting to uniformity , J. Instrum. (2015) T03002.[9] J. Holder et al., The ﬁrst VERITAS telescope , Astropart. Phys. (2006) 391–401.[10] R. Bird et al., Muon Hunter: a Zooniverse project , arXiv (2018) , [ ].[11] Y. LeCun, Y. Bengio et al., Convolutional networks for images, speech, and time series , Thehandbook of brain theory and neural networks (1995) .[12] S. Fukuda et al.,

The super-kamiokande detector , Nucl. Instrum. Methods Phys. Res. A (2003)418–462.[13] M. Abadi et al.,

Tensorﬂow: Large-scale machine learning on heterogeneous distributed systems , arXiv (2016) , [ ].[14] T. Theodore, Particle Identiﬁcation in Cherenkov Detectors using Convolutional Neural Networks ,tech. rep., 2016.[15] P. Lasorak and N. Prouse,

Titus: An intermediate distance detector for the hyper-kamiokande neutrinobeam , arXiv (2015) , [ ].[16] E. Drakopoulou, G. Cowan, M. Needham, S. Playfer and M. Taani, Application of machine learningtechniques to lepton energy reconstruction in water Cherenkov detectors , J. Instrum. (2018)P04009.[17] D. R. Jones, M. Schonlau and W. J. Welch, Eﬃcient global optimization of expensive black-boxfunctions , J. Global Optim. (1998) 455–492.[18] J. Snoek, H. Larochelle and R. P. Adams, Practical bayesian optimization of machine learningalgorithms , Adv. Neural Inf. Process. Syst. (2012) 2951–2959, [ ].[19] C. K. Williams and C. E. Rasmussen,

Gaussian processes for machine learning , vol. 2. MIT pressCambridge, MA, 2006, 10.1007/978-3-540-28650-9_4.[20] The GlueX Collaboration, A. Ghoul et al.,

First results from the GlueX experiment , AIP Conf. Proc. (2016) 020001.[21] J. Dudek et al.,

Physics opportunities with the 12 GeV upgrade at Jeﬀerson Lab , Eur. Phys. J. A (2012) 187.[22] C. A. Meyer and E. Swanson, Hybrid mesons , Prog. Part. Nucl. Phys. (2015) 21–58.[23] M. Calvi et al., Characterization of the Hamamatsu H12700A-03 and R12699-03 multi-anodephotomultiplier tubes , J. Instrum. (2015) P09021.[24] A. El Alaoui, N. Baltzell and K. Haﬁdi, A RICH detector for CLAS12 spectrometer , Phys. Procedia (2012) 773–780. – 12 –

25] F. Barbosa et al.,

The GlueX dirc detector , Nucl. Instrum. Methods Phys. Res. A (2017) 69–71.[26] C. Fanelli, “Overview of Bayesian Optimization Applied to the GlueX case.” Nov 6 2018.[27] J. Hardin and M. Williams,

FastDIRC: a fast Monte Carlo and reconstruction algorithm for DIRCdetectors , J. of Instrum. (2016) P10007, [ ].[28] J. A. Hanley and B. J. McNeil, The meaning and use of the area under a receiver operatingcharacteristic (roc) curve. , Radiology (1982) 29–36.[29] E. Cisbani, A. Del Dotto, C. Fanelli, M. Williams et al.,

AI-optimized detector design for the futureElectron-Ion Collider: the dual-radiator RICH case , arXiv (2019) , [ ].[30] N. Akopov et al., The HERMES Dual-Radiator Ring Imaging Cherenkov Detector , Nucl. Instrum.Meth. A (2002) 511–530, [ ].[31] The LHCb Collaboration, A. Alves Jr et al.,

The LHCb Detector at the LHC , J. Instrum. (2008)S08005.[32] The LHCb Collaboration, M. Adinolﬁ et al., Performance of the LHCb RICH detector at the LHC , Eur. Phys. J. C (2013) 1–17.[33] E. Aschenauer et al., Electron-Ion Collider Detector Requirements and R&D Handbook (Version 1.1) ,.[34] A. Del Dotto et al.,

Design and R&D of RICH detectors for EIC experiments , Nucl. Instrum. Meth. A (2017) 237–240.[35] D. Derkach, N. Kazeev, F. Ratnikov, A. Ustyuzhanin and A. Volokhova,

Cherenkov detectors fastsimulation using neural networks , Nucl. Instrum. Methods Phys. Res. A (2019) , [ ].[36] I. Goodfellow et al.,

Generative adversarial nets , in

Advances in Neural Inf. Process Syst. ,pp. 2672–2680, 2014.[37] C. Fanelli and J. Pomponi,

DeepRICH: Learning Deeply Cherenkov Detectors , .[38] D. P. Kingma and M. Welling, Auto-encoding variational bayes , CoRR abs/1312.6114 (2013) ,[ ].[39] A. Gretton, K. M. Borgwardt, M. J. Rasch, B. Schölkopf and A. J. Smola,

A Kernel Method for theTwo-Sample Problem , NIPS abs/0805.2368 (2006) 513–520, [ ].[40] L. van der Maaten and G. Hinton,

Visualizing data using t-SNE , J. Mach. Learn. Res. (2008)2579–2605.(2008)2579–2605.