DeepCore: Convolutional Neural Network for high p T jet tracking
PProceedings of the CTD/WIT 2019PROC-CTD19-020CMS CR-2019/06823rd June 2020
DeepCore: Convolutional Neural Network for high p T jet tracking Valerio Bertacchi
On behalf of CMS Collaboration,Scuola Normale Superiore and INFN Sez. di Pisa, Italy
ABSTRACTTracking in high-density environments, such as the core of TeV jets, isparticularly challenging both because combinatorics quickly diverge and becausetracks may not leave anymore individual “hits” but rather large clusters ofmerged signals in the innermost tracking detectors. In the CMS collaboration,this problem has been addressed in the past with cluster splitting algorithms,working layer by layer, followed by a pattern recognition step where a highnumber of candidate tracks are tested. Modern Deep Learning techniques can beused to better handle the problem by correlating information on multiple layersand directly providing proto-tracks without the need of an explicit clustersplitting algorithm. Preliminary results will be presented with ideas on how tofurther improve the algorithms.PRESENTED ATConnecting the Dots and Workshop on Intelligent Trackers (CTD/WIT 2019)Instituto de Física Corpuscular (IFIC), Valencia, SpainApril 2-5, 2019 a r X i v : . [ phy s i c s . i n s - d e t ] J un onnecting the Dots and Workshop on Intelligent Trackers. IFIC (Valencia). April 2-5, 2019 The events with high-energy (trasverse momentum p jet T (cid:38) . TeV) jets emission are part of a rich physicsprogram at LHC, both for the New Physics searches and the Standard Model (SM) physics. The boostedenvironment improves the performances of the analysis which involves high-mass SM objects, like vectorbosons or b-quarks [1–3]. The track reconstruction inside the jet is a fundamental step for all the analyseswhich want to investigate the composition of the jets, looking for substructures and specific particle signa-tures. In the CMS experiment the full reconstruction of the event relies on the Particle Flow algorithm [4],which smartly combine the information of the subdetectors to assign to each reconstructed object a particletag, rebuilding the entire event. The silicon tracker information is one of the blocks of the Particle Flow,and improvement in tracking gives large benefits to the entire event reconstruction of CMS.The CMS experiment is an electromagnetic spectrometer composed of a superconducting solenoid whichprovides a magnetic field of 3.8 T. Within the solenoid volume there is, from the interaction point (IP) tooutside, the silicon pixel and strip tracker, the lead tungstanate electromagnetic calorimeter, the hadroniccalorimeter composed by alternated layer of brass and scintillators. The muon detector is composed of gaschambers embedded in the steel yoke outside the superconducting solenoid. A more detailed descriptionof the CMS detector, together with a definition of the coordinate system used and the relevant kinematicvariables, can be found in Ref. [5]. In particular, the pixel detector [6] is composed of four layers in thebarrel region ( η (cid:46) . ) and three disks in the endcap region (which offer a 4-hits coverage up to η = 2 . ).The radii of the barrel layer are 29, 68, 109, 160 mm, the distances of the disks from the IP are 291, 396,516 mm. The pixels size is × µ m in both regions. The resolution is order of 10 µ m in r − φ and25 µ m in z directions.The CMS track reconstruction algorithm, called Combinatorial Track Finder (CTF), is based on thecombinatorial Kalman filter (CKF)[7–10]. The two main ideas of this approach are: to perform patternrecognition and track fitting in the same framework and to manage the high level of complexity of the events(i.e. the combinatorial burden) with multiple passes of the same reconstruction sequence. The first iterationslook for tracks which are easier to find and then they remove the associated hits. Then the following iterationslook for more difficult kinematic regions (low or very high p T , displaced vertex, high η . . . ), but each iterationsearch in less dense environment because of the removed hits. This process is called iterative tracking andeach pass proceeds in four steps:1. Seed generation . Building of proto-tracks with the use of few hits (form 2 to 4) from specific layersof the tracker. This rough estimation of track parameters will be used as starting point for the secondstep. The minimum requirements to obtain an estimate of the tracks parameters are three points, astwo 3D hits with the vertex constraint or three 3D hits. For each iteration a set of seeding layers anda tracking region are defined: the seeding layers are the detectors where the seed hits are searched(a pair, triplet or quadruplet of tracker layers), the tracking regions are the kinematic or geometricselection applied on the hits to identify the seeds to define the phase space of region of interest. If theseeding layers are three or four pixel detector layers a Cellular Automaton [11] is used to produce theseed list instead of the tracking region constraint.2.
Pattern Recognition . Extrapolation from the track-seeds to the outer layers of the tracker lookingfor compatible hits, exploring multiple hypotheses. The extrapolation is done taking into accountthe material effects (multiple scattering, energy losses), first moving outward and then repeating theextrapolation inward to recover precision in the seeding region.3.
Fitting . Fitting using the Kalman filter and smoother [12], moving outward with the Runge-Kuttapropagator [13] which takes into account both the material effects and the inhomogeneities of themagnetic field.4.
Quality flagging . Flagging the track candidates with different tag depending on their quality (basedon number of hits, χ , track parameters . . . ), or discarded if the quality results too low.1 onnecting the Dots and Workshop on Intelligent Trackers. IFIC (Valencia). April 2-5, 2019 After the selection, the track collections from the various iterations are merged in a single collection, called general tracks . More details about the track reconstruction can be found in Ref. [14].The number of charged particle tracks and their spatial density inside the jets grows with the energyof the jet and dedicated iteration for high energy jet was added in 2015, because the tracking performancein the jet core (i.e. the central region) result lower than the average [15]. This iteration, called jetCore was added as last of the iterative tracking and searches seeds only in a cone of ∆ R = (cid:112) ∆ φ + ∆ η < . around the jet axis (from calorimeter deposit) if p jet T > GeV. The seeds are built with pairs of hits on thepixel detector and/or in the internal strip detector barrel, compatible with p T > GeV, with the vertexconstraint. In addition, the CKF tests a larger number of candidates in the jet core cone region ( ∼ againstthe standard 5). The tracks in the jet core, due to the high density, often leave on the pixel layers largemerged cluster and not individual hits. A dedicated k-means [16] based cluster splitter was developed toface the merged clusters, which exploits the jet axis information to predict the cluster shape and charge for asingle particle cluster or a multiple-particle merged cluster. The performance of the jetCore iteration (calledfrom now standard jetCore) are suboptimal: the jetCore iteration improves the total tracking efficiency, buta simulation with an ideal cluster splitting reveals that there is still room for improvement. Anyway, trackingefficiency still degrades in the jet core also with the ideal splitting, point out that the inefficiency is not dueto the merged cluster only. Therefore has been decided to change approach and develop a new version of thejetCore seeding algorithm avoiding an explicit splitting step and using the combined information of multiplepixel detector layers to produce a new list of jetCore seeds, instead of focusing to improve a layer-by-layercluster splitting. An Artificial Neural Network, called DeepCore , has been developed, properly trained andtested in CMS reconstruction software to cope with this task. The description of DeepCore together withits performance is presented in the following sections.
In this section the general strategy on which the novel high p T jet seeding algorithm is based is presented.Then the details of DeepCore are described and in the last part the integration of the network in the CMSCTF is shown. DeepCore is currently developed for barrel region only, therefore the pixel detector is simplymade of four cylindrical layers in this framework. The purpose of the seeding algorithm is to produce a list of track-seed i.e. sets of track parameters forthe interested tracking region. The primary goal of the algorithm is to find additional seeds in the jet coreregion, recovering seeding efficiency lowering the fake tracks rate. This result can be reached producingbetter quality seeds in term of track parameters. The secondary goal is to lower the time consumption ofjetCore iteration, currently one of the most expensive of the CTF (because of the large number of exploredcandidates).Because the previous cluster splitting algorithm resulted suboptimal this explicit step has been skipped.The seeding algorithm produces directly the list of seeds (i.e. track parameters) from the raw pixel detectorinformation, without any clustering algorithm on the top. A good candidate to reproduce the function f : { raw pixel information } −→ { list of track seeds } is an Artificial Neural Network (NN). With raw pixel information from now on it is referred to individualpixel charge and position, without any clustering algorithm, but the default charge calibration and zero-suppression algorithms applied.In the wide field of NN a Convolutional Neural Network (CNN) has been used to face the problem. TheCNNs [17] are one of the most natural choices with a 2D-picture input, like the pixel detector information.Each node of the network can be interpreted as a single pixel, each node of the hidden layers is connectedonly to few nodes of the previous layer i.e. receives information only from a small region of the layer. Thevalues of the previous layer inside the region are combined with a specific filter to produce the weight of thenode of the hidden layer. The CNN swipes the filter along the entire 2D input looking for common features2 onnecting the Dots and Workshop on Intelligent Trackers. IFIC (Valencia). April 2-5, 2019 in the layer sharing properly the weights. The CNN uses multiple filters to increase the feature-discoverypower, exploring multiple times the entire layer. The relevant parameters are the number of filters (howmany kinds of features are expected), the dimension of the filters (how many pixel are needed to identify afeature) and of the number of convolution layer (the complexity of the features).In the tracking environment the pixel detector layers can be interpreted as RGB channels of the same 2Dpicture (i.e. an additional dimension). The inputs are fixed-size windows of pixel (the jet core regions). Thefeatures inside the filters are the track patterns on the four layers thus the filters dimension must be largeenough to include the track hits on the four layers. The network is realized with convolutional layers only:a 2D-picture output allows to be completely independent on the number of tracks in the layer but only tothe mean occupancy. In addition, the network can be rescaled for different window size or different trackergeometry without changing the architecture but few hyper-parameters only. Another relevant feature of theconvolutional approach is that all the seeds are predicted at the same time, an not removing the correspondenthits in a sequential way. This approach take has been previously used in Ref. [18] to identify a variablenumber of targets in videos with a real time detection.
Training Input.
The input of the network are four pixel maps centred on the merged clusters. Theprocedure to build them is: for each jet with p T > TeV the interception between the jet axis from thecalorimeter information and the first layer of pixel detector is found, then it is opened a cone of ∆ R = 0 . and are found all the merged clusters inside the cone on the layer 1. A cluster is flagged as merged if itscharge and shape are compatible with multiple particles ∗ . If the crossed pixel detector module is inactivethe list of the merged cluster on the next layer, layer 2, is used. Then, for each merged cluster a × pixels window is opened in each of the four layers, using as a center the interception between the layerand the direction defined by the primary vertex (PV) and the merged cluster. Also the jet axis is addedas additional direction to open the four windows. For each of the direction, for each window, the x, y andcharge information of the hits inside the windows is stored. The charge information is normalized to a fixedvalue † to obtain an ADC count number of order 1, easier to handle for a NN. Each training input is made ofthe four windows, called pixel maps, thus for each jet multiple overlapping inputs are produced. In additionalso the jet η and jet p T are given as input, because the shape of the cluster depends on the energy and thecrossing angles of the particles. In Figure 1 an example of the four pixel maps input is shown. A DC c oun t [/ k ] Pixel Window, layer 1 - - - x [pixel] - - - y [ p i x e l ] Simulation Preliminary
CMS
13 TeV
Pixel Window, layer 1 (a) A DC c oun t [/ k ] Pixel Window, layer 2 - - - x [pixel] - - - y [ p i x e l ] Simulation Preliminary
CMS
13 TeV
Pixel Window, layer 2 (b) A DC c oun t [/ k ] Pixel Window, layer 3 - - - x [pixel] - - - y [ p i x e l ] Simulation Preliminary
CMS
13 TeV
Pixel Window, layer 3 (c) A DC c oun t [/ k ] Pixel Window, layer 4 - - - x [pixel] - - - y [ p i x e l ] Simulation Preliminary
CMS
13 TeV
Pixel Window, layer 4 (d)
Figure 1: Example of of the pixel maps used as input for the DeepCore neural network: the maps shows a windowson the four pixel detecor layer of CMS, aligned to the jet direction. The ADC counts are divided by 14000. ∗ This assumption is used for the training input only and does not bias the CNN with respect to an MC-truth merged clusterbecause of the large overlap between windows. † onnecting the Dots and Workshop on Intelligent Trackers. IFIC (Valencia). April 2-5, 2019 Training Target.
For each input the target of the network is made of three copies of a
Track CrossingPoints (TCP) Map and a
Track-Parameters Map . Each copy of the two Maps is pair of × matrices. Foreach pixel of layer 2 input map, if a track crosses that pixel, 1 will be stored in the correspondent pixel of thefirst TCP Map, 0 will be stored in the pixel otherwise. For each 1-pixel of the TCP Map the track parametersof the track are stored in the correspondent pixel of the Track-Parameters Map. The track parameters arestored in local coordinate: ∆ x and ∆ y with respect to the center of the pixel, ∆ η and ∆ φ with respect tothe merged cluster-PV direction and the p T of the track. The track parameters are also stored for the pixelsin a radius of 2 pixels with respect to each TCP, with the local pixel reference (these pixels has been calledNear to track Crossing Points, NCP). The rest of the pixels of the Track-Parameters Map are filled with0. The second and third copies (called Overlap 2 and 3 Maps) are filled to take into account of multipletracks which cross the same pixel: if another track crosses a TCP another 1 will be is stored in the TCPMap-Overlap 2 with the relative filling for the Track-Parameters Map-Overlap 2. The same for the overlap3 Maps in case of three tracks in the same pixel. In Figure 2 this complex target is shown graphically. Architecture.
The architecture of the network is completely convolutional. It is schematically shownin Figure 2. The inputs feed five 2D convolutional layers with reducing filter size and number, then thenetwork is split in two trunks: four 2D convolutional layers to produce the Track-Parameters Maps and four2D convolutional layers for the TCP Maps. The activation functions are ReLU for all the layer but the lastTCP Maps layer, where Sigmoid is used ‡ . The total number of parameters of the network is 77373. merged cluster axis track = cluster barycentre = track-crossingpoint= TCP = NCP area= pixel w.r.t are displayedparametersPV (a) Input: p jet T , η jet,four × maps...1. Conv: 50 filters, ×
2. Conv: 20 filters ×
3. Conv: 20 filters, ×
4. Conv: 18 filters, ×
5. Conv: 18 filters, × ......6. Conv: 18 filters ×
7. Conv: 18 filters ×
8. Conv: 18 filters, ×
9. Conv: 18 filters, × ... ...6. Conv: 12 filters ×
7. Conv: 9 filters ×
8. Conv: 7 filters, ×
9. Conv: 6 filters, × ...Target:Track CrossingPoints MapsTarget:Track-ParametersMaps (b) Figure 2: On the left (a) an example of the Target for a single track: on the left the TCP (in yellow), the trackparameters are stored for all the pixels inside the shaded blue area, the red pixel is the one with respect of which areevaluated the parameters. The Overlap Maps are not shown. On the right (b) the architecture of DeepCore.
Prediction.
The Prediction of the network has the same structure of the Target i.e. three × TCPMaps and three Track-Parameters Maps. The TCP Maps will contains values between 0 and 1 for each pixelthus can be interpreted as a probability that a track cross that pixel. The Track-Parameters Maps containsinstead the five parameters for the TCP and NCP pixels in local coordinate.
Training details.
The NN has been trained with a large sample of inputs, for which also the relative targetinformation is given. During the training the network must predict the target given the input only, then itmust compare the prediction with the true target. The comparison proceeds with a given metric i.e. the lossfunction , which defines the grade of accuracy of the prediction. Two losses, one for each target, has beenused to train DeepCore. A weighted
Binary Cross Entropy has been used for the TCP Maps i.e L T CP = ‡ Sigmoid is recommended for limited range output and binary losses. See training details later on. onnecting the Dots and Workshop on Intelligent Trackers. IFIC (Valencia). April 2-5, 2019 N (cid:80) Ni =1 (cid:104) y true i ln( y pred i ) + (1 − y true i ) ln(1 − y pred i ) (cid:105) , where the TCP-pixels have weight 10, the NCPs 1 and theother pixels 0.01. The weighting is needed to avoid a vanishing TCP prediction because of the sparse target.A clipped mean square error has been used for all the parameters L par = (cid:80) p ∈ TCP,NCP min[( p pred − p true ) , N TCP + NCP ,where the sum runs only on the TCP and NCP pixels. The clipping is needed to avoid large tails in theprediction which enlarge artificially the loss. The training sample is composed by 22 millions of input (about2 millions of jets) plus two million used for validation and it is composed of multijet events with the trasfer ˆ p T between 1.8 and 2.4 TeV. The jets are required to have p jet T > TeV and | η jet | < . , while only the trackswith p T > GeV has been used to build the targets. The batch size (the number of input analysed for eachprediction) is 32, which is the largest possible given the available computation power. The chosen optimizeris Adam [19], the learning rate has been changed during the 246 epochs of training, gradually from · − to − , and in each epoch all the training sample is explored. The training of DeepCore has been performed outside of the CMS reconstruction software (CMSSW) onGPU and then the final weights have been permanently stored and given to CMSSW. DeepCore has beendeveloped with Keras library [20] both for the training and the prediction inside CMSSW. DeepCore hasbeen integrated into the jetCore iteration of CMS reconstruction: standard jetCore seeding has been disabledand the following algorithm is the replacing.The cluster list in a cone of ∆ R = 0 . with the respect of the jet axis is identified for each calorimeter jetwith p T > GeV. Each cluster defines a new direction on which a DeepCore Input is built (the four pixelmaps and the p jet T , | η jet | ). The input is defined for all the cluster and not for the merged cluster to recover asmuch efficiency as possible at seeding level, the standard duplicate remover will take into account to removeoverlapped tracks in the following steps of reconstruction. The input is given to DeepCore NN which returnsthe prediction given the weights of the training. The list of actual seeds is made from DeepCore predictionwith the sets of five track parameters of the most probable pixels. Most probable is defined as TCP outputgreater than 0.85, 0.75, 065 for the three Overlaps or greater than 0.5, 0.4, 0.3 in case the layer 2 is missingfor the given input (because of inactive module). The threshold is lowered in the latter case because thetarget of TCP is built on layer 2, thus it is crucial in the prediction. In addition to the standard duplicateremover, the list of seeds is cleaned from duplicates: if two seeds have ∆ x, ∆ y < µ m, ∆ η, ∆ φ < . theone from the lower value of TCP is removed from the list. The uncertainty on the parameters is fixed for allthe seeds: σ p T = 0 . GeV, σ η = σ φ = 0 . , σ xy = σ z = 44 µ m, without off-diagonal terms, based on theperformance of the prediction of DeepCore (see next section). The behaviour of DeepCore can be checked during the training with an "event display", developed foroptimization studies externally from CMSSW. The same event of Figure 1 is shown in Figure 3, togetherwith TCP Map, the target and the track-parameter prediction of the most probable hits only, at the end ofthe training. The event display has only a qualitative interpretation, but it reveals an almost full efficiencyand an accuracy of 1-2 pixels also with the used linear propagation, with an affordable level of duplication(the duplicate remover has not been run here). The Figure 4 shows an example of the quantitative validationof the training performance, in term of residual of η parameter between the prediction and the target. Thenull average bias, the 1.4% spread and the strong correlation with the target show that DeepCore is able topredict the parameters given the pixel input.DeepCore has been validated integrated in the CMS renconstruction on 20k multijet events with thetrasfer ˆ p T between 1.8 and 2.4 TeV. The jets are required to have p jet T > TeV and | η jet | < . and onthe simulated tracks has been applied the typical CMS selection | η | < . , r prod < cm, | z prod | < cm, p T > . GeV.The result for the final tracking performance are shown in Figure 5 in a stacked plot with highlighted thecontribution of the various iterations of the CTF in the jet core region. The tracking efficiency is defined as5 onnecting the Dots and Workshop on Intelligent Trackers. IFIC (Valencia). April 2-5, 2019 - - - x [pixel] - - - y [ p i x e l ] A DC c oun t [/ k ] Simulation Preliminary
CMS
13 TeV
Pixel Window, layer 1
Simulation Preliminary
CMS
13 TeV
TargetPrediction (a) - - - x [pixel] - - - y [ p i x e l ] A DC c oun t [/ k ] Simulation Preliminary
CMS
13 TeV
Pixel Window, layer 2
Simulation Preliminary
CMS
13 TeV
TargetPrediction (b) - - - x [pixel] - - - y [ p i x e l ] A DC c oun t [/ k ] Simulation Preliminary
CMS
13 TeV
Pixel Window, layer 3
Simulation Preliminary
CMS
13 TeV
TargetPrediction (c) - - - x [pixel] - - - y [ p i x e l ] A DC c oun t [/ k ] Simulation Preliminary
CMS
13 TeV
Pixel Window, layer 4
Simulation Preliminary
CMS
13 TeV
TargetPrediction (d) x [pixel] - - - y [ p i x e l ] - - - P r obab ili t y Simulation Preliminary
CMS
13 TeV
TargetPrediction
TCP Prediction Map, overlap 0 (e)
Figure 3: Example of the pixel maps used as input. On the top are also shown the crosses of the crossing point ofthe target (simulated) tracks and the correspondent prediction of DeepCore for the most probable hits. The predictionis produced on layer 2 and propagated linearly on the other layers. The most right figure is the map of the predictedcrossing point on the window of layer 2, expressed as probability, with the crosses of the predictions and the targets.The linear propagation is used in the event display only, in seed production the predicted p T is used. (prediction-target)/target E v e n t s / . CMS
Simulation Preliminary
13 TeV
QCD events ( PU = 30)1.8 TeV < p T < 2.4 TeV p jetT > 1 TeV, | jet | < 1.4 res = 0.014 Residual distribution - (a) prediction t a r g e t CMS
Simulation Preliminary
13 TeV
QCD events ( PU = 30)1.8 TeV < p T < 2.4 TeV p jetT > 1 TeV, | jet | < 1.4 (b) Figure 4: On the left (a) the residual between the seed η parameter predicted by DeepCore and the target (simulated)track η parameter. On the right (b) the correlation between prediction of DeepCore and target parameters shown withseed η parameter predicted against the simulated track η parameter. ε = N assoc /N sim , where N sim is the number of simulated tracks and N assoc is the number of reconstructedtracks associated to a simulated one. The fake rate is defined as R F = N not assoc /N reco , where N reco isthe number of reconstructed tracks and N not assoc is the number reconstructed tracks not associated to asimulated one. A reconstructed track is flagged as “associated” if the χ between its parameters and thesimulated is lower than 25. This definition replaces the usual CMS one (based on the fraction of truehits used) for these validation studies, because DeepCore seeding is without pixel hits and with the usualassociation it will be negatively biassed.The improvement given by DeepCore to CMS reconstruction is better shown in Figure 6, where arecompared the performance with the standard jetCore algorithm and the one with DeepCore. Also the trackingperformance obtained producing the seed for the jetCore iteration using the simulated track information isshown ( MC truth seeding ), for which the seeding efficiency is 100% and the fake rate 0% by definition.6 onnecting the Dots and Workshop on Intelligent Trackers. IFIC (Valencia). April 2-5, 2019 - - - R(track,jet) D T r a ck i ng e ff i c i en cy Initial+HighPtTriplet+LowPtQuad+LowPtTriplet+DetachedQuad+DetachedTriplet+MixedTriplet+PixelPair+PixelLess+TobTec+JetCore+Muon inside-out+Muon outside-in
13 TeV
CMS
Simulation Preliminary < 2.4 TeV (no PU) T pQCD 1.8 TeV < | < 1.4 jet h > 1 TeV, | jetT p (a) - - - R(track,jet) D F a k e R a t e Initial+HighPtTriplet+LowPtQuad+LowPtTriplet+DetachedQuad+DetachedTriplet+MixedTriplet+PixelPair+PixelLess+TobTec+JetCore+Muon inside-out+Muon outside-in
13 TeV
CMS
Simulation Preliminary < 2.4 TeV (no PU) T pQCD 1.8 TeV < | < 1.4 jet h > 1 TeV, | jetT p (b) Figure 5: Tracking efficiency (left figure) and fake rate (right figure) in the jet core region ( ∆ R < . , between thereconstructed jet axis and the simulated track direction). The contribution of the different iterations of the CKF areshown as stacked histograms. The DeepCore algorithm is used in the iteration dedicated to the cores of the jets [jetCore(purple)]. In the efficiency the shared reconstructed tracks (duplicated) between various iterations are not removed. - - -
10 R(track, jet) D - - - - - M C ) / E ff M C ( E ff - E ff R(track, jet) D T r a ck i ng E ff i c i en cy Without JetCoreStandard JetCoreDeepCoreMC truth seeding
Simulation Preliminary
CMS
13 TeV < 2400 GeV (no PU) T pQCD 1800 GeV <|<1.4 jet h >1 TeV, | jetT p Tracking Efficiency (a) - - -
10 R(track, jet) D M C ) / F a k e M C ( F a k e - F a k e R(track, jet) D F a k e R a t e Without JetCoreStandard JetCoreDeepCoreMC truth seeding
Simulation Preliminary
CMS
13 TeV < 2400 GeV (no PU) T pQCD 1800 GeV <|<1.4 jet h >1 TeV, | jetT p Fake Rate (b)
Figure 6: Tracking efficiency (left figure) and fake rate (right figure) in the jet core region ( ∆ R < . , between thereconstructed jet axis and the simulated track direction). The light blue filled histogram is obtained with the standardCMS tracking algorithm. The dark blue histogram is obtained removing the CKF iteration dedicated to the jet cores.The red histogram is obtained using the DeepCore in the seeding for the iteration dedicated to the jet cores. The greenhistogram is obtained producing the seed for the jetCore iteration using the MC truth seeding. In the lower pads areshown the differences between various tracking efficiencies (fake rates) and the MC truth seeding one, divided by theMC truth seeding efficiency (fake rate). onnecting the Dots and Workshop on Intelligent Trackers. IFIC (Valencia). April 2-5, 2019 DeepCore is able to reproduce the perfect seeding efficiency with degradation below 1%, flat in ∆ R . Onthe other hand, all the fake tracks produced by the standard jetCore are avoided, reducing the seeding fakerate below 5%. In particular the good purity of DeepCore seeds lower the fakes below the rate without thejetCore iteration because DeepCore is able to correctly reconstruct tracks reconstructed as fakes by differentiteration in the low ∆ R region.Also the timing performance has been validated: the DeepCore time consumption is 15% of the averagetime of standard jetCore iteration. The CNNs have been shown to be a valid approach to perform seeding for track reconstruction in a denseenvironment. The DeepCore algorithm, developed and validated with the CMS tracker in the central region,shows better performance than the standard seeding algorithm in such dense environment: it almost cancelsthe seeding inefficiencies, reduces the fake rate up to 60% and the seeding time by 85%. For the trackreconstruction to be used in Run3 of LHC, CMS plans to extend this approach and make use of the endcapregion as well. The more complex geometry of the endcaps require some adjustment of the network input andtarget. Furthermore, an optimization of the training (in terms batch size, learning rate and architecture) andof the target definition (in order to reduce the strong dependence on the layer 2) it is planned. In additionspecific studies are required to evaluate the impact of a pixel-less seeding in the inward CKF extrapolation.Finally, the good performance of the DeepCore algorithm suggests to study the impact of applying such anapproach to the pattern recognition as well.
References [1] A.M. Sirunyan, et al., JINST (05), P05011 (2018). DOI 10.1088/1748-0221/13/05/P05011[2] A.M. Sirunyan, et al., Phys. Rev. D , 092014 (2018). DOI 10.1103/PhysRevD.98.092014[3] L. Asquith, et al., (arXiv:1803.06991) (2018)[4] A.M. Sirunyan, et al., JINST (10), P10003 (2017). DOI 10.1088/1748-0221/12/10/P10003[5] S. Chatrchyan, et al., JINST , S08004 (2008). DOI 10.1088/1748-0221/3/08/S08004[6] V. Tavolaro, JINST (12), C12010 (2016). DOI 10.1088/1748-0221/11/12/C12010[7] A. Strandlie, R. Fruhwirth, Rev. Mod. Phys. , 1419 (2010). DOI 10.1103/RevModPhys.82.1419[8] P. Billoir, Comput. Phys. Commun. , 390 (1989). DOI 10.1016/0010-4655(89)90249-X[9] P. Billoir, S. Qian, Nucl. Instrum. Meth. A294 , 219 (1990). DOI 10.1016/0168-9002(90)91835-Y[10] R. Mankel, Nucl. Instrum. Meth.
A395 , 169 (1997). DOI 10.1016/S0168-9002(97)00705-5[11] F. Pantaleo (2017). URL http://cds.cern.ch/record/2293435 [12] R. Frühwirth, Nucl. Instrum. Meth.
A262 , 444 (1987). DOI 10.1016/0168-9002(87)90887-4[13] J.R. Cash, A.H. Karp, ACM Trans. Math. Softw. (3), 201 (1990). DOI 10.1145/79505.79507[14] S. Chatrchyan, et al., JINST (10), P10009 (2014). DOI 10.1088/1748-0221/9/10/P10009[15] CMS Collaboration. URL https://twiki.cern.ch/twiki/bin/view/CMSPublic/HighPtTrackingDP [16] J. MacQueen, in Proc. Fifth Berkeley Symp. on Math. Statist. and Prob., Vol. 1 (Univ. of Calif. Press,1967), p. 281. URL https://projecteuclid.org/euclid.bsmsp/1200512992 [17] V. Dumoulin, F. Visin, (arXiv:1603.07285) (2016)[18] J. Redmon, A. Farhadi, (arXiv:1804.02767) (2018)[19] D.P. Kingma, J. Ba, (arXiv:1412.6980) (2014)[20] F. Chollet, et al. Keras. https://keras.iohttps://keras.io