[PDF] A novel approach for nearly-coincident events rejection

Abstract

We present a novel technique, called DSVP (Discrimination through Singular Vectors Projections), to discriminate spurious events within a dataset. The purpose of this paper is to lay down a general procedure which can be tailored for a broad variety of applications. After describing the general concept, we apply the algorithm to the problem of identifying nearly coincident events in low temperature microcalorimeters in order to push the time resolution close to its intrinsic limit. In fact, from simulated datasets it was possible to achieve an effective time resolution even shorter than the sampling time of the system considered. The obtained results are contextualized in the framework of the HOLMES experiment, which aims at directly measuring the neutrino mass with the calorimetric approach, allowing to significally improve its statistical sensitivity.

Full PDF

EEur. Phys. J. C manuscript No. (will be inserted by the editor)

A novel approach for nearly-coincident events rejection

M. Borghesi a,3,4 , M. De Gerone , M. Faverzani , M. Fedkevych ,E. Ferri , G. Gallucci , A. Giachero , A. Nucciotti , A. Puiu Dipartimento di Fisica, Universit`a degli Studi di Genova, Genoa 16146, Italy Istituto Nazionale di Fisica Nucleare (INFN), Sezione di Genova, Genoa 16146, Italy Dipartimento di Fisica ”G. Occhialini”, Universit`a di Milano - Bicocca, Milan 20126, Italy Istituto Nazionale di Fisica Nucleare (INFN), Sezione di Milano-Bicocca, Milan 20126, Italy Gran Sasso Science Institute (GSSI), I-67100 L’Aquila, Italy; INFN - Laboratori Nazionali del Gran Sasso, Assergi (L’Aquila) I-67010 - ItalyReceived: date / Accepted: date

Abstract

We present a novel technique, called DSVP(Discrimination through Singular Vectors Projections),to discriminate spurious events within a dataset. Thepurpose of this paper is to lay down a general proce-dure which can be tailored for a broad variety of appli-cations. After describing the general concept, we applythe algorithm to the problem of identifying nearly co-incident events in low temperature microcalorimetersin order to push the time resolution close to its intrin-sic limit. In fact, from simulated datasets it was possi-ble to achieve an eﬀective time resolution even shorterthan the sampling time of the system considered. Theobtained results are contextualized in the frameworkof the HOLMES experiment, which aims at directlymeasuring the neutrino mass with the calorimetric ap-proach, allowing to signiﬁcally improve its statisticalsensitivity.

The demand for increasing sensitivities of nowadays ex-periments requires the development of complex analysistools to respond to several demands, according to thedesign and goals of the experiment. In many experi-ments, a crucial factor in achieving a high sensitivityis the ability to discriminate spurious events. This isa particularly relevant feature to keep into account forexperiments where the statistics of the spurious eventsmight be comparable with, or even overcome, the statis-tics of the proper events. This is the case, for instance,of the direct measurement of the neutrino mass withthe calorimetric approach [1].So far, few techniques are currently employed for thepurpose of discriminating the spurious events from the a e-mail: [email protected] proper ones, but they all require that events belong-ing to these two families must diﬀer in some way fromeach other. In this paper we outline a novel technique,called DSVP (Discrimination through Singular VectorsProjections), based on a previous work by Alpert et al.[2].In section 2 we illustrate the DSVP method while inthe second part of the article (Section 3 and 4) we showan application of this technique in the context of a di-rect calorimetric neutrino mass measurements. In par-ticular, we will focus on the detectors of the HOLMESexperiment [3], for which the expected main source ofbackground will be unrecognized pile-up events. The aim of the DSVP algorithm is to discriminate asmany as possible undesirable events (i.e. spurious eventswhich diﬀer from a reference signal) present in a givendataset.In order to apply the DSVP technique, the followingelements are required: – The measured dataset, (cid:126)M . This n × d matrix consistsof the dataset of interest, where each row is an eventdescribed by d variables. Namely, the events can beseen as points in the R d space. From now on, wecall the good events in the dataset A events, while B events are the ones to be rejected. We assumethat the A events are more numerous respect to the B ones ( N A > N B ). – The expected number of B events N B that the al-gorithm should discard at most. – A training dataset, (cid:126)T , such that N A >> N B . Theevents of this n (cid:48) × d matrix can be distributed ina diﬀerent region of R d respect to the events in (cid:126)M . a r X i v : . [ phy s i c s . i n s - d e t ] J a n For instance, in the case of microcalorimeter signals,the events in (cid:126)T can lie in a diﬀerent energy rangerespect to the events in (cid:126)M .We will use the training dataset (cid:126)T to deﬁne a new vec-tor space which will help us to highlight the featuresthat distinguish an A event from a B one. This newvector space, called from now on the projections space,has dimension k , with k << d . The events can be rep-resented as points in the R k projection space, so thatthe A events are distributed diﬀerently respect to the B ones. The idea is to ﬁnd a model (i.e. hypersurfaces)describing the distribution of the A points in (cid:126)M in thisnew space, so that the B points can be identiﬁed as theones with a larger distance from what predicted by themodel.In order to ﬁnd the model we need to ’clean’ the dataset,obtaining a subset of (cid:126)M , (cid:126)M (cid:48) ⊂ (cid:126)M , which containsmostly A events at the expense of deleting also some A events.The next step is to represent the events in (cid:126)M (cid:48) in theprojection space and to ﬁnd the model parameters whichdescribes the distribution of the (cid:126)M (cid:48) events.We then deﬁne the discrimination parameter and itsthreshold to recognize an A event from a B one in R k .Finally, we take the original dataset (cid:126)M , represent theevents in the projection space, ﬁnd the discriminationparameter and discard all the events that have a valueof the discrimination parameter above the thresholdfound.The procedure (dataset ’cleaning’, model and thresholddeﬁnition and B discrimination) is then repeated withthe survived events. At each iteration, the (cid:126)M datasetwill contain a smaller fraction of B events.In the following sections, each steps of the algorithmare described in detail.2.1 Raw cleaning with PCAIn the ﬁrst step, the aim is to create a suitable datasetfor modeling the distribution of A events in the (cid:126)M ma-trix, lowering the ratio N B /N A at the expense of delet-ing also A events. Knowing that the mean ’morphol-ogy’ of the events is closer to the A ones, we can deﬁnea suitable parameter using the Principal ComponentAnalysis (PCA) [4] to discard mainly B events.The procedure used is equal to the one described in [2],which will be reported for completeness. The singularvalue decomposition (SVD) [5] is computed for the n × d matrix (cid:126)M , which is decomposed in a product of threematrices (cid:126)M = (cid:126)U (cid:126)D (cid:126)V T . The columns of (cid:126)U and (cid:126)V arethe left and right singular vectors respectively, while theentries of the diagonal matrix (cid:126)D are the singular values. The singular values are ordered from 1 to d in orderof importance. Only the ﬁrst j < d columns of (cid:126)D arenon-neglibile. It is convenient to deﬁne a new matrix ˆ (cid:126)U which contains only the ﬁrst j columns of (cid:126)U subtractedby their means which is equivalent to centering the datamatrix, as required by the PCA.The columns of ˆ (cid:126)U are vectors of length n . Basically,they represent the projections of the mean-centered eventscontained in (cid:126)M on the right singular vectors (i.e. thecolumns of (cid:126)V , which are called principal vectors in thePCA framework) with the ﬁrst column of ˆ (cid:126)U expressingthe projections on the ﬁrst right singular vector and soon. The columns of (cid:126)V are vectors of dimension d whichrepresent the direction of greatest variance of the datain (cid:126)M . Thanks to the properties of the PCA, an appro-priate combination of the projections can be of use todeﬁne a parameter, called norm , which indicates howclose an event is to the mean ’morphology’ of the eventsin (cid:126)M .The precision matrix ( σ ) − is computed from the j × j empirical covariance σ = ˆ (cid:126)U T ˆ (cid:126)U and it is used toevaluate the parameter norm for each event i = 1 , ..., n in the matrix (cid:126)Mnorm i = ˆ (cid:126)U i, ∗ ( σ ) − ˆ (cid:126)U Ti, ∗ (1)Suppose that we have a guess of how many B events N B are expected in the dataset. B events deviate dispro-portionately from the mean in this covariance-adjustedsense, so we discard those with largest norm and re-peat the procedure on the remaining data a total of l times, removing N B l events with the largest norm onthe l -th iteration. In our tests, we use l = 5.The iterations guarantee that the mean morphology ofthe events are closer and closer to the A events eachcycle, as B events are increasingly eliminated.After the PCA, we have eliminated N P CAdel = (cid:88) l N B l (2)events, where the B events are the ones predominantlydiscarded. The remaining events after the PCA are m = n − N P CAdel . We call (cid:126)M (cid:48) the m × d the matrix of thesurvived events, which is mostly composed of A events.2.2 Deﬁne a model for the A-eventsTo discriminate the undesirable events, we now need todeﬁne a model which describes the distribution of the A points (the ones belonging to (cid:126)M (cid:48) ) in the projectionspace. First, we need to deﬁne this space. We decompose the (cid:126)T matrix using the SVD. Because the training matrix (cid:126)T is mainly composed by A events, we assume that itsﬁrst k signiﬁcant right singular vectors { (cid:126)v , (cid:126)v , ..., (cid:126)v k } can constitute a base of the projection space.The events in (cid:126)M (cid:48) are projected onto these vectors. Fromnow on, each event in (cid:126)M (cid:48) will be described by k < d variables, its projections onto the right singular vectorof (cid:126)T . We indicate all the coordinates of the (cid:126)M (cid:48) pointsalong the i -th base vector of the projection space as p i = (cid:126)M (cid:48) · (cid:126)v i .To describe the points distribution in the new vectorspace, the projections p are classiﬁed into two groups:the k (cid:48) independent projections, indicated as (cid:126)p ind andthe dependent ones, (cid:126)p dep . { p } k = p , ..., p k (cid:48) (cid:124) (cid:123)(cid:122) (cid:125) (cid:126)p ind , p k (cid:48) +1 , ..., p k (cid:124) (cid:123)(cid:122) (cid:125) (cid:126)p dep (3)The dependent projections can be expressed as a func-tion of the independent ones. There is no general rule toidentify which projection is ”independent” and whichone is not, since it is related to the speciﬁc problem.The training dataset can be used to identify the depen-dencies among the projections, as shown in Fig 1. p ind p dep Projections distribution in T

Fig. 1: An example of distribution of the points in (cid:126)T inthe projection space from sec. 3. In this particular case,the projection space is in R . We decided to set k (cid:48) = 2,thus describing the points distribution in R with twocurves: p = f ( p , p ) and p = f ( p , p )The distribution of the dependent projections can nowbe easily described in a R k (cid:48) +1 subspace by a set of f curves p i = f i ( (cid:126)p ind ) ; i = k (cid:48) + 1 , ..., k (4) Knowing precisely the set of curves { f } , we will be ableto diﬀerentiate between the two distribution of events,because the projection of the B events will not followthe same curves as the one of the A events.Usually the functional form of the diﬀerent f is un-known. However, we can approximate each f curve witha Taylor expansion and let a (weighted) linear regres-sion ﬁnd the best parameters of the expansion. In par-ticular, we use a modiﬁed version of the random sampleconsensus (RANSAC) algorithm [6].The set of curves { f } which describes the (cid:126)M (cid:48) events inthe projection space is what we called the model.2.3 Find a discrimination thresholdThe diﬀerence between the measured dependent pro-jections and the ones expected from the the model isevaluated for each event in the (cid:126)M (cid:48) matrix. A residualnorm is deﬁned as d = (cid:118)(cid:117)(cid:117)(cid:116) k (cid:88) j = k (cid:48) +1 ( p j − f j ( (cid:126)p ind )) (5)In order to discriminate between the A events, the onewith the lowest residual norm, and the B , we need todeﬁne a threshold value, d thr . Due to the fact that the (cid:126)M (cid:48) dataset is mainly made of A events the thresholdis chosen as the highest value of d plus the standarddeviation of the d distribution d thr = max { d } + std { d } (6)This threshold deﬁnition should ensure to include notonly the A events in (cid:126)M (cid:48) , but also the A events in theoriginal dataset (cid:126)M which were eliminated by the ’PCAcleaning’ described in 2.1. Nevertheless, this deﬁnitionof threshold might need to be redeﬁned to account forthe speciﬁc problem considered.2.4 Apply the modelNow all the components to make the algorithm workare present: a base for the projection space, a set ofcurves to model the points distribution in that spaceand a discrimination threshold.We will now use these tools on the original dataset (cid:126)M ,namely:1. Take the inner product of the events in (cid:126)M with thebase of the projection space, determining p k .2. Evaluate the residual norm d using the curves de-scribing the A projections distributions. Single pulse Pile-up pulse

Original events (a)

Events in projections space

Single pulses Pile-up pulses p p p p p p (b) p p p p p p

Before the PCA After the PCA

Single pulsesPile-up pulses Single pulsesPile-up pulses (c)

Model curve for projections distribution p p p

Single pulsesPile-up pulses f model curve (d) Fig. 2: Visual representation of some of the steps of the DSVP technique. As an example the ﬁgure reports signalsfrom a TES microcalorimeter, as explained in 3.2. This particular case was chosen because there are just 3 non-negligible singular values, therefore the points in the projection space can be easily shown on a graph. (a) Eachevent (row) of the matrix (cid:126)M is initially described by 400 variables i.e. samples; we can represent each event asshown in ﬁgure or as points in R . (cid:126)M contains two diﬀerent types of events: single pulses ( A ) of energy E andpile-up pulses with diﬀerent arrival time ( B ) with energies E and E such that E + E = E . (b) The events of (cid:126)M are represented in the projections space. In this space, the two types of events follow two diﬀerent distributions.(c) In the left (right) panel the matrix (cid:126)M ( (cid:126)M (cid:48) ) is represented in the projection space. It is possible to appreciatehow the PCA has drastically reduced the fraction of pile-up. (d) The curve f = f ( p , p ), which describes thedistribution of the events in (cid:126)M (cid:48) , is used to discriminate between the single pulse and pile-up pulses.3. The events with a residual norm above the thresholdare discarded.After the third step, we will have discarded N del events.The events deleted by the third step will be almost, ifnot all, spurious B events. All the previous steps (PCA,model and threshold deﬁnition) are now repeated witha reduced number of expected B events, N B (cid:48) = N B − N del . The iterations successively improve the represen-tation of A events, as B events are increasingly elim-inated. The algorithm stops when N del = 0 or when N B (cid:48) = 0. The algorithm described in sec 2 is now applied in theframework of HOLMES. The HOLMES experiment willperform a direct measurement of the neutrino masswith a sensitivity of the order of 1 eV, measuring theenergy released in the electron capture (EC) decay of

Ho, as proposed by De Rujula and Lusignoli in [7]. Itwill also demonstrate the scalability of the calorimetrictechnique for a next generation experiments that couldgo beyond the current best expected sensitivity of 0.1eV [8]. In order to reach this sensitivity, HOLMES will use low temperature TES microlorimeters with

Hoimplanted in their absorbers, with an activity of 300 Hzper detector.The eﬀect of a non-zero neutrino mass on the

Ho ECdecay spectrum can be appreciated only in a energy re-gion very close to the end point, where the count rate islow and the fraction of nearly-coincident events, calledpile-up events, to single events is greater than one. Ifa pile-up event is composed of two events of energies E and E which occur within a time interval shortedthan the time resolution of the detector, it is recordedas a single event with energy E (cid:39) E + E . Thus, if notcorrectly identiﬁed, pile-up events will distort the decayspectrum of Ho, lowering the sensitivity to m ν . -7 -6 -5 -4 -3 pile-up fraction f pp m n s t a t i s t i c a l s en s i t i v i t y % C L [ e V ] energy resolution E FWHM [eV]

Holmes expected neutrino mass sensitivity Fig. 3: Expected neutrino mass sentivity for theHOLMES experiment. The right panel shows the sen-sitivity compared to the detectors energy resolution atdiﬀerent pile-up fraction. We called the pile-up fractionon the whole energy spectrum f totpp .The neutrino mass sensitivity of HOLMES has beenevaluated through Monte Carlo simulations [9], see Fig3. Once ﬁxed the number of recorded events to 3 × ,the simulations have shown that the sensitivity on neu-trino mass is not strongly dependent on the energy res-olution of the detector (as long as ∆E <

10 eV), butrather on the pile-up fraction f pp , i.e. the ratio betweenthe number of pile-up events to single events. Its reduc-tion is crucial for the success of the experiment.Using the terminology of section 2, in the HOLMES ex-periment an A event is a signal caused by a single energydeposition in the microcalorimeter detector, while a B event is an unrecognized pile-up event. Each signal is asa collection of records I i of the detector’s bias currentacquired at an instant t i = i × t samp , where t samp is thesampling time of the readout system. An example of a microcalorimeter signal is shown in Fig 2 (a). With thecurrent setup, the sampling time is ﬁxed at 2 µ s.We tested the algorithm robustness and eﬃciency throughmany simulations which aim at emulating the resultsexpected by the HOLMES experiment. The HOLMESTES microcalorimeters do not have the Ho implantedyet, therefore a real data test will be done at later times.3.1 Energy spectrum & ROI deﬁnitions

Ho decays via electron capture to an atomic excitedstate of

Dy which relaxes mostly emitting atomicelectrons (i.e. the ﬂuorescence yield is less than 10 − ).At the ﬁrst order the de-excitation energy E c spectrumprobability density is proportional to dλ EC dE c ∝ (cid:112) ( Q − E c ) − m ν (7)Since a non zero neutrino mass aﬀects the shape of thespectrum in an energy range close to endpoint, the one-hole de-excitation spectrum is a good approximationfor our purpose. Therefore, the pulses are generatedaccording to the one-hole spectrum with Q = 2 . m ν = 0 and with energy between 2.650 keV and2.900 keV. Despite the optimal region of interest (ROI)aimed at determining the neutrino mass will be deter-mined only when actual data will be collected, this en-ergy range can be considered as a reasonable ROI.Each detector must be treated separately with the DSVPtechnique in order to account for their slightly diﬀerentcharacteristics. Thus, to create a statistic expected for asingle detector with a target activity of 300 Hz over twoyears of data taking, we generated 40000 ( ∼ µ s.3.2 Detector modelsFor this study we modeled three diﬀerent TES micro-calorimeters with the one-body model [10] or with thetwo-body dangling model [11]. In both cases the cur-rent pulse proﬁle is obtained by solving the system ofthe electro-thermal diﬀerential equations applying thefourth-order Runge-Kutta method (RK4) and consider-ing the transition resistance as proposed by [12] for tak-ing into account the TES non-linear behavior. To thesepulses a noise waveform, generated as an autoregres-sive moving average ARMA (1,1) process with a powerspectrum given by the Irwin-Hilton model, is added.To test the DSVP eﬀectiveness with slightly diﬀerentsignal shapes, the physical parameters in the diﬀeren-tial equations are chosen to describe three types of de-tectors: a. the detectors in [2] which are characterized by a nonlinear response and one thermal body.b. the target detectors of HOLMES [13] have nearly-linear response and behave according to a two ther-mal body model.c. same nominal design as b. except for the productionprocess, causing a signiﬁcantly weaker link towardthe thermal bath. Despite this diﬀerence, the de-tectors show a linear response to energy depositionwith a two-body feature. (a)(b) Fig. 4: Pulse proﬁles corresponding to diﬀerent ener-gies from 0.5 to 5 keV for two diﬀerent detectors withnon-linear (det. [a]), and nearly-linear response (det.[b]). In order to compare the signals, all the pulses arenormalized by dividing their amplitude by the energy.3.3 DSVP & HOLMESWe indicate the ratio between the number of pile-uppulses and the number of single pulses in the ROI as f ROIpp . From simulations, setting a time resolution of10 µ s a value of f ROIpp (cid:39) (cid:126)M matrix, which contains theROI events, must have N A > N B , thus f ROIpp needsto be lowered below one. To reduce this ratio manydiﬀerent strategies can be adopted. In the following anon exhaustive list is reported. – Adding an additional calibration source.

By addinga source characterized by a monochromatic X-rayemission in the ROI, the number of single pulses in the ROI can be increased while keeping the num-ber of pile-up pulses unchanged. This approach canbe very useful because it reshapes the energy spec-trum, potentially reducing the probability of dis-carting single events with energy very close to theend-point. A similar approach was investigated byAlpert [2]. – Volumetric cuts.

The events of the training dataset (cid:126)T are distributed in a ﬁnite volume in the k -dimensionalprojection space. The single pulses in the ROI rea-sonably lie within the same portion of space, whilethe pile-up are expected to be distributed in a dif-ferent region. Thus if we select only the points inthe projection space lying inside the volume whichincludes the training dataset, we could easily elimi-nate a large fraction of pile-up events. Before eval-uating their projection on the (cid:126)T right singular vec-tors, the (cid:126)T and ROI events are normalized to settheir amplitude equal to one. Then, we deﬁne theregion in the k -space in which the (cid:126)T events are dis-tributed. We increase it by a little amount in orderto account small non-linearity eﬀects. Finally, we se-lect only the events in the ROI included inside thisregion. This method can achieve good time resolu-tion, but it works only if the detector response doesnot depart from linearity too much, so in our simu-lation in detectors b. , c. but not a. . – Filtering.

Few ﬁltering techniques allow to achieveeﬀective time resolution close to the sampling time.Among these, a particular Wiener ﬁlter, as describedin [14], is probably the best technique to achieve thisgoal.

First order Ho spectrum T matrix =M1 peak M matrix =ROI Total spectrumPile-up spectrum

Fig. 5: De-excitation simulated spectra of

Ho with f totpp = 10 − . Near the end-point the single pulse countsare outnumbered by pile-up counts. For the HOLMES purposes, in order to fulﬁll the N A >N B condition in the ROI the most suitable and prac-tical method are the ’wiener ﬁlter’ and the ’volumetriccuts’. As indicated in Table 1, applying these algorithmsto the ROI events, the f ROIpp can be reduced around 0 . E ∼ (cid:126)T . This is also the energyrange in which the average signal for the Wiener ﬁlteris deﬁned. The M1 peak is the most suitable regionfor two reasons: it is the peak closest to the ROI, thusreducing the non-linearity eﬀects on the ﬁlters and onthe discrimination algorithm and it fulﬁlls the conditionof N B << N A . The f pp in this sector is expected to beof the order of ∼ − , which can be further reducedwith a raw cleaning with PCA , as described in section2.1. To quantify the eﬃciency of the pile-up discriminationalgorithms, we deﬁne an eﬀective time resolution τ eff as the ratio of the number of retained piled-up recordsto single-pulse records after the algorithm divided bythe same ratio referred to raw data, times 10 µ s. τ eff = (cid:18) pupsingle (cid:19) final ÷ (cid:18) pupsingle (cid:19) initial × µs (8)Simulations have shown that even a small fraction offalse negative modiﬁes the single events spectrum andleads to a systematic error on the neutrino mass evalu-ation. We note that in our simulations no single pulseevent was mistaken as pile-up. The DSVP techniquedescribed in section 2 is designed to leave unaﬀectedthe A events.In applications where a more robust discrimination ofthe B events is required, it is possible to adapt thealgorithm toward this goal, for example by adjustingthe threshold deﬁnition (Eq. 6) , at the expenses ofincreasing the chance of deleting some A events.4.1 DSVP with Wiener Filter and Volumetric cutsWe have estimated the τ eff on the simulated data pro-cessed with the DSVP after lowering the initial f ROIpp using the ’wiener ﬁlter’ or the volumetric cut techniques.Furthermore, before being processed by the DSVP algo-rithm, the signals where also whitened, i.e. transformed to whiten noise by a fast Cholesky-factor backsolve pro-cedure [15]. The results are reported in Table 1. All thesimulations showed that the DSVP is able to reach atime resolution lower than the sampling time of the sig-nal.Table 1 shows that the time resolution is strongly de-pendent on the sampling time, the faster the better, butalso on the rise time of the pulse. While the samplingfrequency is constrained by the readout resources, thereis more scope to change the rise time of the detectors,acting on the electrical time constant of the biasing cir-cuit.Also, the non-linear detector response generally improvesthe eﬃciency of pile-up recognition algorithms. Whentwo near-coincident energy depositions happen insidethe TES, the detector will have diﬀerent starting con-ditions. The shape of the pile-up pulse will be muchmore diﬀerent from the single pulse for a non-linearTES than for a linear one, thus allowing the algorithmsto recognize them more eﬃciently.As we stressed in section 2, the only external parame-ter required by the DSVP algorithm is the number ofevents that it should discard at most, N B . To quantifythe inﬂuence of this parameter on the eﬀectiveness ofthe algorithm, we ﬁxed the dataset (cid:126)M and varied N B ,computing the eﬀective time resolution each time.Figure 6 shows that no false positive was detected evenif we get the number of event to eliminate wrong up to50%. Time resolution of DSVP & WienerROI energy spectrum

Fig. 6:

Left panel : ROI energy spectrum before and af-ter the application of the WF/Volumetric cuts and theDSVP technique. Light line represents the energy spec-trum with τ eff of 10 µ s, while the solid line with a τ eff of 1.7 µ s. Right panel : The dependence of τ eff andthe average percentage of false positive F + from theinput parameter N B ( N in ), which is normalized by thenumber of pile-up pulses present in the ROI ( N true ). Table 1: Eﬀective time resolution of the various algorithms. We indicated with (*) the algorithm used in thatsimulation to lower the ratio of f ROIpp below one. For simplicity, we always set N B equal to the exact number ofpile-up pulses in the ROI. The errors associated with the DSVP τ eff are ≤

5% and are due to the random natureof the modiﬁed minimization RANSAC algorithm.

Detector type Rise Time [ µ s] t sample [ µ s] τ eff Wiener Filter τ eff Volumetric cuts τ eff with DSVPb. 11 2 2.26 2.12 (*) 1.55b. 17 2 2.37 (*) 2.60 1.55b. 22 2 2.94 2.90 (*) 2.01b. 17 1 1.66 (*) 2.00 0.94a. 10 2 1.82 (*) - 1.24c. 19 2 2.70 (*) 3.54 1.82 f ROIpp byadding an external source of single events with energyinside the region of interest instead of using prelimi-nary ﬁlters. We added a source from L α x-ray emissionlines of Pd (2.833, 2.839 keV). Figure 7 shows that in-creasing the number of photons of the Pd source (thusdecreasing f ROIpp ) the eﬀective time resolution of theDSVP improves. Moreover, τ eff always remains belowthe sampling time even for a pile-up fraction up to 0.9. ROI with external source Time resolution of DSVP with external source

Fig. 7:

Left panel : ROI energy spectrum with Pd L α peaks before and after the application of the DSVPtechnique. The initial f ROIpp was set to 0.58.

Right panel :The dependence of τ eff from the f ROIpp is shown for thedetector b. with a rise time of 17 µ s and a samplingtime of 2 µ s. In this case, N B was equal to the numberof pile-up pulses in the ROI. The DSVP algorithm represents a very powerful tech-nique to decrease the number of undesirable events in adataset. It does not rely on any particular assumptionon the nature of the events, thus in principle it can beapplied in various scenarios. In this work we have applied this improved algorithmfor pile-up discrimination, which can lead to major im-provement in experimental sensitivity for experimentssuch as HOLMES (neutrino mass measurement) or CU-PID (Majorana neutrino) [16]. It can also be usefulto recognize single-site events of the 0 νββ interactionsfrom multi-size background events in GERDA [17]. Wetested the DSVP technique for the HOLMES applica-tion and we compared its eﬃciency, represented in thiscase by the eﬀective time resolution τ eff , to more ’clas-sical’ discrimination techniques.With the target detector of HOLMES, the DSVP tech-niques allows us to reduce the total fraction of pile-upevents from 10 − ( ∼ τ eff µs ) to 10 − ( ∼ τ eff . µs ),thus improving the neutrino mass sensitivity Σ ( m ν )from 2 eV to about 1.4 eV. To put this result in perspec-tive, achieving the same improvement in Σ ( m ν ) wouldrequire to increase the acquisition time by a factor 4:from 3 to 12 years. References

1. A. Nucciotti, Advances in High Energy Physics (2016)2. B. Alpert, et al., J. Low. Temp. Phys. (1-2), 263(2016). DOI 10.1007/s10909-015-1402-y3. B. Alpert, M. Balata, D. Bennett, M. Biasotti, C. Bor-agno, C. Broﬀerio, V. Ceriale, D. Corsini, P.K. Day,M. De Gerone, et al., The European Physical JournalC (3), 112 (2015)4. I. Jolliﬀe, Principal Component Analysis (SpringerBerlin Heidelberg, Berlin, Heidelberg, 2011), pp. 1094–1096. DOI 10.1007/978-3-642-04898-2 4555. G.H. Golub, C.F. Van Loan,

Matrix Computations (3rdEd.) (Johns Hopkins University Press, USA, 1996)6. Ransac sklearn. https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.RANSACRegressor.html

7. A. De R´ujula, M. Lusignoli, Physics Letters B (4-6),429 (1982)8. M. Aker, K. Altenm¨uller, M. Arenz, M. Babutzka, J. Bar-rett, S. Bauer, M. Beck, A. Beglarian, J. Behrens,T. Bergmann, et al., Physical review letters (22),221802 (2019)9. A. Nucciotti, The European Physical Journal C (11),3161 (2014)10. K.D. Irwin, G.C. Hilton, in Cryogenic particle detection (Springer, 2005), pp. 63–15011. I.J. Maasilta, Aip Advances (4), 042110 (2012)12. B. Cabrera, Journal of Low Temperature Physics (1-2), 82 (2008)13. B. Alpert, D. Becker, D. Bennet, M. Biasotti, M. Borgh-esi, G. Gallucci, M. De Gerone, M. Faverzani, E. Ferri,J. Fowler, et al., The European Physical Journal C (4),304 (2019)14. E. Ferri, et al., J. Low. Temp. Phys. (1-2), 405 (2016).DOI 10.1007/s10909-015-1466-815. J.W. Fowler, B.K. Alpert, W.B. Doriese, D. Fischer,C. Jaye, Y.I. Joe, G. O’Neil, D. Swetz, J. Ullom, The As-trophysical Journal Supplement Series (2), 35 (2015)16. D. Chernyak, F. Danevich, A. Giuliani, E. Olivieri,M. Tenconi, V. Tretyak, Eur. Phys. J. C , 1989 (2012).DOI 10.1140/epjc/s10052-012-1989-y17. M. Agostini, et al., Eur. Phys. J. C73