[PDF] NNETFIX: An artificial neural network-based denoising engine for gravitational-wave signals

Abstract

Instrumental and environmental transient noise bursts in gravitational-wave detectors, or glitches, may impair astrophysical observations by adversely affecting the sky localization and the parameter estimation of gravitational-wave signals. Denoising of detector data is especially relevant during low-latency operations because electromagnetic follow-up of candidate detections requires accurate, rapid sky localization and inference of astrophysical sources. NNETFIX is a machine learning-based algorithm designed to remove glitches detected in coincidence with transient gravitational-wave signals. NNETFIX uses artificial neural networks to estimate the portion of the data lost due to the presence of the glitch, which allows the recalculation of the sky localization of the astrophysical signal. The sky localization of the denoised data may be significantly more accurate than the sky localization obtained from the original data or by removing the portion of the data impacted by the glitch. We test NNETFIX in simulated scenarios of binary black hole coalescence signals and discuss the potential for its use in future low-latency LIGO-Virgo-KAGRA searches. In the majority of cases for signals with a high signal-to-noise ratio, we find that the overlap of the sky maps obtained with the denoised data and the original data is better than the overlap of the sky maps obtained with the original data and the data with the glitch removed.

Full PDF

NNNETFIX: An artiﬁcial neural network-baseddenoising engine for gravitational-wave signals

Kentaro Mogushi

Institute of Multi-messenger Astrophysics and Cosmology, Missouri University ofScience and Technology, Physics Building, 1315 N. Pine St., Rolla, MO 65409,USA

Ryan Quitzow-James

Institute of Multi-messenger Astrophysics and Cosmology, Missouri University ofScience and Technology, Physics Building, 1315 N. Pine St., Rolla, MO 65409,USA

Marco Cavagli`a

Institute of Multi-messenger Astrophysics and Cosmology, Missouri University ofScience and Technology, Physics Building, 1315 N. Pine St., Rolla, MO 65409,USA

Sumeet Kulkarni

Department of Physics and Astronomy, University of Mississippi, MS 38677-1848,USA

Fergus Hayes

SUPA, School of Physics and Astronomy, University of Glasgow, Glasgow G128QQ, United Kingdom14 January 2021

Abstract.

Instrumental and environmental transient noise bursts in gravitational-wave detectors, or glitches, may impair astrophysical observations by adverselyaﬀecting the sky localization and the parameter estimation of gravitational-wavesignals. Denoising of detector data is especially relevant during low-latency opera-tions because electromagnetic follow-up of candidate detections requires accurate, a r X i v : . [ g r- q c ] J a n NETFIX rapid sky localization and inference of astrophysical sources. NNETFIX is a ma-chine learning-based algorithm designed to remove glitches detected in coincidencewith transient gravitational-wave signals. NNETFIX uses artiﬁcial neural networksto estimate the portion of the data lost due to the presence of the glitch, whichallows the recalculation of the sky localization of the astrophysical signal. The skylocalization of the denoised data may be signiﬁcantly more accurate than the skylocalization obtained from the original data or by removing the portion of the dataimpacted by the glitch. We test NNETFIX in simulated scenarios of binary blackhole coalescence signals and discuss the potential for its use in future low-latencyLIGO-Virgo-KAGRA searches. In the majority of cases for signals with a highsignal-to-noise ratio, we ﬁnd that the overlap of the sky maps obtained with thedenoised data and the original data is better than the overlap of the sky mapsobtained with the original data and the data with the glitch removed. NETFIX

1. Introduction

The ﬁeld of gravitational-wave (GW) astronomy began with the ﬁrst direct detectionof a GW signal from a binary black hole (BBH) merger [1] on September 14 th , 2015.Nine additional BBH mergers were detected with high conﬁdence during the ﬁrst andsecond LIGO [2] and Virgo [3] observation runs [4]. During the third LIGO-Virgoobservation run, 39 binary merger events were detected with high conﬁdence [5],including two exceptional BBH events [6–8] and a possible neutron star and blackhole (NSBH) merger [9].On August 17 th , 2017, the ﬁrst detection of a GW signal from a binary neutronstar (BNS) merger, GW170817, expanded multi-messenger astronomy to include GWobservations [10]. A short gamma-ray burst (GRB) was detected approximately1.7 seconds after the BNS merger time [11]. The sky map calculated from theGW signal allowed the identiﬁcation of the event with an electromagnetic (EM)counterpart [10, 11]. The association of this GW event with the observed EMtransients supports the long-hypothesized model that at least some short GRBsare due to BNS coalescences [12] and has provided many insights into fundamentalastrophysics and cosmology. In April 2020, a second BNS merger without an EMcounterpart was detected [13].In order to detect GW signals, ground-based GW detectors must beextremely sensitive, causing them to become highly susceptible to instrumental andenvironmental noise [4]. In particular, transient noise bursts, or glitches , may impairthe quality of detector data. The presence of a glitch in the proximity of a GWsignal can adversely aﬀect the analysis of the latter, including calculating the skylocalization of the source. The most notable example of such an occurrence wasGW170817, where the eﬀect of a glitch was mitigated in low-latency by removingthe contaminated portion of the data and in follow-up studies by applying ad hocmitigation algorithms [10, 14].One possibility to mitigate the eﬀect of a contaminating glitch would be todiscard the data from the aﬀected detector. This is the simplest and fastest solution;however, it is also likely to impact the analysis and sky localization, especially incases where data is only available from two detectors. Another technique that canbe used in low-latency is gating , which removes the data aﬀected by the glitch. Onemethod of gating is to set the data aﬀected by the glitch to zero using a windowfunction to smoothly transition into and out of the gate [15]. Gating was used in NETFIX − . − . − . − . − . − . − .

025 0 .

000 0 . Time from the geocentric merger time [s] − − − W h i t e n e d t i m e s e r i e s h h h Figure 1.

Left: Whitened time series of a simulated BBH signal with two-detector network signal-to-noise ratio (SNR) ρ N = 42 . m , m ) = (35 , M (cid:12) in advanced LIGO (aLIGO) recolored Gaussian noise(red curve). A 130 ms gate is applied 30 ms before the geocentric merger time(gray curve). The vertical black-dashed line denotes the merger time in LIGO-Hanford (H1). Right: The 90% sky localization error regions from the full data(gray area) and the gated data (red empty contours). The star indicates the truesky position of the simulated signal. The higher detector sensitivity of LIGO and Virgo in their third observingrun has led to an increased number of GW candidate detections from diﬀerentastrophysical populations [6, 9, 13]. Future observation runs with higher sensitivityare expected to produce even greater detection rates, which would lead to higher

NETFIX

2. Algorithm implementation, training and testing

We consider a scenario in which a transient BBH GW signal is observed by a networkof at least two detectors and the data of one detector is partially gated to removea glitch. Without loss of generality, we perform the analysis for the two LIGOdetectors, LIGO-Hanford (H1) and LIGO-Livingston (L1), with the gating appliedto data from the H1 detector. We assume the merger time at the geometric center ofEarth (or geocentric merger time) to be (approximately) known from L1 data. Wedenote with s f ( t ), s g ( t ) and s r ( t ) the full time series, the gated time series, and theNNETFIX reconstructed time series, respectively. The output of NNETFIX can bethought of as the map s r ( t ) := F [ s g ( t )] . (1)We train an ANN regression algorithm to construct the map F such that s r ( t ) ∼ s f ( t ). The NNETFIX implementation uses the scikit-learn [18] Multi-LayeredPerceptron (MLP) Regressor, a type of ANN in which the nodes (mathematicalfunctions) are arranged into layers and connected to every node in the precedingand/or succeeding layers [19]. Each node calculates a weighted linear combination NETFIX

ReLU ) activation function [20] and the

ADAM stochastic gradient-based optimizer [21] with a learning rate of 10 − . Tenpercent of the training data samples are set aside and used for validating the training.The training iteration stops if the ANN performance plateaus with a tolerance levelof 10 − to avoid overtraining. To reconstruct the gated portion of the time series,one hidden layer works better than multiple hidden layers for the loss function ofmean square error and the number of hidden layers tested. The values from the lossfunction have a weak dependency on the number of neurons.To train the algorithm, we ﬁrst build template banks of simulated non-spinning IMRphenomD

BBH merger waveforms [22] with varying intrinsic and extrinsicparameters. To reduce the potential for overtraining, each template bank alsoincludes a number of (pure) noise time series. We distribute the positions ofthe injected signals isotropically in the sky. The waveform coalescence phase,polarization angle, and cosine of the inclination angle are uniformly distributed in theintervals [0 , π ], [0 , π ], and [ − , ρ N [15] of the simulated signals in the range [11 . , . t d = (50, 75, 130)ms and gate end-times before the geocentric merger time t e = (15, 30, 90, 170) ms.The time series are sampled at 4096 Hz, whitened, and high-passed. A conservativevalue of 25 Hz is used for the high-pass ﬁlter. The gates are implemented as a reversedTukey window with a taper of 0.1 s and held ﬁxed with respect to geocentric mergertime; however, the merger time seen in the H1 detector naturally shifts due to the NETFIX m [ M (cid:12) ] m [ M (cid:12) ] n s n n Set dimension( n s ×

50 + n n )Low 10–15 8–12 348 1900 19300Medium 15–25 12–18 251 1350 13900High 28–42 23–35 61 300 3350 Table 1.

Component mass ranges, number of waveforms ( n s ), number of purenoise series ( n n ), and dimension of the TT sets for each of the three scenarios andcombinations of gate durations and end-times. We test the eﬀectiveness of the ANNs by calculating the coeﬃcient ofdetermination for the MLP Regressor in scikit-learn on the testing sets [18]. Thecoeﬃcient of determination ranges from −∞ (bad) to 1 (perfect estimation), withpositive values corresponding to some degree of accuracy. We evaluate the coeﬃcientof determination on each testing set after training the ANN on the correspondingtraining set. The ranges of the coeﬃcient of determination for the testing sets are[0.773, 0.882], [0.750, 0.883], and [0.691, 0.879] for the low-mass, medium-mass, andhigh-mass scenarios, respectively, and the means are 0.833, 0.827, and 0.814.We test for potential statistical eﬀects in the training method by considering themedium mass scenario with a gate duration of 50 ms and a gate end-time of 30 msas a representative case. For 100 trials, we ﬁnd that the coeﬃcient of determinationranges from 0.800 to 0.826 with a mean of 0.815, which is consistent with the rangesof the testing sets.The eﬀect of NNETFIX on quantities such as SNR and sky localization variesfor diﬀerent component masses, network SNR, and gate settings. Therefore, weconstruct 108 additional independent exploration sets with ﬁxed network SNR ρ N = (11 .

3, 28.3, 42 .

4) and component masses of (12, 10), (20, 15), (35, 29) M (cid:12) ,and identical combinations of gate durations and end-times as the TT sets. Eachexploration set consists of 512 independent time series with the remaining parametersdistributed as in the TT sets. NETFIX

3. Performance in the time-domain

NNETFIX’s performance in estimating the full time series can be assessed bycomputing the amount of SNR lost in the reconstruction process. We deﬁne thefractional residual SNR (FRS) FRS = ρ f − ρ r ρ f , (2)where ρ f and ρ r are the (single interferometer) peak SNR of the full time series andthe reconstructed time series in H1, respectively. Positive values of FRS close tozero generally indicate accurate time series reconstructions. However, FRS ∼ ρ f ∼ ρ g ∼ ρ r . These cases can be separated by the fractional SNRgain (FSG) FSG = ρ r − ρ g ρ g , (3)where ρ g is the peak SNR of the gated series. The FSG characterizes the amountof SNR gained by the reconstructed time series in comparison to the gated timeseries. Typically, NNETFIX performance is better for smaller values of FRS andlarger values of FSG.Median values of FRS across the exploration sets range from FRS = − .

09 (high-mass case with ρ N = 11 . t d = 170 ms and t e = 50 ms) to FRS = 0 .

22 (medium-masscase with ρ N = 28 . t d = 15 ms and t e = 130 ms). Sets with smaller gate durationsare generally characterized by lower FRS values. All exploration sets with t d = 50ms have FRS < .

1. The fraction of sets with FRS below this threshold reducesto 0.78 and 0.55 for gate durations of 75 ms and 130 ms, respectively. Similarly,exploration sets with gates that are farther away from the time of the merger alsotend to have a lower median value of the FRS. All sets with gate end-times at 170ms before merger have median FRS < .

07 while only 81%, 55% and 11% of thesamples with t e = 90, 30 and 15 ms have FRS below this threshold, respectively.The eﬀects of the network SNR and component masses on FRS are less signiﬁcant.About 83% of the sets with ρ N = 11 . < . ρ N = 28 . ρ N = 42 .

4. The percentages of low-mass, medium-massand high-mass sets with FRS < . NETFIX ∼ .

02 (low-mass, low-SNR case with t d = 170ms and t e = 50 ms) to ∼ .

89 (high-mass, low-SNR case with t d = 130 ms and t e = 15 ms). High-mass (low-mass) exploration sets are typically characterized byhigher (lower) values of the FSG. All high-mass sets have FSG > .

08, while all low-mass sets have FSG < .

07. Gate end-time and SNR do not seem to signiﬁcantlyaﬀect the value of FSG. The gate duration has a larger eﬀect. Sets with longer gatedurations typically have larger median value of the FSG. Two thirds of the sets with t d = 130 ms have FSG > .

08 compared to only 42% of the sets with t d = 50 ms.A combined threshold of FRS (cid:38) (cid:38) .

01 is a conservative choicefor good reconstructions. About 70% of the samples across all the exploration setssatisfy this criterion. Figure 2 shows the NNETFIX data reconstruction for the timeseries of Fig. 1. As shown in Fig. 3, the reconstructed time series recovers a largesignal energy in the gated portion. In this case, FRS = 0 .

05 and FSG = 0 . − . − . − . − . − . − . − .

025 0 .

000 0 . Time from the geocentric merger time [s] − − − W h i t e n e d t i m e s e r i e s Figure 2.

The whitened full time series (gray), the gated time series (red) andthe reconstructed time series (blue) for the simulated event of Fig. 1. The verticalblack-dashed line denotes the merger time in H1.

A complementary metric to evaluate the performance of the algorithm is the

NETFIX − . − . . F r e q u e n c y [ H z ] − . − . . − . − . . N o r m a li ze d e n e r g y Time from the geocentric merger time [s]

Figure 3.

Time frequency representations of the full (left), gated (middle), andreconstructed (right) time series for the simulated event of Fig. 2 using the Qtransform [27]. The vertical red-dashed line denotes the gate and the verticalwhite-dotted line denotes the merger time in H1. fractional match gain (FMG) FMG = M r − M g M f − M g , (4)where the match M i between a time series s i and the injected waveform h is M i = (cid:104) s i | h (cid:105) (cid:112) (cid:104) s i | s i (cid:105)(cid:104) h | h (cid:105) . (5)The inner product of two time series s i and s j is deﬁned as (cid:104) s i | s j (cid:105) = 4 (cid:60) (cid:90) f N f ˜ s i ( f ) ˜ s j ∗ ( f ) S ( f ) df , (6)where the tilde indicates the Fourier transform, the star denotes the complexconjugate, S ( f ) is the detector noise power spectral density (PSD), f is the high-passfrequency, and f N is the Nyquist frequency.In Eq. (4), we assume M f − M g >

0. In rare instances (0.5% of all explorationset data samples), M g becomes larger than M f . This occurs for small values of thesingle interferometer peak SNR (median value of 4.6) when the gated portion of thedata is dominated by noise and anti-correlates with the injected waveform. In thefollowing, we remove these data samples from the exploration sets.The FMG assesses how well the NNETFIX reconstructed data matches thesignal in comparison to the full data and gated data. Positive (negative) values of the NETFIX M r greater (smaller) than M g , indicating that the NNETFIXreconstructed time series has a better (worse) match with the injected waveformthan the gated time series. Values of the FMG larger than 1 indicate that the ANNoverﬁts the data, i.e., the reconstructed time series is more similar to the injectedwaveform than the full time series. Therefore, we consider the reconstructions with0 < FMG ≤ .

65 0 .

70 0 .

75 0 .

80 0 .

85 0 . M g / M f . . . . . . . M r / M f . . . . . . F M G Figure 4.

Scatterplot of M r /M f vs. M g /M f for the exploration set with ρ N = 42 .

4, ( m , m ) = (20 , M (cid:12) , t d = 130 ms and t e = 30 ms. The circlesdenote samples with 0 < FMG ≤

1, the × markers denote samples with FMG ≤ >

1. The gray areadenotes the region of the parameter space with 0 < FMG ≤

1, which contains95% of the reconstructed time series. Two outliers with values FMG= − . . We quantify NNETFIX’s performance by estimating the reconstructioneﬃciency, which we deﬁne as the fraction of successfully reconstructed samples, i.e.,samples with 0 < FMG ≤

1. The fractions of samples with FMG ≤

0, 0 < FMG ≤ > NETFIX . . . . . . . M g / M f . . . . . . . M r / M f . . . . . . F M G Figure 5.

Scatterplot of M r /M f vs. M g /M f for the exploration set with ρ N = 11 .

3, ( m , m ) = (20 , M (cid:12) , t d = 130 ms and t e = 30 ms. The circlesdenote samples with 0 < FMG <

1, the × markers denote samples with FMG ≤ >

1. The gray areadenotes the region of the parameter space with 0 < FMG ≤

1, which contains 59%of the reconstructed time series. all exploration sets varies from approximately 0 .

31 to over 0 .

95. There is a milddependence on the component masses of the system; the median value of the eﬃciencydecreases from 0 .

77 for the low-mass scenario to 0 .

61 for the high-mass scenario whenall other parameters (SNR, gate duration and gate end-time) are held ﬁxed.Within each mass scenario when the gate duration and gate end-time are heldﬁxed, NNETFIX’s eﬃciency typically improves by a factor ∼ . ρ N = 28 . t d = 75 ms and gate end-time t e = 90 ms. The exploration setswith ρ N = 11 . NETFIX − . − . − . − . . . . . . FMG . . . . . . F r a c t i o n o f c o un t s Figure 6.

Distribution of the FMG for the exploration sets with component masses( m , m ) = (20 , M (cid:12) , gate duration t d = 130 ms, gate end-time t e = 30 ms,and ρ N = 11 . ρ N = 42 . ρ N = 11 . ρ N = 42 . with t d = 75 ms and t e = 90 ms to 66% for the low-mass set with t d = 130 ms and t e = 15 ms.Figure 7 shows the eﬃciency for the exploration sets with component masses( m , m ) = (20 , M (cid:12) as a function of the single interferometer peak SNR. Thepercentage of successful reconstructions ranges from ∼ (cid:38)

80% at high peak SNR, with the lowest values (cid:46)

40% occurring for the setswith t d ≥

75 ms and t e ≥

30 ms. Time series with peak SNR above ∼

20 showsuccessful reconstructions in 70% or more of the cases, irrespective of gate durationand end-time.Changing the gate duration does not seem to have a signiﬁcant eﬀect onNNETFIX’s eﬃciency, which only varies slightly at ﬁxed network SNR and gateend-time across all exploration sets. Similarly, for ﬁxed gate duration and networkSNR, the gate end-time before merger time also has a marginal eﬀect, althoughNNETFIX tends to produce better reconstructions when the gate is closer to the

NETFIX

10 15 20 25 30 35 40

Single interferometer peak SNR . . . . . . . . E ﬃ c i e n c y Figure 7.

Eﬃciency as a function of the single interferometer peak SNR for thescenario with component masses ( m , m ) = (20 , M (cid:12) . Each line corresponds toa diﬀerent gate duration and gate end-time combination. Green (blue, black, red)markers denote gate end-times t e = 15 (30, 90, 170) ms. Circles (crosses, squares)denote gate durations t d = 50 (75, 130) ms. The bin width is 5. merger time, especially for long gate durations in the low-mass and medium-massscenarios.In conclusion, we ﬁnd that NNETFIX may successfully reconstruct gated dataof durations up to a few hundreds of milliseconds and as close as a few tensof milliseconds before the merger time for a majority of time series with singleinterferometer peak SNR greater than 20.

4. Performance of sky maps

The NNETFIX reconstructed time series are expected to produce better sky maps,and therefore better sky localization error regions of the astrophysical signal, thanthe gated time series. We evaluate this improvement by comparing the overlaps ofthe sky map derived from the full time series with the sky maps derived from thegated time series and the reconstructed time series. In the following, we generate the

NETFIX pycbc make skymap , inwhich the data can be manually gated.We follow Ref. [29] and deﬁne the overlap of two sky maps (1,2) as O , = 4 π (cid:90) p (Ω) p (Ω) dΩ (cid:90) p (Ω) dΩ (cid:90) p (Ω) dΩ , (7)where p (Ω) and p (Ω) are the sky localization probability densities of the sky mapsand the integrals are over the solid angle Ω. The discretized version of Eq. (7) is O , = N N (cid:88) i =1 P i P i , (8)where P i and P i each denote the sky localization probability of the i -th pixel ofthe corresponding sky map, and N is the total number of pixels. Each sky map isnormalized such that the sum of the pixel values over the entire map is 1. Equation(8) gives values in the range (0 , N ). Higher values of O , indicate a better overlapbetween the two maps while lower values denote worse overlaps and/or maps whichtend to have less-localized error regions.A suitable metric to evaluate the improvement in the sky localization of a signaldue to NNETFIX’s reconstruction is the overlap log ratio (OLR)OLR = log O r,f O g,f , (9)where O r,f ( O g,f ) denote the overlaps of the sky maps obtained with the reconstructed(gated) time series and the full series. Positive (negative) values of the OLR indicatethat the sky map from the reconstructed time series has a larger (smaller) overlapwith the sky map from the full time series than the latter has with the sky map fromthe gated time series. Tables 5-7 give the fraction of samples with positive OLR forall exploration sets.High values of the OLR are obtained when the overlap of the reconstructed(gated) sky map with the sky map from the full time series is large (small). Theformer typically occurs for reconstructed time series with large values of the FMG.The latter may happen when the loss of signal due to the gate is high and even small NETFIX O g,f is shown in Fig. 8for the exploration set with ρ N = 42 .

4, component masses ( m , m ) = (20 , M (cid:12) , t g = 130 ms and t e = 30 ms. The median value of the OLR is 1 . +1 . − . , where theerror is a 1- σ percentile. The OLR is positive in 87% of the samples. Median valuesof OLR for all exploration sets are given in Tables 8-10. .

01 0 . O g,f − O L R . . . . . . F M G Figure 8.

Scatterplot of OLR and O g,f for the 512 samples from the explorationset with ρ N = 42 .

4, component masses ( m , m ) = (20 , M (cid:12) , gate end-time t e = 30 ms and gate duration t d = 130 ms. The colored circles denote samples with0 < FMG ≤

1, the × markers denote samples with FMG ≤ >

1. 87% of the samples have positive OLR.The median value is OLR= 1 . +1 . − . , where the error is a 1- σ percentile. Values of OLR across the exploration sets generally increase with networkSNR, component masses and gate duration. The network SNR of the signal is themain factor that determines the value of the OLR. Because NNETFIX eﬃcientlyreconstructs time series containing signals with large SNRs, when network SNRs arelarge, the sky maps obtained from the full data are typically more similar to the skymaps from the reconstructed data than to the sky maps from the gated data. We

NETFIX ρ N ≥ .

3, irrespectiveof mass, gate duration and end-time. For these sets, the median values of the OLRfor the high SNR sets are greater than the corresponding values for the mediumSNR sets by a factor ranging from ∼ . t d = 130ms and t e = 15 ms to ∼ t d = 75 ms and t e = 170ms. The sky maps of reconstructed time series with lower SNR generally show littleimprovement compared with the sky maps of gated time series. Median values ofOLR for the sets with ρ N = 11 . ρ N = 28 . . ∼ . .

3) for t d = 130ms and t e = 90 ms ( t d = 130 ms and t e = 15 ms) to ∼ . .

6) for t d = 50 msand t e = 15 ms ( t d = 50 ms and t e = 30 ms).Median values of OLR have a roughly linear dependency on gate duration. Forthe high and medium SNR exploration sets, the median values of OLR for t d = 130ms are larger than the corresponding values for t d = 50 ms by a factor ranging from ∼ . ρ N = 42 . t e = 15 ms) to ∼ . ρ N = 28 . t e = 90 ms). Since longer gate durations correspondto greater signal losses, NNETFIX’s reconstruction provides larger SNR gains andOLR values as the gate duration increases.The portion of a signal close to the merger time has a greater impact on the skymap than the portion of the signal in the early inspiral phase. Therefore, medianvalues of the OLR for the medium and high SNR exploration sets with t e = 15 msare typically higher than the corresponding values for the sets with t e = 170 ms bya factor ranging from ∼ . ρ N = 28 . t d = 90 ms)to ∼ . ρ N = 28 . t d = 50 ms). For shorter signalsand larger gate durations, a gate end-time very close to the merger time may lead tolarge signal losses and removal of the merger portion of the signal in H1, and thus,make the reconstruction process less eﬃcient. Figure 9 shows the OLR as a functionof the gate end-time for the high-mass sets with t d = 130 ms and diﬀerent networkSNRs. Figure 10 shows the sky localization error region that is obtained with theNNETFIX reconstructed data for the case of Fig. 1. The value of the OLR is ∼ . NETFIX ∼ t e [ms] − − O L R Figure 9.

OLR as a function of the gate end-time t e for the exploration setwith component masses ( m , m ) = (35 , M (cid:12) , gate duration t d = 130 ms, anddiﬀerent network SNRs ρ N = 11 . ρ N = 28 . ρ N = 42 . σ percentiles. In summary, for a majority of the cases with gate durations up to a few hundredsof milliseconds and as close as a few tens of milliseconds to the merger time, the skymaps of reconstructed time series with network SNR ρ N ≥ .

5. Conclusion

In this paper, we have presented NNETFIX, a new machine learning-based algorithmdesigned to estimate the portion of a BBH GW signal that may be gated due to the

NETFIX h h h Figure 10.

The 90% probability sky localization error regions obtained with thereconstructed (dashed-blue), full (gray area) and gated (solid-red) time series forthe case of Fig. 1. The star denotes the injection location. presence of an overlapping glitch. We have tested the accuracy of the algorithm withdiﬀerent choices of signal parameters and gate settings, and deﬁned several metricsto assess NNETFIX’s performance. Among these metrics, the most important onesare the FMG and the OLR.The FMG quantiﬁes the algorithm’s eﬃciency in reconstructing the gated datain the time domain. Positive values of this metric indicate that the full time seriesbetter matches the NNETFIX reconstructed time series than the gated time series.The fraction of samples that show improvement varies from approximately one thirdto over 95% across the cases that we investigated. Results show that NNETFIXmay be able to successfully reconstruct a majority of BBH signals with peak singleinterferometer SNR greater than 20 and gates with durations up to a few hundreds

NETFIX O r,f , can be estimated by looking at the distribution of the OLR for theTT set at hand. The exploration sets that we investigated show that there is awell-deﬁned correlation between the OLR, the FSG and the overlap between thesky maps obtained from the gated data and the NNETFIX reconstructed data, O r,g .If this correlation is generally valid, the OLR can be estimated from the observed NETFIX O r,g and the FSG using a ﬁt calculated from the TT set used to trainthe selected NNETFIX model. To expedite this process, the samples in each TT setcould be clustered according to the distributions of the OLR, the O r,g , and the FSG.A classiﬁer could then be trained to estimate the optimal SNR and sky localizationerror region of the full signal.Once NNETFIX has been trained, the CPU time required to reconstruct thedata is of the order of a few seconds for gate durations up to hundreds of milliseconds.This short turnout makes the algorithm suitable to be used in low-latency. NNETFIXcould also be applied to GW signals other than BBH mergers, such as BNS or NSBHmergers. Therefore, it could be beneﬁcial for rapid follow-up of glitch-contaminated,potentially EM-bright candidate detections. In future work, we intend to explorethe application of NNETFIX and its eﬀect on the sky localization of BNS signals,as well as detector network conﬁgurations with more than two detectors. Improvingthe sky localizations of potentially EM-bright signals could increase the chancesof coincident EM and GW observations and lead to a better understanding of thephysical properties of their sources. Acknowledgements

K.M., R.Q.J., and M.C. are supported by the U.S. National Science Foundationgrants PHY-1921006 and PHY-2011334. The authors would like to thank their LIGOScientiﬁc Collaboration and Virgo Collaboration colleagues for their help and usefulcomments, in particular Tito Dal Canton. The authors are grateful for computationalresources provided by the LIGO Laboratory and supported by the U.S. NationalScience Foundation Grants PHY-0757058 and PHY-0823459, as well as resourcesfrom the Gravitational Wave Open Science Center, a service of the LIGO Laboratory,the LIGO Scientiﬁc Collaboration and the Virgo Collaboration. Part of this researchhas made use of data, software and web tools obtained from the Gravitational WaveOpen Science Center and openly available at .LIGO was constructed and is operated by the California Institute of Technologyand Massachusetts Institute of Technology with funding from the U.S. NationalScience Foundation under grant PHY-0757058. Virgo is funded by the French CentreNational de la Recherche Scientiﬁque (CNRS), the Italian Istituto Nazionale diFisica Nucleare (INFN) and the Dutch Nikhef, with contributions by Polish andHungarian institutes. This manuscript has been assigned LIGO Document Control

NETFIX

References [1] Abbott B P et al. (Virgo, LIGO Scientiﬁc) 2016

Phys. Rev. Lett.

Preprint )[2] Aasi J et al. (LIGO Scientiﬁc) 2015

Class. Quant. Grav. Preprint ) URL http://stacks.iop.org/0264-9381/32/i=7/a=074001 [3] Acernese F et al. (VIRGO) 2015

Class. Quant. Grav. Preprint ) URL http://stacks.iop.org/0264-9381/32/i=2/a=024001 [4] Abbott B P et al. (LIGO Scientiﬁc Collaboration and Virgo Collaboration) 2019

Phys. Rev.X (3) 031040 URL https://link.aps.org/doi/10.1103/PhysRevX.9.031040 [5] Abbott R et al. (LIGO Scientiﬁc, Virgo) 2020 ( Preprint )[6] Abbott R et al. (LIGO Scientiﬁc, Virgo) 2020

Phys. Rev. D

Preprint )[7] Abbott R et al. (LIGO Scientiﬁc Collaboration and Virgo Collaboration) 2020

Phys. Rev. Lett. (10) 101102 URL https://link.aps.org/doi/10.1103/PhysRevLett.125.101102 [8] Abbott R et al.

The Astrophysical Journal

L13 URL https://doi.org/10.3847%2F2041-8213%2Faba493 [9] Abbott R et al. (LIGO Scientiﬁc, Virgo) 2020

Astrophys. J.

L44 (

Preprint )[10] Abbott B P et al. (LIGO Scientiﬁc Collaboration and Virgo Collaboration) 2017

Phys.Rev. Lett. (16) 161101 URL https://link.aps.org/doi/10.1103/PhysRevLett.119.161101 [11] Abbott B P et al. (GROND, SALT Group, OzGrav, DFN, INTEGRAL, Virgo, Insight-Hxmt, MAXI Team, Fermi-LAT, J-GEM, RATIR, IceCube, CAASTRO, LWA, ePESSTO,GRAWITA, RIMAS, SKA South Africa/MeerKAT, H.E.S.S., 1M2H Team, IKI-GW Follow-up, Fermi GBM, Pi of Sky, DWF (Deeper Wider Faster Program), Dark Energy Survey,MASTER, AstroSat Cadmium Zinc Telluride Imager Team, Swift, Pierre Auger, ASKAP,VINROUGE, JAGWAR, Chandra Team at McGill University, TTU-NRAO, GROWTH,AGILE Team, MWA, ATCA, AST3, TOROS, Pan-STARRS, NuSTAR, ATLAS Telescopes,BOOTES, CaltechNRAO, LIGO Scientiﬁc, High Time Resolution Universe Survey, NordicOptical Telescope, Las Cumbres Observatory Group, TZAC Consortium, LOFAR, IPN,DLT40, Texas Tech University, HAWC, ANTARES, KU, Dark Energy Camera GW-EM,CALET, Euro VLBI Team, ALMA) 2017

Astrophys. J.

L 12 (

Preprint )[12] Abbott B P et al. (Virgo, Fermi-GBM, INTEGRAL, LIGO Scientiﬁc) 2017

Astrophys. J.

L13 (

Preprint )[13] Abbott B et al. (LIGO Scientiﬁc, Virgo) 2020

Astrophys. J. Lett.

L3 (

Preprint )[14] Driggers J et al. (LIGO Scientiﬁc) 2019

Phys. Rev. D Preprint )[15] Usman S A et al.

Class. Quant. Grav. Preprint )[16] Pankow C et al.

Phys. Rev. D (8) 084016 URL https://link.aps.org/doi/10.1103/PhysRevD.98.084016 NETFIX [17] Cornish N J and Littenberg T B 2015 Classical and Quantum Gravity (13) 135012 URL https://doi.org/10.1088/0264-9381/32/13/135012 [18] Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M,Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, BrucherM, Perrot M and Duchesnay E 2011 Journal of Machine Learning Research International Conference on Learning Representations [22] Khan S, Husa S, Hannam M, Ohme F, P¨urrer M, Forteza X J and Boh´e A 2016

Phys. Rev. D (4) 044007 URL https://link.aps.org/doi/10.1103/PhysRevD.93.044007 [23] Cokelaer T 2007 Phys. Rev. D Preprint )[24] Harry I W, Allen B and Sathyaprakash B 2009

Phys. Rev. D Preprint )[25] Van Den Broeck C, Brown D A, Cokelaer T, Harry I, Jones G, Sathyaprakash B, Tagoshi Hand Takahashi H 2009

Phys. Rev. D Preprint )[26] Manca G M and Vallisneri M 2010

Phys. Rev. D Preprint )[27] Chatterji S, Blackburn L, Martin G and Katsavounidis E 2004

Class. Quant. Grav. S1809–S1818 (

Preprint gr-qc/0412119 )[28] Nitz A, Harry I, Brown D, Biwer C M, Willis J, Canton T D, Capano C, Pekowsky L, Dent T,Williamson A R, Davies G S, De S, Cabero M, Machenschalk B, Kumar P, Reyes S, MacleodD, Pannarale F, dﬁnstad, Massinger T, T´apai M, Singer L, Khan S, Fairhurst S, Kumar S,Nielsen A, shasvath, Dorrington I, Lenon A and Gabbard H 2020 gwastro/pycbc: Pycbcrelease 1.16.4 URL https://doi.org/10.5281/zenodo.3904502 [29] Ashton G, Burns E, Canton T D, Dent T, Eggenstein H B, Nielsen A B, Prix R, Was M and ZhuS J 2018

The Astrophysical Journal https://doi.org/10.3847%2F1538-4357%2Faabfd2

NETFIX N e t w o r k S N R . . . t d [ m s ] t e [ m s ] . . /0 . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 .

00 300 . . . . . . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 .

01 900 . . . . . . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 .

01 1700 . . . . . . . . . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 . T a b l e . F r a c t i o n o f s a m p l e s w i t h F M G ≤ , < F M G ≤ nd F M G > f o r t h ee x p l o r a t i o n s e t s w i t h c o m p o n e n t m a ss e s ( m , m ) = ( , ) M (cid:12) . B o l d f a cee n t r i e s d e n o t e s e t s w h e r e t h e f r a c t i o n o f s a m p l e s w i t h < F M G ≤ i s l a r g e r t h a n % . N e t w o r k S N R . . . t d [ m s ] t e [ m s ] . . /0 . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 .

00 300 . . . . . . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 .

01 900 . . . . . . . . . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 .

10 1700 . . . . . . . . . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 . T a b l e . F r a c t i o n o f s a m p l e s w i t h F M G ≤ , < F M G ≤ nd F M G > f o r t h ee x p l o r a t i o n s e t s w i t h c o m p o n e n t m a ss e s ( m , m ) = ( , ) M (cid:12) . B o l d f a cee n t r i e s d e n o t e s e t s w h e r e t h e f r a c t i o n o f s a m p l e s w i t h < F M G ≤ i s l a r g e r t h a n % . N e t w o r k S N R . . . t d [ m s ] t e [ m s ] . . /0 . . . /0 . . . . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 .

17 300 . . . . . . . . . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 .

12 900 . . . . . . . . . . . /0 . . . . . . /0 . . . /0 . . . . . . /0 .

40 1700 . . . . . . . . . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 . . . /0 . T a b l e . F r a c t i o n o f s a m p l e s w i t h F M G ≤ , < F M G ≤ nd F M G > f o r t h ee x p l o r a t i o n s e t s w i t h c o m p o n e n t m a ss e s ( m , m ) = ( , ) M (cid:12) . B o l d f a cee n t r i e s d e n o t e s e t s w h e r e t h e f r a c t i o n o f s a m p l e s w i t h < F M G ≤ i s l a r g e r t h a n % . NETFIX Network SNR 11.3 28.3 42.4 t d [ms] 50 75 130 50 75 130 50 75 130 t e [ms] 15 0 .

52 0 .

53 0 .

56 0 .

62 0 .

71 0 .

74 0 .

76 0 .

83 0 . .

56 0 .

53 0 .

52 0 .

68 0 .

73 0 .

76 0 .

81 0 .

85 0 . .

55 0 .

51 0 .

53 0 .

69 0 .

65 0 .

74 0 .

78 0 .

79 0 . .

61 0 .

58 0 .

51 0 .

72 0 .

64 0 .

65 0 .

79 0 .

77 0 . Table 5.

Fraction of samples with positive OLR for the exploration sets withcomponent masses ( m , m ) = (12 , M (cid:12) . Network SNR 11.3 28.3 42.4 t d [ms] 50 75 130 50 75 130 50 75 130 t e [ms] 15 0 . .

55 0 .

68 0 .

74 0 .

79 0 .

84 0 .

85 0 . .

51 0 .

69 0 .

76 0 .

80 0 .

85 0 .

88 0 . .

61 0 . .

74 0 .

75 0 .

65 0 .

86 0 .

85 0 . .

62 0 . .

77 0 .

78 0 .

67 0 .

84 0 .

87 0 . Table 6.

Fraction of samples with positive OLR for the exploration sets withcomponent masses ( m , m ) = (20 , M (cid:12) . Entries in italic denote sets where thefraction of samples is smaller than 0.5. Network SNR 11.3 28.3 42.4 t d [ms] 50 75 130 50 75 130 50 75 130 t e [ms] 15 0 .

57 0 . .

77 0 .

79 0 .

69 0 .

85 0 .

83 0 . . .

81 0 .

82 0 .

91 0 .

87 0 . .

65 0 . .

84 0 .

80 0 .

75 0 .

93 0 .

89 0 . .

65 0 .

64 0 .

57 0 .

79 0 .

84 0 .

80 0 .

88 0 .

91 0 . Table 7.

Fraction of samples with positive OLR for the exploration sets withcomponent masses ( m , m ) = (35 , M (cid:12) . Entries in italic denote sets where thefraction of samples is smaller than 0.5. NETFIX Network SNR 11.3 28.3 42.4 t d [ms] 50 75 130 50 75 130 50 75 130 t e [ms] 15 0 . . . .

016 0 .

031 0 .

056 0 .

051 0 .

086 0 . . . . .

015 0 .

028 0 .

044 0 .

073 0 . . . . . .

011 0 .

034 0 .

026 0 .

04 0 . . . . . . .

018 0 .

032 0 . Table 8.

Median values of OLR for the exploration sets with component masses( m , m ) = (12 , M (cid:12) . Network SNR 11.3 28.3 42.4 t d [ms] 50 75 130 50 75 130 50 75 130 t e [ms] 15 0 . -0.0007 .

016 0 .

054 0 .

087 0 .

17 0 .

16 0 .

24 0 . -0.0050 -0.0029 . .

044 0 .

079 0 .

15 0 .

13 0 .

22 0 . . . -0.0051 .

025 0 .

046 0 .

050 0 .

074 0 .

13 0 . . . -0.0044 .

023 0 .

037 0 .

045 0 .

046 0 .

098 0 . Table 9.

Median values of OLR for the exploration sets with component masses( m , m ) = (20 , M (cid:12) . Italic entries denote sets with negative values. Network SNR 11.3 28.3 42.4 t d [ms] 50 75 130 50 75 130 50 75 130 t e [ms] 15 0 .

023 0 . -0.1400 .

33 0 .

45 0 .

68 0 .

62 0 .

82 0 . . -0.043 -0.0099 .

25 0 .

42 0 .

72 0 .

56 0 .

83 1 .

190 0 .

013 0 . -0.015 .

088 0 .

21 0 .

41 0 .

24 0 .

48 0 . .

012 0 .

016 0 . .

046 0 .

081 0 .

11 0 .

12 0 .

22 0 . Table 10.

Median values of OLR for the exploration sets with component masses( m , m ) = (35 , M (cid:12)(cid:12)