[PDF] Compressed Sensing for Time-Frequency Gravitational Wave Data Analysis

Abstract

The potential of compressed sensing for obtaining sparse time-frequency representations for gravitational wave data analysis is illustrated by comparison with existing methods, as regards i) shedding light on the fine structure of noise transients (glitches) in preparation of their classification, and ii) boosting the performance of waveform consistency tests in the detection of unmodeled transient gravitational wave signals using a network of detectors affected by unmodeled noise transient

Full PDF

CCompressed Sensing for Time-Frequency Gravitational Wave Data Analysis

Paolo Addesso, Maurizio Longo, Stefano Marano, Vincenzo Matta

Dept. of Electrical and Computer Engineering and Applied Mathematics,University of Salerno, 84084 Fisciano (SA), Italy

Maria Principe, Innocenzo M Pinto

Waves Group, University of Sannio at Benevento,82100 Benevento, Italy, INFN, LVC and KAGRA (Dated: May 16, 2016)The potential of compressed sensing for obtaining sparse time-frequency representations for gravi-tational wave data analysis is illustrated by comparison with existing methods, as regards i) sheddinglight on the ﬁne-structure of noise transients (glitches) in preparation of their classiﬁcation, and ii)boosting the performance of waveform consistency tests in the detection of unmodeled transientgravitational wave signals using a network of detectors aﬀected by unmodeled noise transients. a r X i v : . [ a s t r o - ph . I M ] M a y . INTRODUCTION AND MOTIVATION An important fraction of the Gravitational Wave (henceforth GW) signals of cosmic origin sought after by runningand/or planned GW detection experiments based on optical interferometers [1]-[4] consists of transient signals. Theseinclude, e.g., binary inspirals/mergers/ringdowns, and supernova outbursts.Transient disturbances of instrumental and/or enviromental origin, globally nicknamed glitches , are also ubiquitouslyobserved in the data of interferometric GW detectors, occurring with various amplitudes, shapes, and rates.In view of the essentially unmodeled nature of transient GWs of astrophysical interest [5], it is challenging to get ridfrom glitches. The problem is further complicated by the fact that glitches infringe the additive Gaussian stationarynoise assumption on which most detection/estimation algorithms are based [6].Substantial work has been made in GW experiments to understand the origin of glitches using additional informationfrom environment and instrument monitoring channels, and to obtain reliable data quality assessment and vetoingcriteria [7]-[9]. Consistency tests among the data gathered by several non co-located interferometers have beensuggested and implemented to discriminate glitches, in view of their essentially local nature [10].In the time-frequency (henceforth TF) domain, where non-stationary signals are most naturally represented [11],typical GW transients as well as typical instrumental glitches exhibit highly localized supports, and are thus technically sparse . GW chirps from inspiraling binary systems with large duration × bandwidth ﬁgures are also sparse in the TFdomain, yielding almost one-dimensional signatures ( ridges ), representing their instantaneous-frequency evolutions[12].In the last few years a unifying conceptual framework for the eﬃcient representation of sparse signals has beendeveloped, globally referred to under the name of compressed sensing (henceforth CS). The power of the CS paradigmis witnessed by the exponentially growing scope of its applications (see, e.g., [13] for a broad and up to date list).Using CS related concepts/tools in GW data analysis we may take advantage of the sparsity of both GW signals andinstrumental glitches in the TF domain, for the purpose of i) enhancing the readability of TF representations of thedata gathered by interferometric detectors, and ii) improving consistency tests among diﬀerent detectors.The potential of sparse representation for GW data analysis was independently recognized in [14] and [15], under adiﬀerent perspective.Our aim in this paper will be to illustrate by examples the potential of CS for TF GW data analysis, extendingprevious results in [16].The paper is accordingly organized as follows. In Section II we review the key features of the relevant signal(s) andnoise(s) and in Section III the basics of the TF representations referred to in the paper.In Section IV we introduce sparse (skeletal) TF representations, and propose a CS-based algorithm for constructingthem, together with pertinent implementation details.In Section V we discuss a number of numerical experiments, where the proposed sparse TF representations areevaluated by comparison with other TF representation tools presently in use in GW data analysis. Conclusions followunder Section VI. II. TRANSIENT GW SIGNALS AND GLITCHES

Transient GW signals can be represented in the form s ( t ) = Re [˜ s ( t )] , where ˜ s ( t ) = S ( t ) exp [ ıψ ( t )] , (1)where the amplitude S ( t ) function evolves adiabatically (i.e., on much longer timescales) compared to the phase ψ ( t ),so that (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ˙ SS (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:28) | ˙ ψ | , and (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ¨ ψ ˙ ψ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:28) . (2)GW signals emitted by compact binary systems during their inspiral [17], merger [18], and ringdown [19] phase, aswell as supernova core-collapse GW waveforms [20] belong to the above class.The data gathered by interferometric GW detectors are embedded in additive noise. Once all narrowband features(power lines, natural-modes of the test-mass suspending wires, etc.) have been subtracted, the residual noise ﬂoor isfound to consists of a locally-stationary, zero mean, coloured Gaussian random process n ( t ), and a generalized impul-sive component g ( t ). This latter is basically a (random) superposition of transients of environmental/instrumentalrigin (known as glitches ), viz. [21]: g ( t ) = K (cid:88) k =1 ψ k ( t ) , (3) K being the (random) number of such transients in the observation time-window, and ψ k ( · ) the (random) glitchestherein.Transient GWs as well as glitches are typically highly localized in the TF plane, their support being restricted to tiny spots or narrow stripes . III. TIME-FREQUENCY REPRESENTATIONS

Time-frequency representations provide the most natural framework for representing non stationary (transient)signals. A variety of diﬀerent TF representations have been proposed, mostly in the XX century [22]-[26].Several have been suggested or used for GW data analysis, including the linear short-time-Fourier, wavelet andconstant-Q transforms (shortly reviewed in Sections III B-III D), the bilinear Bertrand [27] and Wigner-Ville (sectionIII E) transforms, and the special (sparse) decompositions named after Hilbert-Huang (Section III F) and Mallat-Zhang (Section III G).More exist, including the bilinear hyperbolic [28], higher-order (e.g., generalized Cohen-class [29], L-Wigner [30],polyspectral [31], polynomial [32]) , and positive-deﬁnite [33] TF representations, that haven’t been used so far inGW data analysis, to the best of our knowledge.

A. Time-Frequency Duality

Time-frequency duality (also known as the Gabor-Heisenberg principle) expresses the well known fact that thewider the time-support of x ( t ), the narrower its frequency-support, and vice-versa. This property, that plays a keyrole in TF data analysis, can be formalized by introducing the time and frequency barycenters t = (cid:90) + ∞−∞ t | x ( t ) | dt, f = (cid:90) + ∞−∞ f | X ( f ) | df, (4)and the related spreads δ ( x ) t and δ ( x ) f , δ ( x ) t = (cid:20)(cid:90) + ∞−∞ ( t − t ) | x ( t ) | dt (cid:21) / , δ ( x ) f = (cid:20)(cid:90) + ∞−∞ ( f − f ) | X ( f ) | df (cid:21) / , (5) x ( t ) and X ( f ) being a generic time function and its Fourier transform, yielding [11] δ ( x ) t δ ( x ) f = ξ x ≥ (4 π ) − . (6)The quantity ξ x depends on the chosen x function, is known as its TB (time-bandwidth) product, and attains itslower bound when x ( t ) is a (unit-norm) Gabor function x ( t ) = 1 (cid:113) π [ δ ( x ) t ] exp ( − πıf t ) exp  − (cid:32) t − t δ ( x ) t (cid:33)  . (7) B. Short Time Fourier Transform

The short-time Fourier transform (STFT) is likely the simplest and oldest TF representation, and is deﬁned by: F ( h ) x ( t, f ) = (cid:90) + ∞−∞ x ( τ ) h ( t − τ ) exp( − πıf τ ) dτ (8)here h ( t ) is a time-windowing function which vanishes for | t | > δ ( h ) t . The STFT is the projection of x ( t ) into ananalyzing function whose envelope is a time-shifted version of h ( τ ) centered at τ = t , whose (complex) carrier hasfrequency f . The (squared) modulus of the STFT, known as spectrogram , S ( h ) x ( t, f ) = (cid:12)(cid:12)(cid:12) F ( h ) x ( t, f ) (cid:12)(cid:12)(cid:12) (9)has been frequently used as a ﬁducial representation of the frequency-distribution of signal energy for non-stationarysignals. The frequency resolution of the STFT (and spectrogram) is the same for all f . Time-frequency dualityimplies that smaller values of δ ( h ) t (yielding larger time localization) entail poorer frequency resolution. C. Wavelet Transform

The wavelet transform (WT) is deﬁned by: D ( h ) x ( t, f ) = (cid:18) ff h (cid:19) / (cid:90) + ∞−∞ x ( τ ) h ∗ (cid:20) ff h ( t − τ ) (cid:21) dτ. (10)Here h ( t ) is a zero-mean complex-valued time-windowed oscillatory function whose absolute Fourier spectrum isunimodal and peaked at f = f h , known as mother wavelet , viz., h ( t ) = w ( t ) exp ( − πıf h t ) , where w ( t ) = 0 , ∀| t | > δ ( w ) t . (11)It is seen that the wavelet transform is the projection of x ( τ ) into an analyzing function obtained by time-shiftingthe mother wavelet (from τ = 0 to τ = t ), and time-scaling it via the frequency-dependent factor a = ( f /f h ) (whichentails frequency-shifting its carrier from f h to f ).Depending on whether a ≶

1, the above time-scaling corresponds to time-squeezing or time-stretching , and at thesame time, in view of TF duality, to frequency-stretching or frequency-sequeezing , respectively, by the same factor a .The (squared) modulus of the WT, known as scalogram ,Σ ( h ) x ( t, f ) = (cid:12)(cid:12)(cid:12) D ( h ) x ( t, f ) (cid:12)(cid:12)(cid:12) (12)can be used as a ﬁducial representation of the TF distribution of signal energy. We note in passing that both thespectrogram (9) and the scalogram (12) are unitary representations, i.e., satisfy signal energy conservation (Parsevaltheorem), viz., (cid:90) + ∞−∞ | x ( τ ) | dτ = (cid:90) + ∞−∞ dt (cid:90) + ∞−∞ df S ( h ) x ( t, f ) = (cid:90) + ∞−∞ dt (cid:90) + ∞−∞ df Σ ( h ) x ( t, f ) . (13) D. Q-Transform

As noted above, the STFT (and the spectrogram) is a constant-bandwidth representation, where the analyzing functionduration (and bandwidth) is the same for all ( t, f ). In the WT (and the scalogram), on the other hand, the eﬀectivetime-width δ t of the analyzing function at f is inversely proportional to f , and hence (in view of Gabor-Heisenbergtheorem) the spectral-width δ f is proportional to f , so that the ratio f /δ f is the same for all f . Hence, the WT (andthe scalogram) are basically constant-Q representations, Q being the name given to the f /δ f ratio.A diﬀerent, but closely related, constant-Q TF representation is obtained by starting from the STFT , eq. (8), andletting the time-width of the windowing function to be inversely proportional to f , viz.: δ t = Cf (14)so that, in view of TF duality δ f = ξδ t = f ξC (15)ielding fδ f =: Q = Cξ . (16)Under this assumption, equation (8) yields the so called constant-Q transform (henceforth QT), Q ( h ) x ( t, f ; Q ) = (cid:90) ∞−∞ x ( τ ) h Q ( t − τ ; f ) exp( − ı πf τ ) dτ (17)where, denoting as h ( t ) a chosen windowing function with time width δ ( h ) t and TB ﬁgure ξ h , h Q ( t − τ ; f ) = h (cid:34) f δ ( h ) t Qξ h ( t − τ ) (cid:35) . (18)The form of eq. (17) suggests a convenient way for its computation, via a sequence of Fourier transforms (denotedbelow by the F operator), viz.: Q ( h ) x ( t, f ; Q ) = F − ξ → t (cid:2) X ( ξ, f ) H ∗ Q ( ξ ; f ) (cid:3) (19)where, X ( ξ, f ) = F t → ξ [ x ( t ) exp( − ıπf t )] , (20)and H Q ( ξ, f ) = F t → ξ [ h Q ( t ; f )] . (21)The QT tiles the TF plane linearly in time and logarithmically in frequency, as shown in Figure 1, by comparison withthe STFT and WT. In exploratory TF data analysis, the QT is usually computed for a set of (logarithmically-spaced)Q values, lower (higher) Qs yielding lower (higher) resolution in frequency, and higher (lower) resolution in time. TheQT was introduced and developed in [36], [37], and was proposed as a tool for GW data analysis in [38], [39], leadingto several currently perused implementations in LIGO-Virgo, including the the Q , Omega , and

Omicron pipelines[38]-[40].

E. Wigner-Ville Transform

The Wigner-Ville (henceforth WV) transform of a real valued signal x ( t ) reads [41]: W x ( t, f ) = (cid:90) + ∞−∞ ˜ x (cid:16) t + τ (cid:17) ˜ x ∗ (cid:16) t − τ (cid:17) e − i πfτ dτ (22)where ˜ x ( t ) is the the analytic mate of x ( t ), ˜ x ( t ) = x ( t ) + i H [ x ( t )] , (23) H [ · ] denoting the Hilbert transform operator, H [ x ( t )] = 1 π − (cid:90) R x ( τ ) t − τ dτ = F − { U ( f ) F [ x ] } (24)where U ( · ) is Heaviside’s step function.Using the WV has been suggested for diﬀerent purposes in GW data analysis, including the detection of GW chirpsfrom inspiraling binaries, [42]-[44], and the estimation of GW arrival time-delays in a network of detectors for sourcelocalization [45].The WV has a number of nice (and unique) properties [11] among all time-frequency representations: it is a memberof the Cohen Class (i.e., it is covariant with respect to time and/or frequency shifts of its argument); its time(frequency) barycenter at ﬁxed frequency (time) reproduce (and deﬁne) the group-delay and instantaneous frequencyf its argument; its marginal distribution along any radial line in the TF plane yields the energy density of thecorresponding fractional Fourier transform [46], [47]. The WV is also unitary , i.e. energy preserving (Moyal theorem), (cid:20)(cid:90) R | ˜ x ( t ) | dt (cid:21) = (cid:90) (cid:90) R | W x ( t, f ) | dtdf (25)and invertible (up to an irrelevant complex factor):˜ x ( t ) = 1˜ x ∗ (0) (cid:90) ∞−∞ W x ( t/ , f ) exp(2 ıπf t ) df. (26)The main limitation of the WV stems from it bilinear nature, entailing in general the appearance of intermodulationartifacts, which hinder its visual readability. The WV transform is immune from such artifacts only in two specialcases, namely, when x ( t ) is either a single noise-free constant-amplitude chirp whose frequency changes linearly withtime, or a single noise-free Gabor (sine-Gaussian) function [41].Diﬀerent strategies have been proposed to get rid of the WV intermodulation artifacts, such as smoothing kernels [ ? ] and reassignment [49], that are brieﬂy discussed below.

1. Smoothed WV

The 2D Fourier transform of the Wigner-Ville distribution A x ( ξ, η ) = F t → ξf → η [ W x ( t, f )] (27)is known as the Ambiguity (or Woodward) Function (henceforth AF) of x , and the ( ξ, η ) plane is referred to as theAF domain, the ξ and η arguments being referred to as the Doppler-shift [sec − ] and the delay [sec].Interference artifacts in the WV exhibit rapid variations in the TF plane. Accordingly, they map far away from theorigin in the AF plane, and can be eﬀectively suppressed by multiplying the AF by some low-pass 2D-window functionΛ( ξ, η ) vanishing beyond some distance from the origin of the AF plane, and transforming the windowed AF back tothe ( t, f ) domain.Multiplication in the ( ξ, η ) domain corresponds to convolution in the ( t, f ) domain (Borel theorem), and the resultingsmoothed WVD can be written: W ( λ ) x ( t, f ) = (cid:90) (cid:90) R W x ( τ, η ) λ ( t − τ, f − ν ) dτ dν (28)where λ ( t, f ) is the inverse Fourier transform of the 2D-windowing factor Λ( ξ, η ) in the AF domain, and is referredto as smoothing kernel . Smoothing (low pass ﬁltering) of the WV entails some loss in its TF resolution.It is worth noting that both the spectrogram (9) and the scalogram (12) are special smoothed versions of the WV,where the smoothing kernel λ ( t, f ) is the WV transform of the pertinent windowing functions h in eqs. (8) and (10)[ ? ].Several smoothing kernels have been proposed, featuring diﬀerent properties [48]r. In a series of papers [50]-[51]Baraniuk and Jones, developed the idea of seeking a radially-Gaussian windowing function in the AF plane, oftenreferred to as radially Gaussian kernel (RGK) in the technical Literature, tailored to the actual energy distributionin the AF plane. They accordingly use polar coordinates ( r, θ ) in the AF plane, and let K ( r, θ ) = exp (cid:20) − r σ ( θ ) (cid:21) (29)where, in view of the AF symmetry property A x ( − ξ, − τ ) = A x ( ξ, τ ), K ( r, θ + π ) = K ( r, θ ) , (30)and seek σ ( θ ) so as to maximize the energy content of the kernel-weighted AF (cid:90) π dθ (cid:90) ∞ rdr | A x ( r, θ ) K ( r, θ ) | (31)ubject to a measure (area) constraint: (cid:90) π dθ (cid:90) ∞ rdr | K ( r, θ ) | ≤ α. (32)The Baraniuk-Jones radially Gaussian kernels are quite eﬀective in producing intermodulation-artifact free smoothedversions of the WV, with nicely limited (and uniform) loss in TF resolution. The RGK smoothed version of the WVwill be referred to as BJ-smoothed WV in the rest of this paper.

2. Reassignment

Reassignment is, basically, a heuristic procedure for re-focusing the WV, after it was blurred by smoothing, andcan be understood without referring to a particular smoothing kernel [49].According to (28), the value of the smoothed WV at any given point ( t, f ) is a weighted sum of all W x valuesthroughout the TF plane, the weighting factor being represented by the function λ ( · , · ) centered in ( t, f ).Following Flandrin (who re-discovered and elaborated the reassignment concept introduced in [52]), ”it is as if the totalmass of an object were assigned to its geometric center - which is incorrect, except in the special case of homogeneous density”. Reassignment accordingly consists in computing the TF coordinates of the centroid of W x , as weighted bythe ( t, f )-centered λ ( · , · ), viz.: ˆ t ( h ) x = (cid:82)(cid:82) R τ W x ( τ, η ) λ ( t − τ, f − ν ) dτ dν (cid:82)(cid:82) R W x ( τ, η ) λ ( t − τ, f − ν ) dτ dν, (33)ˆ f ( h ) x = (cid:82)(cid:82) R νW x ( τ, η ) λ ( t − τ, f − ν ) dτ dν (cid:82)(cid:82) R W x ( τ, η ) λ ( t − τ, f − ν ) dτ dν (34)and reassigning the value of the WVD originally computed in ( t, f ) to the point (ˆ t ( h ) x , ˆ f ( h ) x ). Reassignment performsvery well in the absence of noise. F. Hilbert-Huang-Transform

The Hilbert-Huang-Transform (henceforth HHT), consists of two steps. The ﬁrst one, known as Empirical ModeDecomposition (EMD) [53], is a constructive recipe for adaptively representing a nonstationary signal as S ( t ) = N (cid:88) k =1 s k ( t ) + r ( t ) (35)where r ( t ) is a monotonic (possibly null) residual, and the intrinsic mode functions (IMF) s k ( t ) are obtained by anexhaustive sifting procedure, consisting in deﬁning the functions m h − ( t ) =  S ( t ) , h = 1 ,S ( t ) − h − (cid:88) k =1 s k ( t ) , h > , (36)computing their mean-envelope µ h − (average of upper and lower envelopes), and letting s h = m h − − µ h − . (37)As shown in [54], EMD attempts (and often succeeds) identifying the various scales at which a signal oscillates, in afully data-driven way. Indeed, the IMF spectra organize spontaneously as an almost constant-Q dyadic wavelet-likeﬁlter bank [55].In the second step, the analytic mate of each IMF is constructed via the Hilbert transform (HT),˜ s k ( t ) = A k ( t ) exp[ ıψ k ( t )] = s k ( t ) + ı H [ s k ( t )] (38)hereby an instantaneous amplitude A k ( t ) (henceforth IA) is associated to a ﬁducial instantaneous angular frequency ˙ ψ k ( t ) (henceforth IF), computed from the (numerical) time-derivative of the instantaneous phase ψ k ( t ). This yieldsthe Hilbert spectrum, aka the HHT: H S ( t, ω ) = 12 π (cid:88) k A k ( t ) δ (cid:34) f − ˙ ψ k ( t )2 π (cid:35) (39)whereby each IMF is represented by a 1 D feature in the TF plane (its ﬁducial instantaneous-frequency line, henceforthIFL), whose points have diﬀerent levels, given by the pertinent IA.The HHT was introduced in GW data analysis in [56], and further exploited in [57]. A recent review of the relatedimplementation aspects can be found in [58].Note that the Hilbert spectrum (39) holds no information about the instantaneous bandwidth (IBW) of the signal[59]-[61]. An IBW-aware version of the HHT can be obtained in principle by estimating the IBW at each point (time)along the IFL, and re-distributing the local energy (squared IA) over an IBW-wide frequency interval, according tosome suitable/ﬁducial fall-oﬀ law.The original EMD method lacks a rigorous mathematical foundation, and hence its convergence properties, andresilience against noise are not well established [53].In principle, it should be noted that any given (smooth, limited) signal x ( t ) can be represented by an inﬁnite numberof possible { a ( t ) , φ ( t ) } pairs such that x ( t ) = a ( t ) cos[ φ ( t )] (see, e.g., [62] for a general discussion, and [63] forsimple examples). In addition, the HHT algorithm is based on the assumption that H [ A ( t ) cos[ ψ ( t )]] = A ( t ) sin[ ψ ( t )],which holds only if i) the Fourier spectra of the envelope A ( t ) and carrier cos[ ψ ( t )] do not nonoverlap [65], and ii) H [cos[ ψ ( t )]] = sin[ ψ ( t )], which is (asymptotically) true under broad but not completely general assumptions [66].A mathematically sound (and better performing, e.g., in the case of multi-tone signals) EMD-like algorithm, knownas the synchro-squeezed wavelet transform was introduced in [67].Conceptual as well as technical (implementation) issues also exist as regards both the EMD and the HT steps of theHHT algorithm: • In the original EMD algorithm [53] spline-ﬁtting is used to construct the upper/lower waveform envelopes, ateach step of the sifting procedure. This choice entails the appearance of end-point over/undershoot, that mayhinder faithful IMF recovery. Alternative EMD algorithms (based on constrained optimization, rather thanspline-ﬁtting) have been proposed to circumvent this problem [68]. Mode mixing, whereby a single IMF mayinclude diﬀerent signal components, or a single component may be split across several IMFs is another knownissue. Modiﬁed EMD algorithm have been used to mitigate this problem, the key idea being that of averagingdiﬀerent IMFs obtained by applying EMD to several superpositions of the signal with diﬀerent (independent)realizations of noise [69]. Better IMF reconstruction has been achieved using wavelet-based projections usingthe Fejer-Korovkin class of wavelet ﬁlters [70]. • The Hilbert Transform step is usually implemented via (fast) discrete Fourier transform (DFT), and is accord-ingly aﬀected by spectral leakage and distorsion, which may spoil the subsequent IF estimation. Numericaldiﬀerentiation used to retrieve the instantaneous frequency is, in addition, quite sensitive to noise. Alternativesto the DFT-HT have been suggested to ﬁx these limitations, with partial success (e.g., interpolative approachesto retrieve the IAs [71], and 1st-order autoregressive modeling for estimating the IFs [72]).None of the above HHT improvements has been applied, to the best of our knowledge, in GW data analysis so far.The most serious limitation of the HHT lies in its limited frequency-resolution (in contrast to claims made in [56]-[58]).Indeed, as shown in [73], a signal consisting of two (pure) tones, will not be solved by the HHT, if the frequencydiﬀerence does not exceed a confusion bandwidth depending on the ratio between the amplitudes of the two tones [74]An improved EMD algorithm with tunable frequency resolution was proposed in [75].In [ ? ] the EMD was used, to derive a set of ambiguity-domain smoothing ﬁlters (one for each IMF) whereby diﬀerent”views” of the WV of the original signal could be obtained, and combined to give a TF representation referred to asEMD-smoothed WV. G. Atomic Decompositions

Mallat and Zhang are are credited for inventing the matching-pursuit algorithm [77] for decomposing a signal intoelementary transients, represented by Gabor atoms [78], [79].he analytic mate of a SG (Gabor) atom s ( t ) can be approximated by˜ s ( t ) = a ( t ) exp[ ıφ ( t )] , (40)with a ( t ) = A exp {− [( t − t ) / ∆ T ] } , φ ( t ) = 2 πf t + ψ, (41)provided the product between the SG carrier frequency f and time-spread ∆ T is large ( (cid:29) W ˜ s ( t, f ) = ∆ T √ πA exp {− t − t ) / ∆ T ] } exp {− π [( f − f )∆ T ] } . (42)Adding the WVs of the individual atoms in the AD yields a sparse, intermodulation artifact-free, ﬁducial energy-distribution of the signal in the TF domain, known as W(i)V(i)gram [77].The AD concept was introduced in GW data analysis in [21] with reference to glitches, and further pursued in [80],[81]. Under the name of Sine-Gaussians (henceforth SG), Gabor atoms have been also used to represent generic GWbursts [82].It should be noted that ADs are non -unique, the choice of the atoms being largely arbitrary. Alternative atom choicesinclude, e.g., exponentially damped sinusoids [83], and modulated Gamma-envelopes [84].The physical readability of atomic representations is strongly aﬀected by the choice of the atoms, and this ultimatelyleads to the problem of ﬁnding the atoms which are the most natural choice for the signals being analyzed [85].MP-derived ADs may display strong atoms in time intervals where the original signal is negligibly small . These atomsusually cancel out in the reconstructed waveform by destructive interference, but stand out in the WiVigram (due tothe positive sign deﬁniteness of 42). Pictorially, such atoms are said to form the dark energy of the AD [86].Modiﬁed MP algorithm have been proposed attempting to minimize not only the reconstruction energy ( L ) error,but also its dark-energy content [87], [88]. IV. FROM WV TO TF SKELETONS VIA CS

The TF representation of the data gathered by an interferometric GW detector contains (noise-blurred) sparse features. These include highly localized juts , representing glitches and GW transients with small time-bandwidthﬁgures, and ridges , i.e. almost one-dimensional features representing the frequency evolution of wideband GW chirps.Together, the above sparse features form the (generalized) TF skeleton of the data.Unfortunately, no TF representation (including the WV, wavelet and Q transform) returns the skeleton of the data.The AD and HHT attempt to ﬁnd sparse TF approximations consisting only of ”juts” or ”ridges” , respectively. Thisis physically unjustiﬁed, and limits their range of meaningful applicability.In this section we use the compressed sensing paradigm to derive general, sparse and highly resolved

TF representationsfrom the WV without making any unjustiﬁed assumption, following the approach originally proposed in [91].As shown in [91], this technique compares favorably to [ ? ]-[49] in terms of eﬀectiveness and computational burden.It is worth stressing, however, that it does not attempt to derive an artifact-free WV representation, but rather to construct the skeleton of the TF distribution which is a-priori unknown, and otherwise unavailable.The CS paradigm relies on the fact that signals that are sparse in the TF domain can be essentially recovered using arelatively small set of samples from the Fourier- conjugate domain, where the signal is expected to be dense , accordingto the Gabor-Heisenberg principle [92], [93].To exploit this property, we start from the AF : A x ( ξ, τ ) = F [ W x ( t, f )] (43)and manage to distill the (sparsest) TF representation (skeleton) of our data using a bunch of values of the AF froma suitable neighbourhood Ω of the origin in the ( ξ, τ ) plane. Doing so we use again the well known property of theWV representation whereby its highly-oscillatory intermodulation artifacts map far away from the origin in the AFplane .Formally, we seek a sparse TF distribution (cid:99) W such that (cid:99) W ( t, f ) = arg min ||W ( t, f ) || : S Ω ( ξ, τ ) {F [ W ( t, f )] − A x ( ξ, τ ) } = 0 . (44)here the L norm || · || is the cardinality of the (discrete) support of its argument, and S Ω ( ξ, τ ) = (cid:26) , ( ξ, τ ) ∈ Ω0 , ( ξ, τ ) / ∈ Ω . (45)The cardinality (size) and shape of Ω should be judiciously chosen for best performance, as further discussed in SectionIV B. A. Computational Cost and Practical Implementation

The optimization problem (44) is combinatorially complex, and thus almost unaﬀordable from a computationalviewpoint. Remarkably, under fairly general conditions one can rather address the viable problem (cid:99) W ( t, f ) = arg min ||W ( t, f ) || : S Ω ( ξ, τ ) {F [ W ( t, f )] − A x ( ξ, τ ) } = 0 (46)where the L norm has replaced the L one [92] . In order to take into account the noisyness of the data it is furtherexpedient to relax the equality constraint in (46), replacing it with a suitable bound on the error L norm, viz. (cid:99) W ( t, f ) = argmin ||W ( t, f ) || : || S Ω ( ξ, τ ) {F [ W ( t, f )] − A x ( ξ, τ ) }|| ≤ (cid:15). (47)The optimization problem (47) can be implemented in several ways, including, e.g., log-barrier, interior-point anditeratively-reweighted least-squares algorithms, whose implementations and computational costs have been discussedin [94], [95], [96], respectively.Convex analysis also tells us that any nontrivial solution of (47) is also a solution of (cid:99) W ( t, f ) = argmin λ ||W ( t, f ) || + 12 || S Ω ( ξ, τ ) {F [ W ( t, f )] − A x ( ξ, τ ) }|| (48)for some λ >

0, as shown, e.g., in [97]. In [98] a class of numerically eﬃcient algorithms is proposed for solving (48),which are computationally cheaper than solving (47) directly. As noted in [98], a sensible choice for λ is λ = γ ||F − [Ω A x ] || ∞ , γ ∼ . (cid:15) ∼ − in (47) in our numerical experiments, which is adequate for our present purposes. B. Choice of the Ω Domain

The choice of the cardinality (number of TF samples) and shape of Ω, aﬀects the skeleton appearance in rathersubtle ways, depending on the features of the signals.As the cardinality (hence the diameter) of Ω increases, we may expect more and more intermodulation products toappear in W , until eventually the whole WV distribution is recovered in the limit where Ω covers the whole TF plane.On the other hand, as the cardinality of Ω is decreased, we expect the skeletal components to be gradually spoiled.In [91] a number of multicomponent signals with known skeletons was considered, and it was shown that for a WVof size N × N , the choice card[Ω] ∼ N (the Heisenberg-Gabor cardinality) was the best one [99]. However, as notedin [91], for a given cardinality N , a better skeleton reconstruction is obtained if the shape of Ω ﬁts the shape of theambiguity function, reﬂecting the coherency properties of the signal.In this connection it would be desirable to ﬁnd a systematic and, as far as possible, non -parametric procedure, todetermine the shape of Ω.We adopted a simple (and rather natural) choice for Ω, namely the domain whose boundary ∂ Ω is the contour levelof the RGK (29) embracing N samples of the AF. As we shall see, this choice yields a more faithful representation ofthe skeleton, in all cases where this latter is known a priori, compared to the isotropic domain [90] suggested in [91].It is worth noting that in the present work, kernel adaptation and sparsity promotion are dealt with sequentially.Alternatively (and perhaps more eﬃciently) [100], the function σ ( θ ) in eq. (29) can be sought as part of the solutionof the sparsiﬁcation problem (44). . NUMERICAL EXPERIMENTS In this section we manage to illustrate by numerical simulations how CS-based TF skeleton representations mayoﬀer sharper time-frequency resolution (which is important for classifying glitches and understanding their origin)and substantial de-noising (which is important for detection purposes). A deeper analysis, in particular as regardsthe statistical properties of these representations, will be the subject of future work.In our numerical experiments, the domain Ω was determined as discussed in Section IV B, using the public-domainRGK toolbox [101]. The gradient projection algorithm proposed in [98], from the public-domain GPSR toolbox [102]was used to solve the optimization problem (48), using eq. (49) for the parameter λ . All waveforms were sampledat 1 KHz in chunks 512 samples wide, unless otherwise stated. Q-transforms were computed using the toolbox from[103], and HHT transforms were computed using public domain codes from the Mathworks repository. A. Gabor Molecules

Simple Gabor molecules, consisting of two SG waveforms (Gabor atoms) are perhaps the simplest multicomponent transient signals, and provide a convenient elementary benchmark for a preliminary, controlled comparison amongdiﬀerent TF representations. Here, following [104], we content ourselves with an operational deﬁnition of a multicom-ponent transient, as a signal whose TF skeleton consists of separate features, i.e., blobs or ridges whose TF supportsexhibit the property that their frequency (time) separation at all times (frequencies) is larger than the sum of theirlocal frequency (time) spreads.The panels in ﬁrst (leftmost) column of Fig.s 2-4 display (from top to bottom) the time domain data, frequencyspectrum, and ﬁducial TF skeleton, which can be computed explicitly from the WV, in this case, by subtracting theintermodulation terms.The second column (from left to right) shows the three visually best Q-transforms (in a range of Q values between 5and 50).The next (third) column shows the WV distribution, the BJ-smoothed WV, and the CC skeleton computed using theoptimized Ω domain.Finally, the two panels in the fourth and last column display the HHT transform, and its plain (top panel) and IBW-aware (bottom panel, here nicknamed fat ) versions. We used eq. (17) from [105] to estimate the IBW and constructthe IBW-aware version of the HHT, assuming the (instantaneous) amplitude to fall oﬀ linearly from a maximum atthe IF, down to zero at the half-bandwidth edges, while keeping the total energy (integral of squared instantaneousamplitude across the IBW) equal to the squared IA of the original HHT.In Fig. 2 the molecule consists of two SG having the same time width, but diﬀerent time centroids, carrier frequenciesand peak amplitudes; in Fig. 3 the two atoms have the same carrier frequency but diﬀerent time centroids; in Fig. 4they have the same time centroid but diﬀerent carrier frequencies.The CC skeleton computed using the optimized Ω domain yields the sharpest TF localization, and goes closest to theactual skeleton for all cases considered. The visually best Q-transform appears less well localized. The fat HHT alsoyields a sharp TF signature for the cases in Figures 2 and 3 - but in the case of atoms with coincident time-centroidsand close carrier frequencies (Fig. 4), it fails to reproduce the actual skeleton, being fooled by the resulting beat.Due to the noted limitation, and for the sake of conciseness, in the following Sections we will not include the HHTamong the compared TF representations.

B. GW Transients and Glitches

The qualitative conclusions valid for Gabor molecules are essentially conﬁrmed for realistic waveforms representingboth sought GW transients and glitches of environmental/instrumental origin.Figure 5 refers to a supernova core-collapse GW waveform from [20]. The sampling frequency is 16

KHz in this case.Here again the TF skeleton obtained using the BJ-optimized Ω domain yields the best TF localization, followed bythe BJ-smoothed WV. These two representations display sharp TF signatures of both the ”bounce” component (the”blob”) and the subsequent ”ringdown” (the ”ridge”) in the signal. The Q-transforms are comparatively less sharpand more blurred, and show an evident tradeoﬀ between time and frequency resolution, as the Q value is varied. TheWV is heavily aﬀected by signal-noise intermodulation artifacts.Figures 6a-c display three prototypical glitches from the LIGO 5th Scientiﬁc Run, from [106] (more glitch skeletonsare available in [107]). Here also the TF skeleton obtained using BJ-optimized Ω domain yields the sharpest TFlocalization. The BJ-smoothed WV comes next. The visually-best Q-transform appears to be close to the TF skeletonobtained using the square Ω domain.he case of Fig. 6a, showing part of the ﬁltered ﬁle

DARM ERR ) of LIGO-Hanford detector (H1) is particularly interesting. Its optimized TF skeleton (and BJ-smoothed WV) clearly showfour elementary TF features (”blobs”), suggesting that this glitch is actually a multicomponent waveform (this is alsoconﬁrmed by atomic decomposition).In the case of the typical ﬁltered power mains glitch sensed by an environmental magnetometer (LIGO

POWMAG channel) shown in Fig. 6b all TF representations yield acceptable results, except the TF skeleton obtained using thesquare Ω domain. This is not surprising, given the highly anisotropic character of the AF in this case.Figure 7 displays a typical ”arch-shaped” glitch seen in the GEO-600 data [108]. Such glitches originate from scatteredlight where seismic motion modulates the scatterers’ velocity, and hence represent nonlinear products of environmentalnoise. Also in this case, the TF skeleton obtained using the BJ-optimized Ω domain and the BJ-smoothed WV yieldquite sharp TF distributions. The TF skeleton obtained using the square domain is on a par with the (visually best)Q-transform in terms of resolution, while looking less noisy compared to this latter.

C. TF Consistency Tests

Sparse representations are intrinsically denoising [92]. Accordingly we may expect that using TF skeletons mayboost the performance of TF consistency tests among multiple detector data.This is illustrated in Fig.s 8a to 8h. Here we assume for simplicity that the direction of arrival of the GWs is ﬁduciallyknown from diﬀerent (e.g., EM or neutrino) observations (triggered detection), and consider the following simplestTF-consistency test:1. let W m the TF representation (plain or BJ-smoothed WV, or TF estimated TF skeleton) built from the datagathered by the m − th receiver, m = 1 , , . . . , M ;2. time-shift the TF representations by the appropriate (known) propagation delays;3. compute the ”coincidence” TF representation W ( c ) ( t, f ) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) M (cid:89) m =1 W m ( t, f ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) /M ; (50)4. deﬁne the pixels in the coincidence TF representation whose level is above the median as ”hot”.In Fig.s 8a to 8h we restrict ourselves to the simplest 2-detectors case. In Fig.s 8a and 8b the data contain a GWchirp corrupted by glitches in additive white Gaussian noise. The SNR of the chirp against the Gaussian backgroundis 15, and is twice that of the glitches. Figure 8a displays the WV, the BJ-smoothed WV and the TF skeleton (usingthe BJ-optimized Ω domain). Figure 8b displays the Q-transforms, for Q = 8, 20 . Q = 8, 12 . D. Coincidence TF Skeletons and Denoising

The nice properties of the coincidence TF skeleton in terms of denosing are illustrated in Fig.s 9a and 9b, showingthe level histograms of the ”hot” pixels drawn from the various coincidence TF representations discussed above. Inach histogram panel the number of hot pixels, their average level, and their cumulative level sum are displayed.In the WV, BJ-smoothed WV and Q-transform cases the number of hot pixels is one half the size of the corresponding(discrete) TF representations. The average level and cumulative level sum of the hot pixels for the two cases where asignal is present is only slightly larger (by a factor always <

2) compared to the case of noise only.In the TF skeleton, as an eﬀect of sparsiﬁcation, the number of hot pixels is smaller by a factor ∼ compared tothe size of the (discrete) WV. In the chirp case (with SNR=15) the energy content of the signal in the TF plane isspread along a ridge (consisting of ∼ . · hot pixels, in our example) while in the merger case (with SNR=10) itis concentrated in a blob (consisting of ∼ · hot pixels, in our example). Compared to the case of noise only, theaverage level of the hot pixels is larger by a factor ∼ ∼

10 (for the merger), and the cumulativesum of the hot pixel levels is larger by a factor ∼ . ∼ . VI. CONCLUSIONS

We reviewed the compressed sensing (CS) paradigm, and suggested its possible use in TF GW data analysis, usingillustrative examples. Gravitational wave chirps and bursts as well as instrumental glitches share the common featureof being represented by sparse

TF objects, including thin 1 D − ridges, and localized 2 D − juts. The main goal of theTF skeleton construction algorithm is to produce readable TF representations of complex, sparse, multi-componentwaveforms, which are free from intermodulation artifacts and still oﬀer as much time-frequency resolution as possible.These are conﬂicting requirements, nonetheless the WV-CS based approach yields an excellent tradeoﬀ between them,as illustrated. We managed to show that TF skeletons may yield better insight into the ﬁne structure of glitches,hopefully improving our understanding about their origin, and our skill in classifying them. We also showed that CS-based sparse TF representations (skeletons) act as an eﬀective denoising algorithm, improving TF-based coincidencetests.We accordingly suggest that sparse CS-based TF skeletons may be nice complements to existing tools for GW dataanalysis. ACKNOWLEDGEMENTS

This work has been supported by the Italian Ministry for University and Scientiﬁc Research (MIUR) through thegrant 20082J7FBN.

REFERENCES (2005) 255.[6] M. Principe and I.M. Pinto, ”Locally Optimum Network Detector of Unmodeled GW Bursts in Glitch Noise,” LIGO-P1000134 (2010).[7] L. Blackburn et al. ”The LSC Glitch Group: Monitoring Noise Transients During the Fifth LIGO Science Run,” Class.Quantum Grav. (2008) 184004.[8] N. Christensen et al., ”LIGO S6 detector characterization studies,” Class. Quantum Grav. (2010) 194010.[9] J.Aasi et al., ”The Characterization of Virgo Data and its Impact on Gravitational-Wave Searches,” Class. QuantumGrav. (2012) 155002.[10] N. Arnaud et al., ”Coincidence and Coherent Data Analysis Methods for Gravitational Wave Bursts in a Network ofInterferometric Detectors,” Phys. Rev. D68 (2003) 102001.[11] L. Cohen, ”Time-Frequency Distributions: a Review,” Proc. of the IEEE, (1989) 941.[12] B. Boashash, ”Estimating and Understanding the Istantaneous Frequency of a Signal - I,” Proc. IEEE (1992) 520.13] The Rice University Compressive Sensing Resources page, http://dsp.rice.edu/CS[14] I.M. Pinto and V. Matta, ”Sparse Representations for Detecting Unmodeled Transients and Representing UnmodeledGlitches,” LIGO Document G1200986 (2012).[15] R. Inta, ”Sparse Methods for Gravitational Wave Detection,” LIGO document G1200593 (2012).[16] P. Addesso et al., ”Sparsifying Time-Frequency Distributions for Gravitational Wave Data Analysis,” Proc. 3rd IEEEIntl. Workshop on Compressed Sensing Th. Appl., Pisa (2015), p. 154.[17] L. Blanchet., ”Gravitational Radiation from Post-Newtonian Sources and Inspiralling Compact Binaries,” Living Rev.Gen. Relativity (2006) 4.[18] J.G. Baker et al., ”Comparison of Binary Black Hole Merger Waveforms,” Class. Quantum Grav. (2007) S25.[19] Th. Damour and A. Nagar, ”New Analytic Representation of the Ringdown Waveform of Coalescing Spinning Black HoleBinaries,” Phys. Rev D90 (2014) 024054.[20] H. Dimmelmaier et al., ”Gravitational Wave Burst Signals from Core Collapse of Rotating Stars,” Phys. Rev.

D78 (2008)064056.[21] M. Principe and I.M. Pinto, ”Modeling the Impulsive Noise Component and its Eﬀect on the Operation of a SimpleCoherent Network Algorithm for Detecting Unmodeled Gravitational Wave Bursts,” Class. Quantum Grav., , 075013(2008).[22] L. Cohen, Time-Frequency Analysis , Englewood Cliﬀs, NJ: Prentice- Hall, 1995.[23] P. Flandrin,

Time-Frequency/Time-Scale Analysis , San Diego, CA: Academic, 1999.[24] S. G. Mallat,

A Wavelet Tour of Signal Processing , New York: Academic, 1998.[25] B. Boashash, ”Time-Frequency Signal Analysis and Processing,” New York, Elsevier, 2015.[26] F. Hlawatsch and G. F. Boudreaux-Bartels, Linear and Quadratic Time Frequency Signal Representations, IEEE SignalProcessing Mag., pp. 2167, Apr. 1992.[27] J. Bertrand and P. Bertrand, A Class of Aﬃne Wigner Distributions with Extended Covariance Properties, J. Math.Phys., , 2515 (1992).[28] A. Papandreou, F. Hlawatsch, and G. F. Boudreaux-Bartels, The Hyperbolic Class of QTFRsPart I: Constant-Q Warping,the Hyperbolic Paradigm, Properties, and Members, IEEE Trans. Signal Processing, SP-41 , 3425 (1993).[29] P. O. Amblard and J. L. Lacoume, Construction of Fourth Order Cohens Class: a Deductive Approach, in Proc. IEEEInt. Symp. on TimeFreq. Time-Scale Anal., 1992, p. 257.[30] L. J. Stankovic, A Method for Improved Distribution Concentration in the Time-Frequency Analysis of MulticomponentSignals Using the L-Wigner Distribution, IEEE Trans. Signal Processing,

SP-43 , 1262 (1995).[31] J.R. Fonollosa, C.L. Nikias, ”Wigner Higher Order Moment Spectra: Deﬁnition Properties, Computation and Applicationsto Transient Signal Analysis,” IEEE Trans.

SP-41 , 245 (1993).[32] B. Boashash and P. OShea, Polynomial WignerVille Distributions and their Relationship to Time-Varying Higher OrderSpectra, IEEE Trans. Signal Processing,

SP-42 , 216 (1994).[33] P. Loughlin, J. Pitton, and L. Atlas, Construction of Positive Time-Frequency Distributions, IEEE Transactions on SignalProcessing , 2697 (1994).[34] L. Blackburn, ”KleineWelle Technical Document,” LIGO Technical Note T060221-00-Z (2007); id. ”Glitch investigationswith Kleine Welle,” LIGO Document G050158 (2005).[35] S. K. Chatterji, ”Introduction to Q-Scan,” LIGO Document G060483 (2006).[36] J.C. Brown, ”Calculation of a Constant Q Spectral Transform,” J. Acoust. Soc. Am., (1991) 425.[37] J.C. Brown and M.S. Puckette, ”An eﬃcient Algorithm for the Calculation of a Constant Q Transform,” J. Acoust. Soc.Am., (1992) 2698.[38] S. K. Chatterji, ”The Search for Gravitational-Wave Bursts in Data from the second LIGO Science Run,” PhD thesis,MIT, Boston MA, USA (2003).[39] S. K. Chatterji et al., ”Multiresolution Techniques for the Detection of GW Bursts,” Class Quantum Grav. (2004)S1809.[40] F. Robinet, ”Omicron: an Algorithm to Detect and Characterize Transient Noise in Gravitational-Wave Detectors.”https://tds.ego-gw.it/ql/?c=10651 (2015).[41] T. A. C. M. Claasen and W. F. G. Mecklenbrauker, The Wigner Distribution-a Tool for Time-Frequency Signal Analysis,Philips J. Res., , 217 (1980); ibid., p. 276.[42] M. Feo et al ”Eﬃcient GW Chirp Detection and Estimation via Time-Frequency Analysis and Edge Detection,” in Proc.7th Marcel Grossman Meeting on General Relativity,

R. T. Jantzen et al., Eds., Singapore: World Scientiﬁc, p 1086.[43] W. G. Anderson and R. Balasubramanian R, 1999 ” Time-Frequency Detection of Gravitational Waves,” Phys. Rev.

D60 (1999) 102001.[44] E. Chassande-Mottin and P. Flandrin ”On the Time-Frequency Detection of Chirps,” Appl. Comp. Harm. Anal. (1999)252.[45] R.P. Croce et al., ”Robust Gravitational Wave Burst Detection and Source Localization in a Network of InterferometersUsing Cross-Wigner Spectra,” Class. Quantum Grav. (2011) 045001.[46] D. Mustard, ”The Fractional Fourier Transform and the Wigner Distribution,” J. Austral. Math. Soc., B 38 , 209 (1996).[47] The equality between the time (frequency) marginal distribution of the WV and the power (spectral density) of itsargument is a special case of the above property.[48] F. Hlawatsch et al., ” Smoothed Pseudo Wigner Distribution, Choi-Williams Distribution and Cone-Kernel Representa-tion: Ambiguity Domain Analysis and Experimental Comparison,” Signal Processing (1995) 149.[49] P. Flandrin, F. Auger, and E. Chassande-Mottin, ”Time-frequency Reassignment: From principles to Algorithms,” in pplications in Time-Frequency Signal Processing , A. Papandreou-Suppappola, Ed., CRC Press, Boca Raton FL, USA(2003), ch. 5.[50] R.G. Baraniuk and D.L. Jones, ”A Signal-Dependent Time-Frequency Representation: Optimal Kernel Design,” IEEETrans. SP-41 (1993) 1589.[51] R.G. Baraniuk and D.L. Jones, ”Signal-Dependent Time-Frequency Analysis using a Radially Gaussian Kernel,” SignalProc. (1993) 263.[52] K. Kodera, C. De Villedary, and R. Gendrin, ”A New Method for the Numerical Analysis of Nonstationary Signals,”Phys. Earth Plan. Int., , 142 (1976).[53] N.E. Huang, et al., ”The Empirical Mode Decomposition and the Hilbert Spectrum for Nonlinear and NonstationaryTime Series Analysis,” Proc. Roy. Soc., A454 , 903 (1998).[54] P. Flandrin and P. Goncalves, ”Empirical Mode Decomposition as Data-Driven Wavelet-like Expansion,” Int. J. Wavelets,Multires. and Info. Proc, , 1 (2004).[55] P. Flandrin, G. Rilling and P. Goncalves, ”Empirical Mode Decomposition as a Filter Bank,” IEEE Sig. Proc. Lett. ,112 (2004).[56] A. Stroer et al., ”Methods for Detection and Characterization of Signals in Noisy Data with the Hilbert-Huang Transform,”Phys. Rev. D79 , 1s24022 (2009).[57] A. Stroer, L. Blackburn and J. Camp, ”Comparison of Signals from Gravitational Wave Detectors with InstantaneousTime-Frequency Maps,” Class. Quantum Grav. , 155001 (2011).[58] H. Takahashi et al., ”On Investigating EMD Parameters to Search for Gravitational Waves,” Adv. Adapt. Data Anal. (2001) 1153.[62] B. Picinbono, ”On Instantaneous Amplitude and Phase of Signals,” IEEE Trans. , 552 (1997).[63] G.R. Putland and B. Boashash. ”Can a Signal be both Monocomponent and Multicomponent?” in Proc. WoSPA 2000,p. 14 (2000).[64] E. Bedrosian, The Analytic Signal Representation of Modulated Waveforms, Proc. IRE, , 2071 (1962).[65] E. Bedrosian, A Product Theorem for Hilbert Transforms, Proc. IEEE , 868 (1963).[66] A.H. Nuttall, On the Quadrature Approximation to the Hilbert Transform of Modulated Signals, Proc. of the of IEEE (1966) 1458.[67] I. Daubechies, L. Jianfeng and H.-T. Wu, ”Synchrosqueezed Wavelet Transforms: an Empirical Mode Decomposition-likeTool,” Appl. Comput. Harmon. Anal.

243 (2011).[68] S. Meignen and V. Perrier, ”A New Formulation for Empirical Mode Decomposition Based on Constrained Optimization,”IEEE Sig. Proc. Lett., , 932 (2007).[69] Z. Wu and N. Huang, ”Ensemble Empirical Mode Decomposition: a Noise-Assisted Data Analysis Method,” Advances inAdaptive Data Analysis, , 1(2009).[70] S. Olhede and A.T. Walden, ”The Hilbert Spectrum via Wavelet Projection,” Proc. Roy. Soc. Lond. A460 , 955 (2004).[71] X. Fang, H. Luo, ”An Improved Hilbert Transform for Nonlinear Vibration Signal Analysis,” Proc. 50thAIAA/ASME/ASCE/AHS/ASC Conference, 2651 (2009).[72] R.T. Rato, M.D. Ortiguera and A.G. Batista, ”On the HHT, its Problems, and Some Solutions,” Mechanical Systemsand Signal Processing, , 1374 (2008).[73] P. Flandrin, ”One or Two Frequencies? The Empirical Mode Decomposition Answers,” IEEE Trans. Signal Proc. , 85(2008).[74] In (G. Rilling and P. Flandrin, Adv. Adapt. Data Anal. , 43 (2009).) it was further noted that even for a single-tone signal, the frequency EMD estimation error blows up with the square of the carrier frequency, and can be small only ifthe ratio between the sampling and signal frequency is an even integer.[75] R. Gupta et al., Estimation of Instantaneous Frequencies Using Iterative Empirical Mode Decomposition, Signal, Imageand Video Proc., (2014) 799.[76] N.J. Stevenson, M. Mesbah and B. Boashash, ”Multiple-View TimeFrequency Distribution Based on the Empirical ModeDecomposition,” IET Signal Proc. , 447 (2010).[77] S. Mallat and Z. Zhang, ”Matching Pursuits with Time-Frequency Dictionaries,” IEEE Trans. Signal Process. , 3397(1993).[78] D. Gabor, ”Theory of Communication,”, J. IEE, (1946) 429.[79] J.M. Bastiaans, ”Gabors Expansion of a Signal into Gaussian Elementary Signals,” Proc. IEEE (1980) 538.[80] T.B. Littenberg and N.J. Cornish, ”Separating Gravitational Wave Signals from Instrument Artifacts,” Phys. Rev. D82 ,103007 (2010).[81] N.J. Cornish and T.B. Littenberg, ”Bayeswave: Bayesian Inference for Gravitational Wave Bursts and InstrumentGlitches,” Class. Quantum Grav. , 135012 (2015).[82] B. Abbot et al., ”Search for Gravitational Waves Associated with 39 Gamma-Ray Bursts using Data from the Second,Third, and Fourth LIGO Runs,” Phys. Rev. D 77 , 062004 (2008).[83] M. M. Goodwin and M. Vetterli, ”Matching Pursuit and Atomic Signal Models Based on Recursive Filter Banks,” IEEETrans. Signal Proc.

SP- 47 , 1890 (1999).84] M. G. Christensen and S. van de Par, ”Eﬃcient Parametric Coding of Transients.”, IEEE Trans. Audio, Speech andLang. Proc., , 1340 (2006).[85] B.L. Sturm and J.J. Shynk, ”Sparse Approximation and the Pursuit of Meaningful Signal Models With InterferenceAdaptation,” IEEE Trans. Audio, Speech, Lang. Proc. , 421 (2009).[86] B. L. Sturm et al., ”Dark Energy in Sparse Atomic Estimations,” IEEE Trans. Audio, Speech and Lang. Proc. , 671(2008).[87] B.L. Sturm, J.J. Shynk and D.H. Kim, ”Pruning Sparse Signal Models Using Interference,” Proc. IEEE CISS’09, p. 454(2009).[88] B.L. Sturm and J.J. Shynk, ”Interference-Driven Adaptation in Sparse Approximation,” Proc. 42nd Asilomar Conferenceon Signals, Systems and Computers, p. 241 (2008).[89] S. Rampone et al., ”Neural Network Aided Glitch-Burst Discrimination and Glitch Classiﬁcation,” International Journalof Modern Physics C24 (2013) 1350084[90] The isotropic Ω domain in the discrete TF plane is a rectangle whose aspect ratio depends on the chosen time-windowwidth and sampling frequency.[91] P. Flandrin and P. Borgnat, ”Time-Frequency Energy Distributions Meet Compressed Sensing,” IEEE Trans.

SP-58 (2010) 2974.[92] D. L. Donoho, ”Compressed Sensing,” IEEE Trans.,

IT-52 (2006) 1289.[93] E. Cand`es, J. Romberg and T. Tao, ”Robust Uncertainty Principles: Exact Signal Reconstruction from Highly IncompleteFrequency Information,” IEEE Trans. Inf. Th.,

IT-52 (2006) 489.[94] E. Cand`es, and J. Romberg, ” (cid:96) − (2001)129.[96] D.L. Donoho, M Elad, and V.N. Temlyakov, ”Stable Recovery of Sparse Overcomplete Representations in the Presenceof Noise,” IEEE Trans. IT-52 (2006) 6.[97] R.T. Rockafellar, ”

Convex Analysys ,” Princeton University Press, Princeton NJ (1970).[98] M.A.T. Figueiredo, R. Nowak and S.J. Wright ”Gradient Projection for Sparse Reconstruction: Application to CompressedSensing and Other Inverse Problems,” IEEE J. Sel. Top. Sig. Proc., (2007) 586.[99] For card[Ω ] ∼ N , the L distance between the actual and constructed skeleton was minimized, and the Renyi entropy of (cid:99) W reproduced that of the actual ∼ (1992) 1520.[106] P. Saulson, ”Listening to Glitches,” LIGO document G070548 (2007) and related project web-pages athttp://physics.syr.edu/research/relativity/ligo/restricted/glitch catalog/ .[107] A. Fusco et al., ”An Atlas of Compressed Coding Retrieved LIGO S5 Glitch Skeletons,” LIGO document T1300598.(2013).[108] M. Was, private communication (2013). igure 1 – Time-frequency tilings for STFT, WLT and QT.Figure 2 – Gabor molecule consisting of two atoms with different time centroids and carrier frequencies. Comparison among the Q transform, the Wigner-Ville (plain, BJ-smoothed and CC cured), and the Hilbert-Huang transform without and with (fat) IBW information. waveformspectrum igure 3 – Gabor molecule consisting of two atoms with different time centroids and equal carrier frequencies. Comparison among the Q transform, the Wigner-Ville (plain, BJ-smooth- ed and CC cured), and the Hilbert-Huang transform without and with (fat) IBW information. waveformspectrum igure 4 – Gabor molecule consisting of two atoms with equal time centroids and different carrier frequencies. Comparison among the Q transform, the Wigner-Ville (plain, BJ-smooth-ed and CC cured), and the Hilbert-Huang transform without and with (fat) IBW infor-mation. waveformspectrum igure 5 - Supernova core-collapse GW waveform from [31]. The panels display the time waveform (noise-free and with added whiteGaussian noise), the Q-transforms for Q=4.8, 6.7, 9.5, and 13.5, the WV and the BJ-smoothed WV, the CC skeletons obtained us-ing the square and BJ-optimized Ω domain, and the pertinent ambiguity function, together with the square and BJ-optimized Ωdomain boundaries.igure 6a - A LIGO glitch from the 5 th Science Run (part of

H1 DARM_ERR_866273637 from [51]). The panels display the time wave-form (noise-free and with added white Gaussian noise), the Q-transforms for Q=4.8, 6.7, 9.5, and 13.5, the WV and the BJ-smooth-ed WV, the CC skeletons obtained using the square and BJ-optimized Ω domain, and the pertinent ambiguity function, together with the square and BJ-optimized Ω domain boundaries.igure 6b - A LIGO ( power_mag ) glitch from the 5 th Science Run from [51]. The panels display the time wave-form (noise-free and with added white Gaussian noise), the Q-transforms for Q=4.8, 6.7, 9.5, and 13.5, the WV and the BJ-smoothed WV, the CC ske-letons obtained using the square and BJ-optimized Ω domain, and the pertinent ambiguity functions, together with the square and BJ-optimized Ω domain boundaries. - -

20. 0. 20. 40. - - @ Hz D d e l a y @ s D ambiguity functionand W domains

0. 0.1 0.2 0.3 0.4 0.5 - - @ s D a m p lit ud e waveform

0. 0.1 0.2 0.3 0.4 0.5 - - @ s D a m p lit ud e waveform H noisy L igure 6c - A LIGO ( power_mag ) glitch from the 5 thth