[PDF] An Efficient Forecasting Approach to Reduce Boundary Effects in Real-Time Time-Frequency Analysis

Abstract

Time-frequency (TF) representations of time series are intrinsically subject to the boundary effects. As a result, the structures of signals that are highlighted by the representations are garbled when approaching the boundaries of the TF domain. In this paper, for the purpose of real-time TF information acquisition of nonstationary oscillatory time series, we propose a numerically efficient approach for the reduction of such boundary effects. The solution relies on an extension of the analyzed signal obtained by a forecasting technique. In the case of the study of a class of locally oscillating signals, we provide a theoretical guarantee of the performance of our approach. Following a numerical verification of the algorithmic performance of our approach, we validate it by implementing it on biomedical signals.

Full PDF

aa r X i v : . [ ee ss . SP ] F e b An Efﬁcient Forecasting Approachto Reduce Boundary Effects in Real-TimeTime-Frequency Analysis

Adrien Meynard, Hau-Tieng Wu

Abstract —Time-frequency (TF) representations of timeseries are intrinsically subject to the boundary effects. Asa result, the structures of signals that are highlighted by therepresentations are garbled when approaching the bound-aries of the TF domain. In this paper, for the purpose of real-time TF information acquisition of nonstationary oscillatorytime series, we propose a numerically efﬁcient approachfor the reduction of such boundary effects. The solutionrelies on an extension of the analyzed signal obtained bya forecasting technique. In the case of the study of a class oflocally oscillating signals, we provide a theoretical guaranteeof the performance of our approach. Following a numericalveriﬁcation of the algorithmic performance of our approach,we validate it by implementing it on biomedical signals.

Index Terms —Boundary effects, time-frequency, forecast-ing, nonstationarity

I. I

NTRODUCTION I N any digital acquisition system, the study and theinterpretation of the measured signals generally re-quire an analysis tool, which enables researchers to pointout the useful characteristics of the signal. The need forsignal analysis arises from various signals, ranging fromaudio [1], [2], mechanical [3], or biomedical signals [4].For instance, biomedical signals, such as photoplethys-mogram (PPG), contain several characteristics, includingrespiratory rate or blood pressure, that cannot be inter-preted from its run-sequence plot in the time domain.An analysis tool would make possible the extraction ofthese useful characteristics.Usually, the measured signals exhibit nonstationarybehavior, and the observed quantities might be inter-fered with by transient phenomena that can vary rapidlyand irregularly. In this paper, we focus on oscillatorytime series. These signals might, for example, oscillatefast with large amplitudes at one moment, and thenoscillate slowly with small amplitudes at the next mo-ment. In order to adapt the analysis to nonstationarities,local spectral analysis is generally performed [5], [6]. The

A. Meynard is with the Department of Mathematics, Duke Univer-sity, Durham, NC, 27708 USA.H.-T. Wu is with the Department of Mathematics and Departmentof Statistical Science, Duke University, Durham, NC, 27708 USA;Mathematics Division, National Center for Theoretical Sciences, Taipei,Taiwan.A. Meynard is the corresponding author (e-mail:[email protected]). short-time Fourier transform [7] (STFT), a typical toolbuilt for this purpose, enables the determination of thelocal frequency content of a nonstationary signal.Windowing is a common method for performing localanalysis. Among many others, STFT [8], continuouswavelet transform (CWT) [9], synchrosqueezing trans-form (SST) [10], and reassignment [11] (RS) are represen-tations that fall back on the use of an analysis window.Let x : I → R denote the observed signal, where I denotes the ﬁnite interval where the signal is measured.Let g s : R → R denote the analysis window, where s is ashape parameter. The support of g s is localized aroundthe origin and is small with respect to | I | . The translationoperator is T τ deﬁned as: T τ f = f ( t − τ ) , ∀ f : R → R .Then, the local analysis of x around the instant τ ∈ I relies on the evaluation of the following inner product: V x ( s , τ ) = h x , T τ g s i I . (1)A major shortcoming of this technique occurs whenanalyzing the signal x near the boundaries of the interval I . Clearly, at these points, half of the information is miss-ing. Consequently, the results of the inner product (1) aredistorted. This phenomenon is usually understood as the boundary effect . The result of the SST of a PPG is shownin the lower-left corner of Fig. 1 (see Section IV-D3 fora comprehensive description). The distortion resultingfrom boundary effects is clearly visible on the right sideof this representation—this area is highlighted by thered dashed rectangle. Indeed, while in the major leftpart of the image, clear lines stand out, they becomeblurred as they approach the right boundary of the im-age. Therefore, estimations of signal characteristics, likeinstantaneous frequencies [12] or amplitudes, from thisTF representation appear to be imprecisely determinable(or even likely to fail) in the vicinity of the boundaries.Moreover, this boundary effect would unavoidably limitapplying these window-based TF analysis tools for real-time analysis purposes. It is thus desirable to have asolution to eliminating the boundary effects.Attempts to minimize the boundary effects generallyconsists in softening the discontinuity on signal edges.Usually, the approaches can be classiﬁed into two main

18 20 22 24 26 28 30

Time (s) -1012 S i gna l

20 25 30

Time (s) F r equen cy ( H z )

20 25 30

Time (s) F r equen cy ( H z ) Fig. 1. A segment of PPG signal (top) and the right boundary ofa TF representation determined by the SST without extension (bot-tom left), and the right boundary of a TF representation determinedby the SST with the proposed boundary effect reduction algorithmby forecasting (bottom right). The window length for the SST is12 seconds. Videos of these real-time SSTs are available online athttps://github.com/AdMeynard/BoundaryEffectsReduction. classes, choosing a proper analysis window and extend-ing/forecasting the signal. In the ﬁrst class, judiciouschoice of analysis window whose support does notinterfere with the boundary points can minimize theoccurrence of aberrant patterns near the boundaries ofthe TF plane, for instance, [13], [14]. Due to the speciﬁcrelationship between the chosen analysis windows andthe TF analysis tool, these techniques do not makeit possible to reduce the boundary effects of all TFrepresentations. Another natural idea, the second class,consists of carrying out a preliminary step of extending,or forecasting, the signal beyond its boundaries. Dueto its ﬂexibility, various forecasting schemes have beenproposed. For example, there exist simple extensionschemes that do not take into account the dynamicalbehavior of the signal, such as zero-padding, periodicextension, symmetric extension [15], [16], or polynomialextrapolation [17]. There exist extension schemes basedon physically relevant dynamical models, such as theExtended Dynamic Mode Decomposition [18] (EDMD)and the Gaussian process regression [19], [20] (GPR). Inspeech processing, dynamic mode predictors have alsobeen proposed [21] to forecast signals falling into the so-called source–ﬁlter model. This gradient-based techniquerelies on the shadowing approach proposed in [22].There also exist extension schemes based on stochasticmodels, such as the Trigonometric, Box-Cox transfor-mation, ARMA errors, Trend and Seasonal components(TBATS) algorithm [23], the dynamic linear models [24].In the physically relevant dynamical models, the oscil-lation or trend, are usually modeled as the mean of astochastic process, while in the stochastic models, the os-cillation or trend are modeled by the covariance structureof the stochastic process. It is also possible to considerpolynomial regression [25] or modeling the mean bysplines [26], kernel functions [27], and wavelets [28], or nonparametric regression [25], to estimate the meanfunction before forecasting the signal. Neural networkmodels, such as the long short-term memory model [29],is another approach. The above list is far from exhaus-tive, and we refer readers with interest to [30] for afriendly monograph on the general forecasting topic.While the model-based extension scheme gives better-extended signals than those simple extension withoutdynamical models, they generally have a great compu-tational cost.In this paper, we propose a fast extension algorithmbased on a simple dynamical model that optimizes thetrade-off between the extension quality and the com-putational cost, so that the real-time analysis can beachieved. The algorithm is composed of two steps.1)

Extend the signal by forecasting it.

The aim is to use asimple dynamic model to predict the values takenby the measured signal outside the measurementinterval. Then, once this operation is done, we haveaccess to an extended signal deﬁned on a largerinterval I ∆ , where ∆ denotes the size of the extensionon both boundaries of I .2) Run the local analysis tool on the extended signal.

Assuming that the support of the analysis windowis smaller than 2 ∆ , the local analysis near the bound-ary of I is now possible without lack of informationthanks to knowledge brought by the extension.Thus, assuming that the quality of the extension stepis sufﬁcient, the analysis results obtained that way willbe less sensitive to the boundary effects than the resultof the analysis tool applied directly to the nonextendedsignal. We claim and prove that forecasting oscillatorysignals based on the simple dynamical model combinedwith the simple least square approach is sufﬁcient forreducing the boundary effects for the TF analysis, orother kernel-based analysis. To our knowledge, suchtheoretical analysis does not exist in the literature. Seethe bottom of Fig. 1—in particular, the regions indicatedby the dashed boxes—for a snapshot of the result. Themain beneﬁt of this simple approach is a numericallyefﬁcient solution with a theoretical guarantee for real-time analysis purposes.The paper is organized in the following way. In sec-tion II, we provide an extension method based on alinear dynamic model. We derive the corresponding al-gorithm for boundary effects reduction. In Section III, weshow that the dynamic model we consider is sufﬁcientto extend signals taking the form of sums of sine waves.An evaluation of the theoretical performance of ouralgorithm on this class of signals is given in Section III. InSection IV, we compare our extension method with moresophisticated methods such as EDMD, GPR, or TBATS.We show that our algorithm gives fast results of rea-sonable quality. Finally, we evaluate the performance ofour boundary effects reduction algorithm on biomedicalsignals, such as respiratory signals, and compare it tothe theoretical results. II. A

LGORITHM

As explained above, the algorithm for the reductionof boundary effects on TF representations relies on ex-tending the signal by forecasting it before applying theTF analysis.We start with the notation. Let x : R → R denotea continuous-time signal. In this work, we consider aﬁnite-length discretization of that one. Thus, the sampledsignal x , whose length is denoted by N , is such that x [ n ] = x (cid:18) nf s (cid:19) , ∀ n ∈ {

0, . . . , N − } ,where f s denotes the sampling frequency. Let M and K be two positive integers such that M < K and K + M ≤ N . Then, for all k ∈ {

0, . . . , K − } , we extract from x ∈ R N the subsignal x k ∈ R M given by: x k =  x [ N − K − M + k ] ... x [ N − K + k − ]  . (2)These subsignals are gathered into the matrix X ∈ R M × K such that: X = (cid:0) x · · · x K − (cid:1) .Notice that these subsignals are overlapping each other.Indeed, x k + is a shifting of x k from one sample. We alsoconsider the matrix Y ∈ R M × K given by: Y = (cid:0) x · · · x K (cid:1) .The boundary effect reduction algorithm is based onmanipulating X and Y .The pseudo-code of the proposed real-time algorithmto reduce boundary effects on windowing-based TFrepresentations is shown in Algorithm 1. We coined thealgorithm BoundEffRed . Below, we detail the algorithm,particularly the signal extension

SigExt in Algorithm 2.

Algorithm 1

Tackling boundary effects of a TF represen-tation in real-time. F x = BoundEffRed ( x , M , K , L , F ) Inputs : x N , M , K , N , L , F while N increases doReal-time input : x N Forecasting step. • Signal extension: ˜ x N + L = SigExt ( x N ) . Representation estimation step. • Extended representation evaluation: F ( ˜ x N + L ) .• Restriction of F ( ˜ x N + L ) to the current timeinterval (see (9)) to obtain F ( N ) x = F ext ( x N ) . Real-time output : Signal representation F ( N ) x end while A. Step 1: Extension by Forecastinga) Dynamical Model:

Establishing a dynamicalmodel means determining the relation linking Y to X ;that is, ﬁnding a function f so that Y = f ( X ) . (3)In a general framework, forecasting means estimatingthe function f from the observed values taken by thesignal, in order to predict its future values. For instance,the dynamic mode decomposition [31], [18] or othermore complicated models [20], [23], [24], [29], allow thisby setting additional constraints on the behavior of f . Wewill see, in Section III, that considering such a complexdynamic model is not necessary for the study of theoscillatory signals of interest to us. That is why weconsider here a naive dynamical model, assuming thatwe have the following relation: Y = AX , (4)where A ∈ R M × M . In other words, we adopt a classicalstrategy in the study of dynamical systems, the lineariza-tion of a nonlinear system , when the system is sufﬁcientlyregular. Notice that this linearized dynamical model canbe written equivalently according to the subsignals x k ,as: x k + = Ax k , , ∀ k ∈ {

0, . . . , K − } . (5) b) Forecasting: Determining the forecasting amountsto estimating the unknown matrix A . Indeed, let ˜ A denotes the estimate of A . We then obtain the forecastingof x k + ℓ by recursively applying the linear relation (4):˜ x K + ℓ = ˜ A ˜ A · · · ˜ A | {z } ℓ times x K = ˜ A ℓ x K .Let e M denote the unit vector of length M given by e M = (cid:0) · · · (cid:1) T . Consequently, given deﬁnition (2), theforecasting of the signal at time N − + ℓ f s is provided by˜ x [ N − + ℓ ] = e TM ˜ x K + ℓ = α ( ℓ ) x K , (6)where α ( ℓ ) denotes the last row of ˜ A ℓ , i.e., α ( ℓ ) = e TM ˜ A ℓ . c) Model Estimation: To estimate the matrix A , weconsider the simple but numerically efﬁcient least squareestimator. That is, we solve the following problem:˜ A = arg min α L ( A ) , (7)where the loss function L is given by: L ( A ) = k Y − AX k = K − ∑ k = k x k + − Ax k k .Therefore, solving the problem (7), i.e., ∇L ( ˜ A ) = , givesthe following estimate ˜ A of the dynamical model matrix A : ˜ A = YX T ( XX T ) − . (8) Remark 1.

This expression clearly shows that the matrix ˜ A takes the following form: ˜ A =  · · · ... . . . . . . . . . ...... . . . . . . · · · · · · α · · · · · · · · · α M  . Then, except for the row vector α = ( α · · · α M ) , the matrix A is fully determined by the dynamical model.d) Signal Extension: Since we are building a real-time algorithm, we consider that only the right “side”of the signal has to be extended, the left “side” beingﬁxed since it only concerns the past values of the signal.We therefore construct the extended signal ˜ x ∈ R N + L concatenating the observed signal x , and the forwardforecasting ˜ x fw ∈ R L . We summarize the extension stepin Algorithm 2. Algorithm 2

Signal extension. ˜ x = SigExt ( x , M , K , L ) Inputs : x N , M , K , L Forward forecasting. • Estimation of the matrix ˜ A via equation (8).• Forecasting ˜ x fw ∈ R L obtained applying equa-tion (6) with ℓ ∈ {

1, . . . , L } . Output : Extended signal ˜ x N + L = (cid:0) x N ˜ x fw (cid:1) T . B. Step 2: Extended Time-Frequency Representation

Let F : R N → C F × N generically denotes the TFrepresentation of interest to us, which could be, for in-stance, STFT, CWT, SST, or RS. Here, F typically denotesthe size of the discretization along the frequency axis.Due to the boundary effects, the representation F ( x N ) shows undesired patterns near its edges. To alleviatethe boundary effects, we apply the representation to theestimated extended signal ˜ x . This strategy moves theboundary effects out of the time interval I = [ N − f s ] .Finally, the boundary-effects insensitive representation F ext : R N → C F × N of x N is given for all ν ∈ { · · · , F − } , n ∈ { · · · , N − } by: F ext ( x N )[ ν , n ] = F ( ˜ x N + L )[ ν , n ] . (9)This amounts to restricting the representation F ( ˜ x N + L ) to the current measurement interval of x N . For the sakeof simplicity, we denote the restriction operator by R ,where R : C F × ( N + L ) → C F × N . Consequently, we have: F ext ( x N ) = R ( F ( ˜ x N + L )) . (10)We call F ext the boundary-free TF representation . F ( N ) x ∈ C F × N is the estimation of the boundary-free TFrepresentation at iteration N in Algorithm 1. For numeri-cal purposes, and to make the real-time implementationachievable, we do not perform a full re-estimation of F ext at iteration N +

1. Instead, the additional knowl-edge provided by x [ N + ] only inﬂuences the values ofthe last L win columns of F ext ( x N + ) , where L win denotesthe window half-length used by the TF representation.Thus, F ( N + ) x is obtained by the concatenation of the ﬁrst N − L win + F ( N ) x with the last L win columnsof F ext ( x N + ) .III. T HEORETICAL P ERFORMANCE

A. Signal Model

We model the meaningful part of the observed signalas a deterministic multicomponent harmonic signal; thatis, a sum of sine waves: z [ n ] = J ∑ j = Ω j cos (cid:18) π f j nf s + ϕ j (cid:19) , (11)where J denotes the number of components, Ω j > j th component, f j its frequency, and ϕ j ∈ [

0, 2 π ) its initial phase. For the sake of simplicity,we make an additional assumption on the frequencies ofeach component. We assume that for all j ∈ {

1, . . . , J } : ∃ p j , p ′ j ∈ N ∗ : f j = p j M f s = p ′ j K f s . (12)In addition, the observed signal is assumed to be cor-rupted by an additive Gaussian white noise. Therefore,the measured discrete signal x is written as: x = z + σ w , (13)where z follows model (11), w is a Gaussian white noise,whose variance is normalized to one, and σ >

0. Clearly, σ denotes the variance of the additive noise σ w . B. Forecasting Error

On the forecasting interval, we decompose the esti-mated signal ˜ x as follows:˜ x [ n ] = z [ n ] + ǫ [ n ] , (14)where ǫ is the forecasting error. When n ∈ I = {

0, . . . , N − } , this error contains only the measurementnoise, that is ǫ [ n ] = σ w [ n ] . Outside the interval I , theimportance of the forecasting error ǫ is also affected bythe loss of information resulting from the linearizationof the dynamical model we consider in (4). To evaluatethe actual behavior of the forward forecasting error ǫ [ n ] when n ≥ N , we determine its ﬁrst two moments.1) The mean, or estimation bias, is such that: µ [ n ] ∆ = E { ǫ [ n ] } = E { ˜ x [ n ] } − z [ n ] . (15)Given the forecasting strategy, we have µ [ n ] = n ∈ I and µ [ n ] = E { α ( n − N + ) } z K + σ E { α ( n − N + ) w K } − z [ n ] (16)when n ≥ N .

2) The covariance is given by: γ [ n , n ′ ] ∆ = E { ( ǫ [ n ] − µ [ n ]) (cid:0) ǫ [ n ′ ] − µ [ n ′ ] (cid:1) } = E { ˜ x [ n ] ˜ x [ n ′ ] } − z [ n ] z [ n ′ ] − µ [ n ] z [ n ′ ] − µ [ n ′ ] z [ n ] − µ [ n ] µ [ n ′ ] .Thus by deﬁnition of the noise, we have γ [ n , n ′ ] = σ δ n , n ′ when ( n , n ′ ) ∈ I . When n ≥ N , let us denote ℓ = n − N +

1. Then, we have two cases.(i) If n ′ ∈ I : γ [ n , n ′ ] = σ E { w [ n ′ ] α ( ℓ ) } z K + σ E { w [ n ′ ] α ( ℓ ) w K } .(17)(ii) If n ′ = N − + λ ≥ N : γ [ n , n ′ ] = z TK E n α ( ℓ ) T α ( λ ) o z K + σ E { α ( ℓ ) w K α ( λ ) } z K + σ E { α ( λ ) w K α ( ℓ ) } z K + σ E { α ( ℓ ) w K α ( λ ) w K }− z [ n ] z [ n ′ ] − z [ n ] µ [ n ′ ] − z [ n ] µ [ n ′ ] − µ [ n ] µ [ n ′ ] .(18)Besides, we recall that γ [ n , n ′ ] = γ [ n ′ , n ] .In Theorem 1, we specify the asymptotic behavior ofthe forecasting bias and covariance when the dataset size K is great. Theorem 1.

Let x ∈ R N be a discrete-time random signalfollowing model (13) . Let ˜ x denotes its forecasting, obtainedusing the extension Algorithm 2. Let n ≥ N be a sampleindex. Then, the ﬁrst-order moment of the forecasting error ǫ [ n ] in (14) deﬁned in (15) satisﬁes | µ [ n ] | ≤ a ( n ) σ + K a ( n ) σ + a ( n ) ! + o (cid:18) K (cid:19) (19) as K → ∞ , where a ( n ) , a ( n ) , and a ( n ) are positive quantities,independent of K and σ . Its second-order moment γ [ n , n ′ ] satisﬁes the following approximation equations:(i) if n ′ ∈ I = {

0, . . . , N − } : (cid:12)(cid:12) γ [ n , n ′ ] (cid:12)(cid:12) ≤ b ( n , n ′ ) σ + K (cid:16) b ( n , n ′ ) + b ( n , n ′ ) σ (cid:17) + o (cid:18) K (cid:19) (20) as K → ∞ , where b ( n , n ′ ) , b ( n , n ′ ) , and b ( n , n ′ ) are positivequantities, independent of K and σ ;(ii) if n ′ ≥ N: (cid:12)(cid:12) γ [ n , n ′ ] (cid:12)(cid:12) ≤ c ( n , n ′ ) σ + K c ( n , n ′ ) σ + c ( n , n ′ ) + c ( n , n ′ ) σ ! + o (cid:18) K (cid:19) (21) as K → ∞ , where c ( n , n ′ ) , c ( n , n ′ ) , c ( n , n ′ ) and c ( n , n ′ ) arepositive quantities, independent of K and σ .Proof. See the Supplementary Materials. The proof ismainly based on Taylor expansions of the forecastingerror. Nonoptimal values of a ( n ) , a ( n ) ,. . . , b ( n , n ′ ) ,. . . , c ( n , n ′ ) are also provided in the proof. The forecasting bias and covariance asymptoticallydepend linearly on the ratio K . This shows the needto use a sufﬁciently large dataset to obtain an accurateforecast. Ideally when K → ∞ , the forecasting errorwould behave like the measurement noise σ w , i.e., azero-mean white noise whose variance is of the order of σ . However, Theorem 1 shows that the estimator mayremain asymptotically biased (the ﬁrst term in the boundof equation (19) being independent of K ). Concerning thecovariance of the forecasting error, the expected asymp-totic behavior is veriﬁed. Indeed, when K → ∞ , thevariance of the forecasting estimator increases at mostlinearly with the noise variance σ , since is bounded by b ( n , n ′ ) σ .Assume now that K is sufﬁciently great and ﬁxed. Thenoise inﬂuences the quality of the forecasting estimatorin two ways. On the one hand, when the noise variance σ is great, the bias and the variance of the estimatorpotentially increase linearly with σ . This reﬂects theclassical behavior of an estimator of a signal degradedby additive noise. On the other hand, due to terms in1/ σ , when the noise variance σ is small, the bias andthe variance of the forecasting estimator are no longercontrolled via equations (19) and (21). This illustratesthe necessary presence of noise to obtain an accurateprediction. Indeed, if σ is too small, the matrix XX T isill-conditioned and its inversion in (8) is sensitive. Theforecasting is then of poor quality. Noise therefore ap-pears as a regularization term to ensure the invertibilityof XX T . This seemingly counterintuitive fact indicatesthe challenge we encounter to do forecasting with thedynamic model (3), the degeneracy. Note that from thelag map perspective, X and Y consist of points located ona low dimensional manifold. Thus, locally the ranks of X and Y are low, which leads to the degeneracy. However,when noise exists, it naturally ﬁlls in deﬁcient ranks, andstabilizes the algorithm.The dependencies of the forecasting quality on thesubsignals lengths M and the forecasting index ℓ = n − N + a ( n ) , a ( n ) ,. . . , b ( n , n ′ ) ,. . . , c ( n , n ′ ) . The numerical results,discussed in Section IV-C1, allow us to better evaluatethese dependencies. C. Adaptive Harmonic Model

One can extend the previous result to the case wherethe instantaneous frequencies and amplitudes of thecomponents of the deterministic part of the observedsignal are slowly varying . It is an

AM-FM model that, inits is continuous-time version, takes the following form z ( t ) = J ∑ j = a j ( t ) cos ( πφ j ( t )) , (22)where a j ( t ) and φ ′ j ( t ) describe how large and fast thesignal oscillates at time t . Clearly, (11) is a special casesatisfying the AM-FM model when the a j and φ ′ j are both constants. In many practical signals, the amplitudeand frequency do not change fast. It is thus reasonableto further restrict the regularity and variations of theinstantaneous amplitudes and frequencies of these AM-FM functions. When the speeds of variation of the instan-taneous amplitudes a j and frequencies φ ′ j are small, thesignal can be “locally” well approximated by a harmonicfunction in (11); that is, locally at time t , z ( t ) can be wellapproximated by z ( t ) : = ∑ Jj = a j ( t ) cos ( π ( φ j ( t ) − t φ ′ j ( t ) + φ ′ j ( t ) t )) by the Taylor expansion. An AM-FMfunction satisfying the slow variation of the instanta-neous amplitudes a j and frequencies φ ′ j is referred to theadaptive harmonic model (AHM) (see [32], [33] for math-ematical details). It is thus clear that the forecasting errorcaused by the SigExt algorithm additionally depends onthe speed of variation of the instantaneous amplitudes a j and frequencies φ ′ j . For the forecasting purpose, it isthus clear that when the speed of variation of a j and φ ′ j are slow K can be chosen large while the signal can bewell approximated by (11). Hence, Theorem 1 can stillbe applied to approximate the error. However, how todetermine the optimal K based on the signal dependson the application and is out of the scope of this paper.It will be explored in future work for speciﬁc applicationproblems.The models described in Sections III-A and III-C con-sider the meaningful part of the signal to be a determin-istic component. These models are purposely adapted tosignals showing local line spectra. The physiological sig-nals we are interested in, such as respiratory or cardiacsignals, have this characteristic (see Section IV-D). Sig-nals with wider local spectra, such as electroencephalog-raphy signals, do not fall into this category, and aremore faithfully modeled as random signals. Thus, thetheoretical justiﬁcations proposed above are no longerapplicable to guarantee the forecasting quality of SigExt . D. Performance of the Boundary Effects Reduction

While we do not provide a generic proof of theboundary effect reduction on any

TF analysis tools, wediscuss a particular case of SST. Since SST is designedto analyze signals satisfying the AHM, as is discussedin Section III-C, Theorem 1 ensures that the forecastingerror ǫ deﬁned in (14) is controlled and bounded interms of mean and covariance. Recall Theorem 3 in [32],which states that when the additive noise is stationaryand may be modulated by a smooth function with asmall ﬂuctuation, the SST of the observed signal remainsclose to the SST of the clean signal, throughout theTF plane. We refer the reader to [32] for a precisequantiﬁcation of the error made in the TF plane, whichdepends notably on the covariance of the additive noiseand on the speed of variation of the amplitudes andinstantaneous frequencies composing the signal. In ourcase, while the noise of the historical data is stationary,the forecasting error depends on the historical noise, andhence the overall “noise” is nonstationary. However, the dependence only appears in the extended part of thesignal. We claim that the proof of Theorem 3 in [32]can be generalized to explain the robustness of SST tothe noise, and a systematic proof will be reported inour future theoretical work. We verify experimentallythat Algorithm 1 is efﬁcient for a large number ofrepresentations in the following Section IV. Therefore, inour case, this means that the boundary effect is stronglyreduced since the impact of the forecasting error doesSST is limited. An immediate application is that theinstantaneous frequencies can now be well estimatedcontinuously up to the edges of the TF plane, andreal-time acquisition of the instantaneous frequency andamplitude information is possible.IV. N UMERICAL R ESULTS

For the reproducibility purpose, the MATLABcode and datasets used to produce the numericalresults in this section are available online athttps://github.com/AdMeynard/BoundaryEffectsReduction.

A. Extension Methods for Comparison

To highlight the fact that the linear dynamical model issufﬁcient to catch most of the dynamical behavior of sig-nals following the AHM, we compare the performanceof Algorithm 1 with a simple extension obtained bypointwise symmetrization [15]. We also evaluate the per-formance of reference forecasting algorithms that couldbe used for extending such signals. These methods are: • The EDMD has been developed by Williams etal. [18]. The proposed algorithm is a way to ob-tain an approximation of the so-called Koopmanoperator of the observed system, which theoreticallyallows catching dynamic of nonlinear systems [34]. • The GPR [19] is a method relying on a probabilisticdynamical model. That one is based on the Gaussianprocess structure, and therefore offers more ﬂexibil-ity in the type of dynamic that could be modeledthan the linear model (4). • The TBATS method [23] is based on a classicaldecomposition of times series into a trend, seasonaland ARMA components, with a speciﬁc dynamicfor the seasonal component. This model demandsthe estimation of numerous parameters and, byimplication, may require a long computation time.

B. Evaluation Metrics

The ﬁrst evaluation metric is quantifying the fore-casting quality (i.e. not depending on ℓ ) in the timedomain. We evaluate the Experimental Mean SquareError MSE xp ( ˜ x ) of the forward forecast extended signals,namely,MSE xp ( ˜ x ) = L k ˜ x − x ext k (23) = L L ∑ ℓ = µ xp [ N − + ℓ ] + γ xp [ N − + ℓ , N − + ℓ ] . where x ext is the ground-truth extended signal; that is, x ext : = (cid:0) x [ − L ] · · · x [ N − + L ] (cid:1) . Then, as long as thebias µ [ N − + ℓ ] and the variance γ [ N − + ℓ , N − + ℓ ] of the forecasting estimator remain small for all ℓ , theMSE takes small values either.The second evaluation metric is evaluating the qualityof the boundary effect reduction directly on the TF rep-resentation. We compare the obtained TF representationto the optimal TF representation F opt N ( x ) , deﬁned asthe restriction of the representation of the ground-truthextended signal x ext ; that is, we have F opt ( x N ) = R (cid:0) F ( x ext ) (cid:1) , (24)where R is deﬁned in (10). To compare different tech-niques, we use a criterion proposed in [33] that quan-tiﬁes the optimal transport (OT) distance between agiven TF representation and the optimal one. In gen-eral, the OT distance is a quantity that measures howdifferent two probability density functions are. Denotea TF representation as Q . For t ﬁxed, we considerthe probability density function deﬁned by p t Q ( ξ ) = | Q ( ξ , t ) | (cid:14) R R | Q ( ν , t ) | d ν ; that is, normalizing the spec-tral information at time t . At each time t , the OT distance d t between two TF representations Q and F opt at time t is deﬁned by d t ( Q , F opt ) = Z R (cid:12)(cid:12) ˜ P t Q ( ξ ) − P t F opt ( ξ ) (cid:12)(cid:12) d ξ ,where P t Q ( ξ ) = R ξ − ∞ p t Q ( ν ) d ν and ˜ P t F opt ( ξ ) = R ξ − ∞ ˜ p t F opt ( ν ) d ν . In other words, the OT distance be-tween Q and F opt at time t is given by the L normof the associated cumulative density function. Note that d t ( Q , F opt ) quantiﬁes the proximity between the esti-mated and actual TF representations at time t . With theOT distance, we deﬁne the performance index D ( Q , F ) for the reduction of boundary effects of a given TFrepresentation Q over another TF representation F by D ( Q , F ) = Z I d t (cid:0) Q , F opt (cid:1) d t Z I d t (cid:0) F , F opt (cid:1) d t . (25)In other words, D ( Q , F ) is the ratio between its av-eraged OTD to the optimal TF representation F opt (see (24)) and the averaged OTD of the original TFrepresentation F to F opt . Thus, D ( Q ) < C. Simulated Signals

We ﬁrst evaluate the quality of the forecasting stepand compare it to the theoretical results provided byTheorem 1. The level of the forecasting error dependson at least two parameters: • The noise variance σ . • The size of the training dataset K . -14 -12 -10 -8 -6 -4 -2 -15 -10 -5 Fig. 2. Evolution of the experimental forecasting variance as a functionof the noise variance for three different values the forecasting sampleindex ℓ . In Sections IV-C1 and IV-C2, we study the inﬂuenceof these parameters. A comparison with the theoreticalresults of Section III is also available.

1) Sum of sine waves:

We proved that the linear dy-namic model is sufﬁcient to catch the dynamical behav-ior of signals taking the form (11). In order to validatethis theoretical result, we apply the forecasting Algo-rithm 2 to a large number of realizations of the randomvector x of size N = , following model (13), and suchthat the deterministic component z takes the form: z [ n ] = cos (cid:16) π p nM (cid:17) + A cos (cid:16) π p nM (cid:17) , ∀ n ∈ {

1, . . . , N } ,with M = p = p =

33 and A = w ∼ N ( , I ) . a) Inﬂuence of the Noise Variance σ : Here, the sizeof the training dataset is set to K = x for 200 different values of σ , logarith-mically equispaced from 10 − to 10 − . For each of thesevalues, we determine the experimental bias, denoted as µ xp [ N − + ℓ ] , and experimental variance, denoted as γ xp [ N − + ℓ , N − + ℓ ] . Fig. 2 shows the experimentalforecasting variance as a function of the noise variancefor three different values the forecasting sample index: ℓ = ℓ =

10, and ℓ = x [ N − + ℓ ] for ℓ =

10 is inaccurate as soon as the noise variance σ is lower than 4 × − (see the red line in Fig. 2).This validates the result of Theorem 1, highlighting thenecessary presence of noise to ensure the numericalstability of the estimator. Furthermore, as soon as σ ishigher than this critical threshold, the estimator varianceincreases linearly with σ , as predicted by the bound (21)provided in Theorem 1. Note that the parameter ℓ haslittle inﬂuence on the quality of the estimator. This is dueto the stationarity of the studied signal, allowing long-range forecasts without deterioration of the estimator. b) Inﬂuence of the Training Dataset Size K: Here, thenoise variance σ is set to σ = − . Then, SigExt is runon 500 realizations of the discrete signal x for 200 dif-

300 500 1000 200010 -5 -4 -3 Fig. 3. Evolution of the experimental forecasting variance in functionof the dataset size for three different values the forecasting sampleindex ℓ . ferent values of training dataset size K , logarithmicallyequispaced from 450 to 2000. For each of these values,we determine the experimental bias µ xp [ N − + ℓ ] andvariance γ xp [ N − + ℓ , N − + ℓ ] . Fig. 3 shows the exper-imental forecasting variance as a function of the trainingdataset size for three different values the forecastingsample index: ℓ = ℓ =

10, and ℓ = K as long as the datasetsize is large enough. Indeed, in the log-log plot displayedin Fig. 3, as soon as K > K tends towards an asymptote of slope −

1. In practice,the asymptotic behavior of

SigExt , illustrated here on asynthetic signal, can be generalized to the study of real-life signals, as long as the duration of the recorded signallargely exceeds that of the desired extension, whichallows the user to set K ≫ L .This study overlooks the analysis of the inﬂuence ofthe parameter M , whose inﬂuence on the value of theexperimental variance is numerically not signiﬁcant aslong as M ≪ K and ℓ < M . The choice of this parameteris especially crucial when the deterministic componentof the signal is no longer stationary. The AHM, discussedbelow, is an example.

2) Adaptive Harmonic Model:

Consider a signal satis-fying the AHM so that the instantaneous frequenciesand amplitudes of its components vary over time. Thedeterministic component z of the random vector x , con-structed following the model (13), takes the followingform, for all n ∈ {

1, . . . , N } : x [ n ] = cos ( πφ [ n ]) + R [ n ] cos ( πφ [ n ]) ,where the instantaneous amplitude R is given by: R [ n ] = + (cid:16) π nN (cid:17) ,and the instantaneous phases are such that: φ [ n ] = p P (cid:18) n + π cos (cid:16) π nN (cid:17)(cid:19) φ [ n ] = p nP + N f s n . TABLE IAHM S

IGNAL : P

ERFORMANCE OF THE E XTENSION M ETHODS ANDTHE A SSOCIATED B OUNDARY -F REE

STFTExtensionmethod CPUtime (s) MSE(mean ± SD) Index D ofthe STFT SigExt ( M = ) ± ± SigExt ( M = ) ± ± SigExt ( M = ) ± ± < ± ± ± ± ± ± ± ± Besides, the noise is chosen to be Gaussian, w ∼ N ( , I ) ,and we take N = , P = p = p = a) Inﬂuence of the subsignals length M: To evalu-ate the forecasting quality,

SigExt is applied to 1000realizations of the above-described signal. We forecastsignal extensions of 0.1 s, i.e., L = M . Note that choosing too smallan M does not provide enough information to forecastthe signal satisfactorily. Furthermore, too large an M makes the algorithm insensitive to local nonstationaritiesproduced by frequency variations. The user must thenmake a compromise on the choice of M in order tooptimize the performance of the algorithm. A choiceof M such that the subsignals contain about a dozenoscillations is generally appropriate. b) Comparison of the extensions methods: The sameset of experiments is run on the extension methodsoutlined in Section IV-A. The results, displayed in Ta-ble I, show that the naive extension we propose givessatisfying results. Indeed, when the subsignal length M is optimally chosen, SigExt performance is of thesame order as that of ore sophisticated methods, likeGPR or TBATS. Besides, even though these methodscould be slightly more robust to perturbations, they aresubstantially limited by the computing time they require,which prevents them from being used to exploit real-time data. Thus,

SigExt is the extension method thatoptimizes the trade-off between forecasting quality andcomputing time. Besides, the last column of Table I givesthe performance index D of the boundary-free STFTassociated with each extension method. This illustratesthe ability of our algorithm to reduce boundary effects toa level comparable with what is possible with the GPRor TBATS extensions. The performance of BoundEffRed on other TF representations is more thoroughly analyzedon real-life signals in the following.

D. Real Physiological Signals1) Respiratory Signal:

We ﬁrst consider a thoracicmovement signal recorded by the piezo-electrical sensorof length 6 hours and 20 minutes. The signal is sampledat f s =

100 Hz. A small portion of this signal is displayedin Fig. 4.

Fig. 4. Extended respiratory signal (blue) obtained by the

SigExt forecasting (ﬁrst panel), the EDMD forecasting (second panel), the GPRforecasting (third panel), and the TBATS forecasting (fourth panel),superimposed with the ground truth signal (red-dashed line).

From that long signal, we build a dataset of 379nonoverlapping signals of 60 seconds, i.e., N = SigExt methoddetailed in Algorithm 2. We forecast 7-second-long ex-tensions on each segment of the signal, correspondingto L = M is chosen sothat M = ⌊ L ⌋ . As a result of Section IV-C1, we take: K = ⌊ M ⌋ . The extensions obtained on one of thesesubsignals are shown in Fig. 4. The resulting MSEs ofdifferent methods are given in Table II. Note that theresults are given in “mean ± standard deviation” format.The TBATS extension method is not implemented, as itsexcessive computing time makes it impossible to imple-ment it on 6000 segments within a reasonable period oftime. Note that the MSE of SigExt is on average higherthan the MSE of EDMD or the symmetric extension, witha huge standard deviation. This variability is caused bythe presence of few segments contaminated by artifactsso that it is unpredictable via a too simple dynamicalmodel like (4). The left of Fig. 5 illustrates one of theseoutliers, where

SigExt fails to catch the fast varyingdynamic of the instantaneous amplitude to satisfactorilyforecast the signal. The EDMD and symmetric extensionsare more robust to those situations, as shown in thisexample, and in Table II. Nevertheless,

SigExt providesa sufﬁciently relevant extension to give TF represen-tations sparingly affected by boundary effects. On theright of Fig. 5, we display a comparison between theright boundary of SST of the same segment of signal(top-right), and its boundary-free SST obtained afterthe

SigExt forecasting (bottom-right). The extension ofthe instantaneous frequency visible on the right side ofthe image illustrates the reduction of boundary effectsproduced despite an inaccurate signal forecasting.We then apply

BoundEffRed for diverse TF repre-sentations: STFT, SST, RS, as well as

Concentration ofFrequency and Time (ConceFT), a generalized multitaperSST-based representation introduced in [33]. We mentionthat unlike other representations, the standard RS is

TABLE IIR

ESPIRATORY S IGNAL : P

ERFORMANCE OF THE E XTENSION M ETHODSAND THE A SSOCIATED B OUNDARY -F REE

TF R

EPRESENTATIONS

Extensionmethod MSE Performance index D (mean ± SD)STFT SST RS ConceFT

SigExt ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ±

45 50 55 60 65

Time (s) -6-4-20246810 S i gna l s SigExt

45 50 55

Time (s) F r equen cy ( H z )

45 50 55

Time (s) F r equen cy ( H z ) Fig. 5. Extensions of a segment of the respiratory signal (left) where

SigExt is outperformed by the EDMD and Symmetric extensions.Corresponding SST (top-right) and boundary-free SST obtained with

SigExt (bottom-right). not causal and rigorously requires knowing the STFTon the whole time-frequency plane before making anyreassignment. To enable real-time implementation of RS,we use here a real-time version of RS proposed in [35].We set the extension length L accordingly to the win-dow length used by the TF analysis tool. For instance,the window length used to evaluate the STFT is of 1400samples. To prevent the STFT from being sensitive to theboundaries, we set L = L equal to the half of the width ofthe window used in the TF transform. In Table II, we givethe averaged performance index (25) by evaluating thewhole TF representations (including boundaries). Eventhough SigExt performs moderately well in the sense ofMSE, the boundary effects are dramatically reduced onthe TF representations, and the averaged performanceis in the same order of that given by EDMD or GPR.A t-test is performed to compare the performance indexof

SigExt with those of other methods. Under the nullhypothesis that the means are equal, with a 5% signif-icance level, the t-tests show no statistically signiﬁcantdifference between

SigExt and EDMD or GPR, regardlessof the representation considered.

2) Two-Component Respiratory Signal:

Then, we con-sider a second respiratory signal that contains not onlythe respiratory cycle, but also a cardiac component -1012 R e s p i r a t o r y s i gna l Time (s) E C G s i gna l Fig. 6. Two-component respiratory signal (top, yellow), and theassociated

SigExt extension (blue); the simultaneously recorded ECGsignal is below. The vertical magenta dotted lines indicate the cardiaccycles in the respiratory signal. known as the cardiogenic artifact [36]. The top of Fig. 6shows an excerpt of the measured thoracic impedance-based respiratory signal, recorded during a broncho-scopic sedation procedure, along with the

SigExt exten-sion. Below, the simultaneously recorded ECG signal isdepicted. The magenta dotted lines emphasize the coin-cidences between the cardiogenic artifact of the respira-tory signal and the QRS complexes of the ECG signal.Such cardiac information can be harvested and utilizedwhen other ﬁrst-line heart rate information resource isnot available or broken [37].The top of Fig. 7 displays the ordinary SST and theboundary-free SST of this respiratory signal, sampled at64 Hz. This TF analysis brings out both components—therespiratory component, indicated by the green arrows, islocated around the fundamental frequency 0.3 Hz andits multiples, while the cardiac component, indicated bythe blue arrows, is located around 1.5 Hz. Note that,here, boundary effects are only reduced on the right-sideboundary of the TF domain, delimited by the red dashedboxes. The middle of Fig. 7 show a zoom near the right-boundary of the ordinary SST, while the bottom of Fig. 7respectively show the respective zoom on the boundary-free SST. In this area, the boundary-free SST makes itpossible to disentangle the components contained inthe signal. Moreover, the performance index of thisrepresentation takes the value D = BoundEffRed has reduced the right-side boundaryeffects by about 30% with respect to the ordinary SST.This shows the ability of our algorithm to work onsignals containing several nonstationary components.

3) Photoplethysmogram:

We consider a 640-second pho-toplethysmogram (PPG) signal extracted from the Phy-sionet dataset [38], [39], sampled at f s =

125 Hz. Aportion of this signal is displayed in Fig. 8. The estimated5-second extensions of this segment obtained by

SigExt ,EDMD, and GPR forecastings are superimposed to theground-truth extension in Fig. 8.We divide the signal into 32-second-long segments andapply

BoundEffRed to each of them. Table III showsthe performance index D of the boundary-free TF rep-resentations, averaged over the signals. For all the TFrepresentations considered, the results show that SigExt

Time (s) F r equen cy ( H z ) Time (s) F r equen cy ( H z ) Frequency (Hz) T i m e ( s ) Frequency (Hz) T i m e ( s ) Fig. 7. Ordinary SST (top-left) and boundary-free SST (top-right) of atwo-component respiratory signal. Middle and bottom: zoom on theareas delimited by the red dashed rectangles; these images are rotated90 degrees clockwise. The window length for the SSTs is 20 seconds.The instantaneous frequency of the respiration is indicated by the greenarrows, while the instantaneous frequency of the cardiogenic artifactis indicated by the blue arrows. -202 S i gna l s SigExt -202 S i gna l s

35 36 37 38 39 40 41 42 43 44 45

Time (s) -202 S i gna l s Fig. 8. Extended PPG signal (blue) obtained by the

SigExt forecast-ing (top), the EDMD forecasting (middle), and the GPR forecasting(bottom), superimposed with the ground truth signal (red dash). reduces boundary effects about as efﬁciently as the ex-tensions given by EDMD or GPR. As for the respiratorysignal, t-tests show no statistically signiﬁcant differencebetween

SigExt and EDMD or GPR, regardless of therepresentation considered. For visual inspection, the TFrepresentation of SST resulting from the

BoundEffRed strategy is shown in the bottom-right panel of Fig. 1,where SST is applied to the portion of PPG displayedin Fig. 8. It produces a signiﬁcant improvement in thequality of the SST near boundaries. Indeed, the blurringvisible when zooming on the right boundary of the SSThas almost vanished. Real-time tracking of the instanta-neous frequencies contained in the measured signal istherefore greatly facilitated.To further evaluate the inﬂuence of the noise levelon the performance of

BoundEffRed , we artiﬁcially addGaussian noise to the measured PPG signal. It is thusan additional noise to the measurement noise actually TABLE IIIPPG S

IGNAL : P

ERFORMANCE OF THE B OUNDARY -F REE

TFR

EPRESENTATIONS A CCORDING TO THE E XTENSION M ETHOD

Extensionmethod MSE Performance index D (mean ± SD)STFT SST ConceFT RS

SigExt ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± -10 0 10 20 30 40 500.20.30.40.50.60.70.80.9 STFTSSTConceFTRS

Fig. 9. Performance index of

BoundEffRed applied on a PPG, infunction of the SNR. contained in the signal. Fig. 9 shows the average per-formance index of

BoundEffRed for different values ofthe Signal-to-Noise Ratio (SNR). Whatever the represen-tation considered,

BoundEffRed is relatively robust tonoise, as long as the SNR remains above 10 dB. Belowthis level, the reduction of boundary effects graduallydeteriorates.

E. Real-Time Implementation

We simulate the real-time implementation of the SSTfrom

BoundEffRed applied to the PPG described inSection IV-D3, subsampled by a factor of 2. The test isperformed on a 2-Core Intel Core i5 CPU running at1.7 GHz and 7.7 GB of RAM.For this signal, sampled at f s = L = BoundEffRed then takes no morethan t forecast =

46 ms. Denote by H ≥ L samples, the update ofthe SST requires the calculation of ⌈ L / H ⌉ new columnsof the SST. In this example, the frequency dimensionof the SST being 512, the computational time of onecolumn is t SST = H for real-timeimplementation is verifying that the computational timeto update the boundary-free TF representation is smallerthan the lag between H samples; that is, t forecast + (cid:24) LH (cid:25) t SST < Hf s . In our example, taking H ≥ ONCLUSION

In this paper, we propose an algorithm, named

Bound-EffRed , for the real-time reduction of boundary effects inTF representations. This method is based on an extensionof the signal obtained by a simple-minded and numer-ically efﬁcient forecasting. We have shown theoreticallythat the chosen dynamic model is sufﬁcient to extendsignals formed by a sum of sine waves. Moreover, thelow computational time allows us to switch to a real-timeimplementation of

BoundEffRed , unlike other existingforecasting methods. The numerical results also con-ﬁrmed the robustness to noise of

BoundEffRed , as wellas its ability to be applied to many TF representations.Additional applications to ECG and PPG are included inthe Supplementary Materials.Various improvements can be considered to make thealgorithm more robust. In particular, we have noticed(see Fig. 5) that when the regular oscillations of theobserved signal break, the forecasting step is no longerrelevant, and only slows down the calculation of the TFrepresentation. A preliminary step should then be addedto the algorithm to detect signal activity and disable theforecasting step when possible. More fundamentally, onecan also consider accelerating the computational timeby optimizing the forecasting step to improve the real-time performance of

BoundEffRed . Indeed, each newforecast requires the inversion of the matrix XX T of size M × M . However, the matrix X used to forecast x N + H at the current iteration differs from the one used at theprevious iteration to forecast x N by only H columns. Itthen seems natural to take inspiration from the work ofStrobach [40], generalized by Badeau et al. [41], whichproposes a fast algorithm for the singular value decom-position of successive data matrices taking the sameform as X . Their algorithm is limited to the case where H =

1. Developing an extension of this algorithm tovariations of H > (cid:0) XX T (cid:1) − . These avenueswill be explored in our future work.R EFERENCES[1] D. Stowell,

Computational Analysis of Sound Scenes and Events .Springer, 2018, ch. Computational Bioacoustic Scene Analysis, pp.303–333.[2] M. Müller, D. P. W. Ellis, A. Klapuri, and G. Richard, “Signalprocessing for music analysis,”

IEEE Journal of Selected Topics inSignal Processing , vol. 5, no. 6, pp. 1088–1110, 2011. [3] Z. Peng, F. Chu, and Y. He, “Vibration signal analysisand feature extraction based on reassigned waveletscalogram,” Journal of Sound and Vibration

Detection and Estimation Methods for Biomedical Signals ,1st ed. USA: Academic Press, Inc., 1996.[5] P. Stoica and R. Moses,

Spectral analysis of signals . Upper SaddleRiver, N.J.: Pearson/Prentice Hall, 2005.[6] G. Matz, F. Hlawatsch, and W. Kozek, “Generalized evolutionaryspectral analysis and the Weyl spectrum of nonstationary randomprocesses,”

IEEE Transactions on Signal Processing , vol. 45, no. 6,pp. 1520–1534, Jun. 1997.[7] K. Gröchenig,

Foundations of time-frequency analysis , ser. Appliedand Numerical Harmonic Analysis. Boston, MA: Birkhäuser Inc.,2001.[8] P. Flandrin,

Time-frequency/time-scale analysis , ser. Wavelet Analysisand its Applications. San Diego: Academic Press Inc., 1999,vol. 10.[9] I. Daubechies,

Ten lectures on wavelets , ser. CBMS-NSF RegionalConference Series in Applied Mathematics. Philadelphia, PA:Society for Industrial and Applied Mathematics (SIAM), 1992,vol. 61.[10] I. Daubechies, J. Lu, and H.-T. Wu, “Synchrosqueezedwavelet transforms: An empirical mode decomposition-like tool,”

Applied and Computational Harmonic Analysis

IEEE Signal Processing Magazine ,vol. 30, no. 6, pp. 32–41, 2013.[12] N. Delprat, B. Escudié, P. Guillemain, R. Kronland-Martinet,P. Tchamitchian, and B. Torrésani, “Asymptotic wavelet andGabor analysis: extraction of instantaneous frequencies,”

IEEETransactions on Information Theory , vol. 38, no. 2, pp. 644–664, Mar.1992.[13] C. K. Chui and E. Quak,

Numerical Methods in Approximation The-ory , ser. ISNM 105: International Series of Numerical Mathematics.Basel: Birkäuser, 1992, vol. 9, ch. Wavelets on a Bounded Interval,pp. 53–75.[14] U. Depczynski, K. Jetter, K. Molt, and A. Niemöller,“The fast wavelet transform on compact intervals asa tool in chemometrics: II. Boundary effects, denoisingand compression,”

Chemometrics and Intelligent LaboratorySystems

IEEE Trans-actions on Image Processing , vol. 11, no. 12, pp. 1357–1364, 2002.[16] L. Chen, T. Q. Nguyen, and K.-P. Chan, “Symmetric extensionmethods for m-channel linear-phase perfect-reconstruction ﬁlterbanks,”

IEEE Transactions on Signal Processing , vol. 43, no. 11, pp.2505–2511, 1995.[17] J. R. Williams and K. Amaratunga, “A discrete wavelet transformwithout edge effects using wavelet extrapolation,”

The Journal ofFourier Analysis and Applications , vol. 3, pp. 435–449, 1997.[18] M. O. Williams, I. G. Kevrekidis, and C. W. Rowley, “A Data-Driven Approximation of the Koopman Operator: Extending Dy-namic Mode Decomposition,”

Journal of Nonlinear Science , vol. 25,pp. 1307–1346, 2015.[19] C. E. Rasmussen and C. K. I. Williams,

Gaussian Processes for Ma-chine Learning , ser. Adaptive Computation and Machine Learningseries. The MIT Press, 2006.[20] S. Roberts, M. Osborne, M. Ebden, S. Reece, N. Gibson,and S. Aigrain, “Gaussian processes for time-seriesmodelling,”

Philosophical Transactions of the Royal SocietyA: Mathematical, Physical and Engineering Sciences , vol.371, no. 1984, p. 20110550, 2013. [Online]. Available:https://royalsocietypublishing.org/doi/abs/10.1098/rsta.2011.0550[21] J. Vargas and S. McLaughlin, “Speech analysis and synthesisbased on dynamic modes,”

IEEE Transactions on Audio, Speech,and Language Processing , vol. 19, no. 8, pp. 2566–2578, 2011.[22] C. Grebogi, S. M. Hammel, J. A. Yorke, andT. Sauer, “Shadowing of physical trajectories in chaoticdynamics: Containment and reﬁnement,”

Phys. Rev. Lett. , vol. 65, pp. 1527–1530, Sep 1990. [Online]. Available:https://link.aps.org/doi/10.1103/PhysRevLett.65.1527[23] A. M. D. Livera, R. J. Hyndman, and R. D. Snyder, “Forecastingtime series with complex seasonal patterns using exponentialsmoothing,”

Journal of the American Statistical Association ,vol. 106, no. 496, pp. 1513–1527, 2011. [Online]. Available:https://doi.org/10.1198/jasa.2011.tm09771[24] M. West and J. Harrison,

Bayesian forecasting and dynamic models .Springer Science & Business Media, 2006.[25] J. Fan and I. Gijbels,

Local polynomial modelling and its applications:monographs on statistics and applied probability 66 . CRC Press, 1996,vol. 66.[26] P. Hall and J. D. Opsomer, “Theory for penalised spline regres-sion,”

Biometrika , vol. 92, no. 1, pp. 105–118, 2005.[27] Y.-W. Chang, C.-J. Hsieh, K.-W. Chang, M. Ringgaard, and C.-J.Lin, “Training and testing low-degree polynomial data mappingsvia linear svm.”

Journal of Machine Learning Research , vol. 11, no. 4,2010.[28] J. S. Marron, S. Adak, I. Johnstone, M. Neumann, and P. Patil, “Ex-act risk analysis of wavelet regression,”

Journal of Computationaland Graphical Statistics , vol. 7, no. 3, pp. 278–309, 1998.[29] P. R. Vlachas, W. Byeon, Z. Y. Wan, T. P. Sapsis, and P. Koumout-sakos, “Data-driven forecasting of high-dimensional chaotic sys-tems with long short-term memory networks,”

Proceedings of theRoyal Society A: Mathematical, Physical and Engineering Sciences , vol.474, no. 2213, p. 20170844, 2018.[30] R. J. Hyndman and G. Athanasopoulos,

Forecasting: principles andpractice . OTexts, 2018.[31] P. J. Schmid, “Dynamic mode decomposition of numerical andexperimental data,”

Journal of Fluid Mechanics , vol. 656, pp. 5–28,2010.[32] Y.-C. Chen, M.-Y. Cheng, and H.-T. Wu, “Non-parametric and adaptive modelling of dynamic periodicityand trend with heteroscedastic and dependent errors,”

Journal of the Royal Statistical Society Series B , vol. 76,no. 3, pp. 651–682, June 2014. [Online]. Available:https://ideas.repec.org/a/bla/jorssb/v76y2014i3p651-682.html[33] I. Daubechies, Y. G. Wang, and H.-T. Wu, “ConceFT:concentration of frequency and time via a multitaperedsynchrosqueezed transform,”

Philosophical Transactions of theRoyal Society A: Mathematical, Physical and Engineering Sciences ,vol. 374, no. 2065, p. 20150193, 2016. [Online]. Available:https://royalsocietypublishing.org/doi/abs/10.1098/rsta.2015.0193[34] M. Korda and I. Mezi´c, “Linear predictors fornonlinear dynamical systems: Koopman operatormeets model predictive control,”

Automatica

IEEE Transactions on Biomedical Engineering ,vol. 64, no. 1, pp. 145–154, 2017.[36] T. C. Smith, A. Green, and P. Hutton, “Recognition of cardiogenicartifact in pediatric capnograms,”

Journal of Clinical Monitoring ,vol. 10, no. 4, pp. 270–275, Jul. 1994.[37] Y. Lu, H.-T. Wu, and J. Malik, “Recycling cardiogenic artifactsin impedance pneumography,”

Biomedical Signal Processingand Control

IEEETransactions on Biomedical Engineering , vol. 64, no. 8, pp. 1914–1923,2017.[39] A. L. Goldberger, L. A. Amaral, L. Glass, J. M. Hausdorff, P. C.Ivanov, M. R. G., M. J. E., M. G. B., P. C. K., and S. H. E.,“Physiobank, physiotoolkit, and physionet: Components of a newresearch resource for complex physiologic signals. circulation[online]. 101 (23)„”

Circulation , vol. 101, no. 23, pp. e215–e220,2000.[40] P. Strobach, “Square hankel svd subspacetracking algorithms,”

Signal Processing

IEEE Transactions on Signal Processing , vol. 52,no. 1, pp. 1–10, Jan 2004. r X i v : . [ ee ss . SP ] F e b Supplementary Materials for “An EfﬁcientForecasting Approach to Reduce BoundaryEffects in Real-Time Time-Frequency Analysis”

Adrien Meynard, Hau-Tieng Wu

I. P

ROOF OF T HEOREM A. Preliminaries1) Notations:

Consider z ∈ R N the deterministic signal deﬁned in (11) and denote by z k ∈ R M the subsignal suchthat z k [ m ] = z [ N − K − M + k + m ] , ∀ m ∈ {

0, . . . , M − } , k ∈ {

0, . . . , K } .Deﬁne Z ∈ R M × K and Z ′ ∈ R M × K , the matrices such that Z ∆ = (cid:0) z · · · z K − (cid:1) , Z ′ ∆ = (cid:0) z · · · z K (cid:1) .Let D ∈ R M × M be the matrix deﬁned by D [ m , m ′ ] = δ m + m ′ . Recall the model (13). Based on the deﬁnition ofmatrices X and Y , we have: 1 K XX T = K ZZ T + σ I | {z } ∆ = S ( ) + E ( ) (26)1 K YX T = K Z ′ Z T + σ D | {z } ∆ = S ( ) + E ( ) , (27)where E ( a ) ∆ = σ E ( a ) + σ E ( a ) , E ( a ) [ m , m ′ ] = K K − ∑ k = z [ N + m + a + k ] w [ N + m ′ + k ] + w [ N + m + a + k ] z [ N + m ′ + k ] ,and E ( a ) [ m , m ′ ] = K K − ∑ k = w [ N + m + a + k ] w [ N + m ′ + k ] − δ ( m + a ) m ′ ,with a ∈ {

0, 1 } . We call E ( ) and E ( ) error matrices because: E { E ( ) } = E { E ( ) } = E { E ( ) } = E { E ( ) } = E { E ( ) } = E { E ( ) } = .Thus, the random matrix ˜ A , deﬁned in equation (8), is expressed in function of the above-deﬁned matrices as:˜ A = (cid:16) S ( ) + E ( ) (cid:17) (cid:16) S ( ) + E ( ) (cid:17) − .Deﬁne A the deterministic matrix such that A ∆ = S ( ) S ( ) − . (28) A. Meynard is with the Department of Mathematics, Duke University, Durham, NC, 27708 USA.H.-T. Wu is with the Department of Mathematics and Department of Statistical Science, Duke University, Durham, NC, 27708 USA; MathematicsDivision, National Center for Theoretical Sciences, Taipei, Taiwan.A. Meynard is the corresponding author (e-mail: [email protected]).

We denote by α ( ℓ ) the last row of A ℓ . As a result, for ℓ ∈ N ∗ , the error vector h ( ℓ ) deﬁned by h ( ℓ ) ∆ = α ( ℓ ) − α ( ℓ ) satisfy the equation h ( ℓ ) = e TM (cid:16) ˜ A ℓ − A ℓ (cid:17) = e TM (cid:18)(cid:16) ( S ( ) + E ( ) )( S ( ) + E ( ) ) − (cid:17) ℓ − A ℓ (cid:19) . (29)The randomness of h ( ℓ ) completely comes from the error matrices. Besides, notice that the ﬁrst M − E ( ) equal to the last M − E ( ) . We gather all sources of randomness into a vector g ∈ R M ( M + ) , deﬁnedas g = vec (cid:18)(cid:20) E ( ) e TM E ( ) (cid:21)(cid:19) , (30)where ”vec” denotes the vectorization operator, that concatenates the columns of a given matrix on top of oneanother. Then, we can write h ( ℓ ) as h ( ℓ ) = f ( ℓ ) ( g ) where f ( ℓ ) is a deterministic function such that: f ( ℓ ) : R M ( M + ) → R M g h ( ℓ ) . (31)In the following paragraph, we provide some useful lemmas to prove Theorem 1.

2) Lemmas:

Lemma 1 (Expressions of A and S ( ) − ) . Let S ( ) be the M × M matrix deﬁned in (26) . Let A the M × M matrixdeﬁned in (28) . Assume the deterministic signal z takes the form (11), and the observed noisy signal takes the form (13). Then,the inverse of the matrix S ( ) is given by S ( ) − [ m , m ′ ] = σ δ m , m ′ − M σ J ∑ j = + σ M Ω j cos (cid:18) π p j m − m ′ M (cid:19) , (32) and the matrix A is given by A [ m , m ′ ] = δ m + m ′ + M J ∑ j = + σ M Ω j cos (cid:18) π p j m ′ M (cid:19) δ m + M . (33) Let k · k max denote the maximum norm of a matrix, i.e., k M k max = max n , n ′ | M [ n , n ′ ] | . Then, (cid:13)(cid:13)(cid:13)(cid:13) S ( ) − (cid:13)(cid:13)(cid:13)(cid:13) max ≤ σ (cid:18) + JM (cid:19) , (34) k A k max ≤ max (cid:18)

1, 2 JM (cid:19) . (35) Proof.

It follows from the signal model (11) that the matrices S ( ) and S ( ) take the following form: S ( a ) [ m , m ′ ] = σ δ ( m + a ) , m ′ + J ∑ j , j ′ = Ω j Ω j ′ K K − ∑ k = cos (cid:18) π f j f s ( N + m + a + k ) + ϕ j (cid:19) cos (cid:18) π f j ′ f s ( N + m ′ + k ) + ϕ j ′ (cid:19) = σ δ ( m + a ) , m ′ + J ∑ j = Ω j K K − ∑ k = cos (cid:18) π f j f s ( m + a − m ′ ) (cid:19) + cos (cid:18) π f j f s ( k + m + a + m ′ + N ) (cid:19) = σ δ ( m + a ) , m ′ + J ∑ j = Ω j (cid:18) π f j f s ( m + a − m ′ ) (cid:19) + Ω j K K − ∑ k = cos (cid:18) π f j f s ( k + m + a + m ′ + N ) (cid:19)| {z } = fjf s = p ′ jK ! = σ δ ( m + a ) , m ′ + J ∑ j = Ω j (cid:18) π f j f s ( m + a − m ′ ) (cid:19) . (36)Thus, S ( ) is a circulant matrix, and is therefore diagonalizable in the Fourier basis: S ( ) = U Λ ( ) U ∗ , where U [ m , m ′ ] = √ M e − π mm ′ / M and Λ ( ) = diag ( λ ( ) , . . . , λ ( ) M − ) with λ ( ) m = σ + J ∑ j = Ω j M − ∑ q = cos (cid:18) π f j f s q (cid:19) e − π qm / M = σ + M J ∑ j = Ω j ( δ m , p j + δ m , M − p j ) .Therefore, S ( ) − = U Λ ( ) − U ∗ ,which leads to S ( ) − [ m , m ′ ] = σ δ m , m ′ − M σ J ∑ j = + σ M Ω j cos (cid:18) π p j m − m ′ M (cid:19) . (37)Directly, we have: (cid:12)(cid:12)(cid:12)(cid:12) S ( ) − [ m , m ′ ] (cid:12)(cid:12)(cid:12)(cid:12) = σ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) δ m , m ′ − M J ∑ j = + σ M Ω j cos (cid:18) π p j m − m ′ M (cid:19)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ σ  + M J ∑ j = + σ M Ω j  ≤ σ (cid:18) + JM (cid:19) .Thus, (cid:13)(cid:13)(cid:13)(cid:13) S ( ) − (cid:13)(cid:13)(cid:13)(cid:13) max ≤ σ (cid:18) + JM (cid:19) .Furthermore, combining equations (36) and (37), we have A [ m , m ′ ] = M − ∑ q = S ( ) [ m , q ] S ( ) − [ q , m ′ ]= δ m + m ′ + M J ∑ j = + σ M Ω j cos (cid:18) π p j m ′ M (cid:19) δ m + M .Directly, we have: (cid:12)(cid:12) A [ m , m ′ ] (cid:12)(cid:12) ≤  m < M − M ∑ Jj = + σ M Ω j if m = M − k A k max ≤ max (cid:18)

1, 2 JM (cid:19) . Lemma 2 (Moments of g ) . Let g ∈ R M ( M + ) be the random vector deﬁned in (30) . Assume the deterministic signal z takes the form (11), and the observed noisy signal takes the form (13). Then, as K → ∞ , the second-order moments of g arebounded as follows: (cid:12)(cid:12) E { g [ r ] g [ r ′ ] } (cid:12)(cid:12) ≤ K (cid:16) C z σ + σ (cid:17) + o (cid:18) K (cid:19) , ∀ ( r , r ′ ) ∈ {

0, . . . , M ( M + ) − } , (38) where C z ∆ = (cid:16) ∑ Jj = Ω j (cid:17) . Besides, higher-order moments of g behave as o (cid:18) K (cid:19) . Proof.

In the following, for all r ∈ {

0, . . . , M ( M + ) − } , we denote g [ r ] = σ g [ r ] + σ g [ r ] , where g [ r ] = E ( a r ) [ m r , m ′ r ] = K K − ∑ k = z k [ m r + a r ] w k [ m ′ r ] + w k [ m r + a r ] z k [ m ′ r ] , g [ r ] = E ( a r ) [ m r , m ′ r ] = K K − ∑ k = w k [ m r + a r ] w k [ m ′ r ] − δ m r + a r , m ′ r ,and m r , m ′ r , a r are the corresponding coordinates of the matrices associated with r through the vectorization opera-tion (30). Thus, order-two moments of this random vector is split as follows: E { g [ r ] g [ r ′ ] } = σ E { g [ r ] g [ r ′ ] } + σ E { g [ r ] g [ r ′ ] } + σ E { g [ r ] g [ r ′ ] } + σ E { g [ r ] g [ r ′ ] } . (39)By deﬁnition of the signal z (see equation (11)), we have | z [ n ] | ≤ ∑ Jj = Ω j , for all n ∈ N . Thus, by a direct bound,we have (cid:12)(cid:12) E { g [ r ] g [ r ′ ] } (cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12) K E (cid:26) K − ∑ k , k ′ = z k [ m r + a r ] z k ′ [ m r ′ + a r ′ ] w k [ m ′ r ] w k ′ [ m ′ r ′ ] + z k [ m ′ r ] z k ′ [ m r ′ + a r ′ ] w k [ m r + a r ] w k ′ [ m ′ r ′ ]+ z k [ m r + a r ] z k ′ [ m ′ r ′ ] w k [ m ′ r ] w k ′ [ m r ′ + a r ′ ] + z k [ m ′ r ] z k ′ [ m ′ r ′ ] w k [ m r + a r ] w k ′ [ m r ′ + a r ′ ] (cid:27)(cid:12)(cid:12)(cid:12)(cid:12) ≤ K J ∑ j = Ω j ! K − ∑ k , k ′ = (cid:12)(cid:12) E (cid:8) w k [ m ′ r ] w k ′ [ m ′ r ′ ] (cid:9)(cid:12)(cid:12) + (cid:12)(cid:12) E (cid:8) w k [ m r + a r ] w k ′ [ m ′ r ′ ] (cid:9)(cid:12)(cid:12) + (cid:12)(cid:12) E (cid:8) w k [ m ′ r ] w k ′ [ m r ′ + a r ′ ] (cid:9)(cid:12)(cid:12) + | E { w k [ m r + a r ] w k ′ [ m r ′ + a r ′ ] }| (40)Besides, since w is a white noise,1 K K − ∑ k , k ′ = (cid:12)(cid:12) E (cid:8) w k [ m ′ r ] w k ′ [ m ′ r ′ ] (cid:9)(cid:12)(cid:12) = K K − ∑ k , k ′ = δ k + m ′ r , k ′ + m ′ r ′ = K (cid:0) K − (cid:12)(cid:12) m ′ r − m ′ r ′ (cid:12)(cid:12)(cid:1) .Moreover, 0 ≤ (cid:12)(cid:12) m ′ r − m ′ r ′ (cid:12)(cid:12) ≤ M −

1. Thus,1 K − M − K ≤ K K − ∑ k , k ′ = (cid:12)(cid:12) E (cid:8) w k [ m ′ r ] w k ′ [ m ′ r ′ ] (cid:9)(cid:12)(cid:12) ≤ K .Therefore, 1 K K − ∑ k , k ′ = (cid:12)(cid:12) E (cid:8) w k [ m ′ r ] w k ′ [ m ′ r ′ ] (cid:9)(cid:12)(cid:12) = K + o (cid:18) K (cid:19) .Similar calculations lead to the same results for the other three terms making up the sum (40). Therefore, we have: (cid:12)(cid:12) E { g [ r ] g [ r ′ ] } (cid:12)(cid:12) ≤ J ∑ j = Ω j ! (cid:18) K + o (cid:18) K (cid:19)(cid:19) ≤ C z K + o (cid:18) K (cid:19) . (41)Besides, since odd-order moments of a zero-mean multivariate Gaussian random vector are zero, we have: E { g [ r ] g [ r ′ ] } = K K − ∑ k , k ′ = z k [ m r + a r ] E (cid:8) w k [ m ′ r ] w k ′ [ m r ′ + a r ′ ] w k ′ [ m ′ r ′ ] (cid:9) + z k [ m ′ r ] E (cid:8) w k [ m r + a r ] w k ′ [ m r ′ + a r ′ ] w k ′ [ m ′ r ′ ] (cid:9) − δ m r ′ + a r ′ , m ′ r ′ E { g [ r ] } = E { g [ r ] g [ r ′ ] } = Besides, by a direct calculation, we have: E { g [ r ] g [ r ′ ] } = K K − ∑ k , k ′ = E (cid:8) w k [ m r + a r ] w k [ m ′ r ] w k ′ [ m r ′ + a r ′ ] w k ′ [ m ′ r ′ ] (cid:9) − δ m r + a r , m ′ r K K − ∑ k ′ = E (cid:8) w k ′ [ m r ′ + a r ′ ] w k ′ [ m ′ r ′ ] (cid:9) − δ m r ′ + a r ′ , m ′ r ′ K K − ∑ k = E (cid:8) w k [ m ′ r ] w k ′ [ m ′ r ′ ] (cid:9) + δ m r + a r , m ′ r δ m r ′ + a r ′ , m ′ r ′ = K K − ∑ k , k ′ = E (cid:8) w k [ m r + a r ] w k [ m ′ r ] w k ′ [ m r ′ + a r ′ ] w k ′ [ m ′ r ′ ] (cid:9) − δ m r + a r , m ′ r δ m r ′ + a r ′ , m ′ r ′ (44)Moreover, using the results of the Isserlis’ theorem [1] to express fourth-order moments of a Gaussian randomvector as a function of its second-order moments, we expand sum (44) as follows: E { g [ r ] g [ r ′ ] } = K K − ∑ k , k ′ = E (cid:8) w k [ m r + a r ] w k [ m ′ r ] (cid:9) E (cid:8) w k ′ [ m r ′ + a r ′ ] w k ′ [ m ′ r ′ ] (cid:9) − δ m r + a r , m ′ r δ m r ′ + a r ′ , m ′ r ′ + K K − ∑ k , k ′ = E { w k [ m r + a r ] w k ′ [ m r ′ + a r ′ ] } E (cid:8) w k [ m ′ r ] w k ′ [ m ′ r ′ ] (cid:9) + K K − ∑ k , k ′ = E (cid:8) w k [ m r + a r ] w k ′ [ m ′ r ′ ] (cid:9) E (cid:8) w k [ m ′ r ] w k ′ [ m r ′ + a r ′ ] (cid:9) = K K − ∑ k , k ′ = δ k + m r + a r , k ′ + m r ′ + a r ′ δ k + m ′ r , k ′ + m ′ r ′ + δ k + m r + a r , k ′ + m ′ r ′ δ k + m ′ r , k ′ + m r ′ + a r ′ = K (cid:16) δ m ′ r − m ′ r ′ , m r + a r − m r ′ − a r ′ ( K − | m ′ r − m ′ r ′ | ) + δ m r + a r − m ′ r ′ , m ′ r − m r ′ − a r ′ ( K − | m r + a r − m ′ r ′ | ) (cid:17) = K (cid:16) δ m ′ r − m ′ r ′ , m r + a r − m r ′ − a r ′ + δ m ′ r + m ′ r ′ , m r + a r + m r ′ + a r ′ (cid:17) + o (cid:18) K (cid:19) .Therefore, (cid:12)(cid:12) E { g [ r ] g [ r ′ ] } (cid:12)(cid:12) ≤ K + o (cid:18) K (cid:19) . (45)Thus, combining results (41), (42), (43) and (45) into expression (39) gives the following bound: (cid:12)(cid:12) E { g [ r ] g [ r ′ ] } (cid:12)(cid:12) ≤ K (cid:16) C z σ + σ (cid:17) + o (cid:18) K (cid:19) .Concerning higher-order moments, let T ≥ E { ∏ T θ = g [ r θ ] } . Thus, E ( T ∏ θ = g [ r θ ] ) = E ( T ∏ θ = (cid:16) σ g [ r θ ] + σ g [ r θ ] (cid:17)) = K T E ( T ∏ θ = K − ∑ k = ρ θ , k (cid:16) w [ k + m ′ r θ ] , w [ k + m r θ + a r θ ] (cid:17)!) = K T K − ∑ k = · · · K − ∑ k T = E ( T ∏ θ = ρ θ , k θ (cid:16) w [ k θ + m ′ r θ ] , w [ k θ + m r θ + a r θ ] (cid:17)) , (46)where ρ θ , k ( u , v ) = σ z k [ m r θ + a r θ ] u + σ z k [ m ′ r θ ] v + σ uv − δ m r θ + a r θ , m ′ r θ .Thus, (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E ( T ∏ θ = g [ r θ ] )(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ ν K K T C T ,where C T = max ( k ,..., k T ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E ( T ∏ θ = ρ θ , k θ ( w [ k θ + m ′ r θ ] , w [ k θ + m r θ + a r θ ]) )(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ,and ν K is the number of nonzero terms in the sum (46). Note that C T is independent of K (but depends on z and σ ). The behavior of the order T moment in function of K is therefore only determined by the ratio ν K K T . Let us bound ν K . Fix k . For each of the other indexes of summation k θ ( θ ∈ {

2, . . . , T } ), there are four values thatmake ρ θ , k θ ( w [ k θ + m ′ r θ ] , w [ k θ + m r θ + a r θ ]) not independent of ρ k ( w [ k + m ′ r ] , w [ k + m r + a r ]) . Indeed, since w is a white noise these quantities are independent except when k θ is such that: k θ = k + m ′ r − m ′ r θ k θ = k + m r + a r − m ′ r θ k θ = k + m ′ r − m r θ − a r θ k θ = k + m r + a r − m r θ − a r θ .Consequently, for each value of k there exist at least ( K − ) T − combinations of ( k , . . . , k T ) where we have E ( T ∏ θ = ρ θ , k θ ( w [ k θ + m ′ r θ ] , w [ k θ + m r θ + a r θ ]) ) = E (cid:8) ρ k ( w [ k + m ′ r ] , w [ k + m r + a r ]) (cid:9) × E ( T ∏ θ = ρ θ , k θ ( w [ k θ + m ′ r θ ] , w [ k θ + m r θ + a r θ ]) ) = E (cid:8) ρ k ( w [ k + m ′ r ] , w [ k + m r + a r ]) (cid:9) =

0. Therefore, at least K ( K − ) T − of the sum (46) are zero.Because T ≥

3, we develop similar arguments on k and k to determine other cases where this correlation termvanishes. We subtract these cases to K T , the total number of combinations of ( k , . . . , k T ) to obtain the followingmaximum bound on the number of nonzero terms in the sum (46): ν K ≤ K T − K ( K − ) T − + K ( K − )( K − ) T − − K ( K − )( K − )( K − ) T − .Thus, (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E ( T ∏ θ = g [ r θ ] )(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ C T K T − K ( K − ) T − + K ( K − )( K − ) T − − K ( K − )( K − )( K − ) T − K T ≤ C T − (cid:18) − K (cid:19) T − + (cid:18) − K (cid:19) (cid:18) − K (cid:19) T − − (cid:18) − K (cid:19) (cid:18) − K (cid:19) (cid:18) − K (cid:19) T − ! ≤ C T (cid:18) − + ( T − ) K + − K − ( T − ) K − + K + K + ( T − ) K + o (cid:18) K (cid:19)(cid:19) ≤ C T o (cid:18) K (cid:19) .Therefore, (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E ( T ∏ θ = g [ r θ ] )(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = o (cid:18) K (cid:19) , ∀ T ≥ Lemma 3 (Bounds on the derivatives of f ( ℓ ) at the origin) . Let f ( ℓ ) : R M ( M + ) → R M denote the multivariate functiondeﬁned in (31) . Assume the deterministic signal z takes the form (11), and the observed noisy signal takes the form (13). Then,the ﬁrst-order derivatives of f ( ℓ ) at the origin are bounded as follows: (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ f ( ℓ ) m ∂ g [ r ] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) g = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ d z , M , ℓ σ , ∀ m ∈ {

0, . . . , M − } , r ∈ {

0, . . . , M ( M + ) − } , (47) where d z , M , ℓ ∆ = ( + ( ℓ − ) M ) M ℓ − (cid:18) max (cid:18)

1, 2 JM (cid:19)(cid:19) ℓ − (cid:18) + max (cid:18)

1, 2 JM (cid:19)(cid:19) (cid:18) + JM (cid:19) . Besides, the second-order derivatives of f ( ℓ ) at the origin are bounded as follows: (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ f ( ℓ ) m ∂ g [ r ] ∂ g [ r ′ ] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) g = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ d z , M , ℓ σ , ∀ m ∈ {

0, . . . , M − } , ( r , r ′ ) ∈ {

0, . . . , M ( M + ) − } , (48) whered z , M , ℓ ∆ = (cid:18) + JM (cid:19) (cid:18) max (cid:18)

1, 2 JM (cid:19)(cid:19) ℓ − (cid:18) + max (cid:18)

1, 2 JM (cid:19)(cid:19) (cid:18) d M , ℓ + ( d M , ℓ + d ′ M , ℓ ) max (cid:18)

1, 2 JM (cid:19)(cid:19) , and d M , ℓ and d ′ M , ℓ are only depending on M and ℓ .Proof. Concerning the ﬁrst-order derivative, from (29), we have: ∂ f ( ℓ ) ∂ g [ r ] = e TM ∂ ˜ A ℓ ∂ g [ r ]= ℓ − ∑ λ = e TM ˜ A λ ∂ ˜ A ∂ g [ r ] ˜ A ℓ − − λ .Thus, ∂ f ( ℓ ) ∂ g [ r ] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) g = = ℓ − ∑ λ = e TM A λ ∂ ˜ A ∂ g [ r ] (cid:12)(cid:12)(cid:12)(cid:12) g = A ℓ − − λ . (49)Furthermore, ∂ ˜ A ∂ g [ r ] = ∂ E ( ) ∂ g [ r ] (cid:16) S ( ) + E ( ) (cid:17) − + (cid:16) S ( ) + E ( ) (cid:17) ∂ (cid:16) S ( ) + E ( ) (cid:17) − ∂ g [ r ]=  J m r , m ′ r (cid:16) S ( ) + E ( ) (cid:17) − if a r = (cid:18) ( − δ m r ,0 ) J m r − m ′ r + (cid:16) S ( ) + E ( ) (cid:17) (cid:16) S ( ) + E ( ) (cid:17) − J m r , m ′ r (cid:19) (cid:16) S ( ) + E ( ) (cid:17) − else,where J m r , m ′ r ∈ R M × M is the matrix such that J m r , m ′ r [ m , m ′ ] ∆ = δ m , m r δ m ′ , m ′ r . Thus, ∂ ˜ A ∂ g [ r ] (cid:12)(cid:12)(cid:12)(cid:12) g = =  J m r , m ′ r S ( ) − if a r = (cid:16) ( − δ m r ,0 ) J m r − m ′ r + A J m r , m ′ r (cid:17) S ( ) − else.Then, (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∂ ˜ A ∂ g [ r ] (cid:12)(cid:12)(cid:12)(cid:12) g = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) max ≤ ( + k A k max ) (cid:13)(cid:13)(cid:13) S ( ) − (cid:13)(cid:13)(cid:13) max .Given bounds (35) and (34), we have that: (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∂ ˜ A ∂ g [ r ] (cid:12)(cid:12)(cid:12)(cid:12) g = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) max ≤ (cid:18) + max (cid:18)

1, 2 JM (cid:19)(cid:19) (cid:18) + JM (cid:19) σ . (50)Besides given expression (49), for all r ∈ {

0, . . . , M ( M + ) − } , we have: (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ f ( ℓ ) m ∂ g [ r ] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) g = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ M k A ℓ − k max (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∂ ˜ A ∂ g [ r ] (cid:12)(cid:12)(cid:12)(cid:12) g = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) max + M ℓ − ∑ λ = k A λ k max (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∂ ˜ A ∂ g [ r ] (cid:12)(cid:12)(cid:12)(cid:12) g = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) max k A ℓ − − λ k max ≤ ( + ( ℓ − ) M ) M ℓ − k A k ℓ − (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∂ ˜ A ∂ g [ r ] (cid:12)(cid:12)(cid:12)(cid:12) g = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) max .Therefore, given bounds (35) and (50), we have: (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ f ( ℓ ) m ∂ g [ r ] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) g = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ d z , M , ℓ σ .Concerning the second-order derivative, we have: ∂ f ( ℓ ) ∂ g [ r ] ∂ g [ r ′ ] = ℓ − ∑ λ = e TM ∂ ˜ A λ ∂ g [ r ′ ] ∂ ˜ A ∂ g [ r ] ˜ A ℓ − − λ + e TM ˜ A λ ∂ ˜ A ∂ g [ r ] ∂ g [ r ′ ] ˜ A ℓ − − λ + e TM ˜ A λ ∂ ˜ A ∂ g [ r ] ∂ ˜ A ℓ − − λ ∂ g [ r ′ ]= ℓ − ∑ λ = λ − ∑ p = e TM ˜ A p ∂ ˜ A ∂ g [ r ′ ] ˜ A λ − − p ∂ ˜ A ∂ g [ r ] ˜ A ℓ − − λ + ℓ − ∑ λ = e TM ˜ A l ∂ ˜ A ∂ g [ r ] ∂ g [ r ′ ] ˜ A ℓ − − λ + ℓ − ∑ λ = ℓ − λ − ∑ p = e TM ˜ A λ ∂ ˜ A ∂ g [ r ] ˜ A p ∂ ˜ A ∂ g [ r ′ ] ˜ A ℓ − λ − − p . Thus, ∂ f ( ℓ ) ∂ g [ r ] ∂ g [ r ′ ] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) g = = ℓ − ∑ λ = λ − ∑ p = e TM A p ∂ ˜ A ∂ g [ r ′ ] (cid:12)(cid:12)(cid:12)(cid:12) g = A λ − − p ∂ ˜ A ∂ g [ r ] (cid:12)(cid:12)(cid:12)(cid:12) g = A ℓ − − λ + ℓ − ∑ λ = e TM A λ ∂ ˜ A ∂ g [ r ] ∂ g [ r ′ ] (cid:12)(cid:12)(cid:12)(cid:12) g = A ℓ − − λ + ℓ − ∑ λ = ℓ − − λ ∑ p = e TM A λ ∂ ˜ A ∂ g [ r ] (cid:12)(cid:12)(cid:12)(cid:12) g = A p ∂ ˜ A ∂ g [ r ′ ] (cid:12)(cid:12)(cid:12)(cid:12) g = A ℓ − λ − − p . (51)Besides, the second-order derivative of the matrix ˜ A is given by ∂ ˜ A ∂ g [ r ] ∂ g [ r ′ ] =  a r = a ′ r = J m r , m ′ r (cid:16) S ( ) + E ( ) (cid:17) − J m r ′ , m ′ r ′ (cid:16) S ( ) + E ( ) (cid:17) − if a r = a ′ r = J m ′ r , m ′ r ′ (cid:16) S ( ) + E ( ) (cid:17) − J m r , m ′ r (cid:16) S ( ) + E ( ) (cid:17) − if a r = a ′ r = (cid:18) ( − δ m r ,0 ) J m r − m ′ r + (cid:16) S ( ) + E ( ) (cid:17) (cid:16) S ( ) + E ( ) (cid:17) − J m r , m ′ r (cid:19) × (cid:16) S ( ) + E ( ) (cid:17) − J m r ′ , m ′ r ′ (cid:16) S ( ) + E ( ) (cid:17) − + (cid:18) ( − δ m r ′ ,0 ) J m r ′ − m ′ r ′ + (cid:16) S ( ) + E ( ) (cid:17) (cid:16) S ( ) + E ( ) (cid:17) − J m r ′ , m ′ r ′ (cid:19) × (cid:16) S ( ) + E ( ) (cid:17) − J m r , m ′ r (cid:16) S ( ) + E ( ) (cid:17) − else.Thus, the second-order derivative of the matrix ˜ A at the origin is such that ∂ ˜ A ∂ g [ r ] ∂ g [ r ′ ] (cid:12)(cid:12)(cid:12)(cid:12) g = =  a r = a r ′ = J m r , m ′ r S ( ) − J m r ′ , m ′ r ′ S ( ) − if a r = a r ′ = J m r ′ , m ′ r ′ S ( ) − J m r , m ′ r S ( ) − if a r = a r ′ = (cid:16) ( − δ m r ,0 ) J m r − m ′ r + A J m r , m ′ r (cid:17) S ( ) − J m r ′ , m ′ r ′ S ( ) − + (cid:16) ( − δ m r ′ ,0 ) J m r ′ − m ′ r ′ + A J m r ′ , m ′ r ′ (cid:17) S ( ) − J m r , m ′ r S ( ) − else.Then, (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∂ ˜ A ∂ g [ r ] ∂ g [ r ′ ] (cid:12)(cid:12)(cid:12)(cid:12) g = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) max ≤  (cid:13)(cid:13)(cid:13) S ( ) − (cid:13)(cid:13)(cid:13) if a r = a r ′ = ( + k A k max ) (cid:13)(cid:13)(cid:13) S ( ) − (cid:13)(cid:13)(cid:13) else ≤ (cid:18) + max (cid:18)

1, 2 JM (cid:19)(cid:19) (cid:18) + JM (cid:19) σ . (52)Returning to equation (51), for all r , r ′ ∈ {

0, . . . , M ( M + ) − } and m ∈ {

0, . . . , M − } , we have: (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ f ( ℓ ) m ∂ g [ r ] ∂ g [ r ′ ] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) g = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ d M , ℓ k A k ℓ − (cid:13)(cid:13)(cid:13)(cid:13) ∂ ˜ A ∂ g [ r ] (cid:13)(cid:13)(cid:13)(cid:13) max (cid:13)(cid:13)(cid:13)(cid:13) ∂ ˜ A ∂ g [ r ′ ] (cid:13)(cid:13)(cid:13)(cid:13) max + d ′ M , ℓ k A k ℓ − (cid:13)(cid:13)(cid:13)(cid:13) ∂ ˜ A ∂ g [ r ] g [ r ′ ] (cid:13)(cid:13)(cid:13)(cid:13) max ,where d M , ℓ and d ′ M , ℓ are only depending on M and ℓ . Besides, given results (28), (50) and (52), we have: (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ f ( ℓ ) m ∂ g [ r ] ∂ g [ r ′ ] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) g = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ d z , M , ℓ σ . B. Expression of the Bias µ . By deﬁnition of the measurement noise, µ [ n ] = n ∈ I . Outside the measurement interval I , denote by ℓ the index such that n = N − + ℓ . Then, given that h ( ℓ ) = α ( ℓ ) − α ( ℓ ) , we deduce from expression (16) that µ [ n ] = E { α ( ℓ ) } z K + σ E { α ( ℓ ) w K } − z [ n ]= α ( ℓ ) z K + E { h ( ℓ ) } z K + σ E { h ( ℓ ) w K } − z [ N − + ℓ ] ∆ = ǫ [ ℓ ] + ǫ [ ℓ ] + ǫ [ ℓ ] , (53) where ǫ [ ℓ ] = α ( ℓ ) z K − z [ N − + ℓ ] , (54) ǫ [ ℓ ] = E { h ( ℓ ) } z K , (55) ǫ [ ℓ ] = σ E { h ( ℓ ) w K } . (56)Let us ﬁrst determine an upper bound on | ǫ [ n ] | . Since α ( ) is the last row of A , we deduce from the expres-sion (33) of A that α ( ) [ m ] = M J ∑ j = + σ M Ω j cos (cid:16) π p j mM (cid:17) . (57)Besides, from equation (33), we also have A z K =  z [ N − M + ] ... z [ N − ] α ( ) z K  .The upward-shift property is thus successively inducted when ℓ increases; that is, A ℓ z K =  z [ N − M + ℓ ] ... z [ N − ] α ( ) z K ... α ( ℓ ) z K  .Then, α ( ℓ ) , the last row of A ℓ follows the following recurrence relation: α ( ℓ ) z K = α ( ) ˜ A ℓ − z K = M − ℓ ∑ m = α ( ) [ m ] z [ N − M + ℓ + m − ] + M − ∑ m = M − ℓ + α ( ) [ m ] α ( m − M + ℓ ) z K .Hence, ǫ [ ℓ ] = α ( ℓ ) z K − z [ N − + ℓ ]= M ∑ m = α ( ) [ m ] z [ N − M + ℓ + m − ] − z [ N − + ℓ ] + M − ∑ m = M − ℓ + α ( ) [ m ] (cid:16) α ( m − M + ℓ ) z K − z [ N − M + ℓ + m − ] (cid:17) = M ∑ m = α ( ) [ m ] z [ N − M + ℓ + m − ] − z [ N − + ℓ ] + M − ∑ m = M − ℓ + α ( ) [ m ] ǫ [ m − M + ℓ ] .Besides, equation (57) gives M ∑ m = α ( ) [ m ] z [ N − M + ℓ + m − ] = J ∑ j , j ′ = Ω j ′ M + σ M Ω j M − ∑ m = cos (cid:16) π p j mM (cid:17) cos (cid:18) π p j ′ N + mM + ϕ j ′ (cid:19)| {z } = δ j , j ′ M cos ( π p j NM + ϕ j )= J ∑ j = Ω j + σ M Ω j cos (cid:18) π p j NM + ϕ j (cid:19) . Thus: | ǫ [ ℓ ] | = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) J ∑ j = Ω j  + σ M Ω j −  cos (cid:18) π p j NM + ϕ j (cid:19) + M − ∑ m = M − ℓ + α ( ) [ m ] ǫ [ m − M + ℓ ] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ J ∑ j = Ω j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + σ M Ω j − (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + JM ℓ − ∑ λ = | ǫ [ λ ] |≤ σ M J ∑ j = Ω j + JM ℓ − ∑ λ = | ǫ [ λ ] | . (58)Then, by induction from the inequality (58), we have | ǫ [ ℓ ] | ≤ σ M (cid:18) + JM (cid:19) ℓ − J ∑ j = Ω j ∆ = c ( ) σ , (59)where c ( ) = M (cid:16) + JM (cid:17) ℓ − ∑ Jj = Ω j . Note that c ( ) is not depending on σ or K .Let us now determine an upper bound on | ǫ [ ℓ ] | . Since E { g [ r ] } = o (cid:18) K (cid:19) , a second-order Taylor expansion of h ( ℓ ) gives E { h ( ℓ ) [ m ] } = M ( M + ) − ∑ r , r ′ = ∂ f ( ℓ ) m ∂ g [ r ] ∂ g [ r ′ ] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) g = E { g [ r ] g [ r ′ ] } + o (cid:18) K (cid:19) . (60)Thus, given the bounds (38) on | E { g [ r ] g [ r ′ ] }| and (48) on the second derivative of f ( ℓ ) m , we have: | ǫ [ ℓ ] | ≤ C z M ( M + ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∂ f ( ℓ ) ∂ g [ r ] ∂ g [ r ′ ] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) g = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) max K (cid:16) C z σ + σ (cid:17) + o (cid:18) K (cid:19) ≤ K c ( ) + c ( ) σ ! + o (cid:18) K (cid:19) , (61)where c ( ) ∆ = M ( M + ) C z d z , M , ℓ , c ( ) ∆ = M ( M + ) C z d z , M , ℓ .Let us now determine an upper bound on | ǫ [ ℓ ] | . A second-order Taylor expansion of h ( ℓ ) gives E { h ( ℓ ) w K } = M − ∑ m = M ( M + ) − ∑ r = ∂ f ( ℓ ) m ∂ g [ r ] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) g = E { g [ r ] w K [ m ] } + o (cid:18) K (cid:19) (62)Indeed, the third-order moments E { g [ r ] g [ r ′ ] w K [ m ] } , behave as o (cid:18) K (cid:19) . Besides, E { g [ r ] w K [ m ] } = σ E { g [ r ] w K [ m ] } + σ E { g [ r ] w K [ m ] } . (63)Then, | E { g [ r ] w K [ m ] }| = K (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) K − ∑ k = z k [ m r + a r ] E (cid:8) w k [ m ′ r ] w K [ m ] (cid:9) + z k [ m ′ r ] E { w k [ m r + a r ] w K [ m ] } (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ K J ∑ j = Ω j ! = C z K . (64) E { g [ r ] w K [ m ] } = K K − ∑ k , = E (cid:8) w k [ m r + a r ] w k [ m ′ r ] w K [ m ] (cid:9) − δ m r + a r , m ′ r E { w K [ m ] } = | E { g [ r ] w K [ m ] }| ≤ C z σ K . (66)Thus, given the bound (47) on the ﬁrst derivative of f ( ℓ ) m , and from (62) we have: | ǫ [ ℓ ] | ≤ σ M ( M + ) d z , M , ℓ σ C z σ K ∆ = c ( ) K , (67)where c ( ) ∆ = M ( M + ) d z , M , ℓ C z .Thus, combining bounds (59) on ǫ , (61) on ǫ and (67) on ǫ gives the following bound of the bias: | µ [ n ] | ≤ c ( ) σ + K c ( ) σ + c ( ) + c ( ) ! + o (cid:18) K (cid:19) . (68) C. Expression of the Covariance γ . Outside the measurement interval I , denote by ℓ the index such that n = N − + ℓ . Then, given that h ( ℓ ) = α ( ℓ ) − α ( ℓ ) , we have γ [ n , n ] = (cid:16) α ( ℓ ) z K (cid:17) + (cid:16) α ( ℓ ) z K (cid:17) E n h ( ℓ ) o z K + E (cid:26)(cid:16) h ( ℓ ) z K (cid:17) (cid:27) + σ α ( ℓ ) E { w K h ( ℓ ) } z K + σ α ( ℓ ) z K E { h ( ℓ ) w K } + σ E { h ( ℓ ) w K h ( ℓ ) } z K + σ (cid:13)(cid:13)(cid:13) α ( ℓ ) (cid:13)(cid:13)(cid:13) + σ E (cid:26)(cid:16) h ( ℓ ) w K (cid:17) (cid:27) + σ α ( ℓ ) E { w K h ( ℓ ) w K } − z [ n ] − z [ n ] µ [ n ] − µ [ n ] = σ (cid:13)(cid:13)(cid:13) α ( ℓ ) (cid:13)(cid:13)(cid:13) − (cid:16) z [ n ] − α ( ℓ ) z K + z [ n ] (cid:17) + E (cid:26)(cid:16) h ( ℓ ) z K (cid:17) (cid:27) + σ α ( ℓ ) E { w K h ( ℓ ) } z K + σ E { h ( ℓ ) w K h ( ℓ ) } z K + σ E (cid:26)(cid:16) h ( ℓ ) w K (cid:17) (cid:27) + σ α ( ℓ ) E { w K h ( ℓ ) w K } ∆ = η [ n ] + η [ n ] + η [ n ] + η [ n ] + η [ n ] + η [ n ] + η [ n ] ,where η [ n ] = σ (cid:13)(cid:13)(cid:13) α ( ℓ ) (cid:13)(cid:13)(cid:13) , η [ n ] = − (cid:16) z [ n ] − α ( ℓ ) z K + z [ n ] (cid:17) , η [ n ] = E (cid:26)(cid:16) h ( ℓ ) z K (cid:17) (cid:27) , η [ n ] = σ α ( ℓ ) E { w K h ( ℓ ) } z K , η [ n ] = σ E { h ( ℓ ) w K h ( ℓ ) } z K , η [ n ] = σ E (cid:26)(cid:16) h ( ℓ ) w K (cid:17) (cid:27) , η [ n ] = σ α ( ℓ ) E { w K h ( ℓ ) w K } .Let us now determine an upper bound on each of these terms.First, since k α ( ℓ ) k ≤ M k A ℓ k , we have: η [ n ] ≤ σ M (cid:13)(cid:13)(cid:13) A ℓ (cid:13)(cid:13)(cid:13) ≤ σ M ℓ k A k ℓ max ≤ σ M ℓ max (cid:18) JM (cid:19) ℓ ! . (69)Second, by deﬁnition of ǫ and ǫ (see expressions (55) and (56)), η takes the following form: η [ n ] = ( ǫ [ n ] + ǫ [ n ]) = o (cid:18) K (cid:19) . (70) Third, second-order Taylor expansions of h ( ℓ ) give: η [ n ] ≤ C z M − ∑ m , m ′ = (cid:12)(cid:12)(cid:12) E n h ( ℓ ) [ m ] h ( ℓ ) [ m ′ ] o(cid:12)(cid:12)(cid:12) ≤ C z M ( M + ) − ∑ r , r ′ = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ f ( ℓ ) m ∂ g [ r ] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) g = ∂ f ( ℓ ) m ′ ∂ g [ r ′ ] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) g = E { g [ r ] g [ r ′ ] } (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + o (cid:18) K (cid:19) ≤ C z M ( M + ) d z , M , ℓ σ K (cid:16) C z σ + σ (cid:17) + o (cid:18) K (cid:19) ≤ C z M ( M + ) d z , M , ℓ K (cid:18) C z σ + (cid:19) + o (cid:18) K (cid:19) . (71)Fourth, | η [ n ] | ≤ σ C z (cid:13)(cid:13)(cid:13) A ℓ (cid:13)(cid:13)(cid:13) max M − ∑ m , m ′ = (cid:12)(cid:12)(cid:12) E { h ( ℓ ) [ m ] w K [ m ′ ] } (cid:12)(cid:12)(cid:12) But, (cid:12)(cid:12)(cid:12) E { h ( ℓ ) [ m ] w K [ m ′ ] } (cid:12)(cid:12)(cid:12) ≤ M ( M + ) − ∑ r = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ f ( ℓ ) m ∂ g [ r ] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) g = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:12)(cid:12) E { g [ r ] w K [ m ′ ] } (cid:12)(cid:12) + o (cid:18) K (cid:19) ≤ M ( M + ) d z , M , ℓ σ C z σ K + o (cid:18) K (cid:19) (72)Thus, | η [ n ] | ≤ M ℓ ( M + ) C z d z , M , ℓ (cid:18) max (cid:18)

1, 2 JM (cid:19)(cid:19) ℓ K + o (cid:18) K (cid:19) . (73)Fifth and sixth, second-order Taylor expansions of h ( ℓ ) give η [ n ] = o (cid:18) K (cid:19) , (74) η [ n ] = o (cid:18) K (cid:19) . (75)Seventh, | η [ n ] | ≤ σ (cid:13)(cid:13)(cid:13) A ℓ (cid:13)(cid:13)(cid:13) max M − ∑ m , m ′ = (cid:12)(cid:12)(cid:12) E { h ( ℓ ) [ m ] w K [ m ] w K [ m ′ ] } (cid:12)(cid:12)(cid:12) ≤ σ (cid:13)(cid:13)(cid:13) A ℓ (cid:13)(cid:13)(cid:13) max M − ∑ m , m ′ = M ( M + ) − ∑ r = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ f ( ℓ ) m ∂ g [ r ] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) g = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12) E { g ( ℓ ) [ r ] w K [ m ] w K [ m ′ ] } (cid:12)(cid:12)(cid:12) + o (cid:18) K (cid:19) .But, E { g ( ℓ ) [ r ] w K [ m ] w K [ m ′ ] } = σ E { g ( ℓ ) [ r ] w K [ m ] w K [ m ′ ] } + σ E { g ( ℓ ) [ r ] w K [ m ] w K [ m ′ ] } .As before, since E { g ( ℓ ) [ r ] w K [ m ] w K [ m ′ ] } is a third-order moment of a multivariate zero-mean Gaussian vector, itvanishes. And, (cid:12)(cid:12)(cid:12) E { g ( ℓ ) [ r ] w K [ m ] w K [ m ′ ] } (cid:12)(cid:12)(cid:12) = K (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) K − ∑ k = E (cid:8) w k [ m r + a r ] w k [ m ′ r ] w K [ m ] w K [ m ′ ] (cid:9) − δ m r + a r , m ′ r E (cid:8) w K [ m ] w K [ m ′ ] (cid:9)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = K (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) K − ∑ k = δ k + m r + a r , K + m δ k + m ′ r , K + m ′ + δ k + m r + a r , K + m ′ δ k + m ′ r , K + m (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ K . Thus, | η [ n ] | ≤ σ (cid:13)(cid:13)(cid:13) A ℓ (cid:13)(cid:13)(cid:13) max M ( M + ) d z , M , ℓ σ σ K + o (cid:18) K (cid:19) ≤ M ℓ + ( M + ) d z , M , ℓ (cid:18) max (cid:18)

1, 2 JM (cid:19)(cid:19) ℓ σ K + o (cid:18) K (cid:19) . (76)To conclude, we combine the expressions (69), (70), (71), (73), (74), (75), and (76) to determine an upper boundon the variance γ [ n , n ] . It follows: γ [ n , n ] ≤ c ( n ) σ + K c ( n ) σ + c ( n ) + c ( n ) σ ! + o (cid:18) K (cid:19) ,where c ( n ) = M ℓ max (cid:18) JM (cid:19) ℓ ! c ( n ) = C z M ( M + ) d z , M , ℓ c ( n ) = C z M ( M + ) d z , M , ℓ + M ℓ ( M + ) C z d z , M , ℓ (cid:18) max (cid:18)

1, 2 JM (cid:19)(cid:19) ℓ c ( n ) = M ℓ + ( M + ) d z , M , ℓ (cid:18) max (cid:18)

1, 2 JM (cid:19)(cid:19) ℓ . a) If n ′ ≥ N: When n ≥ N and n ′ ≥ N , applying the Cauchy-Schwarz inequality, we obtain the followingbound on the covariance γ [ n , n ] : (cid:12)(cid:12) γ [ n , n ′ ] (cid:12)(cid:12) ≤ q γ [ n , n ] γ [ n ′ , n ′ ] ≤ q c ( n ) c ( n ′ ) σ vuut + K σ c ( n ) c ( n ) σ + c ( n ) + c ( n ) σ ! + c ( n ′ ) c ( n ′ ) σ + c ( n ′ ) + c ( n ′ ) σ !! + o (cid:18) K (cid:19) .A ﬁrst-order Taylor expansion of this bound as K → ∞ gives ≤ q c ( n ) c ( n ′ ) σ + K vuut c ( n ′ ) c ( n ) c ( n ) σ + c ( n ) + c ( n ) σ ! + vuut c ( n ) c ( n ′ ) c ( n ′ ) σ + c ( n ′ ) + c ( n ′ ) σ ! + o (cid:18) K (cid:19) ≤ c ( n , n ′ ) σ + K c ( n , n ′ ) σ + c ( n , n ′ ) + c ( n , n ′ ) σ ! + o (cid:18) K (cid:19) ,where c ( n , n ′ ) = q c ( n ) c ( n ′ ) , c ( n , n ′ ) p =  c ( n ) p vuut c ( n ′ ) c ( n ) + c ( n ′ ) p vuut c ( n ) c ( n ′ )  , ∀ p ∈ {

1, 2, 3 } . b) If n ′ ∈ I: When n ≥ N and n ′ ∈ I , equation (17) shows us that: γ [ n , n ′ ] = σ E { w [ n ′ ] h ( ℓ ) } z K + σ E { w [ n ′ ] α ( ℓ ) w K } + σ E { w [ n ′ ] h ( ℓ ) w K } ∆ = β [ n , n ′ ] + β [ n , n ′ ] + β [ n , n ′ ] ,where β [ n , n ′ ] = σ E { w [ n ′ ] h ( ℓ ) } z K , β [ n , n ′ ] = σ E { w [ n ′ ] α ( ℓ ) w K } , β [ n , n ′ ] = σ E { w [ n ′ ] h ( ℓ ) w K } . Besides, thanks to the bound (72) we have: (cid:12)(cid:12) β [ n , n ′ ] (cid:12)(cid:12) ≤ σ C z M − ∑ m = (cid:12)(cid:12)(cid:12) E { w [ n ′ ] h ( ℓ ) [ m ] } (cid:12)(cid:12)(cid:12) ≤ σ C z M ( M + ) d z , M , ℓ σ C z σ K + o (cid:18) K (cid:19) ≤ M ( M + ) C z d z , M , ℓ K + o (cid:18) K (cid:19) . (77)Besides, (cid:12)(cid:12) β [ n , n ′ ] (cid:12)(cid:12) ≤ σ (cid:13)(cid:13)(cid:13) A ℓ (cid:13)(cid:13)(cid:13) max M − ∑ m = (cid:12)(cid:12) E { w [ n ′ ] w K [ m ] } (cid:12)(cid:12) ≤ σ (cid:13)(cid:13)(cid:13) A ℓ (cid:13)(cid:13)(cid:13) max ≤ σ M ℓ − (cid:18) max (cid:18)

1, 2 JM (cid:19)(cid:19) ℓ . (78)Besides, identically to the bound (76) on η , we obtain (cid:12)(cid:12) β [ n , n ′ ] (cid:12)(cid:12) ≤ σ M − ∑ m = (cid:12)(cid:12)(cid:12) E { w [ n ′ ] h ( ℓ ) [ m ] w K [ m ] } (cid:12)(cid:12)(cid:12) ≤ σ M ( M + ) d z , M , ℓ σ σ K + o (cid:18) K (cid:19) ≤ M ( M + ) d z , M , ℓ σ K + o (cid:18) K (cid:19) . (79)Finally, we combine expressions (77), (78), and (79) to determine an upper bound on the variance γ [ n , n ′ ] . It follows: (cid:12)(cid:12) γ [ n , n ′ ] (cid:12)(cid:12) ≤ b ( n , n ′ ) σ + K (cid:16) b ( n , n ′ ) + b ( n ) σ (cid:17) + o (cid:18) K (cid:19) ,where b ( n , n ′ ) = M ℓ − (cid:18) max (cid:18)

1, 2 JM (cid:19)(cid:19) ℓ b ( n , n ′ ) = M ( M + ) C z d z , M , ℓ b ( n , n ′ ) = M ( M + ) d z , M , ℓ .II. A PPLICATION TO AN E LECTROCARDIOGRAM

We provide here an additional implementation of

BoundEffRed , applied to an electrocardiogram (ECG) dataset.The dataset is constructed from a 500-second-long ECG, sampled at f s =

200 Hz, cut into 10 segments of 50 secondseach. Fig. 10 depicts the right boundary of one of these subsignals, together with the 6-second extensions estimatedby

SigExt (ﬁrst panel), or EDMD (second panel), GPR (third panel), or TBATS (fourth panel). These extensions aresuperimposed to the ground-truth extension, plotted in red. The sharp and spiky ECG patterns make the AHMmodel too simplistic to describe this type of signal. Consequently, the forecast produced by

SigExt is moderatelysatisfactory. Note that TBATS is the only one that seems to accurately capture the locations of QRS complexes after37 seconds. We will explore this long-term prediction capability in the future work.Table IV contains the median performance index D of the boundary-free TF representations, over the N subsignalsevaluated, according to the extension method. As a result of the fair quality of the forecasts, the reduction ofboundary effects is less signiﬁcant than for PPG signal. Nevertheles, the results show that BoundEffRed has thesame efﬁciency when the

SigExt extension, the EDMD extension or the GPR extension is chosen. Indeed, t-testsperformed under the null hypothesis that the mean are equals, with a 5% signiﬁcance level, show no statisticalsigniﬁcant difference between

SigExt and EDMD or GPR, regardless of the representation considered. This justiﬁesthe choice of

SigExt for real-time implementation. Fig. 10. Extended ECG (blue) obtained by the

SigExt forecasting (ﬁrst panel), the EDMD forecasting (second panel), the GPR forecasting (thirdpanel), and the TBATS forecasting (fourth panel), superimposed with the ground truth signal (dashed red).TABLE IVECG: P

ERFORMANCE OF THE B OUNDARY -F REE

TF R

EPRESENTATIONS A CCORDING TO THE E XTENSION M ETHOD

Extensionmethod Median performance index D STFT SST ConceFT RS

SigExt

III. A

PPLICATION TO A M ULTICOMPONENT C ARDIAC S IGNAL

We consider a cardiac signal, namely a photoplethysmogram (PPG) recording, sampled at 300 Hz. In addition tothe cardiac cycle measurement, this signal contains a slow varying component, which is the respiratory component,known as respiration induced intensity variation (RIIV) [2]. The top of Fig. 11 displays an excerpt of the signal alongwith a 3-second-long extension obtained by the

SigExt forecasting. The lower part of Fig. 11 shows the respiratorysignal recording the concentration of CO . This signal was recorded simultaneously with the PPG signal, andvisually highlights the low-frequency respiratory component contained in the PPG signal. Indeed, the intervalswhere the concentration of CO drops—highlighted by the bluish areas—coincide with the decreases in the PPGsignal. Note also that the forecasting breaks the waveform of the oscillations because of its inability to forecastthe high-frequency harmonics contained in the signal. Nevertheless, as long as the low-frequency harmonics of thecomponents contained in the signal are preserved, the forecasting is sufﬁcient to reduce boundary effects in thelow-frequency TF domain.The ordinary and boundary-free SSTs of this signal are displayed in Fig. 12. These TF analyses brings out bothcomponents—the fundamental frequency of the cardiac component, the most energetic one, indicated by the bluearrows, is located around 1.3 Hz, while the respiratory component, less visible, indicated by the green arrows, islocated around 0.2 Hz and its multiples. Clearly, near the boundaries, BoundEffRed helps improve the quality ofthe TF representation. Besides, the performance index of this representation takes the value D = BoundEffRed has reduced the right-side boundary effects by about 51% with respect to the ordinary SST. Thisshows the ability of our algorithm to work on signals containing several nonstationary components.R

EFERENCES[1] L. Isserlis, “On a formula for the product-moment coefﬁcient of any order of a normal frequency distribution in any number of variables,”

Biometrika , vol. 12, no. 1-2, pp. 134–139, 11 1918. [Online]. Available: https://doi.org/10.1093/biomet/12.1-2.134[2] L. M. Nilsson, “Respiration signals from photoplethysmography,”

Anesthesia and Analgesia , vol. 117, no. 4, pp. 859–865, 2013. Fig. 11. PPG signal (top, yellow), and the associated

SigExt extension (blue). The simultaneously recorded concentration of CO is below. Thebluish areas show the synchrony between the drops in CO concentration and the decreases in the PPG signal. Time (s) F r equen cy ( H z ) Time (s) F r equen cy ( H z ))