[PDF] Generating Sparse Stochastic Processes Using Matched Splines

Abstract

We provide an algorithm to generate trajectories of sparse stochastic processes that are solutions of linear ordinary differential equations driven by Lévy white noises. A recent paper showed that these processes are limits in law of generalized compound-Poisson processes. Based on this result, we derive an off-the-grid algorithm that generates arbitrarily close approximations of the target process. Our method relies on a B-spline representation of generalized compound-Poisson processes. We illustrate numerically the validity of our approach.

Full PDF

GGenerating Sparse Stochastic Processes UsingMatched Splines

Leello Dadi, Shayan Aziznejad, and Michael Unser,

Fellow, IEEE

Abstract —We provide an algorithm to generate trajectories ofsparse stochastic processes that are solutions of linear ordinarydifferential equations driven by L´evy white noises. A recent papershowed that these processes are limits in law of generalizedcompound-Poisson processes. Based on this result, we derive anoff-the-grid algorithm that generates arbitrarily close approxi-mations of the target process. Our method relies on a B-splinerepresentation of generalized compound-Poisson processes. Weillustrate numerically the validity of our approach.

Index Terms —Sparse stochastic processes, L´evy drivenCARMA processes, B-splines, compound-Poisson processes.

I. I

NTRODUCTION

Motivated by tractability and results such as the central-limittheorem, most of the early work in statistical signal processinghas focused on Gaussian models [1]. In particular, the theoryof Gaussian stationary processes provided justiﬁcations for theuse of the discrete cosine transform [2] as an approximationof the Karhunen–Lo`eve transform, and the Kalman ﬁlter [3]as an optimal estimator.However, the analysis of real-world signals has revealedthat the Gaussian framework may be insufﬁcient to capturethe breadth of the underlying behaviors [4], [5]. An importantproperty that escapes the Gaussian framework is that of spar-sity in some transform domains [6]. Sparsity being an essentialcomponent of modern signal processing [7]–[9], the authors of[10] proposed a wider stochastic framework that encompassesboth Gaussian and sparsity-compatible models. Within thisframework, a continuous-time signal is a realization of astochastic process s that can be whitened by some linear, shift-invariant operator L . The key here is that the resulting whitenoise, or innovation, is not necessarily Gaussian. Put formally,signals are solutions of L s = w, (1)where w is a well deﬁned innovation process called a L´evywhite noise [11]. The term L´evy here comes from the factthat w is an object that can be interpreted as the derivativeof a L´evy process in the sense of distributions [12], [13].Whenever w is non-Gaussian, the realizations of s can beshown to be sparse. Accordingly, they have been named sparsestochastic processes [10]. Speciﬁc instances of such processeshave been used to model natural signals such as images [14], Leello Dadi and Shayan Aziznejad contributed equally to this work.This work was done at the Biomedical Imaging Group, ´Ecole polytech-nique f´ed´erale de Lausanne, Switzerland (e-mail: [email protected];shayan.aziznejad@epﬂ.ch; michael.unser@epﬂ.ch). It was funded by theSwiss National Science Foundation under Grant 200020 184646 / 1. [15], RF echoes in ultrasound [16], and network trafﬁc incommunication systems [17]–[19].The goal of this paper is to generate realizations of thestochastic process s given its whitening operator L and astatistical characterization of its innovation process w . Thecomputer generation of these signals can be of great interestto practitioners who wish to evaluate their reconstructionalgorithms. We are thinking of works such as [20]–[23],where optimal estimators for interpolating and denoising suchprocesses have been derived.A possible approach to generate realizations of s wouldbe to notice that, if L is a differential operator such as D = dd t or a polynomial in D , then (1) deﬁnes a stochasticdifferential equation (SDE) [24]. This becomes more apparentwhen notating w with the alternative notation d Z t , where ( Z t ) t ∈ R + is a L´evy process (Chapter 7.4 in [10]). For example, (D − α I) s = w can be rewritten as d S t = αS t d t + d Z t .A suitable SDE solver, such as the one studied in [25], canthen be used to generate an approximation of the signal. Inparticular, a common method is to solve the linear system ofstochastic difference equations that is obtained by consideringthe discrete counter-part of the operator L ( e.g. using ﬁnitedifferences instead of the derivative), and by replacing theinnovation process w with a discrete white noise (see, forexample, [26]).It turns out that generic SDE solvers do not exploit thelinearity of L . Here, the analytic treatment of (1) can bepushed further to obtain an explicit solution. Brockwell showsin [27] that s corresponds to the integral of a deterministicfunction with respect to a L´evy process. The integral can thenbe approximated by substituting it with a Riemann sum deﬁnedon a partition of the integration interval [28, Theorem 21].These approaches, although valid, have drawbacks when itcomes to the generation of synthetic signals for the evaluationof algorithms. First, they directly depend on the existence of agrid on which the approximation of the continuous processis sampled. This can lead to complication in the contextof the multi-resolution algorithms that manipulate grid-freedescriptions of signals. Second, the generated approximationsare not solutions of an SDE in the form of (1). In other words,the approximations are not mathematical objects of the samenature as s .In what follows, we propose a method that addresses bothissues. It is based on a theoretical result by Fageot et al. [26]that states that any solution s of (1) is the limit in law of asequence of simpler processes s n . In other words, we havethat s n L −→ s, as n −→ ∞ . a r X i v : . [ m a t h . S T ] A ug hese simpler processes, called generalized Poisson processes[10], have the advantage of having a grid-free numericalrepresentation despite having a continuously deﬁned domain.They fall within the category of (random) signals with a ﬁniterate of innovation [29], [30]. They also have the desirableproperty of being whitened by the same operator L as theapproximated signal. This implies that they all have the samecorrelation structure as the target signal (see Proposition 1).Our method takes a sufﬁciently large value for n andgenerates a realization of the process s n on a chosen interval.To do so, we consider an intermediary process called thegeneralized increment process. Interestingly, this process canbe represented as a weighted sum of shifted B-splines and canbe sampled very efﬁciently [31], [32]. The desired stochasticprocess s n is then obtained from the latter by recursiveﬁltering.The outline of the paper is as follows: In Section II, weprovide the necessary mathematical background. In SectionIII, we give a description of our algorithm: we begin by dis-cussing the simulation of the innovation process in SubsectionIII-A. We then deﬁne the generalized increment process inSubsection III-B and we show how to generate its trajectoriesin Subsection III-C. Using this, we provide a recipe forgenerating sparse stochastic processes in Subsection III-D.In Subsection III-E, we show that our generation methodperfectly reproduces the correlation structure of the targetstochastic process. Finally, we conduct numerical investiga-tions to show the validity of our method in Section IV.II. M ATHEMATICAL F OUNDATIONS

In this section, we give a brief overview of the mathematicalconcepts that underly our approach. For a more detailedexposition, the reader is referred to [13], [26], [33], andreferences therein.The Schwartz space S ( R ) is the space of smooth and rapidlydecaying test functions. Its continuous dual, denoted by S (cid:48) ( R ) ,is the space of tempered distributions. It is the space of allcontinuous linear functionals over S ( R ) .We denote by L an operator that is a continuous, linear,shift-invariant mapping from S (cid:48) ( R ) to S (cid:48) ( R ) . The operator L is said to be shift-invariant if for any test function ϕ and any t ∈ R , we have that L { ϕ } ( t − t ) = L { ϕ ( · − t ) } ( t ) , t ∈ R , where ϕ ( · − t ) : t (cid:55)→ ϕ ( t − t ) is the shifted version of ϕ by t .We restrict ourselves to rational operators in D = ddt , written

L = P (D) Q (D) − , where P and Q are polynomials such that deg( P ) > deg( Q ) . The latter assumption is crucial to havethe minimum required regularity (point-wise deﬁnition) forthe solution s of (1). The case L = D is a typical choice thatappears, for example, in the modeling of Brownian motion.Rational operators are deﬁned through their frequency re-sponse (cid:98) L( ω ) = P (j ω ) Q (j ω ) . They provide a succinct representation of the equation P (D) s = Q (D) w that we can simply rewrite as L s = w . We are interested in generalized stochastic processes deﬁnedover S (cid:48) ( R ) . A generalized stochastic process w can be viewedas a random element of S (cid:48) ( R ) in the sense that, for any ϕ ∈ S ( R ) , the linear functional ϕ (cid:55)→ (cid:104) ϕ, w (cid:105) ∈ R is a welldeﬁned random variable over R (See Appendix A for a formaldeﬁnition). A. L´evy White Noises

L´evy white noises constitute an important class of gener-alized stochastic processes, whose speciﬁcation is essential toour framework. The three important operational properties ofL´evy white noises for our purpose are:1)

Stationarity:

For any ϕ ∈ S ( R ) and τ ∈ R , therandom variables (cid:104) ϕ, w (cid:105) and (cid:104) ϕ ( ·− τ ) , w (cid:105) are identicallydistributed.2) Independence:

For any ϕ , ϕ ∈ S ( R ) with disjointsupports, the random variables (cid:104) ϕ , w (cid:105) and (cid:104) ϕ , w (cid:105) areindependent.3) Characterization of the probability law:

For any L´evywhite noises w in S (cid:48) ( R ) and for any test function ϕ ∈S ( R ) , the characteristic function of the random variable X ϕ = (cid:104) ϕ, w (cid:105) can be speciﬁed as (cid:98) P X ϕ ( ξ ) = E [e j ξ (cid:104) ϕ,w (cid:105) ] = exp (cid:18)(cid:90) R f ( ξϕ ( r ))d r (cid:19) , (2)where the function f : R → C is called the L´evyexponent of w .Formally, this L´evy exponent can be obtained as f ( ξ ) = log (cid:16) (cid:98) P X rect ( ξ ) (cid:17) , where X rect = (cid:104) rect [0 , , w (cid:105) is the observation of w throughthe rectangular windowrect [0 , ( x ) = (cid:40) , < x ≤ , otherwise . The distribution of X rect gives us the L´evy exponent f thatdeﬁnes (2), so that we can determine all the statistics of w from the knowledge of X rect .In particular, the following Proposition from [10] connectsthe second-order statistics of w to those of X rect . Proposition 1 ( [10], Theorem 4.15) . Let w be a L´evy whitenoise such that X rect = (cid:104) rect [0 , , w (cid:105) has zero mean and aﬁnite variance σ w = E [ X ] . Then, ∀ ϕ , ϕ ∈ S ( R ) , E [ (cid:104) ϕ , w (cid:105)(cid:104) ϕ , w (cid:105) ] = σ w (cid:104) ϕ , ϕ (cid:105) . It turns out that X rect is an inﬁnitely divisible randomvariable in the sense of Deﬁnition 1 [35]. Deﬁnition 1.

A real-valued random variable X is said tobe inﬁnitely divisible if, for any natural number M ∈ N ,there exist M independent and identically distributed randomvariables X , ..., X M such that X = X + · · · + X M . Although rect [0 , is not in S ( R ) , the random variable X rect can still bedeﬁned. For more details, see [34]. Distribution L´evy exponent

Gaussian ( µ, σ ) j µξ − σ ξ / Symmetric α -stable ( α, c ) , α ∈ (0 , −| cξ | α Gamma ( α, β ) − β log (1 − j ξ/α ) Laplace ( µ, b ) j µξ − log (1 + b ξ ) To check the inﬁnite divisibility of X rect , one can note that,for any M ∈ N , we have that X rect = (cid:104) rect [0 , , w (cid:105) = (cid:104) M − (cid:88) m =0 rect [ mM , m +1 M ] , w (cid:105) = M − (cid:88) m =0 (cid:104) rect [ mM , m +1 M ] , w (cid:105) . (3)The terms in the sum (3) are independent and identicallydistributed random variables as a consequence of the inde-pendence and stationarity properties of white noises, whichcertiﬁes that (cid:104) rect [0 , , w (cid:105) is inﬁnitely divisible.The converse is also true: for any regular inﬁnitely divisiblerandom variable X with L´evy exponent f ( ξ ) = log ( E [e j ξX ]) ,there exists a well deﬁned L´evy white noise w whose statisticsare determined by (2) [35]–[37]. This shows that there is aone-to-one correspondence between inﬁnitely divisible distri-butions and L´evy white noises through (cid:104) rect [0 , , w (cid:105) .The Gaussian, gamma, and α -stable distributions are clas-sical examples of inﬁnitely divisible distributions [12]. Wecan plug in their L´evy exponents in (2) to deﬁne theircorresponding L´evy white noises. We repeat in Table I someinﬁnitely divisible distributions of interest, along with theirL´evy exponents [35].A case of special interest is when f is the L´evy exponentof a compound-Poisson distribution. A compound-Poissonrandom variable X , with rate λ and amplitude law ν , is deﬁnedas X = K (cid:88) k =1 A k , where the number K is a Poisson random variable withparameter λ and ( A k ) Kk =1 is an i.i.d. sequence drawn accordingto ν . We refer to the corresponding L´evy white noise w as acompound-Poisson innovation. It is known to be equal in lawto w = (cid:88) k ∈ Z A k δ ( · − τ k ) , (4)where ( τ k ) k ∈ Z are the locations of impulses with rate λ [13].The law of these impulses is as follows: for any interval [ a, b ] ,the number of impulses in [ a, b ] is a Poisson random variablewith parameter λ ( b − a ) .On any ﬁnite interval, compound-Poisson innovations havea ﬁnite representation. They can be stored on a computer withthe quantization of real numbers as sole source of informationloss. They are therefore well adapted to simulation purposes. The random variable X is said to be regular, if E [ | X | (cid:15) ] < + ∞ for some (cid:15) > . B. Generalized L´evy Processes

The sparse-stochastic-process framework of Unser et al. [10] is a comprehensive theory of generalized L´evy Processes.These are stochastic processes that can be whitened by someadmissible linear, shift-invariant operator. More precisely, s is a generalized L´evy process if there exists an operator L such that w = L s is a L´evy white noise. Equivalently, onemay view generalized L´evy processes as the solution of thestochastic differential equation L s = w. (5)It has been shown that, under mild technical assumptions on L and w , a solution s of (5) exists and constitutes a properlydeﬁned generalized stochastic process over S (cid:48) ( R ) [36].When L is an operator with a trivial null space, such as L = (D − α I) with (cid:60) ( α ) (cid:54) = 0 , we can write that s = L − w, where L − is the inverse of L . However, when the nullspace is nontrivial, for instance when L corresponds to anunstable ordinary differential equation, the speciﬁcation of theboundary conditions become necessary to uniquely identifythe solution. The boundary conditions take the form φ (cid:96) ( s ) = c (cid:96) , (cid:96) = 1 , . . . , N , (6)where φ (cid:96) : s (cid:55)→ φ ( s ) ∈ R are appropriate linear functionals, c (cid:96) ∈ R , and N is the dimension of the null space of L . Forinstance, one can impose that the process s takes ﬁxed valuesat reference locations t < . . . < t N ; that is, φ (cid:96) ( s ) = s ( t (cid:96) ) = c (cid:96) for (cid:96) = 1 , . . . , N . Such boundary conditions appear inthe classical deﬁnition of L´evy processes (including Brownianmotion), where we have that φ ( s ) = s (0) = 0 (Chapter 7 of[10]). We formally write s = L − φ w, where L − φ is the right inverse of L . It incorporates theboundary conditions (6) (Chapter 5.4 of [10]).When w is a compound-Poisson innovation of the form (4),the process s = L − φ w ( L − s , respectively, when the null spaceof L is trivial) is called a generalized Poisson process.The fundamental property for this work is that any sparsestochastic process s that is the solution of (5) can be speciﬁedas the limit in law of a sequence { s n } n ∈ N of generalizedPoisson processes [26]. The corresponding driving processes w n = L s n are compound-Poisson innovations of the form w n = (cid:88) k ∈ Z A k,n δ ( · − τ k,n ) (7)with rates λ n = n and with i.i.d. amplitudes A k,n that areinﬁnitely divisible random variables with L´evy exponent f n = n f , where f is the L´evy exponent of w . C. Green’s Functions

The Green’s function of a differential operator L is atempered distribution ρ L ∈ S (cid:48) ( R ) that satisﬁes L ρ L = δ.

3t can be viewed as the impulse response of the inverse of L .The canonical Green’s function is ρ L = F − (cid:40) (cid:98) L( ω ) (cid:41) , where (cid:98) L is the frequency response of L (Chapter 5.2 of[10]). This deﬁnition can be made to stay valid even when (cid:98) L vanishes at some points, as long as (cid:98) L( ω ) is in S (cid:48) ( R ) . Fordetails on how to compute Green’s functions, the reader isreferred to Appendix B. We have plotted the Green’s functionof several operators in Figure 1 to highlight their variety andtheir dependence on L . III. M ETHOD

In this section, we introduce our method for generating(approximate) trajectories of a sparse stochastic process s thatis whitened by an operator L and whose innovation noise is w . When necessary, we assume general boundary conditionsof the form φ (cid:96) ( s ) = 0 for (cid:96) = 1 , ..., N , where N is thedimension of the null space of L .As mentioned earlier, the process s is the limit of general-ized compound-Poisson processes s n driven by w n = L s n , acompound-Poisson innovation of the form (7). The process s n can therefore be written s n = (cid:88) k ∈ Z A k,n ρ L ( · − τ k,n ) + p ,n , where ρ L is a Green’s function of L and p ,n is an elementof the null space of L determined by boundary conditions (itvanishes when L is invertible). Indeed, we have that L (cid:40)(cid:88) k ∈ Z A k,n ρ L ( · − τ k,n ) + p ,n (cid:41) = (cid:88) k ∈ Z A k,n L { ρ L ( · − τ k,n ) } = (cid:88) k ∈ Z A k,n δ ( · − τ k,n )= w n . For large values of n , the process s n is assumed to be a goodapproximation of s . So, our goal is to generate samples of s n on any uniform grid over any interval [0 , T ] . More precisely,once an interval [0 , T ] is speciﬁed and a regular grid with stepsize h is provided, our aim is to obtain the vector s n whosecomponents are [ s n ] i = s n ( ih ) , for i = 0 , ..., (cid:0) (cid:100) Th (cid:101) − (cid:1) . A. Simulating the Innovation Process

We begin by obtaining a realization of the driving innovation w n . It consists of a sequence of impulse locations ( τ k,n ) anda corresponding sequence of amplitudes ( A k,n ) .The sequence ( τ k,n ) is a point Poisson process. Its real-ization on the interval [0 , T ] is simulated in two steps. First,a Poisson random variable K with parameter λ = nT isgenerated. Then, K impulse locations ( τ k,n ) k ∈{ ,...,K } aresampled uniformly on [0 , T ] .The next step is to simulate the K corresponding amplitudes ( A k,n ) k ∈{ ,...,K } . The characteristic function of the ampli-tudes variable A is ξ (cid:55)→ exp (cid:18) n f ( ξ ) (cid:19) . TABLE II: The n th root of inﬁnitely divisible distributions Distribution n th Root Gaussian ( µ, σ ) Gaussian ( µn , σ √ n ) α -Stable ( α, β, µ, c ) If α (cid:54) = 1 , ( α, β, µn , cn α ) ,If α = 1 , ( α, β, µn − π cβ log( n ) n , cn ) Gamma ( α, β ) Gamma ( αn , β ) Compound-Poisson of intensity λ Compound-Poisson of intensity λn Laplace ( µ, b ) X n = µn + b ( G ( n )1 − G ( n )2 ) with G ( n )1 , G ( n )1 ∼ Gamma ( n , We refer to it as the n th root of the law of (cid:104) rect [0 , , w (cid:105) . Ourassumption in this paper is that there exists, for any n ∈ N , aknown method to generate inﬁnitely divisible variables withL´evy exponent n f ( ξ ) . For common parametric distributionssuch as α -stable, Laplace, and gamma distributions, suchsampling methods [39] are well known and implementedin scientiﬁc computing libraries . Simulating from their n throot is a simple matter of rescaling their parameters, assummarized in Table II. By applying the correct rescaling,we simulate K independent amplitudes and thus obtain thesequence ( A k,n ) k ∈{ ,...,K } . B. Generalized Increment Process

With the impulse locations ( τ k,n ) k ∈{ ,...,K } and amplitudes ( A k,n ) k ∈{ ,...,K } in hand, we can compute samples of s n ( · ) = K (cid:88) k =1 A k,n ρ L ( · − τ k,n ) + p ,n (8)on a grid.A direct approach to generate s n is to use the expansion(8) and represent the process as a sum of shifted Green’sfunctions. However in this case, the determination of s n ( t ) at any point t ∈ [0 , T ] may require nontrivial computationof each and every term in (8). This stems from the fact thatGreen’s functions are inﬁnitely supported in general. Thereare therefore potential drawbacks to expansions in the basis ofshifted Green’s functions like (8). To overcome these issues,we propose instead an alternative method based on B-splines.Recall that L is a rational operator of the form P (D) Q (D) − , where we take { α , ..., α deg( P ) } to be the rootsof P , with possible repetitions. Its discrete counterpart L h d isdeﬁned as L h d { f } = deg( P ) (cid:88) m =0 r [ m ] f ( · − mh ) , where the sequence r is determined through its Fourier trans-form R (e j ω ) = deg( P ) (cid:88) m =0 r [ m ]e − j ωm = deg( P ) (cid:89) m =1 (1 − e α m h e − j ωh ) . Workarounds exists for when a sampling method for the n th root n f ( ξ ) is unavailable. For instance, one can opt for an approximate sampling schemesuch as in [38]. E. Jones, et al., “SciPy: Open source scientiﬁc tools for Python,” 2001. L .It is a ﬁnite impulse-response ﬁlter (FIR). Its null spacecontains the null space of L [31]. The function β h L := L h d { ρ L } is called the B-spline corresponding to L [40]. The B-splinehas the fundamental property of being the shortest possiblefunction within the space of cardinal L -splines (its support isincluded in [0 , deg ( P ) × h ] ) [41], [42]. This will turn out to becrucial for the numerical efﬁciency of our method. Moreover,they reproduce both the Green’s function and elements inthe null space of their corresponding operator L [10, Section6.4.]. Examples of relevant generalized B-splines are shownin Figure 1 (right ﬁgures). Note how they contrast withthe corresponding inﬁnitely supported Green’s functions (leftﬁgures).The application of L h d to s n yields u n ( t ) = L h d { s n } ( t ) = deg( P ) (cid:88) m =0 r [ m ] s n ( t − mh ) . (9)The process u n in (9) is called the generalized incrementprocess. Interestingly, it can be written as a sum of compactly supported terms, like u n ( t ) = L h d { s n } ( t )= K (cid:88) k =1 A k,n L h d { ρ L ( · − τ k,n ) } ( t ) + L h d { p ,n } ( t )= K (cid:88) k =1 A k,n β h L ( t − τ k,n ) + 0 . The process u n , along with boundary conditions, is ouralternate representation of s n . Now, let u n be the vector whosecomponents are [ u n ] i = u ( ih ) , for i = 1 , ..., (cid:0) (cid:100) Th (cid:101) − (cid:1) . Thisvector can be computed more efﬁciently than s n since theprocess u n admits a representation with compactly supportedterms. Moreover, u n is linearly related to the vector s n via adiscrete system of difference equations. Indeed, we have that [ u n ] i = deg( P ) (cid:88) m =0 r [ m ][ s n ] i − m , (10)for deg( P ) ≤ i ≤ (cid:0) (cid:100) Th (cid:101) − (cid:1) . For < i < deg( P ) , we havethat [ u n ] i = deg( P ) (cid:88) m =0 r [ m ] s n (( i − m ) h ) , where the values s n ( − mh ) for m = 0 , ..., (deg( P ) − provide the boundary values. These relations are established bywriting (9) with t = ih . The boundary values are determinedby the null-space term p ,n , which is itself determined by theboundary conditions.Thus, once we have evaluated u n , we can obtain s n bysolving (10), which is accomplished by applying a recursivereverse ﬁlter to u n . This is performed by rewriting (10) as [ s n ] i = 1 r [0]  [ u n ] i − deg( P ) (cid:88) m =1 r [ m ][ s n ] i − m  . (11)By substitution of the boundary values when necessary ( i.e. ,taking s n (( i − m ) h ) instead of [ s n ] i − m when ( i − m ) ≤ ),(11) allows one to recursively compute the components of s n . C. Computing the Generalized Increment Process

We now describe an efﬁcient procedure to compute thegeneralized increment process. The components of u n aregiven by [ u n ] i = K (cid:88) k =0 A k,n β h L ( ih − τ k,n ) . The naive approach here would be to iterate through each gridpoint i independently and compute [ u n ] i . Doing so wouldrequire one to read the entire sequence of impulse locations ( τ k ) for each i . This cannot be avoided since there is noinformation on the sequence ( τ k ) , aside from its inclusionin [0 , T ] . We simply would not know which B-spline termsare inactive, so we would have to iterate through them all. Amore efﬁcient approach is to iterate through the list of impulsesinstead of the grid points.5ig. 2: For a single B-spline term, it is only the grid pointsthat sit within the support of the B-spline that are incremented(black stems).The idea is as follows: First, initialize the vector u n to zeros.Then, read the list of impulse locations one by one. For eachimpulse at τ k , ﬁnd the grid points that lie within the support ofthe B-spline at τ k . Then, increment the value of u n on thosegrid points by the contribution of the considered B-spline (seeFigure 2). In one pass over the list of impulses, this methodcomputes the values [ u n ] i = u n ( ih ) .This intermediate computation of the generalized-incrementprocess provides a considerable gain in terms of efﬁciency.Instead of having a number of operations that scales with (cid:100) Th (cid:101) × K for the Green’s function representation, we haveone that scales with deg( P ) × (cid:0) (cid:100) Th (cid:101) + K (cid:1) . D. Recipe to Generate Trajectories

Here is a summary of the procedure that generates trajec-tories of L s n = w n .First, ﬁx the inﬁnitely divisible distribution that corre-sponds to w and deﬁne the operator L by identifying thepolynomials P and Q .Pick a sufﬁciently large value for n . Intuitively, n should belarge enough to ensure the occurrence of several jumps in eachbin. In other words, we expect n to be of the same order as h − . This has been validated with our numerical experimentsas well, where we show that it provides a good approximationof the underlying statistics of the process (see Subsection IV-Cand Figures 5, 6, and 7).Pick a simulation interval [0 , T ] and generate w n as de-scribed in section III-A. Determine an explicit form for ρ L .At this point, the grid-free approximation s n (expressed as in(8)) is available and can be stored.Fix a grid on [0 , T ] by choosing a step size h . Thendetermine the vector s n with component [ s n ] i = s n ( ih ) .Compute the FIR ﬁlter L h d and obtain β h L = L h d { ρ L } . Then,compute the generalized increment vector u n as described inSection III-C.To obtain s n , apply the reverse ﬁlter to u n following (11).Take the values s n ( − mh ) for m = 0 , ..., (deg( P ) − to bezero for most cases except when L has a nontrivial null space, The choice here is restricted to parametric families we can rescale andsimulate.

Input :

Coefﬁcients of P and Q , approximation level n ,interval size T , step size h Output:

Vector s n Compute ρ L and the FIR ﬁlter r [ m ] Compute β hL = L hd { ρ L } Generate [ ( τ , A ) , ..., ( τ K , A K ) ]Initialize u n with zeros as an array of size (cid:100) Th (cid:101) foreach ( τ k , A k ) do Find closest grid point i grid = (cid:98) τ i h (cid:99) foreach i in { i grid , . . . , i grid + deg( P ) } do [ u n ] i ←− [ u n ] i + A k × β hL ( ih − τ k ) endend Recursively apply a reverse ﬁlter to u n following (11) Algorithm 1:

Procedure to obtain s n .in which case it is derived from boundary conditions. Thepseudocode of our method is provided in Algorithm 1. E. Correlation Structure

In this section, we show a merit of our method by provingthat the generated approximations preserve the correlationstructure of the target process.First note that for any white L´evy noise w , we have that w n L −→ w, where the sequence of compound-Poisson innovations ( w n ) n ∈ N is deﬁned in (7). We refer to this approximatingsequence in Proposition 2. Proposition 2.

Let w be a L´evy white noise such that X rect = (cid:104) rect [0 , , w (cid:105) has zero mean and the ﬁnite variance σ w = E [ X ] . Let n ∈ N and let w n be a compound-Poissoninnovation that approximates w as deﬁned in (7) . Denoting X rect ,n = (cid:104) rect [0 , , w n (cid:105) , we have that E [ X rect ,n ] = E [ X rect ] = 0 and σ w n = E [ X ,n ] = E [ X ] = σ w . The proof can be found in Appendix C. Now, if s n =L − w n is a generalized Poisson process that approximates s = L − w , then E [ (cid:104) ϕ , s n (cid:105)(cid:104) ϕ , s n (cid:105) ] = E [ (cid:104) ϕ , L − w n (cid:105)(cid:104) ϕ , L − w n (cid:105) ]= E [ (cid:104) L − ∗ ϕ , w n (cid:105)(cid:104) L − ∗ ϕ , w n (cid:105) ]= σ w n (cid:104) L − ∗ ϕ , L − ∗ ϕ (cid:105) = σ w (cid:104) L − ∗ ϕ , L − ∗ ϕ (cid:105) = E [ (cid:104) ϕ , s (cid:105)(cid:104) ϕ , s (cid:105) ] . (12)From (12), we concluded that, more than just approximated,the correlation structure is preserved exactly in our method.6ig. 3: Trajectories of L´evy processes ( L = D ) with differentinnovations. From top to bottom: Laplace(0, 1), gamma(1, 1),Gaussian(0,1), and symmetric- α -stable with α = 1 . .IV. N UMERICAL E XPERIMENTS

In this section, we validate our approach by conductingseveral numerical experiments. Let us also mention that aPython library that implements our algorithm can be foundonline . Moreover, an accompanying web interface is alsodesigned and is available . A. Generating L´evy Processes

Among all processes we can generate, those that are solu-tions to D s = w are called L´evy processes when the boundarycondition is s (0) = 0 . We showcase in Figure 3 differentL´evy processes that correspond to several inﬁnitely divisibledistributions. For all four simulations, we took n = 1 , and h = 0 . . As we demonstrate in Section IV-C, a reasonablechoice for these parameter is to set nh to be a small integer(here, nh = 1 ). The visual appearance of the trajectoriesmatches our expectations: The trajectory driven by a Gaussianinnovation has the appearance of Brownian motion; the gammaL´evy process is nondecreasing. B. Choice of the Operator

Our framework allows for any rational operator of the form P (D) Q (D) − , so long as deg( P ) > deg( Q ) . In Figure 4,we generate trajectories of s that are solution of L s = w ,where w is a symmetric- α -stable innovation with α = 1 . .Here we took n = 200 and h = 0 . . We see that, forvarious choices of L , the characteristics of the signal aremarkedly different, which exhibits the breadth of the modelingframework proposed in [10]. C. Convergence as n Grows

In Figure 5, we illustrate how an increase in n improves theapproximation. In addition, we have depicted the convergence https://github.com/Biomedical-Imaging-Group/Generating-Sparse-Processes https://saturdaygenfo.pythonanywhere.com Fig. 4: Trajectories of the solution s of L s = w for differentoperators L . In all cases, we considered a symmetric- α -stablewhite noise w with α = 1 . .Fig. 5: Approximations of Brownian motion (solution to D s = w , with w a Gaussian white L´evy noise) as n increases.of moments in Figure 6. While the two ﬁgures emphasize theeffect of n , they are insufﬁcient to provide a quantitative wayto choose n .Here, we propose a measure that is based on the statisticsof the generalized increment process. Since the process u n is maximally decoupled, we can estimate the distribution of U n = (cid:104) β h ∨ L , w n (cid:105) from the samples { [ u n ] i } i of the general-ized increment process on the grid and obtain the empiricalcumulative distribution function (CDF) ˜ F n ( · ) of U n . We thencompare this empirical function to the reference CDF F ( · ) of U = (cid:104) β h ∨ L , w (cid:105) . For the comparison, we use the Kolmogorov-Smirnov (KS) divergence [43] deﬁned as KS ( ˜ F n , F ) = max x ∈ R | ˜ F n ( x ) − F ( x ) | . We then select n such that the KS-divergence is smaller thana certain threshold ( e.g. , smaller than . ). The choice of thethreshold is conditioned by the desired numerical precision:The lower the threshold, the more faithful the trajectories, butthe higher the computational cost of the algorithm.7ig. 6: Convergence of E [ |(cid:104) rect [0 ,h ] , s n (cid:105)| p ] to E [ |(cid:104) rect [0 ,h ] , s (cid:105)| p ] for p = 0 . , h = 0 . , and severalsymmetric- α -stable L´evy white noises w . The expectationsare estimated with , trajectories for each n .Fig. 7: Kolmogorov-Smirnov (KS) divergence versus the av-erage number of jumps per bin ( N jumps = nh ).Intuitively, we expect that it is necessary to have severaljumps in each bin in order to properly approximate thestatistics of the process. The average number of jumps in eachbin of length h is N jumps = nh , so we expect n to be in theorder of h − .In Figure 7, we have validated this intuition by plottingthe KS-divergence for different values of N jumps in varioussettings. In all cases, as N jumps increases, the KS-divergencedecreases to a baseline error value, due to the ﬁnite-sampleestimation of the underlying distribution. D. Beneﬁts of Grid-Free Approximations

Recall that a main motivation for our algorithm was tomake it compatible with multi-grid methods. In our approach,the approximation s n lives off the grid. It is only after thespeciﬁcation of the step size h that s n is sampled on agrid. The generation of the random variables to determine s n and the sampling on a grid are completely decoupled. This Fig. 8: Single grid-free approximation sampled on grids thatdiffer by their step size.Fig. 9: Average computation time for a trajectory of thesolution of (D − . s = w , where w a Gaussian white noise.The simulation interval is [0 , with step size h = 0 . .means that the same approximation s n can be viewed throughdifferent grids, which we illustrate in Figure 8. The solutionto (D + 1) s = w , where w is a Gaussian white L´evy noise,is ﬁrst approximated by s . Then, it is viewed on differentregular grids on [0 , . E. Computational Efﬁciency

A crucial component of our approach is the computation ofthe generalized increment u in order to obtain the values of s n on a grid. This provides a gain in numerical efﬁciency that canbe felt even on moderately sized simulations. As can be seenin Figure 9, using a Green’s function representation requiressigniﬁcantly more time than using an intermediate B-splinerepresentation. V. C ONCLUSION

We have described a novel approach for generating sparsestochastic processes. Our method leverages the properties of8-splines to guarantee good numerical efﬁciency. A possibledirection for future work is to provide theoretical guidanceon how one should choose the parameter n in terms of aprescribed tolerance on the approximation error.A CKNOWLEDGMENT

The authors would like to thank Dr. Julien Fageot, PakshalBohra, and Thomas Debarre for enlightening discussions.A

PPENDIX AG ENERALIZED STOCHASTIC PROCESSES

Generalized stochastic processes are random elements of S (cid:48) ( R ) that can be fully speciﬁed by their characteristic func-tionals. Those are inﬁnite-dimensional generalizations of thecharacteristic functions of real random variables. Deﬁnition 2.

The characteristic functional of the generalizedstochastic process s is the functional (cid:98) P s : S ( R ) −→ C suchthat (cid:98) P s ( ϕ ) = E [e j (cid:104) ϕ,s (cid:105) ] , for all ϕ ∈ S ( R ) . It is a continuous, positive-deﬁnite functional and (cid:98) P s (0) = 1 . Just as in ﬁnite dimensions, (cid:98) P s contains all the statisticalinformation of s . In particular, for any test function ϕ ∈ S ( R ) ,the distribution of the real random variable (cid:104) ϕ, s (cid:105) is entirelydetermined by (cid:98) P s as its probability density function p is givenby p ( t ) ∝ F − (cid:110) E [e j ω (cid:104) ϕ,s (cid:105) ] (cid:111) ( t ) = F − (cid:110) (cid:98) P s ( ωϕ ) (cid:111) ( t ) , where F − is the inverse Fourrier transform. The constructionof such objects was initiated in [44]. Their use for modelingsparse signals was introduced in [10].A PPENDIX BC OMPUTING G REEN ’ S FUNCTIONS

Here, we describe a method to compute Green’s functions ofrational operators. We begin with the intermediate computationof the Green’s function of

L = (D − α I) k . We have that ρ α,k ( t ) = F − (cid:26) ω − α ) k (cid:27) ( t )= (cid:40) + ( t ) t k − ( k − e αt , (cid:60) ( α ) ≤ − + ( − t ) t k − ( k − e αt , otherwise (13)is a Green’s function of L .Now, recall that rational operators are of the form L = P (D) Q (D) − , where P and Q are polynomials. Tak-ing { α , ..., α m } to be the roots of P with multiplicity { γ , ..., γ m } , the inverse of the frequency response is givenby (cid:98) L( ω ) = Q (j ω ) (cid:81) mi =1 (j ω − α i ) γ i . This inverse is known to admit a partial-fraction decomposi-tion of the form (cid:98) L( ω ) = m (cid:88) i =1 γ i (cid:88) k =1 c ik (j ω − α i ) k for some constants c ik ∈ C . The corresponding Green’sfunction is then given by: ρ L ( t ) = F − (cid:40) (cid:98) L( ω ) (cid:41) ( t )= m (cid:88) i =1 γ i (cid:88) k =1 c ik F − (cid:26) ω − α i ) k (cid:27) ( t )= m (cid:88) i =1 γ i (cid:88) k =1 c ik ρ α i ,k ( t ) The Green’s function of L is then be obtained by summingthe Green’s function of the partial fractions given in (13).A PPENDIX

CProof of Proposition 1: Since w n is a compound-Poissoninnovation, X rect, n is a compound-Poisson random variable. Itcan be written X rect,n = N (cid:88) i =1 A i where N is a Poisson random variable with rate λ = n and the A i are independent identically distributed inﬁnitely divisiblerandom variables with L´evy exponent n f independent from N . We have by independence of the ( A i ) , that X rect,n = n (cid:88) i =1 A i = d X rect because the characteristic function of (cid:80) ni =1 A i is (e n ) n =e f . This directly implies that X rect,n and X rect have the samemoments. R EFERENCES[1] R. Gray and L. Davisson,

An Introduction to Statistical Signal Process-ing . Cambridge University Press, 2004.[2] N. Ahmed, T. Natarajan, and K. R. Rao, “Discrete Cosine Transfom,”

IEEE Transactions on Computers , vol. 23, no. 1, pp. 90–93, Jan. 1974.[3] R. E. Kalman, “A New Approach to Linear Filtering and PredictionProblems,”

Journal of Basic Engineering , vol. 82, no. 1, pp. 35–45,Mar. 1960.[4] D. Mumford and A. Desolneux,

Pattern Theory: The Stochastic Analysisof Real-World Signals . A. K. Peters/CRC Press, Aug. 2010.[5] A. Srivastava, A. B. Lee, E. P. Simoncelli, and S.-C. Zhu, “On advancesin statistical modeling of natural images,”

Journal of MathematicalImaging and Vision , vol. 18, no. 1, pp. 17–33, 2003.[6] A. Amini, M. Unser, and F. Marvasti, “Compressibility of Deterministicand Random Inﬁnite Sequences,”

IEEE Transactions on Signal Process-ing , vol. 59, no. 11, pp. 5193–5201, Nov. 2011.[7] S. Mallat,

A Wavelet Tour of Signal Processing . Elsevier, Sep. 1999.[8] D. L. Donoho, “Compressed sensing,”

IEEE Transactions on Informa-tion Theory , vol. 52, no. 4, pp. 1289–1306, Apr. 2006.[9] J.-L. Starck, F. Murtagh, and J. M. Fadili,

Sparse Image and SignalProcessing: Wavelets, Curvelets, Morphological Diversity . CambridgeUniversity Press, May 2010.[10] M. Unser and P. D. Tafti,

An Introduction to Sparse Stochastic Processes .Cambridge University Press, Aug. 2014.[11] T. Kailath, “The innovations approach to detection and estimationtheory,”

Proceedings of the IEEE , vol. 58, no. 5, pp. 680–695, 1970.[12] K.-i. Sato, S. Ken-Iti, and A. Katok,

L´evy Processes and InﬁnitelyDivisible Distributions . Cambridge University Press, Nov. 1999.[13] M. Unser, P. D. Tafti, and Q. Sun, “A Uniﬁed Formulation of Gaus-sian Versus Sparse Stochastic Processes—Part I: Continuous-DomainTheory,”

IEEE Transactions on Information Theory , vol. 60, no. 3, pp.1945–1962, Mar. 2014.

14] D. Mumford and B. Gidas, “Stochastic models for generic images,”

Quarterly of Applied Mathematics , vol. 59, no. 1, pp. 85–111, 2001.[15] E. Bostan, U. S. Kamilov, M. Nilchian, and M. Unser, “Sparse StochasticProcesses and Discretization of Linear Inverse Problems,”

IEEE Trans-actions on Image Processing , vol. 22, no. 7, pp. 2699–2710, Jul. 2013.[16] M. A. Kutay, A. P. Petropulu, and C. W. Piccoli, “On modelingbiomedical ultrasound RF echoes using a power-law shot-noise model,”

IEEE Transactions on Iltrasonics, Ferroelectrics, and Frequency Con-trol , vol. 48, no. 4, pp. 953–968, 2001.[17] S. M. Kogon and D. G. Manolakis, “Signal modeling with self-similar α -stable processes: The fractional L´evy stable motion model,” IEEETransactions on Signal Processing , vol. 44, no. 4, pp. 1006–1010, 1996.[18] J. R. Gallardo, D. Makrakis, and L. Orozco-Barbosa, “Use of α -stable self-similar stochastic processes for modeling trafﬁc in broadbandnetworks,” Performance Evaluation , vol. 40, no. 1-3, pp. 71–98, 2000.[19] N. Laskin, I. Lambadaris, F. C. Harmantzis, and M. Devetsikiotis,“Fractional l´evy motion and its application to network trafﬁc modeling,”

Computer Networks , vol. 40, no. 3, pp. 363–375, 2002.[20] A. Amini, P. Th´evenaz, J. Ward, and M. Unser, “On the linearity ofBayesian interpolators for non-Gaussian continuous-time AR(1) pro-cesses,”

IEEE Transactions on Information Theory , vol. 59, no. 8, pp.5063–5074, August 2013.[21] A. Amini, U. S. Kamilov, E. Bostan, and M. Unser, “Bayesian Estima-tion for Continuous-Time Sparse Stochastic Processes,”

IEEE Transac-tions on Signal Processing , vol. 61, no. 4, pp. 907–920, Feb. 2013.[22] U. S. Kamilov, P. Pad, A. Amini, and M. Unser, “MMSE Estimationof Sparse L´evy Processes,”

IEEE Transactions on Signal Processing ,vol. 61, no. 1, pp. 137–147, Jan. 2013.[23] S. J. Godsill and G. Yang, “Bayesian inference for continuous-timeARMA models driven by non-Gaussian L´evy processes,” in , vol. 5. IEEE, 2006, pp. V–V.[24] B. Øksendal,

Stochastic Differential Equations: An Introduction withApplications , 2nd ed. Berlin: Springer, 1989.[25] S. Rubenthaler, “Numerical simulation of the solution of a stochasticdifferential equation driven by a L´evy process,”

Stochastic Processesand their Applications , vol. 103, no. 2, pp. 311–349, Feb. 2003.[26] J. Fageot, V. Uhlmann, and M. Unser, “Gaussian and sparse processesare limits of generalized Poisson processes,”

Applied and ComputationalHarmonic Analysis , 2018.[27] P. J. Brockwell, “L´evy-Driven CARMA Processes,”

Annals of theInstitute of Statistical Mathematics , vol. 53, no. 1, pp. 113–124, Mar.2001.[28] P. E. Protter,

Stochastic Integration and Differential Equations , 2nd ed.,ser. Stochastic Modelling and Applied Probability. Berlin Heidelberg:Springer-Verlag, 2005.[29] M. Vetterli, P. Marziliano, and T. Blu, “Sampling signals with ﬁnite rateof innovation,”

IEEE Transactions on Signal Processing , vol. 50, no. 6,pp. 1417–1428, Jun. 2002.[30] M. Unser and P. D. Tafti, “Stochastic models for sparse and piecewise-smooth signals,”

IEEE Transactions on Signal Processing , vol. 59, no. 3,pp. 989–1006, 2010.[31] M. Unser and T. Blu, “Cardinal exponential splines: Part I - theory andﬁltering algorithms,”

IEEE Transactions on Signal Processing , vol. 53,no. 4, pp. 1425–1438, Apr. 2005.[32] M. Unser, “Cardinal exponential splines: Part II - think analog, actdigital,”

IEEE Transactions on Signal Processing , vol. 53, no. 4, pp.1439–1449, Apr. 2005.[33] M. Unser, P. D. Tafti, A. Amini, and H. Kirshner, “A Uniﬁed Formulationof Gaussian versus

Sparse Stochastic Processes—Part II: Discrete-Domain Theory,”

IEEE Transactions on Information Theory , vol. 60,no. 5, pp. 3036–3051, May 2014.[34] J. Fageot and T. Humeau, “Uniﬁed View on L´evy White Noises:General Integrability Conditions and Applications to Linear SPDE,” arXiv:1708.02500 [math] , Aug. 2017, arXiv: 1708.02500.[35] A. Amini and M. Unser, “Sparsity and Inﬁnite Divisibility,”

IEEETransactions on Information Theory , vol. 60, no. 4, pp. 2346–2358,Apr. 2014.[36] J. Fageot, A. Amini, and M. Unser, “On the Continuity of Character-istic Functionals and Sparse Stochastic Modeling,”

Journal of FourierAnalysis and Applications , vol. 20, no. 6, pp. 1179–1211, Dec. 2014.[37] R. C. Dalang, T. Humeau et al. , “L´evy processes and l´evy white noiseas tempered distributions,”

The Annals of Probability , vol. 45, no. 6B,pp. 4389–4418, 2017.[38] L. Bondesson, “On Simulation from Inﬁnitely Divisible Distributions,”

Advances in Applied Probability , vol. 14, no. 4, pp. 855–869, 1982. [39] L. Devroye, “Nonuniform Random Variate Generation,” in

Handbooksin Operations Research and Management Science . Elsevier, Jan. 2006,vol. 13, pp. 83–121.[40] C. De Boor, “On calculating with B-splines,”

Journal of ApproximationTheory , vol. 6, no. 1, pp. 50–62, 1972.[41] I. J. Schoenberg, “Contributions to the problem of approximation ofequidistant data by analytic functions,” in

IJ Schoenberg Selected Papers .Springer, 1988, pp. 3–57.[42] A. Ron, “Factorization theorems for univariate splines on regular grids,”

Israel Journal of Mathematics , vol. 70, no. 1, pp. 48–68, 1990.[43] F. Massey Jr, “The Kolmogorov-Smirnov test for goodness of ﬁt,”

Journal of the American Statistical Association , vol. 46, no. 253, pp.68–78, 1951.[44] I. M. Gel’fand and N. Y. Vilenkin,

Generalized Functions: Applicationsof Harmonic Analysis . Academic Press, May 2014.. Academic Press, May 2014.