Generating Sparse Stochastic Processes Using Matched Splines
GGenerating Sparse Stochastic Processes UsingMatched Splines
Leello Dadi, Shayan Aziznejad, and Michael Unser,
Fellow, IEEE
Abstract —We provide an algorithm to generate trajectories ofsparse stochastic processes that are solutions of linear ordinarydifferential equations driven by L´evy white noises. A recent papershowed that these processes are limits in law of generalizedcompound-Poisson processes. Based on this result, we derive anoff-the-grid algorithm that generates arbitrarily close approxi-mations of the target process. Our method relies on a B-splinerepresentation of generalized compound-Poisson processes. Weillustrate numerically the validity of our approach.
Index Terms —Sparse stochastic processes, L´evy drivenCARMA processes, B-splines, compound-Poisson processes.
I. I
NTRODUCTION
Motivated by tractability and results such as the central-limittheorem, most of the early work in statistical signal processinghas focused on Gaussian models [1]. In particular, the theoryof Gaussian stationary processes provided justifications for theuse of the discrete cosine transform [2] as an approximationof the Karhunen–Lo`eve transform, and the Kalman filter [3]as an optimal estimator.However, the analysis of real-world signals has revealedthat the Gaussian framework may be insufficient to capturethe breadth of the underlying behaviors [4], [5]. An importantproperty that escapes the Gaussian framework is that of spar-sity in some transform domains [6]. Sparsity being an essentialcomponent of modern signal processing [7]–[9], the authors of[10] proposed a wider stochastic framework that encompassesboth Gaussian and sparsity-compatible models. Within thisframework, a continuous-time signal is a realization of astochastic process s that can be whitened by some linear, shift-invariant operator L . The key here is that the resulting whitenoise, or innovation, is not necessarily Gaussian. Put formally,signals are solutions of L s = w, (1)where w is a well defined innovation process called a L´evywhite noise [11]. The term L´evy here comes from the factthat w is an object that can be interpreted as the derivativeof a L´evy process in the sense of distributions [12], [13].Whenever w is non-Gaussian, the realizations of s can beshown to be sparse. Accordingly, they have been named sparsestochastic processes [10]. Specific instances of such processeshave been used to model natural signals such as images [14], Leello Dadi and Shayan Aziznejad contributed equally to this work.This work was done at the Biomedical Imaging Group, ´Ecole polytech-nique f´ed´erale de Lausanne, Switzerland (e-mail: [email protected];shayan.aziznejad@epfl.ch; michael.unser@epfl.ch). It was funded by theSwiss National Science Foundation under Grant 200020 184646 / 1. [15], RF echoes in ultrasound [16], and network traffic incommunication systems [17]–[19].The goal of this paper is to generate realizations of thestochastic process s given its whitening operator L and astatistical characterization of its innovation process w . Thecomputer generation of these signals can be of great interestto practitioners who wish to evaluate their reconstructionalgorithms. We are thinking of works such as [20]–[23],where optimal estimators for interpolating and denoising suchprocesses have been derived.A possible approach to generate realizations of s wouldbe to notice that, if L is a differential operator such as D = dd t or a polynomial in D , then (1) defines a stochasticdifferential equation (SDE) [24]. This becomes more apparentwhen notating w with the alternative notation d Z t , where ( Z t ) t ∈ R + is a L´evy process (Chapter 7.4 in [10]). For example, (D − α I) s = w can be rewritten as d S t = αS t d t + d Z t .A suitable SDE solver, such as the one studied in [25], canthen be used to generate an approximation of the signal. Inparticular, a common method is to solve the linear system ofstochastic difference equations that is obtained by consideringthe discrete counter-part of the operator L ( e.g. using finitedifferences instead of the derivative), and by replacing theinnovation process w with a discrete white noise (see, forexample, [26]).It turns out that generic SDE solvers do not exploit thelinearity of L . Here, the analytic treatment of (1) can bepushed further to obtain an explicit solution. Brockwell showsin [27] that s corresponds to the integral of a deterministicfunction with respect to a L´evy process. The integral can thenbe approximated by substituting it with a Riemann sum definedon a partition of the integration interval [28, Theorem 21].These approaches, although valid, have drawbacks when itcomes to the generation of synthetic signals for the evaluationof algorithms. First, they directly depend on the existence of agrid on which the approximation of the continuous processis sampled. This can lead to complication in the contextof the multi-resolution algorithms that manipulate grid-freedescriptions of signals. Second, the generated approximationsare not solutions of an SDE in the form of (1). In other words,the approximations are not mathematical objects of the samenature as s .In what follows, we propose a method that addresses bothissues. It is based on a theoretical result by Fageot et al. [26]that states that any solution s of (1) is the limit in law of asequence of simpler processes s n . In other words, we havethat s n L −→ s, as n −→ ∞ . a r X i v : . [ m a t h . S T ] A ug hese simpler processes, called generalized Poisson processes[10], have the advantage of having a grid-free numericalrepresentation despite having a continuously defined domain.They fall within the category of (random) signals with a finiterate of innovation [29], [30]. They also have the desirableproperty of being whitened by the same operator L as theapproximated signal. This implies that they all have the samecorrelation structure as the target signal (see Proposition 1).Our method takes a sufficiently large value for n andgenerates a realization of the process s n on a chosen interval.To do so, we consider an intermediary process called thegeneralized increment process. Interestingly, this process canbe represented as a weighted sum of shifted B-splines and canbe sampled very efficiently [31], [32]. The desired stochasticprocess s n is then obtained from the latter by recursivefiltering.The outline of the paper is as follows: In Section II, weprovide the necessary mathematical background. In SectionIII, we give a description of our algorithm: we begin by dis-cussing the simulation of the innovation process in SubsectionIII-A. We then define the generalized increment process inSubsection III-B and we show how to generate its trajectoriesin Subsection III-C. Using this, we provide a recipe forgenerating sparse stochastic processes in Subsection III-D.In Subsection III-E, we show that our generation methodperfectly reproduces the correlation structure of the targetstochastic process. Finally, we conduct numerical investiga-tions to show the validity of our method in Section IV.II. M ATHEMATICAL F OUNDATIONS
In this section, we give a brief overview of the mathematicalconcepts that underly our approach. For a more detailedexposition, the reader is referred to [13], [26], [33], andreferences therein.The Schwartz space S ( R ) is the space of smooth and rapidlydecaying test functions. Its continuous dual, denoted by S (cid:48) ( R ) ,is the space of tempered distributions. It is the space of allcontinuous linear functionals over S ( R ) .We denote by L an operator that is a continuous, linear,shift-invariant mapping from S (cid:48) ( R ) to S (cid:48) ( R ) . The operator L is said to be shift-invariant if for any test function ϕ and any t ∈ R , we have that L { ϕ } ( t − t ) = L { ϕ ( · − t ) } ( t ) , t ∈ R , where ϕ ( · − t ) : t (cid:55)→ ϕ ( t − t ) is the shifted version of ϕ by t .We restrict ourselves to rational operators in D = ddt , written
L = P (D) Q (D) − , where P and Q are polynomials such that deg( P ) > deg( Q ) . The latter assumption is crucial to havethe minimum required regularity (point-wise definition) forthe solution s of (1). The case L = D is a typical choice thatappears, for example, in the modeling of Brownian motion.Rational operators are defined through their frequency re-sponse (cid:98) L( ω ) = P (j ω ) Q (j ω ) . They provide a succinct representation of the equation P (D) s = Q (D) w that we can simply rewrite as L s = w . We are interested in generalized stochastic processes definedover S (cid:48) ( R ) . A generalized stochastic process w can be viewedas a random element of S (cid:48) ( R ) in the sense that, for any ϕ ∈ S ( R ) , the linear functional ϕ (cid:55)→ (cid:104) ϕ, w (cid:105) ∈ R is a welldefined random variable over R (See Appendix A for a formaldefinition). A. L´evy White Noises
L´evy white noises constitute an important class of gener-alized stochastic processes, whose specification is essential toour framework. The three important operational properties ofL´evy white noises for our purpose are:1)
Stationarity:
For any ϕ ∈ S ( R ) and τ ∈ R , therandom variables (cid:104) ϕ, w (cid:105) and (cid:104) ϕ ( ·− τ ) , w (cid:105) are identicallydistributed.2) Independence:
For any ϕ , ϕ ∈ S ( R ) with disjointsupports, the random variables (cid:104) ϕ , w (cid:105) and (cid:104) ϕ , w (cid:105) areindependent.3) Characterization of the probability law:
For any L´evywhite noises w in S (cid:48) ( R ) and for any test function ϕ ∈S ( R ) , the characteristic function of the random variable X ϕ = (cid:104) ϕ, w (cid:105) can be specified as (cid:98) P X ϕ ( ξ ) = E [e j ξ (cid:104) ϕ,w (cid:105) ] = exp (cid:18)(cid:90) R f ( ξϕ ( r ))d r (cid:19) , (2)where the function f : R → C is called the L´evyexponent of w .Formally, this L´evy exponent can be obtained as f ( ξ ) = log (cid:16) (cid:98) P X rect ( ξ ) (cid:17) , where X rect = (cid:104) rect [0 , , w (cid:105) is the observation of w throughthe rectangular windowrect [0 , ( x ) = (cid:40) , < x ≤ , otherwise . The distribution of X rect gives us the L´evy exponent f thatdefines (2), so that we can determine all the statistics of w from the knowledge of X rect .In particular, the following Proposition from [10] connectsthe second-order statistics of w to those of X rect . Proposition 1 ( [10], Theorem 4.15) . Let w be a L´evy whitenoise such that X rect = (cid:104) rect [0 , , w (cid:105) has zero mean and afinite variance σ w = E [ X ] . Then, ∀ ϕ , ϕ ∈ S ( R ) , E [ (cid:104) ϕ , w (cid:105)(cid:104) ϕ , w (cid:105) ] = σ w (cid:104) ϕ , ϕ (cid:105) . It turns out that X rect is an infinitely divisible randomvariable in the sense of Definition 1 [35]. Definition 1.
A real-valued random variable X is said tobe infinitely divisible if, for any natural number M ∈ N ,there exist M independent and identically distributed randomvariables X , ..., X M such that X = X + · · · + X M . Although rect [0 , is not in S ( R ) , the random variable X rect can still bedefined. For more details, see [34]. Distribution L´evy exponent
Gaussian ( µ, σ ) j µξ − σ ξ / Symmetric α -stable ( α, c ) , α ∈ (0 , −| cξ | α Gamma ( α, β ) − β log (1 − j ξ/α ) Laplace ( µ, b ) j µξ − log (1 + b ξ ) To check the infinite divisibility of X rect , one can note that,for any M ∈ N , we have that X rect = (cid:104) rect [0 , , w (cid:105) = (cid:104) M − (cid:88) m =0 rect [ mM , m +1 M ] , w (cid:105) = M − (cid:88) m =0 (cid:104) rect [ mM , m +1 M ] , w (cid:105) . (3)The terms in the sum (3) are independent and identicallydistributed random variables as a consequence of the inde-pendence and stationarity properties of white noises, whichcertifies that (cid:104) rect [0 , , w (cid:105) is infinitely divisible.The converse is also true: for any regular infinitely divisiblerandom variable X with L´evy exponent f ( ξ ) = log ( E [e j ξX ]) ,there exists a well defined L´evy white noise w whose statisticsare determined by (2) [35]–[37]. This shows that there is aone-to-one correspondence between infinitely divisible distri-butions and L´evy white noises through (cid:104) rect [0 , , w (cid:105) .The Gaussian, gamma, and α -stable distributions are clas-sical examples of infinitely divisible distributions [12]. Wecan plug in their L´evy exponents in (2) to define theircorresponding L´evy white noises. We repeat in Table I someinfinitely divisible distributions of interest, along with theirL´evy exponents [35].A case of special interest is when f is the L´evy exponentof a compound-Poisson distribution. A compound-Poissonrandom variable X , with rate λ and amplitude law ν , is definedas X = K (cid:88) k =1 A k , where the number K is a Poisson random variable withparameter λ and ( A k ) Kk =1 is an i.i.d. sequence drawn accordingto ν . We refer to the corresponding L´evy white noise w as acompound-Poisson innovation. It is known to be equal in lawto w = (cid:88) k ∈ Z A k δ ( · − τ k ) , (4)where ( τ k ) k ∈ Z are the locations of impulses with rate λ [13].The law of these impulses is as follows: for any interval [ a, b ] ,the number of impulses in [ a, b ] is a Poisson random variablewith parameter λ ( b − a ) .On any finite interval, compound-Poisson innovations havea finite representation. They can be stored on a computer withthe quantization of real numbers as sole source of informationloss. They are therefore well adapted to simulation purposes. The random variable X is said to be regular, if E [ | X | (cid:15) ] < + ∞ for some (cid:15) > . B. Generalized L´evy Processes
The sparse-stochastic-process framework of Unser et al. [10] is a comprehensive theory of generalized L´evy Processes.These are stochastic processes that can be whitened by someadmissible linear, shift-invariant operator. More precisely, s is a generalized L´evy process if there exists an operator L such that w = L s is a L´evy white noise. Equivalently, onemay view generalized L´evy processes as the solution of thestochastic differential equation L s = w. (5)It has been shown that, under mild technical assumptions on L and w , a solution s of (5) exists and constitutes a properlydefined generalized stochastic process over S (cid:48) ( R ) [36].When L is an operator with a trivial null space, such as L = (D − α I) with (cid:60) ( α ) (cid:54) = 0 , we can write that s = L − w, where L − is the inverse of L . However, when the nullspace is nontrivial, for instance when L corresponds to anunstable ordinary differential equation, the specification of theboundary conditions become necessary to uniquely identifythe solution. The boundary conditions take the form φ (cid:96) ( s ) = c (cid:96) , (cid:96) = 1 , . . . , N , (6)where φ (cid:96) : s (cid:55)→ φ ( s ) ∈ R are appropriate linear functionals, c (cid:96) ∈ R , and N is the dimension of the null space of L . Forinstance, one can impose that the process s takes fixed valuesat reference locations t < . . . < t N ; that is, φ (cid:96) ( s ) = s ( t (cid:96) ) = c (cid:96) for (cid:96) = 1 , . . . , N . Such boundary conditions appear inthe classical definition of L´evy processes (including Brownianmotion), where we have that φ ( s ) = s (0) = 0 (Chapter 7 of[10]). We formally write s = L − φ w, where L − φ is the right inverse of L . It incorporates theboundary conditions (6) (Chapter 5.4 of [10]).When w is a compound-Poisson innovation of the form (4),the process s = L − φ w ( L − s , respectively, when the null spaceof L is trivial) is called a generalized Poisson process.The fundamental property for this work is that any sparsestochastic process s that is the solution of (5) can be specifiedas the limit in law of a sequence { s n } n ∈ N of generalizedPoisson processes [26]. The corresponding driving processes w n = L s n are compound-Poisson innovations of the form w n = (cid:88) k ∈ Z A k,n δ ( · − τ k,n ) (7)with rates λ n = n and with i.i.d. amplitudes A k,n that areinfinitely divisible random variables with L´evy exponent f n = n f , where f is the L´evy exponent of w . C. Green’s Functions
The Green’s function of a differential operator L is atempered distribution ρ L ∈ S (cid:48) ( R ) that satisfies L ρ L = δ.
3t can be viewed as the impulse response of the inverse of L .The canonical Green’s function is ρ L = F − (cid:40) (cid:98) L( ω ) (cid:41) , where (cid:98) L is the frequency response of L (Chapter 5.2 of[10]). This definition can be made to stay valid even when (cid:98) L vanishes at some points, as long as (cid:98) L( ω ) is in S (cid:48) ( R ) . Fordetails on how to compute Green’s functions, the reader isreferred to Appendix B. We have plotted the Green’s functionof several operators in Figure 1 to highlight their variety andtheir dependence on L . III. M ETHOD
In this section, we introduce our method for generating(approximate) trajectories of a sparse stochastic process s thatis whitened by an operator L and whose innovation noise is w . When necessary, we assume general boundary conditionsof the form φ (cid:96) ( s ) = 0 for (cid:96) = 1 , ..., N , where N is thedimension of the null space of L .As mentioned earlier, the process s is the limit of general-ized compound-Poisson processes s n driven by w n = L s n , acompound-Poisson innovation of the form (7). The process s n can therefore be written s n = (cid:88) k ∈ Z A k,n ρ L ( · − τ k,n ) + p ,n , where ρ L is a Green’s function of L and p ,n is an elementof the null space of L determined by boundary conditions (itvanishes when L is invertible). Indeed, we have that L (cid:40)(cid:88) k ∈ Z A k,n ρ L ( · − τ k,n ) + p ,n (cid:41) = (cid:88) k ∈ Z A k,n L { ρ L ( · − τ k,n ) } = (cid:88) k ∈ Z A k,n δ ( · − τ k,n )= w n . For large values of n , the process s n is assumed to be a goodapproximation of s . So, our goal is to generate samples of s n on any uniform grid over any interval [0 , T ] . More precisely,once an interval [0 , T ] is specified and a regular grid with stepsize h is provided, our aim is to obtain the vector s n whosecomponents are [ s n ] i = s n ( ih ) , for i = 0 , ..., (cid:0) (cid:100) Th (cid:101) − (cid:1) . A. Simulating the Innovation Process
We begin by obtaining a realization of the driving innovation w n . It consists of a sequence of impulse locations ( τ k,n ) anda corresponding sequence of amplitudes ( A k,n ) .The sequence ( τ k,n ) is a point Poisson process. Its real-ization on the interval [0 , T ] is simulated in two steps. First,a Poisson random variable K with parameter λ = nT isgenerated. Then, K impulse locations ( τ k,n ) k ∈{ ,...,K } aresampled uniformly on [0 , T ] .The next step is to simulate the K corresponding amplitudes ( A k,n ) k ∈{ ,...,K } . The characteristic function of the ampli-tudes variable A is ξ (cid:55)→ exp (cid:18) n f ( ξ ) (cid:19) . TABLE II: The n th root of infinitely divisible distributions Distribution n th Root Gaussian ( µ, σ ) Gaussian ( µn , σ √ n ) α -Stable ( α, β, µ, c ) If α (cid:54) = 1 , ( α, β, µn , cn α ) ,If α = 1 , ( α, β, µn − π cβ log( n ) n , cn ) Gamma ( α, β ) Gamma ( αn , β ) Compound-Poisson of intensity λ Compound-Poisson of intensity λn Laplace ( µ, b ) X n = µn + b ( G ( n )1 − G ( n )2 ) with G ( n )1 , G ( n )1 ∼ Gamma ( n , We refer to it as the n th root of the law of (cid:104) rect [0 , , w (cid:105) . Ourassumption in this paper is that there exists, for any n ∈ N , aknown method to generate infinitely divisible variables withL´evy exponent n f ( ξ ) . For common parametric distributionssuch as α -stable, Laplace, and gamma distributions, suchsampling methods [39] are well known and implementedin scientific computing libraries . Simulating from their n throot is a simple matter of rescaling their parameters, assummarized in Table II. By applying the correct rescaling,we simulate K independent amplitudes and thus obtain thesequence ( A k,n ) k ∈{ ,...,K } . B. Generalized Increment Process
With the impulse locations ( τ k,n ) k ∈{ ,...,K } and amplitudes ( A k,n ) k ∈{ ,...,K } in hand, we can compute samples of s n ( · ) = K (cid:88) k =1 A k,n ρ L ( · − τ k,n ) + p ,n (8)on a grid.A direct approach to generate s n is to use the expansion(8) and represent the process as a sum of shifted Green’sfunctions. However in this case, the determination of s n ( t ) at any point t ∈ [0 , T ] may require nontrivial computationof each and every term in (8). This stems from the fact thatGreen’s functions are infinitely supported in general. Thereare therefore potential drawbacks to expansions in the basis ofshifted Green’s functions like (8). To overcome these issues,we propose instead an alternative method based on B-splines.Recall that L is a rational operator of the form P (D) Q (D) − , where we take { α , ..., α deg( P ) } to be the rootsof P , with possible repetitions. Its discrete counterpart L h d isdefined as L h d { f } = deg( P ) (cid:88) m =0 r [ m ] f ( · − mh ) , where the sequence r is determined through its Fourier trans-form R (e j ω ) = deg( P ) (cid:88) m =0 r [ m ]e − j ωm = deg( P ) (cid:89) m =1 (1 − e α m h e − j ωh ) . Workarounds exists for when a sampling method for the n th root n f ( ξ ) is unavailable. For instance, one can opt for an approximate sampling schemesuch as in [38]. E. Jones, et al., “SciPy: Open source scientific tools for Python,” 2001. L .It is a finite impulse-response filter (FIR). Its null spacecontains the null space of L [31]. The function β h L := L h d { ρ L } is called the B-spline corresponding to L [40]. The B-splinehas the fundamental property of being the shortest possiblefunction within the space of cardinal L -splines (its support isincluded in [0 , deg ( P ) × h ] ) [41], [42]. This will turn out to becrucial for the numerical efficiency of our method. Moreover,they reproduce both the Green’s function and elements inthe null space of their corresponding operator L [10, Section6.4.]. Examples of relevant generalized B-splines are shownin Figure 1 (right figures). Note how they contrast withthe corresponding infinitely supported Green’s functions (leftfigures).The application of L h d to s n yields u n ( t ) = L h d { s n } ( t ) = deg( P ) (cid:88) m =0 r [ m ] s n ( t − mh ) . (9)The process u n in (9) is called the generalized incrementprocess. Interestingly, it can be written as a sum of compactly supported terms, like u n ( t ) = L h d { s n } ( t )= K (cid:88) k =1 A k,n L h d { ρ L ( · − τ k,n ) } ( t ) + L h d { p ,n } ( t )= K (cid:88) k =1 A k,n β h L ( t − τ k,n ) + 0 . The process u n , along with boundary conditions, is ouralternate representation of s n . Now, let u n be the vector whosecomponents are [ u n ] i = u ( ih ) , for i = 1 , ..., (cid:0) (cid:100) Th (cid:101) − (cid:1) . Thisvector can be computed more efficiently than s n since theprocess u n admits a representation with compactly supportedterms. Moreover, u n is linearly related to the vector s n via adiscrete system of difference equations. Indeed, we have that [ u n ] i = deg( P ) (cid:88) m =0 r [ m ][ s n ] i − m , (10)for deg( P ) ≤ i ≤ (cid:0) (cid:100) Th (cid:101) − (cid:1) . For < i < deg( P ) , we havethat [ u n ] i = deg( P ) (cid:88) m =0 r [ m ] s n (( i − m ) h ) , where the values s n ( − mh ) for m = 0 , ..., (deg( P ) − provide the boundary values. These relations are established bywriting (9) with t = ih . The boundary values are determinedby the null-space term p ,n , which is itself determined by theboundary conditions.Thus, once we have evaluated u n , we can obtain s n bysolving (10), which is accomplished by applying a recursivereverse filter to u n . This is performed by rewriting (10) as [ s n ] i = 1 r [0] [ u n ] i − deg( P ) (cid:88) m =1 r [ m ][ s n ] i − m . (11)By substitution of the boundary values when necessary ( i.e. ,taking s n (( i − m ) h ) instead of [ s n ] i − m when ( i − m ) ≤ ),(11) allows one to recursively compute the components of s n . C. Computing the Generalized Increment Process
We now describe an efficient procedure to compute thegeneralized increment process. The components of u n aregiven by [ u n ] i = K (cid:88) k =0 A k,n β h L ( ih − τ k,n ) . The naive approach here would be to iterate through each gridpoint i independently and compute [ u n ] i . Doing so wouldrequire one to read the entire sequence of impulse locations ( τ k ) for each i . This cannot be avoided since there is noinformation on the sequence ( τ k ) , aside from its inclusionin [0 , T ] . We simply would not know which B-spline termsare inactive, so we would have to iterate through them all. Amore efficient approach is to iterate through the list of impulsesinstead of the grid points.5ig. 2: For a single B-spline term, it is only the grid pointsthat sit within the support of the B-spline that are incremented(black stems).The idea is as follows: First, initialize the vector u n to zeros.Then, read the list of impulse locations one by one. For eachimpulse at τ k , find the grid points that lie within the support ofthe B-spline at τ k . Then, increment the value of u n on thosegrid points by the contribution of the considered B-spline (seeFigure 2). In one pass over the list of impulses, this methodcomputes the values [ u n ] i = u n ( ih ) .This intermediate computation of the generalized-incrementprocess provides a considerable gain in terms of efficiency.Instead of having a number of operations that scales with (cid:100) Th (cid:101) × K for the Green’s function representation, we haveone that scales with deg( P ) × (cid:0) (cid:100) Th (cid:101) + K (cid:1) . D. Recipe to Generate Trajectories
Here is a summary of the procedure that generates trajec-tories of L s n = w n .First, fix the infinitely divisible distribution that corre-sponds to w and define the operator L by identifying thepolynomials P and Q .Pick a sufficiently large value for n . Intuitively, n should belarge enough to ensure the occurrence of several jumps in eachbin. In other words, we expect n to be of the same order as h − . This has been validated with our numerical experimentsas well, where we show that it provides a good approximationof the underlying statistics of the process (see Subsection IV-Cand Figures 5, 6, and 7).Pick a simulation interval [0 , T ] and generate w n as de-scribed in section III-A. Determine an explicit form for ρ L .At this point, the grid-free approximation s n (expressed as in(8)) is available and can be stored.Fix a grid on [0 , T ] by choosing a step size h . Thendetermine the vector s n with component [ s n ] i = s n ( ih ) .Compute the FIR filter L h d and obtain β h L = L h d { ρ L } . Then,compute the generalized increment vector u n as described inSection III-C.To obtain s n , apply the reverse filter to u n following (11).Take the values s n ( − mh ) for m = 0 , ..., (deg( P ) − to bezero for most cases except when L has a nontrivial null space, The choice here is restricted to parametric families we can rescale andsimulate.
Input :
Coefficients of P and Q , approximation level n ,interval size T , step size h Output:
Vector s n Compute ρ L and the FIR filter r [ m ] Compute β hL = L hd { ρ L } Generate [ ( τ , A ) , ..., ( τ K , A K ) ]Initialize u n with zeros as an array of size (cid:100) Th (cid:101) foreach ( τ k , A k ) do Find closest grid point i grid = (cid:98) τ i h (cid:99) foreach i in { i grid , . . . , i grid + deg( P ) } do [ u n ] i ←− [ u n ] i + A k × β hL ( ih − τ k ) endend Recursively apply a reverse filter to u n following (11) Algorithm 1:
Procedure to obtain s n .in which case it is derived from boundary conditions. Thepseudocode of our method is provided in Algorithm 1. E. Correlation Structure
In this section, we show a merit of our method by provingthat the generated approximations preserve the correlationstructure of the target process.First note that for any white L´evy noise w , we have that w n L −→ w, where the sequence of compound-Poisson innovations ( w n ) n ∈ N is defined in (7). We refer to this approximatingsequence in Proposition 2. Proposition 2.
Let w be a L´evy white noise such that X rect = (cid:104) rect [0 , , w (cid:105) has zero mean and the finite variance σ w = E [ X ] . Let n ∈ N and let w n be a compound-Poissoninnovation that approximates w as defined in (7) . Denoting X rect ,n = (cid:104) rect [0 , , w n (cid:105) , we have that E [ X rect ,n ] = E [ X rect ] = 0 and σ w n = E [ X ,n ] = E [ X ] = σ w . The proof can be found in Appendix C. Now, if s n =L − w n is a generalized Poisson process that approximates s = L − w , then E [ (cid:104) ϕ , s n (cid:105)(cid:104) ϕ , s n (cid:105) ] = E [ (cid:104) ϕ , L − w n (cid:105)(cid:104) ϕ , L − w n (cid:105) ]= E [ (cid:104) L − ∗ ϕ , w n (cid:105)(cid:104) L − ∗ ϕ , w n (cid:105) ]= σ w n (cid:104) L − ∗ ϕ , L − ∗ ϕ (cid:105) = σ w (cid:104) L − ∗ ϕ , L − ∗ ϕ (cid:105) = E [ (cid:104) ϕ , s (cid:105)(cid:104) ϕ , s (cid:105) ] . (12)From (12), we concluded that, more than just approximated,the correlation structure is preserved exactly in our method.6ig. 3: Trajectories of L´evy processes ( L = D ) with differentinnovations. From top to bottom: Laplace(0, 1), gamma(1, 1),Gaussian(0,1), and symmetric- α -stable with α = 1 . .IV. N UMERICAL E XPERIMENTS
In this section, we validate our approach by conductingseveral numerical experiments. Let us also mention that aPython library that implements our algorithm can be foundonline . Moreover, an accompanying web interface is alsodesigned and is available . A. Generating L´evy Processes
Among all processes we can generate, those that are solu-tions to D s = w are called L´evy processes when the boundarycondition is s (0) = 0 . We showcase in Figure 3 differentL´evy processes that correspond to several infinitely divisibledistributions. For all four simulations, we took n = 1 , and h = 0 . . As we demonstrate in Section IV-C, a reasonablechoice for these parameter is to set nh to be a small integer(here, nh = 1 ). The visual appearance of the trajectoriesmatches our expectations: The trajectory driven by a Gaussianinnovation has the appearance of Brownian motion; the gammaL´evy process is nondecreasing. B. Choice of the Operator
Our framework allows for any rational operator of the form P (D) Q (D) − , so long as deg( P ) > deg( Q ) . In Figure 4,we generate trajectories of s that are solution of L s = w ,where w is a symmetric- α -stable innovation with α = 1 . .Here we took n = 200 and h = 0 . . We see that, forvarious choices of L , the characteristics of the signal aremarkedly different, which exhibits the breadth of the modelingframework proposed in [10]. C. Convergence as n Grows
In Figure 5, we illustrate how an increase in n improves theapproximation. In addition, we have depicted the convergence https://github.com/Biomedical-Imaging-Group/Generating-Sparse-Processes https://saturdaygenfo.pythonanywhere.com Fig. 4: Trajectories of the solution s of L s = w for differentoperators L . In all cases, we considered a symmetric- α -stablewhite noise w with α = 1 . .Fig. 5: Approximations of Brownian motion (solution to D s = w , with w a Gaussian white L´evy noise) as n increases.of moments in Figure 6. While the two figures emphasize theeffect of n , they are insufficient to provide a quantitative wayto choose n .Here, we propose a measure that is based on the statisticsof the generalized increment process. Since the process u n is maximally decoupled, we can estimate the distribution of U n = (cid:104) β h ∨ L , w n (cid:105) from the samples { [ u n ] i } i of the general-ized increment process on the grid and obtain the empiricalcumulative distribution function (CDF) ˜ F n ( · ) of U n . We thencompare this empirical function to the reference CDF F ( · ) of U = (cid:104) β h ∨ L , w (cid:105) . For the comparison, we use the Kolmogorov-Smirnov (KS) divergence [43] defined as KS ( ˜ F n , F ) = max x ∈ R | ˜ F n ( x ) − F ( x ) | . We then select n such that the KS-divergence is smaller thana certain threshold ( e.g. , smaller than . ). The choice of thethreshold is conditioned by the desired numerical precision:The lower the threshold, the more faithful the trajectories, butthe higher the computational cost of the algorithm.7ig. 6: Convergence of E [ |(cid:104) rect [0 ,h ] , s n (cid:105)| p ] to E [ |(cid:104) rect [0 ,h ] , s (cid:105)| p ] for p = 0 . , h = 0 . , and severalsymmetric- α -stable L´evy white noises w . The expectationsare estimated with , trajectories for each n .Fig. 7: Kolmogorov-Smirnov (KS) divergence versus the av-erage number of jumps per bin ( N jumps = nh ).Intuitively, we expect that it is necessary to have severaljumps in each bin in order to properly approximate thestatistics of the process. The average number of jumps in eachbin of length h is N jumps = nh , so we expect n to be in theorder of h − .In Figure 7, we have validated this intuition by plottingthe KS-divergence for different values of N jumps in varioussettings. In all cases, as N jumps increases, the KS-divergencedecreases to a baseline error value, due to the finite-sampleestimation of the underlying distribution. D. Benefits of Grid-Free Approximations
Recall that a main motivation for our algorithm was tomake it compatible with multi-grid methods. In our approach,the approximation s n lives off the grid. It is only after thespecification of the step size h that s n is sampled on agrid. The generation of the random variables to determine s n and the sampling on a grid are completely decoupled. This Fig. 8: Single grid-free approximation sampled on grids thatdiffer by their step size.Fig. 9: Average computation time for a trajectory of thesolution of (D − . s = w , where w a Gaussian white noise.The simulation interval is [0 , with step size h = 0 . .means that the same approximation s n can be viewed throughdifferent grids, which we illustrate in Figure 8. The solutionto (D + 1) s = w , where w is a Gaussian white L´evy noise,is first approximated by s . Then, it is viewed on differentregular grids on [0 , . E. Computational Efficiency
A crucial component of our approach is the computation ofthe generalized increment u in order to obtain the values of s n on a grid. This provides a gain in numerical efficiency that canbe felt even on moderately sized simulations. As can be seenin Figure 9, using a Green’s function representation requiressignificantly more time than using an intermediate B-splinerepresentation. V. C ONCLUSION
We have described a novel approach for generating sparsestochastic processes. Our method leverages the properties of8-splines to guarantee good numerical efficiency. A possibledirection for future work is to provide theoretical guidanceon how one should choose the parameter n in terms of aprescribed tolerance on the approximation error.A CKNOWLEDGMENT
The authors would like to thank Dr. Julien Fageot, PakshalBohra, and Thomas Debarre for enlightening discussions.A
PPENDIX AG ENERALIZED STOCHASTIC PROCESSES
Generalized stochastic processes are random elements of S (cid:48) ( R ) that can be fully specified by their characteristic func-tionals. Those are infinite-dimensional generalizations of thecharacteristic functions of real random variables. Definition 2.
The characteristic functional of the generalizedstochastic process s is the functional (cid:98) P s : S ( R ) −→ C suchthat (cid:98) P s ( ϕ ) = E [e j (cid:104) ϕ,s (cid:105) ] , for all ϕ ∈ S ( R ) . It is a continuous, positive-definite functional and (cid:98) P s (0) = 1 . Just as in finite dimensions, (cid:98) P s contains all the statisticalinformation of s . In particular, for any test function ϕ ∈ S ( R ) ,the distribution of the real random variable (cid:104) ϕ, s (cid:105) is entirelydetermined by (cid:98) P s as its probability density function p is givenby p ( t ) ∝ F − (cid:110) E [e j ω (cid:104) ϕ,s (cid:105) ] (cid:111) ( t ) = F − (cid:110) (cid:98) P s ( ωϕ ) (cid:111) ( t ) , where F − is the inverse Fourrier transform. The constructionof such objects was initiated in [44]. Their use for modelingsparse signals was introduced in [10].A PPENDIX BC OMPUTING G REEN ’ S FUNCTIONS
Here, we describe a method to compute Green’s functions ofrational operators. We begin with the intermediate computationof the Green’s function of
L = (D − α I) k . We have that ρ α,k ( t ) = F − (cid:26) ω − α ) k (cid:27) ( t )= (cid:40) + ( t ) t k − ( k − e αt , (cid:60) ( α ) ≤ − + ( − t ) t k − ( k − e αt , otherwise (13)is a Green’s function of L .Now, recall that rational operators are of the form L = P (D) Q (D) − , where P and Q are polynomials. Tak-ing { α , ..., α m } to be the roots of P with multiplicity { γ , ..., γ m } , the inverse of the frequency response is givenby (cid:98) L( ω ) = Q (j ω ) (cid:81) mi =1 (j ω − α i ) γ i . This inverse is known to admit a partial-fraction decomposi-tion of the form (cid:98) L( ω ) = m (cid:88) i =1 γ i (cid:88) k =1 c ik (j ω − α i ) k for some constants c ik ∈ C . The corresponding Green’sfunction is then given by: ρ L ( t ) = F − (cid:40) (cid:98) L( ω ) (cid:41) ( t )= m (cid:88) i =1 γ i (cid:88) k =1 c ik F − (cid:26) ω − α i ) k (cid:27) ( t )= m (cid:88) i =1 γ i (cid:88) k =1 c ik ρ α i ,k ( t ) The Green’s function of L is then be obtained by summingthe Green’s function of the partial fractions given in (13).A PPENDIX
CProof of Proposition 1: Since w n is a compound-Poissoninnovation, X rect, n is a compound-Poisson random variable. Itcan be written X rect,n = N (cid:88) i =1 A i where N is a Poisson random variable with rate λ = n and the A i are independent identically distributed infinitely divisiblerandom variables with L´evy exponent n f independent from N . We have by independence of the ( A i ) , that X rect,n = n (cid:88) i =1 A i = d X rect because the characteristic function of (cid:80) ni =1 A i is (e n ) n =e f . This directly implies that X rect,n and X rect have the samemoments. R EFERENCES[1] R. Gray and L. Davisson,
An Introduction to Statistical Signal Process-ing . Cambridge University Press, 2004.[2] N. Ahmed, T. Natarajan, and K. R. Rao, “Discrete Cosine Transfom,”
IEEE Transactions on Computers , vol. 23, no. 1, pp. 90–93, Jan. 1974.[3] R. E. Kalman, “A New Approach to Linear Filtering and PredictionProblems,”
Journal of Basic Engineering , vol. 82, no. 1, pp. 35–45,Mar. 1960.[4] D. Mumford and A. Desolneux,
Pattern Theory: The Stochastic Analysisof Real-World Signals . A. K. Peters/CRC Press, Aug. 2010.[5] A. Srivastava, A. B. Lee, E. P. Simoncelli, and S.-C. Zhu, “On advancesin statistical modeling of natural images,”
Journal of MathematicalImaging and Vision , vol. 18, no. 1, pp. 17–33, 2003.[6] A. Amini, M. Unser, and F. Marvasti, “Compressibility of Deterministicand Random Infinite Sequences,”
IEEE Transactions on Signal Process-ing , vol. 59, no. 11, pp. 5193–5201, Nov. 2011.[7] S. Mallat,
A Wavelet Tour of Signal Processing . Elsevier, Sep. 1999.[8] D. L. Donoho, “Compressed sensing,”
IEEE Transactions on Informa-tion Theory , vol. 52, no. 4, pp. 1289–1306, Apr. 2006.[9] J.-L. Starck, F. Murtagh, and J. M. Fadili,
Sparse Image and SignalProcessing: Wavelets, Curvelets, Morphological Diversity . CambridgeUniversity Press, May 2010.[10] M. Unser and P. D. Tafti,
An Introduction to Sparse Stochastic Processes .Cambridge University Press, Aug. 2014.[11] T. Kailath, “The innovations approach to detection and estimationtheory,”
Proceedings of the IEEE , vol. 58, no. 5, pp. 680–695, 1970.[12] K.-i. Sato, S. Ken-Iti, and A. Katok,
L´evy Processes and InfinitelyDivisible Distributions . Cambridge University Press, Nov. 1999.[13] M. Unser, P. D. Tafti, and Q. Sun, “A Unified Formulation of Gaus-sian Versus Sparse Stochastic Processes—Part I: Continuous-DomainTheory,”
IEEE Transactions on Information Theory , vol. 60, no. 3, pp.1945–1962, Mar. 2014.
14] D. Mumford and B. Gidas, “Stochastic models for generic images,”
Quarterly of Applied Mathematics , vol. 59, no. 1, pp. 85–111, 2001.[15] E. Bostan, U. S. Kamilov, M. Nilchian, and M. Unser, “Sparse StochasticProcesses and Discretization of Linear Inverse Problems,”
IEEE Trans-actions on Image Processing , vol. 22, no. 7, pp. 2699–2710, Jul. 2013.[16] M. A. Kutay, A. P. Petropulu, and C. W. Piccoli, “On modelingbiomedical ultrasound RF echoes using a power-law shot-noise model,”
IEEE Transactions on Iltrasonics, Ferroelectrics, and Frequency Con-trol , vol. 48, no. 4, pp. 953–968, 2001.[17] S. M. Kogon and D. G. Manolakis, “Signal modeling with self-similar α -stable processes: The fractional L´evy stable motion model,” IEEETransactions on Signal Processing , vol. 44, no. 4, pp. 1006–1010, 1996.[18] J. R. Gallardo, D. Makrakis, and L. Orozco-Barbosa, “Use of α -stable self-similar stochastic processes for modeling traffic in broadbandnetworks,” Performance Evaluation , vol. 40, no. 1-3, pp. 71–98, 2000.[19] N. Laskin, I. Lambadaris, F. C. Harmantzis, and M. Devetsikiotis,“Fractional l´evy motion and its application to network traffic modeling,”
Computer Networks , vol. 40, no. 3, pp. 363–375, 2002.[20] A. Amini, P. Th´evenaz, J. Ward, and M. Unser, “On the linearity ofBayesian interpolators for non-Gaussian continuous-time AR(1) pro-cesses,”
IEEE Transactions on Information Theory , vol. 59, no. 8, pp.5063–5074, August 2013.[21] A. Amini, U. S. Kamilov, E. Bostan, and M. Unser, “Bayesian Estima-tion for Continuous-Time Sparse Stochastic Processes,”
IEEE Transac-tions on Signal Processing , vol. 61, no. 4, pp. 907–920, Feb. 2013.[22] U. S. Kamilov, P. Pad, A. Amini, and M. Unser, “MMSE Estimationof Sparse L´evy Processes,”
IEEE Transactions on Signal Processing ,vol. 61, no. 1, pp. 137–147, Jan. 2013.[23] S. J. Godsill and G. Yang, “Bayesian inference for continuous-timeARMA models driven by non-Gaussian L´evy processes,” in , vol. 5. IEEE, 2006, pp. V–V.[24] B. Øksendal,
Stochastic Differential Equations: An Introduction withApplications , 2nd ed. Berlin: Springer, 1989.[25] S. Rubenthaler, “Numerical simulation of the solution of a stochasticdifferential equation driven by a L´evy process,”
Stochastic Processesand their Applications , vol. 103, no. 2, pp. 311–349, Feb. 2003.[26] J. Fageot, V. Uhlmann, and M. Unser, “Gaussian and sparse processesare limits of generalized Poisson processes,”
Applied and ComputationalHarmonic Analysis , 2018.[27] P. J. Brockwell, “L´evy-Driven CARMA Processes,”
Annals of theInstitute of Statistical Mathematics , vol. 53, no. 1, pp. 113–124, Mar.2001.[28] P. E. Protter,
Stochastic Integration and Differential Equations , 2nd ed.,ser. Stochastic Modelling and Applied Probability. Berlin Heidelberg:Springer-Verlag, 2005.[29] M. Vetterli, P. Marziliano, and T. Blu, “Sampling signals with finite rateof innovation,”
IEEE Transactions on Signal Processing , vol. 50, no. 6,pp. 1417–1428, Jun. 2002.[30] M. Unser and P. D. Tafti, “Stochastic models for sparse and piecewise-smooth signals,”
IEEE Transactions on Signal Processing , vol. 59, no. 3,pp. 989–1006, 2010.[31] M. Unser and T. Blu, “Cardinal exponential splines: Part I - theory andfiltering algorithms,”
IEEE Transactions on Signal Processing , vol. 53,no. 4, pp. 1425–1438, Apr. 2005.[32] M. Unser, “Cardinal exponential splines: Part II - think analog, actdigital,”
IEEE Transactions on Signal Processing , vol. 53, no. 4, pp.1439–1449, Apr. 2005.[33] M. Unser, P. D. Tafti, A. Amini, and H. Kirshner, “A Unified Formulationof Gaussian versus
Sparse Stochastic Processes—Part II: Discrete-Domain Theory,”
IEEE Transactions on Information Theory , vol. 60,no. 5, pp. 3036–3051, May 2014.[34] J. Fageot and T. Humeau, “Unified View on L´evy White Noises:General Integrability Conditions and Applications to Linear SPDE,” arXiv:1708.02500 [math] , Aug. 2017, arXiv: 1708.02500.[35] A. Amini and M. Unser, “Sparsity and Infinite Divisibility,”
IEEETransactions on Information Theory , vol. 60, no. 4, pp. 2346–2358,Apr. 2014.[36] J. Fageot, A. Amini, and M. Unser, “On the Continuity of Character-istic Functionals and Sparse Stochastic Modeling,”
Journal of FourierAnalysis and Applications , vol. 20, no. 6, pp. 1179–1211, Dec. 2014.[37] R. C. Dalang, T. Humeau et al. , “L´evy processes and l´evy white noiseas tempered distributions,”
The Annals of Probability , vol. 45, no. 6B,pp. 4389–4418, 2017.[38] L. Bondesson, “On Simulation from Infinitely Divisible Distributions,”
Advances in Applied Probability , vol. 14, no. 4, pp. 855–869, 1982. [39] L. Devroye, “Nonuniform Random Variate Generation,” in
Handbooksin Operations Research and Management Science . Elsevier, Jan. 2006,vol. 13, pp. 83–121.[40] C. De Boor, “On calculating with B-splines,”
Journal of ApproximationTheory , vol. 6, no. 1, pp. 50–62, 1972.[41] I. J. Schoenberg, “Contributions to the problem of approximation ofequidistant data by analytic functions,” in
IJ Schoenberg Selected Papers .Springer, 1988, pp. 3–57.[42] A. Ron, “Factorization theorems for univariate splines on regular grids,”
Israel Journal of Mathematics , vol. 70, no. 1, pp. 48–68, 1990.[43] F. Massey Jr, “The Kolmogorov-Smirnov test for goodness of fit,”
Journal of the American Statistical Association , vol. 46, no. 253, pp.68–78, 1951.[44] I. M. Gel’fand and N. Y. Vilenkin,
Generalized Functions: Applicationsof Harmonic Analysis . Academic Press, May 2014.. Academic Press, May 2014.