[PDF] Reconciling Compressive Sampling Systems for Spectrally-sparse Continuous-time Signals

Abstract

The Random Demodulator (RD) and the Modulated Wideband Converter (MWC) are two recently proposed compressed sensing (CS) techniques for the acquisition of continuous-time spectrally-sparse signals. They extend the standard CS paradigm from sampling discrete, finite dimensional signals to sampling continuous and possibly infinite dimensional ones, and thus establish the ability to capture these signals at sub-Nyquist sampling rates. The RD and the MWC have remarkably similar structures (similar block diagrams), but their reconstruction algorithms and signal models strongly differ. To date, few results exist that compare these systems, and owing to the potential impacts they could have on spectral estimation in applications like electromagnetic scanning and cognitive radio, we more fully investigate their relationship in this paper. We show that the RD and the MWC are both based on the general concept of random filtering, but employ significantly different sampling functions. We also investigate system sensitivities (or robustness) to sparse signal model assumptions. Lastly, we show that "block convolution" is a fundamental aspect of the MWC, allowing it to successfully sample and reconstruct block-sparse (multiband) signals. Based on this concept, we propose a new acquisition system for continuous-time signals whose amplitudes are block sparse. The paper includes detailed time and frequency domain analyses of the RD and the MWC that differ, sometimes substantially, from published results.

Full PDF

aa r X i v : . [ c s . I T ] O c t Reconciling Compressive Sampling Systems forSpectrally-sparse Continuous-time Signals

Michael A. Lexa, Mike E. Davies and John S. Thompson {michael.lexa,mike.davies,john.thompson}@ed.ac.uk

Institute of Digital CommunicationsUniversity of EdinburghJan 2011, Revised May 2011, Revised again Aug 2011

Abstract

The Random Demodulator (RD) and the Modulated Wideband Converter (MWC) are two re-cently proposed compressed sensing (CS) techniques for the acquisition of continuous-time spectrally-sparse signals. They extend the standard CS paradigm from sampling discrete, ﬁnite dimensionalsignals to sampling continuous and possibly inﬁnite dimensional ones, and thus establish the abilityto capture these signals at sub-Nyquist sampling rates. The RD and the MWC have remarkablysimilar structures (similar block diagrams), but their reconstruction algorithms and signal modelsstrongly diﬀer. To date, few results exist that compare these systems, and owing to the potentialimpacts they could have on spectral estimation in applications like electromagnetic scanning andcognitive radio, we more fully investigate their relationship in this paper. We show that the RDand the MWC are both based on the general concept of random ﬁltering, but employ signiﬁcantlydiﬀerent sampling functions. We also investigate system sensitivities (or robustness) to sparse signalmodel assumptions. Lastly, we show that “block convolution” is a fundamental aspect of the MWC,allowing it to successfully sample and reconstruct block-sparse (multiband) signals. Based on thisconcept, we propose a new acquisition system for continuous-time signals whose amplitudes are blocksparse. The paper includes detailed time and frequency domain analyses of the RD and the MWCthat diﬀer, sometimes substantially, from published results.

The theory of compressed sensing (CS) says that if a signal is suﬃciently sparse with respect to some basisor frame, it can be faithfully reconstructed from a small set of linear, nonadaptive measurements [1–3].When the signal belongs to a ﬁnite dimensional space, this statement means that it can be reconstructedfrom a set of measurements whose cardinality may be signiﬁcantly less than the space’s dimension. It alsoimplies that the measurement process is described by an underdetermined linear system of equations, orequivalently, by a rectangular matrix with more columns than rows. The fundamental work of Cand´es,1 econciling CS Sampling

Romberg, and Tao [4] and Donoho [1] established suﬃcient conditions upon such sensing matrices, thatif satisﬁed, allow the stable inversion of the linear systems. A key aspect of CS, and one which plays animportant role in this paper, is that sensing matrices drawn at random often satisfy these conditions.Conceptually, CS theory has three main thrusts: (1) the development of recovery methods thateﬃciently and faithfully reconstruct the original signal from its compressed samples, (2) the investigationof new signal models that eﬀectively represent signal sparsity or other signal structure, and (3) thecreation of new sampling (measurement) mechanisms that acquire signals in a compressed manner. Theﬁrst concerns the reconstruction process and asks how one speciﬁcally reconstructs the original signalfrom the CS measurements (see, e.g., [1, 4, 5]). The second concerns the examination of diﬀerent signalclasses of interest and asks if are structured representations that can be exploited [6–8]. The thirdconcerns the design of the physical sampling system and asks how one devises a system to acquire CSmeasurements [7, 9–11]. This paper focuses on the third thrust and examines two sampling systems fortwo distinct, but related, signal models.Several CS based signal acquisition systems have been proposed for both continuous (analogue) anddiscrete signals. For example, the single-pixel camera [12] is a novel compressive imaging system, wherelight is projected onto a random basis using a micro-mirror device, and then the projected image iscaptured by a single photo-diode (the single “pixel”). Other examples include random ﬁltering [13] andrandom convolution [9] that advocate random linear ﬁltering and low rate sampling as a means to collectCS measurements. In these cases, “random” ﬁlters are linear ﬁlters whose impulse responses are realisa-tions of particular random processes. Along the same lines, the Random Demodulator (RD) [10, 14, 15]and the

Modulated Wideband Converter (MWC) [11,16–18] have recently been proposed as CS samplingsystems that target continuous-time spectrally-sparse signals. The RD is a single channel , uniformsub-Nyquist sampling strategy for acquiring sparse multitone signals ; the MWC is a multi-channel,uniform sub-Nyquist sampling strategy for acquiring sparse multiband signals . (Precise deﬁnitions forthese two signal classes are provided in Section 3.) The RD and the MWC have tremendous potentialimpact because of the longstanding, proven usefulness of spectral signal models in many engineering andscientiﬁc applications (e.g. communications, radar/sonar, medical imaging, etc.). Perhaps owing to thenear coincidental emergence of these systems, few results exist to date that reconcile their remarkablysimilar structures (see Figure 1) with their diﬀerent reconstruction algorithms. In fact, the currentliterature paints a somewhat artiﬁcial dividing line between the RD and the MWC, preferring to focusprimarily on one scheme or another rather than drawing connections between them. One exception isthe recent comparative analysis by Mishali et al. [20] that focuses on the systems’ robustness to signalmodel perturbations and computational and hardware complexity. We comment more on [20] below.In this paper, we oﬀer new insights into the relationship between the RD and the MWC that com-plement the original works of Tropp et al. [10] and Mishali and Eldar [11]. We apply tools from modernsampling theory and classical Fourier analysis and show that the RD and the MWC are two manifes- There are several ways to construct viable random sensing matrices. For example, its entries could simply be indepen-dent and identically distributed realisations of a a zero mean, unit variance Gaussian random variable. An early multi-channel random demodulator was proposed in [19]. econciling CS Sampling tations of the same CS sampling approach, namely random ﬁltering/convolution [9, 13]. This fact isreﬂected the systems’ similar structure. At the same time, we show that the sampling functions char-acterising the systems strongly distinguish the two schemes. In Section 3, we examine three diﬀerentproperties of the RD and the MWC related to the underlying assumption on signal sparsity. We dis-cuss how sparsity manifests itself in each case and comment on the system’s sensitivity or robustness tochanges in sparsity. In Section 4, we highlight, among other insights, the MWC’s use of block convolutionas a principal processing step that enables it to successfully sample and recover “block-sparse” signals,i.e. signals whose nonzero components are grouped together. Extending this idea, we propose a new CSbased sampling system and show through an example that it can successfully sample and reconstructcontinuous-time signals that are block sparse in the time domain.To be clear, we do not discuss the conditions of successful reconstruction, nor implementation issues inthis paper. The original works of Tropp et al. [10] and Mishali and Eldar [11], and even some subsequentscholarship [16–18], extensively investigate these issues. Some of the reconstruction conditions will bestated in the descriptions of the systems in Section 2, but the presumption throughout the paper isthat the RD and the MWC are theoretically proven CS based techniques to sample and reconstructcontinuous-time spectrally-sparse signals. In addition, we only examine the idealised RD and MWCbecause their fundamental similarities and diﬀerences are sharper in this setting (e.g., one does not haveto account for the eﬀects of non-ideal ﬁlters). Aspects of practical implementations are discussed in [14]and [21].This paper and the comparative analysis presented in [20] are similar in some respects; however, theapproach and the conclusions are very diﬀerent. For example, both touch on a model sensitivity issue ofthe RD, but in [20] this sensitivity is billed as a fundamental shortcoming in comparison to the MWCbecause it does not exhibit the exact same sensitivity. In contrast, we argue here that the sensitivity is amanifestation of a CS sensitivity and that the MWC also inherits a shortcoming from CS theory, albeit adiﬀerent one than the RD. In short, the approach of the present paper asks if there is an underlying linkin the way the RD and the MWC process their signals and then examines their similarities and diﬀerencesfrom this common perspective. This perspective yields a consistent and broader understanding of thesesystems and their relation to other CS schemes and standard sampling theory.Throughout the paper, we denote time domain signals by lower case letters (e.g. x, y, ψ ) and frequencydomain signals by upper case letters (e.g. X, Y ). Vectors and matrices are indicated by boldface type(e.g. x , Y , Φ ). Parameters are denoted by upper case letters with one exception: the number of channelsin the MWC is denoted by q ′ . Contributions.

The main contributions of this paper are: • a consistent analysis that (i) clearly shows random ﬁltering underlies the RD and the MWC and(ii) highlights system sensitivities • the insight that block convolution fundamentally enables the MWC to sample and recover frequencyblock-sparse signals, and the generalization of this idea to a new sampling system.3 econciling CS Sampling g(t) rect (2Mt/T-1)x ( t ) p ( t ) y(k) t= (k+1)T/M (a) Random demodulator x(t) p (t)p i (t)p q' (t) y (k)y i (k)y q' (k)πW'/L' sinc( πW't / L' ) t=k M'/W'g i (t)g (t)g q (t) (b) Modulated wideband converter Figure 1:

Time domain block diagrams of the random demodulator (RD) and the modulated wideband converter(MWC). The RD is characterised by T , the duration of the observation interval and M , a sampling rate parameter.The MWC is characterised by L ′ , a parameter for the period of p i ( t ), M ′ , a sampling rate parameter, and q ′ , thenumber of channels. The primary structural diﬀerence between the systems is the type of ﬁlter employed priorto sampling—the RD uses an ideal integrator and the MWC uses ideal low pass ﬁlters. In this section, we examine the sampling mechanisms of the RD and the MWC from a modern samplingtheory perspective. We show the output samples for both systems are equal to the inner products of theinput signal with a set of sampling functions that arise from the systems’ designs. We observe that unliketypical sampling functions, these sampling functions involve random waveforms, a central component inmany CS sampling systems. If the inner products are interpreted as analogue ﬁltering operations, weshow that the samples result from a generalised random ﬁltering or random convolution as described byRomberg [9] and Tropp et al. [13] as a means to acquire CS samples. This analysis suggests that the RDand the MWC are two manifestations of the same sampling approach, but diﬀer in the speciﬁc form ofthe sampling functions. The diﬀerence in sampling functions also reﬂects the diﬀerence in the assumedsignal models for the RD and the MWC. We do not introduce the notion of signal sparsity in this sectionbecause the conclusions reached do not depend on this aspect. Signal sparsity and its consequences arediscussed in Section 3.

Let x ( t ) be a continuous-time, complex-valued signal deﬁned on the real line. The RD acquires samplesof x ( t ) on a ﬁnite observation interval where here, we assume, without loss of generality, that the samplesare collected in the interval [0 , T ] seconds. In [10], Tropp et al. adopt a particular signal model for x ( t )on this interval. They assume in part that x ( t ) has a Fourier series (FS) expansion on [0 , T ] which hasbounded harmonics, i.e., − W/ ≤ nT < W/ n ∈ Z . On this interval, x ( t ) is therefore modeled as x ( t ) = N/ − X n = − N/ X ( n ) e j πT nt , t ∈ [0 , T ] , (1)4 econciling CS Sampling where { X ( n ) } denotes the FS coeﬃcients of x ( t ) and N = T W . For ease of exposition, N is assumedto be an even positive integer. This signal model is often called a multitone model.To acquire the samples, a RD ﬁrst multiplies x ( t ) by a waveform p ( t ) and then ﬁlters and samples theproduct x ( t ) p ( t ) on [0 , T ] (see Figure 1(a)). The signal p ( t ) is taken to be a realisation of a continuousrandom process derived from a vector of Bernoulli random variables. Let Z = [ Z , . . . , Z L − ] be a vectorof independent and identically distributed Bernoulli random variables Z l taking values ± and let p ( t ; Z ) denote the random process p ( t ; Z ) = Z l , t ∈ (cid:20) lW , l + 1 W (cid:19) , l = 0 , . . . , N − . (2)A realisation Z of Z produces a single realisation p ( t ; Z ) of p ( t ; Z ). Here, we abbreviate p ( t ; Z ) by p ( t )and thus consider p ( t ) to be a deterministic quantity, although its randomness plays an important role inproving performance guarantees [10]. In this paper, we sometimes refer to p ( t ) as a random waveform indeference to this point. We stress that when acquiring samples on [0 , T ], the RD uses a single realisationof p ( t ; Z ), but diﬀerent realisations may be used for other observation intervals. Note also that p ( t ) hasthe FS representation, p ( t ) = ∞ X n = −∞ P ( n ) e j πT nt , t ∈ [0 , T ] (3)where { P ( n ) } is the set of FS coeﬃcients of p ( t ).The analogue ﬁlter in the RD design is taken to be an ideal integrator with impulse response h ( t ) =rect (cid:0) MT t − (cid:1) , where rect( x ) =  − ≤ x ≤

10 otherwise , (4)and M ∈ Z + . The sampling period T s is taken to be M times shorter than the observation window( T s = T /M ). The system therefore samples at the rate of

M/T

Hz. The multitone signal model andthe RD sampling system are therefore parameterised by N , the parameter equal to the time-frequencyproduct T W and M , the parameter that controls the RD’s sampling rate. Here, we assume that M < N .The goal of the RD is to sample x ( t ) at low rates while retaining the ability to reconstruct it inthe interval [0 , T ]. Reconstruction entails the discovery of the active frequencies (the signal’s spectralsupport) and the amplitude of the corresponding FS coeﬃcients. If x ( t ) is spectrally sparse on [0 , T ],then reconstruction is possible using CS algorithms [10]. In this case, we note that signal reconstructiononly implies the recovery of the spectral content of x ( t ) in the observation interval. In other words, thesamples y ( k ) , k = 0 , . . . , M −

1, do not convey information about the spectral content of x ( t ) outsideof this interval. To obtain spectral information outside of [0 , T ], the RD must be applied to otherintervals (of possibly diﬀerent durations). If the RD is applied to consecutive intervals, a time-frequencydecomposition of x ( t ) similar to the short-time Fourier transform can be obtained for multitone signals. Time domain description.

By inspection of Figure 1(a), the output samples y ( k ) can be expressed A ± econciling CS Sampling as y ( k ) = g (cid:0) ( k + 1) TM (cid:1) = Z T x ( τ ) p ( τ )rect (cid:0) MT ( t − τ ) − (cid:1) dτ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) t =( k +1) TM (5)for k = 0 , . . . , M − g ( t ) = x ( t ) p ( t ) ∗ h ( t ) ( ∗ denotes convolution). By substituting (1) into thisexpression and evaluating the integral, the following equation relating the time domain samples y ( k ) tothe FS coeﬃcients X ( n ) results: y ( k ) = N/ − X n = − N/ α ( n ) X ( n ) ( k +1) NM − X l = k NM p l e j πN nl , (6)for k = 0 , . . . M − p l = p ( l/W ) and α ( n ) =  T e j πN n − j πn n = 01 /W n = 0 . Tropp et al. derived (6) in [10] by analysing an equivalent digital system. In Appendix 6.1, we providean alternate derivation that explicitly shows the analogue processing inherent in sampling with the RD.Because sampling is a linear operation with the RD, the samples y ( k ) can be viewed as inner productsof x ( t ) with the set of sampling functions (cid:8) p ( τ )rect (cid:0) k + 1 − MT τ (cid:1)(cid:9) where y ( k ) = (cid:10) x ( τ ) , p ( τ )rect (cid:0) k + 1 − MT τ (cid:1)(cid:11) , (7)for k = 0 , . . . , M − (cid:10) x ( t ) , s ( t ) (cid:11) = Z T x ( t ) s ∗ ( t ) dt, for two continuous functions x ( t ), s ( t ) on [0 , T ]. These sampling functions have ﬁnite duration in time( T /M seconds), but because their Fourier transforms involve sinc functions, they extend inﬁnitely infrequency. In the time-frequency plane, their support partitions the space into vertical strips of equalwidth (see Figure 2, left panel). We note that unlike modern sampling theory [22], the sampling functionsin (7) contain the random waveform p ( t ), and the conditions they must satisfy to ensure stable recoveryis governed by CS theory and not Shannon-Nyquist based sampling theory. (Refer to [22] and [7] fordetails regarding the conditions that sampling functions typically must satisfy.)From (5), it is clear the samples y ( k ) can be thought of as pointwise evaluations of the convolutionbetween x ( t ) p ( t ) and an ideal integrator. Equally valid, however, is the view that the samples arepointwise evaluations of a random, linear ﬁltering operation involving x ( t ) and the time-varying analogueﬁlter h ( t, τ ) = p ( τ )rect (cid:0) MT ( t − τ ) − (cid:1) , y ( k ) = Z T x ( τ ) h ( t, τ ) dτ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) t =( k +1) TM (8)6 econciling CS Sampling Frequency Time T/M

Frequency Time

W/L

Figure 2:

The output samples of both the RD and the MWC can be described as inner products of the inputsignal x ( t ) with certain sets of sampling functions. The panel on the left depicts the time-frequency support ofthe RD sampling functions where each vertical strip represents the support of one sampling function. Similarly,the panel on the right depicts the support of the MWC sampling functions where each horizontal strip representsthe support of one sampling function. For the RD and the MWC, the support characteristics of the samplingfunctions directly derive from the type of analogue ﬁlters used prior to sampling. The RD and the MWCrepresent two extreme cases: The RD has perfectly localised support in time but completely unlocalized supportin frequency. The MWC is the exact opposite. for k = 0 , . . . , M −

1. Here, the impulse response h ( t, τ ) is considered random because at each timeinstance it is a windowed portion of a signal that randomly alternates between ±

1. The samples y ( k )can therefore be thought of as the result of a random ﬁltering operation, conceptually similar to therandom ﬁltering schemes proposed in [9] and [13]. In [13], Tropp et al. proposed a CS sampling schemewhere a sparse discrete-time signal is ﬁrst ﬁltered by a digital ﬁlter whose impulse response is a realisationof a sequence of independent and identically distributed random variables, and then subsampled at a lowrate. They illustrated through examples that with the use of CS recovery algorithms random ﬁlteringis a potential sampling structure to acquire CS measurements for sparse discrete time signals. In [9],Romberg proposed and examined a similar idea but considered a speciﬁc digital ﬁlter that randomlychanges the phase of the input signal. Interestingly, Romberg considered the RD as a separate, follow-on processing step to his approach instead of considering it as a generalisation to his notion of randomconvolution. Here, (8) shows that the sampling mechanism of the RD can be viewed as a random ﬁlteringoperation applied to continuous-time signals. We note, however, that the ﬁltering operation in (8) is nota convolution because of the time-varying nature of h ( t, τ ). Strictly speaking then (8) is distinct fromthe systems proposed in [13] and [9], although random ﬁltering remains a common thread. Frequency domain description.

An equivalent frequency domain expression to (5) can be derived(see Appendix 6.1) that relates the discrete Fourier transform (DFT) of y ( k ), denoted by Y ( n ), to theFS coeﬃcients X ( n ), Y ( n ) = T n + N X m = n − N P ( m ) e − j πT n sinc( πM n ) X ( n − m ) , (9) n = 0 , . . . , M −

1, where sinc( x ) = sin ( x ) /x, x ∈ R . This equation clearly shows the frequency domain7 econciling CS Sampling convolution caused by the multiplication with p ( t ) and the eﬀect of ﬁltering with an ideal integrator(indicated by the presence of the e − j πT n sinc( πM n ) term). Thus, one can also interpret Y ( n ) as theoutput of a random, frequency-varying ﬁlter with impulse response H ( n, m ) = P ( m ) e − j πT n sinc( πM n ).We see therefore that the RD’s output in either the time or frequency domains can be viewed as theoutput of a random ﬁlter or convolution. We now let x ( t ) be a bandlimited, ﬁnite energy signal. The spectral content of x ( t ) on R is thenappropriately given by its Fourier transform (FT) X ( ω ), X ( ω ) = Z ∞−∞ x ( t ) e jωt dt. Here, x ( t ) is bandlimited in the usual sense, i.e., X ( ω ) is assumed to be bounded: X ( ω ) = 0 for | ω | ≥ πW ′ radians per second, W ′ ∈ R + , where πW ′ is the bandwidth of x ( t ) and 2 πW ′ is the Nyquistfrequency in radians per second. We adopt the following deﬁnition from [23]. The class of multiband signals B ( F , W ′ ) is the set of bandlimited, ﬁnite energy signals whose spectral support is a ﬁnite unionof bounded intervals, B ( F , W ′ ) = (cid:8) x ( t ) ∈ L ( R ) : X ( ω ) = 0 , ω / ∈ F (cid:9) (10)where F = K [ i =1 [ a i , b i ) , | a i | , | b i | ≤ πW ′ . (11)In the following description of the MWC, primes are added to the parameters to distinguish themfrom the parameters of the RD. The same letters are, however, used for similar quantities. For example, W/ W ′ / i th channel of the MWC multiplies x ( t ) by a random signal p i ( t ), then ﬁlters andsamples the product x ( t ) p i ( t ) at a sub-Nyquist rate (see Figure 1(b)). As in the original formulation, weassume each channel’s ﬁlter is an ideal low pass ﬁlter, although it has been shown that the MWC canoperate with non-ideal low pass ﬁlters [17]. Here, we examine its original formulation to make a clearercomparison to the RD. The signals p i ( t ) , i = 0 , . . . , q ′ −

1, are periodic extensions of diﬀerent realisationsof the continuous random process used for the RD. Formally, let Z = { Z l } , be a sequence of independentand identically distributed Bernoulli random variables taking values ± L ′ , let p ( t ; Z ) denote the random process p ( t ; Z ) = Z l , t ∈ h lW ′ , l + 1 W ′ (cid:17) , l = 0 , . . . , L ′ − . (12)Let Z i denotes a particular realisation of Z . The signals p i ( t ) are then periodic extensions of therealisations p ( t ; Z i ) of p ( t ; Z ): p i ( t + mT p ) = p ( t ; Z i ) , for t ∈ [0 , T p ] , m ∈ Z , (13)8 econciling CS Sampling where T p = L ′ W ′ . The impulse response of the ideal low pass analogue ﬁlter is h ( t ) = πW ′ L ′ sinc( πW ′ L ′ t ),implying a cut-oﬀ frequency of πW ′ L ′ radians per second. (This is a slightly diﬀerent assumption thanthat made in [11] where the cut-oﬀ frequency was set to W ′ /M ′ .) Each channel samples at a rate thatis M ′ ∈ R + times slower than the Nyquist rate, i.e. T s = W ′ M ′ Hz, where T s is the channels’ samplingperiod. The system’s average sampling rate is q ′ W ′ /M ′ Hz. In [11], Mishali and Eldar showed that anecessary condition for successful reconstruction is M ′ ≤ L ′ or that T s ≤ T p . We assume this conditionholds for the MWC throughout the paper. We also assume that the average rate is always less than theNyquist rate ( q ′ < M ′ ). Time domain description.

By inspection of Figure 1(b), we obtain the following time-domainexpression for a single channel of the MWC: y i ( k ) = g i (cid:0) k M ′ W ′ (cid:1) = πW ′ L ′ Z ∞−∞ x ( τ ) p i ( τ )sinc( πW ′ L ′ ( t − τ )) dτ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) t = k M ′ W ′ , (14)for all k ∈ Z . This expression corresponds to the time domain expression in (5) for the RD. Likethe RD, the samples y i ( k ) can be interpreted as inner products with a set of sampling functions { πW ′ L ′ p i ( τ )sinc (cid:0) πM ′ L ′ k − πW ′ L ′ τ (cid:1) } where y i ( k ) = (cid:10) x ( τ ) , πW ′ L ′ p i ( τ )sinc (cid:0) πM ′ L ′ k − πW ′ L ′ τ (cid:1)(cid:11) . (15)But in contrast to the RD, these sampling functions have ﬁnite frequency support and inﬁnite temporalsupport. In the time-frequency plane, their support partitions the space into horizontal strips of width W ′ /L ′ Hz (see Figure 2, right panel). This particular set of sampling functions represents one instanceof a general theory put forth by Eldar [7] to compressively sample continuous-time signals from unions ofshift-invariant spaces, of which multiband signals are members. The theory combines modern samplingtheory with CS theory in such a way that samples are acquired in a typical manner by projecting thesignal onto a set of sampling functions (as in (15)), but CS theory is needed for reconstruction. We donot review the details of this theory here because it does not apply to the RD.Interpreting (14) as a random ﬁltering, we identify the time-varying impulse response as h i ( t, τ ) = πW ′ L ′ p i ( τ )sinc( πW ′ L ′ ( t − τ )). Because the MWC employs an ideal low pass ﬁlter, the impulseresponse contains a sinc function instead of a rectangular function as seen in (5). Consequently, theimpulse response has inﬁnite temporal extent in this (ideal) setting. For the MWC, h i ( t, τ ) is randomin the same general sense as the RD’s time-varying impulse response—the sinc function is multiplied bya realisation of a random process. Frequency domain description.

Using standard Fourier analysis techniques, Mishali and El-9 econciling CS Sampling dar [11] derived the following frequency domain description for the i th channel of the MWC, Y i ( e jω M ′ W ′ )rect (cid:0) M ′ πW ′ ω (cid:1) = W ′ M ′ ⌊ ( L ′ +1) ⌋ X m = −⌊ ( L ′ +1) ⌋ +1 P i ( m ) X ( ω − m πW ′ L ′ )rect (cid:0) L ′ πW ′ ω (cid:1) (16)= ⌊ ( L ′ +1) ⌋ X m = −⌊ ( L ′ +1) ⌋ +1 β ( m ) X ( ω − m πW ′ L ′ )rect (cid:0) L ′ πW ′ ω (cid:1) L ′ − X l =0 p il e − j πL ′ ml , (17)where P i ( m ) denotes the FS coeﬃcients of p i ( t ), ⌊·⌋ denotes the ﬂoor rounding operation, p il = p i ( t ) for t ∈ [ l/W ′ , ( l + 1) /W ′ ) and β ( m ) =  W ′ M ′ − e − j πL ′ m j πm , m = 01 /L ′ , m = 0 . Appendix 6.2 contains a slightly diﬀerent derivation of (17) than that presented in [11]. Compar-ing (16) to (9), we observe that the DTFT of the output sequences y i ( k ) can again be interpretedas the result of a random convolution, where the frequency-varying impulse response is given by H i ( m, ω ) = P i ( m )rect (cid:0) L ′ πW ′ ω (cid:1) . The spectral content of the samples y i ( k ) is expressed by the DTFT,as opposed to the DFT, because x ( t ) is deﬁned on the entire real line for the MWC instead of on aninterval. We also note that the scalars β ( m ) are the complex conjugates of α ( n ) in (6). Single channel MWC.

There are two ways to collapse the MWC into an equivalent single channelsystem. One can either lengthen the observation interval by a factor of q ′ (keeping all other parametersﬁxed), or one can consider increasing the sampling rate while maintaining the same observation interval.If the observation interval is lengthened, the sequence of samples from a single channel MWC can bepartitioned into q ′ groups of W ′ T /M ′ , where each group of samples is thought of as the output from anindividual channel in the multi-channel conﬁguration. Alternatively, one can set the sampling rate of asingle channel MWC equal to the average rate of a multi-channel MWC, i.e., set the sampling rate to q ′ W ′ /M ′ Hz, and accordingly adjust the low pass ﬁlter’s cut-oﬀ frequency to q ′ W ′ /L ′ Hz (see Figure 3).Notice that in this case we still maintain the requirement M ′ ≤ L ′ . The frequency domain descriptionof this single channel MWC can now be obtained from (16) by substituting M ′ /q ′ for M ′ and L ′ q ′ for L ′ : Y ( e jω M ′ q ′ W ′ )rect (cid:0) L ′ πq ′ W ′ ω (cid:1) = q ′ W ′ M ′ ⌊ ( L ′ /q ′ +1) ⌋ X m = −⌊ ( L ′ /q ′ +1) ⌋ +1 P ( m )rect (cid:0) L ′ πq ′ W ′ ω (cid:1) X ( ω − m πW ′ q ′ L ′ ) . (18) Summary.

Equations (5), (9), (14), and (16) all indicate that the sampling mechanisms for the RDand the MWC are based on analogue random ﬁltering/convolution. However, the RD’s integrator andthe MWC’s low pass ﬁlter induce signiﬁcant diﬀerences in the speciﬁc form of the random convolutions,or equivalently, in their sampling functions. In fact, the diﬀerent ﬁlters induce bipolar time-frequencycharacterisations that make them well-suited for the signal models they target—an ideal integrator with10 econciling CS Sampling g(t) πq'

W'/ L ' sinc (πq'W't/L')x ( t ) p ( t ) t=kM'/q'W' Figure 3:

Block diagram of a single channel MWC. To be equivalent to the multi-channel system depicted inFigure 1(b), this system samples at a rate q ′ times faster and has a low pass ﬁlter with a cut-oﬀ frequency q ′ times greater. Additional digital processing is also required to form the linear system in (20). a ﬁnite impulse response is well-suited to signals modeled on a ﬁnite interval and an ideal low pass ﬁlterwith a ﬁnite frequency response is well-suited to signals modeled on a ﬁnite frequency band. This section introduces the notion of signal sparsity and brieﬂy discusses signal reconstruction for theRD and the MWC. We show that in practice the MWC can at best recover an approximation of the inputmultiband signal instead of perfectly reconstructing it. This important fact, although not surprising,is completely missing from the current literature. We also discuss three system characteristics relatedto sparsity that ultimately stem from their reliance on CS theory and algorithms. The ﬁrst is a modelsensitivity of the RD that is already a familiar limitation [10, 24, 25]. The second concerns a possiblesensitivity of the MWC with respect to the number of channels, and while again this behaviour maynot be surprising, it is nevertheless important in the design of the MWC. The third characteristic, incontrast to the second, concerns the robustness of the MWC to increases in the width of the spectralbands caused by windowing. The MWC results are new.

In the original formulation of the RD, the input signal was not only modeled as a multitone signal onthe observation interval, but was also assumed to be spectrally sparse [10]. A spectrally sparse multitone signal is a multitone signal that has a small number of nonzero FS coeﬃcients out of the N + 1 possible(refer to (1)). More precisely, letting K denote the number of nonzero coeﬃcients (or equivalently thenumber of nonzero frequencies), a spectrally sparse multitone signal is one that satisﬁes K ≪ N .The reconstruction of a sparse multitone signal x ( t ) from the samples y ( k ) hinges on the matrix formof (6), y = ΣΨ X (19)where Σ is a M × N matrix of the form Σ =  p . . . p NM − · · · · · · · · · · · · p ( M − NM . . . p N −  , econciling CS Sampling and y = [ y (0) , . . . , y ( M − ′ Ψ r,l = e − j πN n r l X = [ α ( − N ) X ( − N ) , . . . , α ( N − X ( N − ′ α ( n r ) = T e j πN n r − j πn r , α (0) = 1 /W for n r = − N/ r , r = 0 , . . . , N −

1, and l = 0 , . . . , N −

1, where the apostrophes denote transpose.By construction, (19) is an underdetermined linear system of equations ( ΣΨ is M × N with M < N ;see Section 2.1) and underdetermined systems do not, in general, have unique solutions. Nevertheless,CS theory has shown that with the presumed sparsity of X , (19) can be solved by a direct applicationof a number of recently developed recovery algorithms, e.g., ℓ minimisation [4], orthogonal matchingpursuit [5], or iterative hard thresholding [26, 27]. In the CS literature, solving (19) is termed the singlemeasurement vector (SMV) problem. Theoretical guarantees regarding the successful recovery of X areprovided in [10] in terms of the degree of sparsity and the number of samples (measurements) collected. Recall that multiband signals are bandlimited, ﬁnite energy signals whose spectral support F is a unionof bounded intervals (see (10) and (11)). A sparse multiband signal is a multiband signal whose supporthas Lebesgue measure that is small relative to the overall signal bandwidth, i.e., λ ( F ) ≪ W ′ [28]. If,for instance, all the occupied bands (intervals) have equal bandwidth B Hz and the signal is composedof K disjoint frequency bands, then a sparse multiband signal is one satisfying KB ≪ W ′ . In the CSliterature, signals having this type of “block” structure have been studied in various settings; the centralquestion being whether this additional signal structure reduces the minimum number of samples requiredto reconstruct the original signal (see e.g., [6, 29, 30]).MWC signal reconstruction centres on the matrix form of (16), Y ( ω ) = ΦΨS ( ω ) (20)where Y ( ω ) = [ Y ( ω ) , . . . , Y q ′ − ( ω )] ′ Y i ( ω ) = Y i ( e jω M ′ W ′ )rect (cid:0) M ′ πW ′ ω (cid:1) Φ i,l = p il Ψ l,r = e − j πL ′ lm r S ( ω ) = [ S ( ω ) , . . . , S L ′ − ( ω )] ′ S r ( ω ) = β ( m r ) rect (cid:0) L ′ πW ′ ω (cid:1) X ( ω − m r πW ′ L ′ ) β ( m r ) = W ′ M ′ − e − j πL ′ m r j πm r , β (0) = 1 /L ′ econciling CS Sampling for i = 0 , . . . , q ′ − l = 0 , . . . , L ′ − r = 0 , . . . , L ′ − m r = −⌊ ( L ′ + 1) ⌋ + 1 + r . Like (19),this linear system of equations is underdetermined since we assume q ′ < M ′ ≤ L ′ (see Section 2.2).The vector S ( ω ) is sparse in the sense that most of its elements (segments of X ( ω )) do not contain theoccupied frequency bands that comprise x ( t ). Equation (20) can also be derived from the single channelMWC, although one has to ﬁrst extract q ′ lower rate sample sequences from the single higher rate outputsequence. We refer the reader to [11] for details regarding the extra processing steps.In practice the linear system in (20) cannot in general be computed because it theoretically requiresan inﬁnite amount of data. To see the point, consider the inverse DTFT of (20). It immediately followsthat the inverse DTFTs of the spectra Y i ( ω ) = Y i ( e jω M ′ W ′ )rect (cid:0) M ′ πW ′ ω (cid:1) are the time domain sequences { y i ( k ) , k ∈ Z } i . If one interprets the spectral segments S r ( ω ) = β r rect (cid:0) L ′ πW ′ ω (cid:1) X ( ω − m r πW ′ L ′ ) on theright hand side of (20) as single periods of periodic spectra, it follows that their inverse DTFT are thesequences { γ r ( k ) , k ∈ Z } r where γ r ( k ) = L ′ πW ′ Z πW ′ /L ′ − πW ′ /L ′ S r ( ω ) e j L ′ W ′ kω dω, (21)and S r ( ω ) = ∞ X k = −∞ γ r ( k ) e − j L ′ W ′ kω , ω ∈ [ − πW ′ L ′ , πW ′ L ′ ] . (22)The DTFT transform pair of (20) is therefore the linear system, Y = ΦΨS (23)where Φ and Ψ are as in (20) but Y and S are now inﬁnite column matrices: the rows of Y are thesequences y i ( k ) , k ∈ Z , and the rows of S are the sequences γ r ( k ) , k ∈ Z . The matrix S is described asbeing jointly sparse because most of its rows are zero since the zero-valued elements of S ( ω ) correspondto zero-valued sequences γ r ( k ) (rows of S ). In general, a matrix Z is said to be K joint sparse if thereare at most K rows in Z that contain nonzero elements. The recovery of S from the measurements Y in (23) is called an inﬁnite measurement vector (IMV) problem [7, 11, 31] because the columns of Y areviewed as CS measurements (via the measurement matrix ΦΨ ) of a collection of vectors that share acommon sparse support.The limitation of only ever collecting a ﬁnite number of samples truncates the rows of Y and causesthe IMV problem in (23) to become a so-called multiple measurement vector (MMV) problem [31–36],where the goal is to recover a ﬁnite number of the columns of the jointly sparse matrix S correspondingto the ﬁnite number of acquired samples. Using existing CS methods, this MMV problem can be solvedexactly, or with exceedingly high probability, provided the matrix ΦΨ satisﬁes certain conditions andthat enough samples are collected relative to the joint sparsity of S . The solution, however, is in generala linear approximation to the true spectral slices Y i ( ω ) because the solution only recovers a ﬁnite numberof coeﬃcients γ r ( k ) in (22) (see [37] for information about linear approximations). This fact is in contrastto the sparse multitone signal model that is parameterised by a ﬁnite number of parameters and thusonly requires a ﬁnite number of samples for perfect signal reconstruction.13 econciling CS Sampling In [11] and [31], Mishali and Eldar proposed a two step reconstruction process termed the “continuous-to-ﬁnite block” that provably recovers x ( t ) exactly given an inﬁnite amount of data, or in other words,recovers x ( t ) to an arbitrary precision given suﬃcient data. The ﬁrst step recovers the joint support of S by solving an associated MMV problem, and the second step uses the recovered support to reducethe dimension of the measurement matrix ΦΨ such that a unique least squares solution can be found.We stress that even if this two step process perfectly solves the MMV problem derived from (23), thesolution can only, in general, approximate x ( t ) for a ﬁnite number of samples. The ability of CS recovery algorithms to recover the FS coeﬃcients in (19) depends fundamentally onthe sparsity of X , or equivalently, on whether x ( t ) has a sparse FS representation in the observationwindow [0 , T ] and on the number of acquired measurements. For the RD, the sparsity level of X (numberof nonzero entries) in (19) not only changes with the number of tones comprising x ( t ), but can also beincreased if there is a mismatch between the Fourier basis in which x ( t ) is actually sparse and the basisin which x ( t ) is modeled. In fact it is known that x ( t ) may have a sparse Fourier expansion using { exp ( j πT nt ) } n but may not have a sparse expansion using { exp ( j πT + δ nt ) } n , where δ ∈ R is some smallperturbation [37, p.379-380]. The implication for the RD is that in a blind sensing scenario, wherethe frequencies of the tones are not known, model mismatches are likely and if the sparsity level of X rises above a required level, CS recovery may be jeopardized. This possibility has been described as asensitivity of the RD because a small basis mismatch (small δ ) can lead to signiﬁcant reconstructionerrors [24]. This sensitivity was acknowledged by Tropp et al. in [10], highlighted in [16, 20] andstudied in [24, 25]. Recent results, however, could potentially alleviate the problem. In [38], Cand´eset al. showed that sparse multitone signals, deﬁned in terms of an oversampled DFT dictionary, canbe eﬀectively recovered from undersampled data. Because this model can sparsely represent multitonesignals on [0 , T ] as well as on [0 , T + δ ], these results suggest that a modiﬁed RD could be robust to basismismatch. Similarly, Duarte and Baraniuk [25] proposed a heuristic solution that marries model-basedCS [6], redundant DFT frames, and standard spectral estimation techniques. Because of its reliance on CS, the MWC reconstruction algorithm inherits the CS conditions of successfulreconstruction. One such condition relates the number of CS measurements per unit time (number ofrows of Y in (23)) to the support of S . Assuming that the matrix ΦΨ has maximal rank q ′ , a necessarycondition for unique recovery of S from the measurements is q ′ > | supp S | , where | supp S | denotes jointsparsity of S [11, 35]. Knowing that q ′ also equals the number of channels in the MWC, it is reasonableto want to minimise q ′ so as to minimise the processing and hardware complexity of the MWC, i.e. setit as close to 2 | supp S | as possible while still ensuring successful recovery. (In practice, this lower boundchanges depending on the speciﬁc CS algorithm employed.) With this in mind, it is natural to ask howMWC’s performance behaves around this condition boundary. Clearly, beyond the theoretical limit of14 econciling CS Sampling −5 −4 −3 −2 −1 q ′ A v e r age s qua r ed e rr o r Ω =2 Ω =3 Ω =4 Ω =6 0 5 10 15 20 25 30 35 40 45 5010 −5 −4 −3 −2 −1 q ′ A v e r age s qua r ed e rr o r Ω =2 Ω =3 Ω =4 Ω =6 Figure 4:

Reconstruction error as a function of the number of channels q ′ for the MWC. Ω denotes the jointsparsity | supp S | of S and S-OMP is used to recover the input signal’s support with T = 4 sec, W ′ = 500 Hz, L ′ = 50, M ′ = 20. The plots show the rapid increase in error as q ′ falls below a critical point. For the casesshown, the error increases exponentially as q ′ decreases. Left: active bands have equal amplitude; Right: unequalamplitudes. | supp S | , it is impossible to guarantee recovery of the signal’s support. Consequently, one expects theperformance to degrade beyond this point. Figures 4 and 5 show that performance (as measured assquared error) can indeed decay rapidly, in fact expontially fast , as a function of the number of channels q ′ . In the example, we consider a sparse multiband signal bandlimited to 500 Hz that is sampled by aMWC with a spectral resolution of 20 Hz ( L ′ = 50) and a channel sampling rate of 50 Hz ( M ′ = 20) forvarious sparsity levels and values of q ′ . The average squared error is computed as T W ′ k x − b x k , where x is a digitally simulated sparse multiband signal, b x is the reconstructed signal, and k·k denotes the ℓ norm. Figure 4 shows the results when the simultaneously orthogonal matching pursuit algorithm(S-OMP) is used to recover the signal’s support. For each value of q ′ , we report an error that is averagedover 100 trials with each trial using a diﬀerent randomly generated Φ matrix. Note that when each bandhas equal amplitude (left panel) degradation begins at roughly q ′ = 2 | supp S | log ( L ′ ), which is consistentwith S-OMP [33]. When the band amplitudes are randomly chosen (right panel) the turning points areroughly proportional to 2 | supp S | log ( L ′ ). Figure 5 shows the original and reconstructed signals for onespeciﬁc case.The point of this example is to show that when the number of channels is near its theoretical limit,a small change in its value can lead to dramatic performance decreases for the MWC. The exact degreeof degradation is problem dependent however. Let x ( t ) be a sparse multiband signal with FT X ( ω ) and let z ( t ) be a windowed version of x ( t ), z ( t ) = x ( t ) w ( t ) , where w ( t ) is an indicator function of some sub-interval of the observation interval. Consider the situationwhere z ( t ) is the input signal to the MWC, but where the MWC is designed for a multiband signal, i.e., as15 econciling CS Sampling A m p li t ude Seconds A m p li t ude −500 0 5000150300 M agn i t ude Hz −500 −400 −300 −200 −100 0 100 200 300 400 500−50050 A m p li t ude A m p li t ude Seconds A m p li t ude −500 0 5000150300 M agn i t ude Hz −500 −400 −300 −200 −100 0 100 200 300 400 500−500 50 A m p li t ude Figure 5:

Results of two speciﬁc cases from the experiment in Figure 4 (Ω = 3, equal amplitudes). The timeand frequency domains of the input multiband signals are shown along with the diﬀerence signal between theinput and reconstructed signals (green raised plots). Left: q ′ = 25, squared error= 3 × − ; Right: q ′ = 18,squared error= 0 . −5 −4 −3 −2 −1 Signal duration (sec) A v e r age s qua r ed e rr o r q ′ =50q ′ =20 Figure 6:

Reconstruction error as a function of signal duration for the MWC. A four band multiband signalbandlimited to 500 Hz was systematically windowed and sampled by a MWC ( q ′ = 20 , L ′ = 50 , M ′ = 20)over an observation interval of 2 sec. Each point represents a squared error averaged over 50 trials. Note thatthe error remains relatively constant over a wide range of signal durations. described in Section 2.2. Because time limited signals cannot be bandlimited, this case represents a modelmismatch scenario where essentially the input signal has shorter duration than expected. The spectrumof z ( t ) equals X ( ω ) convolved with a sinc function, thus the windowing spreads the original spectrum X ( ω ) [39]. One might therefore expect that if the spectrum were suﬃciently spread, the sparsity of S ( ω )would increase to a point where the condition q ′ > | supp S | is violated and performance would degraderapidly as seen above in Figure 4. However, experimental results show that this is not necessarily thecase. Figure 6 shows that the MWC can be robust to a wide range of signal durations. In fact, for the twovalues of q ′ shown, the error remains relatively constant for signal durations ranging from 100% to 5%of the observation interval. Observe that for q ′ = 20, reconstruction does eventually break down (signalduration less than 0 . econciling CS Sampling A m p li t ude Seconds A m p li t ude −500 0 500050100 M agn i t ude Hz −500 0 500−200 20 A m p li t ude A m p li t ude Seconds A m p li t ude −500 0 5000510 M agn i t ude Hz −500 0 500−20 2 A m p li t ude Figure 7:

Results of two speciﬁc cases from the MWC experiment in Figure 6 ( q ′ = 50). The time and frequencyplots of two windowed multiband signals are show. They represent the two extreme cases: full duration (2 sec),short duration (0.1 sec). The raised plots (green) are the diﬀerence signals between the original and reconstructedsignals. Note that the spread of the spectrum has little eﬀect on reconstruction. This section ends with a description of a novel sampling system whose inspiration comes from a simpleexercise: we exchange the signal models for the RD and the MWC and analyze the relationship betweenthe input and output signals. In particular, we consider sampling sparse multitone signals with theMWC and sparse multiband signals with the RD. We discover that the MWC uses what we call “block-convolution” to successfully sample and recover block sparse signals. We also discover that the RD doesnot use block convolution and thus cannot, without diﬃculty, sample multiband signals. In Section 4.3,we apply this concept in a new way and propose a system that can sample and recover continuous-timeblock-sparse signals at low sampling rates.

Consider the problem of using a MWC to sample and recover a sparse multitone signal instead ofsparse multiband signal. Let x ( t ) be a sparse multitone signal on the observation interval [0 , T ] withbounded harmonics − W ≤ nT < W and denote by X ( n ) the FS coeﬃcients of x ( t ) on [0 , T ] (cf. (1)).Let the random waveforms p i ( t ) , i = 0 , . . . , q ′ − , be as deﬁned in Section 2.2, but with W ′ replacedby W ; their common period T p thus equaling L ′ W , L ′ ∈ Z + . Let the ideal low pass ﬁlter have impulseresponse h ( t ) = πWL ′ sinc( πWL ′ t ) and let the sampling period be T s = M ′ W . For this problem and for thissection only, we assume that the duration of the observation interval is an integer multiple of T p , i.e., T = N L ′ W , N ∈ Z + . We also assume, for ease of exposition, that the period of p i ( t ) equals the samplingperiod ( T p = T s or that L ′ = M ′ ). These assumptions are not necessary, but help make this section’smessage clear and easier to understand. Lastly, to make the problem meaningful, we assume T ≥ T s .In Appendix 6.3, we derive the following expression relating the FS coeﬃcients of x ( t ) to the DFT17 econciling CS Sampling coeﬃcients of the output samples, Y i ( n ) = ⌊ ( L ′ +1) ⌋ X m = −⌊ ( L ′ +1) ⌋ +1 η ( m ) X ( n − N m )rect (cid:0) N n (cid:1) L ′ − X l =0 p il e − j πL ′ lm , (24)where p il = p i ( t ) for t ∈ [ l/W, ( l + 1) /W ) and η ( m ) =  N − e − j πL ′ m j πm m = 01 /L ′ m = 0for − N/ ≤ n < N/ i = 0 , . . . , q ′ −

1. This expression is analogous to the frequency domaindescription of the RD given by (9) and is what (16) becomes assuming a sparse multitone signal modeland L ′ = M ′ . In matrix form, (24) becomes the MMV problem, Y = ΦΨS , (25)where Y i,v = Y i ( n v ) Φ i,l = p il Ψ l,r = e − j πL ′ lm r S r,v = η ( m r ) X ( n v − N m r )rect (cid:0) N n v (cid:1) η ( m r ) = 1 N − e − j πL ′ m r j πm r , η (0) = 1 /L ′ , for i = 0 , . . . , q ′ − l = 0 , . . . , L ′ − v = 0 , . . . , N − n v = − (cid:4) N (cid:5) + v , r = 0 , . . . , L ′ − m r = − (cid:4) ( L ′ + 1) (cid:5) + 1 + r . In contrast to sampling a sparse multiband signal, this MWC MMV problemdoes not result from truncation, rather its ﬁniteness derives from the fact that multitone signals areﬁnitely parameterised. The Fourier components of x ( t ) can be recovered by solving (25) using severalexisting CS algorithms including greedy algorithms [33], mixed norm approaches [34], MUSIC basedrecovery algorithms and, in particular, the approach proposed by Mishali and Eldar in [11] and [31].One can thus successfully sample and recover sparse multitone signals with the MWC. This fact issomewhat surprising since the MWC is designed to sample multiband signals, not multitone signals.More importantly, this example highlights a property of the MWC that diﬀerentiates it from theRD and allows it to successfully sample and recover multitone signals, as well as multiband signals.Speciﬁcally, the convolution in (24) involves shifts of X ( n ) by integer multiplies of N that, when greaterthan one, yields a “block convolution”. By block-convolution, we mean every DFT coeﬃcient Y i ( n )in (24) is a linear combination of ﬁnite segments of X ( n ). Block-convolution is also seen in (16) wherethe FT of a multiband signal is shifted by integer multiplies of W ′ L ′ . In contrast, it is not seen in (9) wherethe frequency shifts describing the RD are by one. This aspect of the MWC allows the construction ofa linear system like (20) and (25) that describe the original spectrum in terms of linear combinations ofthese blocks. The blocks themselves represent a partition of the frequency axis that eﬀectively discretises18 econciling CS Sampling a signal’s spectrum. In Section 4.3 below, we incorporate this property into a multi-channel randomconvolution system that samples and approximately recovers continuous-time block-sparse signals, thetime domain analogue of sparse multiband signals.When the observation interval, the period of p i ( t ), and the sampling period are equal, i.e. when N = 1 and L ′ = M ′ , (25) collapses to a SMV problem, where the matrices Y and S become thevectors [ Y (0) , . . . , Y q ′ − (0)] ′ and [ η ( m ) X ( − m ) , . . . , η ( m L ′ − ) X ( − m L ′ − )] ′ respectively. Note that inthis special case, the MWC and the RD produce equivalent SMV problems, the only diﬀerence beingthe timing in how the samples are acquired—the MWC collects a measurement vector in parallel (eachchannel samples once) while the RD collects its samples sequentially. When a multiband signal is the assumed signal model for the RD, the system fails to produce a singlemeasurement vector problem whose solution recovers x ( t ). To be more concrete, let x ( t ) be a sparsemultiband signal bandlimited to W ′ / F . Consider a RD parameterised by M that samples x ( t ) on [0 , T ] and let p ( t ) be as described in Section 2.1. A similar analysis to thatcontained in Appendix 6.1 leads to the expression y ( k ) = 12 π N ′ /M − X m =0 p k N ′ M + m Z πW ′ − πW ′ X ( ω ) e j ωW ′ − jω e jω (cid:0) k N ′ M + m (cid:1) dω, where here N ′ represents the number of Nyquist periods within the observation window ( N ′ = T W ′ ).Here, for simplicity, we assume N ′ is a positive integer. This expression relates the time domain outputsamples y ( k ) to the Fourier spectrum of x ( t ). To construct a ﬁnite dimensional linear system consistentwith the RD formulation, one could approximate the integral by discretizing ω : y ( k ) ≈ π N ′ /M − X m =0 p k N ′ M + mD − X i =0 X ( ω i ) e j ωiW ′ − jω i e jω i (cid:0) k N ′ M + m (cid:1) δ ω , where δ ω = πWD for some positive integer D and ω i = − πW + δ ω ( i + 1 / D . Clearly, in cases where one wants to closely approximate the integralor ﬁnely discretize the ω axis, the size of the SMV problem could become computationally unwieldy. Infact, examples from [20] show that naively modeling a multiband signal as a multitone signal can leadto computationally prohibitive or even intractible CS problems given current technology. In comparisonto the MWC, one reason for this diﬃculty is that the RD does not use block-convolution .19 econciling CS Sampling The class of continuous-time block signals G ( T , t ) is the set of continuous-time, real-valued, ﬁnite energysignals whose support is a ﬁnite union of bounded intervals, G ( T , t ) = (cid:8) x ( t ) ∈ L ([0 , t ]) ∩ C ([0 , t ]) : x ( t ) = 0 , t / ∈ T (cid:9) where C ([0 , t ]) denotes the set of continuous functions on [0 , t ] and T = K [ i =1 [ a i , b i ) , ≤ a i , b i ≤ t < ∞ . A continuous-time block-sparse signal is a continuous-time block signal whose support has Lebesguemeasure that is small relative to the signal’s overall duration, i.e., λ ( T ) ≪ t .The proposed system combines block convolution with the MWC architecture and the random con-volution ideas of Romberg to obtain a sampling system for continuous-time block-sparse signals (seeFigure 8(a)). The resulting system can also be interpreted as the time domain analogue of the MWC.Matusiak and Eldar [40] recently proposed a sampling system targeting a nearly identical signal class.Their system also shares structural similarities to the MWC but leverages Gabor frames instead of blockconvolution. Sampling System.

Let x ( t ) be a continuous block-sparse signal on the interval [0 , T ] and let Z be aBernoulli random variable taking values ± { p i ( l ) , l = 0 , . . . , L − } q − i =0 an ensemble of random vectors drawn from Z ( L, q ∈ Z + ). The sampling system has q channels andoperates in parallel like the MWC: the i th channel convolves x ( t ) with p i ( l ), resulting in the continuous-time signal g i ( t ) = L − X l =0 p i ( l ) x ( t − l TL ) , (26)and then uniformly samples g i ( t ). Note that by construction the convolution in (26) is a block-convolution(see Section 4.1) and is the standard ﬁltering operation underlying waveform synthesis [39, p. 123].Restricting the time axis to the interval [0 , T /L ], (26) becomes g i ( t )rect (cid:0) LT ( t − T L ) (cid:1) = L − X l =0 p i ( l ) x ( t − l TL )rect (cid:0) LT ( t − T L ) (cid:1) , (27)for i = 0 , . . . , q −

1. This system of equations is similar to the frequency domain description of theMWC (equation (16)) in that the unknowns are segments of the continuous signal of interest; herewe are interested in recovering (cid:8) x ( t − l TL )rect (cid:0) LT ( t − T L ) (cid:1)(cid:9) l whereas for the MWC we want to recover (cid:8) X ( ω − m πW ′ L ′ )rect (cid:0) L ′ πW ′ ω (cid:1)(cid:9) m . Consequently, given the block sparsity of x ( t ), one could proceed to solvethe linear system in a manner similar to that proposed by Mishali and Eldar for the MWC. Alternatively,one could simply discretize the time axis, form a MMV problem from (27) and solve for samples of thesegments (cid:8) x ( t − l TL )rect (cid:0) LT ( t − T L ) (cid:1)(cid:9) l . (This type of approach was described in Section 4.2.) Either way,the solution that is obtained would be an approximation of the original block-sparse time domain signal:the MWC reconstruction method would yield a linear approximation (see Section 3.2), and discretizing20 econciling CS Sampling the time axis would produce samples between which one would have to interpolate. Below, we presentanother method to compute a linear approximation that, unlike the Mishali and Eldar method, usesCS techniques to directly retrieve both the support of x ( t ) (at resolution L ) and the coeﬃcients of theapproximation simultaneously.Representing the segments (cid:8) x ( t − l TL )rect (cid:0) LT ( t − T L ) (cid:1)(cid:9) l in an appropriate orthogonal basis, (27)may be written as g i ( t )rect (cid:0) LT ( t − T L ) (cid:1) = L − X l =0 p i ( l ) ∞ X n =0 α l ( n ) ψ n ( t ) , (28)where α l ( n ) = (cid:10) x ( t − l TL )rect (cid:0) LT ( t − T L ) (cid:1) , ψ n ( t ) (cid:11) . Sampling at a rate LMT

Hz over the interval [0 , TL ]yields y i ( k ) = g i ( k TLM )= L − X l =0 p i ( l ) ∞ X n =0 α l ( n ) ψ n ( k TLM ) , (29)for k = 0 , . . . , M − i = 0 , . . . q −

1. Thus to recover a D -term linear approximation, one needs tosolve for the coeﬃcient matrix A in the matrix form of (29), Y = ΦAΨ (30)where Y i,k = y i ( k ) Φ i,l = p i ( l ) A l,n = α l ( n ) Ψ n,k = ψ n ( k TLM )for i = 0 , . . . , q − k = 0 , . . . , M − l = 0 , . . . , L −

1, and n = 0 , . . . , D −

1. Given the samples { y i ( k ) } and the matrices Φ and Ψ , one can form an MMV problem by post multiplying both sides of (30) bythe right-inverse of Ψ , when it exists. This yields YΨ † = ΦA (31)where Ψ † denotes the right-inverse of Ψ . The right-inverse exists if and only if the columns of Ψ span R D [41], which is only possible if D ≤ M , or equivalently, when the approximation order is less thanor equal to the number of acquired samples. Thus, assuming Ψ † exists, the maximum approximationorder D that can be recovered with this scheme equals the number of samples M acquired per channel.Conversely, the minimum number of samples needed to successfully recover a D -order approximationis M . However, the desire to recover an approximation that is as accurate as possible while using aminimum number of samples suggests that setting D = M is an optimal choice. Thus, in what follows,we assume Ψ is a square matrix that has an inverse Ψ − . The linear system in (31) therefore becomes YΨ − = ΦA . (32)21 econciling CS Sampling From a CS perspective, Φ is a q × L Bernoulli sensing matrix where q < L , and the elements of theproduct YΨ − are the CS measurements. In this case, and in contrast to the RD and the MWC, thesemeasurements are not simply the samples acquired by the system (c.f. equations (19) and (23)), but arelinear combinations of the samples. The coeﬃcient matrix A can be solved for using any existing MMVCS solver, see e.g. [31–36]. The conditions deﬁning when a unique solution exists has been extensivelystudied and depends on the properties of Φ and the degree to which A is sparse (again, see [31–36]).Note that because x ( t ) is assumed to be block-sparse, it follows that A is a joint sparse matrix.Letting b A l,n = b α l ( n ) denote the entries of the CS solution of (32), x ( t ) can be approximated (recon-structed) by computing an approximation for each segment (cid:8) x ( t − l TL )rect (cid:0) LT ( t − T L ) (cid:1)(cid:9) l , l = 0 , . . . , L − x ( t − l TL )rect (cid:0) LT ( t − T L ) (cid:1) ≈ D − X n =0 b α l ( n ) ψ n ( t )and then concatenating the L segment approximations. Example.

Consider the continuous-time block sparse signal depicted in the top panel of Figure 8(b).Samples of the ﬁltered signals g i ( t ) (middle panel) are acquired by an 8 channel system ( q = 8) witha sampling rate of 400 Hz ( M = 40 , LM/T = 400) where the time axis is partitioned into 10 segments( L = 10). Using a Fourier basis, ψ n ( t ) = e j πT/L tn , the linear system (30) was solved using S-OMP [33]resulting in L linear approximations ( D = 40) of the segments (cid:8) x ( t − l TL )rect (cid:0) LT ( t − T L ) (cid:1)(cid:9) l . Thereconstructed signal is shown in the bottom panel of Figure 8(b). In this case, the reconstructed signalis a faithful representation of x ( t ) (normalized squared error = 0 . x ( t ) contains discontinuities, a wavelet basis would likely provide a better approximation than a Fourierbasis.Strictly speaking continuous-time block sparse signals are not bandlimited, but if one examines thespectrum of the test signal in Figure 8(b), one would discover that the signal is “essentially” bandlimitedto about 1000 Hz. Thus, it could be argued according to the Shannon-Nyquist sampling theorem thata sampling rate of about 2000 Hz would be required to accurately capture this signal. Relative to 2000Hz, 400 Hz represents a ﬁve fold reduction in sampling rate. In this paper, we showed that the sampling mechanisms of the RD and the MWC can both be thoughtof as being based on the underlying concept of random ﬁltering or random convolution. The mostsubstantial diﬀerence between the systems stems from the speciﬁc form of their sampling functions (orrandom ﬁlters) and from the assumed signal models. The RD has sampling functions that have ﬁnitetemporal extent but inﬁnite spectral support; the MWC employs sampling functions that have ﬁnitespectral support but inﬁnite temporal support. The randomness in the sampling functions is a hallmarkof CS theory that is fundamental in guaranteeing unique solutions to the underdetermined linear systems22 econciling CS Sampling x(t) p ( lT/L)p i (lT/L)p q (lT/L) y (k)y i (k)y q (k)t=k T/LM g i (t)g (t)g q (t) (a) A m p li t ude A m p li t ude A m p li t ude (b) Figure 8: (a) Schematic diagram of a continuous-time block-sparse sampling system. The system is a general-ization of random convolution as originally proposed in [9]. Each channel convolves x ( t ) with a random sequenceand then samples the result at a low rate. (b) Top panel: Simulated block sparse time domain signal on the unitinterval. The signal is a modiﬁed version of the “bumps” test signal from the WaveLab toolbox [42]. Middlepanel: Overlay plot of the signals g i ( t ) resulting from the random convolution. Here, x ( t ) was sampled with a8 channel system. Bottom panel: Reconstructed signal approximation. Each 0 . x ( t ) on that segment (normalized squared error= 0 . that characterise the RD and the MWC.Block convolution is also an important property that diﬀerentiates the MWC from the RD becauseit is one approach that eﬀectively processes inﬁnite dimensional signals that have a block structure.The absence of this property is one primary reason the RD cannot, in general, reconstruct multibandsignals. We incorporated block convolution into a new sampling system that samples continuous-timeblock-sparse signals.We also oﬀered two novel insights into how the MWC behaves with regard to the underlying sparsityassumption. We showed that if the number of channels is near the minimum required, relatively smallchanges in the number of channels (or equivalently if the sparsity changes for a given number of channels)can cause signiﬁcant reconstruction errors. On the other hand, we provided evidence that the MWC isrobust to increases in the width of the spectral bands caused by windowing.From this paper’s perspective, one can begin to consider generalisations to the RD and the MWCthat target diﬀerent signal classes, in particular, CS sampling systems that have diﬀerent time-frequencycharacterisations. For example, a system that “compressively” samples radar pulses and chirps in aneﬃcient time-frequency manner could possibly oﬀer a means to eﬀectively detect and classify thesesignals while avoiding the overhead of sampling several bands simultaneously or reconstructing theNyquist equivalent signal. This paper takes a step towards this goal by reconciling some of the coreideas behind these sampling systems. 23 econciling CS Sampling The following analyses yield basic time and frequency domain descriptions of the sampling strategies. Weemploy standard Fourier transform properties without explicit explanation for the sake of conciseness.The notational style is that of [39]. To denote transform pairs, we use the shorthand notation, x ( t ) FT ←−−−→ X ( ω ) , and use the abbreviations FT, FS, DTFT, and DFT when referring to the Fourier transform, the Fourierseries, the discrete time Fourier transform, and the discrete Fourier transform, respectively. Also recallthe deﬁnitions: sinc( x ) = sin ( x ) /x, x ∈ R andrect( x ) =  − ≤ x ≤

10 otherwise . (6) and (9) Time domain description.

Let x ( t ) be a sparse multitone signal on [0 , T ] and recall the followingtransform pairs from Section 2.1: x ( t ) FS;1/T ←−−−−→ X ( n ) p ( t ) FS;1/T ←−−−−→ P ( n ) h ( t ) = rect (cid:0) MT t − (cid:1) FT ←−−→ H ( ω ) = TM sinc( T M ω ) e − jω . By inspection of Figure 1(a), the time domain description of the RD is g ( t ) = x ( t ) p ( t ) ∗ h ( t )= Z T x ( τ ) p ( τ ) h ( t − τ ) dτ = Z tt − TM x ( τ ) p ( τ ) dτ = N/ − X n = − N/ X ( n ) Z tt − TM p ( τ ) e j πT nτ dτ, where ∗ denotes convolution. Sampling at t = ( k + 1) TM for k = 0 , , . . . , M − y ( k ) = g (cid:0) ( k + 1) TM (cid:1) = N/ − X n = − N/ X ( n ) Z ( k +1) TMk TM p ( τ ) e j πT nτ dτ = N/ − X n = − N/ X ( n ) N/M − X m =0 Z k TM + m +1 Wk TM + mW p ( τ ) e j πT nτ dτ = N/ − X n = − N/ N/M − X m =0 p ( k TM + mW ) X ( n ) Z k TM + m +1 Wk TM + mW e j πT nτ dτ (33)24 econciling CS Sampling where each step follows from the additivity of the integral and the speciﬁc nature of p ( t ). Evaluatingthe integral, we obtain Z k TM + m +1 Wk TM + mW e j πT nτ dτ =  T e j πN n − j πn e j πN n (cid:0) k NM + m (cid:1) n = 0 W n = 0 (34)Substituting (34) into (33) and letting l = k NM + m , (33) may be rewritten as y ( k ) = N/ − X n = − N/ α ( n ) X ( n ) ( k +1) NM − X l = k NM p l e j πN nl , for k = 0 , . . . M − p l = p ( l/W ) and α ( n ) =  T e j πN n − j πn n = 01 /W n = 0 Frequency domain description.

We also have the following frequency domain description of the RD.

Multiplication/Convolution: x ( t ) p ( t ) FS;1/T ←−−−−→ ⌊ n − N ⌋ X m = −⌊ n − N ⌋ +1 P ( m ) X ( n − m ) Convolution (ﬁltering)/Multiplication: g ( t ) = x ( t ) p ( t ) ∗ h ( t ) l FS; 1/T G ( n ) = ⌊ n − N ⌋ X m = −⌊ n − N ⌋ +1 P ( m ) X ( n − m ) H ( j πT n )= TM ⌊ n − N ⌋ X m = −⌊ n − N ⌋ +1 P ( m ) X ( n − m ) e − j πT n sinc( πM n ) Sampling/Aliasing: y ( k ) = g (cid:0) ( k + 1) TM (cid:1) DFT;M ←−−−−→ Y ( n ) = M ∞ X l = −∞ G ( n − lM )Because Y ( n ) is M -periodic, we can, without loss of information, restrict it to one period. This meanswe need only consider one term in the summation over l . Retaining the l = 0 term yields Y ( n ) = T ∞ X m = −∞ P ( m ) e − j πT n sinc( πM n ) X ( n − m ) , for n = 0 , . . . , M −

1. 25 econciling CS Sampling (17)

Let x ( t ) be a sparse multiband signal and recall the following transform pairs from Section 2.2: x ( t ) FT ←−−→ X ( ω ) p i ( t ) FS; W ′ /L ′ ←−−−−−−→ P i ( m ) h ( t ) = πW ′ L ′ sinc( πW ′ L ′ t ) FT ←−−→ H ( ω ) = rect (cid:0) ωL ′ πW ′ (cid:1) . We then have the following time and frequency domain descriptions for the i th channel, i = 0 , . . . , q ′ − Multiplication/Convolution: x ( t ) p i ( t ) FT ←−−→ ∞ X m = −∞ P i ( m ) X ( ω − mω p )= ⌊ ( ω + πW ′ ) /ω p ⌋ X m = ⌈ ( ω − πW ′ ) /ω p ⌉ +1 P i ( m ) X ( ω − mω p )where ω p = 2 πW ′ /L ′ radians per second. The summation limits are ﬁnite for a given ω because x ( t ) isassumed to be bandlimited. Convolution (ﬁltering)/Multiplication: g i ( t ) = x ( t ) p i ( t ) ∗ h ( t ) l FT G i ( ω ) = ⌊ ( ω + πW ′ ) /ω p ⌋ X m = ⌈ ( ω − πW ′ ) /ω p ⌉ +1 P i ( m ) X ( ω − mω p ) H ( ω )= ⌊ + πW ′ ωp ⌋ X m = −⌊ + πW ′ ωp ⌋ +1 P i ( m ) X ( ω − mω p )rect (cid:0) ωω p (cid:1) Note that the low pass ﬁlter windows X ( ω ) and its translates (i.e., restricts them to the interval[ − ω p / , ω p / ω in the summation limits. Sampling/Aliasing: y i ( k ) = g i ( kT s ) DTFT ←−−−−→ Y i ( e jω M ′ W ′ ) = W ′ M ′ ∞ X n = −∞ G i ( ω + nω s ) (35)Observe that the translates of G i ( ω ) in (35) do not overlap because G i ( ω ) is bandlimited to [ − ω p / , ω p / ω p ≤ ω s (see Section 2.2). We can therefore, without loss of information, restrict Y i ( e jω M ′ W ′ ) to one period (cid:0) Y i ( e jω M ′ W ′ ) is 2 πW ′ /M ′ -periodic (cid:1) . This means we need only consider one termin the summation over n in (35). We choose to retain the n = 0 term and thus have the DTFT Y i ( e jω M ′ W ′ )rect (cid:0) M ′ πW ′ ω (cid:1) = W ′ M ′ ⌊ ( L ′ +1) ⌋ X m = −⌊ ( L ′ +1) ⌋ +1 P i ( m ) X ( ω − mω p )rect (cid:0) ωω p (cid:1) . econciling CS Sampling The Fourier series coeﬃcients of p i ( t ) can now be directly computed, P i ( m ) = 1 T P Z T p p i ( t ) e − j πTp mt dt = 1 T P L ′ − X l =0 Z ( l +1) TpL ′ l TpL ′ p il e − j πTp mt dt = P L ′ − l =0 p il j πm (cid:0) − e − j πL ′ m (cid:1) e − j πL ′ ml , m = 0 L ′ P L ′ − l =0 p il , m = 0 , (36)where p il = p i ( t ) for t ∈ [ l/W ′ , ( l + 1) /W ′ ), to obtain Y i ( e jω M ′ W ′ )rect (cid:0) M ′ πW ′ ω (cid:1) = ⌊ ( L ′ +1) ⌋ X m = −⌊ ( L ′ +1) ⌋ +1 β ( m ) X ( ω − m πW ′ L ′ )rect (cid:0) L ′ πW ′ ω (cid:1) L ′ − X l =0 p il e − j πL ′ ml , where β ( m ) =  W ′ M ′ − e − j πL ′ m j πm , m = 01 /L ′ , m = 0 . (24) Let x ( t ) be a sparse multitone signal on [0 , T ] and recall the following transform pairs and parameterrelations from Section 4.1: x ( t ) FS;1/T ←−−−−→ X ( n ) p i ( t ) FS;1 /T p ←−−−−−→ P i ( m ) h ( t ) = πWL ′ sinc( πWL ′ t ) FT ←−−→ H ( ω ) = rect (cid:0) L ′ πW ω (cid:1) T = N L ′ W , N ∈ Z , N > , T p = L ′ W , T s = M ′ W Also recall the simplifying assumption L ′ = M ′ . In the following analysis, we use the periodic extensionof x ( t ), denoted by ˜ x ( t ), because it is deﬁned for all t ∈ R (thereby simplifying calculations) and hasthe same FS coeﬃcients as x ( t ). Its use does not imply that x ( t ) must be replicated before the MWCsamples it. Rather it implies a discretisation step of the frequency axis that must otherwise be explicitlycarried out to form the MMV problem in (25). Multiplication/Convolution: ˜ x ( t ) p i ( t ) FS;1 /T ←−−−−→ ⌊ N ( n + NL ′ ) ⌋ X m = ⌈ N ( n − NL ′ ) ⌉ +1 P i ( m ) X ( n − N m )Note that the left hand side is deﬁned for all t ∈ R and the right is deﬁned for all n ∈ Z . The summation27 econciling CS Sampling limits are ﬁnite (for a given n ) because the harmonics of x ( t ) are assumed to be bounded. Convolution (ﬁltering)/Multiplication: g i ( t ) = ˜ x ( t ) p i ( t ) ∗ h ( t ) l FS;1/T G i ( n ) = ⌊ N ( n + NL ′ ) ⌋ X m = ⌈ N ( n − NL ′ ) ⌉ +1 P i ( m ) X ( n − N m ) H (cid:0) πT n (cid:1) , = ⌊ ( L ′ +1) ⌋ X m = −⌊ ( L ′ +1) ⌋ +1 P i ( m ) X ( n − N m )rect (cid:0) N n (cid:1) where we note that g i ( t ) is 1 /T -periodic and H (cid:0) πT n (cid:1) = H ( ω ) | ω =2 πn/T . Note that the low pass ﬁlterwindows X ( n ) and its translates, i.e., restricts them to the interval [ − πWL ′ , πWL ′ ] (or [ − N/ , N/ n in the summation limits. Sampling/Aliasing: y i ( k ) = g i ( kT s ) DFT; N ←−−−−→ Y i ( n ) = 1 N ∞ X l = −∞ G i (cid:0) n − lN (cid:1) (37)Because G i ( n ) is “bandlimited” to [ − N/ , N/ Y i ( n ), or equivalently, only one term in the above summation.Choosing the l = 0 term, (37) becomes Y i ( n ) = 1 N ⌊ ( L ′ +1) ⌋ X m = −⌊ ( L ′ +1) ⌋ +1 P i ( m ) X ( n − N m )rect (cid:0) N n (cid:1) , (38)for − N/ ≤ n < N/

2. By substituting the expression for P i ( m ) from (36) into (38), we have the relation Y i ( n ) = ⌊ ( L ′ +1) ⌋ X m = −⌊ ( L ′ +1) ⌋ +1 η ( m ) X ( n − N m )rect (cid:0) N n (cid:1) L ′ − X l =0 p il e − j πL ′ lm , where p il = p i ( t ) for t ∈ [ l/W, ( l + 1) /W ) and η ( m ) =  N − e − j πL ′ m j πm m = 01 /L ′ m = 0 . References [1] D. Donoho, “Compressed sensing,”

IEEE Trans. Info. Th. , vol. 52, no. 4, pp. 1289–1306, Apr 2006.[2] E.J. Candes, “Compressive sampling,” in

Proc. Int. Congr. Math. , 2006, vol. 3, pp. 1433–1452.[3] E.J. Candes and M.B. Wakin, “An introduction to compressive sampling,”

IEEE Signal Processing Mag. ,vol. 25, no. 2, pp. 21–30, Mar 2008.[4] E. Candes, J. Romberg, and T. Tao, “Robust uncertainty principles: Exact signal reconstruction fromhighly incomplete frequency information,”

IEEE Trans. Info. Th. , vol. 52, no. 2, pp. 489–509, Feb 2006. econciling CS Sampling [5] J. Tropp, “Signal recovery from random measurements via orthogonal matching pursuit,” IEEE Trans.Info. Th. , vol. 53, no. 12, pp. 4655–4666, Dec 2007.[6] R.G. Baraniuk, V. Cevher, M.F. Duarte, and C. Hegde, “Model-based compressive sensing,”

InformationTheory, IEEE Transactions on , vol. 56, no. 4, pp. 1982 –2001, Apr. 2010.[7] Y.C. Eldar, “Compressed sensing of analog signals in shift-invariant spaces,”

Signal Processing, IEEETransactions on , vol. 57, no. 8, pp. 2986 –2997, Aug. 2009.[8] Y.M. Lu and M.N. Do, “A theory for sampling signals from a union of subspaces,”

Information Theory,IEEE Transactions on , vol. 56, no. 6, pp. 2334–2345, Jun. 2008.[9] J. Romberg, “Compressive sensing by random convolution,”

SIAM J. Imaging Sci. , vol. 2, no. 4, pp.1098–1128, 2009.[10] J. Tropp, J. Laska, M. Duarte, J. Romberg, and R. Baraniuk, “Beyond Nyquist: Eﬃcient sampling of sparsebandlimited signals,”

IEEE Trans. Info. Th. , vol. 56, no. 1, pp. 520–544, 2010.[11] M. Mishali and Y.C. Eldar, “From theory to practice: Sub-Nyquist sampling of sparse wideband analogsignals,”

IEEE J. Sel. Topics Sig. Process. , vol. 4, no. 2, pp. 375 –391, 2010.[12] Marco Duarte, Mark Davenport, Dharmpal Takhar, Jason Laska, Ting Sun, Kevin Kelly, and RichardBaraniuk, “Single-pixel imaging via compressive sampling,”

IEEE Signal Processing Mag. , vol. 25, no. 2,pp. 83–91, Mar 2008.[13] J.A. Tropp, M.B. Wakin, M.F. Duarte, D. Baron, and R.G. Baranuik, “Random ﬁlters for compressivesampling and reconstruction,”

Proc. IEEE Inter. Conf. on Acoustics, Speech, and Signal Processing , pp.872–875, 2006.[14] S. Kirolos, J. Laska, , M. Wakin, M. Duarte, D. Baron, , T. Ragheb, Y. Massoud, and R. Baraniuk, “Analogto information conversion using random demodulation,”

Proc. 2006 IEEE Dallas/CAS Workshop on Design,Applications, Integration and Software , pp. 71–74, Oct. 2006.[15] J. Laska, S. Kirolos, M. Duarte, T. Ragheb, R. Baraniuk, and Y. Massoud, “Theory and implementationof an analog-to-information converter using random demodulation,”

Proc. IEEE Int. Symp. Circuits andSystems (ISCAS) , pp. 1959–1962, May 2007.[16] M. Mishali and Y.C. Eldar, “Xampling: Analog data compression,” in

Data Compression Conference(DCC), 2010 , Mar. 2010, pp. 366 –375.[17] Yilun Chen, Moshe Mishali, Yonina C. Eldar, and Alfred O. Hero, “Modulated wideband converter withnon-ideal lowpass ﬁlters,” in

Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE InternationalConference on , Mar. 2010, pp. 3630 –3633.[18] M. Mishali, A. Elron, and Y.C. Eldar, “Sub-Nyquist processing with the modulated wideband converter,”in

Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on , Mar. 2010,pp. 3626 –3629.[19] Zhuizhuan Yu, S. Hoyos, and B.M. Sadler, “Mixed-signal parallel compressed sensing and reception forcognitive radio,” in

Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE InternationalConference on , Apr. 2008, pp. 3861 –3864.[20] M. Mishali, Y.C. Eldar, and A. Elron, “Xampling: Signal acquisition and processing in union of subspaces,”[Online] arXiv 0911.0519, CCIT report

Proc. IEEE Int. Symp. Circuits and Systems (ISCAS) , Toappear 2011.[22] M. Unser, “Sampling-50 years after Shannon,”

Proceedings of the IEEE , vol. 88, no. 4, pp. 569 –587, Apr.2000.[23] Y. Bresler, “Spectrum-blind sampling and compressive sensing for continuous-index signals,”

IEEE Info.Th. and Appl. Workshop , pp. 547–554, Jan 2008.[24] Yuejie Chi, A. Pezeshki, L. Scharf, and R. Calderbank, “Sensitivity to basis mismatch in compressedsensing,” in

Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on ,Mar. 2010, pp. 3930 –3933.[25] M. Duarte and R. Baraniuk, “Spectral compressive sensing,” [Online]. Available: dsp.rice.edu/cs,Preprint 2010. econciling CS Sampling [26] T. Blumensath and M. Davies, “Interative thresholding for sparse approximations,” Jour. of Fourier Anal.and App. , vol. 14, no. 5, pp. 629–654, 2008.[27] T. Blumensath and M. Davies, “Interative hard thresholding for compressed sensing,”

Appl. Comput.Harmon. Anal. , vol. 27, no. 3, pp. 265–274, 2009.[28] Ping Feng,

Universal Minimum-Rate Sampling and Spectrum-Blind Reconstruction for Multiband Signals ,Ph.D., University of Illinois at Urbana-Champaign, 1997.[29] M. Stojnic, F. Parvaresh, and B. Hassibi, “On the reconstruction of block-sparse signals with an optimalnumber of measurements,”

Signal Processing, IEEE Transactions on , vol. 57, no. 8, pp. 3075 –3085, Aug.2009.[30] Y.C. Eldar, P. Kuppinger, and H. Bolcskei, “Block-sparse signals: Uncertainty relations and eﬃcientrecovery,”

Signal Processing, IEEE Transactions on , vol. 58, no. 6, pp. 3042 –3054, Jun 2010.[31] M. Mishali and Y.C. Eldar, “Reduce and boost: Recovering arbitrary sets of jointly sparse vectors,”

IEEETrans. Signal Processing , vol. 56, no. 10, pp. 4652–4702, Oct 2008.[32] S.F. Cotter, B.D. Rao, Kjersti Engan, and K. Kreutz-Delgado, “Sparse solutions to linear inverse problemswith multiple measurement vectors,”

Signal Processing, IEEE Transactions on , vol. 53, no. 7, pp. 2477 –2488, Jul. 2005.[33] J.A. Tropp, A.C. Gilbert, and M.J. Strauss, “Algorithms for simultaneous sparse approximation. Part I:Greedy pursuit,”

Signal Processing , vol. 86, no. 3, pp. 572–588, 2006.[34] J.A. Tropp, A.C. Gilbert, and M.J. Strauss, “Algorithms for simultaneous sparse approximation. Part II:Convex relaxation,”

Signal Processing , vol. 86, no. 3, pp. 589–602, 2006.[35] Jie Chen and Xiaoming Huo, “Theoretical results on sparse representations of multiple-measurement vec-tors,”

Signal Processing, IEEE Transactions on , vol. 54, no. 12, pp. 4634 –4643, Dec 2006.[36] M. Davies and Y.C. Eldar, “Rank awareness in joint sparse recovery,” [Online]. Available: arXiv:1004:4529v1[cs.IT], Apr 2010.[37] Stephane Mallat,

A Wavelet Tour of Signal Processing , Academic Press, 2 edition, 1999.[38] Emmanuel J. Cand`es, Yonina C. Eldar, Deanna Needell, and Paige Randall, “Compressed sensing withcoherent and redundant dictionaries,”

Applied and Computational Harmonic Analysis , vol. In Press, 2010.[39] Richard Roberts and Cliﬀord Mullis,

Digital Signal Processing , Addison-Wesley Publishing Co., Inc., 1987.[40] E. Matusiak and Y. Eldar, “Sub-Nyquist sampling of short pulses,” in

Acoustics Speech and Signal Processing(ICASSP), 2010 IEEE International Conference on , To appear 2011.[41] Gilbert Strang,

Linear Algebra and Its Applications , Harcourt Brace & Company, 3rd edition, 1988.[42] “Wavelab 850: Wavelet analysis toolbox maintained at Stanford University,” [Online]. Available: ∼ wavelab ..