Estimation of Bandlimited Signals in Additive Gaussian Noise: a "Precision Indifference" Principle
aa r X i v : . [ c s . I T ] N ov Estimation of Bandlimited Signals in AdditiveGaussian Noise: a “Precision Indifference” Principle
Animesh Kumar and Vinod M. Prabhakaran Department of Electrical Engineering, Indian Institute of Technology Bombay, India School of Technology and Computer Science, Tata Institute of Fundamental Research, Bombay, IndiaEmails: [email protected], [email protected]
Abstract
The sampling, quantization, and estimation of a bounded dynamic-range bandlimited signal affected by additive independentGaussian noise is studied in this work. For bandlimited signals, the distortion due to additive independent Gaussian noise canbe reduced by oversampling (statistical diversity). The pointwise expected mean-squared error is used as a distortion metric forsignal estimate in this work. Two extreme scenarios of quantizer precision are considered: (i) infinite precision (real scalars);and (ii) one-bit quantization (sign information). If N is the oversampling ratio with respect to the Nyquist rate, then the optimallaw for distortion is O (1 /N ) . We show that a distortion of O (1 /N ) can be achieved irrespective of the quantizer precision byconsidering the above-mentioned two extreme scenarios of quantization. Thus, a quantization precision indifference principle isdiscovered, where the reconstruction distortion law, up to a proportionality constant, is unaffected by quantizer’s accuracy. Index Terms bandlimited signals, sampling, estimation, quantization
I. I
NTRODUCTION
Consider a bandlimited signal (or field) quantization problem, where the samples are affected by additive independent andidentically distributed (i.i.d.) Gaussian noise. For example, a spatial signal affected by additive i.i.d. Gaussian noise has to besampled using an array of sensors. In a distributed setup, where filtering before sampling is not possible, noise in bandlimitedsignals can be reduced by statistical averaging of independent noisy samples. In addition, quantization error can be reduced byoversampling as well as by increasing the analog-to-digital converter (ADC) or quantizer precision. The fundamental tradeoff between oversampling, quantizer precision, and (statistical) average distortion is of interest.The tradeoffs between all subsets of these three quantities has been studied in the literature. Tradeoffs for average distortionand oversampling, additive Gaussian noise and average distortion with unquantized samples, and oversampling and quantizationhave a flurry of work (e.g., see [1], [2], [3], [4], [5]). In this work, the tradeoff between oversampling, quantizer precision,and the average distortion is of interest.If extremely high-precision ADCs are used, then the sample distortion is noise limited. On the other hand, if lowest precisionsingle-bit ADCs are used, then the sample distortion is limited by quantization. At a high-level, it is expected that the distortionoptimal ADC precision should be ‘in between’ these two extreme cases, in the sense that it should be able to resolve thesignal up to the noise level. Contrary to this intuition, in this work it is shown that a distortion inversely proportional to theoversampling above the Nyquist rate is achievable with single-bit quantizers. With unquantized (infinite precision) samples,the optimal distortion is speculated to be inversely proportional to the oversampling above the Nyquist rate in the presence ofindependent Gaussian noise. Accordingly, the focus of this work is on the quantization of an additive independent Gaussiannoise affected bandlimited signal using single-bit ADCs and oversampling. The key result of this paper is the uncovering ofa quantization precision indifference principle, which is stated next.
Precision indifference principle:
Consider a bounded-dynamic range bandlimited signal with samples affected by additiveindependent Gaussian noise and observed through quantizers. If N is the oversampling ratio, with respect to the Nyquist rate,then the optimal law for maximum pointwise mean-squared error is O (1 /N ) , irrespective of the quantizer precision. In otherwords, for large N , the quantizer precision only affects the proportionality constant of the distortion. Prior art:
Averaging and other properties of independent random variables are well studied in statistics [6]. Quantizationerror can be reduced by oversampling as well as by increasing the ADC (quantizer) precision (see [3], [7], [4], [5] for theentire range of results). Estimation of square-integrable signals in the presence of Gaussian noise was studied by Pinsker [2];however, quantization is not addressed in his work. Signal quantization with additive noise as a dither has been studied byMasry [8], however signal was not assumed to be bandlimited. Masry’s results give a decay of O (1 /N / ) for bandlimitedsignals, where N is the oversampling above the Nyquist rate; this decay is slower than an O (1 /N ) decay that we are after.The sampling of signals defined on a finite support, while using single-bit quantizers in the presence of ambient noise, hasbeen also studied [9], [10]. Notation:
The set of bounded signals and the set of finite energy signals will be denoted by L ∞ ( R ) and L ( R ) , respectively.The signal of interest will be denoted by g ( t ) . For a signal s ( t ) in L ( R ) , the Fourier transform will be denoted by ˜ s ( ω ) . TheFourier transform and its inverse are defined as, ˜ s ( ω ) = Z R s ( t ) exp( − jωt ) d t ; s ( t ) = 12 π Z R ˜ s ( ω ) exp( jωt ) d ω. The indicator function of a set A is denoted by ( x ∈ A ) . Random variables or processes will be denoted by uppercase letters.The additive independent Gaussian noise is denoted by W ( t ) . The set of reals and integers will be denoted by R and Z ,respectively. The cumulative distribution function (cdf) of a Gaussian random variable with mean zero and variance σ will bedenoted by F ( x ) , x ∈ R . The convolution and expectation operation will be denoted by ⋆ and E , respectively. It is assumed thatall probability models have an underlying sample space, sigma-field, and probability measure such that (weighted) averagesand indicator functions are measurable. Organization:
The mathematical formulation of our sampling problem is discussed in Sec. II. Short review of stableinterpolation kernels and smoothness properties of associated signals are discussed in Sec. III. The discussion on precisionindifference principle appears in Sec. IV. Estimation with perfect samples and single-bit quantized samples are discussed inSec. IV-A and Sec. IV-B, respectively. Conclusions are presented in Sec. V. To maintain the flow of the paper, long proofsappear in the Appendix. II. P
ROBLEM FORMULATION
The discussion begins with a quick review of a stable bandlimited kernel, which is essential for stable interpolation in L ∞ ( R ) and for defining bandlimited signals. For λ > and a = ( λ − / , consider the kernel φ ( t ) that is given by φ ( t ) = 1 πat sin(( π + a ) t ) sin( at ); φ (0) = 1 + aπ . (1)The kernel decreases sufficiently fast (approximately as /t ) and therefore it is absolutely and square integrable. Its Fouriertransform is illustrated in Fig. 1. This kernel can be used to define the set of bounded bandlimited signals, which is a subset − λπ λπ − π π ˜ φ ( ω ) ω Fig. 1.
Stable interpolation filter:
The kernel φ ( t ) ↔ ˜ φ ( ω ) is defined in (1); this kernel is absolutely integrable and will be used to defineboundedbandlimitedsignals. of the Zakai class of bandlimited signals [11]. Consider BL int := { g ( t ) : | g ( t ) | ≤ and g ( t ) ⋆ φ ( t ) = g ( t ) ∀ t ∈ R } . (2)The above definition ensures that g ( t ) is continuous everywhere. It is easy to verify that the set of bounded bandlimited signalsin L ( R ) with Fourier spectrum zero outside [ − π, π ] also belongs to the set BL int . The set BL int also includes (almost-surely)any sample path of a bounded-dynamic range bandlimited wide-sense stationary process [12]. The quantization of bandlimitedsignals from the set BL int in the presence of additive independent Gaussian noise is studied in this work. The derived results areapplicable to finite energy bounded bandlimited signals as well as (almost surely) to any sample path of a bounded wide-sensestationary bandlimited process.
The signal affected by additive noise, g ( t ) + W ( t ) , is available for sampling. It is assumed that W ( t ) ∼ N (0 , σ ) for all t ∈ R . Independence of noise implies that W ( t ) , W ( t ) , . . . , W ( t n ) for distinct t , t , . . . , t n ∈ R are i.i.d. with N (0 , σ ) distribution. The Nyquist rate at which g ( t ) should be sampled for perfect reconstruction is one sample/second. In the noise-freeregime, when σ = 0 , it is sufficient to sample g ( t ) at the Nyquist rate for convergence in L ∞ ( R ) . In the noise-limited regime,when σ > , the reconstruction based on samples of g ( t ) will have distortion (statistical mean-squared error). This distortioncan be reduced by oversampling. Let N , a positive integer, be the oversampling rate. For any statistical estimate b G rec ( t ) ofthe signal g ( t ) , the maximum pointwise mean-squared error D rec is defined as the distortion , i.e., D rec := sup t ∈ R D rec ( t ) = sup t ∈ R E (cid:12)(cid:12)(cid:12) b G rec ( t ) − g ( t ) (cid:12)(cid:12)(cid:12) . (3)For a pointwise-consistent reconstruction, the distortion in (3) should decrease to zero as the oversampling rate N increasesto infinity [6]. Consistent reconstruction of smooth signals with a random dither, in the presence of single-bit quantizers, has been obtained in the past [8]; therefore, the asymptotic rate of decrease in D rec with N is of interest to us. Due tofinite precision limitations (ADC operation) during acquisition, the signal samples are quantized. Since quantization is a lossyoperation [13], D rec is expected to depend upon the ADC precision employed. As mentioned in Sec. I, it will be shown that D rec decreases as O (1 /N ) , irrespective of the sensor precision. Thus, the ADC precision only manifests in the proportionalityconstant (independent of the oversampling factor N ) in the optimal asymptotic reconstruction distortion.To show the proposed precision indifference principle, two extreme cases of quantizer precision will be analyzed and theirdistortion will be compared: (i) signal distortion with perfect samples; and (ii) signal distortion with samples quantized usingsingle-bit ADCs. The sampling setup for these two cases are illustrated in Fig. 2. In Fig. 2(a), the estimator works with infiniteprecision (unquantized) noisy samples while in Fig. 2(b), the estimator works with poorest precision (one-bit) noisy samples.The role of extra dither noise W d ( t ) will be explained later in Sec. IV-B. The estimator b G ( t ) will be designed and itsdistortion performance will be analyzed in this work. τ = 1 / ( λN ) τ { X ( nτ ) } ( x ≥ estimator b G ( t ) (a)(b) + g ( t ) W ( t ) b G ( t ) estimator τ = 1 / ( λN ) τ { Y ( nτ ) } + g ( t ) W ( t ) W d ( t )+ Fig. 2.
Two extreme scenarios of quantization:
In both the scenarios the signal g ( t ) is observed with additive independent Gaussian noise W ( t ) . In (a), the estimator works with infinite precision (unquantized) samples { Y ( nτ ) , n ∈ Z } . In (b), the estimator works with poorestprecision(one-bit)samples { X ( n ) , n ∈ Z } where X ( nτ ) = ( Y ( nτ ) ≥ . Before we move on to the next section, it should be noted that the kernel φ ( t ) and its derivative φ ′ ( t ) are absolutelyintegrable. This absolute integrability and square integrability of φ ( t ) can be translated into the following observations, whichwill be useful in Sec. IV-B during distortion analysis: C φ := Z t ∈ R | φ ( t ) | d t < ∞ , (4) C ′ φ := sup { t k : t k ∈ [ k/λ, ( k +1) /λ ] ,k ∈ Z } X k ∈ Z | φ ′ ( t k ) | < ∞ . (5)and C ′′ φ := sup t ∈ R X k ∈ Z (cid:12)(cid:12)(cid:12)(cid:12) φ (cid:18) t − kλ (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) < ∞ . (6)The next section will review pertinent mathematical results which will be used in the later sections.III. B ACKGROUND
The stable interpolation formula for Zakai sense bandlimited signals is discussd first. The necessity of W d and associatedvariance-conditions on Gaussian noise (see Fig. 2(b)) are discussed. If the pointwise error in interpolation is bounded withbounded perturbation of samples, it is called as a stable interpolation . The properties of stable interpolation and its implicationson filtering of bounded signals are given at the end of this section.¿From the interpolation formula for Zakai sense bandlimited signals, the signal of interest g ( t ) can be perfectly reconstructedfrom its samples taken at the Nyquist rate. For g ( t ) ∈ BL int , the interpolation formula is given by [5, Lemma 3.1], g ( t ) = λ X n ∈ Z g (cid:16) nλ (cid:17) φ (cid:16) t − nλ (cid:17) . (7)where the equality holds absolutely, pointwise, and in L ∞ ( R ) . Thus, in the absence of noise, it is sufficient to sample g ( t ) ata rate of λ sample per second (or per meter in the context of spatial fields). In the presence of quantization, the reconstructionin (7) is stable in L ∞ ( R ) .The role of W d ( t ) in Fig. 2(b) will now be highlighted. If the noise variance σ is very small compared to the dynamic rangeof the signal g ( t ) , i.e., | σ | ≪ , then the samples ( g ( t ) + W ( t ) ≥ will not capture small scale local variation in g ( t ) . Dueto quantization, estimators such as maximum likelihood are expected to be non-linear and their analysis is too complex. To alleviate this issue, if var ( W ( t )) = σ is very small, then an extra additive independent Gaussian dither W d ( t ) can be addedto ensure that ( g ( t ) + W ( t ) + W d ( t ) ≥ is sufficiently random. It is assumed that W d ( t ) and W ( t ) are independent. Suchdithering allows us to use an analytically tractable reconstruction procedure, which has order-optimal distortion . The block-diagram for sampling with one-bit ADCs is illustrated in Fig. 2(b). The technical condition on σ = var ( W ( t )) + var ( W d ( t )) is stated using the cdf of W + W d . Let F : R → [0 , be the cdf of W + W d . Let f ( x ) be the associated probability densityfunction with f ( ± C φ ) = δ and f (0) = ∆ . Observe that ∆ > δ , since f ( x ) = √ πσ exp( − x / σ ) . It is required that thereis a parameter µ > such that − √ C φ ! δ < µ < , (8)where C φ is the constant in (4). First fix a λ > . Then, C φ = R t ∈ R | φ ( t ) | d t > R t ∈ R φ ( t ) d t = ˜ φ (0) = 1 . That is, C φ √ > √ > . Therefore, the lower bound on µ in (8) is positive. Next, observe that if σ is large but fixed, then δ = f ( C φ ) ≈ f (0) = ∆ .Then δ and ∆ are close enough and the inequality in (8) can be satisfied. In other words, for a fixed λ and hence C φ , thereis a finite number σ for which (8) is satisfied for all σ > σ . If var ( W ( t )) < σ , then var ( W d ( t )) > σ − var ( W ( t )) willensure that var ( W + W d ) > σ . If var ( W ( t )) ≥ σ , then the extra dither is not needed. This condition will be used in thedistortion analysis in Sec. IV-B.For single-bit estimation, the signal F ( g ( t )) − / will be encountered, where F : R → [0 , is the cumulative distributionfunction of the stationary noise random variable W ( t ) + W d ( t ) . Since g ( t ) ∈ [ − , , and F ( x ) has a wider support than thedynamic range of signal (i.e., [ − , ), therefore F ′ ( x ) is finite and non-zero for x ∈ [ − , . Since F (0) = 1 / by symmetry,therefore, F ( g ( t )) − / is more convenient than F ( g ( t )) to work with. For simplicity of notation, let l ( t ) = F ( g ( t )) − / .Then | l ( t ) | ≤ | F (1) | − / , i.e., l ( t ) is bounded. The bound depends only on the noise distribution and the dynamic rangeof g ( t ) . Finally | l ′ ( t ) | = | F ′ ( g ( t )) g ′ ( t ) | ≤ | F ′ (0)2 π | since F ′ (0) maximizes F ′ ( x ) in [ − , and | g ′ ( t ) | ≤ π (see [5,Proposition 3.1]).The definition of BL int involves convolution with a stable kernel and convolution will often appear in the context of erroranalysis. The following short lemma will be quite useful later on. Lemma 3.1:
Let p ( t ) be a signal such that || p || ∞ is finite and P ( t ) be any random process such that P ( t ) is bounded (i.e., sup t ∈ R E ( P ( t )) is finite). Then, || p ⋆ φ || ∞ ≤ C φ || p || ∞ , (9)and E [( | P ( t ) | ⋆ | φ ( t ) | ) ] ≤ C φ sup t ∈ R E ( P ( t )) , (10)where the convolutions are well defined since φ ( t ) is absolutely integrable. Proof:
The proof follows by the definition of convolution and the triangle inequality. We have | p ( t ) ⋆ φ ( t ) | = (cid:12)(cid:12)(cid:12)(cid:12)Z u ∈ R p ( u ) φ ( t − u ) d u (cid:12)(cid:12)(cid:12)(cid:12) , ≤ Z u ∈ R | p ( u ) || φ ( t − u ) | d u, ≤ || p || ∞ Z u ∈ R | φ ( t − u ) | d u, = C φ || p || ∞ . For the second moment bound, note that E ( || P ( t ) | ⋆ | φ ( t ) || )= E (cid:18)Z Z u,v ∈ R | P ( u ) || P ( v ) || φ ( t − u ) | φ ( t − v ) | d u d v (cid:19) , = Z Z u,v ∈ R E ( | P ( u ) || P ( v ) | ) | φ ( t − u ) || φ ( t − v ) | d u d v, ( a ) ≤ sup t ∈ R E ( P ( t )) Z Z u,v ∈ R | φ ( t − u ) || φ ( t − v ) | d u d v, = C φ sup t ∈ R E ( P ( t )) , where ( a ) follows by E (2 | P ( u ) || P ( v ) | ) ≤ E ( P ( u ) + P ( v )) ≤ t E ( P ( t )) . Thus the proof is complete.The two extreme scenarios of quantization as depicted in Fig. 2 and their distortions will now be analyzed in the nextsection. IV. E
STIMATION OF BANDLIMITED SIGNAL
Interpolation of bandlimited signals with perfect samples is a well known topic [1]. Loosely speaking, a bandlimited signalof duration T and bandwidth π has πT degrees of freedom [14]. With N T noisy samples of the field g ( t ) + W ( t ) in duration T , the optimal distortion is expected to be O (1 /N ) . With this note, sampling schemes with oversampling rate N are designedto achieve a distortion of O (1 /N ) for sampling g ( t ) . A. Estimation with perfect samples
A brief review of estimation with perfect samples will be highlighted first. Optimal minimum mean-squared method can befound in the work of Pinsker [2]. For illustration and to get a distortion proportional to O (1 /N ) , it suffices to use the frameexpansion. Let the integer-valued oversampling ratio (above the Nyquist rate) be N , and τ = 1 / ( λN ) . Then, the samples { Y ( nτ ) , n ∈ Z } are available for the reconstruction of g ( t ) . Using frame expansion or the shift-invariance of bandlimitedsignals, g ( t ) = 1 N X n ∈ Z λg ( nτ ) φ ( t − nτ ) , = 1 N N − X i =0 X k ∈ Z λg (cid:18) kλ + iτ (cid:19) φ (cid:18) t − kλ − iN λ (cid:19) , (11)where the equality holds pointwise and in L ∞ ( R ) . It must be noted that the basic operation in (11) is that of averaging; hence,the noise is expected to average out while the signal will be retained. This intuition motivates the following estimator for g ( t ) from noisy data (see Fig. 2(a)). Define b G fr ( t ) := 1 N X n ∈ Z λY ( nτ ) φ ( t − nτ ) , (12) = 1 N N − X i =0 X k ∈ Z λ [ g + W ] (cid:18) kλ + iτ (cid:19) φ (cid:18) t − kλ − iN λ (cid:19) . The distortion of b G fr ( t ) is given by the following proposition. Proposition 4.1 (Frame estimate with O (1 /N ) distortion): Let b G fr ( t ) in (12) be an estimate for the bandlimited field g ( t ) corrupted by additive independent Gaussian noise. Let D fr ( t ) := E | b G fr ( t ) − g ( t ) | . Then, sup t ∈ R D fr ( t ) ≤ C ′′ φ λ σ N (13)where the constants σ and C ′′ φ from (6) do not depend on N . Proof:
See Appendix VI-A.The signal term in (12) converges in L ∞ ( R ) to g ( t ) . The noise term results in an independent sum of zero-mean randomvariables at every t ∈ R . This sum of random variables has a variance that decreases as (1 /N ) due to the finite energy ofthe interpolation kernel φ ( t ) . The constant C ′′ φ depends on the properties of the kernel φ ( t ) . The estimation with single-bitquantizers and associated distortion analysis will be presented next. B. Estimation with single-bit quantized samples
This section will present the key result of this work. Consider the system illustrated in Fig. 2(b). In this section, a b G ( t ) will be obtained such that D scales as O (1 /N ) . This is non-trivial to achieve because the non-linear quantization operationis coupled with the statistical estimation procedure. The result will be established in two parts: (i) it will be shown that suitableinterpolation of one-bit samples converges to a non-linear one-to-one function of g ( t ) with an error term having a pointwisevariance of O (1 /N ) ; and (ii) the obtained non-linear function of g ( t ) can be inverted in a stable manner using recursivecomputation based on contraction-mapping. It will be assumed that var ( W d ( t )) = ( σ − var ( W ( t ))) + , where σ is such that(8) is satisfied.The stability property of kernel φ ( t ) has been discussed Sec. III. For this section, fix τ = 1 / ( N λ ) , where λ > is anarbitrary stability constant. Analogous to (11), consider the random process obtained from the single-bit samples X ( nτ ) , n ∈ Z , H N ( t ) = τ X n ∈ Z ( X ( nτ ) − / φ ( t − nτ ) . (14)Then, the following proposition establishes the convergence of H N ( t ) to a function of the signal of interest g ( t ) . A single bounded constant in additive independent Gaussian noise with N independent readings can be estimated up to a distortion of O (1 /N ) [6]. Proposition 4.2 (Convergence of single-bit interpolation):
Let l ( t ) = ( F ( g ( t )) − / and H N ( t ) be as defined in (14).Then sup t ∈ R E ( H N ( t ) − l ( t ) ⋆ φ ( t )) ≤ C N + C N , (15)where C > and C > are constants independent of N . Proof:
See Appendix VI-B.The factor τ = 1 / ( λN ) provides the normalization for averaging in (14), while the terms ( X ( nτ ) − / φ ( t − nτ ) areweighted independent one-bit samples. The average in (15) converges in mean-square to a convolution. The signal l ( t ) ∈ L ∞ ( R ) and the limit l ( t ) ⋆ φ ( t ) is a lowpass version of l ( t ) . The dependence of l ( t ) ⋆ φ ( t ) on g ( t ) is non-linear due to quantization,which results in the F ( g ( t )) term. The original signal g ( t ) is Zakai sense bandlimited and it has one degree of freedom perunit time. The degree of freedom per unit time of l ( t ) ⋆ φ ( t ) can be up to one as well, and F ( x ) has ‘nice’ properties as afunction. Thus, it is not unreasonable to expect that there might be a class of F ( x ) such that ( F ( g ( t )) − / ⋆ φ ( t ) can beinverted to find g ( t ) , even though this equation is nonlinear.Consider compandors defined by Landau and Miranker [15]. Definition 4.1: [15, pg 100] A compandor is a monotonic function Q ( x ) which has the property that Q ( m ( t )) ∈ L ( R ) if m ( t ) ∈ L ( R ) .Landau and Miranker have shown that if g ( t ) ∈ L ( R ) and ˜ g ( ω ) is zero outside [ − π, π ] , and if Q : [ − , → R is a compandorwith non-zero slope, then there is one to one correspondence between g ( t ) and Q ( g ( t )) ⋆ sinc ( t ) [15]. Further, given any signal m ( t ) ∈ L ( R ) and ˜ m ( ω ) zero outside [ − π, π ] , there exists a unique g m ( t ) ∈ L ( R ) with ˜ g m ( ω ) zero outside [ − π, π ] and Q ( g m ( t )) ⋆ sinc ( t ) = m ( t ) .In our case, g ( t ) need not be in L ( R ) , even though l ( t ) = F ( g ( t )) − / is a compandor. Thus, the procedure of Landauand Miranker does not extend directly to bandlimited signals in L ∞ ( R ) , especially in the presence of statistical perturbations.Suitable modifications of their approach will be used to obtain the results for our problem.The dependence between g ( t ) and l ( t ) ⋆ φ ( t ) is quite non-linear. There is no clear or obvious equation by which g ( t ) canbe obtained from l ( t ) ⋆ φ ( t ) . Therefore, this inversion problem is casted into a recursive setup, where Banach’s fixed-pointtheorem can be leveraged along with contraction mapping [16, Ch. 5]. This approach is inspired from the work of Landauand Miranker. Their recursive setup is noted to be stable to perturbations of g ( t ) in L ( R ) [15]. This work will use a variantof their procedure, since the perturbation due to statistical noise with finite variance is not in L ( R ) . Therefore, our recursiveprocedure to obtain an estimate of g ( t ) from H N ( t ) (see (15)) and its analysis is non-trivial and it will be presented in detail.In summary, an estimate for g ( t ) is required. Due to quantization and noise, which is a non-linear operation, an approximation H N ( t ) of ( F ( g ( t )) − / ⋆φ ( t ) is available. The estimate H N ( t ) , which converges to ( F ( g ( t )) − / ⋆φ ( t ) with sample density N ↑ ∞ , will be inverted to obtain an estimate b G ( t ) for the signal g ( t ) . To establish the precision indifference principle,we wish to show that the mean-square error sup t ∈ R E | b G ( t ) − g ( t ) | decreases as O (1 /N ) . The details are presented next.A ‘clip to one’ function Clip [ x ] is defined first.Clip [ x ] = x if | x | ≤ sgn ( x ) otherwise . (16)Since g ( t ) has a dynamic range bounded by one, by assumption, it will be unaffected by clipping. Note that under the L ∞ norm,this transformation reduces the distance between any two scalars x and x , i.e., | Clip [ x ] − Clip [ x ] | ≤ | x − x | . This can beverified on a case by case basis. For example, if x > and x ∈ [ − , , then | Clip [ x ] − Clip [ x ] | = | − x | ≤ | x − x | .Other cases can be similarly enumerated. This clipping procedure is non-linear and complicates some of the presented analysis;however, we feel that its presence is essential for analysis.Let ψ ( t ) = φ ( λt ) . Then ˜ ψ ( ω ) = φ ( ω/λ ) . Thus, ˜ ψ ( ω ) is flat in [ − λπ, λπ ] and in ± [ λπ, λ π ] decreases linearly to zero.Consider the set of bandlimited signals defined by, S BL,bdd = { m ( t ) : | m ( t ) | ≤ C φ and m ( t ) ⋆ ψ ( t ) = m ( t ) } . (17)Then, S BL,bdd is a complete subset of the Banach space L ∞ ( R ) . Lemma 4.1 ( S BL,bdd is a complete metric space):
Let S BL,bdd be as defined in (17). Then ( S BL,bdd , || . || ∞ ) is a completesubset of ( L ∞ ( R ) , || . || ∞ ) . Proof:
Define the distance function d : S BL,bdd × S
BL,bdd → R + as d ( m , m ) = || m − m || ∞ with m ( t ) , m ( t ) ∈S BL,bdd . It is easy to verify the axioms of distance metric [16]: (i) d ≥ and d ≤ ∞ ; (ii) d ( m , m ) ≡ if and only if m ( t ) = m ( t ) ; (iii) d ( m , m ) = d ( m , m ) ; and (iv) d ( m , m ) ≤ d ( m , m ) + d ( m , m ) for any m ( t ) , m ( t ) , m ( t ) ∈S BL,bdd .It is straightforward to see that S BL,bdd ⊂ L ∞ ( R ) since || m || ∞ ≤ C φ for every m ( t ) ∈ S BL,bdd . To show that the subsetis complete, consider any Cauchy sequence m n ( t ) ∈ S BL,bdd . Since L ∞ ( R ) is complete, therefore m n ( t ) → s ( t ) , where s ( t ) ∈ L ∞ ( R ) . It remains to show that s ( t ) belongs to S BL,bdd . For any ǫ > , there is an n such that || m n − s || ∞ < ǫ for all n > n . Since R R | ψ ( t ) | d t = C φ /λ , therefore, || m n ⋆ψ − s ⋆ ψ || ∞ ≤ || m n − s || ( C φ /λ ) = C φ ǫ/λ for all n > n (see Lemma 3.1). Thus, m n ( t ) ⋆ ψ ( t ) → s ( t ) ⋆ ψ ( t ) . However, m n ( t ) ⋆ ψ ( t ) ≡ m n ( t ) since m n ( t ) ∈ S BL,bdd . Therefore, it follows that s ( t ) = s ( t ) ⋆ ψ ( t ) , or s ( t ) ∈ S BL,bdd . Thus, S BL,bdd is complete.A map T : S BL,bdd −→ S
BL,bdd will be defined next. This map will result in a recursive procedure to obtain g ( t ) from h ( t ) := l ( t ) ⋆ φ ( t ) . Define T [ m ( t )] = Clip h µh ( t ) + [ m ( t ) − µ ( F ( m ( t )) − / ⋆ φ ( t ) i ⋆ φ ( t ) . (18)It will be shown that T is a contraction on ( S BL,bdd , || . || ∞ ) . Lemma 4.2 ( T is a contraction): Let ( S BL,bdd , || . || ∞ ) be the metric space as defined in (17). Let T : S BL,bdd −→ S
BL,bdd be a map as defined in (18). If the condition in (8) is satisfied, then there is a choice of µ such that T is a contraction, i.e., || T [ m ] − T [ m ] || ∞ ≤ α || m − m || ∞ , (19)for some < α < and any m ( t ) , m ( t ) ∈ S BL,bdd . The parameter α does not depend on the choice of m and m . Proof:
See Appendix VI-C.Now the key recursive equation will be stated. Let l ( t ) = ( F ( g ( t )) − / and h ( t ) = l ( t ) ⋆ φ ( t ) be available for obtaining g ( t ) . Then, g k +1 ( t ) := T [ g k ( t )] = Clip h µh ( t ) + [ g k ( t ) − µ ( F ( g k ( t )) − / ⋆ φ ( t ) i ⋆ φ ( t ) , (20)where k ≥ , k ∈ Z and µ > is a constant that will be chosen according to Lemma 4.2. Set g ( t ) ≡ . The original signal g ( t ) is a fixed point of this equation and it can be verified by substitution. The following proposition shows that g ( t ) is the only fixed point of the equation in (20). The proof hinges on Banach’s fixed point theorem or contraction theorem [16, Ch. 5]. Proposition 4.3 (Signal of interest is the fixed point of T ): Let g ( t ) ∈ BL int ⊂ S BL,bdd be a continuous bounded bandlim-ited signal. Let h ( t ) = l ( t ) ⋆ φ ( t ) , where l ( t ) = F ( g ( t )) − / . Consider the recursion g k ( t ) = T [ g k − ( t )] , where T is asdefined in (18). Set g ( t ) ≡ . If µ is selected as in (8), then lim k →∞ || g k − g || ∞ = 0 . (21) Proof:
The proof is straightforward with Lemma 4.1 and Lemma 4.2 in place. Define d ( m , m ) = || m − m || ∞ forany m ( t ) , m ( t ) ∈ S BL,bdd . From Lemma 4.1, note that ( S BL,bdd , d ) is a complete metric space. The signal g ( t ) is in S BL,bdd and it satisfies g ( t ) = T [ g ( t )] , i.e., it is a fixed point for T defined in (18).Pick µ as in (8). Then T is a contraction on ( S BL,bdd , d ) . Thus, by Banach’s fixed point theorem (contraction theorem) [16,Ch. 5], there is exactly one fixed point in S BL,bdd for the equation g ( t ) = T [ g ( t )] . Since g k ( t ) converges to a fixed point, itmust converge to g ( t ) in the distance metric d . Thus the proof is complete.Propostion 4.3 holds with perfect information about l ( t ) ⋆ φ ( t ) . The estimation of signal from H N ( t ) , the statisticalapproximation of l ( t ) ⋆ φ ( t ) , will be discussed now. Let G k ( t ) be the sequence of random waveforms generated from H N ( t ) when it is applied to the recursion in (20). That is, fix G ( t ) ≡ and define G k +1 ( t ) := T [ G k ( t )] = Clip h µH N ( t ) + [ G k ( t ) − µ ( F ( G k ( t )) − / ⋆ φ ( t ) i ⋆ φ ( t ) . (22)Let b G ( t ) = lim k →∞ G k ( t ) . For the same choice of µ which ensures that T is a contraction on ( S BL,bdd , || . || ∞ ) , thedistortion of | b G ( t ) − g ( t ) | has to be established. To this end, the following proposition is noted. Proposition 4.4 (1-bit estimation has distortion O (1 /N ) ): Let H N ( t ) be the estimate of l ( t ) as described in (14) and µ beselected as in (8). With G ( t ) ≡ , let G k ( t ) be the sequence of random waveforms as defined in (22). Define lim k →∞ G k ( t ) = b G ( t ) . Then, D := sup t ∈ R E ( b G ( t ) − g ( t )) = O (1 /N ) , i.e., the distortion D decreases as O (1 /N ) . Proof:
See Appendix VI-D.The results of Proposition 4.1 and Proposition 4.4 can be summarized into the following theorem.
Theorem 4.1 (Precision indifference principle): Let g ( t ) be a bounded dynamic-range bandlimited-signal as defined in (2).Assume that g ( t ) + W ( t ) is available for sampling, where W ( t ) an additive independent Gaussian random process with finite This limit exists since it can be shown that || G k − G k − || ∞ ≤ α || G k − − G k − || ∞ for some < α < by using an analogous procedure as inLemma 4.2. variance. Fix an oversampling factor of N , where N is large for statistical averaging. There exists an estimate b G ( t ) obtained from single-bit samples of g ( t ) + W ( t ) such that sup t ∈ R E | b G ( t ) − g ( t ) | = O (1 /N ) . This distortion is proportional to the best possible distortion of O (1 /N ) that can be obtained with unquantized or perfectsamples. A few remarks highlighting the importance of the results obtained will conclude this section.
C. Remarks on the results obtained1) Comparison with the bit-conservation principle:
The bit-conservation principle [5] is somewhat in contrast to the precisionindifference principle. Loosely speaking, bit-conservation principle states that for sampling a bandlimited signal in a noiselesssetting, the oversampling density can be traded-off against ADC precision while maintaining a fixed bit-rate per Nyquist intervaland an order-optimal pointwise distortion. In the presence of additive independent Gaussian noise, this tradeoff between ADCprecision and oversampling is absent while studying pointwise mean-squared distortion. In the noisy setup, the distortion isproportional to /N , where N is the oversampling density irregardless of the ADC precision. The presence of noise shifts therole of ADC precision towards only the proportionality constant in distortion!
2) Interpretation of precision-indifference principle:
First, it can be argued that the precision indifference principle holdswhile estimating a constant signal (one degree of freedom) in additive independent Gaussian noise. Assume that a constant c ∈ [ − , has to be estimated based on N noisy readings Y i = c + W i , ≤ i ≤ N , where { W i , ≤ i ≤ N } are i.i.d. N (0 , σ ) .In the absence of quantization, b C N = ( P Ni =1 Y i ) /N converges to c in the mean-square sense, and E ( b C N − c ) = σ /N . Thisis the optimal distortion if perfect (unquantized) samples are available. Now consider the case where single-bit readings B i = ( c + W i ≥ , ≤ i ≤ N are available. The random variables { B i , ≤ i ≤ N } are i.i.d. Ber ( q ) where q = P ( W ≥− c ) = P ( W ≤ c ) = F ( c ) . Assume b B N = ( P Ni =1 B i ) /N . It can be shown that E ( b B N − F ( c )) ≤ / (4 N ) since eachvar ( B i ) ≤ F ( c )(1 − F ( c )) ≤ / . Define b C = F − ( b B N ) if b B N ∈ [ F ( − , F (1)] and b C = ± otherwise. Since F ( x ) is invertible and d F − ( x ) / d x is bounded for x ∈ [ F ( − , F (1)] , therefore, using the delta method, b C obtained from b B N has a mean-squared error which decreases as (1 /N ) [6]. Next, it should be noted that bandlimited signals have one degree offreedom in every Nyquist interval. An oversampling factor of N means that there are N samples to observe each degree offreedom on an average. Finally, observing the Nyquist samples of a bandlimited signals with a distortion of O (1 /N ) results,by stable interpolation with kernel φ ( t ) , in a pointwise distortion of O (1 /N ) for the signal estimate at any point.
3) Precision-indifference for a larger class of noise:
Consider the model where each sample Y ( nτ ) = g ( nτ ) + V ( nτ ) isaffected by some non-Gaussian noise. Focus on the case where V ( nτ ) can be written as V ( nτ ) = W ( nτ ) + U ( nτ ) , where W ( nτ ) and U ( nτ ) are i.i.d. for all n ∈ Z , τ ∈ R . If W ( nτ ) is Gaussian, var ( V ( nτ )) = σ < ∞ , and F V ( x ) satisfies (8),then the precision indifference principle will hold. The extension of existing proofs is simple and only its key steps will bementioned here. In the perfect sample case (see Fig. 2(a)), the Gaussian part of V ( t ) will limit the best possible (optimal)distortion to O (1 /N ) ; this is because even if the values of U ( nτ ) are (magically) known the residual W ( nτ ) will limit thedistortion. With single-bit quantization, note that all the proofs in Sec. IV-B only depend upon the existence of a δ and ∆ such that (8) is satisfied, monotonicity of F V ( x ) such that its derivative is bounded away from zero, and F V (0) = 1 / . Therecursive procedure in (20), however, requires the knowledge of F V ( x ) .V. C ONCLUSIONS AND FUTURE WORK
The sampling, quantization, and estimation of a bounded dynamic-range bandlimited signal affected by additive independentGaussian noise was studied. Such setup naturally arises in distributed sampling or where the sampling device itself is noisy.For bandlimited signals, the distortion due to additive independent Gaussian noise can be reduced by oversampling (statisticaldiversity). The maximum pointwise expected mean-squared error (statistical L error) was used as a distortion metric. Using twoextreme scenarios of quantizer precision, namely infinite precision and single-bit precision, a quantizer precision indifferenceprinciple was illustrated. It was shown that the optimal law for distortion is O (1 /N ) , where N is the oversampling ratio withrespect to the Nyquist rate. This scaling of distortion is unaffected by the quantizer precision , which is the key message of theprecision indifference principle. In other words, the reconstruction distortion law, up to a proportionality constant, is unaffectedby quantizer precision.Extensions of the precision indifference principle to other classes of parametric or non-parametric signals is of immediateinterest. Further, this work assumed sufficient dithering by noise because the estimators were linear. It is of interest to looktowards estimation techniques which do not require extra dithering.A CKNOWLEDGMENT
The problem of sampling a smooth signal in the presence of noise in a distributed setup was suggested by Prof. KannanRamchandran, EECS, University of California, Berkeley, CA. Discussions on this problem with Prof. Kannan Ramchandranand Prof. Martin Wainwright, EECS, University of California, Berkeley, CA, Prof. H. Narayanan, EE, IIT Bombay, and Prof.Prakash Ishwar, ECE, Boston University, Cambridge, MA were insightful. R EFERENCES[1] S. Mallat,
A Wavelet Tour of Signal Processing: The Sparse Way . Burlington, MA, USA: Academic Press, 2009.[2] M. S. Pinsker, “Optimal filtration of square-integrable signals in Gaussian noise,”
Problemy Peredachi Informatsil , vol. 16, no. 2, pp. 52–68, Apr. 1980.[3] R. M. Gray, “Oversampled Sigma-Delta Modulation,”
IEEE Trans. Commun. , vol. 35, no. 5, pp. 481–489, May 1987.[4] Z. Cvetkovi´c, I. Daubechies, and B. Logan, “Single-Bit Oversampled A/D Conversion With Exponential Accuracy in the Bit Rate,”
IEEE Transactionson Information Theory , vol. 53, no. 11, pp. 3979–3989, Nov. 2007.[5] A. Kumar, P. Ishwar, and K. Ramchandran, “High-resolution distributed sampling of bandlimited fields with low-precision sensors,”
IEEE Transactionson Information Theory , vol. 57, no. 1, pp. 476–492, Jan. 2011.[6] P. J. Bickel and K. A. Doksum,
Mathematical Statistics Vol I . Upper Saddle River, NJ, USA: Prentice Hall, 2001.[7] R. M. Gray and D. L. Neuhoff, “Quantization,”
IEEE Transactions on Information Theory , vol. IT-44, pp. 2325–2383, Oct. 1998.[8] E. Masry, “The reconstruction of analog signals from the sign of their noisy samples,”
IEEE Transactions on Information Theory , vol. 27, no. 6, pp.735–745, Nov. 1981.[9] Y. Wang and P. Ishwar, “Distributed field estimation with randomly deployed, noisy, binary sensors,”
IEEE Transactions on Signal Processing , vol. 57,no. 3, pp. 1177–1189, Mar. 2009.[10] E. Masry and P. Ishwar, “Field estimation from randomly located binary noisy sensors,”
IEEE Transactions on Information Theory , vol. 55, no. 11, pp.5197–5210, Nov. 2009.[11] M. Zakai, “Band-limited functions and the sampling theorem,”
Information and Control , vol. 8, pp. 143–158, 1965.[12] S. Cambanis and E. Masry, “Zakai’s class of bandlimited functions and processes: Its characterization and properties,”
SIAM Journal on AppliedMathematics , vol. 30, no. 1, pp. 10–21, Jan. 1976.[13] A. Gersho and R. M. Gray,
Vector Quantization and Signal Compression . Boston: Kluwer Academic, 1992.[14] D. Slepian, “On bandwidth,”
Proceedings of the IEEE , vol. 64, no. 3, pp. 292–300, Mar. 1976.[15] H. J. Landau and W. L. Miranker, “The recovery of distorted band-limitd signals,”
Journal of Mathematical Analysis and Applications , vol. 2, no. 1, pp.97–104, Feb. 1961.[16] E. Kreyszig,
Introductory Functional Analysis with Applications , 1st ed. Wiley, 1989.[17] M. J. M. Pelgrom,
Analog-to-Digital Conversion . Springer, NY, 2010.[18] W. Rudin,
Principles of Mathematical Analysis . USA: McGraw-Hill Companies, 1976.
VI. A
PPENDIX
A. Unquantized samples result in a distortion of O (1 /N ) By using (11) and the definition of b G fr ( t ) , first note that, b G fr ( t ) − g ( t ) = 1 N N − X i =0 λ X k ∈ Z W (cid:18) kλ + iτ (cid:19) φ (cid:18) t − kλ − iN λ (cid:19) . Since W ( k + iτ ) are i.i.d. with variance σ , therefore, E | b G fr ( t ) − g ( t ) | = 1 N N − X i =0 X k ∈ Z λ σ (cid:12)(cid:12)(cid:12)(cid:12) φ (cid:18) t − kλ − iN λ (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) . From (6), P k ∈ Z | φ ( t − kλ ) | ≤ C ′′ φ . Thus, E | b G fr ( t ) − g ( t ) | ≤ C ′′ φ λ σ N . (23)The proof is now complete.
B. Estimation of non-linear function of the signal g ( t ) By the linearity of expectation, it is easy to see that E ( H N ( t )) = τ P n ∈ Z l ( nτ ) φ ( t − nτ ) . For variance calculations, notethat each X ( nτ ) is an indicator random variable. Thus, var ( X ( nτ )) ≤ (1 / . Using the independence of { X ( nτ ) , n ∈ Z } ,we get var ( H N ( t )) = τ X n ∈ Z var ( X ( nτ )) | φ ( t − nτ ) | , ≤ τ X l ∈ Z N − X i =0 (cid:12)(cid:12)(cid:12)(cid:12) φ (cid:18) t − lλ − iτ (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) , ≤ τ N C ′′ φ = C ′′ φ λ N = C N , where C ′′ φ is given by (6), and is finite since φ ( t ) decays rapidly and is in L ( R ) . The constant C := C ′′ φ / (4 λ ) .Next it will be shown that as N → ∞ , the expression τ P n ∈ Z F ( g ( nτ )) φ ( t − nτ ) converges to the right hand side of (15).The proof uses the fast decay of φ ( t ) , which ensures absolute integrability of φ ( t ) . Consider the expression e = (cid:20)Z τ l ( u ) φ ( t − u ) d u (cid:21) − τ l ( τ ) φ ( t − τ ) . This summation is well defined since l ( t ) = ( F ( g ( t )) − / is bounded and φ ( t ) is integrable. This expression represents the error in approximating convolution integral in a τ interval by the corresponding term in aRiemann sum [18]. This can be bounded as explained next, | e | = (cid:12)(cid:12)(cid:12)(cid:12)Z τ [ l ( u ) − l ( τ )] φ ( t − u ) d u − l ( τ ) (cid:20) τ φ ( t − τ ) − Z τ φ ( t − u ) d u (cid:21)(cid:12)(cid:12)(cid:12)(cid:12) , ≤ || l ′ || ∞ τ Z τ | φ ( t − u ) | d u + (cid:12)(cid:12)(cid:12)(cid:12) l ( τ ) (cid:20) τ φ ( t − τ ) − Z τ φ ( t − u ) d u (cid:21)(cid:12)(cid:12)(cid:12)(cid:12) . (24)where we use the triangle inequality, | l ( u ) − l ( τ ) | ≤ || l ′ || ∞ ( u − τ ) , and finally | u − τ | ≤ τ . Next, by Lagrange mean-valuetheorem, we note that R τ φ ( t − u ) d u = τ φ ( t − t ) for some t ∈ (0 , τ ) . Thus, from (24), | e | ≤ || l ′ || ∞ τ Z τ | φ ( t − u ) | d u + || l || ∞ τ | φ ( t − τ ) − φ ( t − t ) | , ≤ || l ′ || ∞ τ Z τ | φ ( t − u ) | d u + || l || ∞ τ | φ ′ ( t − u ) | , for some u ∈ (0 , τ ) . We note that | φ ( t − u ) − φ ( t − t ) | = | φ ′ ( t − u ) || u − t | ≤ for some u ∈ ( t , τ ) ⊆ (0 , τ ) (this is byLagrange’s mean-value theorem). In the same fashion, for any [ nτ, ( n + 1) τ ] interval, the following bound can be established: | e n | ≤ || l ′ || ∞ τ Z ( n +1) τnτ | φ ( t − u ) | d u + || l || ∞ τ | φ ′ ( t − u n ) | , (25)where u n ∈ ( nτ, ( n + 1) τ ) . Finally, | E ( H N ( t )) − l ( t ) ⋆ φ ( t ) | = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) τ X n ∈ Z l ( nτ ) φ ( t − nτ ) − Z u ∈ R l ( u ) φ ( t − u ) d u (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) , ≤ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)X n ∈ Z e n (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) , ≤ X n ∈ Z | e n | , ≤ || l ′ || ∞ τ Z u ∈ R | φ ( t − u ) | d u + || l || ∞ τ "X n ∈ Z | φ ( t − u n ) | , where the last step follows using (25). Using the boundedness of C φ and C ′ φ (see (4) and (5)), and using bounds for l ( t ) and l ′ ( t ) , we get, | E ( H N ( t )) − l ( t ) ⋆ φ ( t )) | ≤ (cid:18) F ′ (0)(2 π ) C φ λ + | F (1) − / | C ′ φ λ (cid:19) N = C N . (26)Finally, E ( H N ( t ) − l ( t ) ⋆ φ ( t )) = var ( H N ( t )) + [ E ( H N ( t )) − l ( t ) ⋆ φ ( t )] ≤ C N + C N → as N → ∞ . The upper bound in the above inequality is independent of t ; therefore, the desired result follows by maximizing the left handside as a function of t . This completes the proof. It should be noted that the mean-squared error between H N ( t ) and l ( t ) ⋆ φ ( t ) is of the order of (1 /N ) . This result will be used to find the mean-squared error of b G ( t ) − g ( t ) . C. The map T is a contraction From (8), first note − C φ ! δ < − √ C φ ! δ < µ < . Since F ′ ( x ) , the pdf of the noise random variable, is positive and F ′ ( x ) ∈ [ δ, ∆] for x ∈ [ − C φ , C φ ] , therefore, δ ≤ F ( x ) − F ( y ) x − y ≤ ∆ for x = y and x, y ∈ [ − C φ , C φ ] . (27)Now consider the map in (18). First, it will be shown that if m ( t ) ∈ S BL,bdd , then T [ m ( t )] ∈ S BL,bdd . Define, r ( t ) := µh ( t ) + [ m ( t ) − µ ( F ( m ( t )) − / ⋆ φ ( t ) . (28) Then T [ m ( t )] = Clip [ r ( t )] ⋆φ ( t ) . Since | Clip [ r ( t )] | ≤ , Lemma 3.1 applied with p ( t ) = Clip [ r ( t )] results in | Clip [ r ( t )] ⋆φ ( t ) | ≤ C φ for all t ∈ R . Next, T [ m ( t )] ⋆ ψ ( t ) = ( Clip [ r ( t )] ⋆ φ ( t )) ⋆ ψ ( t ) = Clip [ r ( t )] ⋆ ( φ ( t ) ⋆ ψ ( t )) = Clip [ r ( t )] ⋆ φ ( t ) = T [ m ( t )] .Note that ˜ φ ( ω ) ˜ ψ ( ω ) = ˜ φ ( ω ) which results in φ ( t ) ⋆ ψ ( t ) = φ ( t ) . That is, T [ m ( t )] satisfies the convolution property needed tobe present in the set S BL,bdd .The contraction property will now be established. Let m ( t ) and m ( t ) be any two signals in S BL,bdd with corresponding r ( t ) and r ( t ) as defined in (28). The transformed signals can be written in terms of r i ( t ) as T [ m i ( t )] = Clip [ r i ( t )] ⋆ φ ( t ) for i = 1 , . The signals m ( t ) and m ( t ) are bounded in [ − C φ , C φ ] . The contraction property is established by the followingsteps: | r ( t ) − r ( t ) | = | [ m ( t ) − m ( t ) − µ ( F ( m ( t )) − F ( m ( t )))] ⋆ φ ( t ) | = (cid:12)(cid:12)(cid:12)(cid:12)Z u ∈ R ( m ( u ) − m ( u )) (cid:18) − µ F ( m ( u )) − F ( m ( u )) m ( u ) − m ( u ) (cid:19) φ ( t − u ) d u (cid:12)(cid:12)(cid:12)(cid:12) ( a ) ≤ Z u ∈ R | m ( u ) − m ( u ) | (cid:12)(cid:12)(cid:12)(cid:12) − µ F ( m ( u )) − F ( m ( u )) m ( u ) − m ( u ) (cid:12)(cid:12)(cid:12)(cid:12) | φ ( t − u ) | d u ( b ) ≤ Z u ∈ R | m ( u ) − m ( u ) | | − µδ || φ ( t − u ) | d u (29) ( c ) ≤ | − µδ ||| m − m || ∞ Z u ∈ R | φ ( t − u ) | d u = C φ | − µδ | · || m − m || ∞ , where ( a ) follows by the triangle inequality, ( b ) follows from (27) and µ ∆ < , and ( c ) follows from the definition of the L ∞ -norm. Next, by using the distance-reduction property of the clip-to-one function, we get | Clip [ r ( t )] − Clip [ r ( t )] | ≤| r ( t ) − r ( t ) | ≤ C φ | − µδ | · || m − m || ∞ . Applying Lemma 3.1 with p ( t ) = r ( t ) − r ( t ) , we get, | T [ m ( t )] − T [ m ( t )] | ≤ C φ | − µδ | · || m − m || ∞ . By taking supremum on t in the left hand side of the above equation, the desired contraction can be obtained: || T [ m ] − T [ m ] || ∞ ≤ C φ | − µδ | · || m − m || ∞ , (30)independent of the choice of m ( t ) , m ( t ) ∈ S BL,bdd . The conditions in (8) ensure that the parameter C φ | − µδ | < . Set α := C φ | − µδ | , where α < . Thus, || T [ m ] − T [ m ] || ∞ ≤ α · || m − m || ∞ , for some < α < . Since C φ , δ , and ∆ do not depend on m and m , therefore α is independent of the choice of m and m . Thus the proof is complete. D. Mean-squared error analysis of | b G ( t ) − g ( t ) | using contraction To analyze the mean-squared error, two sets of recursion will be considered. One will involve H N ( t ) , the statistical estimateof l ( t ) ⋆ φ ( t ) , and corresponding b G k ( t ) . Then b G ( t ) is the limit of b G k ( t ) as k → ∞ . The second recursion will involve h ( t ) = l ( t ) ⋆ φ ( t ) and the corresponding estimate g k ( t ) for the bandlimited signal g ( t ) . In the second recursion, g ( t ) is thelimit of g k ( t ) . Let r k ( t ) and R k ( t ) be defined as follows: r k ( t ) = µh ( t ) − [ g k − ( t ) − µ ( F ( g k − ( t )) − / ⋆ φ ( t ) , (31) R k ( t ) = µH N ( t ) − [ G k − ( t ) − µ ( F ( G k − ( t )) − / ⋆ φ ( t ) . (32)Note that g k ( t ) = Clip [ r k ( t )] ⋆ φ ( t ) and G k ( t ) = Clip [ R k ( t )] ⋆ φ ( t ) . By subtracting (31) from (32), the following equationsare obtained: R k ( t ) − r k ( t ) = µ ( H N ( t ) − h ( t )) − h G k − ( t ) − g k − ( t ) − µ ( F ( G k − ( t )) − F ( g k − ( t )) i ⋆ φ ( t ) , = µ ( H N ( t ) − h ( t )) − Z u ∈ R φ ( t − u ) h G k − ( u ) − g k − ( u ) − µ ( F ( G k − ( u )) − F ( g k − ( u ))) i d u. By applying the triangle inequality twice on the above equation, the following inequalities are obtained: | R k ( t ) − r k ( t ) | ≤ µ | H N ( t ) − h ( t ) | + (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z u ∈ R φ ( t − u ) h G k − ( u ) − g k − ( u ) − µ ( F ( G k − ( u )) − F ( g k − ( u ))) i d u (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ µ | H N ( t ) − h ( t ) | + Z u ∈ R | φ ( t − u ) | | G k − ( u ) − g k − ( u ) | (cid:12)(cid:12)(cid:12)(cid:12) − µ F ( G k − ( u )) − F ( g k − ( u )) G k − ( u ) − g k − ( u ) (cid:12)(cid:12)(cid:12)(cid:12) d u (33) The use of (8) and (27) in (33) results in | R k ( t ) − r k ( t ) | ≤ µ | H N ( t ) − h ( t ) | + | − µδ | Z u ∈ R | φ ( t − u ) || G k − ( u ) − g k − ( u ) | d u. Since the clipping operation reduces distance, therefore | Clip [ R k ( t )] − Clip [ r k ( t )] | ≤ µ | H N ( t ) − h ( t ) | + | − µδ | Z u ∈ R | φ ( t − u ) || G k − ( u ) − g k − ( u ) | d u. = µ | H N ( t ) − h ( t ) | + | − µδ | ( | φ ( t ) | ⋆ | G k − ( t ) − g k − ( t ) | ) . (34)Now the mean-squared error of Clip [ R k ( t )] − Clip [ r k ( t )] will be bounded using (34). First note that for any two randomvariables X and Y , E (( X + Y ) ) ≤ E ( X + Y ) . Thus, taking second moments on both sides of (34) results in E ( | Clip [ R k ( t )] − Clip [ r k ( t )] | ) ≤ µ E ( | H N ( t ) − h ( t ) | ) + 2 | − µδ | E (cid:0) ( | φ ( t ) | ⋆ | G k − ( t ) − g k − ( t ) | ) (cid:1) , ≤ µ E | H N ( t ) − h ( t ) | + 2 | − µδ | (cid:20) sup t E | G k − ( t ) − g k − ( t ) | (cid:21) C φ , (35)where the last inequality follows from Lemma 3.1 with P ( t ) = G k − ( t ) − g k − ( t ) . Taking supremum over t on the left side,the following recursive relationship is obtained: sup t E ( | Clip [ R k ( t )] − Clip [ r k ( t )] | ) ≤ µ (cid:18) C λ N + C N (cid:19) + 2 C φ | − µδ | sup t E ( | G k − ( t ) − g k − ( t ) | ) , (36)where the uniform upper bound on E | H N ( t ) − h ( t ) | from (15) has been used. Since G k ( t ) − g k ( t ) = ( Clip [ R k ( t )] − Clip [ r k ( t )]) ⋆ φ ( t ) , by applying Lemma 3.1 with P ( t ) = Clip [ R k ( t )] − Clip [ r k ( t )] we get, sup t E ( | G k ( t ) − g k ( t ) | ) ≤ C φ µ (cid:18) C λ N + C N (cid:19) + 2 C φ | − µδ | sup t E ( | G k − ( t ) − g k − ( t ) | ) . (37)From (8), it is noted that √ C φ | − µδ | < . Define < β := 2 C φ | − µδ | < . Using the recursion in (37), it follows that lim k →∞ sup t E | G k ( t ) − g k ( t ) | ≤ − β C φ µ (cid:18) C λ N + C N (cid:19) (38)Since we know that G k ( t ) and g k ( t ) converge in L ∞ ( R ) to b G ( t ) and g ( t ) , respectively, therefore, sup t E | b G ( t ) − g ( t ) | ≤ − β µ (cid:18) C λ N + C N (cid:19) (39)Lastly, µ, β, C , λ , and C are constants that do not depend on N , therefore, sup t E | b G ( t ) − g ( t ) | = O (1 /N ) . The proportionality constant depends upon the class of signal S BL,bdd , the noise variance σ , the stability properties of φ ( t ) , andthe chosen constant µ . It does not depend on the individual signal g ( t ))