A fast tunable blurring algorithm for scattered data
AA FAST TUNABLE BLURRING ALGORITHMFOR SCATTERED DATA ∗ GREGOR ROBINSON & IAN GROOMS † Abstract.
A blurring algorithm with linear time complexity can reduce the small-scale con-tent of data observed at scattered locations in a spatially extended domain of arbitrary dimension.The method works by forming a Gaussian interpolant of the input data, and then convolving theinterpolant with a multiresolution Gaussian approximation of the Green’s function to a differentialoperator whose spectrum can be tuned for problem-specific considerations. Like conventional blur-ring algorithms, which the new algorithm generalizes to data measured at locations other than auniform grid, applications include deblurring and separation of spatial scales. An example illustratesa possible application toward enabling importance sampling approaches to data assimilation of geo-physical observations, which are often scattered over a spatial domain, since blurring observationscan make particle filters more effective at state estimation of large scales. Another example, moti-vated by data analysis of dynamics like ocean eddies that have strong separation of spatial scales,uses the algorithm to decompose scattered oceanographic float measurements into large-scale andsmall-scale components.
Key words. signal processing, data assimilation, multiresolution algorithms
AMS subject classifications.
1. Introduction.
We present a linear-time blurring algorithm that works fordata measured at scattered points in R d , for arbitrary dimension d >
0, and thatpermits choice in shaping its response to different length scales. This algorithm isequivalent to solving a positive definite self-adjoint elliptic partial differential equation.By blurring, we mean the process of attentuating small-scale content of a dataset.Doing so offers a way to denoise and simplify spatial data whose large features areof primary relevance. Blurring is also a mechanism for isolating small-scale contentby simply subtracting the blurred version from the original. We show an example ofeach of these use cases.Applications of blurring in scientific computing generally benefit from having con-trol over the blurring spectrum, i.e. the factor by which it attenuates inputs of differentspatial scales. This concept is made more rigorous in 2. Regularly-spaced data ona periodic domain would enable straightforward application of Fourier methods toimplement a blurring algorithm that obeys a desired spectrum. But applications ingeophysics, and remote sensing in general, often involve measurements made at irreg-ularly scattered locations in a spatially-extended domain. It is therefore desirable notto require a regular grid.The paper is organized as follows. Our blurring method is described in section 2;an illustrative synthetic example is shown in section 3; an example of separating largeand small scales of scattered oceanographic data is presented in section 4; how thisblur can be applied in its original motivating context of meteorological data assimila-tion is described in section 5 along with a connection to generalized Gaussian randomfields that led to our algorithm’s discovery; another example using real meteorologicaldata is in section 6 to show the blur has the desired effect on particle filtering; al-gorithmic complexity and generalizations toward practical application of our method ∗ Submitted to the editors June 16, 2019.
Funding:
This work was funded by the National Science Foundation under award no. 1821074. † Department of Applied Mathematics, University of Colorado Boulder, Boulder, CO ([email protected], [email protected]).1 a r X i v : . [ s t a t . C O ] A p r G. ROBINSON & I. GROOMS are discussed in section 7; and conclusions follow in section 8.
2. Method.
Let z ∈ R N z be a vector of data at locations q i ∈ R d for each i ∈ (1 , · · · , N z ), where N z is the number of observations. The proposed blur S worksby solving for a discrete approximation of D − ζ , where D is an elliptic differentialoperator described in the next paragraph and ζ : R d → R is a continuous-domaininterpolant of the data z expressed as a sum of Gaussians. This obtains by approxi-mating the Green’s function g of D as a multiresolution sum of Gaussians, computingthe convolution of that approximation with ζ , and evaluating the result at the loca-tions { q i } .Define D to be the fractional bound-state Helmholtz operator: D = (cid:0) − (cid:96) ∆ (cid:1) β , (2.1)where ∆ is the formal Laplacian operator, (cid:96) > β > D are Fourier modes of wavenumber k andcorresponding eigenvalues (1 + (cid:96) | k | ) β . The characteristic scale of this operator is (cid:96)/ (2 π √ /β − g attenuates all but constantfunctions on R d , so even a constant data vector z will be attenuated to some degree.We will discuss a way to mitigate this effect in section 3.To represent the data in a continuous form that allows convolution with g , wechoose radial basis function (RBF) interpolation [4]. RBF interpolation of the obser-vations requires us to choose a kernel ψ : R + → R that is used as the radial basis inwhich the interpolant will represent the data. The interpolant takes the form ζ ( · ) = N z (cid:88) j =1 b j ψ ( (cid:107) · − q j (cid:107) ) , (2.2)where ( b j ) are interpolation weights such that ζ ( q i ) = N z (cid:88) j =1 b j ψ ( (cid:107) q i − q j (cid:107) ) = z i . (2.3)In matrix form, this linear system becomes(2.4) Bb = z . We take the RBF kernel to be ψ ( (cid:107) · (cid:107) ) = φ ( · ; 0 , ξ I ), where φ ( · ; µ, Σ ) is the density ofa d -variate Gaussian random variable with mean µ and covariance Σ φ ( · ; µ , Σ ) = (2 π det Σ ) − d/ exp (cid:0) ( · − µ ) T Σ − ( · − µ ) (cid:1) . (2.5)This notation is used as a convenient description of Gaussian functions even thoughwe will not use them to describe any random variables.One may worry that this method is hardly fast, despite using a fast PDE solver,since a naive approach to solving for b requires O ( N y ) operations. Computationalcomplexity of our algorithm, including faster alternatives to solving for b , is discussedin section 7. AST TUNABLE BLURRING ALGORITHM FOR SCATTERED DATA g ( t ) = t − β where t = 1 + (cid:96) | k | . Exponential approximations of inverse power functionslike this are studied in [3, 12]. The approach therein is to write a finite trapezoid-type discretization of an integral representation of ˆ g . With the change of variablesintroduced by McLean [12], the integral representation to be discretized is1 t β = 1Γ( β ) (cid:90) ∞−∞ exp ( − ϕ ( x, t )) (1 + e − x ) dx, (2.6)where ϕ ( x, t ) = t exp( x − e − x ) − β ( x − e − x ) . (2.7)The finite trapezoid rule discretizes this into1 t β ≈ β ) M + (cid:88) n = − M − v n e − a n t , (2.8)where Γ( β ) is the Gamma function and a n = exp( nh − e − nh ) , (2.9) v n = h (1 + e − nh ) exp (cid:0) β ( nh − e − nh ) (cid:1) . (2.10)Ref. [12] Lemma 4 shows that the total required number of terms M − + M + + 1scales as (ln E ) to achieve uniform relative error bounded by E > E ↓ k . Given weights v n and exponential rates a n from the exponential approximation above, we can derive the multiplicative factorsrequired of this equivalent formulation: v n e − a n (1+ (cid:96) k ) = v n e − a n e − a n (cid:96) k (2.11) = v n e − a n (2 π/ (cid:96) a n ) d/ (cid:16) (2 π/ (cid:96) a n ) − d/ e − (cid:96)a n k / (cid:17) (2.12) = v n e − a n ( π/(cid:96) a n ) d/ φ ( k ; 0 , / (cid:96) a n ) . (2.13)The second line obtains from simultaneously multiplying and dividing by the constantrequired to normalize the Gaussian term in large parentheses, which is written inthat manner to ease visual comparison to the standard form of an isotropic d -variateGaussian probability density function of mean 0 and variance 1 / (cid:96) a n . Combining(2.8)-(2.13) yields 1(1 + (cid:96) | k | ) β ≈ β ) M + (cid:88) n = − M − v n e − a n ( π/(cid:96) a n ) d/ φ ( k ; 0 , / (cid:96) a n ) . (2.14)A plot of the relative error committed by this approximation is shown in Figure 1.Taking the inverse Fourier transform and combining terms finally yields the de-sired approximation of the Green’s function in physical space in terms of normalized G. ROBINSON & I. GROOMS
20 30 40 50-0.0004-0.0003-0.0002-0.00010.00000.0001
Integer wavenumber R e l a ti v e e rr o r Fig. 1 . Relative error of a Gaussian approximation in the form (2.13) of the Fourier-spaceGreen’s function (1 − (cid:96) k ) − /β with parameters (cid:96) = 1 . , β = 0 . , h = 0 . , M = 32 , and N = 28 . Gaussians: g ( · ) ≈ M + (cid:88) n = − M − c n φ ( · ; 0 , ρ n ) , (2.15) ρ n = 2 (cid:96) a n , (2.16) c n = v n ( π/(cid:96) a n ) d/ − Γ( β ) e a n . (2.17)With approximations of the data and the Green’s function now constructed interms of d -variate isotropic normalized Gaussians, convolution of which is trivial,applying a discrete version of the integral operator that inverts D is just: D − y ≈ M + (cid:88) i = − M − c i φ ( · ; 0 , ρ i I ) ∗ N y (cid:88) j =1 b j φ ( · ; µ j , ξI ) (2.18) = M + (cid:88) i = − M − N y (cid:88) j =1 c i b j φ ( · ; 0 , ρ i I ) ∗ φ ( · ; µ j , ξI )(2.19) = N y (cid:88) j =1 b j M + (cid:88) i = − M − c i φ ( · ; µ j , ( ρ i + ξ j ) I ) . (2.20)Evaluating this quantity at the observation locations yields the output of our blur. AST TUNABLE BLURRING ALGORITHM FOR SCATTERED DATA ψ , given by(2.21) ˜ ψ ( · ) = M + (cid:88) i = − M − c i φ ( · ; 0 , ( ρ i + ξ j ) I ) . The weights b j of the blurred interpolant are identical to the weights of the inputinterpolant, which will be valuable later in this section.We now describe our blurring algorithm more concretely. Algorithm 2.1 tiestogether pieces of the Green’s function approximation specified in (2.6)-(2.17). Algo-rithm 2.2 combines RBF interpolation of the data with the output of Algorithm 2.1with a convolution and evaluates the result at the data locations. These algorithms,used together, are a complete description of our blur. Algorithm 2.1
Gaussian approximation of fractional bound-state Helmholtz kernel
Input
Blurring scale parameter (cid:96) > β > h > M − ∈ N .Number of positive integration steps M + ∈ N .Dimension of each measurement location d ∈ N . Output
Vector of positive weights c ∈ R M − + M + +1 .Vector of variances ρ ∈ R M − + M + +1 . function GaussianBSH ( (cid:96) , β , h , M − , M + , d ) for n = − M − → M + do ˆ a ← exp ( nh − exp ( − nh )).ˆ w ← h (1 + exp ( − nh )) exp ( β ( nh − exp ( − nh ))).ˆ w (cid:48) ← ˆ w exp ( − ˆ a ) (cid:0) π/(cid:96) ˆ a (cid:1) d/ / Γ( β ). ρ n ← (cid:96) ˆ a . c n ← ˆ w (cid:48) ρ/ π . end forreturn c , ρ . end function Algorithm 2.2 defines a linear operator S on R N z . We will prove that S is positivedefinite in Theorem 2.1, which will be useful in section 5. The theorem is more generalthan the specific algorithm so far presented, which will set the stage for potentialvariants to be described in section 7. To ease into the theorem, we will summarize thepreceding development of S and connect it to a briefer alternative formulation thatis easier to treat analytically. G. ROBINSON & I. GROOMS
Algorithm 2.2
Blur Input Vector of measured data z ∈ R N z . Array of data location vectors q i ∈ R d , i ∈ (1 , . . . , N z ). RBF scale parameter ξ > Vector of positive weights ω ∈ R M − + M + +1 . Vector of variances ρ ∈ R M − + M + +1 . Output Array of blurred data ˜ z i ∈ R , i ∈ (1 , . . . , N z ) function Blur ( z , q , ξ , ω , ρ ) Let φ ( (cid:107) · (cid:107) ; 0 , ξ I ) be a unit-mass Gaussian to use as the RBF kernel. Generate RBF weight matrix B with elements: for i = 1 → N z , j = 1 → N z do B ij ← φ ( (cid:107) z i − q j (cid:107) ; 0 , ξ I ). end for b ← B − z . for i = 1 → N y do ˜ z i ← (cid:80) i (cid:48) ,n ( b i · ω n ) φ ( (cid:107) y i − q (cid:48) i (cid:107) ; 0 , ( ξ + ρ n ) I ). end for return ˜ z . end function We described a sequence of mappings between vector spaces, with blurring takingplace most explicitly in the function space L ( R N y ) of interpolants by way of theconvolution ψ i (cid:55)→ G ψ i , where G denotes an operator that performs convolution withthe Gaussian approximation of g , and ψ i is defined as the interpolation basis function ψ ( (cid:107) · − q i (cid:107) ) centered at location q i . Taken literally, that conceptual developmentprescribes the following composition of linear operations: Y B − −−−→ W F −→ X G −→ ˜ X ˜ F − −−−→ ˜ W ˜ B −→ ˜ Y . (2.22)Nodes in this diagram represents the various vector spaces found along the way ofdescribing our blurring algorithm: • Y is the space of input data, • W is the space of interpolant weights in the basis { ψ i } , • X = span { ψ i } ⊂ L ( R n ) is the space of interpolants, • ˜ X = span { G ψ i } ⊂ L ( R n ) is the space of blurred interpolants, • ˜ W is the space of blurred interpolant weights in the basis { G ψ i } , and • ˜ Y is the space of blurred data.Arrows in the diagram represent the action of the operators superscribed on them: • B − maps input data to RBF weights, • F maps RBF weights to interpolated functions, • G maps interpolated functions to blurred functions, • ˜ F − maps blurred functions to weights in a blurred RBF basis, and • ˜ B maps blurred weights to blurred data.The complicated sequence of steps above can simplify greatly; observe in (2.20)that the weights in the blurred basis { G ψ i } are always identical to the weights in theunblurred basis { ψ i } . Therefore ˜ F − G F = I , leaving just S = ˜ BB − where ˜ B is the AST TUNABLE BLURRING ALGORITHM FOR SCATTERED DATA B i,j = ˜ ψ ( (cid:107) q i − q j (cid:107) ) . This alternative perspective demonstrates that S is equivalent to finding weights b for an RBF interpolant of the unblurred data using a basis { ψ i } , and then evaluatingan RBF interpolant of the blurred data using the same weights b that now act ascoefficients on a blurred basis { G ψ i } . The following theorem is stated in terms of thissimplified perspective. Theorem
Let ψ : R + → R be an interpolating radial basis function andlet g : R + → R be a convolution kernel. Suppose ψ and g each have positive Fouriertransforms, and define ˜ ψ = g ∗ ψ . Then the matrices B with entries B ij = ψ ( (cid:107) q i − q j (cid:107) ) and ˜ B with entries ˜ B ij = ˜ ψ ( (cid:107) q i − q j (cid:107) ) are symmetric positive definite, and the product S = ˜ BB − is positive definite.Proof. A standard theorem of RBF interpolation (e.g. Section 3 of [4]) states that B is positive definite under the assumption that the Fourier transform of ψ is positive.The Convolution Theorem guarantees that ˜ ψ = g ∗ ψ has a positive Fouriertransform if g and ψ both have positive Fourier transforms, so ˜ B is positive definitefor the same reason that B is positive definite.Observe that B and ˜ B are symmetric by construction, and that B − is symmetricpositive definite since it is the inverse of a symmetric positive definite matrix. Theorem7.6.3 in [7] states that the product of a positive definite matrix P and a Hermitianmatrix Q is a matrix with the same number of negative, zero, and positive eigenvaluesas Q . It follows that the product of two Hermitian positive definite matrices is alsopositive definite. Therefore, since ˜ B and B − are Hermitian positive definite matrices,˜ BB − is positive definite.The coefficients in the multiresolution approximation (2.15) are all positive, andthe Fourier transform of a positive Gaussian is also a positive Gaussian. Therefore S , as defined by Algorithm 2.2 together with Algorithm 2.1, is positive definite as acorollary of Theorem 2.1.
3. Example 1: circular measurement locations embedded in a 2-plane.
We want to verify that the blur’s effect resembles what we would expect of a discreteapproximation to D − . To that end, we blur equally-spaced data on a circle embeddedin R to provide insight into the spectral properties of the blur in practice. Thisexample will also describe heuristics in choosing parameters (cid:96) , β , and ξ . In thecourse of this example we will also demonstrate an undesirable phenomenon whereby S attenuates even the largest scales, and suggest a workaround.Locations were chosen to encircle the origin in R with N y = 100 distinct locationsseparated by unit distance from nearest neighbors, i.e. q i = (cid:12)(cid:12)(cid:12) e nπ/ − (cid:12)(cid:12)(cid:12) − (cid:20) cos (2 nπ/ nπ/ (cid:21) . The interpolation kernel was chosen to be the isotropic Gaussian PDF with standarddeviation ξ / = 2 .
5. The convolution kernel is the multiresolution Gaussian approx-imation (2.15) to the fractional bound-state Helmholtz kernel with (cid:96) = 1 and β = 1,using approximation parameters h = 0 . M = 32, and N = 28. These parametersyield an approximation of ˆ g with < .
05% relative error up to k max = 49, the Nyquistnumber for the one-dimensional problem that this example simulates embedded intwo dimensions. Recall that Figure 1 shows this relative error as a function of k . G. ROBINSON & I. GROOMS
The blur thus constructed defines a linear operator S on R N y . For the purpose ofinspecting the effect of blurring at different scales we numerically constructed a matrixrepresentation of S (though for practical application of the blurring algorithm it isinadvisable to actually construct S ). Due to the rotational symmetry of observationlocations, the matrices B − and ˜ B are circulant. The class of circulant matrices isstable under inversion, transposition, and matrix multiplication, so S = ˜ BB − is alsocirculant. Therefore eigenvectors of S are discrete Fourier vectors. This connectionprovides a rationale for comparing eigenvalues of S to the spectrum of D − , whoseeigenfunctions are Fourier modes. However, it is important to recognize that thecomparison is imprecise because the interpolant (2.3) represents these discrete Fouriereigenvectors as functions that differ from Fourier modes on R .Eigenvalues of S are plotted in Figure 2 as circles, and some examples of eigen-functions of S are visualized beneath the plot for k ∈ (1 , , , Eigenfunctionsof S are defined here as continuous interpolants of the matrix’s eigenvectors foundby the RBF interpolation scheme utilized in the blur. This figure also shows a solidtrace labelled “Fourier” that plots (1 + (cid:96) k ) − β , which is the spectrum of D − thatcorresponds to R Fourier modes. Since eigenvectors of S do not correspond to R Fourier modes, this trace of the Fourier spectrum is only a rough comparison ratherthan an analytical prediction that we are trying to match. The observed spectrum of S behaves as expected, with gradual blurring of small scale features.Recall from section 2 that this method has a drawback of attenuating large scales.That behavior is evident in the eigenvalues plotted in Figure 2, which are all lessthan 1. Eigenvalues less than 1 correspond to attenuation, so the blur attenuateseven the largest scale (eigenmode index 0). This over-attenuation occurs becauseconvolution with g attenuates every Fourier eigenmode of D except constant functionsin R d . Since a finite Gaussian approximation can never fully describe a nonzero spatialconstant, even the largest-scale function in the space of possible RBF interpolants willbe attenuated by our blur. The largest-scale eigenfunctions in this example are thinin the direction transverse to the circle, causing those modes to be blurred more thancontinuous Fourier modes in R with the same wavenumber (i.e. Fourier modes with planar length scale equal to the circumferential length scale of S eigenfunctions)For a similar reason that large scales are attenuated too much, the blur does notsuppress the smallest-scale eigenmodes as much as G would suppress a true R Fouriermode of the same wavenumber. This is because the RBF interpolants of the mosthighly-oscillatory eigenvectors in this example have more large-scale content than R Fourier modes with the same length scale.Over-attenuation can be mitigated, so that the largest scales are closer to unity, byrescaling the operator by replacing S (cid:55)→ S / (cid:107) S1 (cid:107) , where is a unit-norm vector withall entries identical. The eigenvectors of S are usually not discrete Fourier vectors likethey are in this symmetric example, so the largest-scale eigenvector is not necessarily . Therefore this mitigation technique is only a heuristic, which derives from the ideathat an input with identical entries contains little small-scale information.Choosing the RBF standard deviation parameter ξ / is not to be taken lightly.We recommend choosing it to be roughly on the order of the nearest-neighbor distancebetween measurements. A value too small prevents the RBF interpolation step fromresolving gradual transitions from location to location, causing the interpolant toappear as a rugged set of “spikes” that are overly suppressed by the convolution stepon account of their inappropriately small scale. Choosing an interpolation kernel thatis too large, however, can cause numerical problems related to ill-conditioning of thelinear system we must solve to arrive at RBF coefficients. Choosing ξ / to be as AST TUNABLE BLURRING ALGORITHM FOR SCATTERED DATA - 1.00.01.0 FourierObserved
Fig. 2 . Top: points indicate eigenvalues of the matrix S defined using interpolation by Gaussianradial basis functions of standard deviation ξ / = 2 . , followed by convolution with a Gaussianapproximation of the Green’s function for the bound-state fractional Helmholtz kernel of D = (1 − ∆) β with (cid:96) = 1 . and β = 1 , acting on 100 equally spaced points around the origin in R with unit nearest-neighbor distance. The solid trace shows the spectrum (1 + k ) − β of D − , eigenfunctions of whichare Fourier modes; this serves to highlight the similarity between the spectra of S and of D − , butis not an analytical solution to match since interpolating the eigenfunctions of S are not Fouriermodes in the plane. Bottom: some example eigenfunctions, defined as interpolants given by (2.3) of the eigenvectors of S , for k ∈ { , , , } . Duplicate eigenpairs that arise due to symmetry aresuppressed in this figure. large as possible, balanced against insurmountable instability due to ill-conditioning,is considered a best practice in RBF literature [4].
4. Example 2: scale separation of ocean temperature measurements.
The application of our algorithm is demonstrated with data from Argo, an interna-tional instrumentation project to measure the world’s oceans [16]. Argo uses an arrayof approximately 3600 profiling floats. Every 10 days, these devices sink to a depthof 2000 meters to capture a depth-wise profile of measurements as they float back tothe surface.We apply our blurring algorithm to Argo sea surface temperature measurementsto decompose the data into large-scale and small-scale components. Physical oceanog-raphers are interested in how ocean heat content varies across spatial and temporalscales, and the Argo data set is an invaluable tool [23]. Given the irregularity of the0
G. ROBINSON & I. GROOMS
30° W 50° W60° N40° N Ta (K) 5.00-5.0Original60° N40° N Ta (K) 2.00-2.0Large-scale60° N40° N Ta (K) 5.00-5.0Small-scale
Fig. 3 . Sea surface temperature measurements made by floats in the Argo oceanographic in-strumentation project. The horizontal and vertical coordinates describe longitudinal and latitudinaldisplacement, in kilometers, from a center of ◦ N ◦ W. Contours and colors indicate interpolatedtemperature differences in degrees Celsius, as estimated by the original interpolant of the data usedin the blurring algorithm with ξ / = 175 km , (cid:96) = 70 km , and β = 8 (top); by the blurred interpolantrepresenting large-scale features (center), and the small-scale interpolant defined as the difference ofthe blurred interpolant subtracted from the original interpolant (bottom). Although the linear fit isan important part of both the original and large scales, it is not added back for the purpose of thisvisualization in order to show other large-scale features that would otherwise be difficult to see. AST TUNABLE BLURRING ALGORITHM FOR SCATTERED DATA ◦ N–60 ◦ N and 20 ◦ –50 ◦ Wover the time period February 1, 2020 through February 10, 2020.A small number of data points were removed to maintain a minimum nearest-neighbor separation of 50 kilometers. A least-squares linear fit is subtracted from theremaining data to arrive at deviations. These deviations are blurred with section 2using RBF length scale ξ / = 175 kilometers, (cid:96) = 70 kilometers, and exponent β = 8.Subtracting the linear fit to obtain deviations avoids rapid variation of the interpolantthat could otherwise happen near the boundaries due to the basis functions’ rapiddecay.The original field of sea surface temperature deviations, as interpolated by RBFstage of our algorithm, is depicted in the top panel of Figure 3. The center panelshows the large-scale component, which is the result of blurring the temperaturedeviations. The small scale component, plotted in the bottom panel, is the differenceof the unblurred and blurred temperature deviations.
5. Application to particle filtering.
The blur described above was originallymotivated by an effort to mitigate the dimensional curse of particle filtering, which wewill refer to as sequential importance sampling with resampling (SIR) hereinafter. This section will describe how the blur was derived from the motivating considerationsabout SIR, in terms of an observation error covariance matrix.SIR begins by running an ensemble of forecasts; each ensemble member is a particle . A set of observations of the true system is then taken; the observationsare corrupted by instrument errors, whose distribution defines a likelihood. Thislikelihood is used to compute weights for each particle so that the weighted ensembleapproximates the Bayesian posterior.SIR is susceptible to a phenomenon called collapse , characterized by essentiallyall the ensemble weight accumulating on a single ensemble member that is closest tothe observations, causing the filter to catastrophically underestimate posterior dis-persion. The number of ensemble members required to avoid SIR collapse dependson system covariance and observation error covariance, scaling exponentially in aneffective system dimension.Specific estimates of ensemble size required to avoid collapse, provided in [21,22], suggest one can reduce the required ensemble size by increasing eigenvalues ofthe observation error covariance. Doing so carefully can also improve uncertaintyquantification for a fixed number of ensemble members. Ref. [14] suggests inflatingthe observation error variance at small scales, letting variance grow in wavenumber,since small scales have very limited predictability in geophysical flows [10, 17, 8].To be more precise, consider an observing system y ( q ) = H { x } ( q ) + r / ( q ) (cid:15) ( q ) , (5.1)where y ( q ) ∈ R is the observation at location q ∈ R d , H is a function-valued obser-vation operator acting on x , which describes the scalar system state as a function of Sequential importance sampling (SIS) is also known as particle filtering . In this paper we special-ize on SIS with resampling (SIR), the most famous variety of particle filter, though the dimensionalcurse and presumably our applications also extend to other particle filter varieties. Decreasing eigenvalues of the system covariance has the same effect, for the same reasons.However, it is harder to justify changes to a dynamical model, and to implement those changes, thanto modify the observation model as we propose. G. ROBINSON & I. GROOMS location, r / ( q ) can be imagined as the standard deviation of the observation errorat q , and r / ( q ) (cid:15) ( q ) is the random observation error.It is natural to think of (cid:15) as a random field. But letting the spectrum growin wavenumber precludes pointwise definition of (cid:15) , with probability 1, so it is not arandom field in the traditional sense. The idea of imposing a correlation structurewith a growing spectrum can instead be understood in the framework of generalizedrandom fields. In this case (cid:15) can be treated as a random process with realizationstaking the form of tempered distributions, i.e. elements of the topological dual to aSchwartz space of rapidly decaying functions on R d .Since realizations of (cid:15) with a growing spectrum cannot be described pointwise,instead interpret the spatially parametrized terms in (5.1) as averages with respect toa Schwartz function ν that is closely concentrated near q . For example, (cid:15) ( q ) ≡ (cid:90) R d (cid:15) ν d x (cid:30)(cid:90) R d ν d x . (5.2)Narrowing our attention within the scope of generalized random fields, let (cid:15) be amean-zero stationary Gaussian generalized random field (GGRF). Then the vector (cid:0) (cid:15) ( q ) , · · · , (cid:15) ( q N y ) (cid:1) is a multivariate normal random variable with zero mean and acovariance matrix C with entries C ij that depend only on (cid:107) q i − q j (cid:107) . Hence the vectorof observations y ≡ (cid:0) y ( q ) , · · · , y ( q N y ) (cid:1) conditioned on x is a multivariate normalrandom variable with mean H ( x ) and covariance R = R / CR / , where R / is a diagonal matrix of the discrete observation standard deviations r / ( q i ) that can be treated as instrument errors and H ( · ) : R N y → R N y is an ob-servation operator acting on the discrete vector x that characterizes the underlyingsystem state. The discrete observing system can be summarized in the form y = Hx + R / (cid:15) ∈ R N y . (5.3)In the spirit of [9, 18], a connection between elliptic stochastic partial differentialequations and random fields enables us to make use of fast algorithms for PDEs inthe context of solving for the likelihood under GGRF models, rather than naivelydeveloping a dense approximation of C and then solving the associated linear system.In seeking to build a GGRF error model with a likelihood that is cheap to evaluatein high dimensions, it was this connection that led to the development of the moregenerally-applicable blurring algorithm presented in this paper.We specifically treat the continuous field of observation error as (cid:15) = D W , where D = (1 − (cid:96) ∆) β acts on a spatial white noise W with mean zero and unit pointwisevariance. As in (2.1), ∆ is the formal Laplacian operator, (cid:96) > β > D are Fouriermodes of wavenumber k and corresponding eigenvalues (1 + (cid:96) | k | ) β . Recall also thatthe characteristic scale of this operator is (cid:96)/ (2 π √ /β − (cid:15) in this manner therefore ascribes a variance tolarge scales that is commensurate with instrument error, but it also progressively andunboundedly inflates variance for small scales at a rate controlled by β . The GGRFdescription of observation error is thus a kind of surrogate model for the assumption AST TUNABLE BLURRING ALGORITHM FOR SCATTERED DATA To see thisequivalence, observe how the correlation matrix features in the Gaussian likelihood,the logarithm of which is proportional to( y − Hx ) T R − / C − R − / ( y − Hx ) . (5.4)Then consider preprocessing the “standardized innovations” R − / ( y − Hx ) with alinear operation R − / ( y − Hx ) (cid:55)−→ SR − / ( y − Hx ) . (5.5)If these blurred observations are now assimilated under the assumption that the errorsin the blurred field are standard normal, then the log-likelihood is proportional to( y − Hx ) T R − / S T SR − / ( y − Hx ) . (5.6)Any valid covariance matrix must be symmetric and positive definite. If S is apositive definite blurring operator — i.e. a positive definite operator with a decayingspectrum toward small scales — then C = ( S T S ) − is a symmetric positive definiteoperator with a spectrum that grows toward small scales. In light of these considera-tions, take S to be the algorithm presented in section 2. Although Theorem 2.1 showsthat S is positive definite, it is typically not symmetric, and the symmetric part of S is not necessarily positive definite. For this reason we must treat R / ( S T S ) − R / as a covariance matrix, rather than R / S − R / .In summary, we propose blurring standardized innovations with S in such a waythat ( S T S ) − is a covariance for a discretized GGRF.
6. Example 3: SIR with radiosonde data.
To demonstrate the behavior ofour blurring algorithm on scattered data and its impact on SIR weights, we make useof data from the U.S. National Center for Atmospheric Research (NCAR) ConvectionAllowing Ensemble [19, 20]. The NCAR ensemble produced real-time 48 hour fore-casts over the conterminous United States (CONUS) from April 7, 2015 to December30, 2017. The ensemble forecasting system consisted of two components: an 80 mem-ber ensemble assimilation system operating at 15 km resolution and a 10 memberensemble forecast system operating at 3 km resolution. We make use of the 80 mem-ber ensemble data. The assimilation system used the Advanced Research version ofthe Weather Research and Forecasting (WRF) model; observations were assimilatedin a 6 hour cycle via the Ensemble Adjustment Kalman Filter [2] implemented inthe Data Assimilation Research Testbed software suite [1]. Every assimilation cycleprocessed between 66,000 and 70,000 observations from a variety of sources includingradiosondes, aircraft measurements, satellite wind measurements, and Global Posi-tioning System radio occultation data, among others. Further details are provided in[19].To verify that our blurring algorithm performs as expected on scattered data, weapply it to radiosonde temperature measurements at a single pressure level. Every 12hours, i.e. every other assimilation window, there are between 90 and 97 radiosonde The innovation of an ensemble member is the difference between observation and forecast. G. ROBINSON & I. GROOMS
Fig. 4 . Left: Temperature fields interpolated from radiosonde data measured on May 15, 2017at a pressure level of 70kPa. Right: the result of applying our blur with parameters ξ / = 5 ◦ , β = 1 , and (cid:96) = 4 ◦ . The shape of the North America is underlaid to give a sense of scale, and smallcircles indicate measurement locations. measurements scattered across North and Central America and the Caribbean avail-able at various pressure levels. An example of the locations of these observations at apressure level of 70 kPa on May 15, 2017 is shown in Fig. 4. The left panel shows thelocations of the measurements along with an interpolated temperature field obtainedusing Gaussian RBFs with standard deviation ξ / = 5 ◦ . The right panel shows theresult of applying our blurring algorithm with blurring exponent β = 1 / (cid:96) = 4 ◦ . Figure 4 provides visual evidence that our algorithm indeed blursscattered data. Interpolating the raw data would leave strange regions of approxi-mately zero Kelvins in the interpolants depicted. So for the purpose of visualization,we subtract the mean before applying the blur, and then add the mean back to theblurred data.We next verify that the algorithm has a controllable degree of blurring withthe desired effect on SIR weights, viz. that the effective sample size increases asthe blurring length scale (cid:96) increases. To that end we use the 80 member ensembleforecast for temperature at the locations of the radiosonde temperature observations,assume that the forecast weights are all equal to 1 /
80, and update the weights basedon mismatch to the observations using the standard SIR update formula.To be precise, let y be the vector of radiosonde temperature observations at agiven time, let Hx ( i ) be the vector of forecast temperatures at the same time andlocations for ensemble member i , and let R / be a diagonal matrix whose diagonalcontains the standard deviations of the observation errors. The un-normalized weightfor the i th ensemble member is(6.1) w i = exp (cid:26) − σ (cid:16) y − Hx ( i ) (cid:17) T R − / S T SR − / (cid:16) y − Hx ( i ) (cid:17)(cid:27) where S is the matrix corresponding to the blurring operator and σ = (cid:107) S (cid:107) is arescaling factor with a unit vector of identical entries. The normalized weights are(6.2) w i = ˜ w i (cid:80) Nj =1 ˜ w j AST TUNABLE BLURRING ALGORITHM FOR SCATTERED DATA ℓ E SS Fig. 5 . Distributions of effective sample sizes, for different values of blurring length scale (cid:96) , observed for the posterior ensembles on radiosonde temperature. The posterior ensembles areobtained by performing an importance sampling update on the WRF forecast as a prior ensemble,using actual radiosonde temperature measurements at an atmospheric pressure level of 50 kPa. Thedistributions in this plot depict ESS values for SIR weights computed twice daily over the entiremonth of May 2017. and the effective sample size (ESS) is(6.3) ESS = 1 (cid:80) Ni =1 w i . Figure 5 shows violin plots of the ESS computed twice daily over the entire monthof May 2017 using radiosonde temperature data at a pressure level of 50 kPa. Thestandard particle filter (i.e. without blurring, or (cid:96) = 0) exhibits very poor performancewith ESS only rarely rising much beyond 1.As the blurring length scale (cid:96) increases from 0, the distribution of ESS also in-creases. With (cid:96) = 4 ◦ , the median ESS is approximately 3 and ESS occasionally risesbeyond 5. We emphasize that these values of ESS are still quite small for an 80 mem-ber ensemble, but the goal here has not been to demonstrate the performance of theparticle filter per se, but of the blurring algorithm for scattered data. That said, evena tiny increase of the ESS beyond its minimum possible value of 1 is promising becauseit offers hope that uncertainty quantification will improve faster upon increasing theensemble size.
7. Computational considerations.
The first step of our blur involves solvinga dense linear system (2.3) for interpolation weights b . A naive approach to doing sowould require O ( N y ) storage and O ( N y ) operations. This would be highly undesir-able, especially if the interpolation matrix B changes between assimilation cycles dueto changing observation locations. Happily, there exist algorithms to solve a GaussianRBF problem much faster. These notably include PetRBF, which is based on GMRESiteration with a restricted additive Schwarz method preconditioner. PetRBF requires O ( N y ) storage and O ( N y ) operations in arbitrary dimension d , and is scalable tomany cores as implemented in PETSc [25].Another potential bottleneck is evaluating the sum of N y M Gaussians at N y target locations, with M = M − + M + + 1 terms in the approximation of g . A direct6 G. ROBINSON & I. GROOMS approach to evaluating this sum, as is written in Algorithm 2.2, has time complexity O ( N y M ). This Gaussian sum approximation can be reduced to O ( N y + N y M ) withthe Fast Gauss Transform (FGT) [6]. The FGT has exponential time complexity inlocation dimensionality d , so it often runs slower than direct evaluation when d isgreater than 2. The Improved Fast Gauss Transform (IFGT) exists to eliminate thatexponential scaling that hinders the original FGT for large d [24]. The IFGT canbe challenging to use in practice, but there exist approaches to assist in automatictuning such as [13]. Finally, ASKIT offers a new approach to the kernel summationproblem that extends to non-Gaussian kernels and which may be less fragile in highdimensions [11].Our algorithm uses an interpolation basis function whose width remains fixedthroughout the domain. This may be problematic when the density of observationlocations is highly heterogeneous, since an RBF standard deviation ξ large enoughto resolve smooth features in a sparsely-sampled region may be large enough thatit causes numerical problems in densely-sampled regions. Those numerical problemsmay arise from ill-conditioning of B or from insufficient data locality expected of somedivide-and-conquer solvers like PetRBF. There is some extant literature on the use ofnonuniform RBF width parameters to address this situation [5], so that the size of thebasis function can adapt to the density of observations. Using adaptive width involvesinterpolation with a basis { ψ i } that is allowed to vary with i . Adaptive width canbe incorporated into our blur, just by modifying (2.3) and (2.18) to let ξ vary with i .Using nonuniform width parameters no longer comes with guaranteed nonsingularity,but [5] suggests that singularity is more of an exception than a rule.Unfortunately, many fast solvers for the RBF problem are incompatible withbasis functions that vary by location. One possibility to reduce the cost of solvingfor interpolation weights in this case is to choose compactly-supported basis functions ψ i so that B is sparse. We are unaware of any compactly-supported radial basisfunctions with positive Fourier transforms that are simple to convolve with a Gaussian,particularly for arbitrary d . But performing the interpolation in terms of compactly-supported bases ψ i can be made compatible with the rest of our blurring method,simply by approximating each ψ i with a sum of Gaussians. The resulting Gaussianapproximation of the data will not be an interpolant, but careful construction canmake it accurate. Therefore Theorem 2.1 does not apply, but we can still expect thissubstitution to yield a good approximation of the convolution acting on the originalinterpolant.It is similarly possible to choose a different convolution kernel g to approximatewith a sum of Gaussians. This idea can be used to implement a blur of the formpresented here with a wider variety of characteristics, such as a non-monotonic re-sponse in length scale. If the Guassian approximation kernel possesses a positiveFourier transform, and the RBF interpolation employs a uniform basis function, thenTheorem 2.1 still applies to guarantee that S T S is a valid covariance matrix in itsapplication shown in section 5.To reduce the M prefactor in the convolution step, we can apply a reductionalgorithm based on Prony’s method with the suboptimal approximation (2.8) as astarting point [3]. Doing so yields an optimal multiresolution approximation of theintegral kernel for given uniform relative error bounds, which may require substantiallyfewer terms to attain the same relative accuracy. This reduction method may beparticularly helpful for different forms of D (ergo g ) that do not yield such a rapidly-convergent approximation as (2.15). AST TUNABLE BLURRING ALGORITHM FOR SCATTERED DATA
8. Conclusions.
We have a described a method to blur data measured at N y locations that are arbitrarily scattered in R d , for arbitrary d , by applying a dis-crete approximation to the integral equation that inverts the fractional bound stateHelmholtz operator (1 − (cid:96) ∆) β . The degree of attenuation for different length scalescan be tuned by adjusting the parameters (cid:96) > β >
0; large scales are attenuatedlittle, but length scales shorter than (cid:96)/ (2 π √ /β −
1) are rapidly suppressed with arate determined by β .The discrete approximation results from a multiresolution Gaussian approxima-tion to the differential operator’s Green’s function. This readily permits convolutionwith a sum of Gaussians that approximate the data; we take the sum of Gaussiansapproximation to be a radial basis function (RBF) interpolant with a Gaussian kernel.The blur is shown to be a positive definite linear operator on R N y in a more generalcontext where the interpolation basis and the convolution kernel have positive Fouriertransforms.Spectral properties of our blur are examined with an example shown in section 3.This example provides evidence that the algorithm operates as expected: attenuationgradually increases in wavenumber, roughly approximating the differential operator’sinverse spectrum, with a caveat that our blur attenuates even the largest scale. Inorder to preserve large scales, we propose dividing S by (cid:107) S (cid:107) , where is a unit vectorwith all entries identical.Section 4 shows an example application of our blurring algorithm on sea surfacetemperature data measured by oceanographic floats in the Argo project. In contrastto the typical approach of decomposing these data into large-scale and small-scalecomponents, our method circumvents the need for interpolating onto a regular grid.Our blur is developed with application to Sequential Importance Sampling withResampling (SIR) particle filters in mind. Section 5 describes how blurring observa-tions with S before assimilating them as if they have uncorrelated errors is equivalentto assuming that the observation errors have covariance ( S T S ) − , which gives observa-tion errors the correlation structure of a stationary generalized Gaussian random field.Relative to an uncorrelated model, an observation error model of this type decreasesthe number of ensemble members required to achieve good uncertainty quantificationfrom SIR for spatially-extended dynamical systems [14].Section 6 demonstrates that this blur has the desired effect of helping balanceSIR weights in an example with real meteorological data, which improves uncertaintyquantification by reducing the tendency of SIR to produce underdispersed posteriordistributions in high dimensions. This example is chosen to be provocative of poten-tial future applications to geophysical fluid dynamics, but it is worth characterizingtraits of applications that would be more appropriate. The extratropical temperaturefield in section 6 probably features little dynamical nonlinearity at large scales, and itsmeasurements are linear and Gaussian, so this corpus of data is an excellent candidatefor assimilation with any one of the many variants of the Ensemble Kalman Filter. Amore appropriate application of blurred-observation SIR would feature substantiallynon-Gaussian behavior at large scales. That can arise due to nonlinear dynamicsof large scales or due to large dispersion of a non-negative state variable relative toits mean, or due to a nonlinear observation operator inducing a non-Gaussian pos-terior distribution. Moist convective systems, for example, have nonlinear dynamicsand substantially skewed sign-definite variables. Examples of nonlinear observationoperators that could be similar motivation for SIR include satellite radiance and pre-cipitation measurements. Any of these features could provide motivation for accept-ing the computational challenge of SIR in exchange for provable convergence to the8 G. ROBINSON & I. GROOMS smoothed-observation surrogate model with its controllable bias. However, it remainsto be seen whether a smoothed-observation SIR filter can beat methods in the EnKFfamily when applied to real atmospheric problems.A naive implementation of Algorithm 2.1 requires O ( N y ) memory and O ( N y ) op-erations to solve for interpolant weights. However section 7 describes how specializedkernel matrix solvers and fast kernel summation methods can reduce the asymptoticcomplexity of our algorithm to O ( N y ). Acknowledgments.
We are grateful to Jeff Anderson and Glen Romine fortheir help in accessing and using the NCAR ensemble data. We are grateful to Gre-gory Beylkin for pointing us towards a multiresolution Gaussian approximation of theblurring kernel. The Argo data used here (http://doi.org/10.17882/42182
REFERENCES[1]
J. Anderson, T. Hoar, K. Raeder, H. Liu, N. Collins, R. Torn, and A. Avellano , Thedata assimilation research testbed: A community facility , B Am Metorol Soc, 90 (2009),pp. 1283–1296.[2]
J. L. Anderson , A local least squares framework for ensemble filtering , Mon. Wea. Rev., 131(2003), pp. 634–642.[3]
G. Beylkin and L. Monz´on , Approximation by exponential sums revisited , Appl ComputHarmon A, 28 (2010), pp. 131–149.[4]
B. Fornberg and N. Flyer , A primer on radial basis functions with applications to thegeosciences , SIAM, 2015.[5]
B. Fornberg and J. Zuev , The Runge phenomenon and spatially variable shape parameters inRBF interpolation , Computers & Mathematics with Applications, 54 (2007), pp. 379–398.[6]
L. Greengard and J. Strain , The fast gauss transform , SIAM Journal on Scientific andStatistical Computing, 12 (1991), pp. 79–94.[7]
R. A. Horn and C. R. Johnson , Matrix analysis , Cambridge Univ. Press, Cambridge, 23.print ed., 1990.[8]
F. Judt , Insights into atmospheric predictability through global convection-permitting modelsimulations , J. Atmos. Sci., 75 (2018), pp. 1477–1497.[9]
F. Lindgren, H. Rue, and J. Lindstr¨om , An explicit link between Gaussian fields and Gaus-sian Markov random fields: the stochastic partial differential equation approach , J RoyStat Soc B, 73 (2011), pp. 423–498.[10]
E. N. Lorenz , The predictability of a flow which possesses many scales of motion , Tellus, 21(1969), pp. 289–307.[11]
W. B. March, B. Xiao, and G. Biros , Askit: Approximate skeletonization kernel-independenttreecode in high dimensions , SIAM Journal on Scientific Computing, 37 (2015), pp. A1089–A1110.[12]
W. McLean , Exponential sum approximations for t- β , in Contemporary ComputationalMathematics-A Celebration of the 80th Birthday of Ian Sloan, Springer, 2018, pp. 911–930.[13] V. I. Morariu, B. V. Srinivasan, V. C. Raykar, R. Duraiswami, and L. S. Davis , Automaticonline tuning for fast gaussian summation , in Advances in neural information processingsystems, 2009, pp. 1113–1120.[14]
G. Robinson, I. Grooms, and W. Kleiber , Improving particle filter performance by smoothingobservations , Monthly Weather Review, 146 (2018), pp. 2433–2446.[15]
D. Roemmich and J. Gilson , The 2004–2008 mean and annual cycle of temperature, salinity,and steric height in the global ocean from the argo program , Progress in oceanography, 82(2009), pp. 81–100.[16]
D. Roemmich, G. C. Johnson, S. Riser, R. Davis, J. Gilson, W. B. Owens, S. L. Garzoli,C. Schmid, and M. Ignaszewski , The argo program: Observing the global ocean withprofiling floats , Oceanography, 22 (2009), pp. 34–43.[17]
R. Rotunno and C. Snyder , A generalization of lorenzs model for the predictability of flowswith many scales of motion , J. Atmos. Sci., 65 (2008), pp. 1063–1076.AST TUNABLE BLURRING ALGORITHM FOR SCATTERED DATA [18] H. Rue and L. Held , Gaussian Markov random fields: theory and applications , CRC press,2005.[19]
C. S. Schwartz, G. S. Romine, R. A. Sobash, K. R. Fossell, and M. L. Weisman , Ncarsexperimental real-time convection-allowing ensemble prediction system , Weather Forecast.,30 (2015), pp. 1645–1654.[20]
C. S. Schwartz, G. S. Romine, R. A. Sobash, K. R. Fossell, and M. L. Weisman , NCARsreal-time convection-allowing ensemble project , B Am Metorol Soc, 100 (2019), pp. 321–343.[21]
C. Snyder, T. Bengtsson, P. Bickel, and J. Anderson , Obstacles to high-dimensionalparticle filtering , Mon. Wea. Rev., 136 (2008), pp. 4629–4640.[22]
C. Snyder, T. Bengtsson, and M. Morzfeld , Performance bounds for particle filters usingthe optimal proposal , Mon. Wea. Rev., 143 (2015), pp. 4750–4761.[23]
C. Wortham and C. Wunsch , A multidimensional spectral description of ocean variability ,Journal of physical oceanography, 44 (2014), pp. 944–966.[24]
C. Yang, R. Duraiswami, N. A. Gumerov, and L. Davis , Improved fast gauss transform andefficient kernel density estimation , IEEE, 2003, p. 464.[25]