[PDF] Feature augmentation for the inversion of the Fourier transform with limited data

Abstract

We investigate an interpolation/extrapolation method that, given scattered observations of the Fourier transform, approximates its inverse. The interpolation algorithm takes advantage of modelling the available data via a shape-driven interpolation based on Variably Scaled Kernels (VSKs), whose implementation is here tailored for inverse problems. The so-constructed interpolants are used as inputs for a standard iterative inversion scheme. After providing theoretical results concerning the spectrum of the VSK collocation matrix, we test the method on astrophysical imaging benchmarks.

Full PDF

FFeature augmentation for the inversion of theFourier transform with limited data

Emma Perracchione , Anna Maria Massone , , Michele Piana , Dipartimento di Matematica, Universit`a degli Studi di Genova, Via Dodecaneso 35,16146 Genova, Italy CNR-SPIN, Via Dodecaneso 33, 16146 Genova, ItalyE-mail: [email protected]; [email protected];[email protected]

February 2021

Abstract.

We investigate an interpolation/extrapolation method that, givenscattered observations of the Fourier transform, approximates its inverse. Theinterpolation algorithm takes advantage of modelling the available data via a shape-driven interpolation based on Variably Scaled Kernels (VSKs), whose implementationis here tailored for inverse problems. The so-constructed interpolants are used as inputsfor a standard iterative inversion scheme. After providing theoretical results concerningthe spectrum of the VSK collocation matrix, we test the method on astrophysicalimaging benchmarks.

Keywords : Shape-driven interpolation, Variably Scaled Kernels, feature augmentation,Solar X-ray imaging

Submitted to:

Inverse Problems

1. Introduction

The inversion of the Fourier transform with limited data is a well-known issue that leadsto bandlimited extrapolation problems and is common to many applied sciences, such asmicroscopy, medical imaging, seismology, and radio-astronomy (see e.g. [1–4]). In somecases, e.g. when the sampling is not uniform, preliminary interpolation methods areneeded to reconstruct the frequency information. In view of this, we drive our attentiontowards interpolation/extrapolation approaches (refer, e.g. to [5, 6]) and, speciﬁcally, tothe reconstruction scheme that consists of the following two steps: • Interpolation of the scattered observations of the Fourier transform so that auniform sampling in the frequency domain is generated. • Extrapolation and inversion of the so-generated interpolants via Fast FourierTransform (FFT)-based iterative methods. a r X i v : . [ m a t h . NA ] F e b eature augmentation for inverse problems a priori information (when available) leading to a shape-driven approximation.Such additional knowledge is implicitly put into the kernel via a scaling function , whichdetermines the accuracy of the approximation process.The main goal of this study is to tailor the VSKs for the Fourier inversion issue withlimited data. Precisely, given a ﬁrst and possibly rough approximation of the inverseproblem, we solve the forward issue and starting from the so-generated approximationin the frequency domain, we deﬁne the scaling function. In this way we providethe interpolation/extrapolation algorithm with the resources to properly compute theinverse of the Fourier transform by limited data.We further point out that also the kernel basis needs to be properly selectedfor the inversion procedure, which is known to be ill-conditioned. Therefore, for thepractical implementation of the VSK setting, we consider the Mat´ern C kernel (seee.g. [16, 17]) that, when dealing with real and possibly noisy data, takes advantage ofits low regularity, leading to reliable results. For that kernel, we provide an analysis ofits spectrum in the VSK framework. This theoretical study shows that the Mat´ern C VSK might be less aﬀected by ill-conditioning than the classical kernel.After providing the theoretical validation for the use of VSKs in the contextof inverse problems, the method is extensively tested on real and simulated datacoming from the framework of astronomical imaging. We consider samples fromboth the

Reuven Ramaty High Energy Solar Spectroscopic Imager (RHESSI) [18] andthe

Spectrometer/Telescope for Imaging X-rays (STIX) [19], which are two telescopesrecording X-rays from the Sun with the main purpose of observing solar ﬂares. For bothsystems, the imaging problem can be described as an inversion of the Fourier transformwith undersampled data. The results on solar ﬂares reconstructions empiricallydemonstrate that VSKs are the key ingredient for robust interpolation/extrapolationalgorithms. eature augmentation for inverse problems

RHESSI and

STIX visibilities. Extensive numerical experiments are carried out in Section 5. Ourconclusions are oﬀered in Section 6.

2. Preliminaries

The main objective of this work is to address inverse problems involving the Fouriertransform via VSKs. In this direction we consider the following inversion issue.

Problem 2.1.

Let F : L ∩ L ( R d , R ) −→ F ( L ∩ L ( R d , R )) ⊆ L ∩ L ( R d , C ) , with d ∈ N + , be the Fourier operator deﬁned as ( F I )( x ) = V ( u ) , x ∈ X, u ∈ U, (1) where I : X ⊆ R d −→ R and V : U ⊆ R d −→ C . Given some scattered observations ona compact set D ⊆ U of the function V , the problem consists in ﬁnding an approximationof I . Since Problem 2.1 is common to many real imaging problems, we restrict to thecases for which I is a real-valued function. To approximate Problem 2.1, we proposean approach that involves kernel-based interpolation schemes at spatial frequenciesand the projected Landweber iterative method [20]. The latter is a well-knownconstrained iterative scheme that realizes extrapolation and artifact reduction usinga soft-thresholding inversion process. For the interpolation issue, which is the focus ofthe present paper, we study meshless schemes that make use of feature augmentationstrategies. Those methods will be analyzed in Section 3, while here we brieﬂy reviewthe iterative inversion approach. Given the set of scattered observations on D ⊆ R d , the interpolation operator returnsan approximation of the complex-valued function V on U ⊇ D . Then, for stabilitypurposes of the inversion scheme we might not allow extrapolation via interpolation outof the compact set D . Therefore, the interpolant, denoted by P V = ( P Re( V ) , P Im( V ) ),might be ﬁrst projected as follows: P V D := χ D P V , where χ D is the characteristic function operator of the compact set D .Note that, ˜ I = F − (cid:0) P V D (cid:1) , eature augmentation for inverse problems I : X −→ R . An extrapolationout of the band D can be computed via the projected Landweber iteration in followingsteps:(i) Initialize the iterative scheme as I (0) = 0.(ii) For a given threshold τ ∈ R and starting from I ( k ) , compute F ( I ( k +1) ) = τ F ( ˜ I ) + (1 − τ χ D ) F ( I ( k ) ) , k = 1 , , . . . . (2)and apply the non-negativity constraint by means of the projection I ( k +1) = P + I ( k +1) (3)where (cid:0) P + I ( k +1) (cid:1) ( x ) = (cid:40) I ( k +1) ( x ) < ,I ( k +1) ( x ) otherwise . Remark 2.1.

It is a well-established result [21] that the projection onto the convex setof non-negative functions is a way to realize extrapolation.

Remark 2.2.

The computation of I ( k +1) at the right hand side of equation (3) is realizedby applying an FFT-based routine to equation (2). The result of the extrapolation scheme depends on the approximation of V : U ⊆ R d −→ C . For smooth functions, many approximation methods could be successfullyimplemented. However, such functions might be sampled at few scattered data andmight be characterized by steep gradients. In that case we need to drive our attentiontowards data-driven interpolation methods.

3. Interpolation

Given a set scattered nodes U = { u i = ( u i , v i ) , i = 1 , . . . , n } ⊆ U and the set ofassociated function values V = { V i = (Re( V i ) , Im( V i )) , i = 1 , . . . , n } ⊆ C , the aim isto ﬁnd two interpolating functions P Re( V ) and P Im( V ) : U −→ R so that P Re( V ) ( u i ) = Re( V i ) , and P Im( V ) ( u i ) = Im( V i ) , i = 1 , . . . , n. (4)For constructing the interpolants we consider kernel methods that take advantageof being meshless and naturally multivariate. The interpolation problems (4) have a unique solution if P Re( V ) and P Im( V ) ∈ span { K ε ( · , u i ) , u i ∈ U } , where K ε : U × U −→ R is a strictly positive deﬁnite andradial kernel and ε > shape parameter . We also remark that to theradial kernel K we can associate a continuous function ϕ ε : [0 , + ∞ ) −→ R , such that K ε ( w , z ) = ϕ ε ( (cid:107) w − z (cid:107) ) , eature augmentation for inverse problems w , z ∈ U . The function ϕ ε is usually referred to as a Radial Basis Function(RBF).For u ∈ U the interpolants are of the form P Re( V ) ( u ) = n (cid:88) k =1 α k K ε ( u , u k ) , and P Im( V ) ( u ) = n (cid:88) k =1 β k K ε ( u , u k ) . The coeﬃcients α = ( α , . . . , α n ) (cid:124) and β = ( β , . . . , β n ) (cid:124) are determined by solving thefollowing linear systems: K α = Re( V ) , and K β = Im( V ) , (5)where the entries of K ∈ R n × n are given by K ik = K ε ( u i , u k ) , i, k = 1 , . . . , n, (6)moreover, Re( V ) = (Re( V ) , . . . , Re( V n )) (cid:124) and Im( V ) = (Im( V ) , . . . , Im( V n )) (cid:124) are thevectors of data values.The kernels are characterized by diﬀerent regularities and especially for inﬁnitelysmooth kernels the selection of the shape parameter aﬀects the accuracy of thereconstruction, meaning that inappropriate choices of its value might lead to poorapproximations (see e.g. [22, 23] for a general overview). To select safe values of theshape parameter, we need to introduce the so-called native spaces . First, we introducethe pre-Hilbert space H K ε ( U ) with reproducing kernel K ε , given by H K ε ( U ) = span { K ε ( · , u ) , u ∈ U } , and equipped with a bilinear form ( · , · ) H Kε ( U ) . Then, the native space N K ε ( U ) is deﬁnedas the completion of H K ε ( U ) with respect to the norm || · || H Kε ( U ) , i.e. || ν || H Kε ( U ) = || ν || N Kε ( U ) , for all ν ∈ H K ε ( U ) [24, 25].To provide error bounds, we need to introduce one more ingredient, the so-called power function . Let K be the interpolation matrix related to the set of nodes U and ˜ K be the matrix associated to the augmented set { u } ∪ U , u ∈ U . The power function isdeﬁned as [26] P K ε , U ( u ) := (cid:115) det( ˜ K )det( K ) , u ∈ U. The following pointwise error bound, that uses the power function and the norm ofthe sought function in the native space, holds true (see e.g. [24, Th. 14.2, p. 117]).

Theorem 3.1.

Let K ε : U × U −→ R be a strictly positive deﬁnite and continuous kerneland U = { u i , i = 1 , . . . , n } ⊆ U a set of distinct points. For all functions ν : U −→ R ,so that ν ∈ N K ε ( U ) , we have that | ν ( u ) − P ν ( u ) | ≤ P K ε , U ( u ) || ν || N Kε ( U ) , u ∈ U. eature augmentation for inverse problems safe shape parameter. Indeed, we will selectthe value of the shape parameter that minimizes the power function computed over thedata. Remark 3.1.

As we will point out via numerical experiments, this standard kernel-based interpolation procedure might suﬀer when interpolating functions sampled at veryfew points and/or characterized by steep gradients. To partially avoid this drawback weintroduce the VSKs with the scope of encoding into the kernel itself some prior knowledgefor the interpolation issue.3.2. Variably scaled kernels

The VSKs have been introduced in [13] and later they have been studied to preserveshape properties of the reconstruction surfaces or as edge detection strategies; we referthe reader to [14, 15, 27, 28] for further details on the topic. Here, we investigate themin the context of inverse problems.

The deﬁnition of VSKs relies upon a scalingfunction

Ψ : U −→ Σ, where Σ ⊆ R m and m ≥ Deﬁnition 3.1.

Let K : ( U × Σ) × ( U × Σ) −→ R be a continuous strictly positive deﬁnitekernel. Given a scaling function Ψ : U −→ Σ , a variably scaled kernel K Ψ ε : U × U −→ R is deﬁned as K Ψ ε ( w , z ) := K ε (( w , Ψ( w )) , ( z , Ψ( z )) , for w , z ∈ U . In other words, the VSKs lead to an interpolation problem with new features deﬁnedvia the function Ψ, thus realizing a feature augmentation process.Then for u ∈ U , in the varying scale setting, our interpolants are given by P ΨRe( V ) ( u ) = n (cid:88) k =1 α Ψ k K Ψ ε ( u , u k ) , = n (cid:88) k =1 α Ψ k K ε (( u , Ψ( u )) , ( u k , Ψ( u k )) , = P Re( V ) (( u , Ψ( u )) , and P ΨIm( V ) ( u ) = n (cid:88) k =1 β Ψ k K Ψ ε ( u , u k ) , = n (cid:88) k =1 β Ψ k K ε (( u , Ψ( u )) , ( u k , Ψ( u k )) , = P Im( V ) (( u , Ψ( u )) . eature augmentation for inverse problems α Ψ = ( α Ψ1 , . . . , α Ψ n ) (cid:124) and β Ψ = ( β Ψ1 , . . . , β Ψ n ) (cid:124) are determined bysolving the two linear systems deﬁned in (5), with kernel matrix K Ψ ∈ R n × n , whoseentries are given by K Ψ ik = K ε (( u i , Ψ( u i ) , ( u k , Ψ( u k )) , i, k = 1 , . . . , n. (7)The so-constructed interpolants are then evaluated on a grid of N × N data.Precisely, let ˜ U = { ˜ u i = (˜ u i , ˜ v i ) , i = 1 , . . . , N } ⊆ U , for i = 1 , . . . , N , we compute P Ψ Re( V ) ( ˜ u i ) = κ ( ˜ u i ) α Ψ , and P Ψ Im( V ) ( ˜ u i ) = κ ( ˜ u i ) β Ψ , (8)where κ ( ˜ u i ) = ( K ε (( ˜ u i , Ψ( ˜ u i ) , ( u , Ψ( u )) , . . . , K ε (( ˜ u i , Ψ( ˜ u i ) , ( u n , Ψ( u n ))) . To fully deﬁne our interpolants, we now need to discuss the selection of both thescaling function and the kernel. Both choices are tailored to the inversion of the Fouriertransform with limited data, meaning that the aim is to deﬁne a scaling function thatprovides an enriched interpolation space and that, at the same time, keeps the usualill-conditioning of the kernel matrices low.

To improve the accuracy of the interpolationprocedure, we should select a scaling function that mimics the samples. Indeed, asshown in [15], this enables us to preserve shape properties of the target function. Inview of this, we take a ﬁrst and possibly rough approximation of I , namely ¯ I . Forstability purposes ¯ I might be ﬁrst segmented as follows:¯ I ( x ) = (cid:40) ¯ I ( x ) , if | ¯ I (cid:15) ( x ) | > p max x ∈ X ¯ I ( x ) , , elsewhere , (9)where p is a given threshold and x ∈ X . Then, the scaling function will be deﬁned bysolving the forward problem (1), i.e. we compute¯ V ( u ) = ( F ¯ I )( x ) , and we set Ψ( u ) = (Re( ¯ V ( u )) , Im( ¯ V ( u ))).This step certainly allows us to add new features to the original scattered data.Nevertheless, since we might need to deal with noisy data, we also have to select properkernel bases. Among several kernels which diﬀer in terms of regularities, we here select the Mat´ern C RBF whose formula is given by K ε ( w , z ) = e − ε || w − z || . eature augmentation for inverse problems C RBF is characterized bya low regularity and it is thus a reasonable choice for eventually dealing with noisydata. On the opposite, the interpolants constructed via smooth kernels might suﬀerfrom instability due to the ill-conditioning of the kernel matrices. Moreover, in thefollowing we will show that the Mat´ern VSK might enable us to improve expressivenessand stability of the standard setting. This theoretically motivates the choice of thekernel.

The concept of expressiveness is relatedto the complexity of a kernel-based model. Several studies show that the so-called VCdimension [29] and the empirical Rademacher complexity [30] are popular complexityindicators. We further remark that complex models are able to express sophisticatedlinks between the data [31].To better investigate the concept of expressiveness in our setting, we introduce theso-called spectral ratio [32]. For a given kernel matrix of the form (6), this is deﬁned as S ( K ) = (cid:107) K (cid:107) T (cid:107) K (cid:107) F = (cid:80) Ni =1 K ii (cid:113)(cid:80) Ni =1 (cid:80) Nj =1 K ij . The following deﬁnition [32, Deﬁnition 1, p. 8] states that the spectral ratio can beused as an expressiveness measure for kernels.

Deﬁnition 3.2.

Let K (1) ε and K (2) ε : U × U −→ R be two (strictly) positive deﬁnitekernels. We say that K (2) ε is more speciﬁc (or more expressive) than K (1) ε whenever forany dataset U = { u , . . . , u n } ⊆ U , we have S ( K (1) ) ≤ S ( K (2) ) . We now prove that the Mat´ern C VSK is more expressive than the classical one.To this aim, we need to compare the spectral ratios of the kernel matrices deﬁned in (6)and (7).

Proposition 3.1.

Let U = { u i , i = 1 , . . . , n } ⊆ U be a set of distinct data. Let Ψ : U −→ Σ be the scaling function for the VSK setting. Let K ε : U × U −→ R be theMat´ern C kernel, then the VSK kernel K Ψ ε : U × U −→ R is more expressive than K ε .Proof. Being the Mat´ern kernel non-increasing we obtain K ij ≥ K Ψ ij ≥ , i, j = 1 , . . . , n, which implies (cid:107) K (cid:107) F ≥ (cid:107) K Ψ (cid:107) F . Moreover, since for any w ∈ U , K ε ( w , w ) = K Ψ ε ( w , w ) = 1, i.e. K Ψ ii = K ii = 1, i = 1 , . . . , n , we have that (cid:107) K Ψ (cid:107) T = (cid:107) K (cid:107) T = n. eature augmentation for inverse problems S ( K ) = n (cid:107) K (cid:107) F ≤ n (cid:107) K Ψ (cid:107) F = S ( K Ψ ) . (cid:4) On the one hand, being the Mat´ern C VSK more expressive than the standardone, the VSK-based model might be able to deal with more complex interpolationtasks. On the other hand, too complex models might lead to instability. However,under several hypothesis, we are able to prove that the VSK setting might improve alsothe conditioning of the kernel matrices, leading to a robust interpolation tool.

To investigate the condition number of theinterpolation matrices generated via the Mat´ern kernel in the VSK setting, we will makeuse of the following result by Schur [33] that can be traced back to 1911 [34, LemmaA.5].

Theorem 3.2. If E and M ∈ R n × n are positive deﬁnite matrices, denoting by λ min and λ max the smallest and largest eigenvalue of a matrix, we have that λ min ( E ) min i =1 ,...,n M ii ≤ λ i ( E ◦ M ) ≤ λ max ( E ) max i =1 ,...,n M ii , where ◦ is the entry-wise product of matrices. To infer on the spectrum of the Mat´ern C kernel, we further have to introduce theGaussian C ∞ RBF. We remark that the Gaussian kernel K ε,G : U × U −→ R is deﬁnedas K ε,G = e − ε || w − z || . For such kernel, we observe that its VSK matrix is given by K Ψ G = K G ◦ K ϕG , (10)where K ϕij = e − ε (cid:107) Ψ( u i ) − Ψ( u j ) (cid:107) , i, j = 1 , . . . , n , and K G is the Gaussian kernel matrixbased upon the distance matrix D given by D ij = (cid:107) u i − u j (cid:107) , i, j = 1 , . . . , n. Similarly, the distance matrix in the VSK setting is deﬁned as D Ψ ij = (cid:107) ( u i , Ψ( u i )) − ( u j , Ψ( u j )) (cid:107) , i, j = 1 , . . . , n. We now have all the ingredients to study the condition number of the VSK matrixgenerated via the Mat´ern C kernel. For a given matrix M , we focus on the 2-conditionnumber deﬁned as cond( M ) = || M || || M − || . Proposition 3.2.

Let U = { u i , i = 1 , . . . , n } ⊆ U be a set of distinct data. Let Ψ : U −→ Σ be the scaling function for the VSK setting. Let K ε : U × U −→ R be theMat´ern C kernel. Given the VSK matrix K Ψ constructed via K Ψ ε : U × U −→ R , if K ◦ ( ε D − ◦ K ϕG ◦ (cid:16) ( K Ψ ) ◦ ( ε D Ψ − (cid:17) ◦− , eature augmentation for inverse problems is positive deﬁnite, we have that cond( K Ψ ) ≤ cond( K ) . Proof.

At ﬁrst we need to point out the link between the Gaussian and Mat´ern kernelmatrices. With the notation previously introduced, denoted by K the Mat´ern C kernelmatrix, we have that K G = K ◦ ( K ◦ ( ε D − ) , (11)and analogously, K Ψ G = K Ψ ◦ ( K Ψ ) ◦ ( ε D Ψ − . (12)Then, from (10) and (11) we obtain K Ψ G = K G ◦ K ϕG = K ◦ ( K ◦ ( D − ) ◦ K ϕG , which, thanks to (12), implies that K Ψ ◦ ( K Ψ ) ◦ ( ε D Ψ − = K ◦ ( K ◦ ( ε D − ) ◦ K ϕG , and thus, K Ψ = K ◦ (cid:18) K ◦ ( ε D − ◦ K ϕG ◦ (cid:16) ( K Ψ ) ◦ ( ε D Ψ − (cid:17) ◦− (cid:19) . Furthermore, since the Mat´ern kernel is strictly positive deﬁnite the associated kernelmatrices are positive deﬁnite and thus the condition number can be computed ascond( K Ψ ) = λ max ( K Ψ ) λ min ( K Ψ ) , and since (cid:18) K ◦ ( ε D − ◦ K ϕG ◦ (cid:16) ( K Ψ ) ◦ ( ε D Ψ − (cid:17) ◦− (cid:19) ii = 1 , i = 1 , . . . , n, from Theorem 3.2, we obtaincond( K Ψ ) = λ max ( K Ψ ) λ min ( K Ψ ) ≤ λ max ( K ) λ min ( K Ψ ) ≤ λ max ( K ) λ min ( K ) = cond( K ) . (cid:4) These results, together with the error bounds shown in [15], give a theoreticalvalidation for the use of VSKs. Numerical tests, carried out in the next section, showthat the proposed method can be eﬀectively used for many inverse problems, providedthat the scaling function is appropriately selected for the considered application. As anexample, in the following we test the method in the framework astronomical imaging. eature augmentation for inverse problems

4. Applications to astronomical imaging

Fourier-based imaging ﬁnds its main applications in medical and astronomical imaging[3, 4, 35]. For the latter topic, the image reconstruction problem can be designed asfollows.Let u ∈ U , U ⊆ R , be a point in the spatial frequency plane, and I ( x ) the sourcefunction corresponding to the point x = ( x, y ) belonging to the physical plane X ⊆ R .Given the following relation, V ( u ) = (cid:90) R I ( x )e πi u · x d x , (13)and some scattered observations on a compact set D ⊆ U of the function V , the imagereconstruction problem consists in ﬁnding an approximation of I .In the present study we focused on solar hard X-ray imaging and, speciﬁcally,we cast our interpolation-based reconstruction scheme into the framework of the NASA RHESSI and the ESA

STIX missions.

RHESSI had its nominal phase between February2002 and August 2018 and many inversion methods have been formulated to express itsobservations as images [36–42]. Instead,

STIX will begin its nominal phase in September2021 and the few studies devoted to its imaging process involve just synthetic data [40].

RHESSI and

STIX share the same imaging concept [43, 44], in which the measuredcounts are arranged into a set of n samples of the Fourier transform of the incomingphoton ﬂux, named visibilities, each one associated to a speciﬁc point ( u, v ) of theFourier ( u, v )-plane. In the case of RHESSI , nine collimators provided these visibilitieson 9 circles of the ( u, v )-plane with increasing radii from about 2 . × − arcsec − to2 . × − arcsec − , and n depending on the count statistics. We point out that inthe application considered in Section 5 below, we utilized detectors from 3 through 9.The radius of detector 3 is 7 . × − arcsec − . In the case of STIX , 30 subcollimatorsrelying on the Moire pattern technology, will provide 60 visibilities on 10 circles of the( u, v )-plane with increasing radii from about 2 . × − arcsec − to 7 . × − arcsec − .In order to apply our interpolation scheme to these data, we ﬁrst need to computethe scaling function Ψ and the optimal shape parameter for the two instruments. In order to deﬁne the scaling function Ψ we will make use of a ﬁrst approximationof the inverse problem obtained via a standard back-projection algorithm [45] thatcomputes the discretized inverse Fourier transform of the visibilities by means of theIDL source code vis_bpmap available in the NASA

Solar SoftWare (SSW) tree. Giventhe visibilities, the latter algorithm returns an M × M grid, namely ¯ I (cid:15) ( ¯ x i ), i = 1 , . . . , M .For stability purposes such an image is ﬁrst segmented as in (9), where we numericallyobserved that an appropriate choice for the threshold is 0 . ≤ p ≤ .

90. Then, bysolving the forward problem (13), we obtain ¯ V ( ¯ u i ), ¯ u i ∈ U , i = 1 , . . . , M . eature augmentation for inverse problems RHESSI and

STIX ,we need to point out that the interpolation/extrapolation procedure provides an imageof size M × M , with M = 128. Precisely, after evaluating the VSK interpolants on a gridof N × N data, we implement a zero-padding strategy that provides a grid of T × T pixels,with T (cid:29) N . Finally, after the inversion we subsample the T × T grid for obtaining an M × M image. Therefore, given ¯ V (¯ u i , ¯ v j ), i, j = 1 , . . . , M , for maintaining the proportionbetween the grids we take ¯ V (¯ u i , ¯ v j ), i, j = M/ − (cid:98) L/ (cid:99) − , . . . , M/ (cid:98) L/ (cid:99) + 1, where L = (cid:98) T / ( M N ) (cid:99) . Then, those values are interpolated and evaluated at the sets U and˜ U . This step generatesΨ( u i ) := ( P Re( ¯ V ) ( u i ) , P Im( ¯ V ) ( u i )) , i = 1 , . . . , n, (14)and Ψ( ˜ u i ) := ( P Re( ¯ V ) ( ˜ u i ) , P Im( ¯ V ) ( ˜ u i )) , i = 1 , . . . , N . (15)Therefore, we are able to encode the back-projection map into the kernel and implementthe VSK setting. Indeed, thanks to (14) and (15), we can construct the VSK kernelmatrix K Ψ and evaluate the VSK interpolant as in (8). Remark 4.1.

For STIX visibilities, as well as for RHESSI visibilities computed viadetectors 3-9, we ﬁx N = 320 , producing visibility grids of mesh size × − arcsec − .The value of T is then set as . We then subsample the T × T image by covering itwith M masks of size × pixels and we select the ﬁrst pixel for each mask. Thisstep leads to approximated images of pixel size of about arcsec. We conclude this section with a comment on the selection of the shape parameterfor the considered imaging problem.

In this subsection, we propose a criterion to select an optimal shape parameter forclassical kernel-based interpolation. In order to make a fair comparison between classicaland VSK interpolation, the same shape parameter will be used for the VSK setting too.Note that Theorem 3.1 bounds the pointwise error in terms of the power functionwhich depends on the kernel and on the data points but is independent of the functionvalues. This suggests a criterion to select a reliable shape parameter, i.e. the shapeparameter that minimizes the power function computed over the data. While for

RHESSI we should compute the power function for each data conﬁguration (the datalocations may vary during the acquisition process), for

STIX we can provide an a priorioptimal shape parameter. To this aim, we take 100 values of the shape parameter in[0 . ,

1] and we evaluate for each of them the corresponding maximum value of the powerfunction. With the considered Mat´ern C kernel, the result is the one plotted in Figure1. We note that, having a few data by STIX makes the problem quite stable; indeedthe error curve grows monotonically with respect to the shape parameter. Therefore,we ﬁx ε = 0 . eature augmentation for inverse problems Figure 1.

The shape parameter VS the maximum value of the power functioncomputed for the

STIX visibilities.

Despite the fact that the optimal shape parameter is investigated only for

STIX ,numerical evidence shows that such value is reliable for

RHESSI too.

5. Numerical experiments

We ﬁrst consider a motivating example devoted to stress the dependence of the proposedalgorithm on the interpolation routine and more speciﬁcally on the locations of thesampling nodes. We then consider an example involving

STIX synthetic visibilitiessimulated from a realistic ﬂaring source model and an application on experimental

RHESSI data.For all the experiments, we test and compare two reconstruction algorithms bothbased on the Landweber scheme with two diﬀerent interpolation procedures: the ﬁrst,namely Land-RBF, uses classical RBFs and the second, Land-VSK, implements theVSK strategy.

Let us consider the two dimensional Gaussian function I ( x ) = A e − B (cid:107) x − x p (cid:107) , (16)with x p , B and A chosen in such a way to mimic an X-ray emitting source from position x p on the Sun, Full Width at Half Maximum (FWHM) equal to 11 arcsec and photonﬂux equal to 10 photons cm − s − . Using the Monte Carlo code at disposal in the STIX simulation software we produce three sets of visibilities according to the ( u, v ) samplingsrepresented in Figure 2. The left panel of this ﬁgure corresponds to 100 Fibonacci nodes,i.e. well-spaced nodes in the ( u, v )-plane. We remark that the problem of ﬁnding optimaldata locations for kernel-based interpolation problems is a well-known issue studied bymany researchers (see, e.g., [25]). Without giving details, the Fibonacci nodes have eature augmentation for inverse problems Figure 2.

Illustrative example of Fibonacci (left),

RHESSI (middle) and

STIX (right)visibilities.

Figure 3.

Reconstruction of the Gaussian source function (16). From left to right:back-projection map computed via Fibonacci,

RHESSI and

STIX data locations,respectively. These back-projection maps have been used to generate the scalingfunctions exploited by Land-VSK. the property of being well-distributed, and hence any RBF-based interpolation routineshould work properly on them. The middle panel corresponds to the

RHESSI sampleand, in this experiment, we assumed that n = 240 visibilities are provided by RHESSI .Finally, the right panel corresponds to the sampling of n = 60 visibilities performed by STIX . Correspondingly to these three sampling conﬁgurations, back-projection providesthe three images represented in Figure 3. Such maps are used as starting point forconstructing the VSK scaling functions. Then the outcomes for VSK and classicalinterpolations are plotted in Figure 4 and Figure 5, respectively, where we also plottedthe interpolated modulus of the visibility surfaces with the diﬀerent methods.We observe that as long as the data are well-spaced (as in the case of Fibonaccinodes), both reconstructions return comparable results; however, note that the highestpeak intensity is not correctly detected by Land-RBF. The diﬀerences between thetwo schemes become more and more evident when interpolating

RHESSI and

STIX visibilities.

Flaring emission at hard X-ray energies is typically due to the bremsstrahlung interactionbetween electron energies accelerated along the two arms of a magnetic loop and the eature augmentation for inverse problems Figure 4.

Top row, from left to right: ground truth and reconstructions of theGaussian source function via Land-VSK using Fibocacci,

RHESSI and

STIX datalocations. The second row represents the visibility surfaces of the ground truth andthe ones returned by the interpolation algorithm.

Figure 5.

Top row, from left to right: ground truth and reconstructions of theGaussian source function via classical Land-RBF methods using Fibocacci,

RHESSI and

STIX data locations. The second row represents the visibility surfaces of theground truth and the ones returned by the interpolation algorithm. eature augmentation for inverse problems Figure 6.

Top row, from left to right: ground truth and reconstructions of thesynthetic double foot-point ﬂare from

STIX simulated visibilities, using Land-VSKand Land-RBF. The bottom row represents the visibility surfaces of the ground truthand the ones returned by the interpolation algorithms. ambient plasma [46]. As a consequence, a typical ﬂare conﬁguration is the one of Figure6, top left panel, where the conﬁguration parameters are described in Table 1 (groundtruth). We generated 25 sets of synthetic visibilities by as many runs of the Monte Carlocode and applied Land-RBF and Land-VSK to these synthetic sets. Figure 6 comparesthe reconstructions and the corresponding visibility surfaces for the ground truth and thetwo reconstruction algorithms, while the corresponding averaged parameter estimatesare illustrated in Table 1 together with their standard deviations. The table also showsthe Relative Root Mean Square Error (RRMSE) deﬁned asRRMSE = || V − W || F || W || F , where V and W are the moduli of the visibility surfaces corresponding to the groundtruth and the reconstruction, respectively, and (cid:107) · (cid:107) F is the Frobenius norm. Further,to numerically verify Propositions 3.1 and 3.2, we report in Table 2 the spectral ratiosand condition numbers for the standard and VSK Mat´ern kernels for one out of the 25simulations. As last example, we test our procedure on a ﬂare observed by

RHESSI on February 202002 during the time interval 11:06:02–11:06:24 UT. The energy range of the event is 50– eature augmentation for inverse problems Table 1.

Results concerning the reconstruction of the simulated double foot-pointﬂare. x p (arcsec) y p (arcsec) FWHM (arcsec) FLUX (photon cm − s − × )First PeakSimulated -15.0 -15.0 11.0 4.88Land-VSK -13.0 ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± − s − × ) RRMSESimulated 10.00Land-VSK 10.18 ± ± ± ± Table 2.

Condition number and spectral ratio of the classical and VSK kernel matricescomputed for the STIX visibilities, using the data samples of Figure 6. cond( K ) cond( K Ψ ) S ( K ) S ( K Ψ )8.410 e+05 3.346 e+05 1.002 1.00670 keV. The results returned by Land-RBF and Land-VSK are illustrated in Figure 7.In this application with real observations, we also include a comparison with two otheralgorithms included the SSW tree: uv smooth, which is an interpolation/extrapolationalgorithm where the interpolation step is realized by using a spline function [41]; andClean [47], which is a deconvolution algorithm based on thresholding procedures. Theimage of the

RHESSI double foot-point ﬂare reconstructed via Land-VSK contains fewerartifacts than the ones produced by Land-RBF and uv smooth. Artefacts are negligiblein the case of Clean reconstructions. However, Clean has two signiﬁcant drawbacks withrespect to Land-VSK: ﬁrst, it is not completely user-independent, given that the last stepof this iterative scheme requires the convolution with an idealized point spread functionwhose FWHM is manually chosen by the user via heuristic considerations; second, the χ values associated to the interpolation scheme is signiﬁcantly smaller ( χ = 2 . eature augmentation for inverse problems Figure 7.

Reconstruction of the ﬂare observed by

RHESSI on February 20 in 2002.From left to right: Land-VSK, Land-RBF, uv smooth and clean.

Land-VSK; χ = 3 . RHESSI data, in Table 3, we report the spectral ratios and conditionnumbers for the standard and VSK Mat´ern kernel matrices.

Table 3.

Condition number and spectral ratio of the classical and VSK kernel matricescomputed for the

RHESSI visibilities, using the data samples of Figure 7. cond( K ) cond( K Ψ ) S ( K ) S ( K Ψ )3.338 e+05 1.753 e+05 1.003 1.006

6. Comments and conclusions

The theoretical and data analysis results of this study show that the proposed Fourierinversion scheme, based on VSK interpolation and projected Landweber extrapolation,can be eﬀectively used in many applications. As a case study, we focused on astronomicalimaging and we pointed out that the use of VSKs seems to be the key ingredient fordealing with solar ﬂares reconstructions, especially when we do not dispose of well-distributed data. The results on the

STIX single and double foot-point ﬂares point outthat the classical kernel-based interpolation is sensitive when the peaks in the imageplane are close to the boundary. Such shifts directly reﬂect on the visibility surfacesproducing oscillations at the boundary of the visibility domain. Therefore, we need touse a data-driven interpolation via VSKs to make up for the lack of information.Finally, to numerically verify Propositions 3.1 and 3.2, we computed the conditionnumbers and spectral ratios of the kernel matrices. The results are consistent withwhat theoretically proven. Future work concerns theoretical studies about the selectionof kernel centres via greedy methods (refer e.g. to [48]) and the inclusion of Land-VSKin the

SSW tree. eature augmentation for inverse problems Acknowledgements

This research has been accomplished within Rete ITaliana di Approssimazione (RITA).The authors acknowledge the ﬁnancial contribution from the agreement ASI-INAFn.2018-16-HH.0 and the support of GNCS-IN δ AM.

References [1] Bertero M, Brianzi P and Pike E R 1987

Inverse Problems Magnetic resonance in medicine Geophysics Interferometry and synthesis in radio astronomy (Springer Nature)[5] Allavena S, Piana M, Benvenuto F and Massone A 2012

Inverse Probl. Imag. IEEE Transactions on Signal Processing Commun. Pure Appl. Math. Kernel-based Approximation Methods using MATLAB (Singapore: World scientiﬁc)[9] Trefethen L 2013

Approximation Theory and Approximation Practice

Other Titles in AppliedMathematics (SIAM)[10] Schumaker L 2007

Spline Functions: Basic Theory

SIAM Rev

Annals of Numerical Mathematics IMA J. Numer. Anal. BIT Numer. Math. SIAM J. Sci. Comput. B472–B491[16] Mat´ern B 1986

Lecture Notes in Statistics, Springer-Verlag, Berlin [17] Stein M 1999 Interpolation of Spatial Data. Some Theory for Kriging (New York: Springer-Verlag)[18] Lin R, Dennis B, Hurford G, Smith D, Zehnder A, Harvey P, Curtis D, Pankow D, Turin P,Bester M, Csillaghy A, Lewis M, Madden N, Beek H V, Appleby M, Raudorf T, McTiernan J,Ramaty R, Schmahl E, Schwartz R, Krucker S, Abiad R, Quinn T, Berg P, Hashii M, SterlingR, Jackson R, Pratt R, Campbell R, Malone D, Landis D, Barrington-Leigh C, Slassi-SennouS, Cork C, Clark D, Amato D, Orwig L, Boyle R, Banks I, Shirey K, Tolbert A, Zarro D, SnowF, Thomsen K, Henneck R, Mchedlishvili A, Ming P, Fivian M, Jordan J, Wanner R, CrubbJ, Preble J, Matranga M, Benz A, Hudson H, Canﬁeld R, Holman G, Crannell C, Kosugi T,Emslie A, Vilmer N, Brown J, Johns-Krull C, Aschwanden M, Metcalf T and Conway A 2002

Sol. Phy. eature augmentation for inverse problems BR Dennis, HF van Beek, J Rodr´ıguez-Pacheco and RP Lin 2020

A&A

A15 URL https://doi.org/10.1051/0004-6361/201937362 [20] Piana M and Bertero M 1997

Inverse Problems JOSA A SIAM J. Sci. Comput. Comput. Math. Appl. Meshfree Approximation Methods with MATLAB (Singapore: World Scientiﬁc)[25] Wendland H 2004

Scattered Data Approximation

Cambridge Monographs on Applied andComputational Mathematics (Cambridge University Press)[26] S De Marchi R Schaback H W 2005

Adv. Comput. Math. J. Comput. Appl. Math.

532 – 547[28] Rossini M 2018

Dolomites Res. Notes Approx. Theory Probal. Appl. J. Mach. Learn. Res. Learning with Kernels: Support Vector Machines, Regularization,Optimization, and Beyond (MIT Press)[32] Donini M and Aiolli F 2017

Mach. Learn.

Interdisciplinary mathematical sciences Ann. Stat. Magnetic Resonance in Medicine: An Oﬃcial Journal ofthe International Society for Magnetic Resonance in Medicine Sol. Phys.

193 – 211[37] Benvenuto F, Schwartz R, Piana M and Massone A 2013

Astron. Astrophys.

A61[38] Bonettini S, Anastasia C and Prato M 2013

SIAM J. Imaging Sci. Astron. Astrophys.

ApJ

The Astrophysical Journal

Sol. Phys.

SIAM J. Imaging Sci. Proceedings of the IEEE [46] Brown J 1971 Solar Physics Solar Physics

Dolomites Res. Notes Approx.6