[PDF] Constraint matrix factorization for space variant PSFs field restoration

Abstract

Context: in large-scale spatial surveys, the Point Spread Function (PSF) varies across the instrument field of view (FOV). Local measurements of the PSFs are given by the isolated stars images. Yet, these estimates may not be directly usable for post-processings because of the observational noise and potentially the aliasing. Aims: given a set of aliased and noisy stars images from a telescope, we want to estimate well-resolved and noise-free PSFs at the observed stars positions, in particular, exploiting the spatial correlation of the PSFs across the FOV. Contributions: we introduce RCA (Resolved Components Analysis) which is a noise-robust dimension reduction and super-resolution method based on matrix factorization. We propose an original way of using the PSFs spatial correlation in the restoration process through sparsity. The introduced formalism can be applied to correlated data sets with respect to any euclidean parametric space. Results: we tested our method on simulated monochromatic PSFs of Euclid telescope (launch planned for 2020). The proposed method outperforms existing PSFs restoration and dimension reduction methods. We show that a coupled sparsity constraint on individual PSFs and their spatial distribution yields a significant improvement on both the restored PSFs shapes and the PSFs subspace identification, in presence of aliasing. Perspectives: RCA can be naturally extended to account for the wavelength dependency of the PSFs.

Full PDF

CConstraint matrix factorization for space variantPSFs ﬁeld restoration

F. Ngol`e , J.-L. Starck , K. Okumura , J. Amiaux and P.Hudelot Laboratoire AIM, CEA/DSM-CNRS-Universite Paris Diderot, Irfu, Serviced’Astrophysique, CEA Saclay, Orme des Merisiers, 91191 Gif-sur-Yvette, France Institut d’Astrophysique de Paris, UMR 7095 CNRS & UPMC, 98 bis BoulevardArago, F-75014 Paris, FranceE-mail: [email protected]

November 2015

Abstract.

Context: in large-scale spatial surveys, the Point Spread Function (PSF) variesacross the instrument ﬁeld of view (FOV). Local measurements of the PSFs are givenby the isolated stars images. Yet, these estimates may not be directly usable forpost-processings because of the observational noise and potentially the aliasing.Aims: given a set of aliased and noisy stars images from a telescope, we want toestimate well-resolved and noise-free PSFs at the observed stars positions, in particular,exploiting the spatial correlation of the PSFs across the FOV.Contributions: we introduce RCA (Resolved Components Analysis) which isa noise-robust dimension reduction and super-resolution method based on matrix-factorization. We propose an original way of using the PSFs spatial correlation inthe restoration process through sparsity. The introduced formalism can be applied tocorrelated data sets with respect to any euclidean parametric space.Results: we tested our method on simulated monochromatic PSFs of Euclidtelescope (launch planned for 2020). The proposed method outperforms existingPSFs restoration and dimension reduction methods. We show that a coupled sparsityconstraint on individual PSFs and their spatial distribution yields a signiﬁcantimprovement on both the restored PSFs shapes and the PSFs subspace identiﬁcation,in presence of aliasing.Perspectives: RCA can be naturally extended to account for the wavelengthdependency of the PSFs.

Keywords : Dimension reduction, Spatial analysis, Super-resolution, Matrix factoriza-tion, Sparsity a r X i v : . [ c s . C V ] A ug onstraint matrix factorization for space variant PSFs ﬁeld restoration

1. Introduction

In many applications such as high precision astronomical imaging or biomedical imaging,the optical system introduces a blurring of the images that needs to be taken into accountfor scientiﬁc analyses, and the blurring function, also called Point Spread Function(PSF), is not always stationary on the observed ﬁeld of view (FOV). A typical exampleis the case of the Euclid space mission [1], to be launched in 2020, where we need tomeasure with a very high accuracy the shapes of more than one billion of galaxies. Anextremely important step to derive such measurements is to get an estimate of the PSFat any spatial position of the observed images. This makes the PSF modeling a criticaltask. In ﬁrst approximation, the PSF can be modeled as a convolution kernel which istypically space and time-varying. Several works in image processing [2] and speciﬁcallyin astronomy [3, 4], address the general problem of restoring images in presence of aspace variant blur, assuming that the convolution kernel is locally known.In astronomical imaging, unresolved objects such as stars, can provide PSFmeasurements at diﬀerent locations in the FOV. Nevertheless, these images can bealiased given the CCD sensors sizes which makes a super-resolution (SR) step necessary.This is the case for instance for the Euclid mission.The SR is a widely studied topic in general image processing literature [5]. Inastronomy, softwares IMCOM [6] and PSFEx [7], which propose an SR option, arewidely used. The IMCOM provides an oversampled output image from multiple under-sampled input images, assuming that the PSF is perfectly known. It does not dealwith the PSF restoration itself. The PSFEx treats SR as an inverse problem, with aquadratic regularizer. In [8], a sparsity based super-resolution method was proposed,assuming that several low resolution (LR) measurements of the same PSF are available.In practice, we generally don’t have such multiple measurements.In this paper, we consider the case where the PSF is both space variant and under-sampled, and we want to get an accurate modeling at high resolution of the PSFﬁeld, assuming we have under-sampled measurements of diﬀerent PSFs in the observedﬁeld. We assume that the PSFs vary slowly across the ﬁeld. Intuitively, this impliesa compressibility of the PSFs ﬁeld, which leads us to the question of what would be aconcise and easily understandable representation of a spatially indexed set of PSFs.

2. Notations

We adopt the following notation conventions: • bold low case letters are used for vectors; • bold capital case letters are used for matrices; • we treat vectors as column vectors unless explicitly mentioned otherwise.For a matrix M , we note m ij the j th coeﬃcient of the i th line, m ( c ) j or M [: , j ] its j th column and m ( l ) i or M [ i, :] its i th line, that we treat as a line vector. More generally onstraint matrix factorization for space variant PSFs ﬁeld restoration j ≤ j , we note M [: , j1 : j2 ] the matrix obtained by extracting the columns of M indexed from j to j ; for i ≤ i , M [ i1 : i2 , :] is deﬁned analogously with respect to M ’s lines. For a vector u , u [ k ] refers to its k th component. For a given integer m , wenote I m the identity matrix of size m × m . Let E be a euclidean space ( E can be R or R if we consider spatial or spatio-temporal data respectively). We note U = ( u k ) ≤ k ≤ p a set of vectors in E . In this paper, we only consider the case E = R ; U will be a set ofpositions in a plan.

3. The PSFs ﬁeld

We assume that we have an image I , which contains p unresolved objects such as stars,which can be used to estimate the PSFs ﬁeld. Noting y k one of these p objects at spatialposition u k , y k is therefore a small patch of I with n y pixels, around the spatial position u k . We will write y k as a 1D vector. The relation between the ”true” PSF x k and thenoisy y k observation is y k = M k x k + n k (1)where M k is a linear operator and n k is a noise that we assume to be Gaussian andwhite. We will consider two kinds of operators in this paper: the ﬁrst one is the simplecase where M k = I n x and we have the number of pixels n x in x k is equal to n y , and thesecond one is a shift+downsampling degradation operator and n x = m d n y , where m d isthe downsampling factor in lines and columns, with m d ≥ Y = [ y . . . y p ] the matrix of n y lines and p columns of all observed patches, X = [ x . . . x p ] the matrix n x × p of all unknown PSFs, we can rewrite Eq. 1 as Y = F ( X ) + N (2)where F ( X ) = [ M x , . . . , M p x p ].This rewriting is useful because, as we discuss in the following, the diﬀerent PSFs x k are not independent, which means that the problems of Eq. 1 should not be solvedindependently for each k . In other terms, the vectors ( x k ) ≤ k ≤ p belong to a speciﬁcunknown manifold that needs to be learned by using the data globally. Let Ω be a r dimensional subspace of R n x embedding the PSFs ﬁeld. We assume thatthere exists a continuous function f : E (cid:55)→ Ω , so that f ( u k ) = x k , ∀ k ∈ (cid:74) , p (cid:75) . Theregularity of f translates the correlation of the data in space (and time).Let ( s i ) ≤ i ≤ r be a basis of Ω . By deﬁnition, we can write each x k as a linearcombination of the s i , x k = (cid:80) ri =1 a ik s i , k = 1 . . . p , or equivalently X = SA (3) onstraint matrix factorization for space variant PSFs ﬁeld restoration S = [ s , . . . , s r ] and A is a r × p matrix containing the coeﬃcients A [: , k ] of thevectors x k ( k = 1 . . . p ) in the dictionary S . Each column of the matrix S , that we alsorefer to as an atom, can be seen as an eigen PSF , i.e. a given PSF’s feature distributedacross the ﬁeld. We need therefore to minimize (cid:107) Y − F ( X ) (cid:107) F , which is an ill posed problem due to boththe noise and the operator F , (cid:107) . (cid:107) F denoting the Frobenius norm of a matrix. There areseveral constraints that may be interesting to use in order to properly regularize thisinverse problem: • positivity constraint: the PSF x k should be positive; • low rank constraint: as described above, we can assume that x k = (cid:80) ri =1 a ik s i , whichmeans that we can instead minimizemin A , S (cid:107) Y − F ( SA ) (cid:107) F ; (4)we assume that r (cid:28) min( n, p ); this dimension reduction has the advantage thatthere are much less unknown to ﬁnd, leading to more robustness, but the problemis now that the cost function is not convex anymore; • smoothness constraint: we can assume that the vectors x k are structured; the lowrank constraint does not necessarily impose x k to be smooth or piece-wise smooth;adding an additional constraint on S atoms, such as a sparsity constraint, allows tocapture spatial correlations within the PSFs themselves; an additional dictionary Φ s can therefore be introduced which is assumed to give a sparse representation ofthe vectors s k ; • proximity constraint: we can assume that a given x k at a position u k is very closeto another PSF x k (cid:48) at position u k (cid:48) if the distance between u k and u k (cid:48) is small; thismeans that the ﬁeld f must be regular; this regularity can be forced by addingconstraints on the lines of the matrix A ; indeed, the p values relative to a line A [ i, :] correspond to the contribution of the i th eigen PSF to locations relative tothe spatial positions U .We show in section 5 how these four constraints can be jointly used to derive the solution.Let ﬁrst review existing methods susceptible to solve this problem.

4. Related work

In all this section, Y refers to the observed data set ( y k ) ≤ k ≤ p . In the ﬁrst part, theaforementioned degradation operator F is simply the identity. Therefore we review somedimension reduction methods. In the second part F is a shifting and downsamplingoperator; we present a PSF modeling software dealing with this more constrainingsetting. onstraint matrix factorization for space variant PSFs ﬁeld restoration The principal components analysis is certainly one of the most popular mathematicalprocedure in multivariate data analysis and especially, dimension reduction. In our case,we want to represent Y ’s elements using r vectors, with r ≤ max( p, n y ). A PCA givesan orthonormal family of r vectors in R n y so that the total variance of Y along thesevectors directions is maximized. By deﬁnition, the PCA looks for redundant featuresover the whole data set. Therefore, in general, the principal components neither capturelocalized features (in sense of E ) nor have a simple physical interpretation.In [9], a ”regularized” PCA is proposed to address this shortcoming for spatial dataanalysis in atmospheric and earth science. Indeed, as a PCA, the method solves thefollowing problem, min A (cid:107) Y − YA T A (cid:107) F , s . t . AA T = I r , (5)for some chosen small r . Moreover, it jointly imposes a sparsity constraint and asmoothing penalties with respect to the space E , on the matrix A lines. This way, withthe right balance between those two penalties, one favors the extraction of localizedspatial features, making the interpretation of the optimal A easy. Yet, there is noobvious way of setting the sparsity and smoothness parameters, which are crucial;moreover, unless the data actually contain spatially localized and non-overlappingfeatures, the coupled orthogonality and sparsity constraint is likely to yield a biasedapproximation of the data.In the context of remote sensing and multi-channel imaging, two ways of integratingspatial information into PCA are proposed in [10]; the set Y is made of multi-channelpixels. In the ﬁrst way, the author introduces a weighting matrix indicating the relativeimportance of each pixel. For instance, the weight of a given pixel can be related toits distance to some location of interest in E . Then, the computation of the covariancematrix of image bands is slightly modiﬁed to integrate this weighting. This idea is closeto the methodology proposed in [11]. As a consequence, one expects to recover spectralfeatures spatially related to some location of interest within the most important ”eigen-pixels”. Yet, we do not have any speciﬁc location of interest in E and we rather wantto recover relevant features across the whole data set.The second approach aims at taking into account the spatial associations andstructural properties of the image. To do so, modiﬁed versions of the image bandscovariance matrices are calculated, with increasing shifts between the bands, up to apredetermined maximum shifting amplitude. These covariance matrices, including the”regular” one, are averaged and the principal components are ﬁnally derived. Intuitively,one expects the spectral features present in structured images regions to be strengthenedand therefore captured into the principal components. However, we consider a generalsetting where the data are randomly distributed with respect to E , which makes theshifted covariances matrices ill-deﬁned.A review of PCA applications and modiﬁcations for spatial data analysis can befound in [12]. onstraint matrix factorization for space variant PSFs ﬁeld restoration M of dimension r embeddedin R n , one can consider using one of the numerous non-linear dimension reductionalgorithms published in the manifold learning literature, such as GMRA [13], [14]. Theidea is to partition the data in smaller subsets of sample close to each other in thesense of the manifold geometry. From this partionning, the manifold tangent spaces areestimated at subsets locations; estimates are then simply given by the best regressionsof these subsets with r − dimensional aﬃne subspaces. The method includes somemultiresolution considerations that are not relevant to our problem. This procedureprovides a dictionary in which each of the original samples need at most r elements tobe represented. Moreover, the local processing of the data, which is necessary in thissetting because of the manifold curvature, makes this approach somehow compatiblewith the considered problem. Indeed, by hypothesis, the closer two samples will be insense of E , the closer they will be in R n , and the more likely they will fall into the samelocal cluster.Another interesting alternative to the PCA can be found in [15]. This constructioncalled ”Treelets” extracts features by uncovering correlated subsets of variables acrossthe data samples. It is particularly useful when the sample size is by far smaller thanthe data dimensionality ( p (cid:28) n y ), which does not hold in the application we considerin the following. In this subsection, F takes the following form: F ( X ) = [ M x ( c )1 , . . . , M p x ( c ) p ] , (6)where M i is a warping and downsampling matrix. Since we consider a set of compactobjects images, the only geometric transformation one has to deal with for registrationis the images shifts with respect to the ﬁnest pixel grid, which can be estimated usingthe images centroids [8].To the best of our knowledge, the only method dealing with this speciﬁc setting isthe one used in the PSF modeling software PSFEx [7]. This method solves a problemof the form: min ∆ S (cid:107) Y − F (( ∆ S + S ) A ) (cid:107) F + λ (cid:107) ∆ S (cid:107) F . (7) S is a rough ﬁrst guess of the model components. Each line of the weight matrix A is assumed to follow a monomial law of some given ﬁeld’s parameters. The number ofcomponents is determined by the maximal degree of the monomials. For instance, letsay that we want to model the PSFs variations as a function of their position in theﬁeld with monomials with degrees up to 3, then: • one needs 6 components corresponding to the monomials 1 , X, X , Y, XY and Y ; • assuming that the i th PSF in Y ’s columns order is located at u i = ( u ix , u iy ) thenthe i th column of A is given by a ( c ) i = [1 , u ix , u ix , u iy , u ix u iy , u iy ] T up to a scalingfactor. onstraint matrix factorization for space variant PSFs ﬁeld restoration

5. Resolved Components Analysis

We have seen that we can describe the PSFs ﬁeld f as[ f ( u ) , . . . , f ( u p )] = X = SA . (8)The matrix S is independent of the spatial location, and the i th line of A gives thecontribution of the vector s i to each of the samples. As discussed in section 3.3, theﬁeld’s regularity can be taken into account by introducing a structuring of the matrix A . We can write: A [ i, :] T = N (cid:88) l =1 α il υ l , i = 1 . . . r, (9)where ( υ l ) ≤ l ≤ N is a set of vectors spanning R p . Equivalently, we can write A = α V T ,where V = [ υ , . . . , υ N ] and α is a r × N matrix (see Fig. 1). Figure 1.

Data matrix factorization: the j th sample, which is stored in the j th column of X is linear combination of S columns using A ’s j th column coeﬃcients asthe weights; similarly, the j th line of A is a linear combination V T ’s lines, using α ’s j th line coeﬃcients as the weights. onstraint matrix factorization for space variant PSFs ﬁeld restoration Physical interpretation

An interesting way to well interpret A is to consider the ideal case where themeasurements are distributed following a regular grid of locations U . In this case, wecan expand the vector A [ i, :] T using the Discrete Cosine Transform (DCT), and vectors υ i in Eq. 9 are regular cosine atoms, and the column index of the matrix is related thefrequency. Hence, lines relative to high frequencies will be related to quicky varyingPSF components in the ﬁeld, while lines related to low frequencies will be related toPSFs stable components. In practice, the sampling is not regular and the DCT cannotbe used, and V has to be learned in a way to keep the harmonic interpretation valid.We want some lines A [ i, :] to describe stable PSFs components on the FOV, and otherto be more related to local behavior. A As previously mentioned, we want to account for the PSFs ﬁeld’s regularity byconstraining A ’s lines. Speciﬁcally, we want some lines to determine the distributionof stable features across the PSFs ﬁeld while we want other lines to be related to morelocalized features. In order to build this constraint, let ﬁrst consider the simple case ofa one dimensional ﬁeld of regularly spaced PSFs. We ﬁrst assume that E = R . We supposethat p = 2 k + 1, for some integer k and we consider the 1D vector ψ e,a = ( ψ i ) ≤ i ≤ p deﬁned as follows: ψ i = ψ p − i +1 = − / | u i − u k +1 | e if i (cid:54) = k + 1 , (10) ψ i = p (cid:88) j =1 j (cid:54) = k +1 a/ | u j − u k +1 | e , otherwise (11)for some positive reals e and a . We suppose that ψ e,a is normalized in l norm. We referto this family of signals, parametrized by e and a as ”notch ﬁlters”, in reason of theirfrequency responses shapes. Some examples can be found in Fig. 2. One can observethat ψ , is essentially a high pass ﬁlter. As e increases, the notch structure clearlyappears, with an increasing notch frequency. It is clear that, for a vector v , minimizingthe functional Ψ e,a ( v ) = (cid:107) v (cid:63) ψ e,a (cid:107) promotes vectors with spectra concentrated aroundthe notch frequency corresponding to the chosen values of e and a . We can directly usethis family of ﬁlters to constraint A as follows: we deﬁne the functional Ψ : M rp ( R ) (cid:55)→ R + , A → r (cid:88) i =1 Ψ e i ,a ( A [ i, :]) , (12)where ( e i ) i is a set of reals verifying 0 ≤ e < e < · · · < e r and a ∈ [0 , e i , minimizing Ψ promotes varying levelof smoothness of A ’s lines, which is what we wanted to achieve. The ﬁlter ψ e,a and the onstraint matrix factorization for space variant PSFs ﬁeld restoration (a) Direct domain samples(b) Discrete Fourier Transform (DFT) entry-wise moduli Figure 2.

Notch ﬁlters examples for diﬀerent values of the parameter e in Eq. 10 and11. The parameter a is set to 1. functional deﬁnitions can be extended to higher dimensions of the space E by involving amultidimensional convolution [16]. Therefore, if the PSFs are distributed over a regulargrid with respect to E , one can implement the proximity constraint by solvingmin A , S (cid:107) Y − F ( SA ) (cid:107) F + λ Ψ ( A ) , (13)for some positive λ . Yet, in practical applications, the observations are in generalirregularly distributed. In the next section, we propose a slightly diﬀerent penaltywhich is usable for arbitrary observations distributions. onstraint matrix factorization for space variant PSFs ﬁeld restoration Let deﬁne the functional (cid:98) Ψ e,a : R p (cid:55)→ R + , v → p (cid:88) k =1 ( p (cid:88) i =1 i (cid:54) = k av k − v i (cid:107) u k − u i (cid:107) e ) , (14)where e and a are positive reals. Minimizing (cid:98) Ψ e,a ( v ) tends to enforce the similarity ofclose features, with respect to E ; in other terms, the more (cid:107) u k − u i (cid:107) is large, the lessimportant is ( av k − v i ) in (cid:98) Ψ e,a ( v ) and e somehow determines the radius of similarity.For e > (cid:98) Ψ e,a ≈ Ψ e,a because of the uniform spacing of the values u i and the decay of (cid:107) u k − u i (cid:107) e , for suﬃciently high p ; we give more details on this approximation in AppendixA in the 1D case. However unlike Ψ e,a , the functional (cid:98) Ψ e,a is still relevant without theuniform sampling hypothesis and we expect qualitatively the same behavior as Ψ e,a withrespect to the frequency domain if the data sampling is suﬃciently dense. Therefore weuse (cid:98) Ψ e,a instead of Ψ e,a in the functional Ψ of Eq.13. Besides, we use the term frequencyeven for randomly distributed samples. V The eﬃciency ofthe regularization of the problem 13 relies on a good choice of the parameters e , . . . , e r and a . Indeed, if the associated notch frequencies does not match with the data setfrequency content, the regularization will more or less bias the PSF estimation dependingon the Lagrange multiplier λ . Besides, setting this parameter might be tricky. Wepropose an alternate strategy for constraining A , which leads to the factorization modelintroduced in Section 5.1 and still builds over the idea of notch ﬁlters.For v ∈ R p we can write (cid:98) Ψ e,a ( v ) = (cid:107) P e,a v (cid:107) , (15)where P e,a is a p × p matrix deﬁned by P e,a [ i, j ] = − (cid:107) u i − u j (cid:107) e if i (cid:54) = j, (16) P e,a [ i, i ] = p (cid:88) j =1 j (cid:54) = i a (cid:107) u i − u j (cid:107) e , (17)( i, j ) ∈ (cid:74) , p (cid:75) . Therefore, (cid:98) Ψ e,a ( v ) = v T Q e,a v , (18)where Q e,a = P Te,a P e,a and is symmetric and positive. We consider the singular valuesdecomposition (SVD) of Q e,a : Q e,a = V e,a D e,a V Te,a . The diagonal values of D e,a aresorted in decreasing order. We note d e,a the vector made of these diagonal values, so that d e,a [1] ≥ · · · ≥ d e,a [ p ] ≥

0. Considering the reduced form (cid:98) Ψ e,a ( v ) = (cid:80) pi =1 d e,a [ i ] (cid:104) v , V e,a [: , i ] (cid:105) , it is clear that minimizing (cid:98) Ψ e,a ( v ) promotes vectors correlated with Q e,a lasteigenvectors. In the case of regular sampling with respect to E , these eigenvectors arethe harmonics close to the notch frequency of ψ e,a . We can rewrite the functional Ψ onstraint matrix factorization for space variant PSFs ﬁeld restoration Ψ ( A ) = r (cid:88) i =1 p (cid:88) j =1 d e i ,a [ j ] (cid:104) v , V e i ,a [: , j ] (cid:105) . (19)It is clear from this expression that minimizing Ψ ( A ) enforces the selection of theeigenvectors associated with the lowest eigenvalues in the set ( d e i ,a [ j ]) i,j for describing A ’s lines. This can be seen as a sparsity constraint over A ’s lines with respectto the atoms ( V e i ,a [: , j ]) i,j ; yet, the small subset of atoms which will carry most ofthe information is somehow predeﬁned through the eigenvalues ( d e i ,a [ j ]) i,j . This isunsuitable if the notch ﬁlters parameters are poorly selected; on the contrary, one wouldlike to select in a ﬂexible way the atoms which ﬁt the best the data.Let suppose that we have determined a set of parameters ( e i , a i ) ≤ i ≤ r so that theﬁlters ψ e i ,a i notch frequencies would cover the range of signiﬁcant frequencies (withrespect to E ) present in the data. As previously, we note ( V e i ,a i ) ≤ i ≤ r the eigenvector’smatrices associated with the operators (cid:98) Ψ e i ,a i . We note V = [ V e ,a , . . . , V e r ,a r ].Considering the preceding remark, we introduce the following problem:min α , S (cid:107) Y − F ( S α V T ) (cid:107) F s.t. (cid:107) α [ l, :] (cid:107) ≤ η l , l = 1 . . . r (20)Now A = α V T . Each line of A is a sparse linear combination of V T ’s lines, andthe ”active” atoms are optimally selected according to the data. The choice of theparameters ( e i , a i ) ≤ i ≤ r and ( η l ) ≤ i ≤ r is discussed in a forthcoming section. In case a = 1, the matrix P e,a is the laplacianof an undirected fully connected and weighted graph with p nodes 1 . . . p , such thatthe weight of the vertex connecting a node i to a node j is (cid:107) u i − u j (cid:107) e [17]. As proposedin spectral graph theory [18], this gives a natural interpretation of P e,a (and Q e,a )eigenvectors as harmonic atoms in the graph’s geometry. Each line of the matrix A can be seen as a function deﬁned on a family of graphs determined by the observationslocations, so that we enforce the regularity of A ’s lines according to the graphs geometry.Our approach is thereby close to the spectral graphs wavelets framework [19]. However,the graphs wavelets are built on a single graph and a scaling parameter allows one toderive wavelets atoms corresponding to spectral bands of diﬀerent sizes. In our case, thescales diversity is accounted for by building a dictionary of harmonics corresponding todiﬀerent graphs. Indeed, as e increases, the weight associated to the most distant nodes(in the sense of (cid:107) u i − u j (cid:107) ) becomes negligible, which implies that the correspondinggraph laplacian is determined by nearby nodes, yielding ”higher” frequencies harmonics. S As previously mentioned, each PSF is a structured image. We can account for thisthrough a sparsity constraint. This has proven eﬀective in multiple frame PSFs super-resolution [8]. onstraint matrix factorization for space variant PSFs ﬁeld restoration eigenPSFs which are S ’s columns. Speciﬁcally, we promote S ’s columns sparsity with respectto a chosen dictionary Φ s . By deﬁnition, a typical imaging system’s PSF concentratesmost of its power in few pixels. Therefore a straightforward choice for Φ s is I n . In otherwords, we will enforce the sparsity of S ’s columns in the pixels domain.On the other hand, we take Φ s as the second generation Starlet forward transform[20], without the coarse scale. The power of sparse prior in wavelet domain for inverseproblems being well established, we shall online emphasize the fact that this particularchoice of wavelet is particularly suitable for images with nearly isotropic features. We deﬁne the sets Ω = { α ∈ M r,N ( R ) / (cid:107) α [ l, :] (cid:107) ≤ η l , l = 1 . . . r } and Ω = { ( S , α ) ∈ M nr ( R ) × M r,N ( R ) / S α V T ≥ M np ( R ) } . The aforementioned constraints leads us to thefollowing optimization problem:min α , S (cid:107) Y − F ( S α V T ) (cid:107) F + r (cid:88) i =1 (cid:107) w i (cid:12) Φ s s i (cid:107) + ι Ω ( α ) + ι Ω ( S , α ) . (21)where (cid:12) denotes the Hadamard product and ι C denotes the indicator function of a set C (see Appendix B). The l term promotes the sparsity of S columns with respect to Φ s . The vectors ( w i ) i weight the sparsity against the other constraints and allow someadaptivity of the penalty, with respect to the uncertainties propagated to each entry of S [8].The parametric aspects of this method are made clear in the subsequent sections.The Problem 21 is globally non-convex because of the coupling between S and α andthe l constraint. In particular, the feasible set { ( S , α ) ∈ M nr ( R ) × M r,N ( R ) / S α V T ≥ } , with N = rp is non-convex.Therefore, one can at most expect to ﬁnd a local minimum. To do so, we considerthe following alternating minimization scheme:(i) Initialization: α ∈ Ω , with N = rp , S = argmin S (cid:107) Y − F ( S α V T ) (cid:107) F + (cid:80) ri =1 (cid:107) w i (cid:12) Φ s S [: , i ] (cid:107) s.t. S α V T ≥ max :(a) α k +1 = argmin α (cid:107) Y − F ( S k α V T ) (cid:107) F s.t. (cid:107) α [ l, :] (cid:107) ≤ η l , l = 1 . . . r ,(b) S k +1 = argmin S (cid:107) Y −F ( S α k +1 V T ) (cid:107) F + (cid:80) ri =1 (cid:107) w i (cid:12) Φ s S [: , i ] (cid:107) s.t. S α k +1 V T ≥ onstraint matrix factorization for space variant PSFs ﬁeld restoration • the feasible set of (b) is non-empty for any choice of α k +1 ; • allowing α to be outside of the global problem feasible set (for S ﬁxed) brings somerobustness regarding local degenerated solutions.There is an important body of work in the literature on alternate minimizationschemes convergence, and in particular in the non-convex and non-smooth setting(see [24] and the references therein). In the proposed scheme, the analysis is complicatedby the asymmetry of the problems (a) and (b).We deﬁne the function H ( α , S ) = 12 (cid:107) Y − F ( S α V T ) (cid:107) F + r (cid:88) i =1 (cid:107) w i (cid:12) Φ s s i (cid:107) (22)and the matrix (cid:98) S k = argmin S (cid:107) S − S k (cid:107) s.t. S α k +1 V T ≥

0. One immediate suﬃcientcondition for the sequence ( H ( α k , S k )) k to be decreasing (and thereby convergent) is H ( α k +1 , (cid:98) S k ) ≤ H ( α k , S k ) (23)which occurs if ( S k , α k +1 ) stays suﬃciently close to Ω . Although we do not prove thisalways holds true, we observe on examples that the matrix S k α k +1 V T in general onlyhas a few and small negative entries for k ≥

1. This follows from the adequacy of thedictionary V for sparsely describing A ’s lines.The complete method is given in Algorithm 1. The resolution of the minimizationsub-problems is detailed in appendices. Algorithm 1

Resolved components analysis (RCA) Parameters estimation and initialization:Harmonic constraint parameters ( e i , a i ) ≤ i ≤ r → V , A Noise level, A → W , Alternate minimization for k = 0 to k max do for j = 0 to j max do S k = argmin S (cid:107) Y − F ( SA k ) (cid:107) F + (cid:80) ri =1 (cid:107) W k,j [: , i ] (cid:12) Φ s S [: , i ] (cid:107) s.t. SA k ≥ update: W k, , S k → update( W k,j +1 ) end for α k +1 = argmin α (cid:107) Y − F ( S k α V T ) (cid:107) F s.t. (cid:107) α [ l, :] (cid:107) ≤ η l update: Noise level, α k +1 → W k +1 , A k +1 = α k +1 V T A k +1 [ i, :] = A k +1 [ i, :] / (cid:107) A k +1 [ i, :] (cid:107) , for i = 1 . . . r end for Return: S k max , A k max . onstraint matrix factorization for space variant PSFs ﬁeld restoration We consider the terms of the form (cid:107) w k,j (cid:12) Φ s s (cid:107) , where k is the alternate minimization index and j is the re-weighted l minimization index. We ﬁrst suppose that Φ s = I n . We decompose w k,j as: w k,j = κ β k,j (cid:12) λ k (24)Let consider the minimization problems in S in Algorithm 1. Assuming that we simplyminimize the quadratic term using the following steepest descent update rule, S m +1 = S m + µ F ∗ ( Y − F ( S m A k )) A Tk , (25)for a well chosen step size µ , F ∗ being the adjoint operator one can estimate the entry-wise standard deviations of the noise which propagates from the observations to thecurrent solution S m +1 . For a given matrix X in M np ( R ), we assume that F takes thefollowing general form F ( X ) = [ M X [: , , . . . , M p X [: , p ]]. We deﬁne F as: F ( X ) = [( M (cid:12) M ) X [: , , . . . , ( M p (cid:12) M p ) X [: , p ]] (26)We note B the observational noise (or model uncertainty) that we assume to gaussian,white and centered. The propagated noise entry-wise standard deviations are given by Σ k = µ (cid:113) F ∗ (Var( B ))( A Tk (cid:12) A Tk ) , (27)where Var() returns entry-wise variances and F ∗ is the adjoint operator of F . Nowone can proceed to a hypothesis testing on the signal presence in each entry of S m +1 based on Σ k [25], and denoise S m +1 accordingly. For instance, we deﬁne the noise-freeversion of S m +1 as follows:ˆ S m +1 [ i , i ] = (cid:40) , if | S m +1 [ i , i ] | ≤ κ Σ k [ i , i ] S m +1 [ i ,i ] | S m +1 [ i ,i ] | ( | S m +1 [ i , i ] | − κ Σ k [ i , i ]) . otherwise; (28)where κ controls the false detection probability; the noise being gaussian, we typicallychoose 3 or 4 for κ .The sequence (ˆ S m ) converges to a solution of the problemargmin S (cid:107) Y − F ( S α k U T ) (cid:107) F + r (cid:88) i =1 κ (cid:107) λ k [: , i ] (cid:12) S [: , i ] (cid:107) , (29)for λ k = κ/µ Σ k . One can ﬁnd some material on minimization schemes in AppendiceAppendix C. This choice yields a noise-free but biased solution because of thethresholding; this is a well-known drawback of l norm based regularizations. Thepurpose of the vector β k,j is to mitigate this bias [26]. β k, is a vector with ones at allentries. At the step 6 in Algo 1, β k,j is calculated as follows: β k,j = 11 + | S k | κ λ k , (30) onstraint matrix factorization for space variant PSFs ﬁeld restoration | S k | is the vector made of element-wiseabsolute values of S k entries. Qualitatively, this removes the strongest features fromthe l norm terms by giving them small weights, which makes the debiasing possible;conversely, the entries dominated by noise get weights close to 1, so that the penaltyremains unchanged.For Φ s (cid:54) = I n we follow the same rational. To set the sparsity in the transformdomain according to the noise induced uncertainty, we need to further propagate it(the noise) through the operator Φ s . Formally, we need to estimate the element-wisestandard deviations of µ Φ s F ∗ ( B ) A Tk . Let consider the intermediate random matrix Y F = F ∗ ( B ). Assuming that F ( F ∗ ( . )) = λ Id( . ) , (31) Y F ’s lines are statistically independent. Therefore, within a given column of Y F A Tk , theentries are statistically independent from one another. We deduce that the element-wisestandard deviations of µ Φ s F ∗ ( B ) A Tk are given by Σ k = µ (cid:113) ( Φ s (cid:12) Φ s ) F ∗ (Var( B ))( A Tk (cid:12) A Tk ) . (32)Then λ k is obtained as previously and β k,j is calculated as β k,j = 11 + | Φ s S k | κ λ k . (33)The property 31 is approximately true in the case of super-resolution. We do not propose a method to choose the number ofcomponents r . Yet, we observe that because of the sparsity constraint, some lines of thematrix α k +1 at the step 8 in Algorithm 1 are equal to the null vector, when the numberof components is overestimated. The corresponding lines in A k +1 and subsequently thecorresponding columns in S k are simply discarded. This provides an intrinsic mean toselect the number of components. Thus in practice, one can choose the initial r as thedata set dimensionality from the embedding space point of view, which can be estimatedbased on a principal component analysis. In this section, we consider the functionals (cid:98) Ψ e i ,a i and especially the choice of the parameters e i and a i . Let assume that we havedetermine a suitable range for the parameters: ( e i , a i ) ∈ S = [ e min , e max ] × [ a min , a max ]for i = 1 . . . r .For a particular ( e, a ) we consider the matrix Q e,a and its eigenvectors matrix V e,a introduced in section 5.2.3. As previously stated, we want the weights matrix A linesto be sparse with respect to Q e,a ’s eigenvectors. In order to choose the parameters andinitialize the weights matrix, we use the following greedy procedure. We consider asequence of matrices ( R i ) ≤ i ≤ r , with R = Y . For i ∈ (cid:74) , r (cid:75) we deﬁne J e,a ( R i ) = max k ∈ (cid:74) ,p (cid:75) (cid:107) R i V e,a [: , k ] (cid:107) , (34) onstraint matrix factorization for space variant PSFs ﬁeld restoration v ∗ e,a the optimal eigenvector. We choose the i th couple of parameters as:( e i , ai ) = argmax ( e,a ) ∈S J e,a ( R i ) . (35) A [ i, :] = v ∗ e i ,a i and R i +1 = R i − R i V e,a V Te,a .Regarding the set S , we choose the interval a min = 0 and a max = 2. This rangeallows the notch structure, assuming that e min ≥

0; for a < h e,a behaves as a lowpass ﬁlter. For a ≥

0, we observe that h e,a becomes a notch ﬁlter, with a notchfrequency close to the null frequency for a ≥

2. As previously stated, e determinesthe inﬂuence of two samples on one another corresponding coeﬃcients in the matrix A in the algorithmic process. According to Section 5.2.2, we set e min = 1. Let considerthe graph G e introduced in section 5 . .

4. The higher is e , the lower is G e connexity.Considering that we are looking for global features (yet localized in the ﬁeld frequencydomain), the highest possible value of e should guarantee that the graph G e is connected.This gives us a practical upper bound for e . Once S is determined, we discretize thisset, with a logarithmic step, in such a way to have more samples close to ( e min , a min )which correspond to low notch frequencies. We solve approximately Problem 35 bytaking the best couple of parameters in the discretized version of S . This step is themost computationally demanding, especially for large data samples. The parameters η l are implicitly set by theminimization scheme used at step 8 in 1. This is detailed in Appendix C.

6. Numerical experiments

In this section, we present the data used to test the proposed method, the simulationrealized and comparisons to other existing methods for both dimensionality reductionand super-resolution aspects.

The data set consists of simulated optical Euclid PSFs as in [8], for a wavelength of600 µm . The PSFs distribution across the ﬁeld is shown on Fig. 3. These PSFs accountfor mirrors polishing imperfections, manufacturing and alignment errors and thermalstability of the telescope. We applied diﬀerent dimension reduction algorithms to a set of 500 PSFs located in theblue box on Fig. 3. We applied the algorithms to diﬀerent observations of the ﬁelds,with varying level of white gaussian noise. For a discrete signal s of length N corruptedwith a white gaussian noise b , we deﬁne the signal to noise ratio (SNR) as:SNR = (cid:107) s (cid:107) N σ b . (36) onstraint matrix factorization for space variant PSFs ﬁeld restoration Figure 3.

Simulated PSFs distribution across the FOV.

In astronomical surveys, the estimated PSF’s shape is particularly important; precisely,one has to be able to capture the PSF anisotropy. We recall that for an image X = ( x ij ) i,j , the central moments are deﬁned as µ p,q ( X ) = (cid:88) i (cid:88) j ( i − i c ) p ( j − j c ) q x ij (37)with ( p, q ) ∈ N , ( i c , j c ) are the image centroid coordinates. The moments µ , and µ , quantiﬁes the light intensity spreading relatively to the lines { ( i c , y ) , y ∈ R } and { ( x, j c ) , x ∈ R } respectively. Now we consider the moment µ , . We introduce thecentered and rotated pixels coordinates ( x i,θ , y j,θ ) deﬁned by the system of equations x i,θ cos( θ ) + y j,θ sin( θ ) = i − i c (38) − x i,θ sin( θ ) + y j,θ cos( θ ) = j − j c , (39)for some θ ∈ [0 , π ]. Then we have µ , = (cid:88) i (cid:88) j [ sin(2 θ )2 ( − x i,θ + y j,θ ) + (2 cos ( θ ) − x i,θ y j,θ ] x ij , (40)and in particular, µ , = (cid:80) i (cid:80) j [ ( − x i, π + y j, π )] x ij . It becomes clear that µ , quantiﬁesthe light intensity spreading with respect to the pixels grid diagonals.The ellipticity parameters are deﬁned as, e ( X ) = µ , ( X ) − µ , ( X ) µ , ( X ) + µ , ( X ) (41) e ( X ) = 2 µ , ( X ) µ , ( X ) + µ , ( X ) . (42)We deﬁne the vector γ ( X ) = [ e ( X ) , e ( X )] T . This vector characterizes how much X departs from an isotropic shape and indicates its main direction of orientation. onstraint matrix factorization for space variant PSFs ﬁeld restoration X as follows:S( X ) = ( (cid:80) i (cid:80) j (( i − i c ) + ( j − j c ) ) x ij (cid:80) i (cid:80) j x ij ) / . (43)Assuming that a given PSF is a 2D discrete probability distribution, this quantitymeasures how much this distribution is spread around its mean [ i c , j c ] T . Let note( X i ) ≤ i ≤ p the set of ”original” PSFs and ( ˆ X i ) ≤ i ≤ p the set of corresponding estimatedPSFs with one of the compared methods, at a given SNR. The reconstruction quality isaccessed through the following quantities: • the average error on the ellipticity vector: E γ = (cid:80) pi =1 (cid:107) γ ( X i ) − γ ( ˆ X i ) (cid:107) /p ; • noting Γ = [ γ ( X ) − γ ( ˆ X ) , . . . , γ ( X p ) − γ ( ˆ X p )], the dispersion of the errors onthe ellipticity vector is measured through the nuclear norm B γ = (cid:107) Γ (cid:107) ∗ ; • the average absolute error on the size: E S = (cid:80) pi =1 | S( X i ) − S( ˆ X i ) | /p in pixels; • the dispersion of the errors on the size: σ S = std((S( X i ) − S( ˆ X i )) i ), in pixels. In this section, we compare RCA to PCA, GMRA andthe software PSFEx. We ran a PCA with diﬀerent number of principal componentsbetween 0 and 15. 10 was the value which provided the best results. GMRA inputwas the data set intrinsic dimension [29], two, since the PSFs only vary as a functionof their position in the ﬁeld; we provided the absolute squared quadratic error allowedwith respect to the observed data based on the observation noise level. For PSFEx, weused 15 components. Finally, RCA used up to 15 components, and eﬀectively, 2 and 4components respectively for the lowest SNR ﬁelds realization. As previously mentioned,we assess the components sparsity’s constraint: • on the one hand we consider Φ s = I n which enforces the components sparsity inpixels domain; this is referred to as ”RCA” in the plots; • on the other hand, we take Φ s as the second generation Starlet forward transform[20], without the coarse scale; this is referred to as ”RCA analysis” in the plots.One can see on the left plot in Fig. 4 that the proposed method is at least 10 timesmore accurate on the ellipticity vector than the other considered methods. Moreoverthe right plot shows that the accuracy is way more stable. This is true for both choiceof the dictionary Φ s .Fig. 5 shows that the estimated size S( ˆ X i ) is very sensitive to the choice of thedictionary Φ s . The results are by far more accurate with a sparsity constraint on thecomponents in wavelet domain than in direct domain. onstraint matrix factorization for space variant PSFs ﬁeld restoration (a) Average error on the ellipticity vector (b) Dispersion of the ellipticity vector Figure 4. x axis: SNR (see section 6.2); y axis: log (E γ ) for the left plot, log (B γ )for the right plot.(a) Average absolute error on the size (b) Dispersion of the errors on the size Figure 5. x axis: SNR; y axis: E S for the left plot, σ S for the right plot. For a given estimate of the PSF at a given location, the error on the size parameteris more sensitive to errors on the core of the PSF (main lobe and ﬁrst rings) and lesssensitive to errors on the outer part of the PSF than one would expect regarding theerror on the ellipticity vector. The error on the outer part of the PSF is essentiallyrelated to the observational noise, whereas the error on core of the PSF - which has ahigh SNR - is more related to the method induced bias. This explains why the PCAperforms quite well for this parameter. On the other hand, the bias induced by thesparsity is not only related to the dictionary choice, but also to the underlying datamodel with respect to the chosen dictionary.As previously explained, the components sparsity term is set in such a way topenalize any feature which does not emerge from the propagated noise, which is a source onstraint matrix factorization for space variant PSFs ﬁeld restoration • we can model each component as s = Φ Ts α , with α sparse, which is known in thesparse recovery literature as synthesis prior; • we can alternately constraint Φ s s to be sparse.This priors are equivalent if the dictionary is unitary [30]. Therefore the pixel domainsparsity constraint can be considered as falling into both framework. However, thetwo priors are no longer equivalent and potentially yields quite diﬀerent solutions forovercomplete dictionaries.We observe in practice that unless the simulated PSFs are strictly sparse withrespect to the chosen dictionary - this includes redundant wavelet dictionaries, thesynthesis prior yields a bias on the reconstructed PSF size, since the estimated PSFsare sparse linear combinations of atoms which are in general sharper than a typicalPSF proﬁle. The analysis prior is somehow weaker and appears to be more suitable forapproximately sparse data.We do not observe a signiﬁcant diﬀerence between these methods with respect tothe mean squared error, except for GMRA which gave noisier reconstructions.We applied the aforementioned methods to the PSFs ﬁeld previously used, withadditional 30 corners PSFs and 30 localized PSFs as shown on Fig. 3 at an SNR of 40.This assess the behavior of the algorithms with respect to spatial clustering and sparsedata distribution. One can see in Fig. 6 examples of simulated observed PSFs fromdiﬀerent areas in the FOV. (a) Observation 1:center PSF (b) Observation 2:center PSF (c) Observation 3:corner PSF (d) Observation 4:corner PSF (e) Observation 5:”local” PSF (f) Observation 6:”local” PSF Figure 6.

Input PSFs at diﬀerent locations in the FOV for a SNR = 40. Thecorresponding reconstructed PSFs can be seen in Fig. 7

For each of these observed PSFs, the reconstructed PSFs for each method are shownin Fig. 7.One can observe that the proposed method gives noiseless and rather accuratePSFs reconstruction for both the center, the corners and the localized area of the ﬁeld onstraint matrix factorization for space variant PSFs ﬁeld restoration Figure 7.

PSFs reconstructions: from the left to the right: original, GMRA, PCA,PSFEx, RCA; from the bottom to the top: 2 ”local” PSFs reconstructions, 2 cornerPSFs reconstructions, 2 center PSFs reconstructions. The observed correspondingPSFs can be seen in Fig. 6 onstraint matrix factorization for space variant PSFs ﬁeld restoration Φ s considered are not speciﬁcallyadapted to curve-like structures. The ring patterns varies across the FOV but are locallycorrelated. Therefore, they can only be recovered where the PSFs are suﬃciently denseand numerous, which is the case at the FOV’s center.PCA and PSFEx yield a signiﬁcant increase of the SNR in their estimated PSFsat the center and in the localized area. Yet, they fail to do so in the corners because ofthe lack of correlation for the PCA and local smoothness for PSFEx.Finally, the poor results obtained with GMRA can be explained by the fact thatthe underlying manifold sampling is not suﬃciently dense for the tangent spaces to beestimated reliably. In this section, the data are additionally downsampled toEuclid telescope resolution. PCA and GMRA does not handle the downsampling.Therefore we only consider PSFEx and RCA in this section. For each method, weestimate an upsampled version of each PSF, with a factor 2 in lines and columns; incase of Euclid, this is enough to have a Nyquist frequency greater than half the signalspatial bandwidth [31].As previously, RCA Analysis refers to the proposed method, with the dictionary Φ s chosen as the second generation Starlet forward transform [20], without the coarsescale; RCA LSQ refers to the proposed method with the dictionary Φ s chosen as theidentity matrix, and the weight matrix A simply calculated as (cid:98) A = argmin A (cid:107) Y − F ( (cid:98) SA ) (cid:107) F , (44) (cid:98) S being the current estimate of the components matrix. Among all themethods previously considered for comparison, PSFEx is the only one handling theundersampling.As for the dimension reduction experiment, the proposed method with Φ s chosenas a wavelet dictionary is at least one order of magnitude more accurate over the shapeparameters and the mean square error. Besides, Fig. 10 shows that the proximityconstraint over the matrix A allows one to select a signiﬁcantly better optimum thana simple least square update of A . Indeed, regularizing the weight matrix estimationreinforces the rejection of F ’s null space.As previously, we restored the complete ﬁeld of Fig. 3 for a linear SNR of 40, using”RCA Analysis”, with undersampled input PSFs as shown in Fig. 11.The ﬁgure 12 shows consistent results with the dimension reduction experiment. Inparticular, the corners PSFs restoration is obviously more accurate.

7. Reproducible research

In the spirit of participating in reproducible research, the data and the codes usedto generate the plots presented in this paper will be made available at onstraint matrix factorization for space variant PSFs ﬁeld restoration (a) Average error on the ellipticity vector (b) Dispersion of the error on the ellipticity vector Figure 8. x axis: SNR (see section 6.2); y axis: log (E γ ) for the left plot, log (B γ )for the right plot.(a) Average absolute error on the size (b) Dispersion of the errors on the size Figure 9. x axis: SNR; y axis: E S for the left plot, σ S for the right plot. cosmostat.org/software/rca/ .

8. Conclusion

We introduced RCA which is a dimension reduction method for continuous andpositive data ﬁeld which is noise robust and handles undersampled data. As a lineardimension reduction method, RCA computes the input data as linear combinations offew components which are estimated, as well as the linear combination coeﬃcients,through a matrix factorization.The method was tested over a ﬁeld of simulated Euclid telescope PSFs. We showthat constraining both the components matrix and the coeﬃcients matrix using sparsity onstraint matrix factorization for space variant PSFs ﬁeld restoration Figure 10.

Average normalized least square error(a) Observation 1:center PSF (b) Observation 2:center PSF (c) Observation 3:corner PSF (d) Observation 4:corner PSF (e) Observation 5:”local” PSF (f) Observation 6:”local” PSF

Figure 11.

Input PSF at diﬀerent locations in the ﬁeld for a SNR = 40. yield at least one order of magnitude more accurate PSFs restoration than existingmethods, with respect to the PSFs shapes parameters. In particular, we show thatthe analysis formulation of the sparsity constraint over the components is particularlysuitable for capturing accurately the PSFs sizes. We also show that constraining thecoeﬃcients matrix yields a signiﬁcantly better identiﬁcation of the PSFs embeddingsubspace when the data are undersampled.An important extension of RCA for astronomical imaging would be to account forthe wavelength dependency of the PSFs. Indeed, an unresolved star image is a linearcombination of the PSFs at diﬀerent wavelengths weighted by the star’s spectrum.Hence, RCA can be naturally extended by replacing the matrix S with a tensor, forwhich each element would be a polychromatic eigen PSF . Acknowledgements

This work is supported by the European Community through the grants PHySIS(contract no. 640174) and DEDALE (contract no. 665044) within the H2020 FrameworkProgram. The authors acknowledge the Euclid Collaboration, the European SpaceAgency and the support of the Centre National d’Etudes Spatiales. onstraint matrix factorization for space variant PSFs ﬁeld restoration Figure 12.

PSFs reconstructions: from the left to the right: original, PSFEx,RCA; from the bottom to the top: 2 ”local” PSFs reconstructions, 2 corner PSFsreconstructions, 2 center PSFs reconstructions. The observed corresponding PSFs canbe seen in Fig. 11 onstraint matrix factorization for space variant PSFs ﬁeld restoration References [1] ESA/SRE 2011 EUCLID Mapping the geometry of the dark universe Tech. rep. ESA[2] Escande P and Weiss P 2014

ArXiv e-prints ( Preprint )[3] Miraut D, Ball´e J and Portilla J 2012

EURASIP J. Adv. Sig. Proc.

193 URL http://dblp.uni-trier.de/db/journals/ejasp/ejasp2012.html [4] Denis L, Thi´ebaut E and Soulez F 2011 Fast model of space-variant blurring and its application todeconvolution in astronomy.

ICIP ed Macq B and Schelkens P (IEEE) pp 2817–2820 ISBN 978-1-4577-1304-0 URL http://dblp.uni-trier.de/db/conf/icip/icip2011.html [5] Farsiu S, Robinson D, Elad M and Milanfar P 2004

International Journal of Imaging Systems andTechnology http://dx.doi.org/10.1002/ima.20007 [6] Rowe B, Hirata C and Rhodes J 2011 ( Preprint arXiv:1105.2852v2 )[7] Bertin E 2011 Automated Morphometry with SExtractor and PSFEx

Astronomical Data AnalysisSoftware and Systems XX ( Astronomical Society of the Paciﬁc Conference Series vol 442) edEvans I N, Accomazzi A, Mink D J and Rots A H p 435[8] Ngol`e Mboula F M, Starck J L, Ronayette S, Okumura K and Amiaux J 2015

Astronomy &Astrophysics

A86 (

Preprint )[9] Wang W T and Huang H C 2015

ArXiv e-prints ( Preprint )[10] Cheng Q 2006 Spatial and spatially weighted principal component analysis for images processing

Geoscience and Remote Sensing Symposium, 2006. IGARSS 2006. IEEE InternationalConference on pp 972–975[11] Harris P, Brunsdon C and Charlton M 2011

International Journal of Geographical InformationScience Preprint http://dx.doi.org/10.1080/13658816.2011.554838 ) URL http://dx.doi.org/10.1080/13658816.2011.554838 [12] Demˇsar U, Harris P, Brunsdon C, Fotheringham A S and McLoone S 2013

Annals of the Associationof American Geographers

Preprint http://dx.doi.org/10.1080/00045608.2012.689236 ) URL http://dx.doi.org/10.1080/00045608.2012.689236 [13] Allard W K, Chen G and Maggioni M 2011

ArXiv e-prints ( Preprint )[14] Maggioni M, Minsker S and Strawn N 2014

ArXiv e-prints ( Preprint )[15] Lee A B, Nadler B and Wasserman L 2008

Ann. Appl. Stat. http://dx.doi.org/10.1214/07-AOAS137 [16] Rakhuba M V and Oseledets I V 2014 ArXiv e-prints ( Preprint )[17] Anderson W N and Morley T D 1985

Linear and Multilinear Algebra Spectral graph theory vol 92 (American Mathematical Soc.)[19] Hammond D K, Vandergheynst P and Gribonval R 2009

ArXiv e-prints ( Preprint )[20] Starck J L, Murtagh F and Bertero M 2011 Starlet transform in astronomical data processing

Handbook of Mathematical Methods in Imaging ed Scherzer O (Springer New York) pp 1489–1531 ISBN 978-0-387-92919-4[21] Soussen C, Idier J, Duan J and Brie D 2015

IEEE Transactions on Signal Processing https://hal.archives-ouvertes.fr/hal-00948313 [22] Blumensath T and Davies M 2008 Journal of Fourier Analysis and Applications http://dx.doi.org/10.1007/s00041-008-9035-z [23] Cartis C and Thompson A 2015 Information Theory, IEEE Transactions on Mathematical Programming

Sparse Image and Signal Processing: Wavelets, Curvelets,Morphological Diversity (New York, NY, USA: Cambridge University Press) ISBN 0521119138,9780521119139[26] Cand`es E, Wakin M and Boyd S 2008

Journal of Fourier Analysis and Applications Modern Cosmology ed Dodelson S (Burlington:Academic Press) pp 292 – III ISBN 978-0-12-219141-1 URL onstraint matrix factorization for space variant PSFs ﬁeld restoration science/article/pii/B9780122191411500292 [28] Paulin-Henriksson S, Amara A, Voigt L, Refregier A and Bridle S L 2008 Astronomy & Astrophysics

Preprint )[29] Little A, Lee J, Jung Y M and Maggioni M 2009 Estimation of intrinsic dimensionality of samplesfrom noisy low-dimensional manifolds in high dimensions with multiscale svd

Statistical SignalProcessing, 2009. SSP ’09. IEEE/SP 15th Workshop on pp 85–88[30] Elad M, Milanfar P and Rubinstein R 2007

Inverse Problems

947 URL http://stacks.iop.org/0266-5611/23/i=3/a=007 [31] Cropper M e a 2013

Monthly Notices of the Royal Astronomical Society ( Preprint arXiv:astro-ph.IM/1210.7691 )[32] Moreau J 1965

Bulletin de la Soci´et´e Math´ematique de France http://eudml.org/doc/87067 [33] Combettes P L, Condat L, Pesquet J C and Cong Vu B 2014 ArXiv e-prints ( Preprint )[34] Beck A and Teboulle M 2009

SIAM Journal on Imaging and Sciences ICIP (IEEE) pp 917–920 URL http://dblp.uni-trier.de/db/conf/icip/icip2008.html

Appendix A. Notch ﬁlter approximation

In this appendix, we explain why the functional Ψ e,a introduced in subsection 5.2 canbe approximated with the functional (cid:98) Ψ e,a . We use the subsection 5.2 notations. Weconsider the 1D case. The samples ( u i ) ≤ i ≤ p are uniformly spaced scalar. We assumethat u < · · · < u p . We note ∆ = u − u . Thus, ψ i = ψ p − i +1 = − | k + 1 − i | e ∆ e if i (cid:54) = k + 1 , and (A.1) ψ k +1 = 2 k (cid:88) n =1 an e ∆ e . (A.2)Using the centered deﬁnition of the convolution with a zero boundary condition, for avector v = ( v i ) ≤ i ≤ p , the vector h = v (cid:63) ψ e,a is given by h [ j ] = p (cid:88) i =1 v i ψ j + k +1 − i , (A.3)for j ∈ (cid:74) , p (cid:75) and with the convention that ψ j + k +1 − i = 0 if j + k + 1 − i < j + k + 1 − i > p . Combining Eq.A.1, A.2 and A.3, we can write h [ j ] = (2 k (cid:88) n =1 an e ∆ e ) v j − (cid:88) i ∈ [max(1 ,j − k ) , min( p,j + k )] ,i (cid:54) = j | j − i | e ∆ e v i . (A.4) onstraint matrix factorization for space variant PSFs ﬁeld restoration e,a ( v ) = (cid:107) h (cid:107) . On the other hand, (cid:98) Ψ e,a ( v ) = (cid:107) t v (cid:107) , with t v deﬁned as t v [ j ] = (2 min( j − ,p − j ) (cid:88) n =1 n e + max( j − ,p − j ) (cid:88) n =min( j − ,p − j )+1 n e ) a ∆ e v j − p (cid:88) i =1 i (cid:54) = j | j − i | e ∆ e v i if j (cid:54) = k + 1and t v [ k + 1] = (2 k (cid:88) n =1 an e ∆ e ) v j − p (cid:88) i =1 i (cid:54) = k +1 | k + 1 − i | e ∆ e v i . (A.5)Thus, t v [ k + 1] − h [ k + 1] = 0 and for j (cid:54) = k + 1 t v [ j ] − h [ j ] = ( max( j − ,p − j ) (cid:88) n = k +1 n e − k (cid:88) n =min( j − ,p − j )+1 n e ) a ∆ e v j − max(1 ,j − k ) (cid:88) i =1 i (cid:54) = j | j − i | e ∆ e v i − p (cid:88) i =min( p,j + k ) i (cid:54) = j | j − i | e ∆ e v i . (A.6)Given the symmetry of ψ e,a with respect to k + 1, we focus on the above diﬀerence for j ≤ k . We further assume that j (cid:54) = 1. Then, Eq.A.6 simpliﬁes to t v [ j ] − h [ j ] = ( p − j (cid:88) n = k +1 n e − k (cid:88) n = j n e ) a ∆ e v j − j − e ∆ e v − p − j (cid:88) n = k n e ∆ e v n . (A.7)Now, using the inequalities for n > (cid:90) nn − t + 1) e dt ≤ n e ≤ (cid:90) nn − t e dt, (A.8)and assuming that e >

1, we get the following upper bounding: | t v [ j ] − h [ j ] | ≤ e − | ( p − j + 1) − e + ( j − − e − ( k + 1) − e − k − e | , | ( p − j ) − e + j − e − ( k + 1) − e − k − e | ) a + e − j − e + k − e − ( p − j ) − e ] (cid:107) v (cid:107) ∞ ∆ e . (A.9)We see that the higher is k (we recall that p = 2 ∗ k + 1) and the closer j is to k , thesmaller is the error. Therefore, we use t v as an approximation for h , up to boundarieserrors. Appendix B. Convex analysis

In this appendix, we give the general convex analysis material relevant to our work. Weconsider a ﬁnite-dimensional Hilbert space H equipped with the inner product (cid:104) ., . (cid:105) andassociated with the norm (cid:107) . (cid:107) . onstraint matrix factorization for space variant PSFs ﬁeld restoration Appendix B.1. Proximity operatorDeﬁnition:

Let F be a real-valued function deﬁned on H . F is proper if its domain,as deﬁned by dom F = { x ∈ H / F ( x ) < + ∞} , is non-empty. F is lower semicontinuous(LSC) if lim inf x → x F ( x ) ≥ F ( x ). We deﬁne Γ ( H ) as the set of proper LSCconvex real-valued function deﬁned on H . For a function F ∈ Γ ( H ), the function y → (cid:107) α − y (cid:107) + F ( y ) achieves its minimum at a unique point denoted by prox F ( α ),( ∀ α ∈ H ) [32]; the operator prox F is the proximity operator of F . Examples: • let C be a convex closed set of H . The indicator function of C is deﬁned as: ι C ( x ) = (cid:40) , if x ∈ C + ∞ , otherwise; (B.1)it is clear from the deﬁnition that the proximity operator of ι C is the orthogonalprojector onto C ; • for H = R , λ ∈ R + and f : x → λ | x | , prox f ( y ) = SoftThresh λ ( y ), whereSoftThresh λ denotes the soft-thresholding operator, with a threshold λ . Properties: • separability: if H = H × · · · × H n , for F ∈ Γ ( H ) and if F ( x ) = F ( x [1]) + · · · + F n ( x [ n ]) where F i ∈ Γ ( H i ), for i = 1 . . . n , then prox F ( y ) =(prox F ( y [1]) , . . . , prox F n ( y [ n ])); • translation: for F ∈ Γ ( H ) and a ∈ H , we deﬁne F a ( x ) = F ( x − a ); thenprox F a ( y ) = a + prox F ( y − a ) Appendix B.2. Convex conjugateDeﬁnition: let F be a real-valued function deﬁned on H . The function F ∗ : y → max x (cid:104) x , y (cid:105) − F ( x ) is the convex conjugate of F ; it is also known as the Legendre-Fencheltransformation of F . Properties: • Moreau identity: for

F ∈ Γ ( H ) and λ ∈ R ∗ + , prox λ F ( x ) + λ prox λ F ∗ ( x λ ) = x ; • Fenchel - Moreau theorem: if

F ∈ Γ ( H ), F = F ∗∗ . Appendix C. Minimization schemes

This appendix details the practical resolution of the proposed method optimizationproblems. onstraint matrix factorization for space variant PSFs ﬁeld restoration Appendix C.1. Components estimation problem

We consider the step 5 in the Alg. 1. If Φ s = I n , the problem of estimating componentstakes the following generic form:min S F ( S ) + G ( L ( S )) + H ( S ) , (C.1)with F ( S ) = (cid:80) ri =1 (cid:107) w i (cid:12) s ( c ) i (cid:107) , G = ι R n × p + , L ( S ) = SA and H ( S ) = (cid:107) Y − M ( S ) (cid:107) F for some bounded linear operator M . F ∈ Γ ( R n × r ), G ∈ Γ ( R n × p ) and L is a bounded linear operator. Moreover, H is convex, diﬀerentiable and has a continuous and Lipschitz gradient. This problem canbe solved eﬃciently using the primal dual algorithms introduced in [33] for instance.One only need to be able to compute λ F and α G ∗ proximity operators, for some givenpositive reals λ and α and H ’s gradient: • prox λ F ( S ) = (ˆ s ij ) ≤ i ≤ n ≤ j ≤ p , with ˆ s ij = SoftThresh λ w j [ i ] ( s j [ i ]); • prox α G ∗ ( Z ) = Z − ( Z ) + • ∇H ( S ) = −M ∗ ( Y − M ( S )), where M ∗ is the adjoint operator of M .For an arbitrary dictionary Φ s , we instead consider the following generic formulation ofthe problem: min S G ( L ( S )) + G ( L ( S )) + H ( S ) , (C.2)where G ( Z ) = (cid:80) ri =1 (cid:107) w i (cid:12) Z ( c ) i (cid:107) and L ( S ) = [ Φ s s ( c )1 , . . . , Φ s s ( c ) r ]. One can use thealgorithms suggested before and minimization will require the computation of α G ∗ proximity operator, for some given positive real α which is simply given byprox α G ∗ ( Z ) = Z − (cid:98) Z ,with (cid:98) Z [ i, j ] = SoftThresh λ w j [ i ] ( Z [ i, j ]). Appendix C.2. Coeﬃcients estimation

We consider the step 8 in the Alg. 1. The problem takes the generic form:min α J ( α ) s . t . (cid:107) α [ l, :] (cid:107) ≤ η l , l = 1 . . . r, (C.3)where J is convex, diﬀerentiable and has a continuous and Lipschitz gradient and α ∈ R r × q . This problem is combinatorial and its feasible set is non-convex. For typicaldata sizes in image processing applications and tractable processing time, one can atbest reach a ”good” local optimum. There is an extensive literature on optimizationproblems involving the l pseudo-norm. We propose an heuristic based on quite commonideas now and which appears to be convenient from a practical point of view. Let α ∗ be a global minimum of Problem C.3. For a vector M ∈ R r × q , we deﬁne its support asSupp( M ) = { ( i, j ) ∈ (cid:74) , r (cid:75) × (cid:74) , q (cid:75) / | M [ i, j ] | ≥ } . (C.4)We note E α ∗ the set of r × q real matrices sharing the support of α ∗ :E α ∗ = { M ∈ R r × q / Supp( M ) = Supp( α ∗ ) } . (C.5) onstraint matrix factorization for space variant PSFs ﬁeld restoration α ∗ is a vector space. In particular, E α ∗ is a convex set. Therefore, α ∗ is a solution ofthe following problem: min α J ( α ) s . t . α ∈ E α ∗ . (C.6)The proposed scheme is motivated by the idea of identifying approximately E α ∗ alongwith the iterative process. One can think of numerous algorithms to solve ProblemC.6, all involving the orthogonal projection onto E α ∗ . We build upon the fast proximalsplitting algorithm introduced in [34]. For a vector u ∈ R q we note σ a permutation of (cid:74) , q (cid:75) verifying | u [ σ (1)] | ≥ · · · ≥ | u [ σ ( q )] | . For an integer k ≤ q , we deﬁneSupp k ( u ) = { i ∈ (cid:74) , q (cid:75) / | u [ i ] | ≥ | u [ σ ( k )] |} , (C.7)Finally for a vector α ∈ R r × q , we deﬁne the subspaceE k, α = { M ∈ R r × q / Supp k ( M [ i, :]) = Supp k ( α [ i, :]) , i = 1 . . . r } . (C.8)The proposed scheme is given in Algorithm 2. f is a positive valued concave increasingfunction and proj E (cid:98) f ( k ) (cid:99) , U k ( . ) denotes the orthogonal projection onto E (cid:98) f ( k ) (cid:99) , U k . Algorithm 2

Beck-Teboulle proximal gradient algorithm with variable proximityoperator Initialization: α = 0 R r × q , β = α , t = 1 res − = 0 , res = 0 , tol , k = 0 Minimization while k < k max and | (res k − res k − ) / res k | do U k = β k − ρ − ∇J ( β k ) α k +1 = proj E (cid:98) f ( k ) (cid:99) , U k ( U k ) t k +1 = √ t k +12 λ k = 1 + t k − t k +1 β k +1 = α k + λ k ( α k +1 − α k ) res k +1 = J ( β k ) k = k + 1 end while Return: α k stop .The solution support size is constraint at step 5 and the size is gradually increasedas shown in Fig. C1.The convergence analysis this scheme is out of the scope of this paper. However,Fig. C2 suggests that once an index is included in an iterate support, this index isincluded in all the forthcoming iterates supports. This implies that at each supportsize’s step in Fig. C1, the algorithm approximately solves a problem of the followingform: min α J ( α ) s . t . α ∈ E , (C.9) onstraint matrix factorization for space variant PSFs ﬁeld restoration Figure C1.

Support size function; X axis: iteration index k in Algorithm 2; Y axis: (cid:98) f ( k ) (cid:99) for f ( x ) = √ x + 1 Figure C2.

Algorithm 2 main iterate evolution; X axis: | α k +1 [0 , :] | for the top imageand | α k +1 [1 , :] | for the bottom image; Y axis: iterate index k for a given subspace E, which is a convex problem.This scheme can be viewed as an iterative hard thresholding [22], with a decreasingthreshold [35]. Yet, it is quite easy to get an upper bound of the support size - relatedto the parameters η l in Problem C.3 - from the data. Depending on the time one iswilling to spend on the coeﬃcients computation, this yields convenient choices for thefunction ff