Bayesian estimation of regularization and PSF parameters for Wiener-Hunt deconvolution
aa r X i v : . [ s t a t . C O ] A p r Bayesian estimation of regularization and PSF parameters forWiener-Hunt deconvolution
Franc¸ ois Orieux, , ∗ Jean-Franc¸ ois Giovannelli, Thomas Rodet, Laboratoire des Signaux et Syst`emes (
CNRS – SUPELEC – Univ. Paris-Sud 11),
SUPELEC ,Plateau de Moulon, 3 rue Joliot-Curie, 91 192 Gif-sur-Yvette, France Laboratoire d’Int´egration du Mat´eriau au Syst`eme (
CNRS – ENSEIRB – Univ. Bordeaux 1 –
ENSCPB ), 351 cours de la Libration, 33405 Talence, France ∗ Corresponding author: [email protected]
This paper tackles the problem of image deconvolution with joint estimation ofPSF parameters and hyperparameters. Within a Bayesian framework, the solutionis inferred via a global a posteriori law for unknown parameters and object. Theestimate is chosen as the posterior mean, numerically calculated by means of aMonte-Carlo Markov chain algorithm. The estimates are efficiently computed inthe Fourier domain and the effectiveness of the method is shown on simulatedexamples. Results show precise estimates for PSF parameters and hyperparametersas well as precise image estimates including restoration of high-frequencies andspatial details, within a global and coherent approach. c (cid:13)
OCIS codes:
1. Introduction
Image deconvolution has been an active research field for several decades and recent contributionscan be found in papers such as [1–3]. Examples of application are medical imaging, astronomy,nondestructive testing and more generally imagery problems. In these applications, degradationsinduced by the observation instrument limit the data resolution while the need of precise interpre-tation can be of major importance. For example, this is particularly critical for long-wavelengthastronomy (see e.g., [4]). In addition, the development of a high quality instrumentation systemmust rationally be completed by an equivalent level of quality in the development of data process-ing methods. Moreover, even for poor performance systems, the restoration method can be used tobypass instrument limitations. 1hen the deconvolution problem is ill-posed a possible solution relies on regularization, i.e., in-troduction of information in addition to the data and the acquisition model [5,6]. As a consequenceof regularization, deconvolution methods are specific to the class of image in accordance with theintroduced information. From this standpoint, the present paper is dedicated to relatively smoothimages encountered for numerous applications in imagery [4, 7, 8]. The second order consequenceof ill-posedness and regularization is the need to balance the compromise between different sourcesof information.In the Bayesian approach [1, 9], information about unknowns is introduced by means of proba-bilistic models. Once these models are designed, the next step is to build the a posteriori law, giventhe measured data. The solution is then defined as a representative point of this law and the twomost classical are (1) the maximizer, and (2) the mean. From a computational standpoint, the firstleads to a numerical optimization problem and the latter leads to a numerical integration problem.However, the resulting estimate depends on two sets of variables in addition to the data.1. Firstly, the estimate naturally depends on the response of the instrument at work, namely thepoint spread function (PSF). The literature is predominantly devoted to deconvolution in thecase of known PSF. On the contrary, the present paper is devoted to the case of unknownor poorly known PSF and there are two main strategies to tackle its estimation from theavailable data set (without extra measurements).(i) In most practical cases, the instrument can be modeled using physical operating descrip-tion. It is thus possible to find the equation for the PSF, at least in a first approximation.This equation is usually driven by a relatively small number of parameters. It is a com-mon case in optical imaging where a Gaussian-shaped PSF is often used [10]. It is alsothe case in other fields: interferometry [11], magnetic resonance force microscopy [12],fluorescence microscopy [13],. . . Nevertheless, in real experiments, the parameter valuesare unknown or imperfectly known and need to be estimated or adjusted in addition tothe image of interest: the question is namely myopic deconvolution.(ii) The second strategy forbears the use of the parametric PSF deduced from the physicalanalysis and the PSF then naturally appears in a non-parametric form. Practically, thenon-parametric PSF is unknown or imperfectly known and needs to be estimated inaddition to the image of interest: the question is referred to as blind deconvolution forexample in interferometry [14–17].From an inference point of view, the difficulty of both myopic and blind problems lies inthe possible lack of information resulting in ambiguity between image and PSF, even in thenoiseless case. In order to resolve the ambiguity, information must be added [3, 18] and itis crucial to make inquiries based on any available source of information. To this end, theknowledge of the parametric PSF represents a precious means to structure the problem andpossibly resolve the degeneracies. Moreover, due to instrument design process, a nominal2alue as well as an uncertainty are usually available for the PSF parameters.In addition, from a practical and algorithmic standpoint, the myopic case, i.e., the case ofparametric PSF, is often more difficult due to the non-linear dependence of the observationmodel with respect to the PSF parameters. On the contrary, the blind case, i.e., the case ofnon-parametric PSF, yields a simpler practical and algorithmic problem since the observationmodel remains linear w.r.t. the unknown elements given the object.Despite the superior technical difficulty, the present paper is devoted to the myopic formatsince it is expected to be more efficient than the blind format from an information standpoint.Moreover, the blind case has been extensively studied and a large amount of paper is available[19–21], while the myopic case has been less investigated, though it is of major importance.2. Secondly, the solution depends on the probability law parameters named hyperparameters(means, variances, parameters of correlation matrix,. . . ). These parameters adjust the shapeof the laws and in the same time they tune the compromise between the information providedby the a priori and the information provided by the data. In real experiments, their values areunknown and need to be estimated: the question is namely unsupervised deconvolution.For both families of parameters (PSF parameters and hyperparameters), two approaches areavailable. In the first one, the parameter values are empirically tuned or estimated in a preliminarystep (with Maximum Likelihood [7] or calibration [22] for example), then the values are used in asecond step devoted to image restoration given the parameters. In the second one, the parametersand the object are jointly estimated [2, 19].For the myopic problem, Jalobeanu et al. [23] address the case of a symmetric Gaussian PSF.The width parameter and the noise variance are estimated in a preliminary step by Maximum-Likelihood. A recent paper [24] addresses the estimation of a Gaussian blur parameter, as in ourexperiment, with an empirical method. They found the Gaussian blur parameter by minimizing theabsolute derivatives of the restored images Laplacian.The present paper addresses the myopic and unsupervised deconvolution problem. We proposea new method that jointly estimates the PSF parameters, the hyperparameters, and the image ofinterest. It is built in a coherent and global framework based on an extended a posteriori lawfor all the unknown variables. The posterior law is obtained via the Bayes rule, founded on apriori laws: Gaussian for image and noise, uniform for PSF parameters and gamma or Jeffreys forhyperparameters.Regarding the image prior law, we have paid special attention to the parametrization of thecovariance matrix in order to facilitate law manipulations such as integration, conditioning or hy-perparameter estimation. The possible degeneracy of the a posteriori law in some limit cases isalso studied.The estimate is chosen as the mean of the posterior law and is computed using Monte-Carlosimulations. To this end, Monte-Carlo Markov chain (MCMC) algorithms [25] enable to draw3amples from the posterior distribution despite its complexity and especially the non-linear depen-dence w.r.t. the PSF parameters.The paper is structured in the following manner. Sec. 2 presents the notations and states theproblem. The three following sections describe our methodology: firstly the Bayesian probabilisticmodels are detailed in Sec. 3; then a proper posterior law is established in Sec. 4; an MCMCalgorithm to compute the estimate is described in Sec. 5. Numerical results are shown in Sec. 6.Finally, Sec. 7 is devoted to conclusion and perspectives.
2. Notations and convolution model
Consider N pixels real square images represented in lexicographic order by vector x ∈ R N , withgeneric elements x n . The forward model writes y = H w x + ǫ (1)where y ∈ R N is the vector of data, H w a convolution matrix, x the image of interest and ǫ themodelization errors or the noise. Vector w ∈ R P stands for the PSF parameters, such as width ororientation of a Gaussian PSF.The matrix H w is block-circulant with circulant-block (BCCB) for computational efficiencyof the convolution in the Fourier space. The diagonalization [26] of H w writes Λ H = F H w F † where F is the unitary Fourier matrix and † is the transpose conjugate symbol. The convolution,in the Fourier space, is then ◦ y = Λ H ◦ x + ◦ ǫ (2)where ◦ x = F x , ◦ y = F y and ◦ ǫ = F ǫ are the 2D discrete Fourier transform (
DFT -2 D ) of image,data and noise, respectively.Since Λ H is diagonal, the convolution is computed with a term-wise product in the Fourierspace. There is a strict equivalence between a description in spatial domain (Eq. (1)) and in Fourierdomain (Eq. (2)). Consequently, for coherent description and computational efficiency, all the de-velopments are equally done in the spatial space or in the Fourier space.For notational convenience, let us introduce the component at null-frequency ◦ x ∈ R and thevector of component at non-null frequencies ◦ x ∗ ∈ C N − so that the whole set of components writes ◦ x = [ ◦ x , ◦ x ∗ ] .Let us note the vector of N components equal to /N , so that t x is the empirical mean levelof the image. The Fourier components are the ◦ n and we have: ◦ = 1 and ◦ n = 0 for n = 0 .Moreover, Λ = F t F † is a diagonal matrix with only one non-null coefficient at null frequency.
3. Bayesian probabilistic model
This section presents the prior law for each set of parameters. Regarding the image of interest, inorder to account for smoothness, the law introduces high-frequency penalization through a differ-4ntial operator on the pixel. A conjugate law is proposed for the hyperparameters and a uniformlaw is considered for the PSF parameters.Moreover, we have paid a special attention to the image prior law parametrization. In the nextsection we present several parametrization in order to facilitate law manipulations such as inte-gration, conditioning or hyperparameter estimation. Moreover, the correlation matrix of the imagelaw may become singular in some limit cases resulting in a degenerated prior law (when p ( x ) = 0 for all x ∈ R N ). Based on this parametrization, Sec. 4 studies the degeneracy of the posterior inrelation with the parameters of the prior law. The probability law for the image is a Gaussian field with a given precision matrix P parametrizedby a vector γ . The pdf reads p ( x | γ ) = (2 π ) − N/ det[ P ] / exp (cid:20) − x t P x (cid:21) . (3)For computational efficiency, the precision matrix is designed (or approximated) in a toroidal man-ner, and it is diagonal in the Fourier domain Λ P = F P F † . Thus, the law for x also writes p ( x | γ ) = (2 π ) − N/ det[ F ] det[ Λ P ] / det[ F † ] exp (cid:20) − x t F † Λ P F x (cid:21) (4) = (2 π ) − N/ det[ Λ P ] / exp (cid:20) − ◦ x † Λ P ◦ x (cid:21) (5)and it is sometimes referred to [27] as a Whittle approximation (see also [28, p.133]) for theGaussian law. The filter obtained for fixed hyperparameters is also the Wiener-Hunt filter [29], asdescribed in Sec. 5.A.This paper focuses on smooth images, thus on positive correlation between pixels. It is intro-duced by high-frequencies penalty using any circulant differential operator: p -th differences be-tween pixels, Laplacian, Sobel. . . The differential operator is denoted by D and its diagonalizedform by Λ D = F DF † . Then, the precision matrix writes P = γ D t D and its Fourier counterpartwrites Λ P = γ Λ † D Λ D = diag (cid:18) , γ | ◦ d | , . . . , γ | ◦ d N − | (cid:19) (6)where γ is a positive scale factor, diag builds a diagonal matrix from elementary components and ◦ d n is the n -th DFT -2 D coefficient of D .Under this parametrization of P , the first eigenvalue is equal to zero corresponding to the ab-sence of penalty for the null frequency ◦ x , i.e., no information accounted for about the empiricalmean level of the image. As a consequence, the determinant vanishes det[ P ] = 0 resulting in adegenerated prior. To manage this difficulty, several approaches have been proposed.Some authors [2, 30] still use this prior despite its degeneracy and this approach can be analyzedin two ways. 5. On the one hand, it can be seen as a non-degenerated law for ◦ x ∗ , the set of non-null frequencycomponents only. In this format, the prior does not affect any probability to the null frequencycomponent and the Bayes rule does not apply to this component. Thus, this strategy yieldsan incomplete posterior law, since the null frequency is not embedded in the methodology.2. On the other hand, it can be seen as a degenerated prior for the whole set of frequencies. Theapplication of the Bayes rule is then somewhat confusing due to degeneracy. In this format,the posterior law cannot be guaranteed to remain non-degenerated.Anyway, none of the two standpoints yields a posterior law that is both non-degenerated andaddressing the whole set of frequencies.An alternative parametrization relies on the energy of x . An extra term γ I , tuned by γ > , inthe precision matrix [31], introduces information for all the frequencies including ◦ x . The precisionmatrix writes Λ P = γ I + γ Λ † D Λ D = diag (cid:18) γ , γ + γ | ◦ d | , . . . , γ + γ | ◦ d N − | (cid:19) (7)with a determinant det[Λ P ] = N − Y n =0 (cid:18) γ + γ | ◦ d n | (cid:19) . (8)The obtained Gaussian prior is not degenerated and undoubtedly leads to a proper posterior. Nev-ertheless, the determinant Eq. (8) is not separable in γ and γ . Consequently, the conditionalposterior for these parameters is not a classical law and future development will be more difficult.Moreover, the non-null frequencies ◦ x ∗ are controlled by two parameters γ and γ p ( ◦ x | γ , γ ) = p ( ◦ x | γ ) p ( ◦ x ∗ | γ , γ ) . (9)The proposed approach to manage the degeneracy relies on the addition of a term for the nullfrequency only Λ = diag (1 , , . . . , P = γ Λ † Λ + γ Λ † D Λ D . (10) = diag (cid:18) γ , γ | ◦ d | , . . . , γ | ◦ d N − | (cid:19) . The determinant has a separable expression det[Λ P ] = γ γ N − N − Y n =1 | ◦ d n | , (11) i.e., the precision parameters have been factorized. In addition, each parameter controls a differentset of frequencies: p ( ◦ x | γ , γ ) = p ( ◦ x | γ ) p ( ◦ x ∗ | γ ) , drives the empirical mean level of the image ◦ x and γ drives the smoothness ◦ x ∗ of the image.With the Fourier precision structure of Eq. (10), we have the non-degenerated prior law for theimage that addresses separately all the frequencies with a factorized partition function w.r.t. ( γ , γ ) p ( x | γ , γ ) = (2 π ) − N/ N − Y n =1 | ◦ d n | γ / γ ( N − / exp (cid:20) − γ k ◦ x k − γ k Λ D ∗ ◦ x ∗ k (cid:21) . (12)where Λ D ∗ is obtained from Λ D without the first line and column. The next step is to write the apriori law for the noise in an explicit form and the other parameters, including the law parameters γ and the instrument parameters w . From a methodological standpoint, any statistic can be included for errors (measurement andmodel errors). It is possible to account for correlations in the error process or to account for anon-Gaussian law, e.g.,
Laplacian law, generalized Gaussian law, or other laws based on robustnorm,. . . In the present paper, the noise is modeled as zero-mean white Gaussian vector with un-known precision parameter γ ǫ p ( ǫ | γ ǫ ) = (2 π ) − N/ γ N/ ǫ exp (cid:20) − γ ǫ k ǫ k (cid:21) . (13)Consequently, the likelihood for the parameters given the observed data writes p ( y | x , γ ǫ , w ) = (2 π ) − N/ γ N/ ǫ exp (cid:20) − γ ǫ k y − H w x k (cid:21) . (14)It naturally depends on the image x , on the noise parameter γ ǫ and on the PSF parameters w embedded in H w . It clearly involves a least squares discrepancy that can be rewritten in the Fourierdomain: k y − H w x k = k ◦ y − Λ H ◦ x k . A classical choice for hyperparameter law relies on conjugate prior [32]: the conditional posteriorfor the hyperparameters is in the same family as its prior. It results in practical and algorithmicfacilities: update of the laws amounts to update of a small number of parameters.The three parameters γ , γ and γ ǫ are precision parameters of Gaussian laws Eq. (12) and (14)and a conjugate law for these parameters is the Gamma law (see Appendix B). Given parameters ( α i , β i ) , for i = 0 , 1 or ǫ , the pdf reads p ( γ i ) = 1 β α i i Γ( α i ) γ α i − i exp ( − γ i /β i ) , ∀ γ i ∈ [0 , + ∞ [ (15)In addition to computational efficiency, the law allows for non-informative priors. With specificparameter values, one obtains two improper non-informative prior : the Jeffreys’ law p ( γ ) = 1 /γ p ( γ ) = U [0 , + ∞ [ ( γ ) with ( α i , β i ) set to (0 , + ∞ ) and (1 , + ∞ ) , respectively.Jeffreys’ law is a classical law for the precisions and is considered as non-informative [33]. Thislaw is also invariant to power transformations: the law of γ n [33, 34] is also a Jeffreys’ law. Forthese reasons development is done using the Jeffreys’ law. Regarding the PSF parameters w , we consider that the instrument design process or a physicalstudy provides a nominal value w with uncertainty δ , that is to say w ∈ [ w − δ , w + δ ] . The”Principle of Insufficient Reason” [33] leads to a uniform prior on this interval p ( w ) = U w , δ ( w ) (16)where U w , δ is a uniform pdf on [ w − δ , w + δ ] . Nevertheless, within the proposed framework,the choice is not limited and other laws, such as Gaussian, are possible. Anyway other choices donot allow easier computation because of the non-linear dependency of the observation model w.r.t.PSF parameters.
4. Proper posterior law
At this point, the prior law of each parameter is available: the PSF parameters, the hyperparametersand the image. Thus, the joint law for all the parameters is built by multiplying the likelihoodEq. (14) and the a priori laws Eq. (12), (15) and (16) p ( ◦ x , γ ǫ , γ , γ , w , ◦ y ) = p ( ◦ y | ◦ x , γ ǫ , w ) p ( ◦ x | γ , γ ) p ( γ ǫ ) p ( γ ) p ( γ ) p ( w ) (17)and explicitly p ( ◦ x , γ ǫ , γ , γ , w , ◦ y ) = (2 π ) − N Q N − n =1 | ◦ d n | β α ǫ ǫ Γ( α ǫ ) β α Γ( α ) β α Γ( α ) γ α ǫ + N/ − ǫ γ α − / γ α +( N − / − exp " − γ ǫ β ǫ − γ β − γ β U w , δ ( w )exp (cid:20) − γ ǫ k ◦ y − Λ H ◦ x k − γ k ◦ x k − γ k Λ D ◦ x k (cid:21) . (18)According to the Bayes rule, the a posteriori law reads p ( ◦ x , γ ǫ , γ , γ , w | ◦ y ) = p ( ◦ x , γ ǫ , γ , γ , w , ◦ y ) p ( ◦ y ) (19)where p ( ◦ y ) is a normalization constant p ( ◦ y ) = Z p ( ◦ y , ◦ x , γ , w ) d ◦ x d γ d w . (20)8s described before, setting γ = 0 leads to degenerated prior and joint laws. However, whenthe observation system preserves the null frequency γ can be considered as a nuisance parameter.In addition, only prior information on the smoothness is available.In Bayesian framework, a solution to eliminate the nuisance parameters is to integrate them outin the a posteriori law. According to our parametrization Sec. 3.A, the integration of γ is theintegration of a Gamma law. Application of Appendix B.B on γ in the a posteriori law Eq. (19)provides p ( ◦ x , γ ǫ , γ , w | ◦ y ) = p ( ◦ x ) p ( ◦ y , ◦ x ∗ , γ ǫ , γ , w | ◦ x ) Z p ( ◦ x ) p ( ◦ y , ◦ x ∗ , γ ǫ , γ , w | ◦ x ) d γ ǫ d γ d w d ◦ x ∗ d ◦ x (21)with p ( ◦ x ) = Z p ( ◦ x | γ ) p ( γ ) d γ = β ◦ x − α − / . (22)Now the parameter is integrated, the parameters α and β are set to remove the null frequencypenalization. Since we have α > and β > we get (1 + β ◦ x / − α − / ≤ and the joint lawis majored β ◦ x − α − / p ( ◦ y , ◦ x ∗ , γ ǫ , γ , w | ◦ x ) ≤ p ( ◦ y , ◦ x ∗ , γ ǫ , γ , w | ◦ x ) . (23)Consequently, by the dominated convergence theorem [35], the limit of the law with α → and β → can be placed under the integral sign at the denominator. Then the null-frequencypenalization p ( ◦ x ) from the numerator and denominator are removed. It is equivalent with theintegration of the γ parameter under a Dirac (see appendix B). The equation is simplified and theintegration with respect to ◦ x in the denominator Eq. (20) Z R p ( ◦ y | ◦ x , γ ǫ , w ) p ( ◦ x ∗ | γ ) p ( γ , γ ǫ , w ) d ◦ x ∝ Z R p ( ◦ y | ◦ x , γ ǫ , w ) d ◦ x (24) ∝ Z R exp " − γ ǫ (cid:18) ◦ y − ◦ h ◦ x (cid:19) d ◦ x (25)converges if and only if ◦ h = 0 : the null frequency is observed. If this condition is met, Eq. (21)with β = 0 and α = 1 is a proper posterior law for the image, the precision parameters and thePSF parameters. In other words, if the average is observed, the degeneracy of the a priori law isnot transmitted to the a posteriori law. 9hen, the obtained a posteriori law writes p ( ◦ x , γ ǫ , γ , w | ◦ y ) = p ( ◦ x , γ ǫ , γ , w , ◦ y ) p ( ◦ y ) ∝ γ α ǫ + N/ − ǫ γ α +( N − / − U w , δ ( w )exp (cid:20) − γ ǫ k ◦ y − Λ H ◦ x k − γ k Λ D ∗ ◦ x ∗ k (cid:21) exp " − γ ǫ β ǫ − γ β . (26)Finally, inference is done on this law Eq. (26). If the null frequency is not observed, or informationmust be added, the previous Eq. (19) can be used.
5. Posterior mean estimator and law exploration
This section presents the algorithm to explore the posterior law Eq. (19) or (26) and to compute anestimate of the parameters. For this purpose, Monte Carlo Markov chain is used to provide samples.Firstly, the obtained samples are used to compute different moments of the law. Afterwards, theyare also used to approximate marginal laws as histograms. These two representations are helpfulto analyse the a posteriori law, the structure of the available information and the uncertainty. Theyare used in Sec. 6.C.2 to illustrate the mark of the ambiguity in the myopic problem.Here, the samples of the a posteriori law are obtained by a Gibbs sampler [25, 36, 37]: it con-sists in iteratively sampling the conditional posterior law for a set of parameters given the otherparameters (obtained at previous iteration). Typically, the sampled laws are the law of ◦ x , γ i and w . After a burn-in time, the complete set of samples are under the joint a posteriori law. The threenext sections present each sampling step. The conditional posterior law of the image is a Gaussian law ◦ x ( k +1) ∼ p (cid:16) ◦ x | ◦ y , γ ( k ) ǫ , γ ( k )0 , γ ( k )1 , w ( k ) (cid:17) (27) ∼ N (cid:16) µ ( k +1) , Σ ( k +1) (cid:17) . (28)The covariance matrix is diagonal and writes Σ ( k +1) = (cid:16) γ ( k ) ǫ | Λ ( k ) H | + γ ( k )0 | Λ | + γ ( k )1 | Λ D | (cid:17) − (29)and the mean µ ( k +1) = γ ( k ) ǫ Σ ( k +1) Λ † H ( k ) ◦ y . (30)where † is the transpose conjugate symbol. The vector µ ( k +1) is the regularized least square solu-tion at the current iteration (or the Wiener-Hunt filter). Clearly, if the null-frequency is not observed ◦ h = 0 and if γ = 0 , the covariance matrix Σ is not invertible and the estimate is not defined asdescribed Sec. 4. 10inally, since the matrix is diagonal, the sample ◦ x ( k +1) is obtained by a term-wise product of F ǫ (where ǫ is white Gaussian) with the standard deviation matrix (cid:16) Σ ( k +1) (cid:17) / followed by theaddition of the mean µ ( k +1) also computed with term-wise products Eq. (30). Consequently, thesampling of the image is effective even with high-dimensional object. The conditional posterior laws of the precisions are Gamma corresponding to their prior law withparameters updated by the likelihood γ ( k +1) i ∼ p (cid:18) γ i | ◦ y , ◦ x ( k +1) , w ( k ) (cid:19) (31) ∼ G (cid:16) γ i | α ( k +1) i , β ( k +1) i (cid:17) . (32)For γ ǫ , γ and γ the parameters law are, respectively, α ( k +1) ǫ = α ǫ + N/ and β ( k +1) ǫ = (cid:18) β − ǫ + 12 k ◦ y − Λ ( k ) H ◦ x ( k +1) k (cid:19) − , (33) α ( k +1)0 = α + 1 / and β ( k +1)0 = (cid:18) β − + 12 (cid:16) ◦ x ( k +1)0 (cid:17) (cid:19) − , (34) α ( k +1)1 = α + ( N − / and β ( k +1)1 = (cid:18) β − + 12 k Λ D ◦ x ( k +1) k (cid:19) − . (35)In the case of Jeffreys’ prior, the parameters are α ( k +1) ǫ = N/ and β ( k +1) ǫ = 2 / k ◦ y − Λ ( k ) H ◦ x ( k +1) k , (36) α ( k +1)0 = 1 / and β ( k +1)0 = 2 / (cid:16) ◦ x ( k +1)0 (cid:17) , (37) α ( k +1)1 = ( N − / and β ( k +1)1 = 2 / k Λ D ◦ x ( k +1) k . (38) Remark 1 — If the a posteriori law Eq. (26) without γ is considered, there is no need to samplethis parameter (Eq. (34) and (37) are not useful) and γ ( k )0 = 0 in Eq. (29).5.C. Sample PSF parameters The conditional law for PSF parameters writes w ( k +1) ∼ p (cid:18) w | ◦ y , ◦ x ( k +1) , γ ( k +1) ǫ (cid:19) (39) ∝ exp " − γ ( k +1) ǫ k ◦ y − Λ H , w ◦ x ( k +1) k (40)where parameters w are embedded in the PSF Λ H . This law is not standard and intricate: no algo-rithm exists for direct sampling and we use the Metropolis-Hastings (M.-H.) method to bypass thisdifficulty. In M.-H. algorithm, a sample w p is proposed and accepted with a certain probability.This probability depends on the ratio between the likelihood of the proposed value and the likeli-hood of the current value w ( k ) . In practice, in the independent form described in appendix C, withprior law as proposition law, it is divided in several steps.11. P ROPOSITION : Sample a proposition w p ∼ p ( w ) = U [ a b ] ( w ) . (41)2. P ROBABILITY OF ACCEPTATION : Calculate the criterion J (cid:16) w ( k ) , w p (cid:17) = γ ( k +1) ǫ (cid:18) k ◦ y − Λ H , w ( k ) ◦ x ( k +1) k − k ◦ y − Λ H , w p ◦ x ( k +1) k (cid:19) . (42)3. U PDATE : Sample t ∼ U [0 1] and takes w ( k +1) = w p if log t < J w ( k ) otherwise . (43) The sampling of ◦ x , γ and w are repeated iteratively until the law has been sufficiently explored.These samples (cid:20) ◦ x ( k ) , γ ( k ) , w ( k ) (cid:21) follow the global a posteriori law of Eq. (19). By the large num-bers law, the estimate, defined as the posterior mean, is approximated by ˆ x = F † E [ ◦ x ] ≈ F † " K K − X k =0 ◦ x ( k ) . (44)As described by Eq. (44), to obtain an estimate of the image in the spatial space, all the computationare achieved recursively in the Fourier space with a single IFFT at the end. An implementationexample in pseudo code is described Fig. 9.
6. Deconvolution results
This section presents numerical results obtained by the proposed method. In order to completelyevaluate the method, true value of all parameters x , w , γ ǫ but also γ , γ is needed. In order toachieve this, an entirely simulated case is studied: image and noise are simulated under their re-spective prior laws Eq. (12) and (13) with given values of γ , γ and γ ǫ . Thanks to this protocol,all experimental conditions are controlled and the estimation method is entirely evaluated.The method has also been applied in different conditions (lower signal to noise ratio, broaderPSF, different and realistic (non-simulated) images, . . . ) and showed similar behaviour. However,in the case of realistic images, since the true value of the hyperparameters γ and γ is unknown,the evaluation cannot be complete. Concretely, a × image is generated in the Fourier space as the product of a complexwhite Gaussian noise and the a priori standard deviation matrix Σ = ( γ Λ † Λ + γ Λ † D Λ D ) − / ,given by Eq. (10). The chosen matrix Λ D results from the FFT -2 D of the Laplacian operator [0 1 0; 1 − / and the parameter values are γ = 1 and γ = 2 .12hese parameters provide the image shown in Fig. 1(a) : it is an image with smooth featuressimilar to a cloud. Pixels have numerical values between − and , and the profile line 68shows fluctuations around a value of − .The a priori law for the hyperparameters are set to the non-informative Jeffreys’ law by fixingthe ( α i , β i ) to (0 , + ∞ ) , as explained in Sec. 3.C. In addition, the PSF is obtained in the Fourierspace by discretization of a normalized Gaussian shape ◦ h ( ν α , ν β ) = exp − π (cid:18) ν α ( w α cos ϕ + w β sin ϕ )+ ν β ( w α sin ϕ + w β cos ϕ )+ 2 ν α ν β sin ϕ cos ϕ ( w α − w β ) (cid:19)! (45)with frequencies ( ν α , ν β ) ∈ [ − .
5; 0 . . This low-pass filter, illustrated in Fig. 2, is controlled bythree parameters: • two width parameters w α and w β set to 20 and 7, respectively. Their a priori laws are uniform: p ( w α ) = U [19 21] ( w α ) and p ( w β ) = U [6 8] ( w α ) corresponding to an uncertainty of about 5%and 15% around the nominal value (see Sec 3.D). • a rotation parameter ϕ set to π/ . The a priori law is also uniform p ( ϕ ) = U [ π/ π/ ( ϕ ) corresponding to 50% uncertainty.Then, the convolution is computed in the Fourier space and the data are obtained by addinga white Gaussian noise with precision γ ǫ = 0 . . Data are shown Fig. 1(b): they are naturallysmoother than the true image and the small fluctuations are less visible and corrupted by the noise.The empirical mean level of the image is correctly observed (the null frequency coefficient of H w is ◦ h = 1 ) so the parameter γ is considered as a nuisance parameter. Consequently it is integratedout under a Dirac (see Sec. 4). This is equivalent to fix its value to 0 in the algorithm Fig. 9, line 4.Finally, the method is evaluated on two different situations.1. The unsupervised and non-myopic case: the parameters w are known. Consequently, thereis no Metropolis-Hastings step (Sec. 5.C): lines 9 to 16 are ignored in the algorithm of Fig. 9and w is set to its true value. To obtain sufficient law exploration, the algorithm is run untilthe difference between two successive empirical means is less than − . In this case, 921samples are necessary and they are computed in approximately 12 seconds on a processor at2.66 GHz with Matlab,2. The unsupervised and myopic case: all the parameters are estimated. To obtain sufficientlaw exploration, the algorithm is run until the difference between two successive empiricalmeans is less than × − . In this case, 18 715 samples are needed and they are computedin approximately 7 minutes. Remark 2 — The algorithm has also been run for up to 1 000 000 samples, in both cases, without erceptible qualitative changes.6.B. Estimation results The two results for the image are given Figs. 1(c) and 1(d) for the non-myopic and the myopiccases, respectively.The effect of deconvolution is notable on the image, as well as on the shown profile. The ob-ject is correctly positioned, the orders of magnitude are respected and the mean level is correctlyreconstructed. The image is restored, more details are visible and the profiles are closer matchingto the true image than data. More precisely, the pixels 20-25 of the 68-th line in Fig. 1 show therestoration of the original dynamic whereas it is not visible in the data. Between pixels 70 and 110,fluctuations not visible in data are also correctly restored.In order to visualize and study the spectral contents of the images, circular average of empiricalpower spectral density is considered and called “spectrum” hereafter. The subjacent spectral vari-able is a radial frequency f such as f = ν α + ν β . The spectrum of the true object, data and restoredobject are shown Figs. 3(a) and 3(b) in non-myopic and myopic cases, respectively. It is clear thatthe spectrum of the true image is correctly retrieved, in both cases, up to the radial frequency f ≈ . . Above this frequency, noise is clearly dominant and information about the image isalmost lost. In other words, the method produces correct spectral equalization in the properly ob-served frequency band. The result is expected from a Wiener-Hunt method but the achievement isthe joint estimation of hyperparameter and instrument parameters in addition to the correct spectralequalization.Concerning a comparison between non-myopic and myopic cases, there is no visual differences.The spectrum Figs. 3(a) and 3(b) in non-myopic and myopic cases respectively are visually indis-tinguishable. This is also the case when comparing Figs. 1(c) and 1(d) and especially 68-th line.From a more precise quantitative evaluation, a slight difference is observed and detailed below.In order to quantify performances, a normalized euclidean distance e = k x − x ∗ k / k x ∗ k (46)between an image x and the true image x ∗ is considered. It is computed between true image andestimate images as well as between true image and data. Results are reported in Tab. 1 and confirmthat the deconvolution is effective with an error of approximately 6 % in myopic case comparedto 11 % with data. Both non-myopic and myopic deconvolution reduce error by a factor 1.7 withrespect to the observed data.Regarding a comparison between non-myopic and myopic case, the errors are almost the same,with a slightly lower value for the non-myopic case, as expected. This difference is coherent with14he intuition: more information are injected in the non-myopic case through the true PSF parame-ters values. Concerning the other parameters, their estimates are close to the true values and are reported inTab. 2. The γ ǫ estimate is very close to the true value with c γ ǫ = 0 . instead of 0.5 in the twocases. The error for the PSF parameters are 0.35%, 2.7% and 1.9% for w α , w β and ϕ , respectively.The value of γ is underestimated in the two cases with approximately 1.7 instead of 2. All the truevalues fall in the ˆ µ ± σ interval.In order to deepen the numerical study, the paper evaluates the capability of the method toaccurately select the best values for hyperparameters and instrument parameters. To this end,we compute the estimation error Eq. (46) for a set of “exhaustive” values of the parameters [ γ ǫ , γ , w α , w β , ϕ ] . The protocol is the following: 1) choose a new value for a parameter ( γ ǫ forexample) and fix the other parameters to the value provided by our algorithm, 2) compute theWiener-Hunt solution (Sec. 5.A) and 3) compute the error index.Results are reported in Fig. 4. In each case, smooth variation of error is observed when varyinghyperparameters and instrument parameters and an unique optimum is visible. By this way, onecan find the value of the parameters that provide the best Wiener-Hunt solution when the true image x ⋆ is known. It is reported on Tab. 1 and shows almost imperceptible improvement: optimizationof the parameters (based on the true image x ⋆ ) allow negligible improvement (smaller than 0.02 %as reported in Tab. 1).So, the main conclusion is that, the unsupervised and myopic proposed approach is a relevanttool in order to tune parameters: it works (without the knowledge of the true image), as well as anoptimal approach (based on the knowledge of the true image). This section describes the a posteriori law using histograms, means and variances of the parame-ters. The sample histograms, Figs. 5 and 6, provide an approximation of the marginal posterior lawfor each parameter. Tabs. 1 and 2 report the variance for the image and law parameters respectivelyand thus allow to quantify the uncertainty.
The histograms for γ ǫ and γ , Fig. 5, are concentrated around a mean value in both non-myopic andmyopic cases. The variance for γ ǫ is lower than the one for γ and it can be explained as follows.The observed data are directly impacted by noise (present at the system output) whereas they areindirectly impacted by the object (present at the system input). The convolution system damagesthe object and not the noise: as a consequence, the parameter γ ǫ (that drives noise law) is more15eliably estimated than γ (that drives object law).A second observation is the smaller variance for γ in the non-myopic case Fig. 5(c) than in themyopic case Fig. 5(d). It is the consequence of the addition of information in the non-myopic casew.r.t. the myopic one, through the value of the PSF parameters. In the myopic case, the estimatesare founded on the knowledge of an interval for the values of the instrument parameters, whereasin the non-myopic case, the estimates are founded on the true values for the instrument parameters. Fig. 6 gives histograms for the three PSF parameters and their appearances are quite differentfrom the one for hyperparameters. The histograms for w α and w β , Figs. 6(a) and 6(b) are not asconcentrated as the one of Fig. 5 for hyperparameters. Their variances are quite large with regardsto the interval of the prior law. On the contrary, the histogram for the parameter ϕ , Fig. 6(c), hasthe smallest variance. It is analyzed as a consequence of a larger sensitivity of the data w.r.t. theparameter ϕ than w.r.t. the parameters w α and w β . In an equivalent manner, the observed data aremore informative about the parameter ϕ than about the parameters w α and w β . Finally, a correlation between parameters ( γ , w α ) and ( γ , w β ) is visible on their joint histogramsFig. 7. It can be interpreted as a consequence of the ambiguity in the primitive myopic deconvolu-tion problem, in the following manner: the parameters γ and w both participate in the interpreta-tion of the spectral content of data, γ as a scale factor and w as a shape factor. An increase of w α or w β results in a decrease of the cutoff frequency of the observation system. In order to explainthe spectral content of a given data set, the spectrum of the original image must contain more highfrequencies, i.e., a smaller γ . This is also observed on the histogram illustrated Fig. 7(a). Globally, the chains of Figs. 5 and 6, have a Markov feature (correlated) and explore the parameterspace. They have a burn-in period followed by a stationary state. This characteristic has alwaysbeen observed regardless the initialization. For fixed experimental conditions, the stationary stateof multiple runs was always around the same value. Considering different initializations, the onlyvisible change is on the length of the burn-in period.More precisely, the chain of γ ǫ is concentrated in a small interval, the burn-in period is veryshort (less than 10 samples) and its evolution seems independent of the other parameters. Thechain of γ has a larger exploration, the burn-in period is longer (approximately 200 samples) andthe histogram is larger. This is in accordance with the analysis of Section 6.C.1.About the PSF parameters, the behaviour is different for ( w α , w β ) and ϕ . The chain of the twowidth parameters has a very good exploration with quasi-instantaneous burn-in period. Conversely,16he chain of ϕ is more concentrated and its burn-in period is approximately 4 000 samples. This isalso in accordance with previous analysis (Section 6.C.2).Acceptation rates in the Metropolis-Hastings algorithm are reported in Tab. 3: they are quitesmall, especially for the rotation parameter. This is due to the structure of the implemented algo-rithm: an independant Metropolis-Hastings algorithm with the prior law as a proposition law. Themain advantage of this choice is its simplicity but as a counterpart, a high rejection rate is observeddue to a large a priori interval for the angle parameter. A future work will be devoted to the designof more accurate proposition law. Fig. 8 illustrates the proposed method on a more realistic image with heterogeneous spatial struc-tures. The original is the Lena image and the data has been obtained with the same Gaussian PSFand also corruption by white Gaussian noise. The Fig. 8(b) shows that the restored image is closerto the true one than the data. Smaller structures are visible and edges are sharper, for examplearound pixel . The estimated parameters are c γ ǫ = 1 . while the true value is γ ⋆ǫ = 2 . Con-cerning the PSF parameters, the results are c w α = 19 . , c w β = 7 . and b ϕ = 1 . while the truevalues are respectively w ⋆α = 20 , w ⋆β = 7 and ϕ ⋆ = 1 . as in the previous section. Here again, theestimated PSF parameters are close to the true values giving a first assessment of the capability ofthe method in a more realistic context.
7. Conclusion and perspectives
This paper presents a new global and coherent method for myopic and unsupervised deconvolutionof relatively smooth images. It is built within a Bayesian framework and a proper extended aposteriori law for the PSF parameters, the hyperparameters and the image. The estimate, definedas the posterior mean, is computed by means of an MCMC algorithm in less than a few minutes.Numerical assessment testifies that the parameters of the PSF and the parameters of the priorlaws are precisely estimated. In addition, results also demonstrate that the myopic and unsuperviseddeconvolved image is closer to the true image than the data and show true restored high-frequenciesas well as spatial details.The paper focuses on linear invariant model often encountered in astronomy, medical imaging,nondestructive testing and especially in optical problems. Non-invariant linear models can alsobe considered in order to address other applications such as spectrometry [4] or fluorescence mi-croscopy [13]. The loss of invariance property precludes entirely Fourier-based computations butthe methodology remains valid and practicable. In particular, it is possible to draw samples of theimage by means of an optimization algorithm [38].Gaussian law, related to L penalization, is known for possible excessive sharp edges penaliza-tion in the restored object. The use of convex L − L penalization [39–41] or non convex L − L a posteriori law, including the new parameters and a Metropolis-Hastings sampler.
8. Acknowledgment
The authors would like to thank Professor Alain Abergel in IAS laboratory at Universit´e Paris-Sud11, France, for fruitful discussions and constructive suggestions. The authors are also grateful toCornelia Vacar, IMS laboratory, for carefully reading the paper.
A. Law in Fourier space
For a Gaussian vector x ∼ N ( µ , Σ ) , the law for ◦ x = F x (the
FFT of x ) is also Gaussian whosefirst two moments are the following: • The mean is ◦ µ = E [ ◦ x ] = F E [ ◦ x ] = F µ . (47) • The covariance matrix is ◦ Σ = E [( ◦ x − ◦ µ )( ◦ x − ◦ µ ) † ] = F Σ F † . (48)Moreover, if the covariance matrix Σ is circulant it writes ◦ Σ = F Σ F † = Λ Σ . (49) i.e., the covariance matrix ◦ Σ is diagonal. B. The Gamma probability density
B.A. Definition
The Gamma pdf for γ > , with given parameter α > and β > , is written G ( γ | α, β ) = 1 β α Γ( α ) γ α − exp ( − γ/β ) . (50)18ab. 4 gives three limit cases for ( α, β ) . The following properties hold: • The mean is E G [ γ ] = αβ • The variance is V G [ γ ] = αβ • The maximiser is β ( α − if and only if α > B.B. Marginalisation
First consider a N dimensional zero-mean Gaussian vector with a given precision matrix γ Γ with γ > . The pdf reads p ( x | γ ) = (2 π ) − N/ γ N/ det[ Γ ] / exp h − γ x t Γ x / i . (51)So consider the conjugate pdf for γ as a Gamma law with parameter ( α, β ) (see previous An-nex). The joint law for ( x , γ ) is the product of the pdf given by Eq. (50) and Eq. (51): p ( x , γ ) = p ( x | γ ) p ( γ ) . The marginalization of the joint law is known [44]: p ( x ) = Z R + p ( x | γ ) p ( γ ) d γ = β N/ det[ Γ ] / Γ ( α + N/ π ) N/ Γ( α ) β x t Γ x ! − α − N/ (52)which is a N dimensional t -Student law of α degrees of freedom with a β Γ precision matrix.Finally, the conditional law reads: p ( γ | x ) = (2 π ) − N/ det[ Γ ] / β α Γ( α ) γ α + N/ − exp h − γ (cid:16) x t Γ x / /β (cid:17)i . (53)Thanks to conjugacy, it is also a Gamma pdf with parameters ¯ α , ¯ β given by ¯ α = α + N/ and ¯ β − = β − + 2 / ( x t Γ x ) . C. The Metropolis-Hastings algorithm
The Metropolis-Hastings algorithm provides samples of a target law f ( w ) that cannot be directlysampled but can be evaluated, at least up to a multiplicative constant. Using the so called “instru-ment law” q (cid:16) w p | w ( t ) (cid:17) , samples of the target law are obtained by the following iterations.1. Sample a proposition w p ∼ q (cid:16) w p | w ( t ) (cid:17) .2. Compute the probability ρ = min f ( w p ) f ( w ( t ) ) q (cid:16) w ( t ) | w p (cid:17) q ( w p | w ( t ) ) , . (54)3. Take w ( t +1) = w p with ρ probability w ( t ) with − ρ probability . (55)19t convergence, the samples follow the target law f ( w ) [25, 36]. When q (cid:16) w p | w ( t ) (cid:17) = q ( w p ) thealgorithm is named independent Metropolis-Hastings. In addition, if the instrument law is uniform,the acceptance probability gets simpler in ρ = min ( f ( w p ) f ( w ( t ) ) , ) . (56) References
1. J. Idier, ed.,
Bayesian Approach to Inverse Problems (ISTE Ltd and John Wiley & Sons Inc.,London, 2008).2. R. Molina, J. Mateos, and A. K. Katsaggelos, “Blind deconvolution using a variational ap-proach to parameter, image, and blur estimation,” IEEE Trans. Image Processing , 3715–3727 (2006).3. P. Campisi and K. Egiazarian, eds., Blind Image Deconvolution (CRC Press, 2007).4. T. Rodet, F. Orieux, J.-F. Giovannelli, and A. Abergel, “Data inversion for over-resolved spec-tral imaging in astronomy,” IEEE J. of Selec. Topics in Signal Proc. , 802–811 (2008).5. A. Tikhonov and V. Arsenin, Solutions of Ill-Posed Problems (Winston, Washington, DC ,1977).6. S. Twomey, “On the numerical solution of Fredholm integral equations of the first kind by theinversion of the linear system produced by quadrature,” J. Assoc. Comp. Mach. , 97–101(1962).7. A. Jalobeanu, L. Blanc-F´eraud, and J. Zerubia, “Hyperparameter estimation for satellite im-age restoration by a MCMC maximum likelihood method,” Pattern Recognition , 341–352(2002).8. J. A. O’Sullivan, “Roughness penalties on finite domains,” IEEE Trans. Image Processing ,1258–1268 (1995).9. G. Demoment, “Image reconstruction and restoration: Overview of common estimation struc-ture and problems,” IEEE Trans. Acoust. Speech, Signal Processing ASSP -37 , 2024–2036(1989).10. P. Pankajakshani, B. Zhang, L. Blanc-F´eraud, Z. Kam, J.-C. Olivo-Marin, and J. Zerubia,“Blind deconvolution for thin-layered confocal imaging,” Appl. Opt. , 4437–4448 (2009).11. E. Thibaut and J.-M. Conan, “Strict a priori constraints for maximum likelihood blind decon-volution,” J. Opt. Soc. Am. A , 485–492 (1995).12. N. Dobigeon, A. Hero, and J.-Y. Tourneret, “Hierarchical bayesian sparse image reconstructionwith application to MRFM,” IEEE Trans. Image Processing (2009).13. B. Zhang, J. Zerubia, and J.-C. Olivo-Marin, “Gaussian approximations of fluorescence mi-croscope point-spread function models,” Appl. Opt. , 1819–1829 (2007).204. L. Mugnier, T. Fusco, and J.-M. Conan, “MISTRAL: a myopic edge-preserving image restora-tion method, with application to astronomical adaptive-optics-corrected long-exposure im-ages,” J. Opt. Soc. Amer. , 1841–1854 (2004).15. E. Thi´ebaut, “MiRA: an effective imaging algorithm for optical interferometry,” in “proc.SPIE: Astronomical Telescopes and Instrumentation,” , vol. 7013 (2008), vol. 7013, pp.70131–I.16. T. Fusco, J.-P. V. ran, J.-M. Conan, and L. M. Mugnier, “Myopic deconvolution method foradaptive optics images of stellar fields,” Astron. Astrophys. Suppl. Ser. , 193 (1999).17. J.-M. Conan, L. Mugnier, T. Fusco, V. Michau, and R. G., “Myopic deconvolution of adaptiveoptics images by use of object and point-spread function power spectra,” Applied Optics ,4614–4622 (1998).18. A. C. Likas and N. P. Galatsanos, “A variational approach for Bayesian blind image deconvo-lution,” IEEE Trans. Image Processing , 2222–2233 (2004).19. T. Bishop, R. Molina, and J. Hopgood, “Blind restoration of blurred photographs via AR mod-elling and MCMC,” in “Image Processing, 2008. ICIP 2008. 15th IEEE Int. Conference on,”(2008).20. E. Y. Lam and J. W. Goodman, “Iterative statistical approach to blind image deconvolution,”J. Opt. Soc. Am. A , 1177–1184 (2000).21. Z. Xu and E. Y. Lam, “Maximum a posteriori blind image deconvolution with Huber–Markovrandom-field regularization,” Opt. Lett. , 1453–1455 (2009).22. M. Cannon, “Blind deconvolution of spatially invariant image blurs with phase,” IEEE Trans.Acoust. Speech, Signal Processing , 58–63 (1976).23. A. Jalobeanu, L. Blanc-Feraud, and J. Zerubia, “Estimation of blur and noise parameters inremote sensing,” in “Proc. IEEE ICASSP,” , vol. 4 (2002), vol. 4, pp. 3580–3583.24. F. Chen and J. Ma, “An empirical identification method of Gaussian blur parameter for imagedeblurring,” Signal Processing, IEEE Trans. on (2009).25. C. P. Robert and G. Casella, Monte-Carlo Statistical Methods , Springer Texts in Statistics(Springer, New York, NY , 2000).26. B. R. Hunt, “A matrix theory proof of the discrete convolution theorem,” IEEE Trans. Automat.Contr. AC-19 , 285–288 (1971).27. M. Calder and R. A. Davis, “Introduction to Whittle (1953) ‘The analysis of multiple stationarytime series’,” Breakthroughs in Statistics , 141–148 (1997).28. P. J. Brockwell and R. A. Davis, Time Series: Theory and Methods (Springer-Verlag, NewYork, 1991).29. B. R. Hunt, “Deconvolution of linear systems by constrained regression and its relationship tothe Wiener theory,” IEEE Trans. Automat. Contr.
AC-17 , 703–705 (1972).30. K. Mardia, J. Kent, and J. Bibby,
Multivariate Analysis (San Diego : Academic Press, 1992),21hap. 2, pp. 36–43.31. C. A. Bouman and K. D. Sauer, “A generalized Gaussian image model for edge-preserving
MAP estimation,” IEEE Trans. Image Processing , 296–310 (1993).32. D. MacKay, Information Theory, Inference, and Learning Algorithms (Cambridge UniversityPress, 2003).33. R. E. Kass and L. Wasserman, “The selection of prior distributions by formal rules,” J. Amer.Statist. Assoc. , 1343–1370 (1996).34. E. T. Jaynes, Probability Theory: The Logic of Science (Cambridge University Press, 2003).35. S. Lang,
Real and functional analysis (Springer, 1993).36. P. Br´emaud,
Markov Chains. Gibbs fields, Monte Carlo Simulation, and Queues , Texts inApplied Mathematics 31 (Spinger, New York, NY , 1999).37. S. Geman and D. Geman, “Stochastic relaxation, Gibbs distributions, and the Bayesian restora-tion of images,” IEEE Trans. Pattern Anal. Mach. Intell. , 721–741 (1984).38. F. Orieux, O. F´eron, and J.-F. Giovannelli, “Stochastic sampling of large dimension non-stationnary gaussian field for image restoration,” Submitted to ICIP2010.39. H. R. K¨unsch, “Robust priors for smoothing and image restoration,” Ann. Inst. Stat. Math. ,1–19 (1994).40. P. Charbonnier, L. Blanc-F´eraud, G. Aubert, and M. Barlaud, “Deterministic edge-preservingregularization in computed imaging,” IEEE Trans. Image Processing , 298–311 (1997).41. J.-F. Giovannelli, “Unsupervised Bayesian convex deconvolution based on a field with an ex-plicit partition function,” IEEE Trans. Image Processing , 16–26 (2008).42. D. Geman and C. Yang, “Nonlinear image recovery with half-quadratic regularization,” IEEETrans. Image Processing , 932–946 (1995).43. X. Descombes, R. Morris, J. Zerubia, and M. Berthod, “Estimation of Markov random fieldprior parameters using Markov chain Monte Carlo maximum likelihood,” IEEE Trans. ImageProcessing , 954–963 (1999).44. G. E. P. Box and G. C. Tiao, Bayesian inference in statistical analysis (Addison-Wesley pub-lishing, 1972). 22able 1. Error e (Eq. (46)) and averaged standard deviation ˆ σ of the posterior imagelaw. The “Best” error has been obtained with the knowledge of the true image.Data Non-myopic Myopic BestError ( e ) 11.092 % 6.241 % 6.253 % 6.235 % ˆ σ of x law - 3.16 3.25 -23able 2. Quantitative evaluation: true and estimated values of hyperparameters andPSF parameters. b γ ǫ ± ˆ σ b γ ± ˆ σ b w α ± ˆ σ b w β ± ˆ σ b ϕ ± ˆ σ True value 0.5 2 20 7 1.05 ( π/ ) Non-myopic
Estimate 0.49 ± . ± . - - -Error 2.0 % 11 % - - - Myopic
Estimate 0.49 ± . ± . ± . ± . ± . Error 2.0 % 18 % 0.35 % 2.7 % 1.9 %24able 3. Acceptation rate.Parameter w α w β ϕ Acceptation rate 14.50 % 9.44 % 2.14 %25able 4. Specific laws obtained as limit of the Gamma pdf. α β
Jeffreys 0 + ∞ Uniform 1 + ∞ Dirac - 026
50 100−120−100−80−60−40−200 (a) True image
TrueData (b) Data
TrueEstimate (c) Non-myopic
TrueEstimate (d) Myopic
Fig. 1. The figure 1(a) represents a × sample of the a priori law for theobject with γ = 1 and γ = 2 . Fig. 1(b) is the data computed with the PSF shownin Fig. 2. Figs. 1(c) and 1(d) are the estimates with non-myopic and the myopicestimate, respectively. Profiles correspond to the 68-th line.27 Fig. 2. PSF with w α = 20 , w β = 7 and ϕ = π/ . The x-axis and y-axis are reducedfrequency. −10 −5 TrueConvolued imageDataEstimate (a) Non-Myopic −10 −5 TrueConvolued imageDataEstimate (b) Myopic
Fig. 3. Circular average of the empirical power spectral density of the image, theconvolued image, the data (convolued image corrupted by noise) and the estimates,in radial frequency with y-axis in logarithmic scale. The x-axis is the radial fre-quency. 28 −2 −1 (a) γ ǫ −2 −1 (b) γ
19 19.5 20 20.5 210.06250.06260.06270.0628 (c) w α (d) w β (e) ϕ Fig. 4. Computation of the best parameters in the sense e Eq. (46). The symbol’ × ’ is the minimum and the symbol ’.’ is the estimated value by our approach. They-axis of γ ǫ and γ are in logarithmic scale.29 (a) γ ǫ for non-myopic case (b) γ ǫ for myopic case (c) γ for non-myopic case (d) γ for myopic case Fig. 5. Histograms and chains for the non-myopic case in Figs. 5(a)-5(c) and themyopic case in Figs. 5(b)-5(d) for γ ǫ and γ , respectively. The symbol × localizesthe initial value and the dashed line corresponds to the true value. The x-axis areiteration’s index for the chains and parameter value for the histograms.30 (a) w α (b) w β (c) ϕ Fig. 6. Histogram and chain for the PSF parameters w α in Fig. 6(a), w β in Fig. 6(b)and ϕ in Fig. 6(c). The symbol × localizes the initial value and the dashed linecorresponds to the true value. The x-axis for the histograms and the y-axis of thechain are limits of a priori law. 31 w α (a) ( γ , w α ) γ w β (b) ( γ , w β ) Fig. 7. Joint histograms for the couple ( γ , w α ) and ( γ , w β ) in Figs. 7(a) and 7(b)respectively. The x-axis and y-axis are the parameter value. (a) Data (b) Estimated image Fig. 8. Observed image Fig. 8(a) and restored image Fig. 8(b). Profiles correspondto the 68-th line. The solid line is the true profile. Dashed line correspond to data inFig. 8(a) and estimated profiles in Fig. 8(b).32 : Initialisation of (cid:20) ◦ x (0) , γ (0) , w (0) , k = 0 (cid:21) repeat % Sample of ◦ x Σ ← γ ( k ) ǫ | Λ H | + γ ( k )0 | Λ | + γ ( k )1 | Λ D | µ ← γ ( k ) ǫ Σ − Λ ∗ H ◦ y ◦ x ( k ) ← µ + Σ − / . ∗ randn % Sample of γ γ ( k ) ǫ ← gamrnd ( α ǫ , β ǫ ) γ ( k )1 ← gamrnd ( α , β ) γ ( k )0 ← gamrnd ( α , β ) % Sample of w w p ← rand ∗ ( a − b ) + a J ← γ ǫ (cid:16) k ◦ y − Λ H ◦ x k − k ◦ y − Λ H , w p ◦ x k (cid:17) / if log( rand ) < min { J, } then w ( k ) ← w p Λ H ← Λ H , w p else w ( k ) ← w ( k − end if % Empirical mean k ← k + 1 ◦ ¯ x ( k ) ← P i ◦ x ( i ) /k until | ¯ x ( k ) − ¯ x ( k − | / | ¯ x ( k ) | ≤ criterionFig. 9. Pseudo-code algorithm. gamrnd , rand and randnrandn