NLTG Priors in Medical Image: Nonlocal TV-Gaussian (NLTG) prior for Bayesian inverse problems with applications to Limited CT Reconstruction
Didi Lv, Qingping Zhou, Jae Kyu Choi, Jinglai Li, Xiaoqun Zhang
NNonlocal TV-Gaussian (NLTG) prior for Bayesian inverseproblems with applications to Limited CT Reconstruction
Didi Lv , Qingping Zhou , Jae Kyu Choi , Jinglai Li , Xiaoqun Zhang , , School of Mathematical Sciences, Shanghai Jiao Tong University, Shanghai 200240,People’s Republic of China Institute of Natural Sciences, Shanghai Jiao Tong University, Shanghai 200240, People’sRepublic of China School of Mathematical Sciences, Tongji University, Shanghai 200082, People’s Republic ofChina Department of Mathematical Sciences, University of Liverpool, Liverpool L69 6ZL, UnitedKingdom MOE Key Laboratory of Scientific and Engineering Computing, Shanghai Jiao TongUniversity, Shanghai 200240, People’s Republic of ChinaE-mail: [email protected], [email protected]
Abstract.
Bayesian inference methods have been widely applied in inverse problems, largelydue to their ability to characterize the uncertainty associated with the estimation results. Inthe Bayesian framework the prior distribution of the unknown plays an essential role in theBayesian inference, and a good prior distribution can significantly improve the inferenceresults. In this paper, we extend the total variation-Gaussian (TG) prior in [41], and proposea hybrid prior distribution which combines the nonlocal total variation regularization and theGaussian (NLTG) distribution. The advantage of the new prior is two-fold. The proposedprior models both texture and geometric structures present in images through the NLTV. TheGaussian reference measure also provides a flexibility of incorporating structure informationfrom a reference image. Some theoretical properties are established for the NLTG prior. Theproposed prior is applied to limited-angle tomography reconstruction problem with di ffi cultiesof severe data missing. We compute both MAP and CM estimates through two e ffi cientmethods and the numerical experiments validate the advantages and feasibility of the proposedNLTG prior. a r X i v : . [ m a t h . O C ] J a n LTG Priors in Medical Image
1. Introduction
Bayesian inference methods [12, 17] have become a popular tool to solve inverse problems.Such popularity is largely due to its ability to quantify the solution uncertainties. A typicalBayesian treatment consists of assigning a prior distribution to the unknown parameters andthen update the distribution based on the observed data, yielding the posterior distribution.Recently considerable attentions have been paid to the studies of infinite dimensionalBayesian inverse problems, where the unknowns are functions of space or time, for example,images. In particular, a rigorous function space Bayesian inference framework for inverseproblems was developed in [35]. It should be clear that, as most practical inverse problems arehighly ill-posed, the performance of the Bayesian inference depends critically on the choiceof the prior distribution, and prior modeling plays an essential role in the Bayesian inferencemethod. For infinite dimensional problems, the Gaussian measures are arguably the mostpopular choice of prior distributions, as it has many theoretical and computational advantagesin the infinite dimensional setting [35].However, in many practical problems, such as medical image reconstruction, thefunctions or images that one wants to recover are often subject to sharp jumps ordiscontinuities. The Gaussian prior distributions are typically not suitable for modeling suchfunctions [42]. To this end several non-Gaussian priors have been proposed to model suchimages, e.g., [37]. Since these prior distributions di ff er significantly from Gaussian, manysampling schemes based on the Gaussian prior can not be used directly. To address the issue,a hybrid prior was proposed in [41]. The hybrid prior is motivated by the total variation(TV) regularization [31] in the deterministic setting; however, it has been proven in [20] thatthe TV based prior does not converge to a well-defined infinite-dimensional measure as thediscretization dimension increases. The hybrid prior is a combination of the TV term andthe Gaussian distribution: it uses a TV term to capture the sharp jumps in the functionsand a Gaussian distribution as a reference measure to make sure that the resulting priordoes converge to a well-defined probability measure in the function space in the infinitedimensional limit.Nonlocal methods are another types of popular regularization methods for imaginginverse problems. They are originally proposed for natural image processing to restorerepetitive patterns and textures, for example a heuristic copy-paste technique was firstlyproposed for texture synthesis in [9], a more systematic nonlocal means filter was proposedin [2] and a nonlocal variational framework was established in [14]. The main idea of thenonlocal methods is to utilize the similarities present in an image as a weight for restoring,smoothing or regularization. As an extension of TV, nonlocal TV (NLTV) regularizationmethod is among those popular variational regularization tools due to its flexibility ofrecovering both texture and geometry patterns for diverse imaging inverse problems, see[43, 44, 28, 22] for the applications. It has been demonstrated that in many practical problems,the NLTV method has better performance than the standard TV, especially for recoveringtextures and structures of images. More definitions and details are present in Section 2.2.Inspired by the success of nonlocal regularization, we propose to improve the hybrid LTG Priors in Medical Image H considered in TG prior, for dealing with larger images class; finally the Gaussian measureprovides some freedom to incorporate other prior information through the covariance matrix,such as structures from a reference image. To demonstrate the e ff ectiveness of the hybridNLTV-Gaussian (NLTG) prior, we apply it to the limited tomography problems, where onlylimited projection data are available. In particular, we consider the two common types ofpoint estimation in the Bayesian framework: maximum a posterior (MAP) and conditionalmean (CM) with the NLTG prior. The MAP estimate consists of solving an optimizationproblem, while the computing of CM involves evaluating a high-dimensional integrationproblem [17, 24].The remainder of this paper is structured as follows: Section 2 describes the proposedNLTG priors construction on the separable Hilbert space and presents the related theoreticalproperties. In section 3, we give a simple introduction on the limited tomography and solvethe reconstruction problem by applying the proposed NLTG prior. The numerical results viaboth MAP and CM estimates are shown in section 4. Finally, we draw our conclusions in thelast section.
2. The NLTG priors
We first give a brief introduction to the basic setup of the Bayesian inference methods forinverse problems. We consider the forward model of the following form: y = F ( u ) + n , (1)where u is an unknown function (in this work we shall restrict ourselves to the situation where u is a real-valued function defined in R , i.e., an image), y ∈ R m is the measured data and n is a m -dimensional zero mean Gaussian noise with covariance matrix Σ . Our goal here is toestimate the unknown function u from the measured data y .First we assume that the unknown u lives in a Hilbert space of functions, say X . We thenchoose a probabilistic measure defined on X , denoted by µ pr , to be the prior measure of u . Theposterior measure of u , denoted as µ y , is represented as the Radon-Nikodym (R-N) derivativewith respect to µ pr :d µ y d µ pr ∝ exp ( − Φ ( u )) , (2)where Φ ( u ) = (cid:107) F ( u ) − y (cid:107) Σ = (cid:104) Σ − ( F ( u ) − y ) , Σ − ( F ( u ) − y ) (cid:105) is the data fidelity term indeterministic inverse problem. In the Bayesian framework, the posterior distribution depends LTG Priors in Medical Image µ pr = µ = N (0 , C ) a Gaussian measuredefined on X with zero mean and covariance operator C . Note that V is symmetric positiveoperator of trace class [21]. To better model functions with sharp jumps, Ref [41] proposesthe hybrid TG prior in the form of,d µ pr d µ ∝ exp ( − R ( u )) , (3)where R ( u ) represents additional prior information (or regularization) on u . In this case, onecan writhe the R-N derivative of µ y with respect to µ :d µ y d µ ∝ exp ( − Φ ( u ) − R ( u )) , (4)which in turn returns to the conventional formulation with Gaussian priors. Specifically weshall choose the state space X to be Sobolev space H and R ( u ) = λ (cid:107) u (cid:107) TV = λ (cid:82) (cid:107)∇ u (cid:107) d x introduced in [41, Section 2.3]. Here we provide the formulation of the NLTG prior and we start with a brief introductionto the nonlocal regularization. For the details, one may consult [14] for the variationalframework based nonlocal operators, [13, 23, 43, 44] for a short survey on the theory andapplication of NLTV, and [2, 5, 10, 18, 27, 45] for more relative surveys. The nonlocalmethods can be described as follows. Let Ω ⊆ R be a bounded set, and x ∈ Ω . Givena reference image f : Ω → R , we define a nonnegative symmetric weight function ω : Ω × Ω → R as follows: ω ( x , y ) = exp − (cid:68) G a , | f ( x + · ) − f ( y + · ) | (cid:69) h (5)where G is a Gaussian kernel with the standard deviation a , h is a filtering parameter and (cid:104)· , ·(cid:105) is the standard inner product in L ( Ω ). Note that in general, h corresponds to the noise level;conventionally we set it to be the standard deviation of the noise [23].Let u : Ω → R . Using the weight function ω : Ω × Ω → R in (5), we define the nonlocal(NL) gradient ∇ ω u : Ω × Ω → R as ∇ ω u ( x , y ) = ( u ( y ) − u ( x )) (cid:112) ω ( x , y ) . (6)For a given p : Ω × Ω → R , its NL divergence is defined by the standard adjoint relation withthe NL gradient operator as follows: (cid:104)∇ ω u , p (cid:105) = − (cid:104) u , div ω p (cid:105) , which leads to the following explicit formula:div ω p ( x ) = (cid:90) Ω ( p ( y , x ) − p ( x , y )) (cid:112) ω ( x , y )d y . (7) LTG Priors in Medical Image J NLTV ( u ) = (cid:90) Ω (cid:107)∇ ω u ( x , · ) (cid:107) L ( Ω ) d x = (cid:90) Ω (cid:32)(cid:90) Ω | u ( x ) − u ( y ) | ω ( x , y )d y (cid:33) d x . (8)Then we can see that the functional in (8) is analogous to the total variation (TV)seminorm, and the NLTG prior can be constructed similarly. To do this, we first need tospecify the state space X , which should be desirably a separable Hilbert space. Recall that Ω is a bounded set in R . As convention, we choose Ω = [0 , throughout this paper. (Note,however, that our analysis is valid for any bounded domain with C boundary). For a givenreference image f : Ω → R and a weight function ω : Ω × Ω → R defined as (5), we introduce X = { u ∈ L ( Ω ) : ∇ ω u ∈ L ( Ω × Ω ) } , (9)where the NL gradient ∇ ω u is defined as (6). Obviously, we have X ⊆ L ( Ω ). In addition, wepresent the following lemma which in turn shows that the NLTV functional defined as (8) iswell defined on L ( Ω ). Lemma 1
Let f ∈ L ( Ω ) be a given reference image, and let ω : Ω × Ω → R be a weightfunction defined as (5). For u ∈ L ( Ω ) , we have (cid:107)∇ ω u (cid:107) L ( Ω × Ω ) ≤ (cid:107) u (cid:107) L ( Ω ) (10) and J NLTV ( u ) ≤ (cid:107) u (cid:107) L ( Ω ) (11) As a consequence, ∇ ω : L ( Ω ) → L ( Ω × Ω ) is a continuous linear operator. Proof.
First we note that, from the definition (5) of ω , we have ω ( x , y ) ∈ (0 , ∀ x , y ∈ Ω . Hence, using H¨older’s inequality (e.g. [11]), we have (cid:107)∇ ω u (cid:107) L ( Ω × Ω ) = (cid:90) Ω (cid:90) Ω |∇ ω u ( x , y ) | d y d x = (cid:90) Ω (cid:90) Ω | u ( x ) − u ( y ) | ω ( x , y )d y d x ≤ (cid:90) Ω (cid:90) Ω | u ( x ) − u ( y ) | d y d x ≤ (cid:90) Ω (cid:90) Ω | u ( y ) | d y d x + (cid:90) Ω (cid:90) Ω | u ( x ) | d y d x + (cid:90) Ω (cid:90) Ω | u ( x ) u ( y ) | d y d x ≤ (cid:107) u (cid:107) L ( Ω ) + (cid:107) u (cid:107) L ( Ω ) = (cid:107) u (cid:107) L ( Ω ) , where we used the fact that | Ω | =
1. This concludes (10).For (11), we first note that for each x ∈ Ω , the mapping x (cid:55)→ (cid:107)∇ ω u ( x , · ) (cid:107) L ( Ω ) (12) LTG Priors in Medical Image L ( Ω , d x ), by (10), where we specified the Lebesgue measure d x for clarity. Thenapplying the H¨older’s inequality again, we have J NLTV ( u ) ≤ (cid:32)(cid:90) Ω (cid:107)∇ ω u ( x , · ) (cid:107) L ( Ω ) d x (cid:33) = (cid:107)∇ ω u (cid:107) L ( Ω × Ω ) ≤ (cid:107) u (cid:107) L ( Ω ) , and this concludes Lemma 1. (cid:4) Combining the definition 9 and Lemma 1, we can see that X = L ( Ω ). Now by taking R ( u ) = λ J NLTV ( u ) , (13)in (4), we obtain the NLTG prior distribution. In this section, we show that the NLTG prior leads to a well-behaved posterior distribution in X , where the proofs follow the similar line as [41].Following [35], we assume that the forward operator F : L ( Ω ) → R m satisfies thefollowing conditions:A1. For every (cid:15) >
0, there exists M = M ( (cid:15) ) > (cid:107) F ( u ) (cid:107) Σ − ≤ exp (cid:16) (cid:15) (cid:107) u (cid:107) L ( Ω ) + M (cid:17) ∀ u ∈ L ( Ω ) . A2. For every δ >
0, there exists L = L ( δ ) > u , u ∈ L ( Ω ) withmax (cid:110) (cid:107) u (cid:107) L ( Ω ) , (cid:107) u (cid:107) L ( Ω ) (cid:111) ≤ δ, we have (cid:107) F ( u ) − F ( u ) (cid:107) Σ − ≤ L (cid:107) u − u (cid:107) L ( Ω ) . We have the following proposition on µ pr in (3) . Proposition 1
Let R : L ( Ω ) → R be defined as (13). Then we have the followings:(i) For all u ∈ L ( Ω ) , we have R ( u ) ≥ .(ii) For every δ > , there exists M = M ( δ ) > such that for all u ∈ L ( Ω ) with (cid:107) u (cid:107) L ( Ω ) ≤ δ ,we have R ( u ) ≤ M.(iii) For every δ > , there exists L = L ( δ ) > such that for all u , u ∈ L ( Ω ) with max (cid:110) (cid:107) u (cid:107) L ( Ω ) , (cid:107) u (cid:107) L ( Ω ) (cid:111) ≤ δ, we have | R ( u ) − R ( u ) | ≤ L (cid:107) u − u (cid:107) L ( Ω ) . Proof.
Since (i) is trivial, we focus on (ii) and (iii). For (ii), note that (10) of Lemma 1 gives J NLTV ( u ) ≤ (cid:107) u (cid:107) L ( Ω ) . For a given δ >
0, we choose M = λδ . Then whenever (cid:107) u (cid:107) L ( Ω ) ≤ δ , we have R ( u ) = λ J NLTV ( u ) ≤ λ (cid:107) u (cid:107) L ( Ω ) ≤ M , LTG Priors in Medical Image u , u ∈ L ( Ω ), we have | J NLTV ( u ) − J NLTV ( u ) | = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:90) Ω (cid:107)∇ ω u ( x , · ) (cid:107) L ( Ω ) d x − (cid:90) Ω (cid:107)∇ ω u ( x , · ) (cid:107) L ( Ω ) d x (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:90) Ω (cid:107)∇ ω ( u − u ) ( x , · ) (cid:107) L ( Ω ) d x = J NLTV ( u − u ) ≤ (cid:107) u − u (cid:107) L ( Ω ) . where the last inequality follows from (11) of Lemma 1. Then by letting L = λ , we conclude(iii). This completes the proof. (cid:4) Theorem 2 states that µ y in (4) is a well-defined probability measure on L ( Ω ) and it isLipschitz continuous in the data y . Since the theorem is a direct consequence of the fact that Φ + R satisfies [36, Assumptions 2.6.], we omit the proof. Theorem 2
Assume that F : L ( Ω ) → R m satisfies A1-A2. Let R : L ( Ω ) → R be defined as(13). For a given y ∈ R m , we define µ y as in (4). Then we have the followings:(i) µ y is a well-defined measure on L ( Ω ) .(ii) µ y is Lipschitz continuous in the data y with respect to the Hellinger distance. Moreprecisely, if µ y and µ y (cid:48) are two measures corresponding to data y and y (cid:48) respectively,then for every δ > , there exists Z = Z ( δ ) > such that for all y, y (cid:48) ∈ R m with max (cid:8) (cid:107) y (cid:107) , (cid:107) y (cid:48) (cid:107) (cid:9) ≤ δ, we have d Hell ( µ y , µ y (cid:48) ) ≤ Z (cid:107) y − y (cid:48) (cid:107) Σ − . As a result, the expectation of any polynomially bounded function g : L ( Ω ) → K iscontinuous in y. For practical concerns, it is important to consider the finite dimensional approximationof µ y . In particular we consider the following finite dimensional approximation:d µ yN , N d µ ∝ exp (cid:0) − Φ N ( u ) − R N ( u ) (cid:1) , (14)where Φ N ( u ) is the N dimensional approximation of Φ ( u ) with F N being the N dimensionalapproximation of F and R N ( u ) is the N dimensional approximation of R ( u ). Theorem 3provides the convergence property of µ yN , N . Theorem 3
Assume that F and F N satisfies A1 with constants independent of N and R N satisfies Proposition 1 (i)-(ii) with constants independent of N . Assume further that for all (cid:15) > , there exist two sequences (cid:8) a N ( (cid:15) ) (cid:9) and (cid:8) b N ( (cid:15) ) (cid:9) , both of which converge to , such that µ ( X (cid:15) ) > for all N , N ∈ N where X (cid:15) = (cid:110) u ∈ L ( Ω ) : (cid:12)(cid:12)(cid:12) Φ ( u ) − Φ N ( u ) (cid:12)(cid:12)(cid:12) ≤ a N ( (cid:15) ) , (cid:12)(cid:12)(cid:12) R ( u ) − R N ( u ) (cid:12)(cid:12)(cid:12) ≤ b N ( (cid:15) ) (cid:111) , (15) then we have d Hell ( µ y , µ yN , N ) → , (16) as N , N → ∞ .LTG Priors in Medical Image L ( Ω ) is a separable Hilbert space, we can consider the finitedimensional approximation of u , as presented in the following corollary. Corollary 4
Let { e k : k ∈ N } be a complete orthonormal basis of L ( Ω ) . For N ∈ N , we defineu N = N (cid:88) k = (cid:104) u , e k (cid:105) e k , (17) and d µ yN d µ ∝ exp ( − Φ ( u N ) − R ( u N )) . (18) If f satisfies A1-A2, then we haved
Hell ( µ y , µ yN ) → , (19) as N → ∞ . The proof of Theorem 3 and Corollary 4 can be directly followed by [41, Theorem 2.3.,Corollary 2.4], hence we omit here.
3. Application to Limited Tomography Reconstruction
In this section, we illustrate the application of the Bayesian inference method and the NLTGprior to limited tomography problem.
X-ray computed tomography (CT) plays an important role in diseases diagnosis of humanbody. Let u denote the image to be reconstructed. Throughout this paper, we assume u ∈ L ( R ) is supported in a domain Ω , and we only consider the two dimensional parallelbeam CT for simplicity. Then the sinograme (or the projection data) f is obtained by thefollowing Radon transform [30]: y ( α, s ) = Pu ( α, s ) = (cid:90) ∞−∞ u ( sn + z ⊥ )d z , ( α, s ) ∈ [0 , π ) × R (20)where n = (cos α, sin α ) and n ⊥ = ( − sin α, cos α ). Given a sinogram y = Pu on [0 , π ) × R , thetomographic image u is reconstructed via the inverse Radon transform [1, 25]: u ( x ) = π (cid:90) π (cid:90) ∞−∞ x · n − s (cid:34) ∂∂ s y ( α, s ) (cid:35) d s d α (21)where x = ( x , x ) ∈ R .From (21), we can easily see that the reconstruction of u requires the so-called completeknowledge of y on [0 , π ) × R [19, 29]. However, the problem (20) becomes ill-posed wheneverthe limited data is available in the subset of [0 , π ) × R due to the reduced size of detector[38, 39] and / or the reduced number of projections [26, 34, 4]. In particular, if the projectiondata f is available on S r (cid:40) [0 , π ) × R : S r = { ( α, s ) ∈ [0 , π ) × R : | s | ≤ r } , (22) LTG Priors in Medical Image g called the ambiguity of P , i . e . Pg = S r [38].As this ambiguity g is nonconstant in the region of interest (ROI) B (0 , r ) [38, 40], thereconstructed image via (21) using available y only on S r will be deteriorated by g .In the literature, numerous studies have been proposed to remove the ambiguity g due tothe restriction of y on S r . These studies can be classified into the following two categories:the known subregion based approaches related to the restoration of signal from the truncatedHilbert transform, and the sparsity model based approaches using the sparse approximationof tomographic images in the ROI. See [32, 8, 16] for the detailed surveys. In this paper, wewill use an approach based on a reference image and NLTG regularization. In this section, we discuss how to compute the two popular point estimator in the Bayesiansetting. The first often used point estimator in the Bayesian framework is the MAP estimator,and following the same steps as in [7], we can show that the MAP estimator with the NLTGprior is the minimizer of I ( u ) : = Φ ( u ) + λ J NLTV ( u ) + (cid:107) u (cid:107) K , (23)over L ( Ω ), where in this problem Φ ( u ) = (cid:107) Au − y (cid:107) Σ with A being the linear operator in thelimited tomography problem.To minimize (23), we adopt the widely used split Bregman method [15] which isequivalent to the alternating direction method of multipliers (ADMM). For the sake of clarity,we present the split Bregman method for (23) in Algorithm 1. Algorithm 1
Split Bregman Method for MAP estimate with NLTG prior.
Initialization: u = u ref , b = d = , y , λ, µ. for k = , , , · · · do Update u k + : u k + = argmin u Φ ( u ) + µ (cid:107) d k − ∇ ω u − b k (cid:107) + (cid:107) u (cid:107) K (24) Update d k + : d k + = argmin d λ (cid:107) d (cid:107) + µ (cid:107) d − ∇ ω u k + − b k (cid:107) (25) Update b k + : b k + = b k + ∇ ω u k + − d k + . (26) end for To solve (24), we use the conjugate gradient method to solve the following linear system:( A T Σ − A − µ ∆ ω + C − ) u = A T Σ − y + µ div ω ( b k − d k ) . The d subproblem (25) has the following closed form solution: d k + = shrink( ∇ ω u k + + b k , λ/µ ) , LTG Priors in Medical Image d , λ ) is the shrinkage formula defined asshrink( d , λ ) = max ( (cid:107) d (cid:107) − λ, d (cid:107) d (cid:107) , with the convention that 0 / = v = (1 − β ) u + β w , (27)where u is the present position, v is the proposed position, and β ∈ [0 ,
1] is the parametercontrolling the stepsize, and w ∼ N (0 , Σ ). The associated acceptance probability is a ( u , v ) = min (cid:110) , exp [ Φ ( u ) + J ( u ) − Φ ( v ) − J ( v )] (cid:111) . (28)We describe the complete pCN algorithm in Algorithm 2. Algorithm 2
The preconditioned Crank-Nicolson (pCN) algorithm. Initialize u (0) ∈ X ; for n = N − do Propose v using Eq. (27); Draw θ ∼ U [0 , if θ ≤ a ( u ( n ) , v ) then u ( n + = v ; else u ( n + = u ( n ) ; end if end for4. Numerical Results In this section, we present some experimental results to demonstrate the performance of theproposed NLTG prior. In particular we compare the results (both MAP and CM estimates)of the proposed method with those of the Filtered back projection (FBP) method, the NLTVregularization model and the TG method. We use the 128 ×
128 XCAT image [33] takinginteger values in [0 , u ori . Then the ground truth image u gt isgenerated by adding one round shaped object which stands for the tumor in lung and furtheradding the sinusoidal wave as an inhomogeneous background. Finally, we generate the LTG Priors in Medical Image Figure 1.
XCAT images: origin, ground truth and reference with di ff erent noise levels(a) Original image u ori (b) Ground truth u gt (c) Reference image u (d) Reference image u reference image u ref , which can be considered as the previous CT image of the same patienttaken by the same CT modality, by using 500 projections of u ori added by the Gaussian noiseof di ff erent levels. In all experiments the reference images used are the reconstruction fromthe projections data with low noise level 5 and high level 20, which will be denoted as u and u respectively. Please refer to Figure 1 for these images.Given a reference image u ref , the covariance matrix Σ − for the Gaussian measure term iscomputed as Σ ( i , j ) = exp( − (cid:107) u ref ( i ) − u ref ( j ) (cid:107) h ) (29)following the idea of radial basis function kernel in machine learning [3] and the similarityweight in nonlocal means filter [2]. We note that this choice of covariance matrix is di ff erentfrom usual Gaussian measure, as the correlation between the pixels value of u ref is used insteadof spatial distance. Such choice aims to bring structures and edge information of the referenceimage to the to-be-reconstructed image, which is especially important for reconstruction fromhighly missing data. In the comparison to TG method, we adopt this covariance matrix aswell for a fair comparison. As for the comparison to the NLTV method, in order to savecomputation and storage memory, we only use the 10 largest weights and the 4 closestneighbors for each pixel, as adopted in [43]. This will be further illustrated in the numericalresults.To synthesize the limited projection data y , we generate the forward operator A as thediscrete Radon transform followed by the restriction onto the discretization of [0 , π ) × [ − , ]. LTG Priors in Medical Image u u u u Table 1.
MAP results: PSNR for di ff erent sinogram noise levels and reference images Note that the size of discrete Radon transform depends on both the size of CT image and thenumber of projections. In this experiments, we use 50 equally spaced projections. Then theprojection data is generated by y = Au gt + η with the Gaussian noise η of di ff erent noise levels. Throughout this experiments, we choosethe noise level to be 1 for the low level and 10 for the high level respectively. Finally, thereconstructed images are further improved by imposing the constraint of intensity [0 , u rec = min { max ( u ∗ , , } (30)where u ∗ denotes the output of the four methods in comparison. In solving (23), the filtering parameter h in both (5) and (29) is chosen as in [43]. In addition,as it is not hard to see that Σ is positive definite, which means that the model (23) is convex,we set the maximum outer iteration number with 80 ensuring that the algorithm converges tothe global minimizer. Finally, the regularization parameter λ is manually chosen so that wecan obtain the optimal restoration results.Tables 1 and 2 summarizes the PSNR and SSIM values of each case. As we can see fromthe tables, our NLTG prior consistently outperforms the other reconstruction methods, namelythe FBP, TV, TG and the NLTV priors. We present the reconstruction results with di ff erentmethods in Figures 2 and 3. As can be expected, both TV and FBP can not successfullyreconstruct reasonable results due to limited projections. The TG prior based MAP estimateshows better performance since our proposed Gaussian covariance matrix (29) provides somestructure information from the reference image as prior. We can also see that, compared to theTG prior, the NLTG prior shows the advantage of NLTV by using the similarity in the image.In addition, compared to the NLTV prior, the NLTG prior can obtain better recovery result,thanks to the presence of the Gaussian term which extracts more structure information by thecovariance matrix computed from the reference image. As we can see from the figures, thevisual improvements are consistent with the improvements in the indices. LTG Priors in Medical Image u u u u Table 2.
MAP results: SSIM for di ff erent sinogram noise levels and reference images(a) Ground truth u gt (b) FBP (c) TV(d) TG with u (e) NLTV with u (f) NLTG with u (g) TG with u (h) NLTV with u (i) NLTG with u Figure 2.
MAP results with sinogram noise level 5.
LTG Priors in Medical Image (a) Ground truth u gt (b) FBP (c) TV(d) TG with u (e) NLTV with u (f) NLTG with u (g) TG with u (h) NLTV with u (i) NLTG with u Figure 3.
MAP results with sinogram noise level 20.
The hyperparameters in the CM model directly come from the MAP model. For both TG priorand NLTG prior, we perform the pCN approach with 9 . × samples and another 0 . × samples as the pre-run. The stepwise β has been chosen to make the acceptance probability isaround 25%. We show the CM results in Figure 4, and then compute the PSNR and SSIM asin Table 3.One can see from the figures and Table 3 that, the CM model with NLTG priorconsistently outperformed the CM model with TG prior for di ff erent scenarios. We can alsosee that the PSNR of CM model is less than that of MAP model under low sinogram noiselevel while the PSNR of CM model is higher than that of MAP under high sinogram noiselevel. Nonetheless, the results of MAP model consistently outperforms CM model in theSSIM. The reason of such behavior is that MAP estimator preserves better structures and LTG Priors in Medical Image (a) Noisy level of sino-gram of 5 with u (b) Noisy level of sino-gram of 5 with u (c) Noisy level of sino-gram of 20 with u (d) Noisy level of sino-gram of 20 with u Figure 4.
CM results: the condition mean for di ff erent sinogram data noise level and di ff erentreferences images. The upper row shows the results obtained by NLTG prior, and the downrow shows the results obtained by the TG prior. edges while CM provides smoother images that suppress the noise. We can also see that thePSNR of CM with di ff erent reference images under fixed sinogram noise level are almostidentical, while the SSIM values are in general dependent on the reference image noise level.This suggests that, in some sense, SSIM index is more sensitive to the structures that can bebrought from reference images.The main advantage of Bayesian techniques is that it can measure the uncertainty in theestimates. Figure 5 summarizes the 95% confidence interval (CI) gaps for each setting. Onceagain, the CM model with TG prior performed worse than the CM model with NLTG prior,as one would expect since the similarity of structures or distribution of an image extracted byNLTG prior can include more information than detection of local sharp edges from TG prior.For the sinogram noise level of 5 , we can see that the CI gap of samples with referencenoise level of 1 shows higher values, especially near the edge of the image, than that ofsamples with reference noise level of 5. It is worth noting that edges have high uncertaintythan the smooth regions. This may be due to the fact that the NLTG prior always tryto frequently tune the optimal values on the edge pixel which consequently result in highvariance and large confidence interval. Finally, we can derive a similar conclusion in the caseof high sinogram noise level. LTG Priors in Medical Image (a) Noisy level of sino-gram of 5 with u (b) Noisy level of sino-gram of 5 with u (c) Noisy level of sino-gram of 20 with u (d) Noisy level of sino-gram of 20 with u Figure 5.
The 95% confidence interval for di ff erent sinogram data noise level and di ff erentreferences images, the range of the values is from 0 to 100. The upper row shows the resultsobtained by NLTV prior, and the down row shows the results obtained by the TG prior. Table 3.
CM results: SSIM and PSNR for di ff erent level of Sinogram noise and reference PSNR SSIMSinogramNoise Level ReferenceImage NLTG TG NLTG TG5 u u u u
5. Conclusions
In this paper, we consider the Bayesian inference methods for infinite dimensional inverseproblems, and in particular we propose a prior distribution that combines the nonlocal methodto extract information from a reference image, with a standard Gaussian distribution. Weshow that the proposed prior distribution leads to a well-behaved posterior measure in theinfinite dimensional setting. We then apply the proposed method to a limited tomographyproblem. The numerical experiments demonstrate the performance of the proposed NLTGprior is competitive against existing and adapted methods, and we also provide a comparisonof the MAP and the CM estimates. We believe that the proposed NLTG prior distributioncan be useful in a large class of image reconstruction problems where reference images areavailable and we plan to investigate these applications in the future.
LTG Priors in Medical Image References [1] RN. Bracewell and AC. Riddle. Inversion of fan-beam scans in radio astronomy.
The AstrophysicalJournal , 150:427, 1967.[2] A. Buades, B. Coll, and J. M. Morel. A Review of Image Denoising Algorithms, with a New One.
Multiscale Model. & Simul. , 4(2):490–530, 2005.[3] C. Chang and C. Lin. Libsvm: a library for support vector machines.
ACM transactions on intelligentsystems and technology (TIST) , 2(3):27, 2011.[4] K. Choi, J. Wang, L. Zhu, T. S. Suh, S. Boyd, and L. Xing. Compressed Sensing Based Cone-BeamComputed Tomography Reconstruction with a First-Order Method.
Med. Phys. , 37(9):5113–5125, 2010.[5] F. R. K. Chung.
Spectral Graph Theory , volume 92 of
CBMS Regional Conference Series in Mathematics .Published for the Conference Board of the Mathematical Sciences, Washington, DC; by the AmericanMathematical Society, Providence, RI, 1997.[6] SL. Cotter, GO. Roberts, AM. Stuart, and D. White. MCMC methods for functions: modifying oldalgorithms to make them faster.
Statistical Science , 28(3):424–446, 2013.[7] M. Dashti, K. J. H. Law, A. M. Stuart, and J. Voss. MAP Estimators and Their Consistency in BayesianNonparametric Inverse Problems.
Inverse Problems , 29(9):095017, 27, 2013.[8] M. Dashti, KJH. Law, AM. Stuart, and J. Voss. MAP estimators and their consistency in bayesiannonparametric inverse problems.
Inverse Problems , 29(9):095017, 2013.[9] Efros, A. Alexei, and T.K. Leung. Texture synthesis by non-parametric sampling. In
IEEE InternationalConference on Computer Vision , pages 1033–1038, Corfu, Greece, September 1999.[10] A. Elmoataz, O. Lezoray, and S. Bougleux. Nonlocal Discrete Regularization on Weighted Graphs: aFramework for Image and Manifold Processing.
IEEE Trans. Image Process. , 17(7):1047–1060, 2008.[11] G. B. Folland.
Real Analysis: Modern Techniques and Their Applications . Pure and Appl. Math. JohnWiley & Sons Inc., New York, 2nd edition, 1999.[12] A. Gelman, JB. Carlin, HS. Stern, DB. Dunson, A. Vehtari, and DB. Rubin.
Bayesian data analysis . CRCpress Boca Raton, FL, 2014.[13] G. Gilboa and S. Osher. Nonlocal Linear Image Regularization and Supervised Segmentation.
MultiscaleModel. & Simul. , 6(2):595–630, 2007.[14] G. Gilboa and S. Osher. Nonlocal Operators with Applications to Image Processing.
Multiscale Model. & Simul. , 7(3):1005–1028, 2008.[15] T. Goldstein and S. Osher. The split bregman method for l -regularized problems. SIAM journal onimaging sciences , 2(2):323–343, 2009.[16] T. Heußer, M. Brehm, S. Marcus, S. Sawall, and M. Kachelrieß. CT data completion based on priorscans. In
Nuclear Science Symposium and Medical Imaging Conference (NSS / MIC), 2012 IEEE , pages2969–2976. IEEE, 2012.[17] J. Kaipio J and E. Somersalo.
Statistical and computational inverse problems , volume 160. SpringerScience & Business Media, 2006.[18] S. Kindermann, S. Osher, and P. W. Jones. Deblurring and Denoising of Images by Nonlocal Functionals.
Multiscale Model. & Simul. , 4(4):1091–1115, 2005.[19] E Klann. A mumford–shah-like method for limited data tomography with an application to electrontomography.
SIAM Journal on Imaging Sciences , 4(4):1029–1048, 2011.[20] M. Lassas and S. Siltanen. Can one use total variation prior for edge-preserving bayesian inversion?
Inverse Problems , 20(5):1537, 2004.[21] Jinglai Li. A note on the karhunen–lo`eve expansions for infinite-dimensional bayesian inverse problems.
Statistics & Probability Letters , 106:1–4, 2015.[22] J. Liu, H. Ding, S. Molloi, X. Zhang, and H. Gao. Nonlocal total variation based spectral CT imagereconstruction.
Medical physics , 42(6):3570–3570, 2015.[23] Y. Lou, X. Zhang, S. Osher, and A. Bertozzi. Image Recovery via Nonlocal Operators.
J. Sci. Comput. ,42(2):185–197, 2010.[24] F. Lucka, S. Pursiainen, M. Burger, and C.H. Wolters. Hierarchical bayesian inference for the eeg inverse
LTG Priors in Medical Image problem using realistic fe head models: depth localization and source separation for focal primarycurrents. Neuroimage , 61(4):1364–1382, 2012.[25] F. Natterer.
The Mathematics of Computerized Tomography . Springer, 1986.[26] X. Pan, EY. Sidky, and M. Vannier. Why do commercial ct scanners still employ traditional, filtered back-projection for image reconstruction?
Inverse problems , 25(12):123009, 2009.[27] G. Peyr´e. Image Processing with Nonlocal Spectral Bases.
Multiscale Model. & Simul. , 7(2):703–730,2008.[28] Gabriel Peyr´e, S´ebastien Bougleux, and Laurent Cohen. Non-local regularization of inverse problems. InDavid Forsyth, Philip Torr, and Andrew Zisserman, editors,
Computer Vision – ECCV 2008 . SpringerBerlin Heidelberg, 2008.[29] ET. Quinto. Singularities of the x-ray transform and limited data tomography in R and R . SIAM Journalon Mathematical Analysis , 24(5):1215–1225, 1993.[30] J. Radon. Uber die bestimmug von funktionen durch ihre integralwerte laengs geweisser mannig-faltigkeiten.
Berichte Saechsishe Acad. Wissenschaft. Math. Phys., Klass , 69:262, 1917.[31] L. Rudin, S. Osher, and E. Fatemi. Nonlinear total variation based noise removal algorithms.
Physica D:nonlinear phenomena , 60(1-4):259–268, 1992.[32] T. Schuster, B. Kaltenbacher, B. Hofmann, and K. S. Kazimierski.
Regularization Methods in BanachSpaces , volume 10 of
Radon Series on Computational and Applied Mathematics . Walter de GruyterGmbH & Co. KG, Berlin, 2012.[33] W.P. Segars, G. Sturgeon, S. Mendonca, Jason Grimes, and Benjamin MW Tsui. 4d xcat phantom formultimodality imaging research.
Medical physics , 37(9):4902–4915, 2010.[34] E. Y. Sidky and X. Pan. Image Reconstruction in Circular Cone-Beam Computed Tomography byConstrained, Total-Variation Minimization.
Phys. Med. Biol. , 53(17):4777, 2008.[35] AM. Stuart. Inverse problems: a Bayesian perspective.
Acta Numer , 19:451–559, 2010.[36] S. J. Vollmer. Dimension-Independent MCMC Sampling for Inverse Problems with Non-Gaussian Priors.
SIAM / ASA J. Uncertain. Quantif. , 3(1):535–561, 2015.[37] Sebastian J Vollmer. Dimension-independent mcmc sampling for inverse problems with non-gaussianpriors.
SIAM / ASA Journal on Uncertainty Quantification , 3(1):535–561, 2015.[38] G. Wang and H. Yu. The meaning of interior tomography.
Physics in medicine and biology , 58(16):R161,2013.[39] JP. Ward, M. Lee, JC. Ye, and M. Unser. Interior tomography using 1d generalized total variation. part i:Mathematical foundation.
SIAM Journal on Imaging Sciences , 8(1):226–247, 2015.[40] J. Yang, H. Yu, M. Jiang, and G. Wang. High-order total variation minimization for interior tomography.
Inverse problems , 26(3):035013, 2010.[41] Z. Yao, Z. Hu, and J. Li. A TV-Gaussian Prior for Infinite-Dimensional Bayesian Inverse Problems and ItsNumerical Implementations.
Inverse Problems , 32(7):075006, 19, 2016.[42] Zhewei Yao, Zixi Hu, and Jinglai Li. A tv-gaussian prior for infinite-dimensional bayesian inverseproblems and its numerical implementations.
Inverse Problems , 32(7):075006, 2016.[43] X. Zhang, M. Burger, X. Bresson, and S. Osher. Bregmanized Nonlocal Regularization for Deconvolutionand Sparse Reconstruction.
SIAM journal on imaging sciences , 3(3):253–276, 2010.[44] X. Zhang and T. F. Chan. Wavelet Inpainting by Nonlocal Toral Variation.
Inverse problems and Imaging ,4(1):191–210, 2010.[45] D. Zhou and B. Sch¨olkopf.