Sparse-View CT Reconstruction via Convolutional Sparse Coding
1 ๏ SPARSE-VIEW CT RECONSTRUCTION VIA CONVOLUTIONAL SPARSE CODING
Peng Bao , Wenjun Xia , Kang Yang , Jiliu Zhou , Senior Member, IEEE and Yi Zhang
1, * , Senior Mem-ber, IEEE College of Computer Science, Sichuan University, Chengdu 610065, China
ABSTRACT
Traditional dictionary learning based CT reconstruction methods are patch-based and the features learned with these methods often contain shifted versions of the same features. To deal with these problems, the convolutional sparse coding (CSC) has been proposed and introduced into various appli-cations. In this paper, inspired by the successful applications of CSC in the field of signal processing, we propose a novel sparse-view CT reconstruction method based on CSC with gradient regularization on feature maps. By directly working on whole image, which need not to divide the image into over-lapped patches like dictionary learning based methods, the proposed method can maintain more details and avoid the ar-tifacts caused by patch aggregation. Experimental results demonstrate that the proposed method has better performance than several existing algorithms in both qualitative and quan-titative aspects.
Index Terms โ Sparse-view, convolutional sparse coding INTRODUCTION
In recent years, reconstructing high quality images from un-dersampled projection data is an attractive research topic in the field of CT imaging. Traditional analytic algorithms, such as FBP, cannot reconstruct high quality images with insuffi-cient projection data. On the other side, iterative reconstruc-tion algorithms are effective under this situation, which can guarantee the image quality when the projection views are in-complete. However, when projection views are highly sparse, it is difficult to achieve a satisfactory result without extra prior information. Introducing reasonable priors can significantly improve imaging quality of traditional iterative methods, such as algebraic reconstruction technique (ART) and expectation maximization (EM). With the development of compressed sensing (CS) theory, Sidky et al. [1] coupled total variation (TV) and projection onto convex sets (POCS) to solve incom-plete projection data reconstruction problem and achieved good performance. Following this path, many variants of TV have been proposed. Particularly, Niu et al . [2] proposed to combine penalized weighted least-squares (PWLS) [3] with total generalized variation (TGV) [4] to reduce the undesired patchy effects caused by the mathematical assumption of TV. However, these methods, including PWLS-TGV, still suffer from patchy effects to different degrees. As another representative works, dictionary learning (DL) methods [5] have been proved effective in CT reconstruction in [6]. Two kinds of dictionaries are learned as bases: (1) a global dictionary can be learned from an external training set that is fixed during the reconstruction process, and (2) an adaptive dictionary learned from the intermediate result that keeps updating during the iterations. However, most dictionary learning methods are patch-based and the learned features often contain shifted versions of the same features. To address these problems, convolu-tional sparse coding (CSC) has been proposed in which shift invariance is directly modelled in the objective function. CSC has been demonstrated very useful in a wide range of com-puter vision problems. Aided by l penalty on the gradient of the feature maps, CSC can be further improved for impulse noise denoising problem [7]. Inspired by this successful re-search, in this work, we proposed a penalized weighted least-squares method based on CSC aided by gradient regulariza-tion on feature maps for sparse-view CT reconstruction (PWLS-CSCGR). The remainder of this paper is organized as follows. In Section 2, the proposed reconstruction scheme will be elaborated. In Section 3, experimental designs and rep-resentative results are given. Finally, we will conclude this paper in the Section 4. METHOD 2.1. Penalized Weighted Least-Squares
The calibrated and log-transformed projection data approxi-mately follow a Gaussian distribution, which can be described by the following formula: ๐ ๐2 = ๐ ๐ ร ๐๐ฅ๐(๐ฒ ๐ /๐) (1) where ๐ ๐ and ๐ ๐2 are the mean and variance of the measured projection data at ๐ -th bin respectively, ๐ ๐ is a parameter ad-pative to different bins and ๐ is a scaling parameter. Based on these properties of projection data, the PWLS CT image re-construction can be modelled as follows: ๐๐๐ ๐๐๐ ๐ (๐ โ ๐จ๐) ๐ ๐ฎ โ1 (๐ โ ๐จ๐) + ๐ฝ๐ (๐) (2) where ๐ denotes the projection data, ๐ is the vectorized atten-uation coefficients to be reconstructed and (โ ) ๐ denotes the transpose operation. ๐จ is the system matrix with size of ๐ ร ๐ ( ๐ is the total number of projection data and ๐ is the total number of image pixels). ๐ด is a diagonal matrix with the ๐ -th element of ๐ ๐2 as calculated by (1). ๐ (๐) represents the regu-larization term and ๐ฝ is a regularization parameter. Current TV and DL based regularization methods suffer from blocky effects or patch aggregation artifacts. To circumvent these problems, in this paper, we propose to utilize CSC with gradient regularization on feature maps as the regularization term in (2). Then the following minimization model is ob-tained: ๐๐๐ ๐๐๐ ๐,{๐ด ๐ },{๐ ๐ }
12 (๐ โ ๐จ๐) ๐ ๐ฎ โ1 (๐ โ ๐จ๐) +๐ฝ(12 โฅโฅโฅโฅโฅโ ๐ ๐๐๐=1 โ ๐ด ๐ โ ๐โฅโฅโฅโฅโฅ + ๐ โโฅโฅ๐ด ๐ โฅโฅ +๐2 โ โฅโฅโ(๐ โ ๐ด ๐ ) + (๐ โ ๐ด ๐ ) โฅโฅ ). (3) where โ is convolution operator, {๐ ๐ } is a set of filters, ๐ด ๐ is the feature map corresponding to filter ๐ ๐ , ๐ and ๐ are the filters that compute the gradient along image rows and col-umns respectively, and ฮป and ฯ are the hyper-parameters. In this model, we use predetermined filters {๐ ๐ } , which can be trained by method proposed in [8]. Then the model becomes following optimization problem: ๐๐๐ ๐๐๐ ๐,{๐ด ๐ }
12 (๐ โ ๐จ๐) ๐ ๐ฎ โ1 (๐ โ ๐จ๐) +๐ฝ(12 โฅโฅโฅโฅโฅโ ๐ ๐๐๐=1 โ ๐ด ๐ โ ๐โฅโฅโฅโฅโฅ + ๐ โโฅโฅ๐ด ๐ โฅโฅ +๐2 โ โฅโฅโ(๐ โ ๐ด ๐ ) + (๐ โ ๐ด ๐ ) โฅโฅ ). (4) An alternating minimization scheme can be applied to solve (4). First, an intermediate reconstructed image ๐ can obtained with a set of fixed feature maps {๐ดฬ ๐ } . Thus, (4) is transformed to: ๐๐๐ ๐๐๐ ๐ (๐ โ ๐จ๐) ๐ ๐ฎ โ1 (๐ โ ๐จ๐) +๐ฝ โฅโฅโฅโฅโฅโ ๐ ๐๐๐=1 โ ๐ดฬ ๐ โ ๐โฅโฅโฅโฅโฅ . (5) With the separable paraboloid surrogate method [6], the solu-tion of (5) can be obtained, which is expressed as follows: ๐ ๐๐ก+1 = ๐ ๐๐ก โ โ (( ๐2 ) ๐ ๐๐ ([ ๐จ๐ ๐ก ] ๐ โ ๐ฆ ๐ )) ๐๐=1 โ (( ๐2 ) ๐ ๐๐ โ ๐ ๐๐๐๐=1 ) ๐๐=1 + ๐ฝโ ๐ฝ ([โ ๐ ๐๐๐=1 โ ๐ด ฬ ๐ ] ๐๐ก โ ๐ ๐๐ก )โ (( ๐2 ) ๐ ๐๐ โ ๐ ๐๐๐๐=1 ) ๐๐=1 + ๐ฝ ,๐ = 1,2, โฆ , ๐, (6) where ๐ก = 0,1,2, โฆ , ๐ represents the iteration index and ๐ ๐๐ is the element of ๐จ . The second step is to re-express the inter-mediate result ๐ with the fixed filters {๐ ๐ } . Since CSC does not provide a good representation for low-frequency compo-nent, only high-frequency component of ๐ is expressed by CSC. Then we have the following optimization problem: Fig. 1. The images in the training set. The display window is [-150 250] HU. ๐๐๐ ๐๐๐ {๐ด ๐ }
12 โฅโฅโฅโฅโฅโ ๐ ๐๐๐=1 โ ๐ด ๐ โ ๐โฅโฅโฅโฅโฅ + ๐ โโฅโฅ๐ด ๐ โฅโฅ +๐2 โ โฅโฅโ(๐ โ ๐ด ๐ ) + (๐ โ ๐ด ๐ ) โฅโฅ , (7) Define the linear operators ๐ฎ and ๐ฎ such that ๐ฎ ๐ ๐ด ๐ = ๐ ๐ โ๐ด ๐ , then the gradient term in (7) can be rewritten as ๐2 โ โฅโฅ ๐ฎ ๐ด ๐ โฅโฅ + ๐2 โ โฅโฅ ๐ฎ ๐ด ๐ โฅโฅ . (8) If we define ๐ญ ๐ ๐ด ๐ = ๐ ๐ โ ๐ด ๐ , ๐ญ = (๐ญ ๐ญ โฏ ๐ญ ๐ ) , ๐ด =(๐ด ๐ด โฏ ๐ด ๐ ) ๐ , and ๐ฑ ๐ = ( ๐ฎ ๐ ๐ โฏโฎ โฎ โฑ ), (9) (7) can be reformulated as follows: ๐๐๐ ๐๐๐ ๐ด 12 โฅ๐ญ๐ด โ ๐โฅ + ๐โฅ๐ดโฅ + ๐2 โฅโฅ๐ฑ ๐ดโฅโฅ + ๐2 โฅโฅ๐ฑ ๐ดโฅโฅ , (10) Alternating Direction Method of Multipliers (ADMM) is ap-plied to solve (10) by introducing ๐ฉ that is constrained to be equal to the variable ๐ด , leading to following problem : ๐๐๐ ๐๐๐ ๐ด,๐ฉ 12 โฅ๐ญ๐ด โ ๐โฅ + ๐2 โฅโฅ๐ฑ ๐ดโฅโฅ + ๐2 โฅโฅ๐ฑ ๐ดโฅโฅ + ๐โฅ๐ฉโฅ , ๐ . ๐ก. ๐ด = ๐ฉ. (11) With ADMM, we have iterations as follows: ๐ด ๐+1 = ๐๐๐ ๐๐๐ ๐ด 12 โฅ๐ญ๐ด โ ๐โฅ + ๐2 โฅโฅ๐ฑ ๐ดโฅโฅ + ๐2 โฅโฅ๐ฑ ๐ดโฅโฅ + ๐2 โฅโฅ๐ด โ ๐ฉ ๐ + ๐ช ๐ โฅโฅ (12) ๐ฉ ๐+1 = ๐๐๐ ๐๐๐ ๐ฉ ๐โฅ๐ฉโฅ + ๐2 โฅโฅ๐ด ๐+1 โ ๐ฉ + ๐ช ๐ โฅโฅ (13) ๐ช ๐+1 = ๐ช ๐ + ๐ด ๐+1 โ ๐ฉ ๐+1 (14) (12) can be transformed into Fourier domain and the solution can be expressed as: (๐ญฬ ๐ป ๐ญฬ + ๐๐ฑฬ ๐ฑฬ + ๐๐ฑฬ ๐ฑฬ + ๐๐ฐ)๐ดฬ= ๐ญฬ ๐ป ๐ฬ + ๐(๐ฉฬ โ ๐ชฬ), (15) where ๐ญฬ , ๐ฑฬ , ๐ฑฬ , ๐ดฬ , ๐ฬ , ๐ฉฬ and ๐ชฬ denote the expressions after transforming into Fourier domain. ๐ฐ is the identity matrix. (15) can be solved efficiently by exploiting the Sherman Morrison formula [8]. And the closed-form solution for (13) can be ob-tained by: ๐ฉ ๐+1 = ๐ ๐/๐ (๐ด ๐+1 + ๐ช ๐ ), (16) where ๐(โ ) denotes the soft-thresholding function.
Fig. 2. Abdominal images reconstructed by various methods. (a) The reference image versus the images reconstructed by (b) FBP, (c) TV-POCS, (d) PWLS-TGV, (e) PWLS-DL and (f) PWLS-CSCGR. The display window is [-150 250] HU.
Fig. 3. Difference images relative to the original image. (a) FBP, (b) TV-POCS, (c) PWLS-TGV, (d) PWLS-DL and (e) PWLS-CSCGR. The display window is [-1000 -700] HU. Table 1: Quantitative results obtained by different algorithms for the abdominal and thoracic images
ABDOMINAL THORACIC PSNR RMSE SSIM PSNR RMSE SSIM FBP 25.35 0.05417 0.45858 22.99 0.07087 0.42335 TV-POCS 43.04 0.00704 0.97187 42.92 0.00714 0.97751 PWLS-TGV 44.05 0.00627 0.97697 44.43 0.00600 0.98310 PWLS-DL 44.84 0.00573 0.97975 43.68 0.00655 0.98145 PWLS-CSCGR EXPERIMENTAL
RESULTS
In this section, some experiments were tested to validate the performance of the proposed method. All the images author-ized by Mayo Clinic and downloaded from โthe NIH-AAPM-Mayo Clinic Low Dose CT Grand Challengeโ. 10 images were randomly selected as training set to train the predetermined fil- ters {๐ ๐ } . Fig. 1 shows the images in the training set. The ge-ometry configuration of the simulations was set identical to [9]. 64 projection views were evenly distributed over 360 o . Several methods were compared with our pro-posed algorithm, includ-ing FBP, TV-POCS [1], PWLS-TGV [2] and PWLS-DL [6]. Specifically, for PWLS-DL, the number of atoms in dictionary was set to 256 and the number of nonzero elements in each coefficient vector was set to 16. The parameters of the pro-posed method were experimentally set as follows: filter size was set to 10 ร 10, the number of filters was 32, ฮฒ = ฮป = 0.005, ฯ = 100 ร ฮป + 1 and ฯ = 0.06. All the parameters had same ini-tializations in all the experiments to demonstrate the robust-ness of our method. The original abdominal image and the results reconstructed from different methods are shown in Fig. 2. The result of FBP contains severe streak artifacts and is almost clinically useless. All the other algorithms remove the artifacts efficiently. How-ever, in Fig. 2(c) and (d), blocky effects are noticed to different degrees. PWLS-DL effectively avoided the blocky effects in Fig. 2(e), but the result looks smooth in some organs. The rea-son for this should rely on that the DL based method is patch-based, which will average all the overlapped patches to pro-duce the final result. This procedure may suppress the noise efficiently, but some small details may be smoothened too. In Fig. 2(f), the proposed method maintains more details than all the other methods. The absolute difference images are illus-trated in Fig. 3 and it can be observed that result reconstructed by our method is closest to the reference image. To further demonstrate the details, Fig. 4 shows the results of magnified region of interest (ROI), which is indicated by the red box in Fig. 2(a). The red arrows indicate several con-trast enhanced blood vessels in the liver, which can only be well identified by our method. Specially, in Fig. 4(e) PWLS-DL smoothens the liver region, and the contrast enhanced blood vessels are lost. In Fig. 4(f), the proposed method pre-serves most details and has most coherent visual effect to the reference image, even for the mottle-like structures in the liver. The quantitative results are given in Table 1. The proposed al-gorithm outperforms other methods for all the metrics.
Fig. 5 presents the thoracic images reconstructed by different methods. The FBP result is covered by the artifacts and it is hard to discriminate the artifacts and blood vessels in the lungs. In Fig. 5(c) - (e), most artifacts are suppressed, but the blood vessels in some area, which are indicated by the red arrows, are blurred. In Fig. 5(f), more details in the lungs are preserved while the artifacts are well eliminated. Particularly, a region which is indicated by the red rectangle in Fig. 5(a) is enlarged in Fig. 6. It can be observed that the curve-like structures indi-cated by the red arrow are clearly visible in the result of pro-posed method and all the other methods cannot maintain these details. The quantitative evaluation of the results of thoracic image is given in the right part of Table 1. It is consistent to the results of abdominal image that our method still had the best scores.
Fig. 4. Zoomed ROI indicated by the red rectangle in 2(a). (a) The reference image versus the images reconstructed by (b) FBP, (c) TV-POCS, (d) PWLS-TGV, (e) PWLS-DL and (f) PWLS-CSCGR. The display window is [-150 250] HU.
Fig. 5. Thoracic images reconstructed by various methods. (a) The reference image versus the images reconstructed by (b) FBP, (c) TV-POCS, (d) PWLS-TGV, (e) PWLS-DL and (f) PWLS-CSCGR. The display window is [-1000 250] HU. C ONCLUSION
In this paper, we proposed a novel sparse-view CT recon-struction algorithm, combining PWLS and CSCGR. The pro-posed method directly performed on whole image rather than overlapped patches, which can avoid the artifacts caused by patch aggregation and preserve more details. Several experi-ments demonstrated that the proposed method had better per-formance than competing methods. In the future, we will try to extend this method to other topics, such as metal artifact reduction. We can also optimize our model with deep learning based methods [10, 11].
Fig. 6. Zoomed ROI indicated by the red rectangle in 5(a). (a) The reference image versus the images reconstructed by (b) FBP, (c) TV-POCS, (d) PWLS-TGV, (e) PWLS-DL and (f) PWLS-CSCGR. The display window is [-150 250]HU. A CKNOWLEDGMENT
This work was supported in part by the National Natural Sci-ence Foundation of China under grants 61671312, the Si-chuan Science and Technology Program under grants 2018HH0070 and Miaozi Project in Science and Technology Innovation Program of Sichuan Province under grants 18-YCG041. R EFERENCES [1]
E. Y. Sidky, C. Kao, and X. Pan, โAccurate image reconstruction from few-views and limited-angle data in divergent-beam CT,โ
J. X-ray Sci. Technol ., vol. 14, no. 2, pp. 119โ139, 2006. [2]
S. Niu et al., โSparse-view X-ray CT reconstruction via total generalized variation regularization,โ
Phys. Med. Biol ., vol. 59, no. 12, pp. 2997, 2014. [3]
T. Li et al., โNonlinear sinogram smoothing for low-dose X-ray CT,โ
IEEE Trans. Nucl. Sci. , vol. 51, no. 5, pp. 2505โ2513, 2004. [4]
Kristian Bredies, Karl Kunisch, and Thomas Pock, โTotal generalized variation,โ
SIAM J. on Imaging Sci ., vol. 3, no. 3, pp. 492โ526, 2010. [5]
M. Elad and M. Aharon, โImage denoising via sparse and redundant representations over learned dictionaries,โ
IEEE Trans. Image Process ., vol. 15, no. 12, pp. 3736โ3745, 2006. [6]
Q. Xu et al., โLow-dose X-ray CT reconstruction via dictionary learn-ing,โ
IEEE Trans. Med. Imaging , vol. 31, no. 9, pp. 1682โ1697, 2012. [7]
B. Wohlberg, โConvolutional sparse representations as an image model for impulse noise restoration,โ in
Image, Video, and Multidimensional Signal Process. Workshop, [8]
B. Wohlberg, โEfficient algorithms for convolutional sparse represen-tations,โ
IEEE Trans. Image Process. , vol. 25, no. 1, pp. 301โ315, 2016. [9]
P. Bao, J. Zhou, and Y. Zhang, โFew-view CT reconstruction with group-sparsity regularization,โ
Int. J. Numer. Meth. Biomed. Engng., vol. 34, no. 9, pp. e3101, 2018. [10]
H. Chen et al., โLow-dose CT with a residual encoder-decoder convo-lutional neural network (RED-CNN),โ
IEEE Trans. Med. Imaging , vol. 36, no. 12, pp. 2524โ2535, 2017. [11]
H. Chen et al., โLEARN: Learned expertsโ assessment-based re-con-struction network for sparse-data CT,โ