[PDF] GPU-based Low-dose 4DCT Reconstruction via Temporal Non-local Means

Abstract

Four-dimensional computed tomography (4DCT) has been widely used in cancer radiotherapy for accurate target delineation and motion measurement for tumors in thorax and upper abdomen areas. However, 4DCT simulation is associated with much higher imaging dose than conventional CT simulation, which is a major concern in its clinical application. Conventionally, each phase of 4DCT is reconstructed independently using the filtered backprojection (FBP) algorithm. The basic idea of our new algorithm is that, by utilizing the common information among different phases, the input information required to reconstruct image of high quality, and thus the imaging dose, can be reduced. We proposed a temporal non-local means (TNLM) method to explore the inter-phase similarity. All phases of the 4DCT images are reconstructed simultaneously by minimizing a cost function consisting of a data fidelity term and a TNLM regularization term. We utilized a forward-backward splitting algorithm and a Gauss-Jacobi iteration method to efficiently solve the minimization problem. The algorithm was also implemented on graphics processing unit (GPU) to achieve a high computational speed. Our reconstruction algorithm has been tested on a digital NCAT thorax phantom in three low dose scenarios. Our new algorithm generates visually much better CT images containing less image noise and streaking artifacts compared with the standard FBP algorithm. Quantitative analysis shows that much higher contrast-to-noise ratio and signal-to-noise ratio can be obtained using our algorithm. The total reconstruction time for all 10 phases of a slice ranges from 90 to 140 seconds on an NVIDIA Tesla C1060 GPU card.

Full PDF

LLow-dose 4DCT Reconstruction via Temporal Non-local Means † Zhen Tian

Department of Biomedical Engineering, Graduate School at Shenzhen, Tsinghua University, Shenzhen, 518055, China, and Center for Advanced Radiotherapy Technologies, University of California San Diego, La Jolla, CA 92093-0843, USA

Xun Jia

Center for Advanced Radiotherapy Technologies and Department of Radiation Oncology, University of California San Diego, La Jolla, CA 92093-0843, USA

Bin Dong

Department of Mathematics, University of California San Diego, La Jolla, CA 93093-0112, USA Yifei Lou

Department of Mathematics, University of California Los Angeles, Los Angeles, CA 90095 Steve B. Jiang a) Center for Advanced Radiotherapy Technologies and Department of Radiation Oncology, University of California San Diego, La Jolla, CA 92093-0843, USA

Purpose : Four-dimensional computed tomography (4DCT) has been widely used in cancer radiotherapy for accurate target delineation and motion measurement for tumors in thorax and upper abdomen areas. However, its prolonged scanning duration causes a considerably increase of radiation dose compared with the conventional CT, which is a major concern in its clinical application. This work is to develop a new algorithm to reconstruct 4DCT images from undersampled projections acquired at low mAs levels in order to reduce the imaging dose. Methods:

Conventionally, each phase of 4DCT is reconstructed independently using the filtered backprojection (FBP) algorithm. The basic idea of our new algorithm is that, by utilizing the common information among different phases, the input information required to reconstruct image of high quality, and thus the imaging dose, can be reduced. We proposed a temporal non-local means (TNLM) method to explore the inter-phase similarity. All phases of the 4DCT images are reconstructed simultaneously by minimizing a cost function consisting of a data fidelity term and a TNLM regularization term. We utilized a modified forward- † Zhen Tian and Xun Jia have contributed equally to this work and should be considered co-first authors. a) Author to whom correspondence should be addressed. Electronic mail: [email protected]

Z. Tian et al. backward splitting algorithm and a Gauss-Jacobi iteration method to efficiently solve the minimization problem. The algorithm was also implemented on graphics processing unit (GPU) to improve the computational speed. Our reconstruction algorithm has been tested on a digital NCAT thorax phantom in three low dose scenarios: all projections with low mAs level, undersampled projections with high mAs level and undersampled projections with low mAs level. Results:

In all three low dose scenarios, our new algorithm generates visually much better CT images containing less image noise and streaking artifacts compared with the standard FBP algorithm. Quantitative analysis shows that, by comparing our TNLM algorithm with the standard FBP algorithm, the contrast-to-noise ratio has been improved by a factor of 3.9-10.2 and the signal-to-noise ratio has been improved by a factor of 2.1-5.9, depending on the cases. In the situation of undersampled projection data, the majority of the streaks in the images reconstructed by FBP can be suppressed using our algorithm. The total reconstruction time for all 10 phases of a slice ranges from 40 to 90 seconds on an NVIDIA Tesla C1060 GPU card. Conclusions:

The experimental results indicate that our new algorithm outperforms the conventional FBP algorithm in effectively reducing the image artifacts due to undersampling and suppressing the image noise due to the low mAs level. Key words: 4DCT reconstruction, dose reduction, temporal non-local means, GPU

65 Z. Tian et al.

1. Introduction

Four-dimensional computed tomography (4DCT) has been widely used for treatment simulation in radiotherapy of tumors with respiratory motion , in that it can provide time-resolved volumetric images. Currently, there are two different methods for 4DCT acquisition and sorting, namely retrospective slice sorting and prospective sinogram selection . For the retrospective slice sorting method, the projection data are continuously acquired at each couch position for a time interval slightly longer than a full respiratory cycle, either in cine mode or in helical mode with a very low pitch . Multiple slices corresponding to different acquisition time points are reconstructed at each couch position and then sorted into respiratory phase bins using various respiratory signals . For prospective sinogram selection method, the CT scanner is triggered by the respiratory signal for projection data acquisition . The projection data within the same phase bin are used to reconstruct CT slices corresponding to that breathing phase. No matter which method is used, the reconstructed CT slices at different breathing phases represent a set of time-resolved volumetric images, which called 4DCT images and can resolve the organ motions and reduce motion artifacts to a satisfactory extent . However, the prolonged acquisition time results in a considerably increased radiation dose. For example, given the standard 4DCT parameters (140kVp, 95mA, 0.5s per rotation), the radiation dose of a 4DCT scan is about 6 times of that of a typical helical CT scan . This fact has become a major concern in the clinical application of 4DCT and thus it is highly desirable to reduce its imaging dose. Intuitively, the radiation dose can be reduced by lowering the mAs level and/or decreasing the number of acquired projections † . However, these approaches in general will lead to amplified image noise and obvious streaking artifacts in the reconstructed 4DCT images, if conventional filtered backprojection (FBP) algorithm is used. In the current FBP-based 4DCT reconstruction algorithm, each phase of the 4DCT images is reconstructed independently based on the acquired projection data associated with it. This method, though simple, completely neglected the highly temporal correlation between 4DCT images at successive phases, as apparently same anatomical features exist in successive phases of 4DCT images with slight motion and deformation. It is expected that taking into account this information during reconstruction, one can reconstruct 4DCT image with high quality, even at the low dose contexts of low mAs level and/or undersampled projection data. Inspired by this idea, in this paper we propose a new 4DCT reconstruction algorithm by utilizing the aforementioned temporal correlations between images at successive phases. In our algorithm, CT slices corresponding to different phases are reconstructed simultaneously as opposed to independently in conventional FBP-type algorithms. In particular, the temporal regularization is imposed between successive phases via a Temporal Non-local Means (TNLM) term to take the inter-phase correlation into account. Our idea of the TNLM function is inspired by the so called Non-local Means (NLM) method originated in image processing † Decreasing the number of projections is not straightforward on currently available commercial CT scanners due to the use of continuous x-ray generation mode. However, technically it is possible to modify the scanners to operate in high-frequency pulsed mode , if there is a clinical need (such as the one suggested in this paper).

Z. Tian et al. field. The NLM method assumes that there are lots of repetitive structures contained in an image and thus utilizes the similar image features at different spatial locations in the same image to constructively enhance each other . This assumption, though valid in many cases in image processing problems, it may not hold in medical images. However, it is reasonable to believe that along temporal direction there exist repetitive features in the 4DCT images. Thus we propose the novel TNLM method which extends the original NLM method into the temporal domain by exploring the similarity between neighboring phases in the context of 4DCT reconstruction. The rest of this paper is organized as following. In Section 2 we will present the new 4DCT reconstruction method. The experimental results with this new method will be given in Section 3. We then give discussion and conclusions in Section 4.

2. Methods

Let us divide a respiratory cycle into phases labeled by

The 4DCT image of phase is denoted by a vector . is the projection matrix of phase that maps the image into a set of projections corresponding to various projection angles. The measured set of projections is denoted by a vector . We attempt to reconstruct the 4DCT images by solving the following optimization problem: , (1) where the first term in the summation is a data fidelity term, ensuring that the projections of the reconstructed 4DCT image at each phase matches the corresponding observed projections. The symbol denotes the matrix transpose. The covariance matrix is a diagonal matrix with its non-zero elements corresponding to the variance of the pixel values of the measured projection images . The second term in Eq. (1), , is the regularization term, and the parameter adjusts the relative weight between the data fidelity term and the regularization term. In this paper, we propose a new TNLM function as the temporal regularization imposed on neighboring phases to explore the inter-phase similarity. A periodic boundary condition along the temporal direction is assumed, e.g. , . For two images at different phases, and , is defined as . (2) The weighting factors are independent of and but defined according to the ground truth images and as , (3) where denotes a square patch on the image centering at a pixel and is the L -norm of the difference between and . is a normalization Z. Tian et al. parameter such that is a parameter that adjusts to what extent we would like to enforce the similarity between patches. To solve the optimization problem in Eq.(1), we implement a forward-backward splitting

19, 20 , where the solution to Eq.(1) can be obtained by alternatively performing the following two steps till convergence: , (4) , (5) where the superscript is the index for iteration steps. is auxiliary vector and is a constant introduced by the splitting algorithm. Note that Eq.(4) is actually one step of gradient descent algorithm towards a problem minimizing an energy function. . The introduction of actually controls the step size of the gradient descent algorithm for numerical stability purpose. In practice, it is found that by substituting this one step gradient descent with a conjugate gradient least square (CGLS) method for the minimization of , the overall convergence can be enhanced, although the convergence after this modification is not mathematically proven. The CGLS algorithm by itself is an iterative algorithm. In each iteration , we use the images obtained from the last iteration, i.e. as an initial guess. Let us denote and . The detailed implementation of this CGLS algorithm is performed as follows: CGLS Algorithm : , . Do the Steps 1-5 for times. 1. Here the superscript is the iteration step for the CGLS algorithm. In practice, it is not necessary to carry out this CGLS algorithm very precisely in each outer loop , since the purpose of this CGLS step is only to generate a better solution based on the input . Therefore, the iteration steps of the CGLS algorithm, is chosen to be a small integer, such as

To solve the subproblem in Eq. (5), let us first take functional variation of with respect to . Note that the weighting factors are constants defined according to the ground truth images and . We arrive at

Z. Tian et al. (6) where . By setting this variation to be zero and rearrange different terms, we obtain the optimality condition as : (7) This equation leads to a Gauss-Jacobi type iteration scheme for solving the problem in Eq. (5): Gauss-Jacobi iteration Algorithm : Initialize and . Do the Step 1 for times. 1. Again the integer here denotes iteration step. It follows from Theorem 10.1.1 in the literature that such an iteration scheme converges for any . The total number of iteration steps is chosen to be a small number, such as . Moreover, since the weighting factors defined according to the ground truth images and are not know beforehand, we update these weights during the iteration according to the latest available images and . In other words, we choose square patches from the reconstructed images obtained in the last iteration, and , to calculate the weight instead of . (8) Additionally, a simple truncation of negative pixel values is necessary after each iteration to ensure the positivity of the reconstructed images. The TNLM algorithm can be summarized as follows: TNLM Algorithm: 1.

Initialize for to be the image reconstructed by FBP with all projections from all phases. 2.

Using CGLS with initial value to get . 3.

Update weighting parameter , using . 4. Get using Gaussian-Jacobi algorithm. 5.

Ensuring image positivity: Go back to step 2 until convergence. The advantage of this TNLM algorithm is straightforward. In the k th iteration, the algorithm first obtains a better solution using CGLS algorithm based on the solution Z. Tian et al. from previous step, . Since this step does not contain any regularization on the solution, the obtained will be contaminated by noise and various artifacts. The following TNLM step, i.e. step 4, updates the solution according to the Gauss-Jacobi algorithm, yielding a new solution . In particular, is a weighted average of the input image and the images at neighboring phases and . Moreover, the updated image pixel value at depend on in a local fashion, i.e. also at voxel , but in a non-local fashion on the neighboring phases, i.e. at other voxels . For those non-local terms, the weight is automatically adjusted according to the similarity between the patches around in phase i and the patches around in neighboring phases. As such, any features that repetitively appear in successive phases, such as true anatomical structures, are preserved during the iteration. In contrast, those features do not repeat, such as streaking artifacts, are suppressed. One disadvantage of our TNLM reconstruction algorithm is its large computational burden of searching for similarity. During the implementation, for each pixel on image , a square patch centering at this pixel on the image is compared with patches centered at all pixels on the neighboring phase images for computing the weighting factor . Suppose the patches used for computing this weighting factor are two dimensional squares with pixels in each dimension. Then the total complexity of the searching scheme between two images is in the order of , where is the size of the 4D-CT slices in each dimension. However, this approach is neither computational efficient nor necessary. In fact, for a patch at location on phase i , the patches similar to it on neighboring phases must locate within a neighborhood of x due to the finite motion range of respiratory motion. Therefore, it is adequate to restrict the search for the similar patches only within a search window. In practice, we set this search window to be a square region with pixels in each dimension. Since it is usually true that , this searching window can reduce the computation load down to . Another technique we used to speed up the calculation is removing the redundant calculations in computing the weighting factor. Note that the weights, and , are actually same before normalization. Reusing this factor in the Gauss-Jacobi algorithm, instead of recomputing it, can reduce the computational load by a factor of about two. Besides, we also implement the 4DCT reconstruction algorithm on an NVIDIA Tesla C1060 card to speed up the computation. This GPU card has a total number of 240 processor cores (grouped into 30 multiprocessors with 8 cores each), each with a clock speed of 1.3 GHz. It is also equipped with 4 GB DDR3 memory, shared by all processor cores. We simply have each GPU thread responsible for one pixel of the CT slices. Because of the large number of GPU threads, the computation efficiency can be considerably elevated.

3. Experimental results

230 Z. Tian et al. We tested our reconstruction algorithm on a digital NCAT phantom at a thorax region . The x-ray projections were simulated using Siddon’s algorithm in fan-beam geometry with an arc detector of 888 units and a spacing of 1.0239mm. The source to detector distance is 949.075mm and the source to rotation center distance is 541.0mm. All of these parameters mimic the realistic configuration of a GE Lightspeed QX/I CT scanner. The gantry’s rotation speed is set to be 0.5s/rotation. The period of the respiratory cycle is 4 seconds. Our 4DCT reconstruction algorithm is tested in three low dose cases: Case 1 - all projections acquired with low mAs protocol; Case 2 - undersampled projections acquired with high mAs protocol; Case 3 - undersampled projections acquired with low mAs protocol. To simulate the noise-contaminated sinogram at low mAs situations, we add Gaussian noise signal to the noise free projection data, where the variance at a given entry i of the projection is taken as

25, 26 (9) where is the projection data value before adding noise. represents the average photon number just before entering the patient body, which is derived from the measurements at a certain mAs level. The variance computed as such also goes into the matrix Σ in Eq. (1) . In Case 1, we first generated 4000 noise-contaminated projections at 20mAs for ten phases and all of them are used for reconstruction. The reconstructed images at three phases, namely the end of inhale, the middle of the exhale, and the end of exhale are shown in Fig.1. The reconstruction results of the conventional FBP algorithm are also shown for comparison purpose. By visually inspecting, it is clear that our TNLM algorithm outperforms the FBP algorithm by greatly reducing the image noise. The blurring effect shown in top and bottom row images is due to the residual heart motion within the respiratory phase bin. More discussion about this effect will be given in Section 4. (a1) (b1) (c1) (a2) (b2) (c2)

Z. Tian et al. (a3) (b3) (c3) Figure 1.

The reconstruction 4DCT images for the 40% phase (end point of inhale) (top) and the 70% phase (mid point of exhale) (middle) and the 100% phase (end point of exhale) (bottom) with all projections at 20 mAs level (Case 1). (a) Ground truth; (b) FBP; (c) TNLM.

As for Case 2, we generated 500 projections at 100 mAs to simulate the undersampling situation with high mAs level. Case 3 is same as Case 2 except the mAs is lowered down to

20 mAs to simulate the undersampling situation with low mAs level. The results of these two cases are shown in Fig. 2 and Fig. 3, respectively. Obvious streaking artifacts can be observed in the CT images reconstructed by the FBP algorithm due to undersampling, see Fig. 2(b1~b3) and Fig. 3(b1~b3). Moreover, the FBP results of the Case 3 are noisier than that of the Case 2 due to the low mAs level. On the other hand, our TNLM algorithm can still maintain the image quality to a satisfactory level in both cases. (a1) (b1) (c1) (a2) (b2) (c2) (a3) (b3) (c3)

Figure 2.

The reconstruction 4DCT images for the 40% phase (end point of inhale) (top) and the 70% phase (mid point of exhale) (middle) and the 100% phase (end point of exhale) (bottom) with 500 projections at 100 mAs level (Case 2). (a) Ground truth; (b) FBP; (c) TNLM.

0 Z. Tian et al. (a1) (b1) (c1) (a2) (b2) (c2) (a3) (b3) (c3) Figure 3.

The reconstruction 4DCT images for the 40% phase (end point of inhale) (top) and the 70% phase (mid point of exhale) (middle) and the 100% phase (end point of exhale) (bottom) with 500 projections at 20 mAs level (Case 3). (a) Ground truth; (b) FBP; (c) TNLM.

To quantitatively evaluate the performance of our reconstruction algorithm in terms of maintaining image contrast and suppressing image noise, we calculate the contrast to noise ratio (CNR) and signal to noise ratio (SNR) which are defined as follows: , (10) . (11) The CNR is defined on a given region of interest (ROI). is the mean value of the signal for the ROI, while and are the mean value and stand deviation of the nearby background. SNR is a quantity to measure the overall deviation of the reconstructed 4DCT image from the ground truth image. is the mean value of an image and is the ground truth image. Table 1 lists the mean values of CNRs over ten phases of the two ROIs shown in Fig. 4.

ROI1 is a small high contrast structure with the background chosen as the nearby dark lung region. ROI2 is the vertebral body with the neighboring tissue as its background. In all the three cases, our TNLM algorithm has improved the CNR by a factor of 3.9-10.2 depending on the ROIs and cases compared with the conventional FBP algorithm. In particular, for the FBP algorithm, all the CNRs of ROI2 are very small and thus the low contrast structures is hardly resolved. While for our TNLM algorithm, the corresponding CNRs are much larger. The results of CNRs illustrate that our algorithm is superior to the FBP algorithm in maintaining

1 Z. Tian et al. good contrast under low dose contexts. Table 2 lists the mean SNRs over ten phases, in which our TNLM algorithm has improved the SNRs by a factor of 2.1-5.9, indicating that our algorithm outperforms the FBP algorithm in effectively suppressing the image noise at low dose situation. Case1 Case2 Case3 ROI1 ROI2 ROI1 ROI2 ROI1 ROI2 FBP 6.83 1.80 4.09 1.13 2.56 0.69 TNLM 28.64 12.15 16.04 7.53 19.53 7.07

Table 1.

Mean CNR values over ten phases of the two ROIs in Fig. 4.

Figure 4.

Two ROIs used for CNR calculation.

Case1 Case2 Case3 FBP 10.49 6.03 3.39 TNLM 22.13 21.12 20.28

Table 2.

Mean SNR values over ten phases .

4. Discussion and Conclusions

In this paper, we have presented a novel iterative 4DCT reconstruction algorithm via temporal regulariation. The 4DCT images of different phases are reconstructed simultaneously by minimizing an energy function consisting of a data fidelity term and a temporal regularization term between every two neighboring phases. A temporal non-local means method is employed to take the temporal correlation of the 4DCT images into account. We utilized a modified forward-backward algorithm to perform the optimization. The iterative reconstruction algorithm is implemented on a GPU platform to improve its efficiency. Our algorithm is tested on a digital NCAT phantom under low dose context by lowering the mAs level or/and decreasing the number of projections. The experimental results indicate that our algorithm performs much better than the conventional FBP algorithm in effectively reducing the image artifacts due to undersampling and suppressing the image noise due to the low mAs level. Specifically, the contrast-to-noise ratio has been improved by a factor of 3.9-10.2 and the signal-to-noise ratio has been improved by a factor of 2.1-5.9, depending on the cases. The total reconstruction time ranges from 40 to 90 seconds on a NVIDIA Tesla C1060 card (NVIDIA, Santa Clara, CA) for ten phases at a transverse slice. This reconstruction time may not meet the requirement for some clinical applications. Yet, the efficiency of our algorithm could be further increased by using some advanced technologies such as multi-grid algorithm and multiple GPUs. The blurred heart edges in some panels of Figs. 1-3 are caused by the residual heart motion within the respiratory phase bin. The breathing period used in this work is 4 sec. Then each of 10 breathing phase bin covers 0.4 sec, within which the respiratory motion is negligible but the heart motion is not. For example, we observe apparent blurring effect in Fig. 1(c1) and Fig. 1(c3), but not Fig. 1(c2), this is because the heart in Fig. 1(c1) and Fig. 1(c3) is in the middle of systole/diastole, while in Fig. 1(c2) it happens to be at the end of systole. The the blurring effect also presents in the ground truth images, as shown in Fig. 1(a1) and Fig. 1(a3), because we average over 400 images within a given breathing phase bin to produce the

ROI1

ROI2

2 Z. Tian et al. ground truth CT image at that phase. The blurring effect due to the residual heart motion also exists in the FBP images, which, however, is less visible due to the image noise and the streak artifact. In this feasibility study we have only quantitatively evaluated CNR and SNR. Other quantitative measures, such as spatial resolution of the images, are not considered in the current work. It is found that the spatial resolution is degraded to a certain extent with the TNLM method, especially at the undersampling situation in Case 2 and Case 3. We plan to conduct much more systematical and quantitative evaluations using a large set of representative patient images and a set of clinically relevant metrics in our future work. Acknowledgements

This work is supported in part by the University of California Lab Fees Research Program. We would like to thank NVIDIA for providing GPU cards for this project.

330 3 Z. Tian et al. References

1. E. Ford, G. Mageras, E. Yorke and C. Ling, "Respiration-correlated spiral CT: A method of measuring respiratory-induced anatomic motion for radiation treatment planning,"

Medical physics , 88-97 (2003). 2. P. Keall, G. Starkschall, H. Shukla, K. Forster, V. Ortiz, C. Stevens, S. Vedam, R. George, T. Guerrero and R. Mohan, "Acquiring 4D thoracic CT scans using a multislice helical method," Physics in Medicine and Biology , 2053-2067 (2004). 3. D. Low, M. Nystrom, E. Kalinin, P. Parikh, J. Dempsey, J. Bradley, S. Mutic, S. Wahab, T. Islam and G. Christensen, "A method for the reconstruction of four-dimensional synchronized CT scans acquired during free breathing," Medical physics , 1254-1263 (2003). 4. T. Pan, T. Lee, E. Rietzel and G. Chen, "4D-CT imaging of a volume influenced by respiratory motion on multi-slice CT," Medical physics , 333-340 (2004).

5. E. Rietzel, T. Pan and G. Chen, "Four-dimensional computed tomography: image formation and clinical protocol," Medical physics , 874 (2005). 6. S. Vedam, P. Keall, V. Kini, H. Mostafavi, H. Shukla and R. Mohan, "Acquiring a four-dimensional computed tomography dataset using an external respiratory signal," Physics in Medicine and Biology , 45-62 (2003).

7. U. Langner and P. Keall, "Prospective displacement and velocity-based cine 4D CT," Medical physics , 4501-4512 (2008). 8. R. Li, J. Lewis, L. Cervino and S. Jiang, "4D CT sorting based on patient internal anatomy," Physics in Medicine and Biology , 4821-4833 (2009). 9. W. Lu, P. Parikh, J. Hubenschmidt, J. Bradley and D. Low, "A comparison between amplitude sorting and phase-angle sorting using external respiratory measurement for 4D CT," Medical physics , 2964-2974 (2006). 10. N. Wink, C. Panknin and T. Solberg, "Phase versus amplitude sorting of 4D-CT data," Journal of Applied Clinical Medical Physics , 77-85 (2006). 11. P. Keall, "4-dimensional computed tomography imaging and treatment planning," Seminars in Radiation Oncology , 81-90 (2004). 12. J. de Koste, S. Senan, C. Kleynen, B. Slotman and F. Lagerwaard, "Renal mobility during uncoached quiet respiration: An analysis of 4DCT scans," Int J Radiat Oncol Biol Phys , 799-803 (2006). 13. B. Myagkov, E. Shikanov and A. Shikanov, "Development and investigation of a pulsed x-ray generator," Atomic energy , 143-148 (2009). 14. G. Yue, Q. Qiu, B. Gao, Y. Cheng, J. Zhang, H. Shimoda, S. Chang, J. Lu and O. Zhou, "Generation of continuous and pulsed diagnostic imaging x-ray radiation using a carbon-nanotube-based field-emission cathode," Applied Physics Letters , 355-357 (2002). 15. M. Kang, C. Park, C. Lee, J. Goo and H. Lee, "Dual-Energy CT: Clinical Applications in Various Pulmonary Diseases1," Radiographics , 685-699 (2010). 16. A. Buades, B. Coll and J. Morel, "A review of image denoising algorithms, with a new one," Multiscale Modeling and Simulation , 490-530 (2006). 17. Y. Lou, X. Zhang, S. Osher and A. Bertozzi, "Image recovery via nonlocal operators," Journal of Scientific Computing , 185-197 (2010).

18. J. Wang, T. Li, Z. Liang and L. Xing, "Dose reduction for kilovotage cone-beam computed tomography in radiation therapy," Physics in Medicine and Biology , 2897-2909 (2008). 19. P. Combettes and V. Wajs, "Signal recovery by proximal forward-backward splitting," Multiscale Modeling and Simulation , 1168-1200 (2005).

20. E. Hale, W. Yin and Y. Zhang, "Fixed-Point Continuation for l1-Minimization: Methodology and Convergence," SIAM Journal on Optimization , 1107-1130 (2008). 21. M. R. Hestenes and E. Stiefel, "Methods of conjugate gradients for solving linear systems," Journal of Research of the National Bureau of Standards , 409-436 (1952). 22. G. Golub and C. Van Loan, Matrix computations . (Johns Hopkins Univ Pr, 1996).

385 4 Z. Tian et al.

23. W. P. Segars,"Development and application of the new dynamic NURBS-based cardiac-torso (NCAT) phantom," Ph.D. Dissertation, University of North Carolina, Chapel Hill, NC 2001. 24. R. Siddon, "Fast calculation of the exact radiological path of a three-dimensional CT array," Medical Physics , 252-255 (1985).

25. T. Li, X. Li, J. Wang, J. Wen, H. Lu, J. Hsieh and Z. Liang, "Nonlinear sinogram smoothing for low-dose X-ray CT," IEEE Transactions on Nuclear Science , 2505 (2004). 26. J. Wang, H. Lu, Z. Liang, D. Eremina, G. Zhang, S. Wang, J. Chen and J. Manzione, "An experimental study on the noise properties of x-ray CT sinogram data in Radon space," Physics in Medicine and Biology , 3327 (2008)., 3327 (2008).