UVTomo-GAN: An adversarial learning based approach for unknown view X-ray tomographic reconstruction
UUVTOMO-GAN: AN ADVERSARIAL LEARNING BASED APPROACH FOR UNKNOWNVIEW X-RAY TOMOGRAPHIC RECONSTRUCTION
Mona Zehni, Zhizhen Zhao
Department of ECE and CSL, University of Illinois at Urbana-Champaign
ABSTRACT
Tomographic reconstruction recovers an unknown imagegiven its projections from different angles. State-of-the-artmethods addressing this problem assume the angles asso-ciated with the projections are known a-priori. Given thisknowledge, the reconstruction process is straightforward asit can be formulated as a convex problem. Here, we tacklea more challenging setting: 1) the projection angles areunknown, 2) they are drawn from an unknown probabilitydistribution. In this set-up our goal is to recover the imageand the projection angle distribution using an unsupervisedadversarial learning approach. For this purpose, we formulatethe problem as a distribution matching between the real pro-jection lines and the generated ones from the estimated imageand projection distribution. This is then solved by reachingthe equilibrium in a min-max game between a generator anda discriminator. Our novel contribution is to recover the un-known projection distribution and the image simultaneouslyusing adversarial learning. To accommodate this, we useGumbel-softmax approximation of samples from categoricaldistribution to approximate the generator’s loss as a functionof the unknown image and the projection distribution. Ourapproach can be generalized to different inverse problems.Our simulation results reveal the ability of our method in suc-cessfully recovering the image and the projection distributionin various settings.
Index Terms — Tomographic reconstruction, adversariallearning, unsupervised learning, gumbel-softmax, categoricaldistribution, computed tomography
1. INTRODUCTION
X-ray computed tomography (CT) is a popular imaging tech-nique that allows for non-invasive examination of patients inmedical/clinical settings. In a CT setup, the measurements,i.e. projections, are modeled as the line integrals of the un-derlying 2D object along different angles. The ultimate goalin CT reconstruction is to recover the 2D object given a largeset of noisy projections.If the projection angles are known, the tomographicreconstruction problem is often solved via Filtered Back-projection (FBP), direct Fourier methods [1] or formulated asa regularized optimization problem [2]. However, the knowl-edge of the projection angles is not always available or it
Fig. 1 . An illustration of our pipeline.might be erroneous, which adversely affects the quality of thereconstruction. To account for the uncertainty in the projec-tion angles, iterative methods that solve for the 2D image andthe projection angles in alternating steps are proposed in [3].While proven effective, these methods are computationallyexpensive and sensitive to initialization.Recently, the use of deep learning (DL) approaches fortomographic reconstruction has surged. DL-based CT recon-struction methods in sparse-view regimes learn either a map-ping from the sinograms to the image domain [4, 5] or a de-noiser that reduces the artifacts from the initial FBP recon-structed image from the sinogram [6, 7, 8, 9, 10, 11]. Fur-thermore, DL-based sinogram denoising or completion is pro-posed in [12, 13]. Solving the optimization formulation of to-mographic reconstruction along the gradient descent updateswith machine learning components is suggested in [14, 15].While these methods rely on the knowledge of the projec-tion angles, they also require large paired training sets to learnfrom. However, here we address a more challenging problemwhere the projection angles are unknown in advance.To overcome the challenges for unknown view CT re-construction, we propose UVTomo-GAN, an unsupervisedadversarial learning based approach for tomographic recon-struction with unknown projection angles. Our method is un-supervised, thus there is no need for large paired training sets.Our approach benefits from the proven potential of generativeadversarial networks (GAN) [16] to recover the image andprojection angle distribution that match the given projectionmeasurements in a distribution sense. Our approach is mainlyinspired by CryoGAN [17]. Unlike CryoGAN, we have amore challenging setting, as we assume that the distribution a r X i v : . [ c s . C V ] F e b f the projection angles is unknown . Therefore, we seek torecover this distribution alongside the image. We show thatthe original generator’s loss involves sampling from the pro-jection angles distribution which is non-differentiable. To al-low for back-propagation through this non-differentiable op-erator, we alter the training loss at the generator side usinggumbel-softmax approximation of samples from a categori-cal distribution [18]. Our proposed idea is general and canbe applied to a wide range of inverse problems with similarsetups. Our results confirm the potential of our method in un-known view tomographic reconstruction task under differentnoise regimes. Our implementation is available at https://github.com/MonaZI/UVTomogan .
2. PROJECTION FORMATION MODEL
We assume the projection formation model for X-ray CT as, ξ (cid:96) = P θ (cid:96) I + ε (cid:96) , (cid:96) ∈ { , , ..., L } (1)where I : R → R is an unknown 2D compactly supportedimage we wish to estimate. P θ (cid:96) denotes the tomographic pro-jection operator that takes the line integral along the directionspecified by θ (cid:96) ∈ [0 , π ] , i.e. ( P θ (cid:96) I )( x ) = ∞ (cid:90) −∞ I ( R Tθ (cid:96) x ) dy (2)where x = [ x, y ] T represents the 2D Cartesian coordinatesand R θ (cid:96) is the 2D rotation matrix specified by angle θ (cid:96) .Here, we assume that { θ (cid:96) } L(cid:96) =1 are unknown and are ran-domly drawn from an unknown distribution p . Finally, thediscretized projections are contaminated by additive whiteGaussian noise ε (cid:96) with zero mean and variance σ . An unbi-ased estimator of σ can be obtained from the variance of theprojection lines but here we assume that σ is known.In this paper, our goal is to recover the underlying image I and the unknown distribution of the projection angles p , givena large set of noisy projection lines, i.e. { ξ (cid:96) } L(cid:96) =1 .
3. METHOD
Our approach involves recovering I and p such that the distri-bution of the projection lines generated from I and p matchesthe distribution of the real projection lines. To this end, weadopt an adversarial learning framework, illustrated in Fig. 1.Our adversarial learning approach consists of a discrimi-nator D φ and a generator G . Unlike classic GAN models, wereplace the generator G by the a-priori known forward modeldefined in (1). The generator’s goal is to output projectionlines that match the distribution of the real projection dataset { ξ (cid:96) real } L(cid:96) =1 and fool the discriminator. For our model, the un-knowns we seek to estimate at the generator side are the image I and the projection angle distribution p . On the other hand,the discriminator D φ , parameterized by φ , tries to distinguishbetween real and fake projections.Similar to [17], we choose Wasserstein GAN [19] withgradient penalty (WGAN-GP) [20]. Our loss function andthe mini-max objective for I , p and φ are defined as, Algorithm 1
UVTomo-GAN
Require: α φ , α I , α p : learning rates for φ , I and p . n disc : thenumber of iterations of the discriminator (critic) per generatoriteration. γ IT V , γ I(cid:96) , γ pT V , γ p(cid:96) : the weights of total variationand (cid:96) -regularizations for I and p . Require:
Initialize I randomly and p with Unif (0 , π ) . Output:
Estimates I and p given { ξ real (cid:96) } L(cid:96) =1 . while φ has not converged do for t = 0 , ..., n disc do Sample a batch from real data, { ξ b real } Bb =1 Sample a batch of simulated projections using es-timated I and p , i.e. { ξ b syn } Bb =1 where ξ b syn = P θ I + ε b , ε b ∼ N (0 , σ ) Generate interpolated samples { ξ b int } Bb =1 , ξ b int = α ξ b real + (1 − α ) ξ b syn with α ∼ Unif (0 , Update the discriminator using gradient ascentsteps using the gradient of (3) with respect to φ . end for Sample a batch of { r i,b } Bb =1 using (7) Update I and p using gradient descent steps by takingthe gradients of the following with respect to I and p , L ( I, p ) = LG ( I, p ) + γITV TV ( I ) + γI(cid:96) (cid:107) I (cid:107) γpTV TV ( p ) + γp(cid:96) (cid:107) p (cid:107) end while L ( I, p, φ ) = B (cid:88) b =1 D φ ( ξ b real ) −D φ ( ξ b syn )+ λ (cid:16) (cid:107)∇ ξ D φ ( ξ b int ) (cid:107)− (cid:17) (3) (cid:98) I, (cid:98) p = arg min I,p max φ L ( I, p, φ ) , (4) where L denotes the loss as a function of I , p and φ . B and b denote the batch size and the index of a sample in thebatch respectively. Also, ξ real mark the real projections while ξ syn are the synthetic projections from the estimated image (cid:98) I and projection distribution (cid:98) p with ξ syn = P θ (cid:98) I + ε , θ ∼ (cid:98) p and ε ∼ N (0 , σ ) . Note that the last term in (3) is the gradientpenalty with weight λ and roots from the Liptschitz conti-nuity constraint in a WGAN setup. We use ξ int to denote alinearly interpolated sample between a real and a syntheticprojection line, i.e. ξ int = α ξ real +(1 − α ) ξ sim , α ∼ Unif (0 , .Note that (4) is a min-max problem. We optimize (4) by alter-nating updates between φ and the generator’s variables, i.e. I and p , based on the associated gradients.Given D φ , the loss that is optimized at the generator is, L G ( I, p ) = − B (cid:88) b =1 D φ ( P θ b I + ε b ) , θ b ∼ p. (5)Notice that (5) is a differentiable function with respect to I .However, it involves sampling θ b based on the distribution p ,which is non-differentiable with respect to p . Thus, here themain question that we ask is: what is an alternative approxi-mation for (5) , which is a differentiable function of p ? To answer this question, we first discretize the supportof the projection angles, i.e. [0 , π ] , uniformly into N θ bins. . Phantom . . . . Lung
Fig. 2 . Examples of clean (red) and noisy (blue) projectionlines for the experiments with SNR = 1 in Fig. 4. . . · − (a) Phantom-SNR ∞ , TV = 0 . P M F · − (b) Lung-SNR ∞ , TV = 0 . . . · − (c) Phantom-SNR 1, TV = 0 . P M F · − (d) Lung-SNR 1, TV = 0 . Fig. 3 . Comparison between the ground truth sample distri-bution of the projection angles (red) and the one estimated byour method (blue). The setting of these experiments are thesame as the ones in Fig. 4.Therefore, p becomes a probability mass function (PMF), rep-resented by a vector of length N θ where N θ (cid:80) i =1 p i = 1 , and p i ≥ , ∀ i . This discretization has made the distribution overthe projection angles discrete or categorical. In other words,the sampled projection angles from p can only belong to N θ discrete categories. This allows us to approximate (5) usingthe notions of gumbel-softmax distribution [18] as follows, L G ( I, p ) ≈ − B (cid:88) b =1 N θ (cid:88) i =1 r i,b D φ ( P θ i I + ε b ) , (6)with r i,b = exp (( g b,i + log( p i )) /τ ) N θ (cid:80) j =1 exp (( g b,j +log( p j )) /τ ) , g b,i ∼ Gumbel (0 , (7)where τ is the softmax temperature factor. As τ → , r i,b → one-hot (arg max i [ g b,i +log( p i )]) . Furthermore, sam-ples from the Gumbel (0 , distribution are obtained by draw-ing u ∼ Unif (0 , , g = − log( − log( u )) [ ] . Note that dueto the reparametrization trick applied in (6), the approximatedgenerator’s loss has a tangible gradient with respect to p .We present the pseudo-code for UVTomo-GAN in Alg. 1.In all our experiments, we use a batch-size of B = 50 . Wehave three different learning rates for the discriminator, imageand the PMF denoted by α φ , α I and α p . We reduce the learn-ing rates by a factor of . , with different schedules for differ-ent learning rates. We use SGD as the optimizers for the dis-criminator and the image with a momentum of . and updatethe PMF using gradient descent steps. We clip the gradientsof the discriminator and the image by and respectivelyand normalize the gradients of the PMF. Following commonpractice, we train the discriminator n disc = 4 times per up-dates of I and p . We discretize the domain of the projection angle, i.e. [0 , π ] , by roughly d equal-sized bins, where d isthe image size.Due to the structure of the underlying images, we add (cid:96) and TV regularization terms for the image, with γ I(cid:96) and γ IT V weights. Furthermore, we assume that the unknown PMF isa piece-wise smooth function of projection angles (which is avalid assumption especially in single particle analysis in cryo-electron microscopy [21]), therefore adding (cid:96) and TV regu-larization terms for the PMF with γ p(cid:96) and γ pT V weights.Our default architecture of the discriminator consists offive fully connected (FC) layers with , , , and output sizes. We choose ReLU [22] as the activationfunctions. To impose the non-negativity constraint over theimage, we set I to be the output of a ReLU layer. In addi-tion, to enforce the PMF to have non-negative values whilesumming up to one, we set it to be the output of a
Softmax layer. Our implementation is in PyTorch and we use Astra-toolbox [23] to define the tomographic projection operator.
4. EXPERIMENTAL RESULTS
We use two different images, a Shepp-Logan phantom anda biomedical image of lungs of size × in our experi-ments. We refer to these images as phantom and lung imagesthroughout this section. We discretize the projection angle do-main [0 , π ] with equal-sized bins and generate a randompiece-wise smooth p . We use this PMF to generate the projec-tion dataset following (1). We test our approach on a no noiseregime (i.e. σ = 0 ) and a noisy case where the signal-to-noise(SNR) ratio for the projection lines is . For experiments withnoisy phantom image, we use a smaller discriminator networkwith , , , and as it leads to improved recon-struction compared to the default architecture. For all experi-ments the number of projection lines L = 20 , . To assessthe quality of reconstruction, we use peak signal to noise ra-tio (PSNR) and normalized cross correlation (CC). The higherthe value of these metrics, the better the quality of the recon-struction. We use total variation distance (TV) to evaluate thequality of the recovered PMF compared to the ground truth.We compare the results of UVTomo-GAN with unknownPMF on four baselines, 1) UVTomo-GAN with known PMF,2) UVTomo-GAN with unknown PMF but fixing it with aUniform distribution during training, 3) TV regularized con-vex optimization, 4) expectation-maximization (EM). In thefirst baseline, similar to [17], we assume that the ground truthPMF of the projection angles is given. Thus, in Alg 1, weno longer update p (step 9). In the second baseline, we alsodo not update the PMF and during training assume that it isa Uniform distribution. In the third baseline, we assume thatthe angles associated to the projection lines are known, soformulate the reconstruction problem as a TV-regularized op-timization solved using alternating direction method of multi-pliers (ADMM) [24] and implement using GlobalBioIm [25].In the fourth baseline, unlike the third one, we do not knowthe projection angles. Thus, we formulate the problem as a h a n t o m - nono i s e GT PSNR =38 . dB, CC =1 TV-Reg.
PSNR =49 . dB, CC =1 . Known p PSNR =33 . dB, CC =1 . Unknown p PSNR =18 . dB, CC =0 . Fix p as unif. PSNR =18 . dB, CC =0 . EM-good init.
PSNR =15 . dB, CC =0 . EM-random init. L ung - nono i s e PSNR =37 . dB, CC =1 . PSNR =32 . dB, CC =1 . PSNR =27 . dB, CC =1 . PSNR =18 . dB, CC =0 . PSNR =16 . dB, CC =0 . PSNR =14 . dB, CC =0 . P h a n t o m - S N R PSNR =32 . dB, CC =1 . PSNR =20 . dB, CC =0 . PSNR =18 . dB, CC =0 . PSNR =18 . dB, CC =0 . PSNR =19 . dB, CC =0 . PSNR =17 . dB, CC =0 . L ung - S N R PSNR =26 . dB, CC =0 . PSNR =22 dB, CC =0 . PSNR =21 . dB, CC =0 . PSNR =16 . dB, CC =0 . PSNR =22 . dB, CC =0 . PSNR =18 . dB, CC =0 . Fig. 4 . Visual comparison of UVTomo-GAN with different baselines. The description of the columns: 1) ground truth image (GT),2) TV-reqularized reconstruction with known projection angles, 3) UVTomo-GAN with known p , 4) UVTomo-GAN with unknown p , 5)UVTomo-GAN with unknown p but assumed to be a Uniform distribution, 6) EM initialized with low-pass filtered GT image, 7) EM withrandom initialization. The PSNR and CC, comparing the reconstructed images and the GT are provided underneath each image. The first tworows correspond to no noise experiments while for the last two rows SNR = 1 . Examples of projection lines for the noisy experiments areprovided in Fig. 2. maximum-likelihood estimation and solve it via EM. Quality of reconstructed image : Figure 4 compares theresults of UVTomo-GAN with unknown PMF against theground truth image and the four baselines. Note that theresults of UVTomo-GAN with unknown p closely resem-bles UVTomo-GAN with known p , both qualitatively andquantitatively. However, with unknown p , the reconstructionproblem is more challenging. Furthermore, we observe thatwith known p , UVTomo-GAN converges faster compared tothe unknown p case. Also, comparing the fourth and fifthcolumns in Fig. 4 shows the importance of updating p . Whilein the second baseline, the outline of the reconstructed imagesare reasonable, they lack accuracy in high-level details.Note that while the first and third baselines are perform-ing well on the reconstruction task, they have the advantageof knowing the projection angles or their distribution. Also,in our experiments we observed that EM is sensitive to theinitialization. The EM results provided in Fig. 4 sixth columnare initialized with low-pass filtered versions of the groundtruth images. We observed that EM fails in successful de-tailed reconstruction if initialized poorly (Fig. 4 last column). Quality of reconstructed PMF : Comparison between theground truth distribution of the projection angles and the onerecovered by UVTomo-GAN with unknown PMF is providedin Fig. 3. Note that the recovered PMF matches the groundtruth distribution, thus proving the ability of our approach torecover p under different distributions and noise regimes.
5. CONCLUSION
In this paper, we proposed an adversarial learning approachfor the tomographic reconstruction problem. We assumedneither the projection angles nor their probability distributionthey are drawn from is known a-priori and we addressed therecovery of this unknown PMF alongside the image from theprojection data. We formulated the reconstruction problem asa distribution matching problem which is solved via a min-max game between a discriminator and a generator. Whileupdating the generator (i.e. the signal and the PMF), to en-able gradient backpropagation through the sampling operator,we use gumbel-softmax approximation of samples from cate-gorical distribution. Numerical results demonstrate the abilityof our approach in accurate recovery of the image and the pro-jection angle PMF. . COMPLIANCE WITH ETHICAL STANDARDS
This is a numerical simulation study for which no ethical approvalwas required.
7. ACKNOWLEDGEMENT
Mona Zehni and Zhizhen Zhao are partially supported by NSF DMS-1854791, NSF OAC-1934757, and Alfred P. Sloan Foundation.
8. REFERENCES [1] H. Stark, J. Woods, I. Paul, and R. Hingorani, “Direct Fourierreconstruction in computer tomography,”
IEEE Transactionson Acoustics, Speech, and Signal Processing , vol. 29, no. 2,pp. 237–245, 1981.[2] C. Gong and L. Zeng, “Adaptive iterative reconstruction basedon relative total variation for low-intensity computed tomogra-phy,”
Signal Processing , vol. 165, pp. 149 – 162, 2019.[3] B. B. Cheikh, E. Baudrier, and G. Frey, “A tomographicalreconstruction method from unknown direction projections for2D gray-level images,”
Pattern Recognition Letters , vol. 86,pp. 49 – 55, 2017.[4] B. Zhu, J. Z. Liu, S. F. Cauley, B. R. Rosen, and M. S. Rosen,“Image reconstruction by domain-transform manifold learn-ing,”
Nature , vol. 555, no. 7697, pp. 487–492, 2018.[5] Y. Ge, T. Su, J. Zhu, X. Deng, Q. Zhang, J. Chen, Z. Hu,H. Zheng, and D. Liang, “ADAPTIVE-NET: deep computedtomography reconstruction network with analytical domaintransformation knowledge,”
Quantitative Imaging in Medicineand Surgery , vol. 10, no. 2, 2020.[6] K. H. Jin, M. T. McCann, E. Froustey, and M. Unser, “Deepconvolutional neural network for inverse problems in imaging,”
IEEE Transactions on Image Processing , vol. 26, no. 9, pp.4509–4522, 2017.[7] T. M. Quan, T. Nguyen-Duc, and W. Jeong, “Compressed sens-ing MRI reconstruction using a generative adversarial networkwith a cyclic loss,”
IEEE Transactions on Medical Imaging ,vol. 37, no. 6, pp. 1488–1497, 2018.[8] H. Chen, Y. Zhang, M. K. Kalra, F. Lin, Y. Chen, P. Liao,J. Zhou, and G. Wang, “Low-dose CT with a residual encoder-decoder convolutional neural network,”
IEEE Transactions onMedical Imaging , vol. 36, no. 12, pp. 2524–2535, 2017.[9] Y. Han and J. C. Ye, “Framing U-net via deep convolutionalframelets: Application to sparse-view ct,”
IEEE Transactionson Medical Imaging , vol. 37, no. 6, pp. 1418–1429, 2018.[10] E. Kang, W. Chang, J. Yoo, and J. C. Ye, “Deep convolutionalframelet denosing for low-dose CT via wavelet residual net-work,”
IEEE Transactions on Medical Imaging , vol. 37, no. 6,pp. 1358–1369, 2018.[11] Q. Yang, P. Yan, Y. Zhang, H. Yu, Y. Shi, X. Mou, M. K. Kalra,Y. Zhang, L. Sun, and G. Wang, “Low-dose CT image denois-ing using a generative adversarial network with wassersteindistance and perceptual loss,”
IEEE Transactions on MedicalImaging , vol. 37, no. 6, pp. 1348–1357, 2018.[12] J. Dong, J. Fu, and Z. He, “A deep learning reconstructionframework for x-ray computed tomography with incompletedata,”
PLOS ONE , vol. 14, pp. e0224426, 11 2019. [13] Z. Li, A. Cai, L. Wang, W. Zhang, C. Tang, L. Li, N. Liang,and B. Yan, “Promising generative adversarial network basedsinogram inpainting method for ultra-limited-angle computedtomography imaging,”
Sensors , vol. 19, no. 18, pp. 3941, 2019.[14] J. Adler and O. Ozan, “Solving ill-posed inverse problemsusing iterative deep neural networks,”
Inverse Problems , vol.33, 04 2017.[15] J. Adler and O. ¨Oktem, “Learned primal-dual reconstruction,”
IEEE Transactions on Medical Imaging , vol. 37, no. 6, pp.1322–1332, 2018.[16] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative ad-versarial nets,” in
Advances in Neural Information Process-ing Systems 27 , Z. Ghahramani, M. Welling, C. Cortes, N. D.Lawrence, and K. Q. Weinberger, Eds., pp. 2672–2680. CurranAssociates, Inc., 2014.[17] H. Gupta, M. T. McCann, L. Donati, and M. Unser, “Cryo-GAN: A new reconstruction paradigm for single-particle cryo-EM via deep adversarial learning,” bioRxiv , 2020.[18] E. Jang, S. Gu, and B. Poole, “Categorical Reparameterizationwith Gumbel-Softmax,” in
ICLR , 2017.[19] M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein genera-tive adversarial networks,” in
Proceedings of the 34th Interna-tional Conference on Machine Learning-Volume 70 , 2017, pp.214–223.[20] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, andA. Courville, “Improved training of wasserstein gans,” in
Proceedings of the 31st International Conference on NeuralInformation Processing Systems , Red Hook, NY, USA, 2017,NIPS’17, p. 5769–5779, Curran Associates Inc.[21] A. Punjani, M. A. Brubaker, and D. J. Fleet, “Building proteinsin a day: Efficient 3d molecular structure estimation with elec-tron cryomicroscopy,”
IEEE Transactions on Pattern Analysisand Machine Intelligence , vol. 39, no. 4, pp. 706–718, 2017.[22] B. Xu, N. Wang, T. Chen, and M. Li, “Empirical evaluation ofrectified activations in convolutional network,” arXiv preprintarXiv:1505.00853 , 2015.[23] W. V. Aarle, W. J. Palenstijn, J. Cant, E. Janssens, F. Ble-ichrodt, A. Dabravolski, J. Beenhouwer, K. Joost Batenburg,and J. Sijbers, “Fast and flexible X-ray tomography using theastra toolbox,”
Opt. Express , vol. 24, no. 22, pp. 25129–25147,Oct 2016.[24] S. Boyd, N. Parikh, and E. Chu,
Distributed optimization andstatistical learning via the alternating direction method of mul-tipliers , Now Publishers Inc, 2011.[25] E. Soubies, F. Soulez, M. T McCann, T. Pham, L. Donati,T. Debarre, D. Sage, and M. Unser, “Pocket guide to solveinverse problems with GlobalBioIm,”