Projection Inpainting Using Partial Convolution for Metal Artifact Reduction
PProjection Inpainting Using Partial Convolutionfor Metal Artifact Reduction
Lin Yuan , Yixing Huang , and Andreas Maier , Pattern Recognition Lab, Friedrich-Alexander Universit¨at Erlangen-N¨urnberg,91058 Erlangen, Germany Erlangen Graduate School in Advanced Optical Technologies (SAOT), 91058Erlangen, Germany
Abstract.
In computer tomography, due to the presence of metal im-plants in the patient body, reconstructed images will suffer from metalartifacts. In order to reduce metal artifacts, metals are typically removedin projection images. Therefore, the metal corrupted projection areasneed to be inpainted. For deep learning inpainting methods, convolu-tional neural networks (CNNs) are widely used, for example, the U-Net.However, such CNNs use convolutional filter responses on both valid andcorrupted pixel values, resulting in unsatisfactory image quality. In thiswork, partial convolution is applied for projection inpainting, which onlyrelies on valid pixels values. The U-Net with partial convolution andconventional convolution are compared for metal artifact reduction. Ourexperiments demonstrate that the U-Net with partial convolution is ableto inpaint the metal corrupted areas better than that with conventionalconvolution.
Keywords:
Deep learning, partial convolution, metal artefact reduction
In computed tomography (CT), image patients may contain metal implants suchas a dental filling, spine screws, surgical clips or artificial hips. In this situation,reconstructed images will suffer from metal artifacts, typically in the form ofstrong bright and dark streaks. These artifacts are caused by various effects,including most prominently beam hardening, noise, scattering and the non-linearpartial volume effect. Metals have much higher attenuation values than bodytissues, leading to severe beaming hardening effect [1]. X-ray photon flux follows aPoisson distribution. Due to high absorption of photons by metals, a low photon-count X-ray beam causes relatively high Poisson noise and detector electronicnoise. In addition, metal implants usually have well defined boundaries, causingthe nonlinear partial volume effect [2].Various metal artifact reduction (MAR) algorithms have been proposed [3].Among them, projection inpainting methods are the most common methods,such as linear interpolation [4], polynomial interpolation [5], wavelet domain in-terpolation [6] and sinusoidal curve fitting [7]. Reprojection methods [8–12] are a r X i v : . [ c s . C V ] M a y also widely used for the inpainting of metal corrupted projections, where themetal region in a initial reconstruction is replaced by a tissue-class model, wheresoft tissue is most frequently used. In addition, data normalization is benefi-cial for projection interpolation. With this idea, the normalized MAR (NMAR)algorithms are proposed [13, 14].Recently, powerful deep learning methods using convolutional neural net-works (CNNs) have also been applied for projection inpainting in MAR appli-cations [15–19]. However, such CNNs use convolutional filter responses on bothvalid and corrupted pixel values, resulting in unsatisfactory image quality. In [20],a partial convolution method has been proposed for image inpainting in the fieldof computer vision, where the convolution operations only rely on valid pixels,given valid pixel masks. With sufficient successive updates, valid pixel regiongrows while the invalid blank regions gradually get inpainted. In this work, weintroduce this partial convolution method for projection inpainting in the appli-cation of MAR. Note that working independently, a multi-domain MAR methodusing partial convolution has already been proposed in [21]. The concept of partial convolution is actually a general CNN with masks. Apartial convolution layer consists of a partial convolution operation and a maskupdate function. The partial convolution operation can be expressed as [20], x (cid:48) = W T ( X (cid:12) M ) sum( )sum ( M ) + b, if sum ( M ) > , , otherwise , (1)where W represents the weight of a convolutional layer filter, b is bias, X is the pixel or feature of the current convolution (sliding) window, M is a cor-responding binary mask composed of 0 and 1. After each partial convolutionoperation is completed, the mask undergoes a round of updates. This meansthat if the convolution can adjust its output on at least one valid input, themask is switched to 0 at that location. (cid:12) indicates element-wise multiplication.sum( )/sum( M ) is a scaling factor where has the same shape as M , thatis, applying an appropriate scaling ratio to adjust the amount of change in theunmasked input. From the definition, we can see that the output of partial con-volution only depends on the unmasked input values.After each partial convolution, the mask is updated [20], m (cid:48) = (cid:40) , if sum ( M ) > , , otherwise, (2)where m (cid:48) is the value of the mask at the convolution output pixel. That is tosay, if the convolution was able to condition its output on at least one valid input value, then we mark that location to be valid [20]. With the increase ofthe number of network layers, the pixel value of the mask output 0 is gettingless and less, the area of the valid area in the output result is getting larger andlarger, and the influence of the mask on the overall loss will also become smaller.The result of this is that when the network is deep enough, all the pixels in themask will become 1 [20]. Fig. 1. mask example
To illustrate the partial convolution, an example is displayed in Fig. 1.
Red frame:
At this time, the mask values in the kernel are all 1 (all arecorrect pixels and need not be filled). Then it will execute the formula of “ifsum ( M ) >
0” in Eqn. (1). Since all pixels are valid here, the convolution canbe processed normally using the conventional convolution.
Green frame:
Although the mask values of the kernel are 0 at the bottomright corner (representing a hole), we can learn something through the nearbynormal pixels with 1s.
Blue frame:
At this time, the mask values in the kernel are all 0 (ball pixelsneed to be filled). We will not deal with it at the beginning, until there is moreinformation to be passed to the later layer.In a partial convolution neural network, for regions like the above greenframe, the mask M will gradually fill from 0 to 1 because of the formula “if sum( M ) >
0” in Eqn. (2). Through this way to the end, a mask of all 1s is obtained,i.e., the entire image has been inpainted, although the intensity values need tobe further improved.
In this work, the U-Net [22] with conventional convolutions and partial convo-lutions are compared for projection inpainting in the application of MAR. Thegeneral U-Net architecture consists of two symmetrical parts: the previous partof the network is the same as the ordinary convolutional network, using 3x3 con-volution and pooled down-sampling, which can grasp the relationship between pixels in the image; the latter part of the network is basically symmetrical withthe previous one, but with skip connections for multi-scale feature extraction. I m a g e M a s k I m a g e M a s k I m a g e M a s k I m a g e M a s k I m a g e M a s k I m a g e M a s k I m a g e M a s k I m a g e M a s k I m a g e M a s k I m a g e M a s k I m a g e M a s k I m a g e M a s k I m a g e M a s k I m a g e M a s k I m a g e M a s k I m a g e M a s k I m a g e M a s k I m a g e M a s k I m a g e M a s k I m a g e M a s k concatenateconcatenateconcatenateconcatenateconcatenate
128 128384 384768 768 256 2561024 1024 512 512 3x33x33x3 3x33x33x33x3 3x33x33x3 x x
256 512 x x
256 512 x x
512 256 x x x
128 128 x x x x x x
64 64 x x x
32 16 x
16 32 x BN UPBNBNBN UP UP UP UP Fig. 2.
The architecture of the U-Net with partial convolution.
The architecture of the U-Net with partial convolution is displayed in Fig. 2,where masks are introduced to the original U-Net architecture. The left partincludes 5 partial convolutional layers PCONV1-PCONV5, responsible for imageencoding. And the right side includes 5 partial convolutional layers PCONV6-PCONV10 responsible for decoding.
Fig. 3.
Training Processing
The flowchart of our experimental setup is shown in Fig. 3, which consists ofthe following steps.
Data generation:
We generate original metal-free projections by forwardprojection of 18 patients’ CT volumes on CONRAD [23] using an angular step of 1 ◦ . For metal corrupted projections, we randomly generate metals of differentsizes and positions in the patient CT volumes and forward project the volumeswith metals to get metal corrupted projections. The masks are obtained byforward projection of the metals only. With the above procedures, we have theoriginal projection images, the masks and the projections with metals, as shownin Fig. 4. masklabel(orignal)input(with metal) Fig. 4.
The generation of metal corrupted projections using an original metal-freeprojection and a mask.
Training data:
We used the pytorch-inpainting-with-partial-conv code ongithub [24]. On this basis, we have made several modifications for our application.The picture is properly scaled to 512 * 512. Because the original code runsslowly, we changed the batch size to 2. The training process is shown in Fig. 5.The input is a metal corrupted projection with its corresponding mask. Afterpartial convolution, the output is obtained and compared with the original imagefor correction. We use the same loss function as that in [20]. The U-Net withconventional convolution is also applied to inpaint metal corrupted projections.Its process is shown in Fig. 6.
Projection inpainting:
Use the trained model to generate metal-free pro-jections for all the 360 ◦ projections. Image reconstruction:
CT reconstruction from inpainted projections usingthe FDK algorithm.
The inpainting results of one projection are displayed in Fig. 7. Fig. 7(a) isthe original image. The red arrows in Figs. 7(b) and (c) indicate the inpainted
Partial convolutioninput(with metal)mask output loss label
Fig. 5.
Training Process of partial convolution convolution output loss labelinput(with metal)
Fig. 6.
Training Process of normal convolution regions. Fig. 7(b) shows that the U-Net with conventional convolution is ableto inpaint something in the metal corrupted region. However, er can observeobvious boundaries. Fig. 7(c) demonstrates that the region inpainted by the U-Net with partial convolution has superior quality, since no obvious boundariesare observed. This demonstrates the benefit of partial convolution for projectioninpainting.The reconstruction results from inpainted projections are displayed in Fig. 8.Fig. 8(a) is the ground truth image. Fig. 8(b) shows the direct reconstructionfrom metal corrupted projections. The metal is present in the image with severeradial streak artifacts. Fig. 8(c) is the reconstruction from inpainted projectionsby the U-Net with conventional convolution, where most streak artifacts arereduced very well. However, at the position of the metal, the boundary of themetal is presented with artifacts. This is because in the projections the bound-ary areas are not well inpainted, as displayed in Fig. 7(b). Fig. 8(d) shows thereconstruction from inpainted projection by the U-Net with partial convolution.The radial steak artifacts are very well reduced. Moreover, at the position of themetal, the remaining artifacts are much fewer than those in (c). This demon-strates the advantage of partial convolution over conventional convolution inprojection inpainting in the application of MAR. (a) (b) (c)
Fig. 7.
The results of inpainted projection (90th projection): (a) original image; (b)inpainted projection by the U-Net with conventional convolution; (c) inpainted projec-tion by the U-Net with partial convolution.(a) (b) (c) (d)
Fig. 8.
The results of one example slice reconstructed from inpainted projections: (a)ground truth slice; (b) reconstruction directly from metal corrupted projections (withmetal in the projections); (c) reconstruction from inpainted projections by the U-Netwith conventional convolution; (d) reconstruction from inpainted projections by theU-Net with partial convolution.
In this paper, we investigate the application of partial convolution in projectioninpainting for MAR. Compared with the U-Net with conventional convlution,the U-Net with partial convolution is able to inpaint the metal corrupted ar-eas better, especially at the boundary areas. As a result, in the reconstructionfrom inpainted projections by the U-Net with partial convolution, radial streakartifacts are well reduced and the structures near the metal position are wellobserved.
References
1. R. A. Brooks and G. Di Chiro, “Beam hardening in x-ray reconstructive tomogra-phy,”
Phys. Med. Biol. , vol. 21, no. 3, p. 390, 1976.2. G. Glover and N. Pelc, “Nonlinear partial volume artifacts in x-ray computedtomography,”
Med. Phys. , vol. 7, no. 3, pp. 238–248, 1980.3. L. Gjesteby, B. De Man, Y. Jin, H. Paganetti, J. Verburg, D. Giantsoudi, andG. Wang, “Metal artifact reduction in CT: where are we after four decades?”
IEEE Access , vol. 4, pp. 5826–5849, 2016.4. W. A. Kalender, R. Hebel, and J. Ebersberger, “Reduction of CT artifacts causedby metallic implants.”
Radiol. , vol. 164, no. 2, pp. 576–577, 1987.5. J. Wei, L. Chen, G. A. Sandison, Y. Liang, and L. X. Xu, “X-ray CT high-densityartefact suppression in the presence of bones,”
Phys. Med. Biol. , vol. 49, no. 24, p.5407, 2004.6. S. Zhao, D. Robeltson, G. Wang, B. Whiting, and K. T. Bae, “X-ray CT metalartifact reduction using wavelets: an application for imaging total hip prostheses,”
IEEE Trans. Med. Imaging , vol. 19, no. 12, pp. 1238–1247, 2000.7. J. J. Liu, S. R. Watt-Smith, and S. M. Smith, “Metal artifact reduction for CTbased on sinusoidal description,”
J. Xray Sci. Technol. , vol. 13, no. 2, pp. 85–96,2005.8. C. R. Crawford, J. G. Colsher, N. J. Pelc, and A. H. Lonn, “High speed reprojectionand its applications,” in
Medical Imaging II , vol. 914, 1988, pp. 311–318.9. R. Naidu, I. Bechwati, S. S. Karimi, S. Simanovsky, and C. R. Crawford, “Methodof and system for reducing metal artifacts in images generated by X-ray scanningdevices,” 2004, uS Patent 6,721,387.10. M. Bal and L. Spies, “Metal artifact reduction in CT using tissue-class modelingand adaptive prefiltering,”
Med. Phys. , vol. 33, no. 8, pp. 2852–2859, 2006.11. D. Prell, Y. Kyriakou, M. Beister, and W. A. Kalender, “A novel forwardprojection-based metal artifact reduction method for flat-detector computed to-mography,”
Phys. Med. Biol. , vol. 54, no. 21, p. 6575, 2009.12. S. Karimi, P. Cosman, C. Wald, and H. Martz, “Segmentation of artifacts andanatomy in CT metal artifact reduction,”
Med. Phys. , vol. 39, no. 10, pp. 5857–5868, 2012.13. E. Meyer, R. Raupach, M. Lell, B. Schmidt, and M. Kachelrieß, “Normalized metalartifact reduction (nmar) in computed tomography,”
Med. Phys. , vol. 37, no. 10,pp. 5482–5493, 2010.14. E. Meyer, R. Raupach, B. Schmidt, A. H. Mahnken, and M. Kachelrieß, “Adaptivenormalized metal artifact reduction (anmar) in computed tomography,” in
IEEENucl. Sci. Symp. Conf. Rec. , 2011, pp. 2560–2565.15. L. Gjesteby, Q. Yang, Y. Xi, H. Shan, B. Claus, Y. Jin, B. De Man, and G. Wang,“Deep learning methods for CT image-domain metal artifact reduction,” in
Devel-opments in X-Ray Tomography XI , vol. 10391, 2017, p. 103910W.16. Y. Zhang and H. Yu, “Convolutional neural network based metal artifact reductionin x-ray computed tomography,”
IEEE Trans. Med. Imaging , vol. 37, no. 6, pp.1370–1381, 2018.17. M. U. Ghani and W. C. Karl, “Deep learning based sinogram correction for metalartifact reduction,”
Electron. Imaging , vol. 2018, no. 15, pp. 472–1, 2018.18. ——, “Fast enhanced CT metal artifact reduction using data domain deep learn-ing,”
IEEE Trans. Computat. Imaging , 2019.19. T. M. Gottschalk, B. W. Kreher, H. Kunze, and A. Maier, “Deep learning basedmetal inpainting in the projection domain: Initial results,” in
Workshop MLMIR ,2019, pp. 125–136.20. G. Liu, F. A. Reda, K. J. Shih, T.-C. Wang, A. Tao, and B. Catanzaro, “Imageinpainting for irregular holes using partial convolutions,” in
Proc. ECCV , 2018,pp. 85–100.21. A. Pimkin, A. Samoylenko, N. Antipina, A. Ovechkina, A. Golanov, A. Dalechina,and M. Belyaev, “Multi-domain CT metal artifacts reduction using partial convo-lution based inpainting,” arXiv preprint , 2019.22. O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks forbiomedical image segmentation,”
Proc. MICCAI , pp. 234–241, 2015.23. A. Maier, H. Hofmann, M. Berger, P. Fischer, C. Schwemmer, H. Wu, K. M¨uller,J. Hornegger, J. Choi, C. Riess, A. Keil, and R. Fahrig, “CONRAD - a softwareframework for cone-beam imaging in radiology,”