Deep-Learning Driven Noise Reduction for Reduced Flux Computed Tomography
Khalid L. Alsamadony, Ertugrul U. Yildirim, Guenther Glatz, Umair bin Waheed, Sherif M. Hanafy
DDeep-Learning Driven Noise Reduction for ReducedFlux Computed Tomography
Khalid L. Alsamadony, Ertugrul U. Yildirim, Guenther Glatz,* Umair binWaheed, Sherif M. Hanafy* E-mail: [email protected]
Abstract
Deep neural networks have received considerable attention in clinical imaging,particularly with respect to the reduction of radiation risk. Lowering the ra-diation dose by reducing the photon flux inevitably results in the degradationof the scanned image quality. Thus, researchers have sought to exploit deepconvolutional neural networks (DCNNs) to map low-quality, low-dose images tohigher-dose, higher-quality images thereby minimizing the associated radiationhazard. Conversely, computed tomography (CT) measurements of geomaterialsare not limited by the radiation dose. In contrast to the human body, however,geomaterials may be comprised of high-density constituents causing increasedattenuation of the X-Rays. Consequently, higher dosage images are required toobtain an acceptable scan quality. The problem of prolonged acquisition timesis particularly severe for micro-CT based scanning technologies. Depending onthe sample size and exposure time settings, a single scan may require severalhours to complete. This is of particular concern if phenomena with an expo-nential temperature dependency are to be elucidated. A process may happentoo fast to be adequately captured by CT scanning. To address the aforemen-tioned issues, we apply DCNNs to improve the quality of rock CT images andreduce exposure times by more than 60%, simultaneously. We highlight cur-rent results based on micro-CT derived datasets and apply transfer learning toimprove DCNN results without increasing training time. The approach is appli-cable to any computed tomography technology. Furthermore, we contrast theperformance of the DCNN trained by minimizing different loss functions suchas mean squared error and structural similarity index.
1. Introduction
Computed tomography has been recognized as an indispensable technologynot only in the health care domain but also with respect to industrial applica-tions like reverse engineering (Bartscher et al., 2006; Bauer et al., 2019), flawdetection (He et al., 2014), and meteorology to name a few (De Chiffre et al.,2014; du Plessis et al., 2016). The non-destructive nature of CT scanning hasalso proven to be tremendously valuable in the case of geomaterials, allowing
Preprint submitted to Engineering Applications of Artificial Intelligence January 20, 2021 a r X i v : . [ ee ss . I V ] J a n o elucidate transport phenomena in porous media, visualize deformation andstrain localization in soils, rocks or sediments, or perform fracture and damageassessment in asphalt, cement and concrete (Alshibli and Reed, 2010). Thethree-dimensional data obtained helps to better inform numerical models im-proving their predictive power and enables delineation of physical properties ofthe specimen under investigation. Both qualities are particularly valued in thearea of digital rock physics (Berg et al., 2017; Alqahtani et al., 2020).The technology has, however, shortcomings, in particular with respect tomonitoring dynamic processes. The limitation of prolonged acquisition times isdistinctively more severe in case of micro-CT ( µ -CT) technologies where it maytake several hours for a scan to complete. During acquisition, the object shouldnot physically change - or as little as possible - to allow for meaningful recon-struction of the sinograms. Medical CT scanners, per design, offer significantlyshorter acquisition times, mere minutes depending on the sample size. Certainexperiments, though, stand to benefit greatly from increased exposure timesas noise is decreased. The noise reduction gives rise to better statistics if, forexample, porosity is to be estimated in combination with a non-wetting radiocontrast agent (Glatz et al., 2016). Similarly, core flood experiments necessitatethe presence of a vessel to maintain temperatures and pressures resulting inattenuation of the X-Rays. Again, prolonged acquisition times, in combinationwith high tube voltages and currents, yield a better image quality. From ex-perience, scanning of a 1-inch long rock specimen using a medical CT scanneroperating at maximum tube voltage (from 140 kV to 170 kV), current (200 mA),and exposure time (four seconds per slice), requires up to 30 minutes betweenscans to allow for the X-Ray tube to cool down. Conversely, certain reactiveprocesses happen rather rapidly, mandating low exposure times if the dynamicsare to be captured. This is particularly true for high-temperature experimentsgiven the exponential dependency of the reaction rate on heat (Glatz et al.,2018; Boign´e et al., 2020).The medical domain of low dose computed tomography (LDCT) seeks toreduce the exposure time in an effort to minimize the radiation risk (McColloughet al., 2009). Lowering the flux by reducing the exposure time, tube peakvoltages, and currents will inevitably decrease the image quality and, thereby,the diagnostic value (Goldman, 2007). Photon emission from the X-Ray sourceis modeled as a Poisson process (Macovski, 1983) and photon starvation atthe detector gives rise to Poisson noise (Barrett and Keat, 2004; Gravel et al.,2004). Additional noise is introduced during the quantization of the signaland in the form of electronic noise (Diwakar and Kumar, 2018). Naturally,researchers sought to reduce artifacts employing improved algorithms duringthe reconstruction of the 3D data from the projections/sinograms (Willeminkand No¨el, 2019) or post-reconstruction. Generally, the latter approach is morecommon given that the raw CT data is often not accessible, especially in thecase of a medical CT system (Nishio et al., 2017; Chen et al., 2017b).Conventional signal processing techniques require a good understanding ofthe underlying nature of noise to optimize the filter design. Noise statistics, toguide the filter model, may be collected experimentally but, from experience,2his constitutes a rather laborious process.Recently, deep convolutional neural networks (DCNNs) have been success-fully applied to map low-quality, low-dose images to higher-dose, higher-qualityimages (Chen et al., 2017a; Kang et al., 2017). In this paper, we seek to build onthis general approach and apply it to computed tomography scanning of geoma-terials for the following reasons. First, the reduction of acquisition time allowsfor an increase in the temporal resolution. Consequently, experiments previouslydeemed out of reach due to the associated dynamics can now be entertained.Second, with respect to digital rock physics, the accuracy of the estimated rockproperties strongly depend on the image quality (Bazaikin et al., 2017; Liuet al., 2018; Guan et al., 2019). Third, high quality images are a prerequisitefor resolution enhancement techniques (Papari et al., 2016; Wang et al., 2019;Da Wang et al., 2019). In addition, a reduction in scan time will contributeto an increase in the lifetime of the X-Ray tube (medical CT) and the filament( µ -CT), respectively.Using artificial rock CT images obtained from simulated parallel-beam pro-jections, Pelt et al. (2018) obtained promising results with respect to improvedimage quality by means of DCNNs. The work presented in this paper seeks toextend the DCNNs filtering concept including results not only for synthetic datacreated using the ASTRA toolbox (van Aarle et al., 2015) but also for actual µ -CT data generated by a FEI Heliscan microCT operating with a cone-beam.Importantly, we do not only aim to reduce scanning time but seek to improve thequality of the reconstructed images compared to the high dose training images,simultaneously.In short, in this paper, we study two deep learning architectures for improv-ing degraded rock images resulting from reduced exposure time µ -CT scans.Furthermore, we investigate the applicability of transfer learning to minimizethe number of training images needed. In addition, we explore the impact ofmean-squared error (MSE) and structural similarity index measure (SSIM) lossfunctions on the reconstructed image quality. While both loss functions arecapable of considerably improving the respective quality metrics, PSNR andSSIM, they tend to emphasize different structural features. These findings arecrucial for improving digital rock physics applications where rock properties,like porosity and permeability, are to be estimated from computed tomographydata only.
2. Methodolgy
Convolutional neural networks (CNNs) constitute a subset of artificial neuralnetworks (ANNs), heavily relying on digital filter operations (kernel/convolutionmatrix), where the weights of the filters are informed during the training processto minimize a particular loss metric between the predictions and the training(true) samples. CNNs are especially suitable for computer vision applicationsgiven that the filters can capture the spatial relation between individual pixelsor image elements. 3 igure 1: VDSR architecture showing cascaded pair of layers. The input is a low-resolutionimage, or a noisy image in our case, which goes through layers and gets transformed to ahigh-resolution or denoised image. The convolutional layers use 64 filters each.
The main components of CNNs are convolution layers and activation func-tions. Convolution layers consists of filters that slide across the input featuremap (e.g., image). Each element of a filter is multiplied by the overlapping ele-ment of the input feature map and subsequently summed to yield one elementof the output feature map. This operation is followed by activation functionsto add non-linearity empowering a CNN to learn the complex relationship be-tween inputs and their corresponding labels. A common activation function isthe rectified linear unit (ReLU), designed to set negative values to zero, andlinearly map inputs to outputs in case of positive values.
For the transfer learning aspect of this work, we take advantage of the verydeep super resolution (VDSR) architecture and the associated pretrained VDSRby Kim et al. (2016). Both were obtained from the MathWorks website (Math-Works, 2018). The VDSR architecture consists of 20 weighted convolution layersfollowed by a ReLU. Each convolution layer, except the final and the input layer,accommodate 64 filters of size 3 ×
3. Figure 1 shows the network architecture.The second architecture investigated constitutes a deep convolutional neu-ral network (DCNN) based on a residual encoder/decoder architecture, which isknown as U-Net network. The encoder or feature extractor component consistsof three blocks with each block comprising three consecutive convolution layerswhere all layers are followed by a ReLU activation function. At the end of eachblock, a sample-based discretization process in form of max pooling operation isexecuted to reduce the size of the feature map. Similarly, the decoder incorpo-rates a transposed convolution layer followed by three consecutive convolutionlayers to be terminated by ReLU activation functions. In addition, skip con-nections between each encoder block and its corresponding decoder block areincluded to concatenate the output of the transposed convolution layers withthe feature map from each encoder block. All convolution layers, except the4 igure 2: Architecture of the proposed DCNN (U-Net) which is based on a residual en-coder/decoder structure. Each black box represents a feature map. The number of channelsis denoted at the top of each box. The input and output images have the same size (heightand width) which is indicated at the sides of the first and last box. Dark green boxes representcopied feature maps from the encoder block. The arrows state different operations. last layer, are of size 3 ×
3. In the encoder part, the number of filters increasesfor each block (32, 64, and 128), sequentially. Equivalently, in the decoder part,the number of filters decreases for each block (128, 64, and 32), sequentially.The network configuration is outlined in Figure 2.
The mean squared error (MSE) is a commonly used loss function for imagerestoration tasks and is defined as follows: L MSE ( P ) = 1 N (cid:88) p ∈ P [ t ( p ) − r ( p )] , (1)where p constitutes the index of the pixel in patch P , t ( p ) is the pixel valuein the trained patch, and r ( p ) corresponds to the pixel value of the referenceimage, and N is the number of pixels in a given patch. (Zhao et al., 2016).The structural similarity index measure (SSIM), however, is often regardedas a more pragmatic metric for evaluating image quality, particularly with re-spect to human visual perception (Wang et al., 2004). For a pixel p , the SSIMis given as follows: 5SIM( p ) = (2 µ r µ t + c )(2 σ tr + c )( µ r + µ t + c )( σ r + σ t + c ) , (2)where µ and σ reflect the average and variance of the training and referencepatch, respectively, and σ tr is the associated covariance. c and c are constantsrequired to avoid division with a weak denominator and partially depend on thedynamic range of the pixels. Given that the SSIM ranges from -1 to 1, with 1being indicative that the training image is identical to the reference image, theloss function needs to be written as follows: L SSIM ( P ) = 1 N (cid:88) p ∈ P − SSIM( p ) . (3)For training of the VDSR, and the pretrained VDSR, an image patch sizeof 41 ×
41 with 128 patches per image was utilized. The Adam optimizer wasconfigured for a learning rate of 0.0001, 5 epochs, and a mini-batch size of 32.For this case we only applied the MSE loss function as defined in Eq.1.Both loss functions were exploited to train the DCNN (U-Net) employingan image patch size of 512 × Using a FEI Heliscan microCT, configured to perform 1800 projections perrevolution at a tube voltage of 85 kV and a current of 72 mA, two datasetswere acquired at an exposure time of 0.5 seconds and 1.4 seconds, respectively.As mentioned above, an increased exposure time translates to a greater imagequality given that more photons are collected at the detector thereby decreasingthe noise. Henceforth, we refer to the scans collected at 1.4 seconds as high qual-ity images and 0.5 seconds data as low quality images. The scanned specimenwas of carbonate origin measuring 1.5 inches in diameter and about 2 inches inlength. The sample geometry dictated a minimum voxel size of about 14 µm .During all scans, an approximately 100 µm thick aluminum sheet wasmounted at the tungsten target window to soften the X-Rays in an effort tominimize beam hardening artefacts. The amorphous-silicon, large-area, digitalflat-panel detector with 3072 × × ×
512 pixels and the gray-scalevalues were normalized to a range between zero and one. In a first order approx-imation, the gray values may be interpreted as density values where brighterareas are indicative of greater density and darker areas of lower density. Hence,pore space is represented by shades of black (see e.g., Figure 3).6 .4. ASTRA Toolbox
As will be shown later, the images predicted by the network are of signif-icantly greater quality than the training images collected at an exposure timeof 1.4 seconds. Consequently, it became necessary to create an artificial casebased on the images predicted by the network to verify that the architecture isindeed predicting the ground truth.The ASTRA toolbox is an open-source software for tomographic projectionsand reconstruction, available for MATLAB © and Python (van Aarle et al.,2015, 2016). Throughout this work, MATLAB © µ -CT, constitutingthe low quality data (see e.g. right-hand-side in Fig. 3).2. A 1.4 seconds exposure time series obtained from the µ -CT, representingthe high quality data (see e.g. left-hand-side in Fig. 3).3. The images predicted by the network which are of greater quality com-pared to the training images (labels). These images serve as the groundtruth.4. An artificial low quality image series, derived from predicted images usingthe ASTRA toolbox mimicking the results obtained from the µ -CT at anexposure time of 0.5 seconds.5. An artificial higher quality image series, delineated from predicted imagesusing the ASTRA toolbox resembling the results obtained from the µ -CTat an exposure time of 1.4 seconds.The artificially created series was used to validate the predictive power ofthe trained network as detailed in Section 3.4.
3. Results
In this section, we benchmark the proposed DCNNs to restore low quality µ -CT images as a result of reduced exposure times. We begin by highlightingthe problem and its adverse consequences on the scanned image quality. Next,we explore the applicability of transfer learning to help expedite the training ofthe PVDSR network. 7xploiting a pre-trained VDSR network we substantiate that optimal per-formance can be obtained faster than relying on a randomly initialized VDSRnetwork. In addition, we also compare the reconstruction performance of dif-ferent loss functions including MSE and SSIM. Finally, we prove the efficacy ofthe DCNNs by testing it against simulated low and high exposure images fromthe ASTRA toolbox (van Aarle et al., 2015). In the context of rock imaging, or imaging of materials in general, the re-duction of exposure time offers three main advantages.Firstly, µ -CT scanning, if offered as a commercial service is, from experi-ence, charged on an hourly basis ranging from hundreds to thousands of dollarsper hour. Evidently, high quality scans necessitate a longer exposure time con-sequently being more costly. Thus, a decrease in scan time while maintainingimage quality is beneficial to both parties: it allows the provider to offer theservice to the client at a reduced cost, and, at the same time, increase thethroughput.Secondly, any reduction in exposure time will results in a more economic useof the filament life time. Generally, a single filament costs about 700 to 1 , µ -CT images without the need for expert knowledge withrespect to filter design. 8 igure 3: High exposure time, 1.4 seconds, CT image (left) and low exposure time, 0.5 seconds,CT image (right) of a carbonate rock sample where dark colors are indicative of pore space.Evidently, a reduced exposure time results in an increased noise level owing to the photonstarvation at the detector. While DCNNs have shown remarkable performance for a myriad of scientificproblems, they are well-known for being data- and resource-intensive due to thelarge number of trainable parameters. Lack of training samples or computa-tional resources may hurt the performance of these networks in either of thesesituations. A pragmatic approach to address this issue is to take advantageof transfer learning, a machine learning technique seeking to apply previouslygained knowledge to speed up finding the solution to a different yet relatedproblem.In this particular study, we explore the applicability of transfer learning usingthe VDSR network, as illustrated in Figure 1, and compare the reconstructionperformance of the VDSR network initialized as per He et al. (2015) with apre-trained VDSR network by minimizing the MSE as defined in Equation 1.We train both the pre-trained VDSR network and the VDSR network for arange of number of training images, starting with 50 training images up to amaximum of 300 training images. For each particular number of training images,we measure the reconstruction performance of the two networks by comparingthe average SSIM and peak signal-to-noise ratio (PSNR) values derived from400 test images.From Figures 4 and 5, we observe that the pre-trained VDSR network alwaysyields a better overall reconstruction performance for a given number of trainingimages. This holds true for both considered metrics to quantify reconstructionquality (SSIM and PSNR), demonstrating the inherent advantage of transferlearning.Figure 6 exemplifies the reconstruction quality achieved by the pre-trainedVDSR network, based on 300 training images, improving the SSIM and PSNR9 igure 4: Summary plot of the average SSIM values employing 400 test images as predictedby the pre-trained VDSR network and a VDSR initialized following the approach of He et al.(2015). From 50 to 200 training images, both networks show similar performance gains withrespect to the corresponding SSIM values. After 200 training images, however, the VDSR tofurther increase the SSIM value compared to the pre-trained network. In general, however,the pre-trained VDSR yields greater SSIM values for all cases. values of the low exposure image from 0.54 and 23 dB to 0.78 and 34 dB,respectively.Examining the predicted image in Figure 6, it should be noted that it isindeed of greater quality (less noisy) than the high quality reference image ortraining label. The noise is greatly reduced and the grain boundaries show asharper delineation. This is somewhat surprising given that from a conventionalsignal processing point of view both edges and noise constitute high frequencycontent. Often, filters designed to remove high frequency content are often foundto smear out edges and subtle details (Lee, 1981). Granted, median filters, orfilters utilizing local statistics, in general, perform well in preserving them yet itis remarkable that the network learned to differentiate between discontinuitiesin form of edges and noise. This particular aspect is addressed in more detailin Section 3.4.
Proper selection of the loss, objective, or fitness function is crucial in guidingthe learning process of the network. The MSE loss function, as defined in Equa-tion 1, is a preferred metric to optimize the weights owing to its simplicity andwell-behavedness with respect to gradient calculations. Notably, minimizationof the MSE indirectly maximizes the PSNR.Given the particular problem of image prediction, we seek to compare theimpact of the MSE loss function against the SSIM loss function, as defined inEquation 2, on the image quality metrics PSNR and SSIM, respectively. Forthis purpose, we train the U-net-derived DCNN on 1600 training images for10 igure 5: Summary plot of the average PSNR values employing 400 test images as predictedby the pre-trained VDSR network and the VDSR. From 50 to 100 training images the VDSRshows a slightly better improvement compared to the pre-trained VDSR network. This advan-tage diminishes as the number of training images increases. Similar to the SSIM plot shownin Figure 4, the pre-trained VDSR yields greater SSIM values for all cases. each metric. The trained networks were benchmarked using 400 test images.In general, we obtain considerable improvements for both the SSIM and PSNRvalues of the reconstructed images as shown in Figures 8-11.For the MSE optimized network, the PSNR increased, on average, fromabout 22.6 dB to 34.5 dB (see Figure 8), and the SSIM from 0.56 to 0.79 (seeFigure 10). In case of the SSIM optimized network the PSNR increased, onaverage, to 34.6 dB (see Figure 9), and the SSIM to 0.79 (see Figure 11).Both loss functions perform remarkably well in restoring fine scale features,as exemplified in Figure 7, and yield similar image quality improvements. Withrespect to Figure 12, however, it seems they tend to emphasize different fea-tures of the data. The MSE optimized network predicts coarser grain textures
Figure 6: From left to right: High exposure (reference) image, low exposure image(SSIM=0.54, PSNR=23 dB), and denoised image (SSIM=0.78, PSNR=34 dB) using the pre-trained VDSR network. igure 7: The image on the left is an example of a low exposure time slice with the imagein the center being the high exposure time equivalent. The image on the far right is thereconstruction based on the SSIM optimized DCNN (U-Net). The DCNN performs remarkablywell in reconstructing fine scale features, barely visible even in case of a longer exposure time. and boundaries and seems to be more sensitive to fine scale pore space. Con-versely, the SSIM optimized network suggests smoother textures, sharper grainboundaries and appears to be less sensitive to fine scale pore space.Surprisingly, the quality of the predicted image is clearly superior to thequality of the long exposure time image. As mentioned before, the network isseemingly able to distinguish between high frequency noise and discontinuitiesin form of edges. At this point it became necessary, to verify the predictivepower of the networks and it was decided to create artificial cases where theground truth is known. The approach is detailed in the next section. As elaborated in the previous section, the images predicted by the DCNNs(images on the right of Figure 6, Figure 12b) are not only of superior qualitycompared to the low exposure images, but also exhibit less noise than their cor-responding high exposure images or the training labels. As discussed in the in-troduction, and substantiated by Figure 12b, noise can be reduced by increasingthe exposure time or flux in general. In addition, the choice of the reconstructionalgorithm is also critical. Iterative reconstruction algorithms like
SimultaneousIterative Reconstructive Technique (SIRT) or
Conjugate Gradient Least Squares (CGLS) are well known to suppress noise compared to classic filtered backprojec-tion (FDP) via Feldkamp-type (FDK) reconstruction algorithms (Fleischmannand Boas, 2011; Biguri et al., 2016). The particular algorithm employed by theFEI Heliscan microCT is proprietary.Given the surprising results, we seek to verify them by creating an artificialdataset for which the ground truth is known. For this purpose, the VDSRnetwork’s denoised images were fed into the ASTRA toolbox to create noisyprojections mimicking low and high exposure time images. Next, the projectionswere reconstructed using FDK, SIRT, and CGLS. As summarized in Figure 13,SIRT and CGLS performed well in removing the noise whereas FDK failed todo so. Hence, we decided to solely focus on FDK for creation of the artificial12 igure 8: Histogram of the PSNR values obtained for the 400 test images. “Before filtering”refers to the low exposure scans. “After filtering” refers to the DCNN (U-Net) denoised scanswhere the network as optimized with respect to the MSE loss function.Figure 9: Histogram of the PSNR values obtained for the 400 test images. “Before filtering”refers to the low exposure scans. “After filtering” refers to the DCNN (U-Net) denoised scanswhere the network was optimized with respect to the SSIM loss function. igure 10: Histogram of the SSIM values obtained for the the 400 test scans. “Before filtering”refers to the low exposure images. “After filtering” refers to the DCNN (U-Net) denoised scanswhere the network was optimized with respect to the MSE loss function.Figure 11: Histogram of the SSIM values obtained for the the 400 test scans. “Before filtering”refers to the low exposure images. “After filtering” refers to the DCNN (U-Net) denoised scanswhere the network was optimized with respect to the SSIM loss function. a) A pair of images from the test data set. The image on the left is an example of a high exposuretime (1.4 s) scan, the image on the right is the equivalent low exposure time (0.5 s) scan (SSIM=0.52,PSNR=22 dB).(b) Denoising results exemplifying the performance of the DCNN (U-Net). The left image showsthe prediction of the DCNN optimized with respect to the MSE (SSIM=0.77, PSNR=34 dB), theright image exhibits the prediction of the SSIM optimized network (SSIM=0.77, PSNR=34 dB).The MSE optimized network predicts coarser grain textures (greater variation in grayscale valuesindicative of larger variations in grain density) and boundaries and seems to be more sensitive tofine scale pore space (compare upper left quadrant of both images for the presence of fine scalepore space). Conversely, the SSIM optimized network suggests smoother textures, sharper grainboundaries and appears to be less sensitive to fine scale pore space. Figure 12: DCNN (U-Net) denoising example from the test set. igure 13: From left to right: Reference image (as predicted from the network representingthe ground truth), FDK reconstruction, SIRT reconstruction, and CGLS reconstruction.Figure 14: From left to right: Reference image (as predicted from the network representing theground truth), artificial low exposure image created via FDK i.e. VDSR input (SSIM=0.17,PSNR=14 dB), artificial high exposure image created via FDK i.e. VDSR label (SSIM=0.30,PSNR=21 dB), and VDSR output (SSIM=0.89, PSNR=26 dB). data set. Subsequently, the artificial data set was tested utilizing the trainednetworks.Figure 14 shows results for the VDSR network trained on the artificialdatasets i.e., it was trained to map the artificial low exposure to its correspond-ing artificial high exposure (training example/label). The average SSIM andPSNR values of the predicted images from the network (SSIM=0.86, PSNR=25dB) are better than the artificial high (SSIM=0.28, PSNR=18 dB) and low ex-posure images (SSIM=0.17, PSNR=14 dB) according to 200 test images, wherethe reference images (ground truth) have been used to calculate these values. Itis, again, surprising that the output images of the network yield greater qualityresults compared to their training examples (artificial high exposure images).This substantiates, however, the results reported in the previous section wherethe ground truth was unknown.
4. Conclusions
In this work, we have successfully demonstrated the value of DCNN to im-prove the quality of µ -CT scans of a carbonte rock sample. The proposedmethod has the potential to reduce the exposure time by about 60% (from 1.4seconds to 0.5 seconds) without compromising the scan quality. On the con-trary, we found that the networks are able to predict images of superior qualitycompared to the long exposure time training images (labels). In particular, thenetworks are seemingly able to distinguish between unwanted high frequency16ontent like noise, and actual high frequency features of the data like discon-tinuities in form of grain boundaries. Importantly, we verified the predictivepower of the networks by creating a synthetic dataset to compare against theknown ground truth.Given the substantial time requirements for training the networks, we alsoinvestigated the applicability of transfer learning. Using a pre-trained VDSRnetwork we found that high quality images can be obtained for a smaller numberof training epochs compared to training from scratch.Additionally, we highlighted the impact of MSE and SSIM based loss func-tions on the DCNN predictions. Both yield similar improvements with respect toPSNR and SSIM. They tend to, however, emphasize different structural aspectsof the specimen. The MSE optimized network predicts coarser grain texturesand boundaries and seems to be more sensitive to fine scale pore space. Con-versely, the SSIM optimized network suggests smoother textures, sharper grainboundaries and appears to be less sensitive to fine scale pore space.To conclude, the proposed method enables substantial savings in acquisitiontime while simultaneously improving the scan quality. The reduction in scantime is an important aspect if dynamic processes are to be elucidated or highersample throughput is required. Importantly, the approach is applicable to anycomputed tomography technology (medical CT, µ -CT, industrial CT). The vastimprovement in image quality, without the need for expert intervention, is cru-cial for digital rock physics applications where rock properties like porosity andpermeability are estimated solely from computed tomography data. Inevitably,the accuracy of the estimation is dictated, in part, by the scan quality.
5. Acknowledgements
We thank Dr. Jack Dvorkin for approval to access the FEI HeliscanmicroCT, and Mr. Nadeem Ahmed Syed and Mr. Syed Rizwanullah Hussainifor explanations how to operate the scanner, all at the Center for IntegrativePetroleum Research (CIPR)–CPG, KFUPM.This work was supported by the Research Startup Grant no. SF20003awarded to G.G. by the College of Petroleum Engineering and Geosciences,King Fahd University of Petroleum and Minerals.
6. Conflict of Interest
The authors declare no competing financial interest.
References van Aarle, W., Palenstijn, W.J., Cant, J., Janssens, E., Bleichrodt, F.,Dabravolski, A., De Beenhouwer, J., Batenburg, K.J., Sijbers, J., 2016. Fastand flexible x-ray tomography using the astra toolbox. Optics express 24,25129–25147. 17an Aarle, W., Palenstijn, W.J., De Beenhouwer, J., Altantzis, T., Bals, S.,Batenburg, K.J., Sijbers, J., 2015. The ASTRA Toolbox: A platform foradvanced algorithm development in electron tomography. Ultramicroscopy157, 35–47. doi: .Alqahtani, N., Alzubaidi, F., Armstrong, R.T., Swietojanski, P., Mostaghimi,P., 2020. Machine learning for predicting properties of porous media from 2dx-ray images. Journal of Petroleum Science and Engineering 184, 106514.Alshibli, K.A., Reed, A.H., 2010. Advances in Computed Tomography for Ge-omaterials. John Wiley & Sons, Inc., Hoboken, NJ, USA. doi: .Av¸sar, T.S., Arıca, S., 2017. Automatic segmentation of computed tomogra-phy images of liver using watershed and thresholding algorithms, in: IFMBEProceedings, Springer Verlag. pp. 414–417. doi: .Barrett, J.F., Keat, N., 2004. Artifacts in ct: recognition and avoidance. Ra-diographics 24, 1679–1691.Bartscher, M., Hilpert, U., Goebbels, J., Weidemann, G., Puder, H., Jidav,H.N., 2006. Einsatz von computer-tomographie in der Reverse-Engineering-Technologie. Materialpruefung/Materials Testing 48, 305–311. doi: .Bauer, F., Schrapp, M., Szijarto, J., 2019. Accuracy analysis of a piece-to-piece reverse engineering workflow for a turbine foil based on multi-modalcomputed tomography and additive manufacturing. Precision Engineering60, 63–75. doi: .Bazaikin, Y., Gurevich, B., Iglauer, S., Khachkova, T., Kolyukhin, D., Lebedev,M., Lisitsa, V., Reshetova, G., 2017. Effect of ct image size and resolutionon the accuracy of rock property estimates. Journal of Geophysical Research:Solid Earth 122, 3635–3647.Berg, C.F., Lopez, O., Berland, H., 2017. Industrial applications of digital rocktechnology. Journal of Petroleum Science and Engineering 157, 131–147.Biguri, A., Dosanjh, M., Hancock, S., Soleimani, M., 2016. Tigre: a matlab-gputoolbox for cbct image reconstruction. Biomedical Physics & EngineeringExpress 2, 055010.Boign´e, E., Bennett, N.R., Wang, A., Mohri, K., Ihme, M., 2020. Simultaneousin-situ measurements of gas temperature and pyrolysis of biomass smolderingvia X-ray computed tomography. Proceedings of the Combustion Institutedoi: .18hen, H., Zhang, Y., Kalra, M.K., Lin, F., Chen, Y., Liao, P., Zhou, J., Wang,G., 2017a. Low-Dose CT with a residual encoder-decoder convolutional neuralnetwork. IEEE Transactions on Medical Imaging 36, 2524–2535. doi: .Chen, H., Zhang, Y., Zhang, W., Liao, P., Li, K., Zhou, J., Wang, G., 2017b.Low-dose ct denoising with convolutional neural network, in: 2017 IEEE 14thInternational Symposium on Biomedical Imaging (ISBI 2017), IEEE. pp. 143–146.Da Wang, Y., Armstrong, R.T., Mostaghimi, P., 2019. Enhancing resolutionof digital rock images with super resolution convolutional neural networks.Journal of Petroleum Science and Engineering 182, 106261.De Chiffre, L., Carmignato, S., Kruth, J.P., Schmitt, R., Weckenmann, A.,2014. Industrial applications of computed tomography. CIRP Annals - Man-ufacturing Technology 63, 655–677. doi: .Diwakar, M., Kumar, M., 2018. A review on CT image noise and its denoising.doi: .Feldkamp, L.A., Davis, L.C., Kress, J.W., 1984. Practical cone-beam algorithm.Journal of the Optical Society of America A 1, 612. doi: .Fleischmann, D., Boas, F.E., 2011. Computed tomography—old ideas and newtechnology.Frommer, A., Maass, P., 1999. Fast CG-based methods for Tikhonov-Phillipsregularization. SIAM Journal of Scientific Computing 20, 1831–1850. doi: .Gilbert, P., 1972. Iterative methods for the three-dimensional reconstructionof an object from projections. Journal of Theoretical Biology 36, 105–117.doi: .Glatz, G., Castanier, L., Kovscek, A., 2016. Visualization and Quantificationof Thermally Induced Porosity Alteration of Immature Source Rock UsingX-ray Computed Tomography. Energy and Fuels 30. doi: .Glatz, G., Lapene, A., Castanier, L.M., Kovscek, A.R., 2018. An experimentalplatform for triaxial high-pressure/high-temperature testing of rocks usingcomputed tomography. Review of Scientific Instruments 89, 45101. doi: .Goldman, L.W., 2007. Principles of ct: radiation dose and image quality. Jour-nal of nuclear medicine technology 35, 213–225.19ravel, P., Beaudoin, G., De Guise, J.A., 2004. A method for modeling noisein medical images. IEEE Transactions on Medical Imaging 23, 1221–1232.doi: .Guan, K.M., Nazarova, M., Guo, B., Tchelepi, H., Kovscek, A.R., Creux, P.,2019. Effects of image resolution on sandstone porosity and permeability asobtained from x-ray microscopy. Transport in Porous Media 127, 233–245.He, K., Zhang, X., Ren, S., Sun, J., 2015. Delving deep intorectifiers: Surpassing human-level performance on imagenet classifica-tion. CoRR abs/1502.01852. URL: http://arxiv.org/abs/1502.01852 , arXiv:1502.01852 .He, N., Zhang, L., Lu, K., 2014. Aluminum CT image defect detection based onsegmentation and feature extraction, in: Lecture Notes in Computer Science(including subseries Lecture Notes in Artificial Intelligence and Lecture Notesin Bioinformatics), Springer Verlag. pp. 446–454. doi: .Kang, E., Min, J., Ye, J.C., 2017. A deep convolutional neural network usingdirectional wavelets for low-dose X-ray CT reconstruction. Medical Physics44, e360–e375. doi: , arXiv:1610.09736 .Kim, J., Kwon Lee, J., Mu Lee, K., 2016. Accurate image super-resolution usingvery deep convolutional networks, in: Proceedings of the IEEE conference oncomputer vision and pattern recognition, pp. 1646–1654.Lee, J.S., 1981. Refined filtering of image noise using local statistics. ComputerGraphics and Image Processing 15, 380–389. doi: .Liu, T., Jin, X., Wang, M., 2018. Critical resolution and sample size of digitalrock analysis for unconventional reservoirs. Energies 11, 1798.Macovski, A., 1983. Medical Imaging Systems. Prentice-Hall.MathWorks, 2018. Single image super resolution using deep learn-ing. , Accessed: 4 May 2020.McCollough, C.H., Primak, A.N., Braun, N., Kofler, J., Yu, L., Christner, J.,2009. Strategies for reducing radiation dose in ct. Radiologic Clinics 47,27–40.Mostaghimi, P., Blunt, M.J., Bijeljic, B., 2013. Computations of AbsolutePermeability on Micro-CT Images. Mathematical Geosciences 45, 103–125.doi: .Nishio, M., Nagashima, C., Hirabayashi, S., Ohnishi, A., Sasaki, K., Sagawa,T., Hamada, M., Yamashita, T., 2017. Convolutional auto-encoder for imagedenoising of ultra-low-dose ct. Heliyon 3, e00393.20apari, G., Idowu, N., Varslot, T., 2016. Fast bilateral filtering for denoisinglarge 3d images. Ieee transactions on image processing 26, 251–261.Pelt, D.M., Batenburg, K.J., Sethian, J.A., 2018. Improving tomographic re-construction from limited data using mixed-scale dense convolutional neuralnetworks. Journal of Imaging 4, 128.du Plessis, A., le Roux, S.G., Guelpa, A., 2016. Comparison of medical and in-dustrial X-ray computed tomography for non-destructive testing. Case Stud-ies in Nondestructive Testing and Evaluation 6, 17–25. doi:10.1016/j.csndt.2016.07.001