CoIL: Coordinate-based Internal Learning for Imaging Inverse Problems
Yu Sun, Jiaming Liu, Mingyang Xie, Brendt Wohlberg, Ulugbek S. Kamilov
CCoIL: Coordinate-based Internal Learning forImaging Inverse Problems
Yu Sun , Jiaming Liu , Mingyang Xie ,Brendt Wohlberg , and Ulugbek S. Kamilov , , ∗ Department of Computer Science and Engineering, Washington University in St. Louis, MO 63130, USA Department of Electrical and Systems Engineering, Washington University in St. Louis, MO 63130, USA Los Alamos National Laboratory, Theoretical Division, Los Alamos, NM 87545 USA ∗ Email : [email protected] Abstract
We propose
Coordinate-based Internal Learning (CoIL) as a new deep-learning (DL)methodology for continuous representation of measurements. Unlike traditional DLmethods that learn a mapping from the measurements to the desired image, CoIL trainsa multilayer perceptron (MLP) to encode the complete measurement field by mappingthe coordinates of the measurements to their responses. CoIL is a self-supervisedmethod that requires no training examples besides the measurements of the test objectitself. Once the MLP is trained, CoIL generates new measurements that can be usedwithin a majority of image reconstruction methods. We validate CoIL on sparse-viewcomputed tomography using several widely-used reconstruction methods, includingpurely model-based methods and those based on DL. Our results demonstrate the abilityof CoIL to consistently improve the performance of all the considered methods byproviding high-fidelity measurement fields.
The problem of reconstructing an unknown image from a set of noisy measurementsis fundamental to computational imaging. The task is traditionally formulated as aninverse problem and solved using model-based optimization by leveraging a forward modelcharacterizing the imaging system and a regularizer imposing prior knowledge on theunknown image. There has been significant progress in developing sophisticated imagepriors, including those based on transform-domain sparsity, self-similarity, and dictionarylearning [1–4].There has been a considerable recent interest in deep learning (DL) based solutionsto imaging inverse problems [5–8]. The traditional DL approach involves training a1 a r X i v : . [ ee ss . I V ] F e b igure 1: The conceptual illustration of CoIL in the context of sparse-view CT. A multilayerperceptron (MLP) is used to represent the full measurement field by learning to map themeasurement coordinate ( θ, l ) to its response r . Visual examples compare the recoveredimages with and without CoIL for total variation (TV). CoIL is used to generate viewsfrom the data consisting of noisy views of dB input SNR. The quantitative and visualresults in this paper highlight the ability of CoIL to significantly improve the imaging qualityfor several widely-used image reconstruction methods.convolutional neural network (CNN) to directly perform a regularized inversion of theforward model by exploiting redundancies in a training dataset [9–12]. Model-based DL isan alternative to the traditional DL that explicitly uses knowledge of the forward modelby integrating a CNN into model-based optimization. Two widely-used approaches in thiscontext are plug-and-play priors (PnP) [13] and regularization by denoising (RED) [14],which have been used with pre-trained deep denoisers to achieve excellent performancein a number of imaging tasks [15–25]. An alternative model-based DL approach is deepunfolding , which interprets the iterations of a model-based optimization algorithm as layersof a CNN and trains it end-to-end in a supervised fashion [26–36].There has been a considerable amount of work on DL for imaging inverse problems,the unifying theme being that one can train a CNN over a dataset to represent a priorfor an unknown image. In this paper, we take a fundamentally different approach byproposing a methodology for leveraging redundancy within the measurements of a singleunknown image, thus requiring no training examples besides the test-input itself. Ourproposed Coordinate-based Internal Learning (CoIL) seeks to represent the full continuousmeasurement field by exploiting the internal information within the subsampled and noisymeasurements. The core of CoIL is a multilayer perceptron (MLP) that maps the measure-ment coordinates to the corresponding sensor responses. The measurement coordinatesare the parameters corresponding to the geometry of the imaging system that determinethe response measured by the sensors. For example, in computed tomography (CT) twoparameters characterizing the response are the view angle θ of the incoming ray beam2nd the spatial location l of the relevant detector on the sensor plane. By training MLPon the coordinate-response pairs extracted from the measurements of a desired object,CoIL is able to build a continuous mapping from the coordinates to the sensor responses.Thus, the learned MLP corresponds to a neural representation of the full measurementfield. By querying the MLP with the relevant coordinates, CoIL can generate the full fieldthat can be used for image reconstruction. Figure 1 provides a conceptual illustration ofthe CoIL methodology. Note that CoIL is not restricted to a specific image reconstructionmethod, but is compatible with a majority of methods including those based on model-basedoptimization or DL.The main contributions of this paper are as follows:• We propose CoIL as the first computational imaging methodology that leverages acoordinate-based neural representation [37–39] for learning high-quality measure-ment fields. Our work complements the recent work in DL by exploiting the internalinformation in the measurements, which can be subsequently combined with otherinformation sources during reconstruction.• We propose a novel MLP architecture for representing the measurement field. Unlikethe CNN architectures that rely on a sequence of multi-filter convolutions, the MLPis built on fully-connected layers with only hidden neurons. The relatively smallscale of our model makes it straightforward to train and deploy.• We extensively validate our method in the context of sparse-view CT. We show thatCoIL synergistically combines with a majority of widely-used methods by being ableto generate high-fidelity full-view sinograms. In all our experiments the methods withCoIL consistently outperform the corresponding ones without it. In this section, we review background information related to CoIL. We introduce theimaging inverse problems and review several popular reconstruction methods. We alsodiscuss sensor-domain DL models and the recent progress on internal learning.
Consider the linear measurement model y = Ax + e , (1)where the measurement operator A ∈ R m × n characterizes the response of the imagingsystem and vector e ∈ R m represents the noise, which is often assumed to be additive whiteGaussian (AWGN). The associated imaging inverse problem involves the reconstruction of3igure 2: Illustration of the CoIL workflow for an arbitrary imaging system with freeparameters v ∈ R v . First, a set of N > measurements are acquired by the system underdifferent realization of v . Then, the coordinate-response pairs { ( v i , r i ) } Ni =1 are used to traina coordinate-based MLP M φ : v → r for encoding the full measurement field. Once thetraining is finished, the encoded field is extracted from M φ with an arbitrary resolutionby querying the relevant coordinates. In the final stage, the CoIL field and the actualmeasurements are jointly used for image reconstruction using a user-defined method.the image x ∈ R n from the measurements y ∈ R m . Due to ill-posedness, practical inverseproblems are often formulated as regularized optimization (cid:98) x = arg min x ∈ R n f ( x ) , with f ( x ) = g ( x ) + h ( x ) , (2)where g is the data-fidelity term that quantifies the consistency of x with y , and h isthe regularizer that imposes some prior knowledge on x . For instance, two widely usedfunctions in the context of imaging are least-square and total variation (TV) g ( x ) = 12 (cid:107) Ax − y (cid:107) and h ( x ) = τ (cid:107) Dx (cid:107) , (3)where τ > controls the regularization strength and D is the discrete gradient operator [1].The nonsmoothness of the regularizer is a common occurrence in imaging, which precludesthe usage of the standard gradient descent algorithms.The family of proximal methods are effective solvers for nonsmooth optimization prob-lems of form (2). Two common algorithms are fast iterative shrinkage/thresholding algorithm (FISTA) [40] and alternating direction method of multipliers (ADMM) [41]. These algorithms4ely on a mathematical concept known as the proximal operator [42], defined as prox µh ( z ) := arg min x ∈ R n (cid:26) (cid:107) x − z (cid:107) + µh ( x ) (cid:27) , (4)to handle nonsmooth terms without differentiation. The parameter µ > in (4) balancesthe importance of the term h . Note that the proximal operator can be interpreted as a maximum a posterior (MAP) denoiser for AWGN with variance µ . DL has become very popular for imaging inverse problems [5–8] due to its excellentperformance. Traditional DL methods first bring the measurements { y i } Ni =1 to the imagedomain and then use a deep CNN architecture, such as UNet [43], to map the resultinglow-quality images { ˜ x i } Ni =1 to high-quality images { x i } Ni =1 . Here, N > denotes the totalnumber of training examples. Typically, these CNNs are trained by minimizing a lossfunction (cid:96) ( ψ ) = 1 N N (cid:88) i =1 L ( F ψ ( ˜ x i ) , x i ) , (5)where F ψ denotes the network parametrized by ψ , and function L quantifies the discrepancybetween F ψ ( ˜ x i ) and x i . Popular choices for L include the (cid:96) and (cid:96) norms. Some othermodels consider different schemes that directly map { y i } to reconstructed images { x i } .These methods often adopt hybrid CNN architectures that contain fully connected layersfor learning either an approximation of the inverse ( AA T ) − [44] or an inversion to someimplicit image manifolds [45]. Nevertheless, traditional DL methods do not explicitlyimpose consistency with respect to the forward model during image reconstruction. The family of denoising-driven approaches represents an alternative to traditional DL bycombining iterative model-based algorithms with deep denoisers as priors. These methodsdraw inspiration from the equivalence between proximal operator and image denoiser. Onepopular framework is PnP, which generalizes the proximal methods, such as FISTA andADMM, by replacing the proximal operator with an arbitrary AWGN denoiser D σ : R n → R n ,with σ > controling the denoising strength. This simple replacement enables PnP touse advanced denoisers, including those based on CNNs [46–48], for regularizing theinverse problem. The PnP algorithms have been shown to be effective in various imagingapplications [49–53]. However, D σ may not correspond to any explicit h in (2), in whichcase PnP loses its interpretation as optimization. PnP has also been theoretically analyzedPnP [20–24, 54–56].RED [14] is a related framework that uses the operator H ( x ) = τ ( x − D σ ( x )) , (6)5ithin many kinds of iterative algorithms [14, 18, 25, 57]. RED with deep denoisershas been reported to be effective in image super-resolution [58], phase retrieval [59],and tomographic imaging [18]. It has been shown that when the denoiser D σ is locallyhomogeneous and has a symmetric Jacobian [14, 60], H ( x ) corresponds to the gradient ofthe following regualrizer h ( x ) = τ x T ( x − D σ ( x )) . (7)RED has recently been theoretically analyzed for general denoisers that may not be associ-ated with any explicit regularizer [18, 25, 57, 61]. Deep unfolding is another widely used model-based DL methodology, originally proposedin [62] for sparse coding. The central idea of deep unfolding is that one can unfoldan iterative algorithm and train it end-to-end as a deep neural netowork [26–29]. Thisenables integration of the physical information into the architecture in the form of data-consistency blocks that are combined with trainable CNN regularizers [30–34]. By trainingthe corresponding model-based network end-to-end, one obtains a regularizer optimizedfor a specific problem. Excellent performance of deep unfolding has been reported in anumber of imaging applications [31, 63], and recent work has addressed the computationaland memory complexity of training such networks [35, 36].Another family of DL methods has used generative adversarial networks (GAN) forregularizing inverse problems [64–66] (cid:98) z = arg min z ∈ R z (cid:107) A G ( z ) − y (cid:107) and (cid:98) x = G ( (cid:98) z ) , (8)where G is a pre-trained GAN, and z ∈ R z is the encoding in the latent space. Theoptimization in (8) implicitly imposes the regularization by restricting the solution (cid:98) x to therange of a GAN G . By searching for the optimal encoding (cid:98) z , one can obtain an estimate (cid:98) x = G ( (cid:98) z ) in the domain defined by GAN that has the smallest distance to the true x .The recovery properties under GANs have been analyzed in the context of compressivesensing [64–66]. Interested readers can find more information in the recent review [67].It is worth pointing out that CoIL is complementary to all these prior works, since itseeks to learn the measurement field given measurements of a single unknown image. Asshown in Section 4, CoIL can be naturally combined with the majority of reconstructionalgorithms used in computational imaging. An interest in developing DL-based approaches for measurement synthesis has recentlyemerged. An end-to-end scheme similar to image super-resolution is commonly adoptedto first linearly interpolate the measurements to the same scale as that of the targetmeasurements and then use a CNN to map the intermediate output to the final refined6igure 3: Visualization of the coordinate-based MLP used in the CoIL methodology. Thenetwork M φ = N φ ◦ γ ( v ) is a concatenation of a single Fourier feature mapping (FFM)layer γ ( v ) and a conventional MLP N φ . As training on example pairs { ( v i , r i ) } Ni =1 , M φ isable to learn a continuous mapping from a coordinate to its response. Hence, M φ becomesan implicit neural representation of the full-measurement field.results [68, 69]. GAN has been employed to synthesize missing measurements that arecorrupted by the metal artifacts [70–73]. The effectiveness of these approaches has beenshown in different imaging modalities [74–76]. Nevertheless, most deep synthesis methodsrequire a dataset of fully-sampled measurements for supervised training. A scan-specificCNN model that avoids training on a large dataset has been recently proposed [77], but stillrequires fully-sampled measurements of the object as groundtruth. CoIL is fundamentallydifferent from the existing methods for measurement synthesis since it learns a representa-tion of the full measurement field from the measurements of an unknown object withoutany groundtruth. Deep internal learning explores the internal information of the test signal for learninga neural network prior without using any external data. One successful approach is toexploit the patch-wise similarity within images, leading to significant results for spatialand temporal super-resolution [78, 79]. Another widely adopted approach is deep imageprior (DIP) , which optimizes a CNN to parameterize the reconstructed image [80–82].
Coordinate-based neural representation is a recent alternative that encodes a spatial field intothe weights of a MLP, which is trained to map coordinates (e.g., ( x, y, z ) ) to the pixel values(e.g., [0 , ). It has been quite successful in computer vision and graphics, but has not beenwidely explored in computational imaging. The coordinate-based MLPs have been used torepresent images [83], scenes [37, 38], and three-dimensional (3D) shapes [37, 38, 84, 85]. Neural radiance field (NeRF) [37] is a recent model that has significantly improved therepresentation power of MLPs by first expanding the input coordinates into a Fourierspectrum (see Section 3.1 for detailed discussion). Its formulation has been adaptedfor improving scene resolutions [39], dealing with multiple lightening conditions [86],removing occluders [38], and handling small deformations [87]. Although there havebeen some early attempts in representing medical images [83], the usage of coordinate-7ased representation has never been explored for representing measurement fields incomputational imaging. This work addresses this gap by proposing CoIL as a novel imagereconstruction methodology that leverages an MLP to repesent the measurement fields.
In this section, we present the details of the CoIL methodology that leverages the powerof coordinate-based learning for addressing imaging inverse problems. Figure 2 illustratesthe general workflow of CoIL for a given imaging system. We first explain the proposedMLP network and then discuss its integration into several common image reconstructionmethods.
The coordinate-based MLP is the central component of CoIL. The network can be expressedas M φ : v → r with v ∈ R v , r ∈ R , where v represents the coordinate in the given imaging system, and r is the correspondingsensor response. The network can be conceptually separated into two parts, where the firstpart is a single Fourier feature mapping (FFM) layer γ ( v ) that is pre-defined before training,while the second part is a standard MLP N φ : γ ( v ) → r whose parameters φ needs to beoptimized. A visual illustration of the complete network architecture is provided in Figure 3.While the numerical studies presented in this paper focus on CT, CoIL is also applicableto other imaging modalities by simply changing the coordinate-response pairs in the MLPrepresentation. For example, one can potentially integrate CoIL into optical diffractiontomography (ODT) [88–90] by letting v denote the sensor location and the angle of theincident light and letting r have two elements corresponding to the real and imaginarycomponents of the light-field. Although neural networks are known to be universal function approximators [91], ithas been found that standard MLPs perform poorly in representing high-frequency vari-ations [37, 92]. In our experiments, we also experienced such issues when we directlyapplied N φ to learning the mapping v → r (see No FFM in Figure 5). In order to overcomethe limitations of standard MLPs, we include the FFM layer to expand the input coordinate v as a combination of different frequency components γ ( v ) = sin ( k π v ) , cos ( k π v ) , ... sin ( k L π v ) , cos ( k L π v ) , (9)8here sin ( · ) and cos ( · ) compute element-wise sinusoidal and cosinusoidal values, respec-tively, and { k i } Li =1 determine the frequencies in the mapping. The FFM layer pre-definesthe frequency components so that the network N φ can actively select the ones that are themost useful for encoding sensor responses by learning the weights in the first layer. Bymanipulating the coefficients k i and total number of components L > , we can explicitlycontrol the expanded spectrum and thus impose some implicit regularization. The FFMlayer was first introduced in NeRF as positional encoding of spatial coordinates [37], anda follow-up work [83] has further explored its functionality by using a concept known asneural target kernels [93]. The original formulation of γ ( v ) in [37] sets k i as an exponentialfunction k i = 2 i − with L = 10 . We discovered that the presence of very high-frequencycomponents leads to the overfitting of the MLP to the noise in the measurements. We thusadopted a linear sampling k i = πi/ in the Fourier space that leads to a higher numberof frequency components in the low-frequency regions. Our empirical results show thatour strategy can effectively improve M φ in representing high-frequency variations andpreventing overfitting to noise (see Figure 5 for examples). The network implementing N φ is composed of fully-connected (FC) layers. The first layers have hidden neurons whose outputs are activated by the rectified linear unit(ReLU), while the last layer has hidden neurons without any activation. We implement skip connections after every even-numbered (less than ) FC layer to concatenate theoriginal input of N φ with the intermediate outputs. The use of skip connections in MLPhas been shown to be beneficial for fast training [84] and better accuracy [85]. Note thatalthough M φ is a fully connected network, its input corresponds to a single coordinate,which enables element-wise processing of all the measurements. CoIL trains a separate MLP to represent the full measurement field for each test objects.This means that the training pairs { ( v i , r i ) } Ni =1 are obtained by extracting the measurementsof the test object only, without any training dataset. The network M φ is trained by usingAdam [94] to minimize the standard (cid:96) -norm loss (cid:96) ( ψ ) = 1 N N (cid:88) i =1 (cid:107)M φ ( v i ) − r i (cid:107) . (10)We implement a decreasing learning rate, which decays exponentially as the trainingepoch increases, to smooth our optimization. Although M φ is a MLP, the network has asignificantly smaller size ( ≈ MB on disk) compared to the standard UNet architecture( ≈ MB on disk) used in many DL-based models.9 .2 Image reconstruction in CoIL
After training, one can generate an arbitrary number of measurements by querying M φ using the relevant coordinates. We will refer to the corresponding measurement field as CoIL field . The CoIL field can be readily integrated into the majority of image reconstructionmethods. Here, we discuss the integration of CoIL into four widely-used methods.
Filtered backprojection (FBP) is a classic method for bringing the measurements into theimage domain [95]. Since the CoIL field is essentially a set of measurements, we candirectly feed the field as input to FBP for image reconstruction. A slightly different way toapply FBP is to form a combined input that includes both the original measurements andthose generated by CoIL. The key benefit of the latter approach is that it directly uses thereal data while also complementing it with CoIL measurements.
Model-based methods reconstruct images by solving optimization problems of form (2). TheCoIL field can be incorporated into the formulation by including an additional “data-fidelity”term ˜ g in the objective f ( x ) = (1 − α ) g ( x ) + α ˜ g ( x ) (cid:124) (cid:123)(cid:122) (cid:125) New data-fidelity + h ( x ) . (11)Here, the parameter ≤ α ≤ controls the tradeoff between the real data and the generatedfield. In practice, we can fine-tune the value of α to obtain a good balance between twoterms. For example, consider the least-squares function ˜ g ( x ) = 12 (cid:107) ˜ Ax − M φ ( ˜ v ) (cid:107) , (12)where ˜ A ∈ R m × n corresponds to the sampling geometry of the CoIL field, ˜ v represents allthe query coordinates for the trained MLP M φ ( ˜ v ) . Since the network is pre-trained, onecan directly use any existing image regularizer and solve the optimization problem with astandard iterative algorithm, such as FISTA or ADMM. As reviewed in Section 2.2, most end-to-end DL models are trained to directly map thelow-quality images { ˜ x i } Ni =1 to the high-quality images { x i } Ni =1 , making them vulnerable tounseen outliers. For example, this adversely influences the performance of DL, when thereis a mismatch between training and testing angles. CoIL can be used to address this issueby generating the measurement field corresponding to the same subsampling rate as themeasurements used for training the DL model (cid:98) x = F ψ ( FBP ( M φ ( ˜ v ))) , (13)10igure 4: Eight × images from the scans of two patients in the AAPM humanphantom dataset [96] were used for testing.Table 1: The average SNR of the sinograms generated by No FFM, Pos Enc, and CoIL in thescenarios corresponding to P × I = { , , } × { , , } ( P ) ( I ) No FFM Pos Enc CoIL
60 30 33 .
95 15 . .
40 42 .
62 21 . .
50 46 .
33 23 . .
90 30 34 .
93 23 . .
40 43 .
82 30 . .
50 48 .
08 35 . .
120 30 36 .
24 22 . .
40 44 .
81 24 . .
50 49 .
68 26 . . where F ψ denotes the pre-trained CNN. Alternatively, one can include the original testimage in the input by averaging the ˜ x and FBP ( M φ ( ˜ v )) using a weight α (cid:98) x = F ψ ((1 − α ) ˜ x + α FBP ( M φ ( ˜ v )) (cid:124) (cid:123)(cid:122) (cid:125) Joint input ) . (14)This approach enables the usage of the learned measurements by MLP with the truemeasurements from the imaging system. Our results in Section 4 show that this CoIL-basedstrategy achieves better results than training a DL model directly on the measurements.11igure 5: Illustration of the benefit of including the Fourier feature mapping (FFM) layerinto CoIL. We plot sinograms and their FBP econstructions in the first and second row,respectively. The proposed FFM in CoIL is compared against
No FFM strategy, which doesnot have any FFM layer, and the positional encoding (Pos Enc) adopted in [37]. The threeMLPs are used to generate views from the P = 120 projections with I = 40 dB noise.Both sinograms and images are labeled with the SNR values with respect to the groundtruthshown in the right-most column. The bounding boxes highlight areas of significant visualdifference. This comparison shows the benefit of using the FFM layer with linear spacing inthe Fourier space. PnP/RED algorithms can be interpreted as extensions of model-based algorithms balancingconsistency with the measurements against deep denoising priors [22, 25]. Considergradient-based RED (GM-RED) x + ← x − γ [ ∇ g ( x ) + τ ( x − D σ ( x ))] (15)where γ > is the stepsize, and ∇ g is the gradient of the data-fidelity term. Similar to themodification of model-based optimization, one straightforward way to integrate CoIL intoGM-RED is to include the gradient of ˜ g as an extra term x + ← x − γ [(1 − α ) ∇ g ( x ) + α ∇ ˜ g ( x ) (cid:124) (cid:123)(cid:122) (cid:125) New data enforcement + τ ( x − D σ ( x ))] , where the new update ensures the consistency with the real measurements as well asthe field generated by CoIL, with α controlling the relative weighting. This idea is alsoapplicable to PnP, for example, by integrating CoIL within PnP-FISTA x + ← D σ ( s − γ [(1 − α ) ∇ g ( s ) + α ∇ ˜ g ( s )]) (16a) s + ← x + + (( q + − /q + )( x + − x ) (16b)12igure 6: Quantitative evaluation of the CoIL field for different projection numbers ( P )and noise levels ( I ). The plot is divided into three regions, corresponding to P equal to , , and , respectively. Within each region, the average SNR values of the generatedsinograms are plotted against different input SNR values, which are also drawn by thedotted horizontal lines for better visualization. First, note how CoIL generally producesmeasurement fields of better SNR than the noise level in the measurements. Second, thefigure highlights that the quality of the generated CoIL field improves as the number ofviews increases or the noise level decreases.where the acceleration parameter q > is updated as q + ← (cid:16) (cid:112) q (cid:17) . In the next section, we will provide results highlighting the performance of CoIL in thecontext of all these algorithms.
We numerically validate CoIL in the context of sparse-view CT. We first substantiate theeffectiveness of the proposed form of FFM, and then demonstrate the benefits of usingCoIL for image reconstruction. We consider four reconstruction methods,
FBP , FISTA-TV , GM-RED , and
FBP-UNet . FBP-UNet refers to the end-to-end model proposed in [9] andFISTA-TV refers to the TV regularized inversion implemented using FISTA. We integrateCoIL into these algorithm by including the parameter α as discussed in Section 3.2. Sparse view X-ray CT is an imaging modality that aims to reconstruct a tomographic imagefrom few projections. In medical applications, it can significantly reduce the radiationdose and hence reduce the risk of radiation exposure. The reconstruction task in CT canbe formulated as the linear inverse problem of form (1). In our simulations, we adopt13able 2: The average SNR values obtained with and without CoIL by using FBP, FISTA-TV,GM-RED, and FBP-UNet in the scenarios corresponding to P × I = { , , }×{ , , } . ( P ) ( I ) FBP FISTA-TV GM-RED FBP-UNet FBP FISTA-TV GM-RED FBP-UNet
60 30 0 .
15 22 . . .
45 22 .
81 23 .
01 24 .
40 9 .
09 26 .
08 27 .
12 27 . .
48 26 .
95 27 .
42 27 .
50 14 .
25 29 .
37 30 .
75 29 . .
99 29 .
78 30 .
88 30 .
90 30 1 .
95 23 .
32 23 .
37 24 . .
15 23 .
58 23 .
64 25 .
40 10 .
92 26 .
98 28 .
86 28 . .
14 28 .
28 29 .
31 29 .
50 16 .
07 30 .
76 31 .
71 31 . .
69 31 .
78 32 .
19 32 .
120 30 3 .
21 23 .
79 24 .
00 24 . .
63 24 .
08 24 .
39 25 .
40 12 .
10 27 .
59 29 .
30 29 . .
52 28 .
95 29 .
79 29 .
50 17 .
07 31 .
53 32 .
31 32 . .
02 32 .
89 33 .
02 33 . the parallel beam geometry with a measurement operator A corresponding to the Radontransform.We consider the experimental setting where the X-ray beam is emitted from the viewangle θ ∈ [0 , π ] and its radiation attenuation is recorded by the detectors at different(normalized) sensor-plane locations l ∈ [0 , . Thus, the MLP is trained to map the locationand angle ( l, θ ) to the corresponding response r . Figure 4 visualizes eight × testimages used in all experiments. These images are selected from the scans of two patientsin the APPM human phantom dataset [96], while the scans of other patients are usedfor training the FBP-UNet and the deep denoiser in GM-RED. We implemented A and itsadjoint A T by using RayTransform from the Operator Discretization Library (ODL) [97],which allows fast computation using a GPU backend. We synthesized the test sinogramscorresponding to P ∈ { , , } projection views, each further contaminated by threenoise levels equivalent to the input signal-to-noise ratio (SNR) of I ∈ { , , } dB. SNRis also used as a metric to quantify the reconstruction quality SNR( ˆ x , x ) (cid:44)
20 log (cid:18) (cid:107) x (cid:107) (cid:107) x − ˆ x (cid:107) (cid:19) . (17)We denote the SNR values averaged over all test images as average SNR .For each test image, CoIL trains separate MLPs to represent its full measurement field indifferent scenarios P × I = { , , } × { , , } . We conducted all the experiments,as well as the training of all neural networks, on a machine equipped with an Intel XeonGold 6130 Processor and four Nvidia GeForce GTX 1080 Ti GPUs. The training time of asingle MLP on our machine takes about minutes. The 2016 NIH-AAPM-Mayo Clinic Low Dose CT Grand Challenge , , and projections, respectively. Withineach region, the average SNR improvement is plotted against the reconstruction method.The vertical axis is in log-scale for better visualization. Note that CoIL consistently improvesthe average SNR values for all the considered algorithms in every scenario. We first evaluate the effectiveness of the proposed FFM layer used in the coordinate-basedMLP. We trained and compared three networks where: (a) the FFM layer is not implemented(
No FFM ); (b) the FFM layer implements the positional encoding where k i = 2 k − i ( Pos Enc );and (c) the FFM layer implements the proposed linear expansion k i = ( πi ) / ( CoIL ). In thesimulations, we use these networks to generate the sinograms corresponding to views,and the total number of frequency components is set to L = 10 for both Pos Enc and CoIL.Table 1 summarizes the average SNR values of the sinograms generated by the threenetworks in all scenarios. Here, we use SNR as the quality metric in the sinogram space,because it enables straightforward comparison with the original measurements whose noiselevel is characterized by input SNR. As shown in the table, CoIL consistently achievessignificantly higher SNR values than both No FFM and
Pos Enc . Our interpretation is that
No FFM is unable to represent the high-frequency variations in the measurement field,while
Pos Enc overfits to noise by containing too many high-frequency components. Weobserve that the sampling pattern in
CoIL better captures the nature of the measurementswithout overfitting to the noise. This is further illustrated in visual examples in Figure 5,which plots the sinograms and their FBP reconstructions obtained by each network for P = 120 and I = 40 dB. Specifically, No FFM is able to represent the general structure of thesinogram but fails in generating the details;
Pos Enc produces strong artifacts in its sinogramdue to its FFM layer.
CoIL succeeds in both representing the high-frequency details andavoiding strong artifacts in the generated measurements. The improvement in the sinogramquality is also reflected in the SNR values obtained after FBP reconstruction. Note how
CoIL significantly differs from other approaches in the regions highlighted by the boundingboxes. 15igure 8: Visual illustration of reconstruction with and without CoIL using the severalmethods. CoIL generates measurement fields corresponding to (for FBP, TV, and RED)and (used for FBP-UNet) views from P = 60 measurements with I = 40 dB noise.Each image is labeled with its SNR value with respect to the groundtruth displayed in theleft-most column. The visual differences are highlighted in the bounding boxes using greenarrows. Note how CoIL enables the recovery of certain details missing in the reconstructionswithout it.We have also investigated the evolution of the sinogram quality for different number ofviews and noise levels. Figure 5 plots the SNR of the sinograms obtained by CoIL against theinput SNR ( I ∈ { , , } ) for different number of views ( P ∈ { , , } ). The threedotted horizontal lines in the figure highlight each I value. We first note that CoIL generatessinograms that generally have higher SNR than the noise level in the measurements. Inparticular, when I = 30 dB, the average SNR values are more than dB higher for every P . This highlights the ability of CoIL to generate high-quality sinograms. The figure alsodemonstrates that the SNR values improve as the number of views increases or noise leveldecreases. This highlights that the quality of the CoIL fields can be improved by havingmore measurements or acquiring those that are less noisy. We next highlight the benefit of CoIL for image reconstruction. We trained all our MLPs byusing the FFM layer based on our linear expansion. We implemented FBP by using fbp-op from the ODL package under the default parameter setting. We used DnCNN [46] to build16igure 9: Visual illustration of reconstruction with and without CoIL using the severalmethods. CoIL generates measurement fields corresponding to (for FBP, TV, and RED)and (used for FBP-UNet) views from P = 90 measurements with I = 40 dB noise.Each image is labeled with its SNR value with respect to the groundtruth displayed in theleft-most column. The visual differences are highlighted in the bounding boxes using greenarrows.the deep denoiser within GM-RED. In every experiment, we selected the network achievingthe highest SNR value from the ones corresponding to five noise levels σ ∈ { , , , , } .For FBP, FISTA-TV, and GM-RED, CoIL generates the measurement field with projectionviews from the test measurements. For FBP-UNet using CoIL, we trained the CNN onthe dataset consisting of the measurements having . × P = { , , } projectionviews and used CoIL to generate additional measurements to achieve that number. As abaseline, we trained a separate FBP-UNet that directly predicts the groundtruth form thetest measurements. Note that the baseline networks correspond to the optimal performancethat FBP-UNet can achieve for the test measurements without integrating CoIL. In order tostabilize FBP-UNet, we trained these networks using the data with random fluctuations inboth projection views ( ± ) and noise amount ( ± dB).Figure 7 quantitatively evaluates the improvements in imaging quality due to CoIL forall the considered reconstruction algorithms. For each algorithm, we plot the differencebetween the SNR obtained with and without CoIL. We can clearly observe that CoIL leadsto significant quality improvements for all the algorithms. Remarkably, for the higheramount of noise ( I = 30 dB), the average improvement by CoIL can sometimes be as highas dB for FBP. On the other hand, when the noise is relatively mild, CoIL still leads17igure 10: Visual illustration of reconstruction with and without CoIL using the severalmethods. CoIL generates measurement fields corresponding to (for FBP, TV, and RED)and (used for FBP-UNet) views from P = 120 measurements with I = 40 dB noise.Each image is labeled with its SNR value with respect to the groundtruth displayed inthe left-most column. The visual differences are highlighted in the bounding boxes usinggreen arrows. Visual examples reconstructed with and without CoIL using the consideredmethods.to significant SNR improvements for all algorithms including model-based and DL-basedmethods. In particular, for P = 60 and I = 50 dB, FBP-UNet without CoIL achieves . dB, while FBP-UNet with CoIL achieves . dB, which is nearly dB improvement. Theexact numbers obtained by each algorithm are also summarized in Table 2. These resultshighlight that CoIL is able to accurately represent the measurement field and generatehigh-fidelity measurements that can be used to improve image reconstruction.Figure 8 presents visual comparisons of images reconstructed with and without CoIL for P = 60 and I = 40 . Each image is labeled with its SNR with respect to the groundtruth andthe visual differences are highlighted by arrows in the bounding boxes. This comparisonhighlights visual improvements due to CoIL. For example, consider the visual differencesfor FBP-Unet, where one can clearly see additional visual details after integration of theCoIL field. The better reconstruction quality with CoIL is also reflected in the higher SNRvalues. Additional visual comparisons in Figure 9 and Figure 10 also highlight the benefitof image reconstruction with CoIL. 18 Conclusion
The CoIL methodology developed in this paper is a new approach for computational imagingusing coordinate-based neural representations. CoIL can represent the full measurementfield as a single MLP network by training it to map the measurement coordinates to theirsensor responses. This makes CoIL a self-supervised model that can be trained without anyexternal dataset. We provided extensive empirical results demonstrating the improvementsdue to CoIL in the context of sparse-view CT, highlighting its great potential to worksynergistically with existing image reconstruction methods. Future work will explore newapplications of CoIL to other imaging modalities, such as optical diffraction tomographyand intensity diffraction tomography.
Acknowledgement
The research in this work was supported by NSF award CCF-1813910 and by the LaboratoryDirected Research and Development program of Los Alamos National Laboratory underproject number 20200061DR.
References [1] L. I. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based noise removalalgorithms,”
Physica D , vol. 60, no. 1–4, pp. 259–268, Nov. 1992.[2] M. A. T. Figueiredo and R. D. Nowak, “Wavelet-based image estimation: An empiricalBayes approach using Jeffreys’ noninformative prior,”
IEEE Trans. Image Process. , vol.10, no. 9, pp. 1322–1331, Sep. 2001.[3] M. Elad and M. Aharon, “Image denoising via sparse and redundant representationsover learned dictionaries,”
IEEE Trans. Image Process. , vol. 15, no. 12, pp. 3736–3745,Dec. 2006.[4] A. Danielyan, V. Katkovnik, and K. Egiazarian, “BM3D frames and variational imagedeblurring,”
IEEE Trans. Image Process. , vol. 21, no. 4, pp. 1715–1728, Apr. 2012.[5] M. T. McCann, K. H. Jin, and M. Unser, “Convolutional neural networks for inverseproblems in imaging: A review,”
IEEE Signal Process. Mag. , vol. 34, no. 6, pp. 85–95,2017.[6] A. Lucas, M. Iliadis, R. Molina, and A. K. Katsaggelos, “Using deep neural networksfor inverse problems in imaging: Beyond analytical methods,”
IEEE Signal Process.Mag. , vol. 35, no. 1, pp. 20–36, Jan. 2018.197] G. Ongie, A. Jalal, C. A. Metzler, R. G. Baraniuk, A. G. Dimakis, and R. Willett, “Deeplearning techniques for inverse problems in imaging,”
IEEE J. Sel. Areas Inf. Theory ,vol. 1, no. 1, pp. 39–56, May 2020.[8] G. Wang, J. C. Ye, and B. De Man, “Deep learning for tomographic image reconstruc-tion,”
Nature Machine Intelligence , vol. 2, no. 12, pp. 737–748, 2020.[9] K. H. Jin, M. T. McCann, E. Froustey, and M. Unser, “Deep convolutional neuralnetwork for inverse problems in imaging,”
IEEE Trans. Image Process. , vol. 26, no. 9,pp. 4509–4522, Sep. 2017.[10] E. Kang, J. Min, and J. C. Ye, “A deep convolutional neural network using directionalwavelets for low-dose x-ray CT reconstruction,”
Medical Physics , vol. 44, no. 10, pp.e360–e375, 2017.[11] Y. Sun, Z. Xia, and U. S. Kamilov, “Efficient and accurate inversion of multiplescattering with deep learning,”
Opt. Express , vol. 26, no. 11, pp. 14678–14688, May2018.[12] Y. Han and J. C. Ye, “Framing U-Net via deep convolutional framelets: Application tosparse-view CT,”
IEEE Trans. Med. Imag. , vol. 37, no. 6, pp. 1418–1429, 2018.[13] S. V. Venkatakrishnan, C. A. Bouman, and B. Wohlberg, “Plug-and-play priors formodel based reconstruction,” in
Proc. IEEE Global Conf. Signal Process. and Inf. Process.(GlobalSIP) , Austin, TX, USA, Dec. 3-5, 2013, pp. 945–948.[14] Y. Romano, M. Elad, and P. Milanfar, “The little engine that could: Regularization bydenoising (RED),”
SIAM J. Imaging Sci. , vol. 10, no. 4, pp. 1804–1844, 2017.[15] S. Ono, “Primal-dual plug-and-play image restoration,”
IEEE Signal. Proc. Let. , vol. 24,no. 8, pp. 1108–1112, Aug. 2017.[16] U. S. Kamilov, H. Mansour, and B. Wohlberg, “A plug-and-play priors approach forsolving nonlinear imaging inverse problems,”
IEEE Signal. Proc. Let. , vol. 24, no. 12,pp. 1872–1876, Dec. 2017.[17] S. A. Bigdeli, M. Zwicker, P. Favaro, and M. Jin, “Deep mean-shift priors for imagerestoration,” in
Advances in Neural Information Processing Systems 30 , 2017, pp.763–772.[18] Z. Wu, Y. Sun, A. Matlock, J. Liu, L. Tian, and U. S. Kamilov, “SIMBA: Scalableinversion in optical tomography using deep denoising priors,”
IEEE J. Sel. Topics SignalProcess. , pp. 1–1, 2020.[19] J. Liu, Y. Sun, C. Eldeniz, W. Gan, H. An, and U. S. Kamilov, “RARE: Image recon-struction using deep priors learned without ground truth,”
IEEE J. Sel. Topics SignalProcess. , pp. 1–1, 2020. 2020] S. H. Chan, X. Wang, and O. A. Elgendy, “Plug-and-play ADMM for image restoration:Fixed-point convergence and applications,”
IEEE Trans. Comp. Imag. , vol. 3, no. 1, pp.84–98, Mar. 2017.[21] G. T. Buzzard, S. H. Chan, S. Sreehari, and C. A. Bouman, “Plug-and-play unplugged:Optimization free reconstruction using consensus equilibrium,”
SIAM J. Imaging Sci. ,vol. 11, no. 3, pp. 2001–2020, Sep. 2018.[22] E. K. Ryu, J. Liu, S. Wang, X. Chen, Z. Wang, and W. Yin, “Plug-and-play methodsprovably converge with properly trained denoisers,” in
Proc. 36th Int. Conf. MachineLearning (ICML) , Long Beach, CA, USA, June 09–15 2019, vol. 97, pp. 5546–5557.[23] Y. Sun, B. Wohlberg, and U. S. Kamilov, “An online plug-and-play algorithm forregularized image reconstruction,”
IEEE Trans. Comput. Imaging , vol. 5, no. 3, pp.395–408, Sept. 2019.[24] X. Xu, Y. Sun, J. Liu, B. Wohlberg, and U. S. Kamilov, “Provable convergence ofplug-and-play priors with mmse denoisers,”
IEEE Signal Process. Lett. , vol. 27, pp.1280–1284, 2020.[25] Y. Sun, J. Liu, and U. S. Kamilov, “Block coordinate regularization by denoising,” in
Advances in Neural Information Processing Systems 32 , pp. 380–390. 2019.[26] J. Zhang and B. Ghanem, “ISTA-Net: Interpretable optimization-inspired deep networkfor image compressive sensing,” in
Proc. IEEE Conf. Comput. Vision Pattern Recognit. ,2018, pp. 1828–1837.[27] Y. Yang, J. Sun, H. Li, and Z. Xu, “Deep ADMM-Net for compressive sensing MRI,” in
Advances in Neural Information Processing Systems 29 , 2016, pp. 10–18.[28] A. Hauptmann, F. Lucka, M. Betcke, N. Huynh, J. Adler, B. Cox, P. Beard, S. Ourselin,and S. Arridge, “Model-based learning for accelerated, limited-view 3-d photoacoustictomography,”
IEEE Trans. Med. Imag. , vol. 37, no. 6, pp. 1382–1393, 2018.[29] J. Adler and O. ¨Oktem, “Learned primal-dual reconstruction,”
IEEE Trans. Med. Imag. ,vol. 37, no. 6, pp. 1322–1332, June 2018.[30] H. K. Aggarwal, M. P. Mani, and M. Jacob, “MoDL: Model-based deep learningarchitecture for inverse problems,”
IEEE Trans. Med. Imag. , vol. 38, no. 2, pp. 394–405, Feb. 2019.[31] S. A. Hosseini, B. Yaman, S. Moeller, M. Hong, and M. Akcakaya, “Dense recurrentneural networks for accelerated MRI: History-cognizant unrolling of optimizationalgorithms,”
IEEE J. Sel. Topics Signal Process. , vol. 14, no. 6, pp. 1280–1291, Oct.2020. 2132] I. Y. Chun, Z. Huang, H. Lim, and J. Fessler, “Momentum-Net: Fast and convergentiterative neural network for inverse problems,”
IEEE Trans. Patt. Anal. and MachineIntell. , pp. 1–1, 2020.[33] B. Yaman, S. A. H. Hosseini, S. Moeller, J. Ellermann, K. U˘gurbil, and M. Akc¸akaya,“Self-supervised learning of physics-guided reconstruction neural networks withoutfully sampled reference data,”
Magn. Reson. Med. , Jul. 2020.[34] H. K. Aggarwal and M. Jacob, “J-MoDL: Joint model-based deep learning for optimizedsampling and reconstruction,”
IEEE J. Sel. Topics Signal Process. , vol. 14, no. 6, pp.1151–1162, 2020.[35] M. Kellman, K. Zhang, E. Markley, J. Tamir, E. Bostan, M. Lustig, and L. Waller,“Memory-efficient learning for large-scale computational imaging,”
IEEE Trans. Comput.Imag. , vol. 6, pp. 1403–1414, 2020.[36] J. Liu, Y. Sun, W. Gan, X. Xu, B. Wohlberg, and U. S. Kamilov, “SGD-Net: Efficientmodel-based deep learning with theoretical guarantees,” arXiv:2101.09379 [eess.IV] ,2021.[37] B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng,“NeRF: Representing scenes as neural radiance fields for view synthesis,” in
TheEuropean Conference on Computer Vision (ECCV) , 2020.[38] R. Martin-Brualla, N. Radwan, M. SM Sajjadi, J. T Barron, A. Dosovitskiy, and D. Duck-worth, “NeRF in the wild: Neural radiance fields for unconstrained photo collections,” arXiv:2008.02268 [cs.CV] , 2020.[39] K. Zhang, G. Riegler, N. Snavely, and V. Koltun, “NeRF++: Analyzing and improvingneural radiance fields,” arXiv:2010.07492 [cs.CV] , 2020.[40] A. Beck and M. Teboulle, “A fast iterative shrinkage-thresholding algorithm for linearinverse problems,”
SIAM J. Imaging Sci. , vol. 2, no. 1, pp. 183–202, 2009.[41] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed optimization andstatistical learning via the alternating direction method of multipliers,”
Foundationsand Trends in Machine Learning , vol. 3, no. 1, pp. 1–122, July 2011.[42] J. J. Moreau, “Proximit´e et dualit´e dans un espace Hilbertien,”
Bull. Soc. Math. France ,vol. 93, pp. 273–299, 1965.[43] O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks for biomedi-cal image segmentation,” in
Medical Image Computing and Computer-Assisted Interven-tion (MICCAI) , Munich, Germany, Oct. 5-9 2015, pp. 234–241.2244] T. W¨urfl, M. Hoffmann, V. Christlein, K. Breininger, Y. Huang, M. Unberath, and A. K.Maier, “Deep learning computed tomography: Learning projection-domain weightsfrom image domain in limited angle problems,”
IEEE Trans. on Med. Imag. , vol. 37,no. 6, pp. 1454–1463, 2018.[45] B. Zhu, J. Z Liu, S. F Cauley, B. R Rosen, and M. S Rosen, “Image reconstruction bydomain-transform manifold learning,”
Nature , vol. 555, no. 7697, pp. 487–492, 2018.[46] K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang, “Beyond a Gaussian denoiser:Residual learning of deep CNN for image denoising,”
IEEE Trans. Image Process. , vol.26, no. 7, pp. 3142–3155, July 2017.[47] K. Zhang, W. Zuo, and L. Zhang, “FFDNet: Toward a fast and flexible solution for CNN-based image denoising,”
IEEE Trans. Image Process. , vol. 27, no. 9, pp. 4608–4622,Sep. 2018.[48] G. Song, Y. Sun, J. Liu, Z. Wang, and U. S. Kamilov, “A new recurrent plug-and-playprior based on the multiple self-similarity network,”
IEEE Signal Process. Lett. , vol. 27,no. 1, pp. 451–455, 2020.[49] S. Sreehari, S. V. Venkatakrishnan, B. Wohlberg, G. T. Buzzard, L. F. Drummy, J. P. Sim-mons, and C. A. Bouman, “Plug-and-play priors for bright field electron tomographyand sparse interpolation,”
IEEE Trans. Comput. Imaging , vol. 2, no. 4, pp. 408–423,Dec. 2016.[50] K. Zhang, W. Zuo, S. Gu, and L. Zhang, “Learning deep CNN denoiser prior for imagerestoration,” in
Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR) , 2017,pp. 3929–3938.[51] Y. Sun, S. Xu, Y. Li, L. Tian, B. Wohlberg, and U. S. Kamilov, “Regularized Fourierptychography using an online plug-and-play algorithm,” in
Proc. IEEE Int. Conf.Acoustics, Speech and Signal Process. (ICASSP) , Brighton, UK, May 12-17, 2019, pp.7665–7669.[52] Kai Zhang, Wangmeng Zuo, and Lei Zhang, “Deep plug-and-play super-resolution forarbitrary blur kernels,” in
Proceedings of the IEEE Conference on Computer Vision andPattern Recognition (CVPR) , 2019, pp. 1671–1681.[53] R. Ahmad, C. A. Bouman, G. T. Buzzard, S. Chan, S. Liu, E. T. Reehorst, and P. Schniter,“Plug-and-play methods for magnetic resonance imaging: Using denoisers for imagerecovery,”
IEEE Signal Processing Magazine , vol. 37, no. 1, pp. 105–116, 2020.[54] T. Meinhardt, M. Moeller, C. Hazirbas, and D. Cremers, “Learning proximal operators:Using denoising networks for regularizing inverse imaging problems,” in
Proc. IEEEInt. Conf. Comp. Vis. (ICCV) , 2017, pp. 1799–1808.2355] T. Tirer and R. Giryes, “Image restoration by iterative denoising and backwardprojections,”
IEEE Trans. Image Process. , vol. 28, no. 3, pp. 1220–1234, Mar. 2019.[56] A. M. Teodoro, J. M. Bioucas-Dias, and M. A. T. Figueiredo, “A convergent imagefusion algorithm using scene-adapted Gaussian-mixture-based denoising,”
IEEE Trans.Image Process. , vol. 28, no. 1, pp. 451–463, Jan. 2019.[57] Yu Sun, Jiaming Liu, Yiran Sun, Brendt Wohlberg, and Ulugbek Kamilov, “Async-RED:A provably convergent asynchronous block parallel stochastic method using deepdenoising priors,” in
International Conference on Learning Representations (ICLR) ,2021.[58] Gary Mataev, Peyman Milanfar, and Michael Elad, “DeepRED: Deep image priorpowered by RED,” in
The IEEE International Conference on Computer Vision (ICCV)Workshops , Oct. 2019.[59] C. Metzler, P. Schniter, A. Veeraraghavan, and R. Baraniuk, “prDeep: Robust phaseretrieval with a flexible deep network,” in
Proc. 35th Int. Conf. Machine Learning(ICML) , Stockholmsm¨assan, Stockholm Sweden, 10–15 July 2018, pp. 3501–3510.[60] E. T. Reehorst and P. Schniter, “Regularization by denoising: Clarifications and newinterpretations,”
IEEE Trans. Comput. Imag. , vol. 5, no. 1, pp. 52–67, Mar. 2019.[61] R. Cohen, M. Elad, and P. Milanfar, “Regularization by denoising via fixed-pointprojection (RED-PRO),” arXiv:2008.00226 [eess.IV] , 2020.[62] K. Gregor and Y. LeCun, “Learning fast approximation of sparse coding,” in
Proc. 27thInt. Conf. Machine Learning (ICML) , Haifa, Israel, June 21-24, 2010, pp. 399–406.[63] S. Biswas, H. K. Aggarwal, and M. Jacob, “Dynamic MRI using model-based deeplearning and SToRM priors: MoDL-SToRM,”
Magn. Reson. Med. , vol. 82, no. 1, pp.485–494, July 2019.[64] A. Bora, A. Jalal, E. Price, and A. G. Dimakis, “Compressed sensing using generativemodels,” in
Proc. 34th Int. Conf. Machine Learning (ICML) , International ConventionCentre, Sydney, Australia, 06–11 Aug 2017, vol. 70, pp. 537–546.[65] V. Shah and C. Hegde, “Solving linear inverse problems using gan priors: An algorithmwith provable guarantees,” in . IEEE, 2018, pp. 4609–4613.[66] A. Jalal, L. Liu, A. G Dimakis, and C. Caramanis, “Robust compressed sensing usinggenerative models,”
Advances in Neural Information Processing Systems , vol. 33, 2020.[67] N. Shlezinger, J. Whang, Y. C Eldar, and A. G Dimakis, “Model-based deep learning,” arXiv:2012.08405 [eess.SP] , 2020. 2468] H. Lee, J. Lee, and S. Cho, “View-interpolation of sparsely sampled sinogram usingconvolutional neural network,” in
Medical Imaging 2017: Image Processing , Martin A.Styner and Elsa D. Angelini, Eds., 2017, vol. 10133, pp. 617 – 624.[69] H. Lee, J. Lee, H. Kim, B. Cho, and S. Cho, “Deep-neural-network-based sinogramsynthesis for sparse-view CT image reconstruction,”
IEEE Transactions on Radiationand Plasma Medical Sciences , vol. 3, no. 2, pp. 109–119, 2019.[70] R. Anirudh, H. Kim, J. J Thiagarajan, K Aditya Mohan, K. Champley, and T. Bremer,“Lose the views: Limited angle CT reconstruction via implicit sinogram completion,” in
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) ,2018, pp. 6343–6352.[71] M. U. Ghani and W C. Karl, “Fast enhanced CT metal artifact reduction using datadomain deep learning,”
IEEE Trans. Comput. Imag. , vol. 6, pp. 181–193, 2019.[72] Muhammad Usman Ghani and W Clem Karl, “Data and image prior integration forimage reconstruction using consensus equilibrium,” arXiv:2009.00092[eess.IV] , 2020.[73] M. U. Ghani,
Data and image domain deep learning for computational imaging , Ph.D.thesis, 2021.[74] B. EH Claus, Y. Jin, L. A Gjesteby, G. Wang, and B. De Man, “Metal-artifact reductionusing deep-learning based sinogram completion: Initial results,” in
Proc. 14th Int.Meeting Fully Three-Dimensional Image Reconstruction Radiology Nuclear Medicine ,2017, pp. 631–634.[75] Q. De Man, E. Haneda, B. Claus, P. Fitzgerald, B. De Man, G. Qian, H. Shan, J. Min,M. Sabuncu, and G. Wang, “A two-dimensional feasibility study of deep learning-basedfeature detection and characterization directly from CT sinograms,”
Medical physics ,vol. 46, no. 12, pp. 790–800, 2019.[76] Y. Han, L. Sunwoo, and J. C. Ye, “ k -space deep learning for accelerated MRI,” IEEETrans. Med. Imag. , vol. 39, no. 2, pp. 377–386, 2020.[77] Mehmet A., Steen M., Sebastian W., and Kˆamil U., “Scan-specific robust artificial-neural-networks for k-space interpolation (RAKI) reconstruction: Database-free deeplearning for fast imaging,”
Magn. Reson. Med. , vol. 81, no. 1, pp. 439–453, 2019.[78] A. Shocher, N. Cohen, and M. Irani, “‘Zero-shot’ super-resolution using deep internallearning,” in
Proceedings of the IEEE Conference on Computer Vision and PatternRecognition (CVPR) , 2018, pp. 3118–3126.[79] L. P. Zuckerman, E. Naor, G. Pisha, S. Bagon, and M. Irani, “Across scales and acrossdimensions: Temporal super-resolution using deep internal learning,” in
EuropeanConference on Computer Vision (ECCV) , 2020, pp. 52–68.2580] D. Ulyanov, A. Vedaldi, and V. Lempitsky, “Deep image prior,” in
Proc. IEEE Conf.Computer Vision and Pattern Recognition (CVPR) , Salt Lake City, UT, USA, June 18-22,2018, pp. 9446–9454.[81] J. Liu, Y. Sun, X. Xu, and U. S. Kamilov, “Image restoration using total variationregularized deep image prior,” in , May 2019, pp. 7715–7719.[82] Y. Gandelsman, A. Shocher, and M. Irani, “‘Double-DIP’: Unsupervised image decom-position via coupled deep-image-priors,” in
Proceedings of the IEEE/CVF Conference onComputer Vision and Pattern Recognition (CVPR) , June 2019, pp. 11026–11035.[83] M. Tancik, P. P. Srinivasan, B. Mildenhall, S. Fridovich-Keil, N. Raghavan, U. Singhal,R. Ramamoorthi, J. T. Barron, and R. Ng, “Fourier features let networks learn highfrequency functions in low dimensional domains,”
Advances in Neural InformationProcessing Systems 33 , 2020.[84] Z. Chen and H. Zhang, “Learning implicit fields for generative shape modeling,” in
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) ,2019, pp. 5939–5948.[85] J. J. Park, P. Florence, J. Straub, Ri. Newcombe, and S. Lovegrove, “DeepSDF: Learningcontinuous signed distance functions for shape representation,” in
Proceedings ofthe IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , 2019, pp.165–174.[86] P. P. Srinivasan, B. Deng, X. Zhang, M. Tancik, B. Mildenhall, and J. T. Barron,“NeRV: Neural reflectance and visibility fields for relighting and view synthesis,” in arXiv:2012.03927 [cs.CV] , 2020.[87] K. Park, U. Sinha, J. T Barron, S. Bouaziz, D. B Goldman, S. M Seitz, and R. Brualla,“Deformable neural radiance fields,” arXiv:2011.12948 [cs.CV] , 2020.[88] Y. Sung, W. Choi, C. Fang-Yen, K. Badizadegan, R. R. Dasari, and M. S. Feld, “Opticaldiffraction tomography for high resolution live cell imaging,”
Opt. Express , vol. 17, no.1, pp. 266–277, December 2009.[89] U. S. Kamilov, I. N. Papadopoulos, M. H. Shoreh, A. Goy, C. Vonesch, M. Unser, andD. Psaltis, “Learning approach to optical tomography,”
Optica , vol. 2, no. 6, pp.517–522, June 2015.[90] T.-A. Pham, E. Soubies, A. Goy, J. Lim, F. Soulez, D. Psaltis, and M. Unser, “Versatilereconstruction framework for diffraction tomography with intensity measurementsand multiple scattering,”
Opt Express , vol. 26, no. 3, pp. 2749–2763, February 2018.[91] K. Hornik, M. Stinchcombe, H. White, et al., “Multilayer feedforward networks areuniversal approximators.,”
Neural networks , vol. 2, no. 5, pp. 359–366, 1989.2692] N. Rahaman, A. Baratin, D. Arpit, F. Draxler, M. Lin, F. Hamprecht, Y. Bengio, andA. Courville, “On the spectral bias of neural networks,” in
International Conference onMachine Learning , 2019, pp. 5301–5310.[93] A. Jacot, F. Gabriel, and C. Hongler, “Neural tangent kernel: Convergence andgeneralization in neural networks,” in
Advances in neural information processingsystems 31 , 2018, pp. 8571–8580.[94] D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in
InternationalConference on Learning Representations (ICLR) , 2015.[95] A. C. Kak and M. Slaney,
Principles of Computerized Tomographic Imaging , IEEE, 1988.[96] C. McCollough, “TU-FG-207A-04: Overview of the low dose CT grand challenge,”
Med. Phys , vol. 43, no. 6Part35, pp. 3759–3760, 2016.[97] J. Adler and O. ¨Oktem, “Solving ill-posed inverse problems using iterative deep neuralnetworks,”