Towards multi-sequence MR image recovery from undersampled k-space data
TTowards multi-sequence MR image recoveryfrom undersampled k-space data
Cheng Peng , Wei-An Lin , Rama Chellappa , S. Kevin Zhou , The University of Maryland, College Park Chinese Academy of Sciences PengCheng Laboratory, Shenzhen
Abstract.
Undersampled MR image recovery has been widely studiedfor accelerated MR acquisition. However, it has been mostly studiedunder a single sequence scenario, despite the fact that multi-sequenceMR scan is common in practice. In this paper, we aim to optimize multi-sequence MR image recovery from undersampled k-space data under anoverall time constraint while considering the difference in acquisition timefor various sequences. We first formulate it as a constrained optimization problem and then show that finding the optimal sampling strategy for allsequences and the best recovery model at the same time is combinatorial and hence computationally prohibitive. To solve this problem, we proposea blind recovery model that simultaneously recovers multiple sequences,and an efficient approach to find proper combination of sampling strategyand recovery model. Our experiments demonstrate that the proposedmethod outperforms sequence-wise recovery, and sheds light on how todecide the undersampling strategy for sequences within an overall timebudget.
Magnetic Resonance Imaging (MRI) is a widely used medical imaging technique.It holds several distinct advantages over other imaging modalities such as com-puted tomography (CT) and ultrasound. Not only can MRI resolve tissues at ahigh quality, it can also be customized with different pulse sequences to producea variety of desired contrasts that reveal specific kinds of tissues, such as bloodvessels and tumor regions. Furthermore, compared to CT, MRI does not exposepatients to ionizing radiation. MRI is also limited in comparison by its longacquisition time. This is because MRI data is acquired by traversing throughk-space sequentially, where the speed of traversal is limited by the underlyingMR physics and machine quality. In addition, many patients have to take mul-tiple MR sequences, each of which uses different parameters to target specifictissues, resulting in longer overall acquisition time. This leads to various practi-cal problems, ranging from image blurriness due to patient movement to limitingaccessibility of the machines.There is a lot of research on how to undersample MR k-space data while main-taining image quality. Lustig et al. [1] first proposed to use Compressed Sensingin MRI (CSMRI), assuming that the undersampled MR images have a sparserepresentation in some transform domain, where noise can be discarded throughminimizing the L norm of the representation. This method was shown to yieldmuch better results than zero-filling the missing k-space samples (ZF); Extend-ing on CSMRI, Ravishankar et al. [2] applied more adaptive sparse modelling a r X i v : . [ ee ss . I V ] A ug Cheng Peng , Wei-An Lin , Rama Chellappa , S. Kevin Zhou , through Dictionary Learning, where the transformation is optimized throughspecific sets of data, resulting in better sparsity encoding. To further explore re-dundancy within the MR data, new methods have been proposed in recent years[3,4,5,6,7], focusing on extrapolating information in adjacent slices, in multi-acquisition scenarios, and in scenarios where additional sequence is available. Inthe domain of Deep Learning, Schlemper et al. [8] proposed a cascade of CNNsthat incorporates data consistency layers to de-noise MRI in image domain whilemaintaining consistency in the k-space, and showed that the results significantlyoutperformed [2]. Yang et al. [9] proposed DAGAN, which recovers undersam-pled MR images through a U-Net structure with perceptual and adversarial lossin addition to L loss in image space and frequency space. Quan et al. [10] pro-posed RefineGAN, which performs reconstruction and refinement through twodifferent networks, and enforces a cyclic loss in the image and frequency spaces.Although the mentioned CNN-based methods have obtained impressive re-sults, they focus on single sequence reconstruction. Few studies have exploredthe effectiveness of CNN-based methods under multi-sequence scenarios, whichare common in practice and shown to contribute in non-learning-based methods[7,11]. Xiang et al. [12] showed that a highly undersampled T sequence, givena fully sampled T sequence, can be well recovered through a Dense U-Net. De-spite this, there has not been a quantitative study done with regard to the beststrategy at undersampling k-spaces over multiple sequences for image recovery.In this paper, we consider recovering multiple MR sequences through a singleCNN. The contributions of our paper can be summarized as follows: (i) we for-mulate a combinatorial constrained optimization problem, where given a limitedacquisition time, we seek to find the best strategy to undersample the k-spacesof multiple sequences to achieve the best overall recovery; (ii) we propose a novelCNN-based blind recovery model that extrapolates the shared information acrossdifferent sequences and simultaneously recover them, as well as an efficient ap-proach to finding a proper combination of sampling strategy and recovery model;(iii) we perform extensive evaluation on a large amount of real and simulatedk-space data, which shows that the proposed model outperforms the method ofindependently recovering each sequence, and that our method finds the under-sampling strategy adaptive to the given sequences . We first note that the most popular MR k-space sampling method is throughCartesian trajectory, where a series of acquisitions is performed along equally-spaced parallel lines, which are conventionally called phase encoding lines . Thisleads to a natural implementation for MR undersampling, where the technicianscan drop certain phase encoding lines from the sampling grid [1]. In this paper,we focus on undersampling with 1D masks along the phase encoding direction.Consider multiple MR sequences with full k-space spectrums { F s } Ss =1 , witheach spectrum sampled by N phase encoding lines. For each F s , the unit timefor sampling a phase encoding line is denoted by t s . We define 1D samplingmasks M s ∈ { , } N which selects a subset of encoding lines M s (cid:12) F s for fasteracquisition. By applying the inverse Fourier transform F − , an undersampled owards multi-sequence MR image recovery from undersampled k-space data 3 MR image for sequence s is reconstructed as I M s = F − ( M s (cid:12) F s ) . (1)When fully sampled, the MR image is reconstructed by I s = F − ( F s ). If wedenote the number of selected encoding lines by |M s | , the total time needed toacquire all the sequences is T = (cid:80) Ss =1 t s × |M s | . Although undersampled MRis faster to acquire, it exhibits degraded quality compared to fully sampled MR.To allow fast acquisition and at the same time retain image quality, we firstconsider the problem of searching for an optimal sampling strategy {M s } Ss =1 and a deep neural network f θ that best recovers fully sampled { I s } Ss =1 from { I M s } with a time constraint T ≤ T max . This constrained optimization problemcan be formulated as follows:min θ, {M s } S (cid:88) s =1 E I s ∼ p ( I s ) (cid:2)(cid:13)(cid:13) f θ ( I M s ) − I s (cid:13)(cid:13) (cid:3) s.t. S (cid:88) s =1 t s |M s | ≤ T max . (2)In (2), we use the L loss; however, other loss functions can be used too.The problem defined in (2) is combinatorial in nature. First, the set {M s } Ss =1 has a total of 2 NS possible combinations. Secondly, the best recovery modeldepends on the choice of sampling strategy. As a result, the optimal solutionto (2) is in general difficult to find. As a preliminary attempt, we assume afixed candidate set C ∈ { m , . . . , m F } for each M s . The number of possiblesampling strategies becomes F S instead. However, even with the simplification,a straightforward approach to (2), which ismin M S ∈C S (cid:32) min θ S (cid:88) s =1 E I s ∼ p ( I s ) (cid:2)(cid:13)(cid:13) f θ ( I M s ) − I s (cid:13)(cid:13) (cid:3)(cid:33) s.t. S (cid:88) s =1 t s |M s | ≤ T max , (3)still requires training F S models and then choosing the one with minimumloss. This is necessary since each model is trained to best eliminate noise intro-duced by the specific M s , and inevitably becomes sub-optimal when the noiselevel/pattern is changed.In this work, we propose an efficient approach that finds a ( θ, {M s } Ss =1 )while circumventing the computational cost in training an excessive number ofmodels. Conceptually, we propose to first train a blind recovery model (BRM),which takes randomly undersampled MR sequences as inputs, and recovers themto fully sampled MR sequences. The trained BRM can then be used as an MRsequence quality estimator to search for the optimal sampling strategy {M ∗ s } Ss =1 .Finally, with {M ∗ s } Ss =1 , we can proceed to solve (3) by fine-tuning on the existingBRM. In total, the proposed method only requires training one CNN, whichsignificantly reduces the computational cost.
A blind recovery model (BRM) is a CNN f θ which recovers I s by fusing informa-tion from different undersampled MR sequences { I M s } Ss =1 , M s ∈ C . We adopt Cheng Peng , Wei-An Lin , Rama Chellappa , S. Kevin Zhou , a data augmentation approach, which randomly selects sampling masks from C ,and consider the following unconstrained optimization problem : θ ∗ = arg min θ S (cid:88) s =1 E I s ∼ p ( I s ) , M s ∼ p ( C ) (cid:2)(cid:13)(cid:13) f θ ( I M s ) − I s (cid:13)(cid:13) (cid:3) . (4)As we will show, the model trained under this scheme sacrifices its ability tooverfit on a specific sampling profile, and in exchange performs generally wellacross all sampling profiles, and can serve as a good estimator for discoveringthe best sampling strategy. Given a trained BRM f θ ∗ , we propose to search for the optimal sampling strategyby finding the one with a minimum loss: M ∗ S = arg min M S S (cid:88) s =1 E I s ∼ p ( I s ) (cid:2)(cid:13)(cid:13) f θ ∗ ( I M s ) − I s (cid:13)(cid:13) (cid:3) s.t. S (cid:88) s =1 t s |M s | ≤ T max . (5)The above exhaustive search requires F S forward passes, which is significantlyless computationally heavy than training F S CNNs. The solution θ ∗ can befurther improved by learning a refined model specific to M ∗ s :ˆ θ = arg min θ S (cid:88) s =1 E I s ∼ p ( I s ) (cid:2)(cid:13)(cid:13) f θ ( I M ∗ s ) − I s (cid:13)(cid:13) (cid:3) . (6) Since the BRM takes multiple images from different sequences as inputs, onehas the option of training (a) multiple SISO (single input single output) CNNs,one per sequence, or (b) one monolithic MIMO (multiple input multiple output)CNN for all sequences. The latter option holds several advantages over the for-mer. First, option (a) does not consider the complementary information acrossdifferent sequences. As shown in [12,3], there exists a strong correlation betweensequences of the same patient, as they share the underlying anatomical struc-tures. If a particular sequence is severely undersampled, leading to the loss ofsome anatomical detail, such information may be present in other less severelyundersampled sequences. Secondly, option (b) only requires training one model,while option (a) requires S models. As all the models attempt to eliminate dis-tortions due to undersampling, they should learn similar features. Our multi-sequence simultaneous recovery approach is shown in Fig 1. The ap-proach is based on Residual Dense Block (RDB) [13], which incorporates theidea of residual learning and dense block [14], allowing all layers of features tobe seen directly by other layers. During learning, each raw k-space data F s firstgets undersampled through a randomly generated mask M s . The results arethen transformed from k-space to image space, and concatenated before sent tothe recovery network, which outputs I R S . The loss function is defined as thefollowing: L = (cid:107) I R S − I S (cid:107) . owards multi-sequence MR image recovery from undersampled k-space data 5 Fig. 1: Multi-sequence recovery pipeline with the masks M s randomly selected. Datasets.
We employ two datasets. The first one is a privately collected, k-space raw data of three sequences ( T , T , FLAIR) from 20 patients, with eachsequence containing 18 slices. The sequences are co-registered and taken withan MRI machine with 8 channels; in order to augment training, we treat eachchannel as an individual image to result in a total of 2,880 three-sequence images,which are divided into a ratio of 17:1:2 for training, validation, and testing. Werefer to this dataset as “real data”. In order to further validate our research,we also employ the Brain Tumor Image Segmentation (BraTS) dataset [15,16],which contains T , T , and FLAIR. The sequence are co-registered to the sameanatomical template, skull-stripped, and interpolated to the same resolution. Wedivide the selected 167 cases into a ratio of 140:10:17 for training, validation,and testing. From every case, we select the middle 60 slices that contain most ofthe anatomical details. Because BraTS does not provide raw k-space data, wefollow common practices [12,9] to simulate k-space data. We refer to this datasetas “simulated data”. Below, our insights are first demonstrated with experimentson real data and are further validated on simulated data.
Acquisition time and undersampling settings.
In general, T and FLAIRhave a longer repetition time (TR) than T ; however, the acquisition time ofeach sequence also depends on the number of excitations. A larger number ofexcitations helps better resolve sequences but take a longer time. Therefore, theacquisition time of each sequence is rather machine-dependent. Here we considerthree experimental settings: t T : t T : t flair = (1) 1:1:1, (2) 1:4:6, and (3) 2:3:6.We experiment on both low-pass sampling [12] and random sampling [9]. Wefound that random sampling works better on real data but worse on simulateddata. As our approach is agnostic of sampling strategy, we choose the betterperforming sampling strategy for each dataset. During BRM training, the masks M S are generated based on a random λ s ∈ [1 , k ], where k is the maximumundersampling factor (we set k = 8). This means that BRM, after training, canhandle a continuous set of undersampling factors on every sequence. Evaluation metrics.
We utilize two metrics to gauge image quality: PSNR(peak signal-to-noise ratio) and SSIM (structural similarity). Since we mainlyfocus on three sequences, calculation of these metrics on three-sequence outputs
Cheng Peng , Wei-An Lin , Rama Chellappa , S. Kevin Zhou , Fig. 2: Quantitative recovery performance comparison. The Pearson correlationcoefficient between Dedicated and MIMO vs between Dedicated and ZF is 0.85vs -0.33 in the selected rangeis the same as on RGB images. This is easily extensible with a larger number ofsequences. Since MRI images do not have a fixed dynamic range, PSNR valuesshould be regarded as relative improvements. For example, a T image tends tohave a lower PSNR as it has the highest peak out of all three sequences. Network training.
We implement the proposed approach using PyTorch andtrain all the models with Adam optimization, a momentum of 0.5 and a learningrate of 0.0001, until they reach convergence.
Main results.
We evaluate the effectiveness of BRM in order to empiricallyprove that a properly trained network f θ performs well regardless of the choicesof M S , and serves as a good estimator of best sampling strategy. Furthermore,we want to show that MIMO BRM performs better than SISO BRM.The study is done by training (i) one MIMO BRM, (ii) three SISO BRMfor three sequences, and (iii) many models that are dedicated for specific sam-pling ratios. All the models follow the same structure as shown in Fig. 1. Theproposed training scheme for continuous λ s ∈ [1 , k ] allows us to efficiently inves-tigate the performance of different undersampling strategies. For each acquisitiontime setting { t s } Ss =1 , we search through possible { λ s } Ss =1 on the following sim-plex: (cid:80) Ss =1 t s λ s = T max , which maximally utilizes the budgeted time T max . Weselect hundreds of { λ s } Ss =1 under the 1:1:1 time setting, and set T max = T , or75% reduction in time. We run the trained models on the test set, and plotthe reconstruction performances in Fig. 2. The top-three performing samplingstrategies for different acquisition time setting are shown in Table 1.Fig. 2 shows a clear performance gap between MIMO and SISO. Overall,the reconstruction performance of ZF images is the good indicator of the per-formances of BRMs; however, the correlation fluctuates often, and two sets ofZF that are similar in PSNR can swing for more than 1dB after going throughBRM. To limit the number of dedicated models we need to train, we select arange of sampling factors of which ZF performance does not correlate well withMIMO/SISO performance, and train 30 dedicated models to see how well BRMpredicts the performance of dedicated models.As we observe from the right image in Fig. 2, our BRM, both from MIMOand SISO settings, predicts the performance of dedicated models with a high owards multi-sequence MR image recovery from undersampled k-space data 7 t T : t T : t flair λ T , λ T , λ flair ZF SISO MIMO MIMO(tuned)1 : 1 : 1 6.6, 2.1, 8.0 33.48/0.918 38.57/0.980 39.24/0.984 40.00/0.987Real 8.00, 2.11, 6.63 33.43/0.920 38.36/0.979 39.16/0.984 / / / Simulated 5.27, 3.41, 3.74 32.31/0.889 37.88/0.975 38.31/0.979 38.98/0.9806.10, 3.14, 3.74 32.21/0.887 37.51/0.973 38.31/0.978 38.99/0.9802 : 3 : 6 2.61, 3.74, 5.16 32.87/0.899 38.01/0.976 38.67/0.980 / Simulated 2.44, 3.74, 5.40 32.84/0.899 37.87/0.975 38.66/0.980 39.35/0.9822.61, 3.41, 5.66 32.82/0.899 37.80/0.975 38.65/0.980 39.33/0.982
Table 1: Quantitative evaluations for the top performing λ S under different ac-quisition time assumption. The performance numbers presented here are PSNR(dB) and SSIM.correlation. We further choose the best three { λ s } Ss =1 , and perform the laststage of fine-tuning accordingly to (6). A visual evaluation on real data is shownin Fig. 2. For simulated data, please refer to the Supplemental Material section.Base on the best performing { λ s } Ss =1 , we perceive that among T , T , andFLAIR, the results are best when T is sampled the most. We suggest that thismakes intuitive sense as T images provide the best contrast out of the threesequences, which can compensate for the details lost in other images. The sameobservation can be made on the simulated data, where both T and FLAIR showgood contrast. When the time setting is changed to non-uniformity, we can seethat our search for the best sampling strategy reflects the change. T is sampledmore as a result of faster acquisition time, while T is still sufficiently sampled. In this work, we formulated multi-sequence MR recovery as a constrained op-timization problem, and explored possible methods to solve such a problem.We proposed a CNN-based approach and an optimization scheme that helps usfind the proper combinations of sampling strategy and recovery model withoutcombinatorial complexity. We evaluated our approach on both private raw dataand public simulated data, demonstrating that our method can quickly finds thesampling strategy that yields superior reconstruction performance. We showedthat our model outperforms single sequence recovery methods in terms of recov-ery quality, time and space complexity. We believe that this research builds thefoundation for further researches in multi-sequence MR recovery.
References
1. Lustig, M., Donoho, D., Pauly, J.M.: Sparse mri: The application of compressedsensing for rapid mr imaging. Magnetic Resonance in Med. (6) (2007) 1182–952. Ravishankar, S., Bresler, Y.: MR image reconstruction from highly undersampledk-space data by dictionary learning. IEEE TMI (5) (2011) 1028–41 Cheng Peng , Wei-An Lin , Rama Chellappa , S. Kevin Zhou , Sequence LR SISO MIMO MIMO tuned GT λ T = 6 . / PSNR/SSIM λ T = 2 . / PSNR/SSIM λ flair = 8 . / PSNR/SSIM
Fig. 2: Visual comparison of different methods, with PSNR (dB) and SSIM valueslisted under the images. After recovery, the images are shaper with more visibledetails.
3. Huang, J., Chen, C., Axel, L.: Fast multi-contrast MRI reconstruction. In: MIC-CAI. (2012) 281–2884. Hirabayashi, A., Inamuro, N., et al.: Compressed sensing MRI using sparsity in-duced from adjacent slice similarity. In: SampTA, IEEE (2015) 287–2915. Senel, L.K., Kilic, T., et al.: Statistically segregated k-space sampling for acceler-ating multiple-acquisition mri. IEEE Trans. Medical Imaging (2019)6. Gozcu, B., Mahabadi, R.K., et al.: Learning-based compressive MRI. IEEE Trans.Med. Imaging (6) (2018) 1394–14067. Gong, E., et al.: Promise: Parallel-imaging and compressed-sensing reconstructionof multicontrast imaging using sharable information. Mag. Res. in Med. (2015)8. Schlemper, J., Caballero, J., et al.: A deep cascade of convolutional neural networksfor dynamic MR image reconstruction. IEEE TMI (2) (2018) 491–5039. Yang, G., Yu, S., et al.: DAGAN: deep de-aliasing generative adversarial networksfor fast compressed sensing MRI reconstruction. IEEE TMI (2018) 1310–2110. Quan, T.M., et al.: Compressed sensing MRI reconstruction using a generativeadversarial network with a cyclic loss. IEEE TMI (2018) 1488–9711. Bilgic, B., Kim, T.H., et al.: Improving parallel imaging by jointly reconstructingmulti-contrast data. Magnetic resonance in medicine (2) (2018) 619–63212. Xiang, L., Chen, Y., et al.: Ultra-fast t2-weighted MR reconstruction using com-plementary t1-weighted information. In: MICCAI. (2018) 215–22313. Zhang, Y., Tian, Y., Kong, Y., Zhong, B., Fu, Y.: Residual dense network forimage super-resolution. CoRR abs/1802.08797 (2018)14. Huang, G., Liu, Z., Weinberger, K.Q.: Densely connected convolutional networks.CoRR abs/1608.06993 (2016)15. Menze, B.H., Jakab, A., et al.: The multimodal brain tumor image segmentationbenchmark (BRATS). IEEE Trans. Med. Imaging (10) (2015) 1993–2024owards multi-sequence MR image recovery from undersampled k-space data 916. Bakas, S., et al.: Advancing the cancer genome atlas glioma MRI collections withexpert segmentation labels and radiomic features. Scientific data (2017)0 Cheng Peng , Wei-An Lin , Rama Chellappa , S. Kevin Zhou , Sequence LR SISO MIMO MIMO tuned GT λ T = 6 . PSNR/SSIM λ T = 2 . PSNR/SSIM λ flair = 8 . PSNR/SSIM
Fig. 2: Visual comparison of different recovery methods on real data
Sequence LR SISO MIMO MIMO tuned GT λ T = 2 . PSNR/SSIM λ T = 2 . PSNR/SSIM λ flair = 7 . PSNR/SSIM
Fig. 2: Visual comparison of different recovery methods on real data owards multi-sequence MR image recovery from undersampled k-space data 11Sequence LR SISO MIMO MIMO tuned GT λ T = 2 . / PSNR/SSIM λ T = 3 . / PSNR/SSIM λ flair = 5 . / PSNR/SSIM
Fig. 2: Visual comparison of different recovery methods on simulated data. Notethat BraTS sequences are interpolated for registration; therefore the image qual-ity is not as good as the real data.