[PDF] Fine-tuning deep learning model parameters for improved super-resolution of dynamic MRI with prior-knowledge

Abstract

Dynamic imaging is a beneficial tool for interventions to assess physiological changes. Nonetheless during dynamic MRI, while achieving a high temporal resolution, the spatial resolution is compromised. To overcome this spatio-temporal trade-off, this research presents a super-resolution (SR) MRI reconstruction with prior knowledge based fine-tuning to maximise spatial information while preserving high temporal resolution of dynamic MRI. An U-Net based network with perceptual loss is trained on a benchmark dataset and fine-tuned using one subject-specific static high resolution MRI as prior knowledge to obtain high resolution dynamic images during the inference stage. 3D dynamic data for three subjects were acquired with different parameters to test the generalisation capabilities of the network. The method was tested for different levels of in-plane undersampling for dynamic MRI. The reconstructed dynamic SR results after fine-tuning showed higher similarity with the high resolution ground-truth, while quantitatively achieving statistically significant improvement. The average SSIM of the lowest resolution experimented during this research (6.25~\% of the k-space) before and after fine-tuning were 0.939 \pm 0.008 and 0.957 \pm 0.006 respectively. This could theoretically result in an acceleration factor of 16, which can potentially be acquired in less than half a second. The proposed approach shows that the super-resolution MRI reconstruction with prior-information can alleviate the spatio-temporal trade-off in dynamic MRI, even for high acceleration factors.

Full PDF

FFine-tuning deep learning model parameters for improved super-resolution of dynamicMRI with prior-knowledge

Chompunuch Sarasaen a,b,c, ∗ , Soumick Chatterjee a,c,d,e, ∗ , Mario Breitkopf a,c , Georg Rose b,c , Andreas N¨urnberger d,e,g ,Oliver Speck a,c,f,g,h a Biomedical Magnetic Resonance, Otto von Guericke University Magdeburg, Germany b Institute for Medical Engineering, Otto von Guericke University Magdeburg, Germany c Research Campus STIMULATE, Otto von Guericke University Magdeburg, Germany d Faculty of Computer Science, Otto von Guericke University Magdeburg, Germany e Data and Knowledge Engineering Group, Otto von Guericke University Magdeburg, Germany f German Center for Neurodegenerative Disease, Magdeburg, Germany g Center for Behavioral Brain Sciences, Magdeburg, Germany h Leibniz Institute for Neurobiology, Magdeburg, Germany

Abstract

Dynamic imaging is a beneﬁcial tool for interventions to assess physiological changes. Nonetheless during dynamic MRI, whileachieving a high temporal resolution, the spatial resolution is compromised. To overcome this spatio-temporal trade-o ﬀ , this re-search presents a super-resolution (SR) MRI reconstruction with prior knowledge based ﬁne-tuning to maximise spatial informationwhile preserving high temporal resolution of dynamic MRI. An U-Net based network with perceptual loss is trained on a bench-mark dataset and ﬁne-tuned using one subject-speciﬁc static high resolution MRI as prior knowledge to obtain high resolutiondynamic images during the inference stage. 3D dynamic data for three subjects were acquired with di ﬀ erent parameters to test thegeneralisation capabilities of the network. The method was tested for di ﬀ erent levels of in-plane undersampling for dynamic MRI.The reconstructed dynamic SR results showed higher similarity with the high resolution ground-truth after ﬁne-tuning. The averageSSIM of the lowest resolution experimented during this research (6.25 % of the k-space) before and after ﬁne-tuning were 0.939 ± ± ﬀ in dynamic MRI, even for high acceleration factors. Keywords: super-resolution, dynamic MRI, prior knowledge, ﬁne-tuning, patch-based super-resolution, deep learning

1. Introduction

Magnetic Resonance Imaging (MRI) is in clinical use for afew decades with the advantage of non-ionising radiation, non-invasiveness and an excellent soft tissue contrast. Consideringthe clear visibility of tumours because of its high soft tissue con-trast, together with real-time supervision (e.g. thermometry),MRI is a promising tool for interventions. The visualisation oflesions, as well as the needle paths, have to be acquired priorto any interventional procedures, so-called a planning scan or apreinterventional MR imaging [1]. Furthermore, in MR-guidedinterventions, such as liver biopsy, it is necessary to continu-ously acquire data and reconstruct a series of images duringinterventions, in order to examine dynamic movements of in-ternal organs [2]. A clear interpretable visualisation of the tar-get lesion, surrounding tissues including risk structures is cru-cial during interventions. In order to achieve a high tempo-ral resolution during dynamic MRIs, because of the inherentlyslow speed of image acquisition, the amount of data to be ac-quired has to be reduced - which may result in loss of spatial ∗ C. Sarasaen and S. Chatterjee contributed equally to this work. resolution. Although there are a number of techniques dealingwith this spatio-temporal trade-o ﬀ [3, 4, 5, 6], their speed of re-construction creates a hindrance for real-time or near real-timeimaging. Therefore, a compromise between spatial and tempo-ral resolution is inevitable during real-time MRIs and needs tobe mitigated.The so-called super-resolution (SR) algorithms aim to restoreimages with high spatial resolution from the corresponding lowresolution images. SR approaches have been widely used forvarious applications [7, 8], including for super-resolution ofMRIs (SR-MRI) [9, 10, 11]. Furthermore, deep learning basedsuper-resolution reconstruction has been substantiated in recenttimes to be a successful tool for SR-MRI [12, 13], including fordynamic MRIs [14, 15]. However, most deep learning basedmethods need large training data sets and ﬁnding such trainingdata – matching with the data of the real-time acquisition thatneeds to be reconstructed in terms of contrast and sequence –can be a challenging task. Using a training set signiﬁcantlydi ﬀ erent than the test set can produce results of poor qual-ity [16, 17]. Several techniques have been used to deal with theproblem of small datasets in deep learning, such as data aug-mentation [18] and synthetic data generation [19, 20]. How- Preprint submitted to Elsevier February 5, 2021 a r X i v : . [ ee ss . I V ] F e b ver, these methods rely on artiﬁcially modifying the data toincrease the size of the dataset. Patch-based training can alsohelp cope with the small dataset problem by splitting each datainto smaller patches. This can e ﬀ ectively increase the num-ber of samples in the dataset without artiﬁcially modifying thedata [21]. The patch-based super-resolution (PBSR) techniqueslearn the mapping function from given corresponding pairs ofhigh resolution and low resolution image patches [22].This study proposes a PBSR reconstruction, aiming at ad-dressing the problem of the lack of large abdominal datasetsfor training. This research intends to improve deep learningbased super-resolution of dynamic MR images by incorporat-ing prior images (planning scan). The network was trained ona publicly available abdominal dataset of 40 subjects, acquiredusing di ﬀ erent sequences than the dynamic MR that is to bereconstructed. After that, the network was ﬁne-tuned using ahigh resolution prior planning scan of the same subject as thedynamic acquisition. Super-resolution approaches have been employed for a widevariety of tasks, such as computer vision [23, 24, 8], remotesensing [7, 25], face-related tasks [26, 27] and medical ap-plications [11, 28]. Deep learning based methods have beenwidely used in recent times for performing super-resolution [29,30, 24, 25]. Moreover, deep learning based techniques havebeen proven to be a successful tool for numerous applicationsin the ﬁeld of MRI, including for performing MR reconstruc-tion [31, 32, 33, 34] and for SR-MRI [12, 35, 36, 13]. Di ﬀ er-ent deep learning based SR-MRI ideas have been proposed forstatic brain MRI [28, 37, 38, 12, 35, 39, 40, 41]. Furthermore,deep learning based methods have additionally been shown totackle the spatio-temporal trade-o ﬀ [42], also for dynamic car-diac MR reconstruction [14, 15].Single-image super-resolution techniques are classiﬁed intothe groups of prediction-based, edge-based, image statisticaland patch-based methods [22]. PBSR can overcome the needof large training datasets as the actual training is done usingpatches, rather than on whole images. The PBSR methods havebeen applied to di ﬀ erent tasks, including applications in medi-cal imaging [43, 44, 45, 46, 47]. By employing PBSR, the re-construction procedure can be driven to cope with the limitationof available training abdominal MR data [48, 49].The U-Net [50] model, which was originally proposed forimage segmentation, over the past few years has been provento solve various inverse problems as well [32, 51, 52]. Iqbal etal., [51] developed an U-Net based architecture for SR recon-struction of MR spectroscopic images. Hyun et al. [32] recon-structed MRI utilising a 2D U-Net from zero-ﬁlled data, un-dersampled using uniform Cartesian sampling (GRAPPA-like)with a dense k-space centre. Ghodrati et al. [52] employed aU-Net model to test the performance of this network structurefor cardiac MRI reconstruction. Due to the promising resultsshown in the papers mentioned earlier, the current paper pro-posed a 3D U-Net [53] based architecture for performing SR-MRI for abdominal dynamic images. Transfer learning is a technique for re-proposing or adaptinga pre-trained model with ﬁne-tuning [54]. With transfer learn-ing, the network weights learned from one task can be used aspre-trained weights for another task, and then the network istrained (ﬁne-tuned) for the new task. It has been widely usedin data mining and machine learning [55, 56, 57, 58]. Trans-fer learning can address the issue of having insu ﬃ cient trainingdata [59, 60]. The ﬁne-tuning process is known to improve thenetwork’s performance and can help to converge in less trainingepochs with smaller datasets [61]. One of the main research in-quiries of applying transfer learning is ”what to transfer”. Thiscurrent study thereby, utilises the speciﬁc knowledge of priorsfrom a static planing image, which is usually acquired earlierto an interventional procedure. The incorporation of priors ismeant to constrain anatomical structures in the ﬁne-tuning pro-cess and to improve the data ﬁdelity term in the regularisationprocess.To determine the reconstruction error during training a deeplearning model, between the model’s prediction and the corre-sponding ground-truth images, the selection of a loss functionis crucial. Pixel-based loss functions such as mean squared er-ror (L2 loss) are commonly used for SR, however, in terms ofperceptual quality, it often generates too smooth results which iscaused due to the loss of high-frequency details [62, 63, 64, 65].Perceptual loss has shown potential to achieve high-qualityimages for image transformation tasks such as style trans-fer [64, 66]. For MRI reconstruction, Ghodrati et al. [52] haveshown a comparative study of loss functions, such as percep-tual loss (using VGG-16 as perceptual loss network), pixel-based loss (L1 and L2) and patch-wise structural dissimilarity(Dssim), for deep learning based cardiac MR reconstruction.They found that the results of the perceptual loss outperformedthe other loss functions. Hence, in this work a combination ofperceptual loss network with the mean absolute error (MAE) asthe loss function was used, which is explained in section 2.3.1. Given a low resolution image I LR and a corresponding highresolution image I HR , the reconstructed high resolution imageˆ I HR can be recovered from a super-resolution reconstruction us-ing the following equation [67]:ˆ I HR = (cid:122) ( I LR ; θ ) (1)where (cid:122) denotes the super-resolution model that maps theimage counterparts and θ denotes the parameters of (cid:122) . The SRimage reconstruction is an ill-posed problem, a network modelcan be trained to solve the objective function:ˆ I HR = arg min L ( ˆ I HR , I HR ) + λ R ( I LR ) (2)where the L ( ˆ I HR , I HR ) denotes the loss function betweenthe approximated HR image ˆ I HR and ground-truth image I HR , λ R ( I HR ) is a regularisation term and λ denotes the trade-o ﬀ pa-rameter.2 .3. Contributions This paper presents a method to incorporate prior knowledgein deep-learning based super-resolution and its application indynamic MRI. The main contributions are as follows: • This paper addresses the trade-o ﬀ between the spatial andtemporal resolution of dynamic MRI by incorporating astatic high resolution scan as prior knowledge. • A 3D U-Net model was ﬁrst trained for the task of SR-MRI on a benchmark dataset and was ﬁne-tuned using asubject-speciﬁc prior planning scan. • This paper further tackles the problem of the lack of high-resolution dynamic MRI for training in two ways: – By using a static benchmark dataset for training, hav-ing di ﬀ erent contrasts and resolutions than the tar-get dynamic MRI; followed by ﬁne-tuning using onestatic planning scan – By patch-based super-resolution training and ﬁne-tuning • To achieve realistic super-resolved images, perceptual losswas used as the loss function for training and ﬁne-tuningthe model, which was calculated using a 3D perceptualloss network pre-trained on MR images.

2. Methodology

This paper proposes a framework of patch-based MR super-resolution based on U-Net, incorporating prior knowledge. Theframework can be divided into three stages: main training, ﬁne-tuning and inference. The U-Net model was initially trainedwith a benchmark dataset for main-training and then was ﬁne-tuned using a subject-speciﬁc prior static scan. Finally in theinference stage, high resolution dynamic MRIs were recon-structed from low resolution scans. This chapter starts with thedescription of various datasets used in this research, then ex-plains the network architecture followed by the implementationand training, ﬁnally explains the metrics used for evaluation.

In this work, 3D abdominal MR volumes were artiﬁciallydownsampled in-plane using MRUnder [68] pipeline to simu-late low resolution datasets. The low resolution data was gen-erated by performing undersampling in-plane (phase-encodingand read-out direction) by taking the centre of k-space withoutzero-padding. The CHAOS challenge dataset [69] (T1-Dual;in- and opposed-phase images) was used for the main train-ing. High resolution 3D static (breath-hold) and 3D ”pseudo”-dynamic (free-breathing) scans for 10 time-points (TP) usingT1w FLASH sequence were acquired for ﬁne-tuning and in-ference respectively. Each time-point of the dynamic acquisi-tion was treated as separate 3D Volume. Three healthy subjects MRUnder on Github: https://github.com/soumickmj/MRUnder were scanned with the same sequence but with di ﬀ erent param-eters on a 3T MRI (Siemens Magnetom Skyra). This aims totest the generalisation of the network. For each subject, the 3Dstatic and the 3D dynamic scans were acquired in di ﬀ erent ses-sions using the same sequences and parameters. The sequenceparameters of the various datasets have been listed in Table 1.The CHAOS dataset (for main training), the 3D static scans(for ﬁne-tuning) and the 3D dynamic scans (for inference) wereartiﬁcially downsampled for three di ﬀ erent levels of image res-olution, by taking 25%, 10% and 6.25% of the k-space centre,resulting in MR acceleration factor of 2, 3 and 4 respectively(considering undersampling only in the phase-encoding direc-tion). This can be accelerated theoretically to a factor of 4, 9and 16 respectively considering the amount of data used for theSR reconstruction. The e ﬀ ective resolutions and the roughlyestimated acquisition times of the low resolution images werecalculated from the corresponding high-resolution images, arereported in Table 2. The acquisition times were calculated as AcqT ime = PE n × T R × S m , where PE n is the number of phase-encoding steps, T R is the repetition time, and S m is the numberof slices. Phase / slice oversampling, phase / slice resolutions andGRAPPA factor were taken into account while calculating PE n .The low resolution images served as the input to the networkand were compared against the high resolution ground-truth im-ages. Fig 1 portrays the proposed network architecture. In thiswork, a 3D U-Net based model [50, 53, 70] with perceptual lossnetwork [71] was employed for super-resolution reconstruction.The U-Net architecture consists of two main paths; contract-ing (encoding) and expanding (decoding). The contracting pathconsists of three blocks, each comprises of two convolutionallayers and a ReLU activation function. The expanding pathalso consists of three blocks, but convolutional transpose lay-ers were used instead of the convolutional layers. The trainingwas performed using 3D patches of the volumes, with a patchsize of 24 . U-Net model requires the same size for input andoutput images, therefore, the input images were interpolated us-ing trilinear interpolation before supplying them to the network.Patch-based training may result in patching artefacts during in-ference. To remove these artefacts, the inference was performedwith a stride of one and the overlapped portions were averagedafter reconstruction. Fig 2 shows the method overview. The main training wasperformed using 3D patches of the 80 volumes (40 subjects, in-phase and opposed-phase for each subject), with a patch sizeof 24 and a stride of 6 for the slice dimension and 12 for theother dimensions. After that, the network was ﬁne-tuned usinga single 3D static scan of the same subject from an earlier ses-sion, labelled as x,y,z,t in Fig 2. This static scan has the sameresolution, contrast and volume coverage as the high resolutiondynamic scan. The static and the dynamic scans were not co-registered to keep it similar to the real-life scenario and to keep3 able 1: MRI acquisition parameters CHAOS dataset and subject-wise 3D dynamic scans. Static scans were performed using the same subject-wise sequenceparameters as the dynamic scans for one time-point (TP), acquired at a di ﬀ erent session. CHAOS(40 Subjects) Subject 1 Subject 2 Subject 3Sequence T1 Dual In-Phase& Opposed-Phase T1w Flash 3D T1w Flash 3D T1w Flash 3DResolution 1.44 x 1.44 x 5 -2.03 x 2.03 x 8 mm mm mm mm FOV x, y, z 315 x 315 x 240 -520 x 520 x 280 mm

280 x 210 x 160 mm

350 x 262 x 176 mm Encoding matrix 256 x 256 x 26 -400 x 400 x 50 256 x 192 x 40 256 x 192 x 40 256 x 192 x 44Phase / Slice oversampling - 10 / / / / TE 110.17 - 255.54 ms / / / / / Px 975 Hz / Px 975 Hz / PxGRAPPA factor None 2 None NonePhase / Slice partial Fourier - O ﬀ/ O ﬀ O ﬀ/ O ﬀ O ﬀ/ O ﬀ Phase / Slice resolution - 75 /

65 % 75 /

65 % 50 /

64 %Fat suppression - None On OnTime per TP - 5.53 sec 11.76 sec 8.01 sec

Table 2: E ﬀ ective resolutions and estimated acquisition times (per TP) of the dynamic and static datasets after performing di ﬀ erent levels of artiﬁcial undersampling. Subject 1 Subject 2 Subject 3Resolution( mm ) Acq. Time( sec ) Resolution( mm ) Acq. Time( sec ) Resolution( mm ) Acq. Time( sec )High Resolution Ground-truth 1.09 x 1.09 x 4 4.81 1.09 x 1.09 x 4 9.61 1.36 x 1.36 x 4 6.6225% of k-space 2.19 x 2.19 x 4 1.22 2.19 x 2.19 x 4 2.43 2.73 x 2.73 x 4 1.6510% of k-space 3.50 x 3.50 x 4 0.47 3.50 x 3.50 x 4 0.94 4.38 x 4.38 x 4 0.666.25% of k-space 4.38 x 4.38 x 4 0.28 4.38 x 4.38 x 4 0.56 5.47 x 5.47 x 4 0.42 Figure 1: The proposed network Architecture. igure 2: Method Overview. the speed of inference fast. Fine-tuning and evaluations wereperformed with a patch size of 24 and a stride of one. Theimplementation was done using PyTorch [72] and was trainedusing Nvidia Tesla V100 GPUs. The loss was minimised us-ing the Adam optimiser with a learning rate of 1e-4. The maintraining was performed for 200 epochs. The network was ﬁne-tuned for only one epoch, using the planning scan with a lowerlearning rate (1e-6). Loss during the training and ﬁne-tuning of the network wascalculated with the help of perceptual loss [64]. The ﬁrst threelevels of the contraction path of a pre-trained (on 7T MRAscans, for vessel segmentation) frozen U-Net MSS model [71]were used as the perceptual loss network (PLN) to extract fea-tures from the ﬁnal super-resolved output of the model and theground-truth images (refer to Fig.1). Typically VGG-16 trainedon three-channel RGB non-medical images (ImageNet Dataset)is used as PLN, even while working with medical images [52]as the PLN doesn’t have to be trained on a similar dataset. Inthis research, the pre-trained network was chosen because itwas originally trained on single-channel medical images, butof di ﬀ erent contrast and organ; and it was hypothesised that us-ing a network trained as such will be more suitable than a net-work trained on three-channel images. The extracted featuresfrom the model’s output and from the ground-truth images werecompared using mean absolute error (L1 loss). The losses ob-tained at each level for each feature were then added togetherand backpropagated. To evaluate the quality of the reconstructed images againstthe ground-truth HR images, two metrics were selected, namelythe structural similarity index (SSIM)[63] and the peak signal-to-noise ratio (PSNR).For perceptual quality assessment, the accuracy of the recon-structed images was compared to the ground truth using SSIM, which is based on the computations of luminance, contrast andstructure terms between image x and y:

S S I M ( x , y ) = (2 µ x µ y + c )(2 σ xy + c )( µ x + µ y + c )( σ x + σ y + c ) (3)where µ x , µ y , σ x , σ y and σ xy are the local means, standard de-viations, and cross-covariance for images x and y , respectively. c = ( k L ) and c = ( k L ) , where L is the dynamic range ofthe pixel-values, k = .

01 and k = . PS NR =

10 log (cid:32) R MS E (cid:33) (4)where R is the maximum ﬂuctuation in the input image.

3. Results and Discussion

Performance of the model was evaluated for three di ﬀ erentlevels of undersampling: by taking 25%, 10% and 6.25% of thek-space centre. The network was tested before and after ﬁne-tuning using 3D dynamic MRI. The proposed approach wascompared against the low resolution input, the traditional tri-linear interpolation and Fourier interpolation of the input (zero-ﬁlling k-space). There was a noticeable improvement quali-tatively and quantitatively while reconstructing low resolutiondata using the proposed method, even for only 6.25% of the k-space. Fig 3 shows the comparison qualitatively for the low res-olution images by taking 25%, 10% and 6.25% of the k-space.Fig 4 portrays the comparison of the low resolution input for6.25% of the k-space, the lowest resolution investigated dur-ing this study, with the SR result after ﬁne-tuning over di ﬀ erenttime points. The SSIM maps were calculated against the highresolution ground-truth, which the respective SSIM value canbe found on top of the image. Fig 5 illustrates the deviations of5n example result from its corresponding ground-truth for twodi ﬀ erent regions of interest.Additionally, for quantitative analysis, Table 3 displays theaverage and standard deviation (SD) of SSIM, PSNR and theSD of subtracted images for all time points for the dynamicdatasets. Here, each time-point has been considered indepen-dent of each other as separate 3D volumes. To clearly showthe distribution of the resultant metrics over all conditions andsubjects, Fig 6 illustrates the SSIM and PSNR for di ﬀ erent res-olutions.It can be observed that SR results after ﬁne-tuning could alle-viate the undersampling artefacts, which are still present in theSR results of the main training, even for relatively low resolu-tion images like 10% and 6.25%. Consequently, the visibilityof small details is improved. Fine-tuning with the planning scanhelped in obtaining sharper images and achieving a better edgedelineation.The acquisition time of high resolution 3D ”pseudo”-dynamic reference data without parallel acquisition in this studywas ten seconds and ﬁve seconds with GRAPPA factor two (Ta-ble 2). These are not su ﬃ cient for real-time or near real-timeapplications and might lead to blurring in free-breathing sub-jects. This research shows the potential to acquire such volumewith only minimal loss of spatial information in less than half asecond.The ﬁne-tuning process took approximately eight hours toﬁnish for each subject using the earlier mentioned setup (Sec-tion 2.3). Super-resolving each time-point took only a fractionof a second. The required time for ﬁne-tuning and inferencecan further be reduced by reducing the patch-overlap (stride),though that might reduce the quality of the resultant super-resolved images. It can be further perceived that the networkwas able to produce results highly similar to the ground-truth(SSIM of 0.957) even while super-resolving 6.25% of k-space,which can make the acquisition 16 times faster. Combining thisfast acquisition speed with the inference speed of the method,this study can be extended to be used for real-time or near real-time MRIs during interventions.In the current study, only the centre of the k-space was usedduring undersampling, which results in loss of resolution with-out creating explicit image artefacts. Other undersampling pat-terns, such as variable density or GRAPPA-like uniform under-sampling of higher spatial frequencies may be investigated inthe future.It should be noted that the static planning scans and the actualdynamic scans during interventions are typically acquired withdi ﬀ erent sequences, with planning scans having higher contrastand resolution than the dynamic scans. This study was con-ducted using the same sequence for static and dynamic scans,but di ﬀ erent resolutions and positions (di ﬀ erent scan session).An additional experiment was performed by ﬁne-tuning usinga volumetric interpolated breath-hold examination (VIBE) se-quence as planing scan for one subject. Super-resolving thedynamic low-resolution images from 6.25% of the k-space re-sulted in 0.032 lower SSIM than using the identical sequencewith higher resolution for ﬁne-tuning. This may be a limitationof the current approach but requires further investigation.

4. Conclusion and Future Work

This research shows that ﬁne-tuning with a subject-speciﬁcprior static scan can improve the results of deep learning basedsuper-resolution (SR) reconstruction. A 3D U-Net based modelwas trained with the help of perceptual loss to estimate thereconstruction error. The network model was initially trainedusing the CHAOS abdominal benchmark dataset and was thenﬁne-tuned using a static high resolution prior scan. The modelwas used to obtain super-resolved high resolution 3D abdomi-nal dynamic MRI from their corresponding low resolution im-ages. Even though the network was trained using MRI se-quences di ﬀ erent than the reconstructed dynamic MRI, theSR results after ﬁne-tuning showed higher similarity with theground-truth images. The proposed method could overcomethe spatio-temporal trade-o ﬀ by improving the image resolutionwithout compromising the speed of acquisition. This approachcould be applied to real-time dynamic acquisitions, such as forinterventional MRIs, because of the high speed of inference ofdeep learning models.In the presented approach, a 3D U-Net was used as the net-work model, which needs interpolation as a pre-processing step.Therefore the reconstructed images could be su ﬀ ering from in-terpolation errors. As future work, network models such asSRCNN, which do not need interpolation, will be studied. Inaddition, image resolutions lower than the already investigatedones will be studied to check the network’s limitations. More-over, clinical interventions are performed with devices, such asneedle, which are not present in the planning scan. The authorsplan to extend this research in future by evaluating on imageswith such devices. Acknowledgements

This work was conducted within the context of the Inter-national Graduate School MEMoRIAL at Otto von GuerickeUniversity (OVGU) Magdeburg, Germany, kindly supported bythe European Structural and Investment Funds (ESF) under theprogramme ”Sachsen-Anhalt WISSENSCHAFT International-isierung“ (project no. ZS / / / igure 3: Comparative results of low resolution (25%, 10% and 6.25% of k-space) 3D Dynamic data of the same slice. From left to right: low resolution images(scaled-up), Interpolated input (Trilinear), super-resolution results of the main training (SR Results Main Training), super-resolution results after the ﬁne-tune (SRAfter Fine-Tuning) and ground-truth images.Figure 4: An example comparison of the low resolution input of the 6.25% of k-space with the super-resolution (SR) result after ﬁne-tuning over three di ﬀ erenttime points, compared against the high resolution ground-truth using SSIM maps. igure 5: An example from the reconstructed results, compared against its ground-truth (GT) for low resolution images from 6.25% of k-space. From left to right,upper to lower: ground-truth, trilinear interpolation, SR result of main training and SR result after ﬁne-tuning. For the yellow ROI, (a-b): trilinear interpolation andthe di ﬀ erence image from GT, (e-f): SR result of the main training and the di ﬀ erence image from GT and (i-j): SR result after ﬁne-tuning and the di ﬀ erence imagefrom GT. The images on the right part are identical examples for the red ROI.Table 3: The average and the standard deviation of SSIM, PSNR, and SD of di ﬀ erence images with ground-truth. The table shows the results of di ﬀ erent resolutions. Data 25% of k-space 10% of k-space 6.25% of k-spaceSSIM PSNR di ﬀ SD SSIM PSNR di ﬀ SD SSIM PSNR di ﬀ SD Trilinear 0.964 ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± Figure 6: Line plot showing the mean and 95% conﬁdence interval of the resultant SSIM and PSNR over the di ﬀ erent time-points for each subject eferences [1] A. H. Mahnken, J. Ricke, K. E. Wilhelm, CT-and MR-guided Interven-tions in Radiology, Vol. 22, Springer, 2009.[2] M. A. Bernstein, K. F. King, X. J. Zhou, Handbook of MRI pulse se-quences, Elsevier, 2004.[3] J. Tsao, P. Boesiger, K. P. Pruessmann, k-t blast and k-t sense: dynamicmri with high frame rate exploiting spatiotemporal correlations, MagneticResonance in Medicine: An O ﬃ cial Journal of the International Societyfor Magnetic Resonance in Medicine 50 (5) (2003) 1031–1042.[4] M. Lustig, J. M. Santos, D. L. Donoho, J. M. Pauly, kt sparse: High framerate dynamic mri exploiting spatio-temporal sparsity, in: Proceedings ofthe 13th annual meeting of ISMRM, Seattle, Vol. 2420, 2006.[5] M. Lustig, D. Donoho, J. M. Pauly, Sparse mri: The application of com-pressed sensing for rapid mr imaging, Magnetic Resonance in Medicine:An O ﬃ cial Journal of the International Society for Magnetic Resonancein Medicine 58 (6) (2007) 1182–1195.[6] H. Jung, K. Sung, K. S. Nayak, E. Y. Kim, J. C. Ye, k-t focuss: a generalcompressed sensing framework for high resolution dynamic mri, Mag-netic Resonance in Medicine: An O ﬃ cial Journal of the InternationalSociety for Magnetic Resonance in Medicine 61 (1) (2009) 103–116.[7] H. Zhang, Z. Yang, L. Zhang, H. Shen, Super-resolution reconstructionfor multi-angle remote sensing images considering resolution di ﬀ erences,Remote Sensing 6 (1) (2014) 637–657.[8] M. S. Sajjadi, B. Scholkopf, M. Hirsch, Enhancenet: Single image super-resolution through automated texture synthesis, in: Proceedings of theIEEE International Conference on Computer Vision, 2017, pp. 4491–4500.[9] E. Van Reeth, I. W. Tham, C. H. Tan, C. L. Poh, Super-resolution inmagnetic resonance imaging: a review, Concepts in Magnetic ResonancePart A 40 (6) (2012) 306–325.[10] E. Plenge, D. H. Poot, M. Bernsen, G. Kotek, G. Houston, P. Wielopolski,L. van der Weerd, W. J. Niessen, E. Meijering, Super-resolution meth-ods in mri: can they improve the trade-o ﬀ between resolution, signal-to-noise ratio, and acquisition time?, Magnetic resonance in medicine 68 (6)(2012) 1983–1993.[11] J. S. Isaac, R. Kulkarni, Super resolution techniques for medical imageprocessing, in: 2015 International Conference on Technologies for Sus-tainable Development (ICTSD), IEEE, 2015, pp. 1–6.[12] K. Zeng, H. Zheng, C. Cai, Y. Yang, K. Zhang, Z. Chen, Simultaneoussingle-and multi-contrast super-resolution for brain mri images based ona convolutional neural network, Computers in biology and medicine 99(2018) 133–141.[13] X. He, Y. Lei, Y. Fu, H. Mao, W. J. Curran, T. Liu, X. Yang, Super-resolution magnetic resonance imaging reconstruction using deep atten-tion networks, in: Medical Imaging 2020: Image Processing, Vol. 11313,International Society for Optics and Photonics, 2020, p. 113132J.[14] C. Qin, J. Schlemper, J. Caballero, A. N. Price, J. V. Hajnal, D. Rueckert,Convolutional recurrent neural networks for dynamic mr image recon-struction, IEEE transactions on medical imaging 38 (1) (2018) 280–290.[15] Q. Lyu, H. Shan, Y. Xie, D. Li, G. Wang, Cine cardiac mri mo-tion artifact reduction using a recurrent neural network, arXiv preprintarXiv:2006.12700.[16] M. Wang, W. Deng, Deep visual domain adaptation: A survey, Neuro-computing 312 (2018) 135–153.[17] G. Wilson, D. J. Cook, A survey of unsupervised deep domain adaptation,ACM Transactions on Intelligent Systems and Technology (TIST) 11 (5)(2020) 1–46.[18] L. Perez, J. Wang, The e ﬀ ectiveness of data augmentation in image clas-siﬁcation using deep learning, arXiv preprint arXiv:1712.04621.[19] M. A. Lateh, A. K. Muda, Z. I. M. Yusof, N. A. Muda, M. S. Azmi,Handling a small dataset problem in prediction model by employ artiﬁcialdata generation approach: A review, in: Journal of Physics: ConferenceSeries, Vol. 892, 2017, p. 012016.[20] M. Frid-Adar, I. Diamant, E. Klang, M. Amitai, J. Goldberger,H. Greenspan, Gan-based synthetic medical image augmentation for in-creased cnn performance in liver lesion classiﬁcation, Neurocomputing321 (2018) 321–331.[21] M. Frid-Adar, I. Diamant, E. Klang, M. Amitai, J. Goldberger,H. Greenspan, Modeling the intra-class variability for liver lesion detec-tion using a multi-class patch-based cnn, in: International Workshop on Patch-based Techniques in Medical Imaging, Springer, 2017, pp. 129–137.[22] C.-Y. Yang, C. Ma, M.-H. Yang, Single-image super-resolution: A bench-mark, in: European Conference on Computer Vision, Springer, 2014, pp.372–386.[23] W. Shi, J. Caballero, F. Husz´ar, J. Totz, A. P. Aitken, R. Bishop, D. Rueck-ert, Z. Wang, Real-time single image and video super-resolution using ane ﬃ cient sub-pixel convolutional neural network, in: Proceedings of theIEEE conference on computer vision and pattern recognition, 2016, pp.1874–1883.[24] C. Dong, C. C. Loy, X. Tang, Accelerating the super-resolution convo-lutional neural network, in: European conference on computer vision,Springer, 2016, pp. 391–407.[25] Q. Ran, X. Xu, S. Zhao, W. Li, Q. Du, Remote sensing images super-resolution with deep convolution networks, Multimedia Tools and Appli-cations 79 (13) (2020) 8985–9001.[26] M. F. Tappen, C. Liu, A bayesian approach to alignment-based image hal-lucination, in: European conference on computer vision, Springer, 2012,pp. 236–249.[27] X. Yu, B. Fernando, B. Ghanem, F. Porikli, R. Hartley, Face super-resolution guided by facial component heatmaps, in: Proceedings of theEuropean Conference on Computer Vision (ECCV), 2018, pp. 217–233.[28] Y. Huang, L. Shao, A. F. Frangi, Simultaneous super-resolution and cross-modality synthesis of 3d medical images using weakly-supervised jointconvolutional sparse coding, in: Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition, 2017, pp. 6070–6079.[29] C. Dong, C. C. Loy, K. He, X. Tang, Learning a deep convolutional net-work for image super-resolution, in: European conference on computervision, Springer, 2014, pp. 184–199.[30] Y. Zhu, Y. Zhang, A. L. Yuille, Single image super-resolution using de-formable patches, in: Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition, 2014, pp. 2917–2924.[31] S. Wang, Z. Su, L. Ying, X. Peng, S. Zhu, F. Liang, D. Feng, D. Liang, Ac-celerating magnetic resonance imaging via deep learning, in: 2016 IEEE13th International Symposium on Biomedical Imaging (ISBI), IEEE,2016, pp. 514–517.[32] C. M. Hyun, H. P. Kim, S. M. Lee, S. Lee, J. K. Seo, Deep learning forundersampled mri reconstruction, Physics in Medicine & Biology 63 (13)(2018) 135007.[33] K. Hammernik, T. Klatzer, E. Kobler, M. P. Recht, D. K. Sodickson,T. Pock, F. Knoll, Learning a variational network for reconstruction of ac-celerated mri data, Magnetic resonance in medicine 79 (6) (2018) 3055–3071.[34] S. Chatterjee, M. Breitkopf, C. Sarasaen, G. Rose, A. N¨urnberger,O. Speck, A deep learning approach for reconstruction of undersampledcartesian and radial data, in: ESMRMB 2019, 2019.[35] C. Liu, X. Wu, X. Yu, Y. Tang, J. Zhang, J. Zhou, Fusing multi-scaleinformation in convolution network for mr image super-resolution recon-struction, Biomedical engineering online 17 (1) (2018) 1–23.[36] A. S. Chaudhari, Z. Fang, F. Kogan, J. Wood, K. J. Stevens, E. K. Gib-bons, J. H. Lee, G. E. Gold, B. A. Hargreaves, Super-resolution muscu-loskeletal mri using deep learning, Magnetic resonance in medicine 80 (5)(2018) 2139–2154.[37] R. Tanno, D. E. Worrall, A. Ghosh, E. Kaden, S. N. Sotiropoulos, A. Cri-minisi, D. C. Alexander, Bayesian image quality transfer with cnns: ex-ploring uncertainty in dmri super-resolution, in: International Confer-ence on Medical Image Computing and Computer-Assisted Intervention,Springer, 2017, pp. 611–619.[38] C.-H. Pham, A. Ducournau, R. Fablet, F. Rousseau, Brain mri super-resolution using deep 3d convolutional networks, in: 2017 IEEE 14th In-ternational Symposium on Biomedical Imaging (ISBI 2017), IEEE, 2017,pp. 197–200.[39] Y. Chen, F. Shi, A. G. Christodoulou, Y. Xie, Z. Zhou, D. Li, E ﬃ cientand accurate mri super-resolution using a generative adversarial networkand 3d multi-level densely connected network, in: International Confer-ence on Medical Image Computing and Computer-Assisted Intervention,Springer, 2018, pp. 91–99.[40] B. Deka, H. U. Mullah, S. Datta, V. Lakshmi, R. Ganesan, Sparse repre-sentation based super-resolution of mri images with non-local total varia-tion regularization, SN Computer Science 1 (5) (2020) 1–13.[41] Y. Gu, Z. Zeng, H. Chen, J. Wei, Y. Zhang, B. Chen, Y. Li, Y. Qin, Q. Xie, . Jiang, et al., Medsrgan: medical images super-resolution using gener-ative adversarial networks, Multimedia Tools and Applications 79 (2020)21815–21840.[42] M. Liang, J. Du, L. Li, Z. Xue, X. Wang, F. Kou, X. Wang, Video super-resolution reconstruction based on deep learning and spatio-temporal fea-ture self-similarity, IEEE Transactions on Knowledge and Data Engineer-ing.[43] J. V. Manj´on, P. Coup´e, A. Buades, V. Fonov, D. L. Collins, M. Robles,Non-local mri upsampling, Medical image analysis 14 (6) (2010) 784–792.[44] F. Rousseau, A. D. N. Initiative, et al., A non-local approach for im-age super-resolution using intermodality priors, Medical image analysis14 (4) (2010) 594–605.[45] Y. Zhang, G. Wu, P.-T. Yap, Q. Feng, J. Lian, W. Chen, D. Shen, Re-construction of super-resolution lung 4d-ct using patch-based sparse rep-resentation, in: 2012 IEEE Conference on Computer Vision and PatternRecognition, IEEE, 2012, pp. 925–931.[46] P. Coup´e, J. V. Manj´on, M. Chamberland, M. Descoteaux, B. Hiba, Col-laborative patch-based super-resolution for di ﬀ usion-weighted images,NeuroImage 83 (2013) 245–261.[47] S. Jain, D. M. Sima, F. Sanaei Nezhad, G. Hangel, W. Bogner,S. Williams, S. Van Hu ﬀ el, F. Maes, D. Smeets, Patch-based super-resolution of mr spectroscopic images: application to multiple sclerosis,Frontiers in neuroscience 11 (2017) 13.[48] Y. Tang, L. Shao, Pairwise operator learning for patch-based single-imagesuper-resolution, IEEE Transactions on Image Processing 26 (2) (2016)994–1003.[49] D. Misra, C. Crispim-Junior, L. Tougne, Patch-based cnn evaluationfor bark classiﬁcation, in: European Conference on Computer Vision,Springer, 2020, pp. 197–212.[50] O. Ronneberger, P. Fischer, T. Brox, U-net: Convolutional networks forbiomedical image segmentation, in: International Conference on Medicalimage computing and computer-assisted intervention, Springer, 2015, pp.234–241.[51] Z. Iqbal, D. Nguyen, G. Hangel, S. Motyka, W. Bogner, S. Jiang, Super-resolution 1h magnetic resonance spectroscopic imaging utilizing deeplearning, Frontiers in oncology 9.[52] V. Ghodrati, J. Shao, M. Bydder, Z. Zhou, W. Yin, K.-L. Nguyen, Y. Yang,P. Hu, Mr image reconstruction using deep learning: evaluation of net-work structure and loss functions, Quantitative imaging in medicine andsurgery 9 (9) (2019) 1516–1527.[53] ¨O. C¸ ic¸ek, A. Abdulkadir, S. S. Lienkamp, T. Brox, O. Ronneberger, 3du-net: learning dense volumetric segmentation from sparse annotation,in: International conference on medical image computing and computer-assisted intervention, Springer, 2016, pp. 424–432.[54] Y. Bengio, I. Goodfellow, A. Courville, Deep learning, Vol. 1, MIT pressMassachusetts, USA:, 2017.[55] W. Dai, G.-R. Xue, Q. Yang, Y. Yu, Transferring naive bayes classiﬁersfor text classiﬁcation, in: AAAI, Vol. 7, 2007, pp. 540–545.[56] B. Li, Q. Yang, X. Xue, Transfer learning for collaborative ﬁltering via arating-matrix generative model, in: Proceedings of the 26th annual inter-national conference on machine learning, 2009, pp. 617–624.[57] K. Choi, G. Fazekas, M. Sandler, K. Cho, Transfer learning for musicclassiﬁcation and regression tasks, arXiv preprint arXiv:1703.09179.[58] K.-H. Lee, X. He, L. Zhang, L. Yang, Cleannet: Transfer learning forscalable image classiﬁer training with label noise, in: Proceedings of theIEEE Conference on Computer Vision and Pattern Recognition, 2018, pp.5447–5456.[59] W. Zhao, Research on the deep learning of the small sample data basedon transfer learning, in: AIP Conference Proceedings, Vol. 1864, AIPPublishing LLC, 2017, p. 020018.[60] Y.-G. Kim, S. Kim, C. E. Cho, I. H. Song, H. J. Lee, S. Ahn, S. Y. Park,G. Gong, N. Kim, E ﬀ ectiveness of transfer learning for enhancing tumorclassiﬁcation with a convolutional neural network on frozen sections, Sci-entiﬁc Reports 10 (1) (2020) 1–9.[61] S. J. Pan, Q. Yang, A survey on transfer learning, IEEE Transactions onknowledge and data engineering 22 (10) (2009) 1345–1359.[62] Z. Wang, E. P. Simoncelli, A. C. Bovik, Multiscale structural similarityfor image quality assessment, in: The Thrity-Seventh Asilomar Confer-ence on Signals, Systems & Computers, 2003, Vol. 2, Ieee, 2003, pp.1398–1402. [63] Z. Wang, A. C. Bovik, H. R. Sheikh, E. P. Simoncelli, Image quality as-sessment: from error visibility to structural similarity, IEEE transactionson image processing 13 (4) (2004) 600–612.[64] J. Johnson, A. Alahi, L. Fei-Fei, Perceptual losses for real-time styletransfer and super-resolution, in: European conference on computer vi-sion, Springer, 2016, pp. 694–711.[65] C. Ledig, L. Theis, F. Husz´ar, J. Caballero, A. Cunningham, A. Acosta,A. Aitken, A. Tejani, J. Totz, Z. Wang, et al., Photo-realistic single imagesuper-resolution using a generative adversarial network, in: Proceedingsof the IEEE conference on computer vision and pattern recognition, 2017,pp. 4681–4690.[66] L. A. Gatys, A. S. Ecker, M. Bethge, Image style transfer using con-volutional neural networks, in: Proceedings of the IEEE conference oncomputer vision and pattern recognition, 2016, pp. 2414–2423.[67] Z. Wang, J. Chen, S. C. Hoi, Deep learning for image super-resolution: Asurvey, IEEE transactions on pattern analysis and machine intelligence.[68] S. Chatterjee, soumickmj / mrunder: Initial release (Jun. 2020). doi:10.5281/zenodo.3901455 .[69] A. E. Kavur, N. S. Gezer, M. Barıs¸, S. Aslan, P.-H. Conze, V. Groza, D. D.Pham, S. Chatterjee, P. Ernst, S. ¨Ozkan, et al., Chaos challenge-combined(ct-mr) healthy abdominal organ segmentation, Medical Image Analysis(2020) 101950.[70] C. Sarasaen, S. Chatterjee, A. N¨urnberger, O. Speck, Super resolution ofdynamic mri using deep learning, enhanced by prior-knowledge, in: 37thAnnual Scientiﬁc Meeting Congress of the European Society for Mag-netic Resonance in Medicine and Biology, 33(Supplement 1): S03.04,S28-S29, Springer, 2020. doi:10.1007/s10334-020-00874-0 .[71] S. Chatterjee, K. Prabhu, M. Pattadkal, G. Bortsova, F. Dubost, H. Mat-tern, M. de Bruijne, O. Speck, A. N¨urnberger, Ds6: Deformation-awarelearning for small vessel segmentation with small, imperfectly labeleddataset, arXiv preprint arXiv:2006.10802.[72] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan,T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf,E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner,L. Fang, J. Bai, S. Chintala, Pytorch: An imperative style, high-performance deep learning library, in: H. Wallach, H. Larochelle,A. Beygelzimer, F. d'Alch´e-Buc, E. Fox, R. Garnett (Eds.), Advancesin Neural Information Processing Systems 32, Curran Associates, Inc.,2019, pp. 8024–8035..[71] S. Chatterjee, K. Prabhu, M. Pattadkal, G. Bortsova, F. Dubost, H. Mat-tern, M. de Bruijne, O. Speck, A. N¨urnberger, Ds6: Deformation-awarelearning for small vessel segmentation with small, imperfectly labeleddataset, arXiv preprint arXiv:2006.10802.[72] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan,T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf,E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner,L. Fang, J. Bai, S. Chintala, Pytorch: An imperative style, high-performance deep learning library, in: H. Wallach, H. Larochelle,A. Beygelzimer, F. d'Alch´e-Buc, E. Fox, R. Garnett (Eds.), Advancesin Neural Information Processing Systems 32, Curran Associates, Inc.,2019, pp. 8024–8035.