[PDF] Complementary Time-Frequency Domain Networks for Dynamic Parallel MR Image Reconstruction

Abstract

Purpose: To introduce a novel deep learning based approach for fast and high-quality dynamic multi-coil MR reconstruction by learning a complementary time-frequency domain network that exploits spatio-temporal correlations simultaneously from complementary domains. Theory and Methods: Dynamic parallel MR image reconstruction is formulated as a multi-variable minimisation problem, where the data is regularised in combined temporal Fourier and spatial (x-f) domain as well as in spatio-temporal image (x-t) domain. An iterative algorithm based on variable splitting technique is derived, which alternates among signal de-aliasing steps in x-f and x-t spaces, a closed-form point-wise data consistency step and a weighted coupling step. The iterative model is embedded into a deep recurrent neural network which learns to recover the image via exploiting spatio-temporal redundancies in complementary domains. Results: Experiments were performed on two datasets of highly undersampled multi-coil short-axis cardiac cine MRI scans. Results demonstrate that our proposed method outperforms the current state-of-the-art approaches both quantitatively and qualitatively. The proposed model can also generalise well to data acquired from a different scanner and data with pathologies that were not seen in the training set. Conclusion: The work shows the benefit of reconstructing dynamic parallel MRI in complementary time-frequency domains with deep neural networks. The method can effectively and robustly reconstruct high-quality images from highly undersampled dynamic multi-coil data (16 \times and 24 \times yielding 15s and 10s scan times respectively) with fast reconstruction speed (2.8s). This could potentially facilitate achieving fast single-breath-hold clinical 2D cardiac cine imaging.

Full PDF

FF U L L P A P E R

S u b m i t t e d t o M a g n e t i c Re s o n a n c e i n M e d i c i n e

Complementary Time-Frequency DomainNetworks for Dynamic Parallel MR ImageReconstruction

Chen Qin | Jinming Duan | Kerstin Hammernik |Jo Schlemper | Thomas Küstner | René Botnar | ClaudiaPrieto | Anthony N. Price | Joseph V. Hajnal | Daniel Rueckert Department of Computing, ImperialCollege London, London, United Kingdom Institute for Digital Communications,School of Engineering, University ofEdinburgh, Edinburgh, United Kingdom School of Computer Science, University ofBirmingham, Birmingham, United Kingdom AI in Medicine and Healthcare, Klinikumrechts der Isar, Technical University ofMunich, Munich, Germany Hyperﬁne Research Inc., Guilford, CT, USA School of Biomedical Engineering andImaging Sciences, King’s College London, St.Thomas’ Hospital, London, United Kingdom Department of Diagnostic andInterventional Radiology, Medical Imageand Data Analysis, University Hospital ofTuebingen, Tuebingen, Germany

Correspondence

Chen Qin, PhD, Institute for DigitalCommunications, School of Engineering,University of Edinburgh, Edinburgh, EH93JL, United KingdomEmail: [email protected]

Funding information

EPSRC Programme Grant: EP/P001009/1

Manuscript information:

Word count (abstract): 240Word count (body): approx. 5000Number of ﬁgures/tables: 6/4Number of supporting ﬁgures/videos: 2/2

Purpose

To introduce a novel deep learning based approachfor fast and high-quality dynamic multi-coil MR reconstruc-tion by learning a complementary time-frequency domainnetwork that exploits spatio-temporal correlations simulta-neously from complementary domains.

Theory and Methods

Dynamic parallel MR image recon-struction is formulated as a multi-variable minimisation prob-lem, where the data is regularised in combined temporalFourier and spatial ( x - f ) domain as well as in spatio-temporalimage ( x - t ) domain. An iterative algorithm based on vari-able splitting technique is derived, which alternates amongsignal de-aliasing steps in x - f and x - t spaces, a closed-formpoint-wise data consistency step and a weighted couplingstep. The iterative model is embedded into a deep recur-rent neural network which learns to recover the image viaexploiting spatio-temporal redundancies in complementarydomains. Results

Experiments were performed on two datasets ofhighly undersampled multi-coil short-axis cardiac cine MRIscans. Results demonstrate that our proposed method out-performs the current state-of-the-art approaches both quan-titatively and qualitatively. The proposed model can also a r X i v : . [ ee ss . I V ] D ec Qin et al. generalise well to data acquired from a diﬀerent scannerand data with pathologies that were not seen in the train-ing set.

Conclusion

The work shows the beneﬁt of reconstructingdynamic parallel MRI in complementary time-frequency do-mains with deep neural networks. The method can eﬀec-tively and robustly reconstruct high-quality images from highlyundersampled dynamic multi-coil data ( × and × yield-ing 15s and 10s scan times respectively) with fast recon-struction speed (2.8s). This could potentially facilitate achiev-ing fast single-breath-hold clinical 2D cardiac cine imaging. K E Y W O R D S

Dynamic Parallel Magnetic Resonance Imaging, Deep Learning,Cardiac Image Reconstruction, Temporal Fourier Transform,Complementary Domain, Recurrent Neural Networks. | INTRODUCTION

Magnetic Resonance Imaging (MRI) is a widely used diagnostic modality which generates images with high spatial andtemporal resolution as well as excellent soft tissue contrast. Dynamic MRI is often used to monitor dynamic processesof anatomy such as cardiac motion by acquiring a series of images at a high frame rate. However, the physics of theimage acquisition process as well as physiological constraints limit the speed of MRI acquisition, and long scan timealso makes it diﬃcult to acquire images of moving structures. Thus, acceleration of the MRI acquisition is crucial toenable these clinical applications.Parallel imaging (PI) techniques [1–3] have been widely used to accelerate MR imaging. They speed up the scantime by sampling only a limited number of phase-encoding steps, and then exploiting the correlations to restore themissing information in the reconstruction phase. Compressed sensing (CS) techniques combined with PI have showngreat potential in improving the image reconstruction quality and acquisition speed [4–8]. CS-based methods exploitsignal sparsity in some speciﬁc transform domain, and recover the original image from undersampled k-space datausing nonlinear reconstructions. One eﬀective mean to exploit spatio-temporal redundancies for signal recovery indynamic MRI is to enforce the sparsity in combined spatial and temporal Fourier ( x - f ) domain against consistencywith the acquired undersampled k-space data, and this can be represented by methods such as k - t FOCUSS [4, 5] and k - t SPARSE-SENSE [6]. The combinations of CS with low-rank in matrix completion schemes and spatio-temporalpartial separability [7, 9, 10] have also been proposed to exploit correlations between the temporal proﬁles of thevoxels, e.g. k - t SLR [7]. Some more recent approaches [11, 12] also utilised patch-based regularisation frameworksto exploit geometric similarities in the spatio-temporal domain. However, these CS-based approaches often requirecareful selection of problem-speciﬁc regularisation schemes and the tuning of hyper-parameters is often non-trivial.Furthermore, the reconstruction speed of these methods is often slow due to the iterative nature of the optimisationused, and in the context of dynamic imaging, the additional time domain further increases the computational demand.In contrast, deep learning (DL) based reconstruction approaches have become extremely popular in recent years in et al. 3 and have enabled progress beyond the limitations of traditional CS techniques [13–18]. In DL methods, prior infor-mation and regularisation can be implicitly learnt from the acquired data without having to manually specify thembeforehand. Additionally, image quality and reconstruction speed are improved substantially. These advances includeapplications in both PI [19–26] and dynamic MRI [27–33]. Most current approaches in DL for accelerated PI are basedon exploiting information in a single image either in image domain [19, 20, 24] or in k-space domain [34–36], whereeach image (or frame) is reconstructed independently. Examples of these include the variational network (VN) [19]and robust artiﬁcial-neural-networks for k-space interpolation (RAKI) etc. In accelerated dynamic MRI, one of thekey ingredients is to exploit the temporal redundancies. To this end, 3D convolutional networks (Cascade CNN) [27]and bidirectional convolutional recurrent neural networks (CRNN) [29] have been proposed to exploit the temporaldependencies of dynamic sequences in spatio-temporal image domain. Most of these DL-based approaches so farhave focused on either 2D static PI or single-coil dynamic MRI, whereas only a few methods exist for dynamic parallelMRI reconstruction [32, 33, 37]. Thus more eﬃcient and eﬀective DL models for dynamic parallel MRI are highlydesirable.In this work, inspired by CS-based k - t methods, we formulate the dynamic parallel MR image reconstructionas a multi-variable minimisation problem considering regularisation in both spatio-temporal and temporal frequencydomains. We propose a novel end-to-end trainable deep recurrent neural network to model the iterative processresulting from the multi-variable minimisation. Speciﬁcally, the proposed DL approach alternates among four steps:(1) a signal de-aliasing step in combined spatial and temporal frequency domain ( x - f ) via an xf -CRNN; (2) a comple-mentary de-aliasing step in spatio-temporal image domain ( x - t ) with an xt -CRNN; (3) a closed-form point-wise dataconsistency (DC) step and (4) a closed-form weighted coupling step which are embedded as layers in the deep neuralnetwork (DNN). Each of these steps correspond to the iterative algorithm derived from a variable splitting technique(Section 2). As the proposed model exploits spatio-temporal redundancies from Complementary Time-Frequencydomains for the eﬀective image reconstruction, we term our model as CTFNet.The main contributions of our work can be summarised as follows: Firstly, we propose a new regularisationmethod built on recurrent neural networks for data regularisation in complementary spatio-temporal and temporalfrequency domains to fully exploit data redundancies. Though previous studies [15, 38, 39] have shown that MRreconstruction can be performed in both k-space and image domains, it is unclear how cross-domain knowledgecan be eﬀectively utilised by DNNs in the dynamic setting, with an extra temporal dimension. To the best of ourknowledge, this is the ﬁrst work that investigates how complementary domain knowledge can be exploited in learning-based dynamic reconstruction. Secondly, we propose a closed-form DC layer that does not require a complex matrixinversion, and operates together with a weighted coupling layer for multi-coil images. Compared to other works[19, 20, 40], our approach oﬀers an exact update for DC, avoiding the expensive need of solving a linear systemvia gradient updates. This enables our approach to be computationally more eﬃcient and simpler for implementation.Finally, we demonstrate that our approach is able to further push the undersampling rates with improved image qualityagainst state-of-the-art CS and DL methods, as well as with a good generalisation ability to unseen data. This indicatesa great potential in achieving fast single-breath-hold 2D cardiac cine imaging.This work extends our preliminary conference work on single-coil dynamic MRI reconstruction [41] and 2D staticparallel MRI reconstruction [42], where we explored dynamic MRI and static PI separately. In comparison to our previ-ous work, this work presents a novel and uniﬁed end-to-end DL solution with a new formulation for dynamic parallelMRI reconstruction, which addresses a more common scenario in the use-case for clinical practice. It proposes a newway of exploiting complementary time-frequency domain information in DL. Signiﬁcantly more thorough quantitativeand qualitative evaluations of the proposed method including comparison, generalisation and ablation studies havebeen performed on multi-coil cardiac MR data with retrospective undersampling. Qin et al. | THEORY

Dynamic parallel MRI model:

Assume that m ∈ (cid:195) N is a complex-valued MR image sequence in x - y - t space repre-sented as a vector, and let v i ∈ (cid:195) M ( M (cid:28) N ) denote the undersampled k-space data (in k x - k y - t space) measuredfrom the i th MR receiver coil. The data acquired from each coil thus can be represented as v i = DF s S i m , (1)where F s is the spatial Fourier transform matrix, D is the sampling matrix on a Cartesian grid that zeros out entriesthat are not acquired, and S i is the i th coil sensitivity map. The reconstruction of m from v i is an ill-posed inverseproblem, where i ∈ { , , ..., n c } and n c denotes the number of receiver coils. Similar to CS formulations [6, 43, 44]based on the SENSE model, we formulate dynamic parallel MRI reconstruction as the following optimisation problem:min m R xf ( F t m ) + µ R xt ( m ) + λ n c (cid:213) i =1 (cid:107) DF s S i m − v i (cid:107) . (2)Here, R xt is deﬁned as a regularisation term on the spatio-temporal domain ( x - y - t space, also denoted as x - t ) ofthe image sequence m , similar to the spatio-temporal total variation in most CS-based approaches. To fully exploitthe spatio-temporal correlations, we additionally add a regularisation term R xf to regularise the data in the combinedspatial and temporal frequency domain ( x - f space), in which F t denotes the temporal Fourier transform. This leveragesthe characteristic that the signal can be sparsely represented in the temporal Fourier domain, because of the periodiccardiac motion exhibited in dynamic imaging. Previous works [15, 41, 43, 45] have shown that data regularisation indiﬀerent domains is beneﬁcial due to the complementary information they represent, and thus here we propose tocombine the regularisation terms from the complementary time and frequency domains with µ to balance between R xf and R xt . The last term in Eq. 2 enforces the data ﬁdelity in PI, and here we formulate it as a coil-wise DC term,which aims to avoid the need to solve a linear problem inside subsequent sub-problem and also makes it simple tobe embedded in an end-to-end DL framework (see following Optimisation). The model weight λ balances betweenregularisation and data ﬁdelity. Optimisation:

To optimise Eq. 2, we propose to employ the variable splitting technique [42, 46] to decouple thedata ﬁdelity term and regularisation terms. Speciﬁcally, auxiliary splitting variables u ∈ (cid:195) N , ρ ∈ (cid:195) N and { σ i ∈ (cid:195) N } n c i =1 are introduced here, converting Eq. 2 into the following equivalent form:min m , u , ρ , σ i R xf ( ρ ) + µ R xt ( u ) + λ n c (cid:213) i =1 (cid:107) DF s σ i − v i (cid:107) s . t . m = u , F t m = ρ , S i m = σ i , (cid:91) i ∈ { , , ..., n c } . (3)In detail, the introduction of the ﬁrst constraint m = u decouples m in the regularisation term R xt from that in the dataﬁdelity term, and the second constraint F t m = ρ enables the decoupling of R xf from the other terms. The introductionof the third constraint S i m = σ i is also crucial as it allows decomposition of S i m from DF s S i m in the data ﬁdelity term,which avoids the diﬃcult dense matrix inversion in subsequent calculations (see Eq. 6). Using the penalty functionmethod, Eq. 3 can be reformulated to minimise the following single cost function:min m , u , ρ , σ i R xf ( ρ ) + µ R xt ( u ) + λ n c (cid:213) i =1 (cid:107) DF s σ i − y i (cid:107) + α (cid:107) u − m (cid:107) + β (cid:107) ρ − F t m (cid:107) + γ n c (cid:213) i =1 (cid:107) σ i − S i m (cid:107) , (4) in et al. 5 where α , β and γ are penalty weights. To minimise Eq. 4 which is a multi-variable optimisation problem, alternatingminimisation over m , u , ρ and σ i is performed, resulting in iteratively solving the following sub-problems: ρ k +1 = arg min ρ β (cid:107) ρ − F t m k (cid:107) + R xf ( ρ ) , (5a) u k +1 = arg min u α (cid:107) u − m k (cid:107) + µ R xt ( u ) , (5b) σ k +1 i = arg min σ i λ n c (cid:213) i =1 (cid:107) DF s σ i − v i (cid:107) + γ n c (cid:213) i =1 (cid:107) σ i − S i m k (cid:107) , (5c) m k +1 = arg min m α (cid:107) u k +1 − m (cid:107) + β (cid:107) ρ k +1 − F t m (cid:107) + γ (cid:213) n c i =1 (cid:107) σ k +1 i − S i m (cid:107) . (5d)Here, k ∈ { , , , ..., n it − } denotes the k th iteration and m is the zero-ﬁlled reconstruction as an initialisation.An optimal solution ( m ∗ ) can be found by iterating over ρ k +1 , u k +1 , σ k +1 i and m k +1 until convergence or reaching themaximum number of iterations n it .Speciﬁcally, Eq. 5a and Eq. 5b are the proximal operators of the combined temporal Fourier and spatial domainprior R xf and the spatio-temporal image domain prior R xt respectively. Eq. 5c is a coil-wise data consistency step inPI (pDC), which imposes the consistency between the acquired k-space measurements and the reconstructed data. Aclosed-form solution for Eq. 5c can be derived as: σ k +1 i = F Hs ( ( λ D T D + γ I ) − ( γ F s S i m k + λ D T v i )) , (6)in which F Hs is the conjugate transpose of F s and I is the identity matrix. Similarly, by optimising Eq. 5d, we obtain thefollowing solution: m k +1 = ( α I + β I + γ (cid:213) n c i =1 S H i S i ) − ( α u k +1 + β F H t ρ k +1 + γ (cid:213) n c i =1 S H i σ k +1 i ) , (7)where S H i is the conjugate transpose of S i . This can be regarded as a weighted coupling (wCP) of the results obtainedfrom Eq. 5a, Eq. 5b and Eq. 5c. In particular, it can be seen that both Eq. 6 and Eq. 7 are closed-form solutions and canbe computed in a point-wise manner due to the inversion of diagonal matrices. This avoids iterative gradient updatesand thus enables fast reconstruction speed in comparison to conjugate gradient-based approaches [20, 32, 46]. | METHODS3.1 | CTFNet for dynamic parallel MRI reconstruction

Based on the model formulation in Eq. 5, we propose to embed the iterative reconstruction process into a DL frame-work to further improve the reconstruction quality with faster reconstruction speed and higher acceleration factors(AF). Speciﬁcally, we propose a complementary time-frequency domain network (CTFNet) for the dynamic parallelMRI reconstruction to exploit the spatio-temporal correlations in complementary spatio-temporal and temporal fre-quency domains. Our model consists of four core components: (1) an xf -CRNN to implicitly learn the regularisationfrom the training data itself and perform the iterative de-aliasing in x - f domain, corresponding to Eq. 5a; (2) an xt -CRNN similarly as the learning-based proximal operator in the spatio-temporal image domain, corresponding to Eq.5b; (3) a pDC layer that performs coil-wise DC in PI (Eq. 5c); and (4) a wCP layer that is naturally derived from Eq. 5d Qin et al.

F I G U R E 1

An illustrative diagram of the proposed CTFNet at a single iteration . Each component iscorresponding to each subequation in Eq. 11 respectively. (a) Network architecture for xf -CRNN, which iscomposed of 4 layers of CRNN-i and 1 layer of 2D CNN with a residual connection from the baseline estimate; (b)network architecture for xt -CRNN, where a variation of architecture [29] is employed which consists of 4 layers ofBCRNN evolving over both temporal and iteration dimensions, 1 layer of 2D CNN and a residual connection; (c) PIdata consistency (pDC) layer; (d) weighted coupling (wCP) layer. Numbers inside CNN, CRNN-i and BCRNN layersindicate {kernel size, dilation factor, number of ﬁlters}. We used dilated 2D convolutions with kernel size × anddilation factor × . The number of input and output channels of the network was 2, representing the real andimaginary part of the complex-valued data. Note that features learned at each iteration is propagated along iterationsteps via the hidden-to-hidden connections in CRNN and BCRNN units. For mathematical notations, please refer toEq. 11.and performs the weighted coupling. An illustrative diagram of the proposed model is shown in Fig. 1. Note that theiterative reconstruction process as stated in Eq. 5 is modelled via the convolutional recurrent neural networks (CRNN)with recurrence over iterations. Details of each component of our network is explained hereafter. | xf -CRNN Corresponding to Eq. 5a, we ﬁrst propose to exploit the spatio-temporal correlations in the combined temporal Fourierand spatial domain. Instead of explicitly imposing the regularisation term on the data such as in conventional CS-basedmethods, here we propose to implicitly learn the regularisation from the training data itself by leveraging DNNs inthe x - f domain. Speciﬁcally, motivated by some of the CS-based k - t methods such as k - t FOCUSS [4, 5], where itssolution to the underdetermined inverse problem can be expressed as the form that consists of a baseline signal ¯ ρ together with its residual encoding ( ρ k − ¯ ρ ) for the k + 1 -th estimate of the x - f signal ρ k +1 , we propose to formulate in et al. 7 our x - f domain reconstruction as ρ k +1 = ¯ ρ + xf -CRNN ( ρ k − ¯ ρ ) . (8)Particularly, in our formulation of Eq. 8, diﬀerent from model-based [47] or compressed sensing [4, 6] algorithms,we employ a stack of convolutional layers to estimate the missing data based on other available points, typically withinits vicinity in x - f space. To fully exploit the spatio-temporal redundancies, we use the temporal average of a sequenceas the x - f baseline signal ¯ ρ , and thus xf -CRNN learns to reconstruct residuals of the temporal frequencies with re-spect to the temporal average (direct current) values. This makes the residual energy much sparser and enables thenetwork to focus more on the dynamic patterns of the signals with less eﬀorts in reconstructing static backgroundregions. In contrast to k - t FOCUSS implementation where sparsity was exploited for each coil separately, the pro-posed approach exploits the joint information in the multi-signal ensemble that represents the combination from allcoils. This has been shown to be eﬀective in reducing the number of required samples per coil and providing increasedacceleration capability [6]. Furthermore, diﬀerent from our previous work in [41], we propose to model the iterativereconstruction process in x - f domain with the recurrent neural network (CRNN-i [29]) where recurrence is evolvingover iterations via hidden-to-hidden connections and the trainable network parameters are shared across sequentialiteration steps.The illustrative diagram of x - f reconstruction is shown in Fig. 2. Speciﬁcally, we formulate the k - t to x - f transfor-mation process in PI as an x - f transform layer in the network. The x - f transform layer receives input from multi-coil k - t space data, and then transform it to x - f space as inputs to xf -CRNN. Details of the process are illustrated andexplained in Fig. 2. After the signal de-aliasing in x - f domain, another inverse Fourier transform along f is adoptedto transform the estimated x - f signal ρ k +1 back to dynamic image space for the subsequent weighted coupling withother predictions, as shown in Fig. 1. | x t -CRNN Corresponding to the formulation in Eq. 5b, we additionally propose to learn a regulariser in the spatio-temporal imagedomain complementary to the combined spatial and temporal frequency domain. Speciﬁcally, to eﬀectively exploitthe spatio-temporal redundancies in x - y - t space, we adopt a variation of our previous CRNN-MRI [29] network forimage space de-aliasing which has been shown to be an eﬀective technique in dynamic MRI reconstruction, termedas xt -CRNN. In detail, bidirectional CRNN layers [29] with recurrence evolving over both temporal and iterationdimensions via hidden-to-hidden connections are employed. This allows us to embed the iterative reconstructionprocess in a learning setting as well as to propagate information along temporal axis bidirectionally. Similar to the x - f space reconstruction, the proposed xt -CRNN also learns to reconstruct the combined data from all coils, and learnsthe residuals of the temporal average baseline ¯ m (Eq. 12) in spatio-temporal domain with m k − ¯ m as input to thenetwork. This can require fewer k - t samples for residual encoding and similarly enables the xt -CRNN to focus moreon the dynamics of the reconstruction. The x - t domain and x - f domain reconstructions are complementary, whichfurther enables the network to maximally explore cross-domain knowledge for the signal recovery. Qin et al.

F I G U R E 2

The x - f transform and reconstruction diagram for a single iteration in the combined spatial andtemporal frequency space. In detail, the x - f transform layer receives input from multi-coil k - t space data. Theacquired multi-coil k-space data is ﬁrstly averaged along t to yield a temporal average for each coil separately. Atiteration k , the temporally averaged data is subtracted from corresponding coil data at each time frame, and thesubtracted data and temporally averaged data from multi-coils are then inverse Fourier transformed andsensitivity-combined back to image space. This yields a sequence of aliased images and a temporally averagedsequence (Eq. 12). Each frequency-encoding position of the coil-combined images is then processed separatelyhereafter. The image rows from aliased images or baseline images are gathered and temporal Fourier transformedalong t to yield an x - f image, corresponding to ρ k − ¯ ρ and ¯ ρ respectively. These signals are then fed as inputs to xf -CRNN for x - f space reconstruction (Eq. 8 and Eq. 11a). | Data consistency layer

As discussed in Section 2, Eq. 5c and Eq. 6 give a closed-form solution with no dense matrix inversion, so that we canexactly embed it as a PI data consistency (pDC) layer in the DNN. To make it concise, we reformulate Eq. 6 as: σ k +1 i = F Hs (cid:104) Λ F s S i m k + ( − λ ) v i (cid:105) , Λ jj =  λ D jj = 11 D jj = 0 (9)where i ∈ { , , ..., n c } and λ = γ /( λ + γ ) . The DC in PI is performed coil-wise and point-wise, which makes it simpleand appealing for implementation in DNNs. Here λ is a hyperparameter that allows the adjustment of data ﬁdelitybased on the noise level of the acquired measurements. in et al. 9 | Weighted coupling layer

Similarly, Eq. 5d can be formulated as a weighted coupling (wCP) layer in DNNs given estimations from Eq. 5a, Eq. 5band Eq. 5c, as represented in the closed-form solution Eq. 7. The coil sensitivity maps can be normalised to one alongcoil dimension, and thus we can simplify Eq. 7 as m k +1 = α u k +1 + β F Ht ρ k +1 + ( − α − β ) n c (cid:213) i =1 S Hi σ k +1 i , (10)in which α = αα + β + γ and β = βα + β + γ control the weighted coupling of predictions from x - t domain and x - f domainrespectively. | CTFNet

Based on the proposed four modules, our CTFNet can thus be compactly represented as follows: ρ k +1 = F t ¯ m + xf -CRNN ( F t m k ; F t ¯ m ) , (11a) u k +1 = ¯ m + xt -CRNN ( m k ; ¯ m ) , (11b) σ k +1 i = pDC ( m k ; S i , v i , λ , D ) , i ∈ { , , ..., n c } , (11c) m k +1 = WA ( F Ht ρ k +1 , u k +1 , S H i σ k +1 i ; α , β ) . (11d)Here ¯ m denotes the temporally averaged sensitivity-combined image of a sequence that is used as the baseline signal,and it can be mathematically expressed as ¯ m = (cid:213) n c i =1 S H i F Hs (cid:104) max ( I , (cid:213) t D ) − (cid:213) t v i (cid:105) T , (12)in which max operation is performed element-wise, (cid:205) t indicates summation along the temporal dimension, and [·] T represents the repetition operation along the temporal dimension for T times (the number of frames in a sequence).Given the proposed framework, our CTFNet can iteratively learn to reconstruct the true images from both spatio-temporal and temporal frequency spaces, so that the spatio-temporal redundancies can be jointly exploited fromcomplementary domains for better reconstructions. | Network Learning

Given the training set Ω with undersampled data m as input and fully sampled data as target, the network is trainedend-to-end by minimising the pixel-wise L1 norm between the reconstructed data and the sensitivity-weighted groundtruth data m gt : L ( θ ) = 1 n Ω (cid:213) ( m , m gt )∈ Ω (cid:13)(cid:13) m gt − m n it (cid:13)(cid:13) , (13)where m n it denotes the predicted image at iteration n it , i.e., the ﬁnal output of the proposed network, θ is the set ofnetwork parameters, and n Ω is the number of training samples. | Data

We used two datasets for the experimental evaluations. The ﬁrst dataset (Dataset A) includes 38 sets of complex-valued multi-slice short-axis cardiac MRI scans acquired on a 1.5T Siemens scanner. 2D bSSFP cine acquisition withretrospective gating and × GRAPPA acceleration was performed for 14 healthy subjects and 24 patients for left ven-tricular coverage. The data was acquired with Cartesian sampling and with acquisition parameters including in-planeresolution of . × . mm, slice thickness of 8mm and temporal resolution of around 40ms. Images were reconstructedfrom the × acceleration to a fully sampled k-space by GRAPPA. In experiments, six slices from each subject that coverthe dynamic anatomy were extracted, resulting in a total number of 228 slices for the experiments. Each acquisitionin this cohort consists of 25 frames with 30/34/38-channel multi-coil data. The second dataset (Dataset B) used inour experiments consists of 10 fully sampled complex-valued short-axis cardiac cine MRI acquired on a 1.5T Philipsscanner. Each scan contains a single slice SSFP acquisition with 30 temporal frames and 32-channel multi-coil rawdata, which has an in-plane resolution of . × . mm and 10mm thickness.A variable density incoherent spatiotemporal acquisition (VISTA) sampling scheme [48] was employed to under-sample the k-space data in our experiments, which has been shown to be an eﬀective Cartesian sampling strategy fordynamic data. The scheme is based on a constrained minimisation of Riesz energy on a spatiotemporal grid. It allowsuniform coverage of the acquisition domain with regular gaps between samples and guarantees a fully-sampled, time-averaged k-space to facilitate GRAPPA or ESPIRiT kernel estimation. In experiments, we undersampled the data atAFs of 8, 16 and 24, and examples of them are shown in Fig. S1. Coil sensitivity maps were pre-computed from thefully-sampled, time-averaged k-space center with the ESPIRiT algorithm [49] by using the BART toolbox [50]. | Experiments

We ﬁrstly performed the comparison study where we compared our CTFNet against other competing approaches onDataset A with mixed healthy subjects and patients for reconstructions from undersampling rates of 8, 16 and 24. Inthe second step, we explored the generalisation potential of the proposed method with respect to diﬀerent scannersand acquisition settings (Dataset A to Dataset B) as well as to pathology not represented in the training set (healthyto patients in Dataset A). Lastly, an ablation study was conducted on both datasets that investigated the eﬀects ofregularisation in diﬀerent domains. | Evaluation Method

We compared our proposed approach (CTFNet) with representative MR reconstruction methods, including state-of-the-art CS and low-rank based method k - t SLR [7], and two variants of DL methods, dynamic VN [33] and CascadeCNN [24, 27], which have been substantially enhanced to adapt to dynamic parallel image reconstruction. DynamicVN [33] learns the complex spatio-temporal convolutions in contrast to the original VN [19], and for strong compar-isons with our method, we propose to improve it by incorporating the temporal average baseline as an initialisation.Similarly, as to Cascade CNN with the D-POCSENSE framework [24] originally designed for static PI, we also reﬁnedit to learn the residual of the temporal average, and adjusted it with the same convolutional recurrent architecture asCTFNet to equip it with the ability to exploit spatio-temporal correlations. Thus we term it as CascadeCRNN. k - t SLRformulation has also been extended to be used with multi-coil data based on SENSE model in contrast to its originalimplementation [7].Quantitative results were evaluated in terms of normalised mean-squared-error (NMSE) and peak-to-noise-ratio in et al. 11 (PSNR) on complex-valued images, as well as structural similarity index (SSIM) and high frequency error norm (HFEN)on magnitude images. These metrics were made to evaluate the reconstruction results with complimentary em-phasis. All quantitative results were computed only around cropped dynamic regions for better evaluation. LowerNMSE/HFEN and higher PSNR/SSIM indicate better results. Evaluations on comparison and ablation studies weredone via a 2-fold cross-validation on two datasets separately. | Implementation details

The detailed network architecture of the proposed CTFNet is shown and explained in Fig. 1. Values of λ , α and β were all set to 0.1 based on our preliminary works [25, 42]. The network architecture for CascadeCRNN was thesame as xt -CRNN and the number of iteration steps n it for all methods was set to 5. All DL networks were imple-mented in PyTorch, and ADAM optimiser was employed with a learning rate of − . During training, we extractedtraining patches along the frequency-encoding direction and used the entire sequence of the data. Networks fordiﬀerent undersampling factors were ﬁrst trained jointly and then ﬁnetuned separately, with a total number of backpropagations. Patch extraction and data augmentation were performed on-the-ﬂy on the individual coil images,with random rotation and scaling. For k - t SLR, we used the Matlab implementation provided by [7] with an extensionto multi-coil data. Source code will be available online * . | RESULTS4.1 | Comparison study

Quantitative comparison results of diﬀerent methods on dynamic multi-coil cardiac data with various high AFs ( × , × and × ) are presented in Table 1. Here the models were trained on Dataset A with a 2-fold cross-validation,where each fold contained 7 healthy subjects and 12 patients with six slices for each subject. The results reportedin Table 1 were on the entire 228 2D+t slices. It can be seen that our proposed CTFNet outperforms k - t SLR by alarge margin in terms of all these measures at diﬀerent undersampling rates. It also oﬀers a much faster ( ∼ × )reconstruction speed with 2.8s for the entire sequence of one slice (12G TITAN Xp GPU) compared with k - t SLR with2444.8s (16GB RAM, 3.60GHz CPU) for the same reconstruction. In comparison to other DL-based methods whichhave been carefully enhanced to incorporate temporal information, our proposed approach can still achieve betterperformance on all acceleration rates, with an improvement of around 1dB PSNR and 1.5% SSIM increase over themost competing method (CascadeCRNN). The performance gap of the improvement is also increasing as AF increases.Additionally, we also compared the qualitative results on × and × undersampled data (equivalent scan time: 15sand 10s respectively within a single breath-hold) in Fig. 3, which shows the reconstructed images along both spatialand temporal dimensions as well as their corresponding error maps on a patient and a healthy subject. Compared toother competing methods, it can be observed that our proposed model can faithfully recover the images with smallererrors especially around dynamic regions, and can also produce sharper reconstructions along temporal proﬁles. | Generalisation study

In this study, we explored the generalisation potential of the proposed method. We ﬁrst investigated the robustness ofthe models when applied to data that were acquired with diﬀerent scanners and acquisition settings from the training * https://github.com/cq615/kt-Dynamic-MRI-Reconstruction TA B L E 1

Comparison results of diﬀerent methods on Dataset A of dynamic multi-coil cardiac cine MRI with highacceleration factors (AF). Results (mean (standard deviation)) were computed and compared only around dynamicregions. NMSE is scaled to − . Best results are shown in bold.AF Metrics k - t SLR Dynamic VN CascadeCRNN Proposed × NMSE 0.664 (0.380) 0.529 (0.518) 0.545 (0.516) (0.314)PSNR 40.892 (2.875) 43.048 (3.736) 42.798 (3.549) (3.341)SSIM 0.957 (0.023) 0.970 (0.026) 0.968 (0.029) (0.020)HFEN 0.138 (0.047) 0.103 (0.076) 0.110 (0.074) (0.052) × NMSE 1.932 (3.517) 1.351 (1.012) 1.253 (1.308) (0.794)PSNR 37.612 (3.136) 38.923 (3.706) 39.225 (3.530) (3.403)SSIM 0.920 (0.052) 0.936 (0.045) 0.937 (0.049) (0.039)HFEN 0.257 (0.154) 0.212 (0.111) 0.194 (0.106) (0.088) × NMSE 2.702 (1.763) 1.964 (1.734) 1.844 (1.797) (1.201)PSNR 35.222 (3.123) 37.257 (3.705) 37.562 (3.603) (3.447)SSIM 0.895 (0.052) 0.914 (0.055) 0.914 (0.060) (0.049)HFEN 0.309 (0.107) 0.270 (0.124) 0.251 (0.123) (0.104)data. Speciﬁcally, we employed models trained on Dataset A and directly tested them on Dataset B. Dataset B diﬀersfrom Dataset A on the aspects of scanners, acquisition parameters, temporal resolutions, number of acquisition coilsand sampling matrix size. The generalisation test results of diﬀerent DL models are shown in Table 2. The proposedmethod achieves high performance on the unseen test dataset and also consistently outperforms against other com-peting methods (+1dB PSNR and +1.7% SSIM compared to the second best on AF 24 × ), indicating its capability ineﬀectively learning the inverse dynamic reconstruction problem. Besides, we also visualised the generalisation resultsof Dataset B under diﬀerent AFs, as presented in Fig. 4 and Fig. S2. It can be observed that our approach can re-cover the ﬁne details and the temporal traces of the image very well on data from unseen domain even with extremeundersampling rate ( × ), though it is anticipated that the reconstruction gets more challenging as AF increases.In addition, we further investigated the generalisation performance of the proposed method from healthy subjectsto patients that were not represented in the training set. In detail, we trained another model with only healthy subjects(14 subjects, 84 slices), and directly tested it on patients in Dataset A. The generalisation results were comparedwith models trained with mixed healthy subjects and patients (19 subjects, 114 slices), as shown in Table 3. Thoughthe pathological conditions were not included in the training data, the generalisation results from healthy data topatients were very competitive to the mixed training models with an average of only 0.2dB PSNR and 0.2% SSIMdrop of performance. This can also be observed from the qualitative comparison as shown in Fig. 5, where only subtlediﬀerences can be detected from these two training settings. | Ablation study

To better understand the proposed method and its performance, we attempted to perform the ablation study to gainmore insights. Particularly, we investigated on the eﬀects of diﬀerent regularisations ( R xt and R xf ) on the dynamicparallel reconstruction problem. Speciﬁcally, we compared results from the spatio-temporal image space reconstruc-tion (Proposed ( R xt )), the combined temporal Fourier and spatial space reconstruction (Proposed ( R xf )) as well as thecomplementary time-frequency domain reconstruction (Proposed ( R xf + R xt )). All these ablated approaches with vary- in et al. 13 F I G U R E 3

Qualitative comparison results of diﬀerent methods on spatial and temporal dimensions with theirerror maps. Results are shown for undersampling rates × of a patient (top) and × of a healthy subject (bottom)on Dataset A. The scan time for these two acquisitions are 15s and 10s within a single breath-hold respectively. Theproposed method can well recover the ﬁne details and preserve the temporal traces, though this gets morechallenging on aggressively undersampled data. A dynamic video is shown in supporting information Video S1 forbetter visualisation.ing domain regularisations were conducted under the same variable splitting framework as in Section 2, where forthe single domain reconstruction, only the corresponding domain network was used. The quantitative comparisonresults of the ablation study are shown in Table 4, where reconstruction models were trained on data with AF × fromdatasets A and B respectively. A qualitative result is also given in Fig. 6 on data with AF × . TA B L E 2

Generalisation results of diﬀerent DL methods trained on Dataset A and deployed to Dataset B fordiﬀerent AFs. Results (mean (standard deviation)) were computed and compared only around dynamic regions.NMSE is scaled to − . Best results are shown in bold.AF Metrics Dynamic VN CascadeCRNN Proposed × NMSE 0.966 (0.353) 0.929 (0.340) (0.245)PSNR 38.923 (2.744) 39.101 (2.553) (2.389)SSIM 0.955 (0.012) 0.955 (0.012) (0.010)HFEN 0.120 (0.026) 0.124 (0.029) (0.017) × NMSE 2.019 (0.754) 1.763 (0.551) (0.417)PSNR 35.760 (2.710) 36.277 (2.301) (2.286)SSIM 0.919 (0.024) 0.923 (0.019) (0.014)HFEN 0.235 (0.043) 0.206 (0.035) (0.027) × NMSE 2.921 (0.762) 2.656 (0.727) (0.593)PSNR 34.027 (2.463) 34.451 (2.082) (2.449)SSIM 0.892 (0.022) 0.895 (0.024) (0.018)HFEN 0.311 (0.039) 0.280 (0.048) (0.037)

F I G U R E 4

Generalisation reconstructions of the proposed method on the unseen domain Dataset B alongspatial and temporal dimensions with various AFs as well as their error maps. (a) Fully sampled image (b) Example ofundersampling image with AF × (c) (d) Reconstruction from AF × (e) (f) Reconstruction from AF × (g) (h)Reconstruction from AF × . The proposed method can well reconstruct the images with good preservation oftemporal trace on various undersampling rates. Though reconstruction is more challenging as AF increases, thereconstructed results can still be useful. A dynamic video is shown in supporting information Video S2 for bettervisualisation. in et al. 15 TA B L E 3

Generalisation results of the proposed method trained on healthy subjects only (84 slices) and testedon patients in Dataset A for diﬀerent AFs. Results (mean (standard deviation)) were computed only around dynamicregions and compared with models trained with mixed healthy subjects and patients (114 slices) also in Dataset A.NMSE is scaled to − . Better results are shown in bold.AF Metrics Mixed (114) → patients healthy (84) → patients × NMSE (0.317) 0.421 (0.366)PSNR (3.626) 44.066 (3.698)SSIM (0.023) 0.969 (0.026)HFEN (0.057) 0.096 (0.066) × NMSE (0.795) 0.981 (0.849)PSNR (3.645) 40.379 (3.700)SSIM (0.044) 0.938 (0.046)HFEN (0.095) 0.183 (0.099) × NMSE (1.184) 1.353 (1.091)PSNR (3.649) 38.855 (3.615)SSIM (0.055) 0.919 (0.055)HFEN (0.112) 0.232 (0.113)

F I G U R E 5

Comparison of the proposed method between mixed training results (from mixed healthysubjects/patients to patients) and generalisation results (from healthy subjects to patients). Results shown are onone patient with hypertensive cardiomyopathy in Dataset A on AF × . The generalisation result is almost as well asthe one from standard mixed training. TA B L E 4

Ablation study of eﬀects of diﬀerent regularisations on dynamic cardiac cine MRI reconstruction.Experiments were performed on two diﬀerent datasets (A and B) with undersampling rate × . NMSE is scaled to − . Results are presented in mean (standard deviation). Best results are indicated in bold.Method Proposed ( R xt ) Proposed ( R xf ) Proposed ( R xf + R xt ) (0.314)PSNR 42.785 (3.355) 43.433 (3.456) (3.341)SSIM 0.969 (0.026) 0.970 (0.026) (0.020)HFEN 0.107 (0.068) 0.096 (0.064) (0.052)B NMSE 0.906 (0.288) 0.852 (0.274) (0.197)PSNR 39.160 (2.481) 39.433 (2.444) (2.487)SSIM 0.956 (0.010) 0.958 (0.011) (0.009)HFEN 0.126 (0.027) 0.115 (0.020) (0.019) F I G U R E 6

Qualitative comparisons of the ablated diﬀerent domain reconstructions on spatial and temporaldimensions with their error maps. Results are shown for AF × (scan time 15s) on Dataset A. Highlighted regionsindicate improvement of the complementary time-frequency domain reconstruction. | DISCUSSION

In this work, we have demonstrated that the proposed method is capable of recovering high quality images from highlyundersampled dynamic multi-coil data. Diﬀerent from existing DL-based approaches, we incorporated the combinedspatial and temporal frequency domain regularisation into the formulation of the dynamic parallel MRI reconstruc-tion problem and exploited spatio-temporal redundancies from both x - t and x - f spaces with DNNs. Compared withspatio-temporal image ( x - t ) domain reconstruction (Proposed ( R xt ), Table 4), the proposed x - f space reconstruction(Proposed ( R xf )) has shown to be more eﬀective in exploiting the spatio-temporal correlations, with higher reconstruc-tion accuracy and a smaller number of network parameters (Table 4). This is mainly due to the inherent nature of the in et al. 17 periodic dynamic cardiac MRI data itself, where strong correlations exist in k-space and time and signal in tempo-ral Fourier space is sparse. This has been represented in many traditional CS-based methods, and here our resultshave demonstrated that the learned implicit DNN-prior in the temporal Fourier domain can further increase the ac-celeration capability and achieve even better performance. In addition, combination of time-frequency cross-domainknowledge (Proposed ( R xf + R xt ), Table 4 and Fig. 6) further enhances the reconstruction capability of the proposedmethod with better reconstruction quality. This indicates that learning jointly from both spatio-temporal and temporalfrequency domains can capture complementary useful information that can be eﬀectively utilised by the proposedframework, which also explains the superior performance of CTFNet over other competing methods.Furthermore, the proposed CTFNet builds on a multi-variable minimisation problem and embeds it into an eﬃcientDL framework. The employed variable splitting technique eﬀectively decouples data regularisation terms on variousdomains from the data ﬁdelity term, which enables the natural derivation of pDC layer and wCP layer in PI with closed-form point-wise solutions. Though the derived pDC layer shares similar form as the one proposed in D-POCSENSE[24] which is a simple extension from single-coil application [27], our solution (pDC with wCP layers) for the multi-coil setting has the mathematical support based on variable splitting and alternating minimisation, and thus reasonsthe particular formulation and structure of our network. In contrast to [20, 32] where data ﬁdelity step is solved viaconjugate gradient algorithm due to the diﬃcult matrix inversion in their DC terms, our CTFNet oﬀers a much simplerand more eﬃcient solution with exact steps and avoids iterative gradient updates, allowing for faster reconstructionspeed and easier embedding into DNNs. Besides, our approach also oﬀers the ﬂexibility of incorporating additionalregularisation terms in the framework, whereas this will not be very straightforward for the other approaches.Moreover, the proposed method can generalise well to unseen cardiac MR data with diﬀerent acquisition param-eters and with pathology that were not seen in the training set. The method can achieve satisfactory performanceon these scenarios even with highly aggressive undersampling strategies, which indicates that the proposed methodis robust to unseen and unusual image features or temporal behaviours present in our currently used dataset. Thisshows promising results for deploying DL models for clinical practice, nevertheless, more validations on this aspectincluding radiologists’ discretion are still needed for its practical use.Particularly, by exploiting spatio-temporal redundancies in the proposed DL framework, our approach can out-perform the state-of-the-art CS and DL-based methods and can further push the acceleration capability with fastreconstruction speed for the dynamic parallel MR imaging. In our work, Dataset A was a multi-breath-hold acquisi-tion of 8 consecutive breath-holds with 15s for each ( × GRAPPA accelerated). Hence an AF of 16 or higher willresult in the possibility of achieving the same acquisition in a single breath-hold. Despite this being a retrospectiveundersampling study, our results indicate a great potential in facilitating fast single-breath-hold clinical 2D cardiaccine imaging.For the future work, we will explore the dynamic parallel image reconstruction with other types of undersamplingstrategies, such as radial sampling which is also commonly used in acceleration of 2D cardiac MR imaging in practice.In addition, we could also consider incorporating some other regularisation terms into the framework, such as reg-ularisation on some other transform domains, to exploit the data redundancy for eﬀective reconstruction. Besides,generalisation capability of the model can be further validated on more data from diﬀerent domains and with variousacquisition parameters and pathologies to investigate its potential application for clinical use. | CONCLUSION

In this paper, we have proposed a novel DL-based approach, termed CTFNet, for highly undersampled dynamic par-allel MR image reconstruction. The proposed method exploits spatio-temporal correlations in both the combinedspatial and temporal frequency domain and the spatio-temporal image domain based on a variable splitting and al-ternating minimisation formulation. The network is able to learn to iteratively reconstruct the images by jointly andeﬀectively exploiting information from the complementary time-frequency domains. Our proposed CTFNet outper-forms state-of-the-art dynamic MR reconstruction methods in terms of both quantitative and qualitative performance,with excellent recovery of ﬁne details and preservation of temporal traces. It also enables increased accelerations ofdata acquisition with favorable generalisation ability, which is promising for realising single-breath-hold clinical 2Dcardiac cine MR imaging.

Acknowledgements

This work was supported by EPSRC programme grant SmartHeart (EP/P001009/1). references [1] Pruessmann KP, Weiger M, Scheidegger MB, Boesiger P. SENSE: sensitivity encoding for fast MRI. Magn Reson Med1999;42(5):952–962.[2] Griswold MA, Jakob PM, Heidemann RM, Nittka M, Jellus V, Wang J, et al. Generalized autocalibrating partially parallelacquisitions (GRAPPA). Magn Reson Med 2002;47(6):1202–1210.[3] Sodickson DK, Manning WJ. Simultaneous acquisition of spatial harmonics (SMASH): fast imaging with radiofrequencycoil arrays. Magn Reson Med 1997;38(4):591–603.[4] Jung H, Ye JC, Kim EY. Improved k–t BLAST and k–t SENSE using FOCUSS. Physics in Medicine & Biology2007;52(11):3201.[5] Jung H, Sung K, Nayak KS, Kim EY, Ye JC. k-t FOCUSS: a general compressed sensing framework for high resolutiondynamic MRI. Magn Reson Med 2009;61(1):103–116.[6] Otazo R, Kim D, Axel L, Sodickson DK. Combination of compressed sensing and parallel imaging for highly acceleratedﬁrst-pass cardiac perfusion MRI. Magn Reson Med 2010;64(3):767–776.[7] Lingala SG, Hu Y, DiBella E, Jacob M. Accelerated dynamic MRI exploiting sparsity and low-rank structure: k-t SLR. IEEETrans Med Imag 2011;30(5):1042–1054.[8] Lustig M, Santos JM, Donoho DL, Pauly JM. kt SPARSE: High frame rate dynamic MRI exploiting spatio-temporal sparsity.In: Proc. ISMRM 13th Annu. Meeting Exhibit; 2006. p. 2420.[9] Otazo R, Candès E, Sodickson DK. Low-rank plus sparse matrix decomposition for accelerated dynamic MRI with sepa-ration of background and dynamic components. Magn Reson Med 2015;73(3):1125–1136.[10] Zhao B, Haldar JP, Christodoulou AG, Liang ZP. Image reconstruction from highly undersampled (k, t)-space data withjoint partial separability and sparsity constraints. IEEE Trans Med Imag 2012;31(9):1809–1820.[11] Yoon H, Kim KS, Kim D, Bresler Y, Ye JC. Motion adaptive patch-based low-rank approach for compressed sensingcardiac cine MRI. IEEE Trans Med Imag 2014;33(11):2069–2085. in et al. 19 [12] Mohsin YQ, Lingala SG, DiBella E, Jacob M. Accelerated dynamic MRI using patch regularization for implicit motioncompensation. Magn Reson Med 2017;77(3):1238–1248.[13] Knoll F, Hammernik K, Zhang C, Moeller S, Pock T, Sodickson DK, et al. Deep-learning methods for parallel magneticresonance imaging reconstruction: A survey of the current approaches, trends, and issues. IEEE Signal ProcessingMagazine 2020;37(1):128–140.[14] Hammernik K, Knoll F. Machine learning for image reconstruction. In: Handbook of Medical Image Computing andComputer Assisted Intervention Elsevier; 2020.p. 25–64.[15] Eo T, Jun Y, Kim T, Jang J, Lee HJ, Hwang D. KIKI-net: cross-domain convolutional neural networks for reconstructingundersampled magnetic resonance images. Magn Reson Med 2018;80(5):2188–2201.[16] Ye JC, Han Y, Cha E. Deep convolutional framelets: A general deep learning framework for inverse problems. SIAM JImaging Sci 2018;11(2):991–1048.[17] Tezcan KC, Baumgartner CF, Luechinger R, Pruessmann KP, Konukoglu E. MR image reconstruction using deep densitypriors. IEEE Trans Med Imag 2018;38(7):1633–1642.[18] Yang G, Yu S, Dong H, Slabaugh G, Dragotti PL, Ye X, et al. DAGAN: Deep de-aliasing generative adversarial networksfor fast compressed sensing MRI reconstruction. IEEE Trans Med Imag 2017;37(6):1310–1321.[19] Hammernik K, Klatzer T, Kobler E, Recht MP, Sodickson DK, Pock T, et al. Learning a variational network for reconstruc-tion of accelerated MRI data. Magn Reson Med 2018;79(6):3055–3071.[20] Aggarwal HK, Mani MP, Jacob M. MoDL: Model-based deep learning architecture for inverse problems. IEEE Trans MedImag 2018;38(2):394–405.[21] Kwon K, Kim D, Park H. A parallel MR imaging method using multilayer perceptron. Medical physics 2017;44(12):6209–6224.[22] Cheng JY, Mardani M, Alley MT, Pauly JM, Vasanawala S. DeepSPIRiT: generalized parallel imaging using deep convolu-tional neural networks. In: Proc. ISMRM 26th Annu. Meeting Exhibit; 2018. .[23] Lønning K, Putzky P, Caan MW, Welling M. Recurrent inference machines for accelerated MRI reconstruction. In:International Conference on Medical Imaging With Deep Learning; 2018. p. 1–11.[24] Schlemper J, Duan J, Ouyang C, Qin C, Caballero J, Hajnal JV, et al. Data consistency networks for (calibration-less)accelerated parallel MR image reconstruction. In: Proc. ISMRM 27th Annu. Meeting Exhibit; 2019. p. 4664.[25] Hammernik K, Schlemper J, Qin C, Duan J, Summers RM, Rueckert D. Σ -net: Systematic Evaluation of Iterative DeepNeural Networks for Fast Parallel MR Image Reconstruction. arXiv preprint arXiv:191209278 2019;.[26] Fuin N, Bustin A, Küstner T, Oksuz I, Clough J, King AP, et al. A multi-scale variational neural network for acceleratingmotion-compensated whole-heart 3D coronary MR angiography. Magnetic Resonance Imaging 2020;.[27] Schlemper J, Caballero J, Hajnal JV, Price AN, Rueckert D. A deep cascade of convolutional neural networks for dynamicMR image reconstruction. IEEE Trans Med Imag 2018;37(2):491–503.[28] Schlemper J, Castro DC, Bai W, Qin C, Oktay O, Duan J, et al. Bayesian Deep Learning for Accelerated MR ImageReconstruction. In: International Workshop on Machine Learning for Medical Image Reconstruction Springer; 2018. p.64–71.[29] Qin C, Schlemper J, Caballero J, Price AN, Hajnal JV, Rueckert D. Convolutional recurrent neural networks for dynamicMR image reconstruction. IEEE Trans Med Imag 2019;38(1):280–290. [30] Hauptmann A, Arridge S, Lucka F, Muthurangu V, Steeden JA. Real-time cardiovascular MR with spatio-temporal artifactsuppression using deep learning–proof of concept in congenital heart disease. Magn Reson Med 2019;81(2):1143–1156.[31] Seegoolam G, Schlemper J, Qin C, Price A, Hajnal J, Rueckert D. Exploiting Motion for Deep Learning Reconstructionof Extremely-Undersampled Dynamic MRI. In: International Conference on Medical Image Computing and Computer-Assisted Intervention Springer; 2019. p. 704–712.[32] Biswas S, Aggarwal HK, Jacob M. Dynamic MRI using model-based deep learning and SToRM priors: MoDL-SToRM.Magn Reson Med 2019;82(1):485–494.[33] Hammernik K, Schloegl M, Kobler E, Stollberger R, Pock T. Dynamic Multicoil Reconstruction using Variational Networks.In: Proc. ISMRM 27th Annu. Meeting Exhibit; 2019. p. 4656.[34] Akçakaya M, Moeller S, Weingärtner S, Uğurbil K. Scan-speciﬁc robust artiﬁcial-neural-networks for k-space interpola-tion (RAKI) reconstruction: Database-free deep learning for fast imaging. Magn Reson Med 2019;81(1):439–453.[35] Han Y, Sunwoo L, Ye JC. k -Space Deep Learning for Accelerated MRI. IEEE Trans Med Imag 2019;39(2):377–386.[36] Zhang P, Wang F, Xu W, Li Y. Multi-channel Generative Adversarial Network for Parallel Magnetic Resonance ImageReconstruction in K-space. In: International Conference on Medical Image Computing and Computer-Assisted Interven-tion; 2018. p. 180–188.[37] Küstner T, Fuin N, Hammernik K, Bustin A, Qi H, Hajhosseiny R, et al. CINENet: deep learning-based 3D cardiac CINEMRI reconstruction with multi-coil complex-valued 4D spatio-temporal convolutions. Scientiﬁc reports 2020;10(1):1–13.[38] Sriram A, Zbontar J, Murrell T, Zitnick CL, Defazio A, Sodickson DK. GrappaNet: Combining parallel imaging with deeplearning for multi-coil MRI reconstruction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and PatternRecognition; 2020. p. 14315–14322.[39] Wang S, Ke Z, Cheng H, Jia S, Ying L, Zheng H, et al. DIMENSION: Dynamic MR imaging with both k-space and spatialprior knowledge obtained via multi-supervised network training. NMR in Biomedicine;p. e4131.[40] Mardani M, Gong E, Cheng JY, Vasanawala SS, Zaharchuk G, Xing L, et al. Deep generative adversarial neural networksfor compressive sensing MRI. IEEE Trans Med Imag 2018;38(1):167–179.[41] Qin C, Schlemper J, Duan J, Seegoolam G, Price A, Hajnal J, et al. k-t NEXT: Dynamic MR Image Reconstruction Exploit-ing Spatio-Temporal Correlations. In: International Conference on Medical Image Computing and Computer-AssistedIntervention Springer; 2019. p. 505–513.[42] Duan J, Schlemper J, Qin C, Ouyang C, Bai W, Biﬃ C, et al. VS-Net: Variable splitting network for accelerated parallelMRI reconstruction. In: International Conference on Medical Image Computing and Computer-Assisted InterventionSpringer; 2019. p. 713–722.[43] Lustig M, Donoho D, Pauly JM. Sparse MRI: The application of compressed sensing for rapid MR imaging. Magn ResonMed 2007;58(6):1182–1195.[44] Block KT, Uecker M, Frahm J. Undersampled radial MRI with multiple coils. Iterative image reconstruction using a totalvariation constraint. Magn Reson Med 2007;57(6):1086–1098.[45] Tsaig Y, Donoho DL. Extensions of compressed sensing. Signal processing 2006;86(3):549–571.[46] Ramani S, Fessler JA. Parallel MR image reconstruction using augmented Lagrangian methods. IEEE Trans Med Imag2010;30(3):694–706. in et al. 21 [47] Tsao J, Boesiger P, Pruessmann KP. k-t BLAST and k-t SENSE: dynamic MRI with high frame rate exploiting spatiotem-poral correlations. Magn Reson Med 2003;50(5):1031–1042.[48] Ahmad R, Xue H, Giri S, Ding Y, Craft J, Simonetti OP. Variable density incoherent spatiotemporal acquisition (VISTA)for highly accelerated cardiac MRI. Magn Reson Med 2015;74(5):1266–1278.[49] Uecker M, Lai P, Murphy MJ, Virtue P, Elad M, Pauly JM, et al. ESPIRiT—an eigenvalue approach to autocalibratingparallel MRI: where SENSE meets GRAPPA. Magn Reson Med 2014;71(3):990–1001.[50] Uecker M, Ong F, Tamir JI, Bahri D, Virtue P, Cheng JY, et al. Berkeley Advanced Reconstruction Toolbox. In: Proc.ISMRM 23th Annu. Meeting Exhibit; 2015. p. 2486. SUPPORTING INFORMATION

S U P P O R T I N G I N F O R M AT I O N F I G U R E S1

Examples of the VISTA undersampling patterns foracceleration factors 8, 16, and 24. Top ﬁgures show the undersampling patterns in k -space, and the bottom ﬁguresshow the undersampling patters in k - t space. S U P P O R T I N G I N F O R M AT I O N F I G U R E S2 x - f reconstructions of CTFNet under diﬀerent AFs withtheir error maps on dataset B. (a) Fully sampled signal (b) Undersampled example by AF × (c) (d) x - f reconstructionfrom AF × (e) (f) x - f reconstruction from AF × (g) (h) x - f reconstruction from AF ××