[PDF] Deep Learning for Low-Field to High-Field MR: Image Quality Transfer with Probabilistic Decimation Simulator

Abstract

MR images scanned at low magnetic field ( <1 T) have lower resolution in the slice direction and lower contrast, due to a relatively small signal-to-noise ratio (SNR) than those from high field (typically 1.5T and 3T). We adapt the recent idea of Image Quality Transfer (IQT) to enhance very low-field structural images aiming to estimate the resolution, spatial coverage, and contrast of high-field images. Analogous to many learning-based image enhancement techniques, IQT generates training data from high-field scans alone by simulating low-field images through a pre-defined decimation model. However, the ground truth decimation model is not well-known in practice, and lack of its specification can bias the trained model, aggravating performance on the real low-field scans. In this paper we propose a probabilistic decimation simulator to improve robustness of model training. It is used to generate and augment various low-field images whose parameters are random variables and sampled from an empirical distribution related to tissue-specific SNR on a 0.36T scanner. The probabilistic decimation simulator is model-agnostic, that is, it can be used with any super-resolution networks. Furthermore we propose a variant of U-Net architecture to improve its learning performance. We show promising qualitative results from clinical low-field images confirming the strong efficacy of IQT in an important new application area: epilepsy diagnosis in sub-Saharan Africa where only low-field scanners are normally available.

Full PDF

DDeep Learning for Low-Field to High-Field MR:Image Quality Transfer withProbabilistic Decimation Simulator

Hongxiang Lin , Matteo Figini , Ryutaro Tanno , , Stefano B. Blumberg ,Enrico Kaden , Godwin Ogbole , Biobele J. Brown , Felice D’Arco ,David W. Carmichael , , Ikeoluwa Lagunju , Helen J. Cross , ,Delmiro Fernandez-Reyes , , and Daniel C. Alexander Centre for Medical Image Computing and Department of Computer Science,University College London, UK Machine Intelligence and Perception Group, Microsoft Research Cambridge, UK Department of Radiology, College of Medicine, University of Ibadan, Nigeria Department of Paediatrics, College of Medicine, University of Ibadan, Nigeria Great Ormond Street Hospital for Children, London, UK UCL Great Ormond Street Institute of Child Health, UK Department of Biomedical Engineering, King’s College London, UK [email protected]

Abstract.

MR images scanned at low magnetic ﬁeld ( < Magnetic Resonance Imaging (MRI) is now ubiquitous in neurology with a strongtrend towards the use of high-ﬁeld scanners, with 1.5T and 3T being the cur- a r X i v : . [ ee ss . I V ] S e p oronal View Axial ViewHigh-ﬁeld Low-ﬁeld High-ﬁeld Low-ﬁeld Fig. 1.

High-ﬁeld vs low-ﬁeld MR scans: (a-b) Resolution change on coronal plane; (c-d)Contrast change on axial plane. Data sources: (a, c) 3T MRI from Human ConnectomeProject [2]; (b, d) 0.36T MRI acquired from University College Hospital, Ibadan. rent clinical standard. However, low-ﬁeld MRI scanners, less than 1T, are stillcommon in low and middle income countries (LMICs), due to limited fundsand frequent power outages. Low-ﬁeld scanners suﬀer from lower signal-to-noiseratio (SNR) than high ﬁeld at equivalent spatial resolution. To counteract theSNR reduction, practitioners commonly acquire images with non-adjacent thickslices to reduce the acquisition time and cross-talk artifacts in brain MRI sce-nario [1]. This leads to resolution reduction in the slice direction compared withthe in-plane resolution and a loss of information due to gaps between slices;see Fig. 1(a-b). Moreover, the contrast between grey matter (GM) and whitematter (WM) may be worse than in high ﬁeld even at equivalent SNR andspatial resolution as illustrated in Fig. 1(c-d).In this study, we aim to learn an image-translation mapping from low ﬁeldto high ﬁeld to perform super-resolution and contrast enhancement. In the lit-erature, mathematical models have been proposed to describe the variation ofMRI signal with the magnetic ﬁeld [3,4], but such models are simplistic anddo not include all eﬀects on the ﬁnal images, such as variability in the acqui-sition process. Furthermore, the reconstruction of missing information betweenthe acquired slices is severely ill-posed, which hinders the practical capabilityof producing high-ﬁeld like images. Several approaches in the literature aim tosolve related problems. Bahrami et al [5] proposed a multi-level Canonical Cor-relation Analysis for estimating 7T from 3T images using paired training data.Wolterink et al [6] used the idea of cycle consistency to leverage the abundanceof unpaired training sets and learn to synthesise CT from MRI. This approach is,however, known to be susceptible to hallucinations and may introduce spuriousfeatures in the output images [7].Image Quality Transfer (IQT) is a machine learning framework used to en-hance low-quality clinical data to the abundant neurological information in high-quality images. Most implementations of IQT simulate low quality data fromhigh quality providing matched-paired for training. In [8,9,10,11] for instance,the corresponding low-ﬁeld data are synthesised by downsampling and matchingvoxel-wise intensities coming from prior or empirical knowledge about actuallow-ﬁeld data. However, the trained model strongly depends on the accuracy of2ow-ﬁeld synthesis. To improve model generalisability, the prediction of a trainedmodel should be built on unseen test data with less dependency of simulation.In this paper, we build on the IQT framework to construct a mapping thatestimates high-ﬁeld images from the matched low-ﬁeld inputs. The paired data,particularly in large numbers, are hard to acquire in one area due to the rareavailability of high-ﬁeld scanners in LMICs and low-ﬁeld scanners in high incomecountries (HICs). Our key technical contribution is to propose a probabilisticdecimation (downsampling) model to improve robustness of IQT training andto enhance images from low-ﬁeld scanners. More speciﬁcally, low-ﬁeld data gen-eration comes from a probabilistic model which comprises random tissue-speciﬁcintensity statistics (e.g. SNR) and probabilistic semantic segmentation. We as-sume that an a priori distribution related to the tissue-speciﬁc SNR is available.The segmentation mask estimated by Statistical Parametric Mapping [12] is alsoprobabilistic in terms of the tissue type. Therefore for one high-ﬁeld subject, wecan simultaneously generate the corresponding multiple low-ﬁeld data and formthe paired training data, a novel way of performing data augmentation. We thenlearn the low-ﬁeld-to-high-ﬁeld transformation by adapting the U-Net architec-ture [13] with a super-resolution module, a “bottleneck block”, extending itsdepth to enable it to capture more global features of image contrast.

Let a 3D low-ﬁeld input patch x of size w × h × d be corrupted by smoothing, lowcontrast, and random noise. It is randomly cropped from the original low-ﬁeldMR volume denoted by X . Our aim is to reconstruct the sub-voxel information inthe slice thickness direction and to attain the high SNR and contrast transferringto the corresponding high-ﬁeld output patch y of size w × h × kd , where k is an up-sampling rate. Then we assemble all output patches into a high-ﬁeld MR volumedenoted by Y . The relationship between x and y is modelled by a degradationprocess of image quality, described by a function S such that x = S ( y, α ) + (cid:15), (1)where α denotes a vector of SNR components corresponding to prior knowledgeof WM and GM in the low-ﬁeld input volume, i.e. α = ( SN R

W MX , SN R

GMX ). Itis randomly sampled from the Gaussian distribution N ( µ , Σ ) where µ is a meanvector and Σ is a covariance matrix. (cid:15) denoting background noise has a Gaussiandistribution N (0 , σ BG ). Section 2.2 will specify the formulation and algorithm formodelling S . We then employ deep learning, speciﬁcally a convolutional neuralnetwork, to estimate the inverse mapping S † .We use a given M -paired training set T M = { ( x i , y i ) } Mi =1 with a ﬁxed α totrain our convolutional neural networks over all sampled patches from all MRvolumes. We optimise the network parameters θ by minimising the average of3he pixel-wise mean squared error (MSE) denoted by (cid:107) · (cid:107) over all training sets: θ ∗ = arg min θ M M (cid:88) i =1 (cid:107) S † θ ( x i ) − y i (cid:107) . (2) Equation (1) enables us to produce additional training data by randomly sam-pling the coeﬃcient α from an a priori distribution, forming the so-called proba-bilistic decimation simulator. It translates the voxel-wise low-ﬁeld SNRs, relatedto the sampled α and the tissue category, to the high-ﬁeld image and down-samples with a factor of k . We use this simulator to generate N low-ﬁeld patchesfor each high-ﬁeld patch y i and form a new training set T M,N = { ( x ij , y i ) | i =1 , · · · , M, j = 1 , · · · , N } . Henceforth, the new model is trained on the augmentedset T M,N with the following expression: θ ∗ = arg min θ M N M (cid:88) i =1 N (cid:88) j =1 (cid:107) S † θ ( x ij ) − y i (cid:107) . (3)We develop Alg. 1 for implementing the probabilistic decimation simulatorfor neural images. We transform high-ﬁeld images Y ( v ) to synthetic low-ﬁeld im-ages denoted by ˆ X ( v ) for any voxel coordinate v by adapting the SNR in WMand GM to the values obtained in our reference low-ﬁeld dataset. We assumethat SNRs of WM and GM have a 2D Gaussian distribution and the backgroundnoise in the low-ﬁeld or the high-ﬁeld images has a 1D Gaussian distribution witha zero mean and a standard deviation of σ X or σ Y , respectively. We also assume σ X (cid:29) σ Y since the random noise in high ﬁeld is negligible. The simulation pro-cedure starts with the skull-stripped Y ( v ) with isotropic voxels of length e z . Wethen down-sample along the slice thickness direction (vertical, or z -direction).A 1 D Gaussian ﬁlter h σ ( z ) = σ √ π e − z / (2 σ ) is applied to the high-ﬁeld imagesalong the z -direction, where the σ is linked to a full-width at half maximum(FWHM): FWHM = 2 √ σ . The FWHM of the Gaussian ﬁlter is set to theslice thickness, or in terms of σ : σ = ke z / √ Y ( v ) are ﬁrst segmented into tissue categories j = W M, GM, others (denoted by M j ( v )) using the uniﬁed segmentation algo-rithm in Statistical Parametric Mapping [12]. In this algorithm, the mask M j ( v )corresponds to the probability that each voxel v belongs to the tissue category j . SNR of the high-ﬁeld image with respect to the tissue category j is deﬁnedas: SN R jY = (cid:80) v M j ( v ) Y ( v ) σ Y (cid:80) v M j ( v ) . (4)4 lgorithm 1 Probabilistic Decimation Simulator for low-ﬁeld Image

Input: high-ﬁeld Images Y ( v ), masks M j ( v ) for j = W M, GM, others , downsamplingscale k ∈ N , background noise levels σ X and σ Y , low-ﬁeld SNR distribution N ( µ , Σ ).1: Y ↓ k ( v ) = Y ↓ k (˜ v, v (cid:48) ) = (cid:80) v (cid:48)(cid:48) Y (˜ v, kv (cid:48) − v (cid:48)(cid:48) ) h σ ( v (cid:48)(cid:48) ); (cid:46) Downsample on v (cid:48)(cid:48) component.2: Y j ↓ k ( v ) = M j ( v ) Y ↓ k ( v ); (cid:46) Apply masks.3:

SNR jY = (cid:80) v Y j ↓ k ( v ) / (cid:0) σ Y (cid:80) v M j ( v ) (cid:1) ; (cid:46) Compute SNRs for high ﬁeld.4: (

SNR

WMX , SNR

GMX ) ∼ N ( µ , Σ ); (cid:46) Sample SNRs for low ﬁeld.5: l j = (cid:26) SNR jX /SNR jY , j = W M, GM, , others ; (cid:46) Evaluate ratio of image intensity.6: ˆ X ( v ) = (cid:80) j ∈{ WM,GM,others } l j Y j ↓ k ( v ); (cid:46) Transfer contrast.7: ˆ X (cid:15) ( v ) = ˆ X ( v ) + (cid:15) ( v ) where (cid:15) ( v ) ∼ N (0 , σ X ). (cid:46) Add noise.

Output:

Noisy synthetic low-ﬁeld image ˆ X σ ( v ). This allows us to evaluate ratios of low-ﬁeld-to-high-ﬁeld image intensity for bothWM and GM; see Step 5. We then re-scale the high-ﬁeld images with the ratiosof image intensity according to tissue category, which results in the syntheticlow-ﬁeld images ˆ X ( v ). We ﬁnally add Gaussian white noise to ˆ X ( v ), with astandard deviation of σ X . The classical 3D isotropic U-Net [14] maps two identical-size cubes serving asinput and output through the encoder-decoder framework. Each level, deﬁnedas a collection of operations in between two shape deformations, for a typicalU-Net consists of several convolutional layers together with a pooling layer. Theactivation from each level in the encoder is concatenated to the input features tothe same level in the decoder, enabling the network to integrate both local andglobal image features. U-Net uses the “same” zero-padding technique so thatfeature sizes keep invariant during convolution.In this work, we extend the U-Net architecture into mapping input and out-put patches diﬀering with up-scaling factor k in the slice direction. Consideringthe case of k = 4 illustrated in Fig. 2, this anisotropic U-Net ﬁrst partiallydown-samples the ﬁrst two dimensions until the down-scaling features becomeisotropic and thereafter conducts isotropic down- and up-sampling. To achievethis, we deﬁne the following two operations: Bottleneck Block.

To incorporate a super-resolution transformation intoU-Net, we propose a bottleneck block used to connect corresponding levels of thecontracting and expanding paths, as shown in Fig. 2(b). The design is inspiredby bottleneck block in ResNet [15] and FSRCNN [16]. The bottleneck block BB ( b, u ) has three hyperparameters: the input ﬁlter f , the number of shrinkinglayers b and the up-sampling scaling factor u . It shrinks half of the ﬁlters onconsecutive 3 × × × ×

1. All convolution layers are activated by Rectiﬁed LinearUnit (ReLU) with Batch Normalization (BN). The skip connection enables the5 nput data

RC(3)16 ChMP2x2x1 RC(3)32 ChMP2x2x1 RC(3)64 ChMP2x2x2 RC(3)128 ChMP2x2x2 RC(3)256 Ch Deconv128 Ch2x2x2Concat256 Ch RC(3)128 Ch Deconv64 Ch2x2x2Concat128 Ch RC(3)64 Ch Deconv32 Ch2x2x2Concat64 Ch RC(3)32 Ch Deconv16 Ch2x2x2Concat32 Ch RC(3)16 Ch Conv1 Ch1x1x1

Output data

BB(2,4)16 ChBB(2,2)32 Ch RC( b ) Residual CoreMP Max PoolingConcat Concatenation Deconv DeconvolutionBB( b , u ) Bottleneck BlockConv Convolution(32 , , , , , ,

8) (8 , ,

8) (4 , ,

4) (2 , ,

2) (4 , ,

4) (8 , ,

8) (16 , ,

16) (32 , ,

32) (32 , , , ,

8) (32 , , , ,

8) (16 , , , , , , (a) Anisotropic U-Net Architecture. b copies I npu t C o n v + R e L U + B N f/ C o n v + R e L U + B N f/ ··· C o n v + R e L U + B N f/ C o n v + R e L U + B N f Ch1x1x1 + D ec o n v f Ch1x1x u O u t pu t b copies I npu t C o n v + R e L U + B N f Ch3x3x3 ··· C o n v + R e L U + B N f Ch3x3x3 + R e L U + B N O u t pu t Conv f Ch1x1x1 (b) Bottleneck Block BB ( b, u ) (c) Residual Core RC ( b ) Fig. 2. (a) The diagram of anisotropic U-Net (example for the up-scaling factor of k = 4). The operations, (b) Bottleneck Block BB ( b, u ) with f ﬁlters and (c) ResidualCore RC ( b ) with f ﬁlters, are detailed. The round boxes correspond to the diﬀerentoperations illustrated in the bottom right of (a). The number of output channels,abbreviated as “ Ch ” , and the kernel size are denoted on top and bottom of theboxes. The arrows represent transfer of data with its corresponding shape highlighted. training of deeper networks [15]. Resolution change is eﬃciently carried out bya transpose convolution, or deconvolution, with the same kernel and stride of(1 , , u ). Residual Core.

To have more convolutional layers on each level, the residualcore that is a revision of residual element in [17] is introduced in Fig. 2(c). Thisis a combination of several sequential 3 × × × × High-resolution axial T1-weighted images were obtained from thepublicly available Human Connectome Project (HCP) dataset [2], acquired on a6T Siemens Connectome scanner with an isotropic voxel size of 0 . × . × . . To investigate sensitivity of the proposed U-Net, we trained it on twotraining sets with two up-scaling factors of k = 4 or 8. Speciﬁcally, the slicethickness/gap is 2 . . k = 4, and 4 . . k = 8. As areference for low ﬁeld, T1-weighted images were acquired on a 0.36T MagSense360 MRI System scanner with a non-isotropic voxel size of 0 . × . × . including 6 . . IQT Pipeline.

In the training stage, we randomly selected 30 subjects withskull-stripping from HCP dataset and employed them to synthesise the low-ﬁeldimages using Alg. 1 based on a priori variable SNRs. Regarding patch extraction,we cropped the low-ﬁeld patches with the step size of 8, 16, and 16 /k along x -, y -, and z -directions, respectively. We also cropped the high-ﬁeld patches with thesame volume and position as the corresponding low-ﬁeld patches. The low-ﬁeldand high-ﬁeld patch sizes were 32 × × (32 /k ) and 32 × ×

32, respectively.Then the patches capturing 80% background voxels were excluded from a patchlibrary.We examined if overﬁtting occurred with a validation set and judged theperformance of the trained neural network with an evaluation set. We split all 30subjects into 12, 3, and 15 for training, validation, and evaluation sets. Moreover,we investigated the image quality by calculating the peak signal-to-noise ratio(PSNR) and the structural similarity index (SSIM) [18]. We employed a two-tailed Wilcoxon signed-rank test to determine the statistical signiﬁcance of theperformance diﬀerence between two comparing methods.

Neural Networks.

We conducted an ablation study on the proposed U-Net,denoted by ANISO U-Net( b ), in the case of b = 2 or 3 for shrinking layers inthe bottleneck block. We evaluated our networks against the 3D cubic B-splineinterpolation and several existing U-Net baselines equivalently switching oﬀ thecorresponding blocks, i.e. bottleneck block and residual core, in ANISO U-Net.One is an isotropic 3D U-Net (ISO U-Net) [14] implemented with 5 levels and 3convolutional layers per level. The input of ISO U-Net is isotropically interpo-lated using cubic B-splines. The other one is 3D-SRU-Net [13] that up-sampleseach level output on the contraction path before concatenation. It contained 3levels for the down-sampling scale k = 4 and 4 levels for k = 8. We uniﬁed hyper-parameters of the three U-Nets as follows. Number of ﬁlters on the ﬁrst levelwas 16 with the number of ﬁlters doubling at each subsequent level. All U-Netswere implemented in Python using Keras library [19] with Tensorﬂow backend.They were calculated on a Nvidia GTX 1080 Ti GPU. Training used ADAM [20]as the optimizer with a starting learning rate of 10 − and a decay of 10 − . Weinitialized the parameters with Glorot normal initializer [21]. The batchsize was32 and the loss function is the pixel-wise mean squared error (MSE). All theexperiments started converging after about 30 epochs and we employed earlystopping after 5 epochs of no improvement on the validation set.7 able 1. The performance of the proposed model on up-scaling factors k = 4 or 8. Themean and standard deviation of PSNR and the mean SSIM (MSSIM) are calculatedover 15 evaluation subjects. For each case, we show the best performance over anensemble of 5 trained models. Bold font denotes the best mean or standard deviation.The asterisk ∗ denotes p -value < .

01 compared with the rest methods.Method k = 4 k = 8PSNR (dB) MSSIM PSNR (dB) MSSIMCubic B-spline 20 . ± . ∗ . ± . ∗ . ± . ∗ . ± . ∗ ISO U-Net 30 . ± .

573 0 . ± . ∗ . ± .

469 0 . ± . . ± .

638 0 . ± . . ± .

542 0 . ± . . ± . ∗ . ± . . ± .

517 0 . ± . ∗ ANISO U-Net(3) 30 . ± .

639 0 . ± . ∗ . ± .

544 0 . ± . Input Cubic ISO 3D-SRU-Net ANISO ANISO GroundU-Net U-Net(2) U-Net(3) Truth C o r o n a l S ag i tt a l Fig. 3.

Visualization of U-Net reconstructions with the up-scaling factor k = 8. We evaluated the ability of the proposed U-Net in an ideal case where the SNR-related coeﬃcient α = ( SN R

W MX , SN R

GMX ) in Eq. (1) is deterministic. We ﬁxedthe

SN R

W MX and

SN R

GMX as 61 and 53, respectively, in the IQT pipeline byreconstructing images in the evaluation set at Step 4 in Algorithm 1. Table 1shows that our model, ANISO U-Net(2), achieved the best performance in termsof the average PSNR and SSIM, and especially, signiﬁcantly outperformed theothers in terms of PSNR at k = 4 and the mean SSIM (MSSIM) at k = 8. Thereconstruction degraded as the up-scaling factor increased. Figure 3 shows theU-Net reconstructions on coronal and sagittal planes. Qualitatively we observedclear recovery of high resolution information and enhancement of contrast. Thereconstructed images from all networks nicely highlighted features visible in theground truth images that were obscured in the low quality input. The quantita-tive results in Table 1 show little diﬀerence among the U-Net outputs but theymight not be able to reﬂect subtle qualitative diﬀerences. The zoomed patchesin Fig. 3 highlight diﬀerences more clearly and we believe ANISO U-Net(2) ap-proximates the ground truth most closely and with the least artefacts as shownin the ANISO U-Net(3) result of Fig. 3. Delicately selecting hyper-parameterscan avoid overﬁtting, and hence can mitigate the artifacts.8 .3 Evaluation on Variable SNR Data Sets We evaluated the performance of several deep learning architectures includ-ing the proposed anisotropic U-Net with variable-SNR low-ﬁeld data.

SN R

W MX and

SN R

GMX are now sampled from a two-dimensional Gaussian distribution N ( µ, Σ ) where the coeﬃcients are: µ = (64 . , . Σ = (cid:18) . , . . , . (cid:19) . The simulator shown in Alg. 1 randomly generated N low-ﬁeld input imageswith diﬀerent SNR for the chosen 15 training subjects in the HCP data set.We trained the deep learning models on the dataset with the augmenting factor N = 1 , , .

5% patches for training in eachoverlap patch library. For each neural network, an ensemble of 5 models weretrained in terms of diﬀerent augmented dataset.Table 2 shows the mean and standard deviation of PSNR and MSSIM over15 test subjects in terms of the augmented datasets and deep learning architec-tures. As a result, probabilistic decimation model was generally able to producemore stable reconstruction than the deterministic model if the unseen test datawere also generated from the variable SNR. Both accuracy and robustness corre-sponding to mean and standard deviation of MSSIM improved in various degreeas the number of generated low-ﬁeld image samples increased, and in addition,the performances for the two methods were statistically signiﬁcant in terms of N = 8 at k = 8. Regarding PSNR, the performance upgraded after augmen-tation but the robustness reﬂected by the standard deviation did not improvecorrespondingly. In addition, we observed that PSNR and MSSIM at k = 4 onlyslightly improve when the augmenting factor N became larger, which means theimprovement of performance arising from augmentation gradually reached anupper bound. We tested our IQT approach on the data from a 10-year-old epilepsy patientwho has two cortical-subcortical cystic lesions with surrounding edema on low-ﬁeld T1-weighted images at the GM-WM junction of the parietal lobes. In thiscase, we used IQT with ANISO U-Net(2) trained on the HCP dataset withthe augmenting factor N = 1 and the up-scaling factor of k = 4. Figure 4shows the axial and coronal results enhanced from the low-ﬁeld T1-weightedimage of the patient. The IQT approach improved the GM-WM contrast globally,and signiﬁcantly enhanced the resolution in coronal and sagittal planes. Theenhanced image strongly highlights the two lesions in this patient which arevery subtle on the input T1-weighted image. In this particular patient, the lesionswere clearly visible on the original T2-weighted image, which validates that IQThighlights the lesions in the correct locations, as Fig. 4(c) shows. However, ingeneral not all lesions are clearly visible on any MRI sequence, especially atlow ﬁeld, and Fig. 4 highlights the potential of our algorithms to reveal subtle9 able 2. The performance of probabilistic decimation simulation for augmentationwith a factor of N . The mean and standard deviation of PSNR and MSSIM are calcu-lated over 15 evaluation subjects. We show the best performance over an ensemble of5 trained models. The “const” at N samples/subject column means that the modelswere trained on the ﬁxed SNR data sets as described in Section 3.2. Bold font denotesthe best mean or standard deviation. The asterisk ∗ denotes p -value < .

01 comparedwith the other augmentation factors.Method N samples k = 4 k = 8 / subject PSNR (dB) MSSIM PSNR (dB) MSSIM3D-SRU-Net “const” 27 . ± .

030 0 . ± . . ± . . ± . . ± . . ± . . ± .

693 0 . ± . . ± .

585 0 . ± . ∗ . ± . ∗ . ± . ∗ . ± . . ± . . ± .

587 0 . ± . ∗ . ± .

541 0 . ± . . ± . . ± . ∗ ANISOU-Net(2) “const” 27 . ± .

522 0 . ± . . ± .

367 0 . ± . . ± .

552 0 . ± . . ± . . ± . ∗ . ± . . ± . . ± . ∗ . ± . . ± . . ± . . ± .

613 0 . ± . . ± .

421 0 . ± . . ± . . ± . ∗ lesions enhancing diagnosis and potentially enabling eﬀective treatment via clearlocalisation. In this work, we present an IQT approach to enhance low-ﬁeld MRIs aimingto match resolution as well as contrast of high-ﬁeld images. We introduce theanisotropic U-Net characterised by a deeper hierarchy and super resolving con-nections between input and output layers. We propose the probabilistic decima-tion simulator by synthesising multiple low-ﬁeld images with respect to distinctgrey-white matter SNR sampled from an a priori distribution. We demonstratethat the proposed method improves the robustness on the unseen test data ofvariable SNR at the evaluation stage. We validate our proposed U-Net on theevaluation dataset and the results potentially show generalisability to the actualclinical low-ﬁeld images.This work oﬀers several avenues for future improvement and application.Here the metrics (MSSIM and PSNR) used for quantitative assessment reﬂectthe performance on only synthetic images. This demonstrates eﬃcacy, but eval-uation on a sizeable data set of clinical images and clinical signiﬁcance fromradiologists are essential for further translation. Therefore, additional qualita-tive evaluation by radiologist ratings and, ultimately, demonstration of improveddecision making is essential to conﬁrm impact of the approach. Nevertheless, webelieve our methods have great potential to identify subtle lesions in epilepsy andother neurological conditions and thus to improve patient outcomes in LMICsin the future. 10 xial Coronal(a) Input (b) Enhanced (c) Reference (d) Input (e) Enhanced (f) Reference

Fig. 4.

The IQT prediction on the low-ﬁeld epileptic patient data for (a-c) axial planeand (d-f) coronal plane. (a) and (d): Low-ﬁeld T1-weighted input with cubic B-splineinterpolation; (b) and (e): IQT-enhanced T1-weighted output using ANISO U-Net(2);(c) and (f): low-ﬁeld T2-weighted image as a reference of ground truth. Two sub-centimeter parenchymal cystic lesions at the GM-WM junction of the parietal lobesare pointed out by the red and the yellow arrows. They are barely visible in (a) and (d)but greatly enhanced in (b) and (e). (c) and (f), not involved in the IQT experiment,veriﬁed their location in an independent acquisition.

Acknowledgements

This work was supported by EPSRC grants (EP/R014019/1, EP/R006032/1 andEP/M020533/1) and the NIHR UCLH Biomedical Research Centre. Data wereprovided in part by the Human Connectome Project, WU-Minn Consortium(Principal Investigators: David Van Essen and Kamil Ugurbil; 1U54MH091657)funded by NIH and Washington University. The 0 .

36T MRI data were acquiredat the University College Hospital, Ibadan, Nigeria.

References

1. Wadghiri, Y.Z., Johnson, G., Turnbull, D.H.: Sensitivity and performance timein MRI dephasing artifact reduction methods. Magnetic Resonance in Medicine. (3), 470–476 (2001)2. Sotiropoulos, S.N., et al.: Advances in diﬀusion MRI acquisition and processingin the Human Connectome Project. NeuroImage. , 125–143 (2013)3. Marques, J.P., Simonis, F.F.J., Webb, A.G.: Lowﬁeld MRI: An MR physicsperspective. Journal of Magnetic Resonance Imaging. (6), 1528–1542 (2019)4. Brown, R.W., Cheng, Y.-C.N., Haacke, E.M., Thompson, M.R., Venkatesan, R.:Magnetic resonance imaging: physical principles and sequence design. 2nd edn.John Wiley & Sons, Inc., Hoboken, New Jersey (2014)5. Bahrami, K., Shi, F., Zong, X., Shin, H.W., An, H., Shen, D.: Reconstruction of7T-Like Images from 3T MRI. IEEE Transactions on Medical Imaging. (9),2085–2097 (2016)6. Wolterink, J.M., Dinkla, A.M., Savenije, M.H.F., Seevinck, P.R., van den Berg,C.A.T., I˘sgum, I.: Deep MR to CT Synthesis Using Unpaired Data. In: SASHIMI2017. LNCS 10557, pp 14–23. Springer, Cham (2017).7. Cohen, J.P., Luck, M., Honari, S.: Distribution Matching Losses Can HallucinateFeatures in Medical Image Translation. In: MICCAI 2018. LNCS 11070, pp 529–536. Springer, Cham (2018) . Alexander, D.C., Zikic, D., Ghosh, A., Tanno, R., Wottschel, V., Zhang, J.,Kaden, E., Dyrby, T.B., Sotiropoulos, S.N., Zhang, H., Criminisi, A.: Imagequality transfer and applications in diﬀusion MRI. NeuroImage. , 283–298(2017)9. Tanno, R., Ghosh, A., Grussu, F., Kaden, E., Criminisi, A., Alexander, D.C.:Bayesian Image Quality Transfer. In: MICCAI 2016. LNCS 9901, pp 265–273.Springer, Cham (2016)10. Tanno, R., Worrall, D.E., Ghosh, A., Kaden, E., Sotiropoulos, S.N., Criminisi,A., Alexander, D.C.: Bayesian image quality transfer with CNNs: Exploringuncertainty in dMRI super-resolution. In: MICCAI 2017, LNCS 10433, pp. 611–619. Springer, Cham (2017)11. Blumberg, S.B., Tanno, R., Kokkinos, I., Alexander, D.C.: Deeper Image QualityTransfer: Training Low-Memory Neural Networks for 3D Images. In: MICCAI2018. LNCS 11070, pp. 118–125. Springer, Cham (2018)12. Ashburner, J., Friston, K.J.: Uniﬁed segmentation. NeuroImage. , 839-851(2005)13. Heinrich, L., Bogovic, J.A., Saalfeld, S.: Deep Learning for Isotropic Super-Resolution from Non-isotropic 3D Electron Microscopy. In: MICCAI 2017, LNCS10434, pp. 135–143. Springer, Cham (2017)14. C¸ i¸cek, ¨O., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3D U-Net:Learning Dense Volumetric Segmentation from Sparse Annotation. In: MICCAI2016, LNCS 9901, pp. 424–432 (2016)15. He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recogni-tion. In CVPR 2016, pp. 770–778 (2016)16. Dong, C., Loy, C.C., Tang, X.: Accelerating the Super-Resolution ConvolutionalNeural Network. In: ECCV 2016, LNCS 9906, pp.391-407. Springer, Cham (2016)17. Guerrero, R., Qin, C., Oktay, O., Bowles, C., Chen, L., Joules, R., Wolz, R.,Valdes-Hernandez, M.C., Dickie, D.A., Wardlaw, J., Rueckert, D.: White matterhyperintensity and stroke lesion segmentation and diﬀerentiation using convolu-tional neural networks. NeuroImage: Clinical. , 918-934 (2018)18. Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assess-ment: from error visibility to structural similarity. IEEE transactions on imageprocessing. (4), 600-612 (2004)19. Chollet, F. et al.: Keras. https://keras.io (2015)20. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. In arXivpreprint arXiv:1412.6980 (2014)21. Glorot, X. and Bengio, Y.: Understanding the diﬃculty of training deep feedfor-ward neural networks. In: AISTATS 2010, PMLR , pp. 249-256 (2010), pp. 249-256 (2010)