[PDF] Combining Multi-Sequence and Synthetic Images for Improved Segmentation of Late Gadolinium Enhancement Cardiac MRI

Abstract

Accurate segmentation of the cardiac boundaries in late gadolinium enhancement magnetic resonance images (LGE-MRI) is a fundamental step for accurate quantification of scar tissue. However, while there are many solutions for automatic cardiac segmentation of cine images, the presence of scar tissue can make the correct delineation of the myocardium in LGE-MRI challenging even for human experts. As part of the Multi-Sequence Cardiac MR Segmentation Challenge, we propose a solution for LGE-MRI segmentation based on two components. First, a generative adversarial network is trained for the task of modality-to-modality translation between cine and LGE-MRI sequences to obtain extra synthetic images for both modalities. Second, a deep learning model is trained for segmentation with different combinations of original, augmented and synthetic sequences. Our results based on three magnetic resonance sequences (LGE, bSSFP and T2) from 45 different patients show that the multi-sequence model training integrating synthetic images and data augmentation improves in the segmentation over conventional training with real datasets. In conclusion, the accuracy of the segmentation of LGE-MRI images can be improved by using complementary information provided by non-contrast MRI sequences.

Full PDF

CCombining Multi-Sequence and Synthetic Imagesfor Improved Segmentation of Late GadoliniumEnhancement Cardiac MRI

Víctor M. Campello , Carlos Martín-Isla , Cristian Izquierdo , Steﬀen E.Petersen , , Miguel A. González Ballester , and Karim Lekadir Dept. Matemàtiques i Informàtica, Universitat de Barcelona, Spain [email protected] BCN-MedTech, DTIC, Universitat Pompeu Fabra, Barcelona, Spain Barts Heart Centre, Barts Health NHS Trust, London, United Kingdom William Harvey Research Institute, NIHR Barts Biomedical Research Centre,Queen Mary University of London, United Kingdom ICREA, Barcelona, Spain

Abstract.

Accurate segmentation of the cardiac boundaries in late ga-dolinium enhancement magnetic resonance images (LGE-MRI) is a fun-damental step for accurate quantiﬁcation of scar tissue. However, whilethere are many solutions for automatic cardiac segmentation of cine im-ages, the presence of scar tissue can make the correct delineation of themyocardium in LGE-MRI challenging even for human experts. As partof the Multi-Sequence Cardiac MR Segmentation Challenge, we proposea solution for LGE-MRI segmentation based on two components. First,a generative adversarial network is trained for the task of modality-to-modality translation between cine and LGE-MRI sequences to obtain ex-tra synthetic images for both modalities. Second, a deep learning modelis trained for segmentation with diﬀerent combinations of original, aug-mented and synthetic sequences. Our results based on three magneticresonance sequences (LGE, bSSFP and T2) from 45 diﬀerent patientsshow that the multi-sequence model training integrating synthetic imagesand data augmentation improves in the segmentation over conventionaltraining with real datasets. In conclusion, the accuracy of the segmen-tation of LGE-MRI images can be improved by using complementaryinformation provided by non-contrast MRI sequences.

Keywords:

Multi-sequence cardiac MRI · Late gadolinium enhance-ment MRI · Image segmentation · Image synthesis · Deep Learning.

Late gadolinium enhancement magnetic resonance imaging (LGE-MRI) is widelyused to assess presence, location and extent of regional scar or ﬁbrotic tissue inthe myocardium. Whilst LGE-MRI is a well-established technique and key tomany cardiovascular magnetic resonance (CMR) examinations there are chal-lenges in quantiﬁcation and interpretation due to a number of factors. Image a r X i v : . [ ee ss . I V ] J a n V. M. Campello et al. analysis depends on image quality which can be aﬀected by suboptimal CMRacquisition. Correct inversion times (TI) need to be identiﬁed and then TI re-quire appropriate adjustments to allow good ‘nulling’ of remote, unaﬀected my-ocardium. This ensures optimal contrast between scar/ﬁbrosis (bright) and nor-mal, remote myocardium (dark). Timing after contrast administration is impor-tant to allow not only suﬃcient wash-out of contrast agent (gadolinium chelate)from the remote myocardium but also from the blood pool. Images acquired tooearly will leave the blood pool bright which makes diﬀerentiating subendocardialinfarct from blood pool challenging.In the existing literature, two main families of techniques have been proposedto automatically segment LGE-MRI data. The ﬁrst one segments directly theLGE-MRI images by using diﬀerent techniques such as graph-cuts [1], atlas-based registration [2], or more recently Convolutional Neural Networks (CNNs)[3]. However, these techniques generally lack robustness due to the limited avail-ability of LGE-MRI datasets for training. As a result, the second family oftechniques has considered exploiting other cardiac MRI sequences to provideadditional signals for guiding more robustly the segmentation process. For in-stance, some researchers [4,5] proposed to segment ﬁrst cine-MRI images andto propagate the obtained contours into the LGE-MRI images through imageregistration. Similarly but by using additional sequences, the authors in [6] im-plemented an atlas-based segmentation approach combining information frombalanced-Steady State Free Processing (bSSFP), LGE and T2 sequences. How-ever, these techniques are highly dependent on the image registration step, whichis challenging due to the inherent diﬀerences between the cardiac MRI sequences.In addition, in order to improve segmentation and increase the model ro-bustness over unseen data, image synthesis has been proposed recently. Themost common model combines generative adversarial networks (GANs) with acycle-consistency constrain for image-to-image translation and two segmentationnetworks, one for each image domain, trained end-to-end in order to beneﬁt froma combined loss function. This model has been applied for cross-modality seg-mentation improvement [7,8], domain adaptation across scanners [8] or acrossmodalities [9] and segmentation of an unlabeled target modality using only thesource ground truth [10,11]. Alternatively, a GAN can be trained to generate syn-thetic images from masks according to some conditional value, like the datasetstyle, as in the case of retinal fundus images for vessel segmentation [12].In this paper, we propose an approach to circumvent the need for image reg-istration, while addressing the lack of LGE-MRI images for training. Concretely,we implement a CNN-based approach that is capable of learning key propertiesof the cardiac structures simultaneously from multiple cardiac MRI sequences.Furthermore, image synthesis and data augmentation are used to generate newexamples that take into account both the global appearance of LGE-MRI dataand the local appearance of scar tissues. With this approach, direct deep learningbased segmentation of LGE-MRI is enabled without the need for inter-sequenceimage registration and while exploiting the richness of multi-sequence cardiacMRI. ulti-sequence and synthetic images for LGE-MRI segmentation 3

The LGE-MRI dataset used in this paper was providedas part of the Multi-Sequence Cardiac Magnetic Resonance Segmentation Chal-lenge (MS-CMRSeg). It consists of 45 patients from Shanghai Renji Hospital thatwere scanned using three MRI sequences: bSSFP, LGE and T2. Ground truthsegmentations of the left ventricle (LV), right ventricle (RV) and myocardium(MYO) were provided for some of the cases according to the distribution in Ta-ble 1 (second row). Even though all sequences were acquired and selected for theend-diastolic cardiac phase, there were diﬀerences in the shape of the cardiacboundaries consistently between the three sequences for the same patient. More-over, the slices were not aligned between the sequences in the direction of theventricular axis, which further complicates the application of image registration.Note that all patients in the sample suﬀer from cardiomyopathies and that everyLGE-MRI image presents a scar of variable size within the myocardial wall.

Table 1.

MS-CMRSeg sequences details.bSSFP LGE T2Number of patients 45 45 45Segmented patients 35 5 35Number of slices 8 – 12 10 – 18 3 – 7Slice thickness ( mm ) 8 – 13 5 12 – 20TR/TE ( ms ) 2.7/1.4 3.6/1.8 2000/90In-plane resolution ( mm ) . × .

25 0 . × .

75 1 . × . Data pre-processing

As a ﬁrst step, intensity bias correction was applied toall sequences to correct for potential artifacts and the intensity histograms of allimages were matched to a common one to obtain coherent appearances acrossimages. Furthermore, before the training process, all images were interpolatedand cropped so that they had a pixel size of × and the same resolution.They were also normalised such that the mean intensity and the standard devi-ation equal . , thus ensuring most of the input values to be positive in between0 and 1 for convenience in later representation of the images. Before describing the CNN model implemented in this paper for LGE-MRI seg-mentation, this section presents two methods used to increase the number oftraining data and obtain higher LGE-MRI variability.

V. M. Campello et al.

Data augmentation

By using the provided segmentations, a set of 50 land-marks were evenly placed around the epicardium and endocardium. With these,the myocardium and left ventricle were rotated relative to the rest of the image,as shown in the examples in Figure 1, in order to obtain an augmented datasetwith varying locations of the scar tissues. Since the contour of the epicardiumis not perfectly round in general, a Gaussian ﬁlter of size × was appliedaround the outer boundary to smooth the transition between the rotated andﬁxed regions, thus preventing image intensity discontinuities. A total of twenty . degrees rotations were applied. Thus, the LGE-MRI dataset was multipliedby a factor of 20 and the location of the scar in the myocardium ranged betweenthe initial position and 144 degrees clockwise. This augmentation technique in-creases the variability in the scar locations within the myocardial wall that wasotherwise very low due to the small number of patients available for training.Furthermore, further data augmentations were obtained by applying small rota-tions of the input images up to 15 degrees before training. Fig. 1.

Example of three rotations of the myocardial wall with respect to the wholeimage by using the landmarks provided in the leftmost image. This shows the changesin the location of the scar tissues

Image synthesis

The rationale behind the proposed image synthesis is thatthere are many more segmented cine-MRI datasets available open-access or inclinical registries for training CNN models. Thus, to increase the number ofannotated LGE-MRI cases for training, image synthesis from cine-MRI imagessequences is proposed. To achieve this, the CycleGAN method [13] was imple-mented using the PyTorch library provided at this link † .This method translates images from one domain to another without the needfor image registration or for the sequences to be from the same patients. It con-sists of a pair of generators G LGE , G bSSF P and a pair of discriminators D LGE , D bSSF P that have opposed goals. The generator G LGE ( G bSSF P ) transformsthe bSSFP (resp. LGE) sequence into a realistic LGE (bSSFP) image, while thediscriminator D LGE ( D bSSF P ) attempts to distinguish between real and fakeLGE (bSSFP) sequences. To achieve a good image translation between the twosequences, the loss function contains two terms: (1) an adversarial loss for each † https://github.com/junyanz/pytorch-CycleGAN-and-pix2pixulti-sequence and synthetic images for LGE-MRI segmentation 5 target domain that accounts for the similarity between the generated and realimages, and (2) a cycle consistency loss that ensures that the transformed image G LGE ( X ) ( G bSSF P ( Y ) ) is transformed back to X ( Y ) through G bSSF P ( G LGE ). E x a m p l e Original bSSFP Synthetic LGE Original LGE E x a m p l e Fig. 2.

Examples of synthetic LGE-MRI images. The leftmost column are the originalcine images, the central column shows the transformed images to the LGE domain andthe rightmost column is the most similar slice from the real LGE sequences, since theywere not registered/aligned.

For the training of the CycleGAN model, all slices from the 45 patients forthe LGE and bSSFP sequences were used during 200 epochs. The training took12 hours on a NVIDIA 1080 GPU with a batch size of . The Adam optimizerwas used with learning rate of × − , with ﬁrst and second moment decayrates of . and . , respectively. Some examples for the generated images areshown in Figure 2.In order to evaluate the quality of the generated images, two segmentationmodels (like the one described in the next subsection) were trained using thebSSFP images and the synthetic LGE images separately. The obtained resultsare presented in Table 2. In particular, the synthetic LGE images, that areanatomically similar to the original bSSFP, provide more information for thetask of LGE segmentation. Table 2.

Average and standard deviation for the Dice score of segmentation resultsover the ﬁve labeled LGE volumes. LV MYO RVavg. std. avg. std. avg. std.model trained w. bSSFP 0.503 0.406 0.370 0.301 0.515 0.434model trained w. synthetic LGE 0.809 0.116 0.688 0.145 0.820 0.065 V. M. Campello et al.

Once a large set of training sample was obtained from the original, augmentedand synthetic images, a modiﬁed U-Net architecture [14] was used for the imagesegmentation by integrating two techniques: (1) a deep supervision term in theupsampling path as proposed in [15] that will act as lower-resolution masksthat are convolved to condition the ﬁnal predictions; and (2) a reduction of thenumber of ﬁlters after each upsampling operation to match the number of labelsas proposed by [16]. Each image in the dataset was provided as a single channelinput, thus forcing the model to diﬀerentiate between sequences with a uniqueset of weights. Additionally, in order to avoid overﬁtting given the sample size,dropout was used after every max pooling and upsampling operations, exceptfor the high level features in the architecture, as shown in Figure 3.

Fig. 3.

Detailed architecture of the CNN model used for LGE segmentation. The num-bers in the boxes correspond to the number of channels. Convolution operations havea kernel size of × and stride of , while transpose convolutions have a kernel size of × and stride of . During training, 20% of the patients for each dataset was reserved for vali-dation and early stopping. With a batch size of images, this model took lessthan 36 hours to achieve the best accuracy on the validation set after almost90 epochs on a NVIDIA TITAN X GPU. The Adam optimizer was used witha learning rate of − , with ﬁrst and second moment decay rates equal to . and . , respectively. In order to deﬁne the best trained CNN model for LGE-MRI segmentation,various training sets were used by varying the input sequences and combinationsof image synthesis and scar augmentation, as follows: ulti-sequence and synthetic images for LGE-MRI segmentation 7

1. LGE sequences only;2. LGE and bSSFP sequences;3. All sequences (LGE, bSSFP and T2);4. All sequences plus MYO and LV rotations in LGE sequences;5. Number 1 plus synthetic LGE sequences;6. Number 2 plus synthetic LGE sequences;7. Number 3 plus synthetic LGE sequences;8. Number 4 plus synthetic LGE sequences.When evaluated on the validation set, the training set number 8 resulted inthe best segmentations, showing the added value of image synthesis and dataaugmentation for LGE-MRI segmentation. Thus, we applied the correspondingCNN model to the test dataset composed of 40 LGE-MRI cases. The obtainedsegmentations were sent to the organizers of MS-CMRSeg Challenge for eval-uation. The obtained results are summarized in Table 3, showing average dicescores of 90% (LV), 87% (MYO) and 81% (RV).

Table 3.

Average and standard deviation for results over the test set.LV MYO RVavg. std. avg. std. avg. std.Dice score 0.898 0.045 0.810 0.061 0.866 0.051Jaccard index 0.817 0.072 0.685 0.084 0.768 0.078Surface distance ( mm ) 2.0 0.8 1.8 0.5 2.3 0.9Hausdorﬀ distance ( mm ) 11 4 12 4 16 7 Two remarks are important to note regarding the results reported in Ta-ble 3: (1) Despite the high variability in the LGE-MRI datasets, especially inthe presence, extent and location of the scar tissues, relatively consistent resultsare obtained with standard deviations for the dice scores around 5%. (2) Despitethe availability of only 5 LGE-MRI volumes for training, the proposed approachwas able to achieve comparable results to very recent deep learning techniques,which reported dice scores of . ± . (LV), . ± . (MYO) and . ± . (RV) based on 5 times more training cases (25 LGE-MRI images).[3]. This indicates the value of the proposed inter-sequence synthesis and scaraugmentation for generating richer training samples.Finally, for visual illustration, Figure 4 shows three segmentation examplesas obtained in this study. Model number 3 (second column) introduces errorsthat are corrected when adding synthetic images (model number 7 in the thirdcolumn). The last column shows that the segmentation further improves whenintegrating the scar tissue augmentation as proposed in this paper (model 8). This paper proposed to address the limited availability of training samples forLGE-MRI segmentation by enriching the CNN models using two complimentary

V. M. Campello et al. E x a m p l e Original bSSFP + LGE + T2 bSSFP + LGE + T2 + syn. LGE bSSFP + LGE + T2 + syn. & rot. LGE E x a m p l e E x a m p l e Fig. 4.

Three segmentation examples as obtained by using diﬀerent training combina-tions, showing the improvement achieved by integrating inter-sequence image synthesis(column 3) and scar tissue augmentation (column 4) during training. methods. Firstly, since samples of annotated cine-MRI sequences are more com-monly available, image synthesis of LGE-MRI images was implemented using aCycleGAN approach, thus obtaining a larger number of LGE-MRI cases dur-ing training. Secondly, we performed LGE-speciﬁc data augmentation throughshape-guided rotations of the myocardium, which increases the variability relatedto the location of the scar tissues in the myocardium. The validation shows con-sistent results across the datasets, indicating the potential of this approach forenhancing the richness and generalization of LGE-speciﬁc CNNs.Future work include the extension of the image synthesis to take into accountlocal cardiac motion abnormality for synthesizing scar tissue, as well as theuse of elastic deformations of the myocardium and scar to augment non-rigidlythe LGE-MRI examples. Furthermore, extensive validation will be performedto assess in detail the relative importance of the diﬀerent steps and sequences(bSSFP, T2) in enriching the CNN models for LGE segmentation.

This work was partly funded by the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 825903 (euCanSHare project).SEP acts as a paid consultant to Circle Cardiovascular Imaging Inc., Calgary,Canada and Servier. SEP acknowledges support from the National Institute forHealth Research (NIHR) Cardiovascular Biomedical Research Centre at Barts,from the SmartHeart EPSRC programme grant (EP/P001009/1) and the Lon- ulti-sequence and synthetic images for LGE-MRI segmentation 9 don Medical Imaging and AI Centre for Value-Based Healthcare. SEP and KLacknowledge support from the CAP-AI programme, London’s ﬁrst AI enablingprogramme focused on stimulating growth in the capital’s AI Sector.