Automatic Cerebral Vessel Extraction in TOF-MRA Using Deep Learning
V. de Vos, K.M. Timmins, I.C. van der Schaaf, Y. Ruigrok, B.K. Velthuis, H.J. Kuijf
AAutomatic Cerebral Vessel Extraction in TOF-MRA UsingDeep Learning
V. de Vos , K.M. Timmins , I.C. van der Schaaf , Y. Ruigrok , B.K. Velthuis , and H.J. Kuijf Eindhoven University of Technology, The Netherlands University Medical Center Utrecht, The Netherlands
ABSTRACT
Deep learning approaches may help radiologists in the early diagnosis and timely treatment of cerebrovasculardiseases. Accurate cerebral vessel segmentation of Time-of-Flight Magnetic Resonance Angiographs (TOF-MRAs) is an essential step in this process. This study investigates deep learning approaches for automatic, fastand accurate cerebrovascular segmentation for TOF-MRAs.The performance of several data augmentation and selection methods for training a 2D and 3D U-Net forvessel segmentation was investigated in five experiments: a) without augmentation, b) Gaussian blur, c) rotationand flipping, d) Gaussian blur, rotation and flipping and e) different input patch sizes. All experiments wereperformed by patch-training both a 2D and 3D U-Net and predicted on a test set of MRAs. Ground truth wasmanually defined using an interactive threshold and region growing method. The performance was evaluatedusing the Dice Similarity Coefficient (DSC), Modified Hausdorff Distance and Volumetric Similarity, betweenthe predicted images and the interactively defined ground truth.The segmentation performance of all trained networks on the test set was found to be good, with DSCscores ranging from 0.72 to 0.83. Both the 2D and 3D U-Net had the best segmentation performance withGaussian blur, rotation and flipping compared to other experiments without augmentation or only one of thoseaugmentation techniques. Additionally, training on larger patches or slices gave optimal segmentation results.In conclusion, vessel segmentation can be optimally performed on TOF-MRAs using a trained 3D U-Net onlarger patches, where data augmentation including Gaussian blur, rotation and flipping was performed on thetraining data.
Keywords:
Cerebrovascular diseases, Magnetic Resonance Angiography (MRA), segmentation, deep learning,U-Net
1. INTRODUCTION
Stroke, including ischemic and hemorrhagic stroke and aneurysmal subarachnoid hemorrhage, is a major causeof death and disability worldwide with more than six million deaths in 2015. In some cases it can be caused byabnormalities of the intracranial arteries including stenosis, intracranial aneurysms and other vascular malfor-mations. The incidence is even increasing because of the increasing population ages.
1, 2
For an early diagnosis and timely treatment of various cerebrovascular diseases, detailed information about thevasculature might aid a radiologist in decision making. This information could be obtained from cerebrovascularsegmentations, where the blood vessels are extracted from the images. This will allow for quantitative analysisof the vasculature, as well as better (3D) visualization.
1, 3, 4
Currently, use of such segmentations is not commonpractice, because this often requires manual segmentation; a difficult and time-consuming procedure, which isprone to inter- and intra-rater variability.
1, 5
Automatic vessel extraction methods could overcome this issue,including methods as Markov random fields, multi scale filtering, deformable models, hybrid methods anddeep learning.
1, 5
Such methods create a 3D vascular model for every patient, which can be useful to findvessel abnormalities. In a study of Gan et al. (2005), an automatic vessel segmentation method based onmaximum intensity projections (MIP) was presented. This method compiled the vessel segmentation iterativelyby using the segmentation of the MIP images along a fixed direction. The MIP images were segmented with afinite mixture model (FMM) and expectation maximization (EM) algorithm. Once the images were segmented a r X i v : . [ ee ss . I V ] J a n long the individual axes, the results were combined. In addition, a study of Phellan et al. (2017) proposeda deep Convolutional Neural Network (CNN) to automatically segment the vessels in TOF-MRA images ofhealthy subjects. Experiments were performed with a varying number of images for training the CNN and crossvalidation was used to test the generalization of the model. The ground truth was obtained by manual annotatedimage patches extracted in the axial, coronal and sagittal directions. This study provides an automatic vessel segmentation method by training and evaluating a CNN with U-Netarchitecture, which is one of the most promising deep learning networks for segmentation tasks. To evaluate theperformance of this network, different experiments were performed to compare a 2D and 3D U-Net architecturewith several training data augmentation and selection methods.
2. MATERIALS AND METHODS2.1 Dataset
The data used in this study included 69 patients with unruptured aneurysms scanned in the University MedicalCenter Utrecht, the Netherlands. All patients underwent a 3D TOF-MRA scan in the period between 2004 and2012 and were scanned twice, a baseline scan and a follow up scan. An example of one slice of a TOF-MRA isshown in Figure 1. Overall, the slice thickness ranged from 0.4 to 0.7 mm and the in-plane voxel size rangedfrom 0.195x0.195 to 0.586x0.586 mm.Figure 1: Example slice in the transverse plane of a TOF-MRA.
Before segmenting the vascular structure, the images in the dataset, as described in section 2.1, were preprocessedby using N4 bias field inhomogeneity correction
10, 11 and Z-score normalization. The dataset did not contain delineations of the brain vasculature. To acquire the labelled ground truthdata for vessel segmentation, interactive vessel segmentation was performed. First, the image was interactivelythresholded by using histogram-based thresholding in which the user can choose the image specific intensitypercentage at which the threshold was determined. The threshold of all images was chosen between 95% and99% of the maximum image intensity. The resulting thresholded image was used to define seed points for regiongrowing. The resulting labels were manually checked for accuracy and corrected as required. The interactivevessel segmentation was performed in MevisLab (version 3.2). .3 Network Both a 2D and 3D fully convolutional neural network with U-Net architecture were trained on randomly selectedand augmented patches from TOF-MRA images. For the 2D network, the input patches had a size of 64x64voxels and for the 3D network a size of 16x16x16 voxels in order to train on the same number of voxels per patchin 2D and 3D. The same patches were used for all the experiments.A balanced number of patches from vessel (80%) and non-vessel (20%) regions were used for training. Theselection of patches was based on the center voxel of each patch. When this voxel was labelled as vessel in theground truth image, the patch was categorized as a patch containing vessels and otherwise it was categorized asnon-vessel patch.Finally, both the 2D and 3D network were optimized using a dice loss function, Adam optimizer and alearning rate of 1 × − . For both the 2D and 3D architectures (190.396 trainable parameters), five experiments were compared. Inall experiments, the same MRAs were used for training (n = 84, 64%), validation (n = 21, 16%) and test (n= 26, 20%). The first experiment, (a), was performed without applying any augmentation technique to thetraining data. Next, three experiments were performed by training the networks with the patches with differentaugmentation techniques: b) Gaussian blurring, c) rotation and flipping and d) both Gaussian blurring androtation and flipping. The fifth experiment, (e), was performed by training the networks with full slices insteadof patches for 2D and training the 3D network with larger patches (64x64x64 voxels) with all augmentationtechniques mentioned before.The resulting trained networks were used to segment the blood vessels in the pre-processed test set of MRAs.Voxels with a probability larger than 0.7 were assumed to be inside a vessel.Post-processing was performed using connected component analysis in which regions with less than 200 voxelswere eliminated from the segmentation.
To evaluate and compare the performances of the different experiments, the Dice Similarity Coefficient (DSC),
14, 15
Modified Hausdorff Distance (MHD)
14, 15 and Volumetric Similarity (VS) between the predicted segmentationand the generated ground truth segmentation for each MRA were determined.The DSC was used to evaluate the overlap between the ground truth and predicted segmentation. However,the DSC is limited for the evaluation of the vessel segmentations as vessels are narrow and elongated. For thisreason, segmentation errors can quickly lead to a loss of overlap. Therefore, a distance metric was also usedfor evaluation. A commonly used distance metric is the Hausdorff Distance (HD). However, this measure isvery sensitive to outliers, which are common in medical segmentations. For this reason, the Modified HausdorffDistance (MHD) was used, which is not based on the maximum distance between points but on a defined per-centile (95%) of the distance between boundary points.
14, 15
Finally, the VS was used to compare the segmentedvolumes without taking into account the location or overlap of the segmentations.A Wilcoxon signed-rank test was performed to compare the results achieved by the different experiments.This test was performed with the goal of determining whether there is a difference between the evaluation metricsof the experiments. Python version 3.7.6 with the SciPy library was used to perform this test.
3. RESULTS
Tables 1a and 1b show the average resulting numerical results expressing the performance of the experiments inboth 2D and 3D, respectively. It can be observed that the segmentation performance of all trained networks inboth 2D and 3D was good with all mean DSC scores larger than 0.70.able 1: Segmentation metrics for the test set for the proposed augmentation techniques and the use of patchesor slices for the training of the U-Net. Values are provided as the mean ± the standard deviation. The size invoxels of the patches used for the different experiments are indicated between the brackets. (a) 2D U-Net, (b)3D U-Net. (a) 2D U-Net
2D U-Net
Augmentation DSC MHD [mm] VS a Patches None 0.74 ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± (b) 3D U-Net
3D U-Net
Augmentation DSC MHD [mm] VS a Patches None 0.72 ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± < . ∗ − < a) (b) Figure 4: Example segmentation for one slice in the transverse plane of a TOF-MRA. (a) Ground truth segmen-tation. (b) Automatic segmentation resulted from the 3D U-Net trained on patches of size 64x64x64 voxels withGaussian blur, rotation and flipping. Arrow 1 indicates a small oversegmentation in the automatic segmentation. (a) (b)
Figure 5: Example segmentation for one slice in the transverse plane of a TOF-MRA. (a) Ground truth segmen-tation. (b) Automatic segmentation resulted from the 3D U-Net trained on patches of size 64x64x64 voxels withGaussian blur, rotation and flipping. Arrow 2 indicates an undersegmentation in the automatic segmentation.he optimum method for cerebrovascular segmentation was found to be the 3D U-Net trained on patches ofsize 64x64x64 voxels with all augmentation procedures, which resulted in a DSC of 0.83, MHD of 29.9 mm andVS of 0.86.
4. DISCUSSION
Comparing the performance of the proposed deep learning experiments for vessel segmentation yielded someinteresting results. This study showed that the automatic cerebrovascular segmentation can be accurately per-formed using a CNN with U-Net architecture. The performance of the U-Net can be improved with augmentingthe training data. The optimum network for vessel segmentation was determined to be the 3D U-Net on patchesof size 64x64x64 voxels and augmented by Gaussian blur, rotation and flipping.As described in section 3, all experiments performed with the proposed CNN with U-Net architecture resultedin good DSC scores ranging from 0.72 to 0.83. In general, this overlap measure was higher compared to theDSC of 0.74 reported in a study of Chen et al. (2017), which used a 3D convolutional autoencoder for vesselsegmentation. Another CNN for vessel segmentation in TOF-MRA was proposed by a study of Phellan et al.(2017) and resulted in DSCs ranging from 0.764 to 0.786 depending on the number of images used for training. On the contrary, the U-Net framework proposed by a study of Livne et al. (2019) showed higher overlap measurewith a mean DSC of 0.88. This could be caused by the larger patches this study used. The study of Livne etal. (2019) found an optimal patch size of 96x96 voxels. In our study, it was also found that the experimentwith training on slices or larger patches gave the best results, as described in section 3.In general, the 2D U-Net performs better compared to the 3D U-Net for cerebrovascular segmentation exceptin the experiment with training on slices or larger patches (e). This may be caused by the complicated shape ofthe vessels in 3D, which makes it more difficult for the network to learn.As described in section 2, we performed experiments with Gaussian blur, rotation and flipping. The Gaus-sian blur could help the network to learn more robust features, by varying the contrast between the vessels andsurrounding tissues. The rotation and flipping overcome positional biases. The combination of those augmenta-tion techniques results in more diverse training data resulting in a better segmentation accuracy. This was alsoobserved in section 3, where it was described that experiments (d) and (e) gave the best results for both the2D and 3D U-Net. This section also described that both the 2D and 3D U-Net performances were improved byaugmenting the training data compared to no augmentation, which proves the importance of data augmentationto increase the diversity of the data without actually collecting new data.Finally, for both the 2D and 3D U-Net, the best results were obtained by training on slices or larger patches(64x64x64 voxels) (experiment (e)). This was also reported in a study by Livne et al. (2019) and may be dueto the larger patches providing a better representation of the small vessels in the full brain MRA and therebyimproving the learning process of the vessel locations in the brain. The proposed vessel segmentation experiments have both advantages and limitations.Firstly, the computation time of the algorithm is important in clinical use. As described in section 1, this isone of the main reasons to provide an automatic vessel segmentation method. The trained U-Net can providethe vessel segmentations in the order of seconds per image.Second, as described in section 2, the same MRAs were used for both the 2D and 3D experiments for vesselsegmentation. In addition, the patches used for training the 2D network were of size 64x64 voxels and for the 3Dnetwork of size 16x16x16 voxels in order to train on the same number of voxels per patch in 2D and 3D. Thosefactors make it easier to compare the experiments performed in this study.One main limitation of deep learning is the dependency on training data. This training data should be ableto represent the unlabelled test data well enough to provide good results. In this study, the dataset consisted ofpatients with unruptured aneurysms. To obtain a more representative dataset for vessel segmentation, healthypatients and patients with other pathologies, such as vessels containing stenoses, occlusions or infarcts could beincluded.nother limitation of the vessel segmentation is the lack of a manually labeled vessel imaging dataset. Onemain advantage of the proposed vessel segmentation method was that an interactive vessel segmentation method(described in section 2.2) was used for generating the ground truth labels. Manual annotations are labour andtime intensive and this study showed that it is possible to produce a robust vessel segmentation without them.However, some small vessels were missed by the interactive ground truth generation technique. This was themain cause of the relatively high MHD results, described in section 3. Further investigation into optimising theground truth segmentation is warranted.Finally, as described in section 3, the best vessel segmentation results were obtained by training a U-Net onslices in 2D or larger patches in 3D. However, the largest patches in 3D were of size 64x64x64 voxels as the patchsize was limited due to memory constraints. This potentially reduced the performance of the 3D U-Net wheremore context might be needed.
As described in section 2, the data used for the vessel segmentation was randomly split into a training, validationand testing set. Cross-validation could be performed to ensure the robustness and generalization of the trainednetwork.As this was a preliminary study, the test set used for the evaluation of the proposed vessel segmentationalgorithm only contained 26 images, which is relatively small. Consequently, outlier images could have a largeinfluence on the results. Future work could focus on using a larger dataset to evaluate the performance ofthe proposed segmentation method. For example, the full dataset provided by the Aneurysm Detection AndsegMentation (ADAM) challenge containing 113 sets of brain MR images for training and 142 sets for testingcould be used. Furthermore, future work could improve the ground truth used for the deep learning. With multiple medicalexperts, a systematic quantitative rating could be performed which includes the intra- and inter-rater variabilityand improves the ground truth segmentations.In this study, only vessel segmentations generated by the U-Net architecture were evaluated. The U-Netarchitecture was chosen because of its prevalent and successful use in previous medical image segmentationproblems. In future work, other network architectures for vessel segmentation could be investigated. However,due to the nature of this segmentation problem, no large improvements with respect to the U-Net performance areexpected. In addition, a study of Livne et al. (2019) compared the performance of the U-Net to the performanceof a U-Net with half of the convolutional layers. This resulted in comparable segmentation results and reducedthe training time. Further research could focus on evaluating the half U-Net, or a U-Net with less parameters,for vessel segmentation and comparing the performance to the original U-Net performance as described in ourstudy.
5. CONCLUSION
In conclusion, our study found that a 3D U-Net trained on patches of size 64x64x64 voxels augmented usingGaussian blur, rotation and flipping performs optimally for vessel segmentation from TOF-MRAs.
REFERENCES
1. R. Phellan, A. Peixinho, A. Falc˜ao, and N. Forkert, “Vascular segmentation in tof mra images of the brainusing a deep convolutional neural network,”
Lecture Notes in Computer Science , pp. 39–46, 2017.2. M. Katan and A. Luft, “Global burden of stroke,”
Seminars in Neurology (2), pp. 208–211, 2018.3. A. Frangi, W. Niessen, V. K.L., and M. Viergever, “Multiscale vessel enhancement filtering,” Lecture Notesin Computer Science , pp. 130–137, 1998.4. R. Gan, W. Wong, and A. Chung, “Statistical cerebrovascular segmentation in three-dimensional rotationalangiography based on maximum intensity projections,”
Medical Physics (9), pp. 3017–3028, 2005.. M. Livne, J. Rieger, O. Aydin, A. Taha, E. Akay, T. Kossen, J. Sobesky, J. Kelleher, K. Hildebrand, D. Frey,and V. Madai, “A u-net deep learning framework for high performance vessel segmentation in patients withcerebrovascular disease,” Frontiers in Neuroscience (97), 2019.6. K. Fang, D. Wang, L. Lui, S. Zhou, W. Chu, A. Ahuja, and P. Heng, “3d model-based method for ves-sel segmentation in tof-mra,” Proceedings of the 2011 International Conference on Machine Learning andCybernetics
Grulin , pp. 1607–1611, 2011.7. T. McInerney and D. Terzopoulos, “Medical image segmentation using topologically adaptable surface,”
Lecture Notes in Computer Science , pp. 23–32, 1997.8. T. Chen and D. Metaxas, “Gibbs prior models, marching cubes, and deformable models: A hybrid frameworkfor 3d medical image segmentation,”
Lecture Notes in Computer Science , pp. 703–710, 2003.9. O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmenta-tion,”
Medical Image Computing and Computer-Assisted Intervention (MICCAI) , pp. 234–241, 2015.10. N. Tustison, B. Avants, P. Cook, Y. Zheng, A. Egan, P. Yushkevich, and J. Gee, “N4itk: Improved n3 biascorrection,”
IEEE Transactions on Medical Imaging (6), pp. 1310–1320, 2010.11. “Advanced normalization tools (ants).” http://stnava.github.io/ANTs/ . Accessed: 2020-10-23.12. B. Ellingson, T. Zaw, T. Cloughesy, K. Naeini, S. Lalezari, S. Mong, A. Lai, P. Nghiemphu, and W. Pope,“Comparison between intensity normalization techniques for dynamic susceptibility contrast (dsc)-mriestimates of cerebral blood volume (cbv) in human gliomas,” Journal of magnetic reesonance imaging(JMRI) (6), pp. 1472–1477, 2012.13. F. Ritter et al. , “Medical image analysis,” IEEE Pulse (6), pp. 60–70, 2011.14. K. Toennies, Guide to Medical Image Analysis - Advances in Computer Vision and Pattern Recognition ,Springer, 2012. pp. 418.15. A. Taha and A. Hanbury, “Metrics for evaluating 3d medical image segmentation: analysis, selection, andtool,”
BMC Medical Imaging (29), 2015.16. E. Whitley and J. Ball, “Statistics review 6: Nonparametric methods,” Critical Care (6), pp. 509–513,2002.17. L. Chen, Y. Xie, J. Sun, N. Balu, M. Mossa-Basha, K. Pimentel, T. Hatsukami, J. Hwang, and C. Yuan,“3d intracranial artery segmentation using a convolutional autoencoder,” IEEE International Conferenceon Bioinformatics and Biomedicine (BIBM) , pp. 704–707, 2017.18. “Aneurysm detection and segmentation (adam) challenge.” http://adam.isi.uu.nlhttp://adam.isi.uu.nl