[PDF] Automatic Tissue Segmentation with Deep Learning in Patients with Congenital or Acquired Distortion of Brain Anatomy

Abstract

Brains with complex distortion of cerebral anatomy present several challenges to automatic tissue segmentation methods of T1-weighted MR images. First, the very high variability in the morphology of the tissues can be incompatible with the prior knowledge embedded within the algorithms. Second, the availability of MR images of distorted brains is very scarce, so the methods in the literature have not addressed such cases so far. In this work, we present the first evaluation of state-of-the-art automatic tissue segmentation pipelines on T1-weighted images of brains with different severity of congenital or acquired brain distortion. We compare traditional pipelines and a deep learning model, i.e. a 3D U-Net trained on normal-appearing brains. Unsurprisingly, traditional pipelines completely fail to segment the tissues with strong anatomical distortion. Surprisingly, the 3D U-Net provides useful segmentations that can be a valuable starting point for manual refinement by experts/neuroradiologists.

Full PDF

AAutomatic Tissue Segmentation with DeepLearning in Patients with Congenital orAcquired Distortion of Brain Anatomy

Gabriele Amorosino , , Denis Peruzzo , Pietro Astolﬁ , , Daniela Redaelli ,Paolo Avesani , , Filippo Arrigoni and Emanuele Olivetti , NeuroInformatics Laboratory (NILab), Bruno Kessler Foundation, Trento, Italy Center for Mind and Brain Sciences (CIMeC), University of Trento, Italy Neuroimaging Lab, Scientiﬁc Institute IRCCS Eugenio Medea,Bosisio Parini (Lecco), Italy PAVIS, Italian Institute of Technology (IIT), Genova, Italy [email protected] [email protected]

Abstract.

Brains with complex distortion of cerebral anatomy presentseveral challenges to automatic tissue segmentation methods of T1-weightedMR images. First, the very high variability in the morphology of the tis-sues can be incompatible with the prior knowledge embedded within thealgorithms. Second, the availability of MR images of distorted brains isvery scarce, so the methods in the literature have not addressed suchcases so far. In this work, we present the ﬁrst evaluation of state-of-the-art automatic tissue segmentation pipelines on T1-weighted images ofbrains with diﬀerent severity of congenital or acquired brain distortion.We compare traditional pipelines and a deep learning model, i.e. a 3DU-Net trained on normal-appearing brains. Unsurprisingly, traditionalpipelines completely fail to segment the tissues with strong anatom-ical distortion. Surprisingly, the 3D U-Net provides useful segmenta-tions that can be a valuable starting point for manual reﬁnement byexperts/neuroradiologists.

Accurate segmentation of brain structural MR images into diﬀerent tissues, likewhite matter (WM), gray matter (GM) and cerebrospinal ﬂuid (CSF), is ofprimary interest for clinical and neuroscientiﬁc applications, such as volumequantiﬁcation, cortical thickness analysis and bundle analysis.Since manual segmentation of the brain tissues is extremely time consumingit is usually performed by means of well-established automated tools, such asFSL [7], SPM [1], FreeSurfer [5] and ANTs [12]. Typically, these tools obtainexcellent quality of segmentation in normal-appearing brains.More recently, brain tissue segmentation has been addressed by deep learn-ing algorithms like convolutional neural networks (CNNs) [9,10,4,13], applieddirectly to T1 or T2 weighted MR images. The quality of segmentation obtained a r X i v : . [ q - b i o . T O ] M a r y such methods is again excellent and the computational cost, once trained, isusually greatly reduced with respect to traditional pipelines.Especially in children, many congenital (e.g. malformations, huge arachnoidcysts) or acquired (e.g. severe hydrocephalus, encephalomalacia due to perinatalinjuries) conditions can cause complex modiﬁcations of cerebral anatomy thatalter the structural and spatial relationship among diﬀerent brain structures.Automatically segmenting such brains presents multiple challenges mainly dueto the high variability of the morphology together with the scarce availability ofdata. Moreover, the prior knowledge encoded in automated pipelines, or the setof images used to train segmentation algorithms, do not cover such cases.In this work, for the ﬁrst time, we present results of diﬀerent well-establishedbrain tissue segmentation pipelines on T1 images of malformed and highly dis-torted brains in the pediatric age. Unsurprisingly, we observe that the qualityof segmentation is highly variable with traditional pipelines, and it fails whenthe complexity of the brain distortion and the severity of the malformation arehigh.Moreover, as a major contribution, we show the results of a CNN for segmen-tation of medical images, namely the 3D U-Net [3,11], trained on over 800 pe-diatric subjects with normal brain . Surprisingly, the 3D U-Net segments brainswith moderate and severe distortion of brain anatomy either accurately or atleast to a suﬃcient level to consider manual reﬁnement by expert radiologists.The evaluation study of the U-Net presented here is ﬁrst conducted on alarge sample of normal brain images, as a sanity check, to quantitatively assessthat the speciﬁc implementation and training procedure reaches state-of-the-artquality of segmentation. The evaluation on distorted brains is instead qualitative,because of the small sample available due to the rarity of the condition, as wellas the current lack of gold standard segmentation.If conﬁrmed by future and more extensive studies, this result opens the wayto CNNs-based tissue segmentation methods in applications, even in the caseof malformed and highly distorted brains. From the methodological point ofview, CNNs show a much higher degree of ﬂexibility in tissue segmentation thanwell-established pipelines, despite being trained on segmentations obtained fromthose same pipelines. We assembled a dataset of over 900 MR images from subjects and patients in thepediatric age, divided into two parts: normal brains and distorted brains. Theﬁrst part consists of 570 T1-w images from public databases (C-MIND , NIMH )and 334 T1-w images acquired in-house by the authors during clinical activityat IRCCS Eugenio Medea (Italy). The second part comprises 21 patients, againacquired at IRCCS E.Medea. https://research.cchmc.org/c-mind , NIH contract http://pediatricmri.nih.gov .1 Normal Brains– Public Databases. • T1-w images from 207 healthy subjects of the C-MIND database (165from CCHMC site, with average age 8.9 (SD=5.0) and 42 from UCLAsite, with average age 7.6 (SD=3.8)) both with 3D MPRAGE MRI 3Tscan sequence (TR=8000ms, TE=3.7ms, Flip angle=8 o , FOV=256x224x160,voxel spacing=1x1x1 mm ). • T1-w images from 363 healthy subjects (average age 10.7 (SD=6.0))from the

Pediatric MRI database of NIMH Data Archive [6], MRI 1.5Tscanner with two diﬀerent sequences: ∗

284 subjects acquired with a 3D T1-w Sequence 3D RF-spoiled gra-dient echo sequence (TR 22 −

25 ms, TE=10 −

11 ms, Excitation pulse=30 o , Refocusing pulse 180 o , FOV = 160 − ) ∗

79 subjects acquired with a T1-w Sequence Spin echo (TR = 500ms, TE = 12 ms, Flip angle= 90 o , Refocusing pulse=180 o Field ofview = 192x256x66, voxel spacing=1x1x3 mm ) – In-House Database: IRCCS E.MEDEA. T1-w images from 334 subjectswith normal brain, acquired in-house with average age 10.6 (SD=5.2) and3D T1-w MPRAGE 3T MRI scan sequence (TR=8000ms, TE=4ms, Flipangle=8 o , FOV=256x256x170, voxel spacing=1x1x1 mm ). Images from two groups of patients acquired in-house at IRCSS E.Medea withthe same MR and scan sequence described above. In detail: – Agenesis of Corpus Callosum.

12 patients with agenesis of corpus cal-losum (ACC) (average age 5.8 (SD=5.2)). Callosal agenesis is characterizedby colpocephaly, parallel ventricles, presence of Probst bundles and upwardextension of third ventricle. See some cases in Figure 1. – Complex distortions.

All T1-w images received bias ﬁeld correction (

N4BiasFieldCorrection ) andAC-PC alignment. The reference segmentation of normal brains was performedwith the

AntsCorticalThickness.sh script of ANTs [12] with the PTBP (pe-diatric) prior [2], resulting in a 3D mask with 7 labels (6 tissues): background,Cerebrospinal ﬂuid (CSF), Gray matter (GM), White matter (WM), Deep graymatter (DGM), Trunk and Cerebellum. The results of the segmentations werevisually assessed by two experts in pediatric neuroimaging.3

Methods

A standard whole-brain 3D U-Net model [3,11] was implemented to predict thebrain masks of 6 tissues plus background from T1-w images. The architectureof the network consists of 3 main parts: the contraction path, the bottleneckand the expansion path. The input and the output layer are for 256x256x256isotropic 3D images, resolution to which all input images are initially resampled.The adopted architecture, described below, has minor changes with respect tothe literature to reduce the memory footprint and to ﬁt the GPU used duringthe experiments reported in Section 4.The contraction path consists of 4 blocks, each with two convolutional layers(3x3x3 kernel followed by ReLU) followed by MaxPooling (2x2x2 ﬁlter, stride2x2x2) for downsampling. The size of the downsampling over the blocks is256 → → → →

16, while the application of convolutional layers pro-duces an increasing number of feature maps over the blocks, 12 → → → We qualitatively compared the segmentation pipelines of FSL, FreeSurfer, SPM,ANTs and the 3D U-Net on T1-weighted images from patients with agenesisof corpus callosum (ACC) and with complex distortions of brain anatomy, asdescribed in Section 2.2. Diﬀerent pipelines produced diﬀerent sets of segmentedtissues. To harmonize the results, we considered only 3 main tissues: gray matter(GM), white matter (WM) and the cerebrospinal ﬂuid (CSF). These tissues arethe ones of greater interest in most of the applications of brain tissue segmen-tation. Moreover, for the pipelines segmenting the deep gray matter (DGM), welabeled DGM as GM. The technical details of each pipeline are the following: – FSL v6 [7]: we used fsl anat with default parameter values. – FreeSurfer v6 [5]: we used recon-all with default parameter values. – SPM12 [1]: we used default parameter values and default prior. – ANTs v2.2.0 [12]: we used AntsCorticalThickness.sh with default pa-rameter vales and the PTBP (pediatric) prior [2]. We considered only thesteps till tissue segmentation. https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/fsl_anat https://surfer.nmr.mgh.harvard.edu/fswiki/recon-all

3D U-Net.

The model was implemented with TensorFlow 1.8.0 [8] trainedon T1-weighted images from the Normal Brains Datasets of ≈

900 imagesdescribed in Section 2. We kept apart 90 randomly selected images for asanity check, i.e., to assess that the trained model could reach state-of-the-art quality of segmentation on healthy subjects. The training (loss: cross-entropy, Adam optimization, learning rate: 10 − ) was performed iteratively,one image at a time taken at random from the training set, looping over thewhole set for 60 epochs.All computations were performed on a dedicated workstation: 6 cores proces-sor Intel(R) Xeon(R) CPU E5-1650 v4 3.60GHz, 128Gb RAM, GPU: NVIDIAGeForce GTX 1080TI 11Gb RAM. Except for FreeSurfer ( recon-all ), all pipelines carried out all the required seg-mentations. FreeSurfer failed in all cases of severe anatomical distortion. Specif-ically, recon-all did not converge during either Talairach registration or skullstripping. SPM completed the segmentation task in all subjects. However, itsconﬁdence in the prior for the segmentation initialization and optimization leadsto many major macroscopic errors. Given the strong limits on the length of thisarticle, we do not illustrate and discuss the uninteresting results of FreeSurferand SPM.In Figure 1, we show a paradigmatic set of axial slices segmented by FSL,ANTs and the 3D U-Net, from patients with ACC, i.e. from 4 of the 12 subjectsdescribed in Section 2.2. Similarly, in Figure 2, we report paradigmatic axialslices segmented by those methods, from 4 of the subjects with complex cerebraldistortions. For the segmentations of all methods, we re-labeled as CSF all thevoxels inside the brain mask of each patient that were incorrectly segmentedas background. In both ﬁgures in the last row, for each subject, we highlight adetail of the slice for one of the segmentation methods (indicated with a dashedsquare, above in the same column). Such details are discussed in Section 5.Finally, in Table 1, we report the results of the sanity check, i.e. that thetraining process of the 3D U-Net on normal brains was successful. The numbersrepresent the average quality of segmentation (DSC, Dice similarity coeﬃcient)obtained by the 3D U-Net for the reference segmentation of the 6 tissues ofnormal brains described in Section 2.3. These results are comparable to those inthe state-of-the-art [4,13].

Metric CSF GM WM DGM Trunk Cereb.

Grand Avg.

DSC 0.87 ± ± ± ± ± ± ± Table 1.

3D U-Net: average Dice similarity coeﬃcient (DSC) for segmentations of 6tissues, plus grand average, on 90 normal brains, after 60 epochs of training. T1-w

A B C D

FSLANTs3D U-Net FSLANTs ANTs ANTsHighlight

Fig. 1.

First row: T1-weighted MR images of 4 subjects (A, B, C and D) with agenesisof corpus callosum . Below, the related tissue segmentations (GM in green, WM in blueand CSF in red) of the following pipelines: FSL (2nd row), ANTs (3rd row) and 3DU-Net (4th row). In the 5th row, for each subject, we show the enlarged view of one ofthe segmentations, indicated above with a dashed yellow square. White arrows pointto the highlights discussed in Section 5. E F G H

T1-wFSLANTs3D U-Net 3D U-Net3D U-Net3D U-Net ANTsHighlight

Fig. 2.

First row: T1-weighted MR images of 4 subjects (E, F, G and H) with complexcerebral distortions . Below, the related tissue segmentations (GM in green, WM in blueand CSF in red) of the following pipelines: FSL (2nd row), ANTs (3rd row) and 3DU-Net (4th row). In the 5th row, for each subject, we show the enlarged view of one ofthe segmentations, indicated above with a dashed yellow square. White arrows pointto the highlights discussed in Section 5. Discussion

Figure 1 shows that FSL fails to segment GM and WM in cases of moderateand severe ventricular dilatation with thinning of the WM (case A, C and lessevident in D). In one case (B) FSL also misses identifying the thalami as (deep)GM. ANTs performs better in identifying (deep) GM and the cortex at theconvexity. However, case A and C show that it may fail in diﬀerentiate betweenGM and subcortical WM on the mesial surface of the hemispheres. This maybe related to the prior used by that pipeline, which is based on the anatomyof normal subjects and not designed to recognize spatial reorganization of thecortex, especially in the midline, like in ACC cases. A similar error is in caseC where a cortical component close to the head of the caudate is misclassiﬁedas WM. Finally in D, Probst bundles, which are abnormal WM tracts runningparallel to the medial ventricular wall, are labelled as GM. In contrast, the 3DU-Net performs well in segmenting ACC. The most relevant error in these casesis at the interface between ventricles and WM: the 3D U-Net wrongly identiﬁesa very thin layer of GM along the inner ventricular surface. This is probablyrelated to partial volume eﬀects.Figure 2 shows that, in case of complex malformations and severe parenchy-mal distortion, FSL and ANTs are unreliable and incur in major macroscopicerrors, as opposed to 3D U-Net which performs vastly better. In cases of severeventricular dilatation and distortion (E and H), FSL fails to segment the cortex,which is wrongly labelled as CSF. In case F, the CSF collection that replacesthe left hemisphere is misclassiﬁed as the cortex. In G, FSL fails to properly seg-ment the pachygyric (i.e. with a reduced number of gyri) cortex. Finally, someintensity inhomogeneities in the deep ventricular CSF are misclassiﬁed as GM(E and G). With ANTs, which is based on priors, the pipeline is forced to seg-ment WM, GM and CSF in the missing hemisphere (F) or when the anatomy ishighly irregular. In all of these cases, the pipeline misplaces structures (E, G, H)or segment structures that are actually missing (F). The 3D U-Net outperformsFSL and ANTs in all cases (e.g. G), with few mistakes. The main issues are: i)the mislabeling of signal inhomogeneities in the deep CSF (E, same as FSL), andthe segmentation of a subtle layer of GM at the border between lateral ventriclesand WM (H, as done in ACC cases). Care must be used in evaluating CSF/WMinterface in cases of brain malformations because, at this level, heterotopic GMnodule may occur.

In this work, we observe a much higher accuracy of the 3D U-Net when seg-menting brains with diﬀerent degrees of anatomical distortion, compared to well-established pipelines. This is surprising given that such cases were not used inthe training phase. At the same time, the 3D U-Net can reproduce the high qual-ity of segmentation of ANTs on normal brains. Clearly, the results on distortedbrains are not perfect but still a valuable starting point for manual reﬁnement8y experts/neuroradiologists. In future work we plan to manually segment T1-w images of the distorted brains, to create a gold standard and to be able toquantify the quality of segmentation of the 3D U-Net and to use some of themduring the training process of the network.

Data used in the preparation of this article were obtained from the C-MINDData Repository created by the C-MIND study of Normal Brain Development.This is a multisite, longitudinal study of typically developing children from agesnewborn through young adulthood conducted by Cincinnati Children’s HospitalMedical Center and UCLA and supported by the National Institute of ChildHealth and Human Development (Contract https://research.cchmc.org/c-mind . This manuscript reﬂects theviews of the authors and may not reﬂect the opinions or views of the NIH.9 eferences

1. Ashburner, J., Friston, K.J.: Voxel-based morphometry–the methods. NeuroImage11(6 Pt 1), 805–821 (Jun 2000), http://dx.doi.org/10.1006/nimg.2000.0582

2. Avants, B., ntustison, Wang, D.J.: The Pediatric Template of Brain Per-fusion (PTBP) (Feb 2015), https://figshare.com/articles/The_Pediatric_Template_of_Brain_Perfusion_PTBP_/923555

3. C¸ i¸cek, ¨O., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3D U-Net:Learning Dense Volumetric Segmentation from Sparse Annotation. In: Ourselin, S.,Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) Medical Image Comput-ing and Computer-Assisted Intervention – MICCAI 2016, vol. 9901, pp. 424–432.Springer International Publishing, Cham (2016), http://link.springer.com/10.1007/978-3-319-46723-8_49

4. Cullen, N.C., Avants, B.B.: Convolutional Neural Networks for Rapid and Simul-taneous Brain Extraction and Tissue Segmentation. In: Spalletta, G., Piras, F.,Gili, T. (eds.) Brain Morphometry, vol. 136, pp. 13–34. Springer New York, NewYork, NY (2018), http://link.springer.com/10.1007/978-1-4939-7647-8_2

5. Dale, A.M., Fischl, B., Sereno, M.I.: Cortical surface-based analysis. I. Segmenta-tion and surface reconstruction. NeuroImage 9(2), 179–194 (Feb 1999)6. Evans, A.C.: The NIH MRI study of normal brain development. NeuroImage30(1), 184–202 (Mar 2006), https://linkinghub.elsevier.com/retrieve/pii/S105381190500710X

7. Jenkinson, M., Beckmann, C.F., Behrens, T.E.J., Woolrich, M.W., Smith,S.M.: FSL. NeuroImage 62(2), 782–790 (2012),

8. Mart´ın Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen,Craig Citro, Greg S. Corrado, Andy Davis, Jeﬀrey Dean, Matthieu Devin, San-jay Ghemawat, Ian Goodfellow, Andrew Harp, Geoﬀrey Irving, Michael Isard,Jia, Y., Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg,Dandelion Man´e, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, MikeSchuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, PaulTucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Vi´egas, Oriol Vinyals,Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, Xiaoqiang Zheng:TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems (2015),

9. Moeskops, P., Viergever, M.A., Mendrik, A.M., de Vries, L.S., Benders, M.J.N.L.,Iˇsgum, I.: Automatic Segmentation of MR Brain Images With a ConvolutionalNeural Network. IEEE Transactions on Medical Imaging 35(5), 1252–1261 (May2016)10. Rajchl, M., Pawlowski, N., Rueckert, D., Matthews, P.M., Glocker, B.: NeuroNet:Fast and Robust Reproduction of Multiple Brain Image Segmentation Pipelines(Apr 2018), https://openreview.net/forum?id=Hks1TRisM

11. Ronneberger, O., Fischer, P., Brox, T.: U-Net: Convolutional Networks for Biomed-ical Image Segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F.(eds.) Medical Image Computing and Computer-Assisted Intervention – MICCAI2015. pp. 234–241. Lecture Notes in Computer Science, Springer InternationalPublishing (2015)12. Tustison, N.J., Cook, P.A., Klein, A., Song, G., Das, S.R., Duda, J.T., Kandel,B.M., van Strien, N., Stone, J.R., Gee, J.C., Avants, B.B.: Large-scale evaluationof ANTs and FreeSurfer cortical thickness measurements. NeuroImage 99, 166–179(Oct 2014)

3. Yogananda, C.G.B., Wagner, B.C., Murugesan, G.K., Madhuranthakam, A., Mald-jian, J.A.: A Deep Learning Pipeline for Automatic Skull Stripping and Brain Seg-mentation. In: 2019 IEEE 16th International Symposium on Biomedical Imaging(ISBI 2019). pp. 727–731 (Apr 2019), iSSN: 1945-7928

Supplementary Materials

T1-w

A B C D

FreeSurfer3D U-Net

SPM

This is an extension of Figure 1, i.e. paradigmatic segmentations of patientswith Agenesis of Corpus Callosum, where we show SPM and FreeSurfer that weexcluded in the manuscript, together with the 3D U-Net for comparison. Thesegmentations of SPM and FreeSurfer have major errors in the ventricles and inthe area of the thalami and caudate. SPM probability maps were thresholded at p = 0 . E F G H

T1-wSPMFree Surfer3D U-Net

This is an extension of Figure 2, i.e. paradigmatic segmentations of patientswith complex cerebral distortions, where we show SPM and FreeSurfer that weexcluded in the manuscript, together with the 3D U-Net for comparison. SPMhas major macroscopic errors almost everywhere. FreeSurfer failed to convergeon the images of these subjects so the related entries are empty, as reported inSection 4. SPM probability maps were thresholded at p = 0 ..