Brain Tumor Segmentation and Survival Prediction using Automatic Hard mining in 3D CNN Architecture
Vikas Kumar Anand, Sanjeev Grampurohit, Pranav Aurangabadkar, Avinash Kori, Mahendra Khened, Raghavendra S Bhat, Ganapathy Krishnamurthi
BBrain Tumor Segmentation and SurvivalPrediction using Automatic Hard mining in 3DCNN Architecture
Vikas Kumar Anand − − − , Sanjeev Grampurohit , PranavAurangabadkar , Avinash Kori − − − , Mahendra Khened ,Raghavendra S Bhat , and Ganapathy Krishnamurthi − − − Indian Institute of Technology Madras, Chennai 600036, India [email protected] Intel Technology India Pvt. Ltd, India
Abstract.
We utilize 3-D fully convolutional neural networks (CNN) tosegment gliomas and its constituents from multimodal Magnetic Reso-nance Images (MRI). The architecture uses dense connectivity patternsto reduce the number of weights and residual connection and is initial-ized with weights obtained from training this model with BraTS 2018dataset. Hard mining is done during training to train for the difficultcases of segmentation tasks by increasing the dice similarity coefficient(DSC) threshold to choose the hard cases as epoch increases. On theBraTS2020 validation data (n = 125), this architecture achieved a tu-mor core, whole tumor, and active tumor dice of 0.744, 0.876, 0.714,respectively. On the test dataset, we get an increment in DSC of tumorcore and active tumor by approximately 7%. In terms of DSC, our net-work performances on the BraTS 2020 test data are 0.775, 0.815, and0.85 for enhancing tumor, tumor core, and whole tumor, respectively.Overall survival of a subject is determined using conventional machinelearning from rediomics features obtained using generated segmentationmask. Our approach has achieved 0.448 and 0.452 as the accuracy on thevalidation and test dataset.
Keywords:
Gliomas · MRI ·
3D CNN · Segmentation · Hard mining · Overall survival
A brain tumor is an abnormal mass of tissue that can be malignant or benign.Furthermore, based on risk, a malignant tumor can be classified into two cate-gories, High-Grade Glioma (HGG) and Low-Grade Glioma (LGG). MR imagingis the most commonly used imaging solution to detect the tumor location, size,and morphology. Different modalities of MR imaging enhances separate compo-nents of a brain tumor. The Enhancing Tumor (ET) appears as a hyperintenseregion in the T1Gd image with respect to T1- weighted image and T1Gd imageof healthy white matter. Typically resection is performed on the Tumor Core a r X i v : . [ ee ss . I V ] J a n V. K. Anand et al. (TC) region. The necrotic region (NCR), non-enhancing region (NET), and ETconstitutes the TC. The NCR and NET tumor core appear as hypointense areasin T1Gd with respect to T1. The TC and peritumoral edema (ED) constitutesthe WT and describes the disease’s full extent. The WT appears as a hyper-intense area in FLAIR. Delineation of the tumor and its component, also calledsegmentation of tumor region, on several modalities is the first step towardsdiagnosis. Radiologists carry out this process in a clinical setup, which is time-consuming, and manual segmentation becomes cumbersome with an increase inpatients’ numbers. Therefore, automated techniques are required to perform seg-mentation tasks and reduce the radiologist effort. The diffused boundary of thetumor and partial volume effect in the MRI further enhance the challenge in thesegmentation of the different regions of the tumor on several MR imaging modal-ities. In recent years, Deep Learning methods, especially Convolutional NeuralNetworks (CNN), have achieved the state of the art results in the segmentationof different tumor components from a different sequence of MR images [1,2].Typically, due to the volumetric nature of medical images, organs are being im-aged as 3-D entities, and subsequently, we utilize the nature of 3D CNN basedarchitectures for segmentation task.In this manuscript, we have used patch-based 3D encoder-decoder architec-ture to segmentation brain tumors from MR volumes. We have also used con-ditional random field and 3D connected component analysis for post-processingof the segmentation maps.
BraTS 2018 winner, Myronenko et al. [3], has proposed 3D encoder-decoder ar-chitecture with variational autoencoder as a regularization for a large encoder.He has used a non-cuboid patch of a fairly bigger size to train the network witha batch size of 1. Instead of using softmax on several classes or several networksfor a different class, all three nested tumor sub-regions are being taken as outputafter sigmoid. The ensemble of the different networks has given the best result.Isenee et al. [4] has used basic U-Net [5] with minor modifications. They securesecond place in BraTS 2018 challenge by care-full training with data augmenta-tion during training and testing time. For training, a 128 patch has been usedwith a batch size of 2. Due to the small batch size instance, normalization hasbeen used. They found an increase in performance using Leaky ReLU insteadof the ReLU activation function. BraTS 2019 winner Jiang et al. [6] have usedtwo-stage cascaded U-Net architecture to segmentation brain tumors. Segmen-tation maps obtained in the first stage are being fed to the second stage alongwith inputs of the first stage. They have also used two decoders in the secondstage to get two different segmentation maps. The loss function incorporates alllosses that occur due to these segmentations. Data augmentation during trainingand testing has further improved performance. Data sampling, random patch-size training as a data processing method, semi-supervised learning, architecturedevelopment, and fusion of results as a model devising methods and warming- itle Suppressed Due to Excessive Length 3 up learning and multi-task learning optimizing processes have used as differenttricks by [7] for 3D brain tumor segmentation. Bag of tricks for segmentationhas secured second place in BraTS 2019 challenge.This work utilizes a single 3D encoder-decoder architecture for segmentationfor different components of a brain tumor. We have used a smaller patch size andhard mining to train our model. Smaller patch size gives us leverage to deployour model on a smaller GPU, and the hard mining step finds the hard exampleduring training for weighting the loss function. We have not used additionaltraining data and used only data that are provided by challenge organizers. A 3D fully convolutional neural network (3DFCNN) [8] is devised to segmentbrain tumors and its constituents ET, NER, NET, and ED, from multi-parametricMR volume. This network is used to achieve semantics segmentation task. Eachpixel or volex, which is fed to the network, is assigned with a class label bymodel. This network has dense connectivity patterns that enhance the flow ofinformation and gradients through the model. This enables us to make a deepnetwork tractable. The predictions produced by the model are smoothened byusing Conditional random fields followed by class wise 3D connected compo-nent analysis. Post-processing techniques help in decreasing the number of falsepositives in final segmentation maps.
BraTS 2020 challenge dataset [9,10,11,12,13] has been utilized to train the net-work architecture which is discussed in this manuscript. The training datasetcomprises 396 subjects (number of HGG case = 320 and LGG cases = 76 ). Eachsubject has 4 MR sequences, namely FLAIR, T2, T1, T1Gd, and segmentationmaps, annotated by an expert on each sequence. Each volume is skull-stripped,rescaled to the same resolution (1 mm × mm × mm ), and co-registered to thecommon anatomical template. The BraTS 2020 challenge organizer has issued125 cases and 166 cases to validate and test the algorithm, respectively. Featuressuch as age, survival days, and resection status are provided separately for thetraining, validation, and testing phases for 237, 29, and 166 HGG scans. Data Pre-processing
Each volume is normalized to have zero mean and unitstandard deviation as a part of pre-processing. img = ( img − mean ( img )) /std ( img ) img = Only brain region of the volume mean ( img ) = mean of a volume std ( img ) = standard deviation of a volume V. K. Anand et al.
The fully convolu-tional network is used for semantic segmentation task. The input to the networkis 64 sized cubes. The network predicts the respective class of voxels in theinput cube. Each input to the network has to follow two paths, an encoding,and a decoding path. The encoding architecture of the network comprises Denseblocks along with Transition Down blocks. A series of convolution layers followedby ReLU [14] & each convolutional layer receives input from all the precedingconvolutional layers that make the Dense block. This connectivity pattern leadsto the explosion of many feature maps with the network’s depth. To overcomethe explosion in parameters, set the number of output feature maps per convolu-tional layer to a small value (k = 4). The spatial dimension of the feature maps isreduced by utilizing the Transition down blocks in the network. The decoding orthe up-sampling pathway in the network consists of the Dense blocks and Tran-sition Up blocks. Transposed convolution layers are utilized to up sample featuremaps in the Transition Up blocks. In the decoding section, the dense blocks takefeatures from the encoding part in concatenation with the up-sampled featuresas input. The network architecture for semantic segmentation is depicted in Fig-ure 1. Patch Extraction :Each patch is of size 64 . These are extracted from thebrain. Many patches are extracted from less frequent classes such as necrosiscompared to more frequent classes. This scheme of patch extraction helps inreducing the class imbalance between different classes. The 3DFCNN accepts aninput of size 64 and predicts the respective class of the input voxels. There are77 layers in the network architecture. The effective reuse of the model’s featuresis ensured by utilizing the dense connections among various convolutional lay-ers. The dense connections among layers increase the number of computations,which is subdued by fixing the number of output feature maps per convolutionallayer to 4. Training:
The dataset is split into training, validation, and testing in theratio 70: 20: 10 using stratified sampling based on tumor grade. The networkis trained on 205 HGG volumes and 53 LGG volumes. The same ratio of HGGand LGG volumes has been maintained during the validation and testing of thenetwork on held-out data. To further address the issue of class imbalance inthe network, the network parameters are trained by minimizing weighted cross-entropy. The weight associated with each class is equivalent to the ratio of themedian of the class frequency to the frequency of the class of interest [15]. Thenumber of samples per batch is set at 4, while the learning rate is initialized to0.0001 and decayed by a factor of 10% every-time the validation loss plateaued.
Hard Mining:
Our network performed poorly on the hard examples. We haveresolved this issue by hard mining such cases and fine-tuned the trained networkwith these hard mined cases [16]. We implement a threshold-based selection itle Suppressed Due to Excessive Length 5
Fig. 1: 3DFCNN used for segmentation of Brain Tumor and its constituents. TD:Transition Down block; C: Concatenation block; TU: Transition Up block
V. K. Anand et al. of hard examples. This threshold is obtained using DSC. If a subject has aDSC, which is less than a threshold DSC then this subject is considered a hardexample. We choose all such hard examples for a particular set threshold, andwe fine-tune our model with these cases. We have chosen two threshold valuesto fine-tune our model.
For training, we have extracted radiomic features using ground truth. Thereare five types of rediomic features have been extracted for this purpose. Thesefeatures are first-order radiomic features, which comprises 19 features (mean,median, entropy, etc.); second-order features are Gray Level Co-Occurrence Ma-trix (GLCM), Gray Level Run Length Matrix (GLRLM),Gray Level DependenceMatrix (GLDM), Gray Level Size Zone Matrix (GLSZM), and Neighboring GrayTone Difference Matrix (NGTDM), these are altogether 75 different features and2D and 3D Shape features consists 26 features. We have used pyradicomis [17]to extract all radiomic features. Using a different combination of segmentationmaps, we have extracted 1022 different features. We have assigned an impor-tance value to each feature by using a forest of trees [18,19]. Thirty-two mostimportant features out of 1022 features are being used to train the Random For-est Regressor (RFR). The pipeline of overall survival prediction is illustrated inFigure 2.Fig. 2: Pipeline used for prediction of overall survival of a patient.
The algorithm is implemented in PyTorch [20]. The network is trained on NVIDIAGeForce RTX 2080 Ti GPU with Intel Xeon(R) CPU E5-2650 v3 @ 2.30GHz ×
20, 32 GB RAM CPU. We have not used any additional data set for train-ing our network. We have taken 64 × ×
64 sized non-overlapping patches fortraining. Four patches from different parametric images such as Flair, T1ce, T2,and T1 images are concatenated as the input of the network. We have taken thesame size of overlapping patches during inference that reduce the edge effect insegmentation maps. itle Suppressed Due to Excessive Length 7
We have reported our network performance on validation and test data sets.The proportion of HGG and LGG in these data sets is not known to us. Wehave uploaded our segmentation and OS prediction result for validation and testdata sets on the BraTS 2020 server. DSC, Sensitivity, Specificity, and HausdorffDistance have been used as metrics for the network performance evaluation.
Themodel is trained using dice loss to generate segmentation maps. Figure 3 showsdifferent parametric images and the segmentation map obtained on a validationdata (a) FLAIR image (b) T1Gd image (c) T1 image (d) T2 image (e) Segmentationmaps
Fig. 3: Segmentation results on validation : Green ,Yellow and Red regions rep-resent edema, enhancing tumor and necrosis respectively.On the BraTS validation data (n = 125), the performance of the network islisted in Table 1.Table 1: Different Metrics for all component of Tumor on the Validation dataset
DSC Sensitivity Specificity Hausdroff DistanceET WT TC ET WT TC ET WT TC ET WT TCMean 0.71 0.88 0.74 0.74 0.92 0.74 0.99 0.99 0.99 38.31 6.88 32.00StdDev 0.31 0.13 0.29 0.33 0.14 0.31 0.0005 0.001 0.0005 105.32 12.67 90.55Median 0.85 0.90 0.89 0.89 0.96 0.89 0.99 0.99 0.99 2.23 3.61 4.24
Performance of our network on the BraTS 2020 Test data:
Figure4 depicts the different MR images and segmentation maps generated by our
V. K. Anand et al. network on test data set. Table 2 summarizes all performance metrics obtainedon test data set (n = 166). (a) FLAIR image (b) T1Gd image (c) T1 image (d) T2 image (e) Segmentationmaps
Fig. 4: Segmentation results on test data: Green ,Yellow and Red regions repre-sent edema, enhancing tumor and necrosis respectively.Table 2: Different Metrics for all component of Tumor on the Test data set
DSC Sensitivity Specificity Hausdroff DistanceET WT TC ET WT TC ET WT TC ET WT TCMean 0.776 0.8507 0.815 0.833 0.923 0.838 0.999 0.999 0.999 19.11 8.07 21.276StdDev 0.223 0.164 0.255 0.249 0.153 0.257 0.0004 0.0016 0.0007 74.87 12.21 74.66Median 0.836 0.902 0.907 0.926 0.968 0.940 0.999 0.999 0.999 2 4.12 3
Figure 2 explains the overall flowchart of the experiment used to find the sur-vival day. Feature extractor module is used to drive all rediomics features usingall 4 type sequenced images and corresponding ground truth with several com-binations. Before obtaining the importance of a feature for this task, we havestandardized the training and validation feature matrix. Feature importance isobtained using a forest of trees. RFR has been used as a regressor for this task.Table 3 comprises the different metrics obtained during training and validation.
Prediction of OS using training and validation data:
During validationphase, clinical information of 29 cases are provided to find the OS of case. Table3 contains all performance metrics for OS prediction. itle Suppressed Due to Excessive Length 9
Table 3: Different metrics obtained on training and validation data to evaluatethe survival of a patient.
Accuracy MSE medianSE stdSE SpearmanRTraining 0.59 44611.096 13704.672 109337.932 0.775Validation 0.448 110677.443 22874.178 142423.687 0.169
Prediction of over all survival using test data :
For testing phase, clinicalinformation is available for 166 cases. Table 4 is the summary of performancemetrics of our algorithms for survival prediction.Table 4: Different metrics obtained on test data to evaluate the survival of apatient.
Accuracy MSE medianSE stdSE SpearmanRTest 0.452 4122630758 67136.258 1142775.390 -0.014
This manuscript deals with 2 out of 3 problems posed by the challenge orga-nizer. We have illustrated the use of a fully convolutional neural network forbrain tumor segmentation. Overall survival prediction is calculated using gen-erated segmentation maps with conventional machine learning algorithms. Wehave used the single network from our previous participation in the BraTS 2018challenge, described in [8]. This year, we have introduced the hard mining stepsduring training the network. Our model has achieved DSC of 0.71 and 0.77 onenhancing tumors, 0.88 and 0.85 on the whole tumor, and 0.74 and 0.81 on thetumor core for validation and test data. We have observed that the hard miningstep improves DSC for tumor core by 9%, DSC for ET by 8%, and the wholetumor by 2%. Hard mining step makes it easy to learn hard examples by thenetwork during training.
References
1. Konstantinos Kamnitsas, Christian Ledig, Virginia FJ Newcombe, Joanna P Simp-son, Andrew D Kane, David K Menon, Daniel Rueckert, and Ben Glocker. Efficientmulti-scale 3d cnn with fully connected crf for accurate brain lesion segmentation.
Medical image analysis , 36:61–78, 2017.2. S´ergio Pereira, Adriano Pinto, Victor Alves, and Carlos A Silva. Brain tumor seg-mentation using convolutional neural networks in mri images.
IEEE transactionson medical imaging , 35(5):1240–1251, 2016.3. Andriy Myronenko. 3d mri brain tumor segmentation using autoencoder regular-ization. In
International MICCAI Brainlesion Workshop , pages 311–320. Springer,2018.4. Fabian Isensee, Philipp Kickingereder, Wolfgang Wick, Martin Bendszus, andKlaus H Maier-Hein. No new-net. In
International MICCAI Brainlesion Workshop ,pages 234–244. Springer, 2018.5. Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional net-works for biomedical image segmentation. In
International Conference on Medi-cal image computing and computer-assisted intervention , pages 234–241. Springer,2015.6. Zeyu Jiang, Changxing Ding, Minfeng Liu, and Dacheng Tao. Two-stage cascadedu-net: 1st place solution to brats challenge 2019 segmentation task. In
InternationalMICCAI Brainlesion Workshop , pages 231–241. Springer, 2019.7. Yuan-Xing Zhao, Yan-Ming Zhang, and Cheng-Lin Liu. Bag of tricks for 3d mribrain tumor segmentation. In
International MICCAI Brainlesion Workshop , pages210–220. Springer, 2019.8. Avinash Kori, Mehul Soni, B Pranjal, Mahendra Khened, Varghese Alex, andGanapathy Krishnamurthi. Ensemble of fully convolutional neural network forbrain tumor segmentation from magnetic resonance images. In
International MIC-CAI Brainlesion Workshop , pages 485–496. Springer, 2018.9. Spyridon Bakas, Hamed Akbari, Aristeidis Sotiras, Michel Bilello, Martin Rozycki,Justin Kirby, John Freymann, Keyvan Farahani, and Christos Davatzikos. Seg-mentation labels and radiomic features for the pre-operative scans of the tcga-gbmcollection. the cancer imaging archive.
Nat Sci Data , 4:170117, 2017.10. Spyridon Bakas, Hamed Akbari, Aristeidis Sotiras, Michel Bilello, Martin Rozycki,Justin Kirby, John Freymann, Keyvan Farahani, and Christos Davatzikos. Seg-mentation labels and radiomic features for the pre-operative scans of the tcga-lggcollection.
The cancer imaging archive , 286, 2017.11. Spyridon Bakas, Hamed Akbari, Aristeidis Sotiras, Michel Bilello, Martin Rozycki,Justin S Kirby, John B Freymann, Keyvan Farahani, and Christos Davatzikos. Ad-vancing the cancer genome atlas glioma mri collections with expert segmentationlabels and radiomic features.
Scientific data , 4:170117, 2017.12. Spyridon Bakas, Mauricio Reyes, Andras Jakab, Stefan Bauer, Markus Rempfler,Alessandro Crimi, Russell Takeshi Shinohara, Christoph Berger, Sung Min Ha,Martin Rozycki, et al. Identifying the best machine learning algorithms for braintumor segmentation, progression assessment, and overall survival prediction in thebrats challenge. arXiv preprint arXiv:1811.02629 , 2018.13. Bjoern H Menze, Andras Jakab, Stefan Bauer, Jayashree Kalpathy-Cramer, Key-van Farahani, Justin Kirby, Yuliya Burren, Nicole Porz, Johannes Slotboom,Roland Wiest, et al. The multimodal brain tumor image segmentation benchmark(brats).
IEEE transactions on medical imaging , 34(10):1993–2024, 2014.itle Suppressed Due to Excessive Length 1114. Vinod Nair and Geoffrey E Hinton. Rectified linear units improve restricted boltz-mann machines. In
ICML , 2010.15. David Eigen and Rob Fergus. Predicting depth, surface normals and semanticlabels with a common multi-scale convolutional architecture. In
Proceedings of theIEEE international conference on computer vision , pages 2650–2658, 2015.16. Mahendra Khened, Avinash Kori, Haran Rajkumar, Balaji Srinivasan, and Ganap-athy Krishnamurthi. A generalized deep learning framework for whole-slide imagesegmentation and analysis. arXiv preprint arXiv:2001.00258 , 2020.17. Joost JM Van Griethuysen, Andriy Fedorov, Chintan Parmar, Ahmed Hosny,Nicole Aucoin, Vivek Narayan, Regina GH Beets-Tan, Jean-Christophe Fillion-Robin, Steve Pieper, and Hugo JWL Aerts. Computational radiomics system todecode the radiographic phenotype.
Cancer research , 77(21):e104–e107, 2017.18. Lars Buitinck, Gilles Louppe, Mathieu Blondel, Fabian Pedregosa, AndreasMueller, Olivier Grisel, Vlad Niculae, Peter Prettenhofer, Alexandre Gramfort,Jaques Grobler, Robert Layton, Jake VanderPlas, Arnaud Joly, Brian Holt, andGa¨el Varoquaux. API design for machine learning software: experiences from thescikit-learn project. In
ECML PKDD Workshop: Languages for Data Mining andMachine Learning , pages 108–122, 2013.19. F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel,M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos,D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machinelearning in Python.
Journal of Machine Learning Research , 12:2825–2830, 2011.20. Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, GregoryChanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Des-maison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, AlykhanTejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and SoumithChintala. Pytorch: An imperative style, high-performance deep learning library. InH. Wallach, H. Larochelle, A. Beygelzimer, F. d ' Alch´e-Buc, E. Fox, and R. Garnett,editors,