[PDF] Brain Tumor Survival Prediction using Radiomics Features

Abstract

Surgery planning in patients diagnosed with brain tumor is dependent on their survival prognosis. A poor prognosis might demand for a more aggressive treatment and therapy plan, while a favorable prognosis might enable a less risky surgery plan. Thus, accurate survival prognosis is an important step in treatment planning. Recently, deep learning approaches have been used extensively for brain tumor segmentation followed by the use of deep features for prognosis. However, radiomics-based studies have shown more promise using engineered/hand-crafted features. In this paper, we propose a three-step approach for multi-class survival prognosis. In the first stage, we extract image slices corresponding to tumor regions from multiple magnetic resonance image modalities. We then extract radiomic features from these 2D slices. Finally, we train machine learning classifiers to perform the classification. We evaluate our proposed approach on the publicly available BraTS 2019 data and achieve an accuracy of 76.5% and precision of 74.3% using the random forest classifier, which to the best of our knowledge are the highest reported results yet. Further, we identify the most important features that contribute in improving the prediction.

Full PDF

BBrain Tumor Survival Prediction usingRadiomics Features

Sobia Yousaf , Syed Muhammad Anwar , Harish RaviPrakash , and UlasBagci Department of Software Engineering, UET Taxila, Taxila, Pakistan CRCV, University of Central Florida, Orlando FL, USA

Abstract.

Surgery planning in patients diagnosed with brain tumor isdependent on their survival prognosis. A poor prognosis might demandfor a more aggressive treatment and therapy plan, while a favorable prog-nosis might enable a less risky surgery plan. Thus, accurate survival prog-nosis is an important step in treatment planning. Recently, deep learn-ing approaches have been used extensively for brain tumor segmentationfollowed by the use of deep features for prognosis. However, radiomics-based studies have shown more promise using engineered/hand-craftedfeatures. In this paper, we propose a three-step approach for multi-classsurvival prognosis. In the ﬁrst stage, we extract image slices correspond-ing to tumor regions from multiple magnetic resonance image modalities.We then extract radiomic features from these 2D slices. Finally, we trainmachine learning classiﬁers to perform the classiﬁcation. We evaluateour proposed approach on the publicly available BraTS 2019 data andachieve an accuracy of and precision of 74.3% using the randomforest classiﬁer, which to the best of our knowledge are the highest re-ported results yet. Further, we identify the most important features thatcontribute in improving the prediction.

Gliomas are the most common type of brain tumor with over 78% malignanttumors being gliomas [1]. However, not all gliomas are malignant and can bebroadly classiﬁed into two groups: high-grade glioma (HGG) and low-gradeglioma (LGG). According to the World Health Organization guidelines fourgrades are deﬁned for tumors [9]. Grade I and Grade II tumors are LGG, whichare primarily benign and slow growing. Grades III and IV are HGG, which aremalignant in nature with a high probability of recurrence. With Grade I tumorsbeing mostly benign, patients tend to have a long term survival rate. Patientswith HGG on the other hand, owing to the more aggressive nature of these tu-mors, have a much lower survival time, sometimes not exceeding a year. An earlydiagnosis of glioma would help the radiologist in assessing the patient’s condi-tion and plan a treatment accordingly. Magnetic resonance images (MRI) providehigh contrast for soft tissue and hence represent the heterogeneity of the tumorcore, providing a detailed information about the tumor. Diﬀerent modalities com-monly used in radiology include T1-weighted, contrast enhanced T1-weighted, a r X i v : . [ ee ss . I V ] S e p Sobia Yousaf, Syed Muhammad Anwar, Harish RaviPrakash, and Ulas Bagci ﬂuid-attenuated inversion recovery (FLAIR), and T2-weighted MRIs. A detailedproﬁle of enhancing tumor region can be described using contrast enhanced T1-weighted MRI as compared to T2-weighted MRI [16]. The hyper-intense regionsin FLAIR images tend to correspond to regions of edema thus suggesting theneed for the use of multi modal images [8]. A quantitative assessment of the braintumor provides important information about the tumor structure and hence isconsidered as a vital part for diagnosis [14]. Automatic tumor segmentation ofpre-operative multi modal MR Images in this perspective is attractive becauseit provides the quantitative measurement of the tumor parameters such as shapeand volume. This process is also considered as a pre-requisite for survival predic-tion, because signiﬁcant features can only be computed from the tumor region.So, this quantitative assessment has a signiﬁcant importance in the diagnosisprocess and research. Due to imaging artifacts, ambiguous boundaries, irregularshape and appearance of tumor and its sub-regions, development of automatictumor segmentation algorithms become challenging.Over the past few years, several deep learning (DL) based approaches havebeen introduced especially for medical image analysis[4,2,12,10]. DL models out-perform traditional machine learning approaches on numerous applications incomputer vision as well as medical image analysis, especially when suﬃcientlylarge number of training samples are available [3]. Automatic segmentation aloneis not suﬃcient for diagnosis and treatment, hence survival analysis is also nec-essary to help determine the treatment and therapy plans. Towards this, tradi-tional machine learning based approaches have shown promising results usinghand-crafted features [14]. These features, having been extracted from radiologyimages, are commonly referred to as radiomic features and help in the charac-terization of the tumor. Since 2017, the challenge of survival prognosis of gliomapatients on BraTS benchmark data has been included. In the task of survivalprediction, patients diagnosed with HGG are categorized into short-, mid- andlong-term survival groups. The interval of these classes can be decided on thebasis of number of days, months, or resection status. While DL algorithms haveshown excellent performance on tumor segmentation tasks, in the survival pre-diction task, they have shown unstable performance [15]. Radiomics are likelyto be dominant for precision medicine because of its capability to exploit de-tailed information of gliomas phenotype [18]. Inspired from these achievementsof radiomics features in this challenging task of survival prediction on severalmodalities, herein we propose to utilize radiomic features for prediction.

Our Contributions

In this paper we present a machine learning based approach, utilizing radiomicfeatures, for survival prediction on BraTS 2019 data. We extract radiomic fea-tures from the tumor regions utilizing the provided ground-truth segmentationmasks and train machine learning classiﬁers to predict the survival class. Ourmain contributions are rain Tumor Survival Prediction using Radiomics Features 3 – We identify discriminating features that contributed the most in improv-ing the accuracy and found that Haralick features are more signiﬁcant forsurvival prediction task. – We explore multiple classiﬁers commonly used in this domain, and foundrandom forest to be the best performing model with state-of-the-art perfor-mance when used with the selected radiomics features.

Our proposed approach towards survival prediction is shown in Fig. 1. It consistsof the three main steps. 1) Region of interest (ROI) extraction 2) Radiomicfeatures computation, and 3) Survival prediction The details of these steps arepresented in the following sections.

Input ImageT1,T1 C , FLAIR,T2 ROI ExtractionUsing Ground Truth Radiomics FeaturesHaralick Texture, LBP, Shape, StatisticalIntensity ShapeTexture LBP Survival PredictionKNN, SVM, DT, RF, DA classifiers • Short Survivor (0 – 600) days • Mid Survivor (600 -1300) days • Long Survivor (1300 – Alive) days

Fig. 1.

The proposed radiomics features based survival prediction pipeline using BraTS2019 data.

The data were acquired from multiple institutions using diﬀerent scanners. Therecould exist diﬀerent levels of noise in scanners leading to intensity variations thatcan strongly inﬂuence the extracted radiomic features [5]. Hence, bias ﬁeld cor-rection and normalization steps were applied on the input data to standardizethe intensity values. More precisely, the intensity value of each image slice is sub-tracted from its mean and is divided by the images intensity standard deviation.In order to extract the ROI, we applied the ground truth of respective patienton all input modalities. As a result we get the complete tumor region from allscans of the patient.

Radiomics are the speciﬁc kind of features that are primarily computed fromradiology images to describe phenotypes of the tumor region. These features can

Sobia Yousaf, Syed Muhammad Anwar, Harish RaviPrakash, and Ulas Bagci be further used to predict the tumor and can improve the survival prediction.Herein, radiomic features extracted from input modalities are classiﬁed into threegroups- ﬁrst order statistics, shape features, and texture features. The details ofthese features are presented in the following text.

First-order statistics

These features represent statistical properties such asthe average intensity value, median, variance, standard deviation, kurtosis, skew-ness, entropy, and energy. These features were computed using the intensity val-ues in MR images, such that the gray-level intensity of the tumor region is de-scribed accurately. In particular, a total of 10 ﬁrst order features were extractedfrom each slice of the four modalities used.

Shape Features

Shape features include perimeter, area, convex area, convexperimeter, concavity, diameter, major and minor axis length, circulatory, elonga-tion, and sphericity. Further, we described the tumor shape by using the Fourierdescriptor, where the entire shape is represented using minimal numeric values[6].

Texture Features

Texture features are considered to be strong in the radiomicsﬁeld [19]. We computed the Haralick texture features [7], and local binary pat-terns (LBP). In particular, Haralick features were computed from the gray levelco-occurrence matrix (GLCM), which describes the spatial relationship amongpixels. Whereas in LBP, a binary encoded representation is used to describe therelationship between pixels of interest with its neighbors [11]. A total of 14 Har-alick features (shown in Table 1) and 55 LBP features were extracted from eachslice. In particular, G represents the number of gray levels, i and j are indices ofthe pixels of these gray levels, while P ( i, j ) denotes the intensities of the pixel inthe GLCM matrix. While µ , σ , and σ represent the mean, standard deviation,and variance respectively. The survival prediction task is an important but challenging task for BraTS data.One of the reasons could be that only age and MR images are provided, hencethis prediction mainly relies on tumor identiﬁcation within the MRI. To thisend, we have used radiomics features for describing tumors within the region ofinterest. In particular, we computed 90 features (statistical, shape- and texture-based) per subject per slice. These features are computed for complete tumor andfed to ﬁve diﬀerent classiﬁers including discriminant analysis (DA), decision tree(DT), K-nearest neighbor (k-NN), support vector machine (SVM), and randomforest (RF).k-NN is a simple machine learning algorithm that takes the all data availableagainst deﬁned classes and categorizes the incoming test samples on the basisof distance function or similarity measures. In our experiment, we have useddiﬀerent values of k to evaluate the performance. DA is a statistical approachto ﬁnd similar patterns or feature combinations to separate two or more datasamples. The resultant combination of patterns can be used as a classiﬁer toallocate samples to classes. SVM is a supervised machine learning model which rain Tumor Survival Prediction using Radiomics Features 5 Table 1.

Description of Haralick Texture Feature’s used in this study for survivalprediction.

Feature Name Equation

Probabilities P(x), P(y) P x ( i ) = G − (cid:88) i =0 P ( i, j ) , P y ( j ) = G − (cid:88) j =0 P ( i, j )Variance V ar = G − (cid:88) i =0 G − (cid:88) j =0 ( i − µ ) P ( i ∗ j )Standard Deviation σ x ( i ) = G − (cid:88) i =0 ( P x ( i ) − µ x ( i )) , σ y ( j ) = G − (cid:88) j =0 ( P y ( j ) − µ y ( j )) Homogeneity H = G − (cid:88) i =0 G − (cid:88) j =0 [ P x ( i, j )] Contrast C = G − (cid:88) i =0 n G (cid:88) i =1 G (cid:88) j =1 P ( i, j ) , | i − j | = n Correlation

Corr = G − (cid:88) i =0 G − (cid:88) j =0 ( i ∗ j ) ∗ P ( i, j ) − ( µ x ∗ µ y ) / ( σ x ∗ σ y )Inverse Diﬀerence Moment IDM = G − (cid:88) i =0 G − (cid:88) j =0 P ( i, j ) / i − j ) Entropy

Ent = G − (cid:88) i =0 G − (cid:88) j =0 P ( i ∗ j ) ∗ log ( P ( i, j ))Average Sum S A = G − (cid:88) i =0 iXP x + y ( i )Entropy Diﬀerence D Ent = − G − (cid:88) i =0 P x + y ( i ) ∗ log ( P x + y ( i ))Entropy Sum S Ent = − G − (cid:88) i =0 P x + y ( i ) ∗ log ( P x + y ( i ))Intertia Inr = G − (cid:88) i =0 G − (cid:88) j =0 ( i − j ) ∗ P ( i ∗ j ) maximizes the hyper-plane margin between diﬀerent classes. The classiﬁer mapsinput space into a high-dimension linearly separable feature space. Because ofthe nonlinear problem space we used the radial basis kernel. A DT starts dividingthe data into smaller segments, meanwhile the tree is developed incrementally.The ﬁnal tree contains two types of nodes i.e., decision and leaf nodes. Here everydecision node has more than one branches (i.e. low, mid and long survivor) whilethe leaf node represents the ﬁnal decision. In particular, we used 10 splits for theDT model. RF is one of the famous machine learning classiﬁers that is consideredto be the best in response to over-ﬁtting problems in large dimensional data. ARF model comprises of several trees that take random decisions on given trainingsamples. For survival prediction, we used an RF with 30 bags and observed thataccuracy increased with increasing the number of trees until it plateaued out. To evaluate the performance of our proposed method we used BraTS 2019 bench-mark data provided by the Cancer Imaging Archive. The dataset comprises of

Sobia Yousaf, Syed Muhammad Anwar, Harish RaviPrakash, and Ulas Bagci

Table 2.

Performance evaluation of diﬀerent classiﬁers for survival prediction usingradiomics features. The bold values shows the best results.

Classiﬁer Evaluation MetricsAccuracy Precision Recall k-NN 0.388 0.379 0.365DA 0.471 0.409 0.399DT 0.678 0.640 0.659SVM 0.526 0.509 0.519

RF 0.765 0.743 0.736

FCNN [17] 0.515 - - independent training and validation sets. The training data contains 259 sub-jects diagnosed with HGG and 76 subjects diagnosed with LGG along withground truth annotations by experts. Moreover, the data comprises of MRI im-ages from 19 diﬀerent institutions of four MRI modalities (T1-weighted, T2-weighted, T1-contrast enhanced and FLAIR). We selected the CBICA, BraTS2013 and a single dataset from the TCIA archive resulting in 166 subjects withHGG. The images are pre-processed via skull-stripping, co-registration to a com-mon anatomical template and re-sampling to an isotropic resolution of 1 × × mm . The data also includes the survival information in terms of number of daysfor each patient along with their age. In the BraTS 2019 data, the age range ofthe HGG cases is from 19 to 86 years and survival information ranges from 0 to1767 days. It should be noted, that for some patients the survival informationwas missing, and we treated those as having low survival. We used the extracted radiomics features combined with clinical features to pre-dict the survival class. All radiomics features were combined with patient ageand hence a total of 30632 feature values were obtained from 166 HGG subjectsto train ﬁve diﬀerent conventional machine learning classiﬁers. These includeda total of 90 radiomics features extracted per slice per subject, while no fea-ture reduction technique was used. We chose ﬁve ML classiﬁers to evaluate theperformance of the extracted radiomics features. We created the class labels bynormalizing and dividing the number of survival days into three diﬀerent re-gions i.e., short survivor (0 600), medium survivor (600 1300), long survivor(1300 Alive). The measurement criteria followed in literature is to predict thecorrect number of cases that has survival less than 10 months, between 10 to15 months and greater than 15 months. We further used precision, recall andaccuracy as performance measures for each classiﬁer. Initially, we used ﬁrst orderstatistical features and shape-based features, but found that these features couldnot provide a signiﬁcant performance in the prediction task. Hence, we incor-porated Haralick texture features and Fourier shape descriptor, and observed asigniﬁcant increase in performance when using conventional classiﬁers. We useda 10-fold cross-validation approach for classiﬁcation purpose.Table 2 shows the performance of machine learning models using accuracy,precision, and recall parameters. A fully connected neural network, with two rain Tumor Survival Prediction using Radiomics Features 7

Table 3.

Confusion matrix for random forest classiﬁer (the values represent percent-ages).

Predicted Class Actual Class

Low Survival Mid Survival Long SurvivalLow Survival 76.52 9.04 14.44Mid Survival 24.77 75.23 0Long Survival 19.00 6.38 74.62 hidden layers, was used for survival prediction on Brats 2019 training data [17].For 101 patients, using radiomics features, an accuracy of 0 . k is an importantparameter to choose for the k-NN classiﬁer and impacts the overall classiﬁcationresults. Since k-NN performance was not at par when using radiomics featuresat k = 3, we experimented with increasing the value of k, but did not observea signiﬁcant improvement in the performance. Our results indicate that randomforest was able to learn the data representation from the radiomics features foroverall survival prediction. In RF, each tree (total 30 trees) was diverse becauseit was grown and unpruned fully that’s why the feature space was divided intosmaller regions. Hence RF learned using the random samples, where a randomfeature set was selected at every node giving diversity to the model.Since in BraTS 2019 benchmark data, the input modalities have intensityvariations and tumor appearance is also heterogeneous, features computed fromthese modalities are also diverse in nature. We further quantify the importanceof all features (statistical, shape-based, and Haralick) as shown in Figure 2. Itwas observed that Haralick features (represented on feature index 1 to 14) hadan out-of-bag feature importance value ranging between 1 - 2.5. This was onaverage higher than all other set of features used and shows the signiﬁcance ofthese features in the classiﬁcation task. This analysis was performed in MAT-LAB using statistical and machine learning toolbox. A confusion matrix for thebest performing classiﬁer (RF) is shown in Table 3, where the values representpercentages. In this paper, we presented an automatic framework for the prediction of sur-vival in patients diagnosed with glioma using multi modal MRI scans and clinicalfeatures. First, ROI radiomics features were extracted, which were then com-bined with clinical features to predict overall survival. For survival prediction,we extracted shape features, ﬁrst order statistics, and texture features from thesegmented tumor region and then used classiﬁcation models with 10-fold crossvalidation for prognosis. In particular, the experimental data were acquired inmulti-center setting and hence a cross-validation approach was utilized to testthe robustness of our proposed approach in the absence of an independent testcohort. In literature, survival prediction model has been applied on diverse dataalong with diﬀerent class labels and resection based clinical feature. For braintumor, the performance in survival prediction has been lower, for instance an ac-

Sobia Yousaf, Syed Muhammad Anwar, Harish RaviPrakash, and Ulas Bagci

Fig. 2.

A representation of out-of-bag feature importance for all radiomics featuresused in this study with diﬀerent colors, Haralick features(yellow), ﬁrst order statisticsfeatures(green), shape features (blue) and LBP (red). curacy of 70% was achieved [13]. In particular, 3D features were extracted fromthe original images and ﬁltered images. Further, feature selection was performedto reduce the 4000+ features down to 14. While, in our proposed approach weutilize slice-based features (90) and majority voting across slices to obtain a ﬁ-nal classiﬁcation. Among ﬁve classiﬁers mentioned above, RF showed the bestresults using the computed radiomics features. The performance signiﬁcantlyvaried among these classiﬁers, which shows the challenging nature of this pre-diction. With RF, an accuracy of 0.76, along with precision and recall of 0.74and 0.73, respectively was achieved. We also predicted subject wise evaluationof RF model where majority voting among slices from each patient was usedto assign one of the three classes to the patient. We achieved an accuracy of0 .

75 using the subject-wise approach. In future, we intend to extend this workby incorporating more data from the TCIA archive as well as using 3D fea-tures extracted from atlas based models for survival prediction. We also intendto use Cox proportional hazards models to better handle data with no survivalinformation provided (missing data).

References

1. American association of neurological surgeons. , ac-cessed: 07/12/20202. Anwar, S.M., Altaf, T., Raﬁque, K., RaviPrakash, H., Mohy-ud Din, H., Bagci, U.:A survey on recent advancements for ai enabled radiomics in neuro-oncology. In:International Workshop on Radiomics and Radiogenomics in Neuro-oncology. pp.24–35. Springer (2019)3. Anwar, S.M., Majid, M., Qayyum, A., Awais, M., Alnowami, M., Khan, M.K.:Medical image analysis using convolutional neural networks: a review. Journal ofmedical systems (11), 226 (2018)rain Tumor Survival Prediction using Radiomics Features 94. Anwar, S.M., Yousaf, S., Majid, M.: Brain tumor segmentation on multimodal mriscans using emap algorithm. In: 2018 40th Annual International Conference of theIEEE Engineering in Medicine and Biology Society (EMBC). pp. 550–553. IEEE(2018)5. Bakas, S., Akbari, H., Sotiras, A., Bilello, M., Rozycki, M., Kirby, J.S., Freymann,J.B., Farahani, K., Davatzikos, C.: Advancing the cancer genome atlas glioma mricollections with expert segmentation labels and radiomic features. Scientiﬁc data , 170117 (2017)6. Burger, W., Burge, M.J.: Fourier shape descriptors. In: Principles of digital imageprocessing, pp. 169–227. Springer (2013)7. Haralick, R.M., Shanmugam, K., Dinstein, I.H.: Textural features for image classi-ﬁcation. IEEE Transactions on systems, man, and cybernetics (6), 610–621 (1973)8. Ho, M.L., Rojas, R., Eisenberg, R.L.: Cerebral edema. American Journal ofRoentgenology (3), W258–W273 (2012)9. Louis, D.N., Perry, A., Reifenberger, G., Von Deimling, A., Figarella-Branger, D.,Cavenee, W.K., Ohgaki, H., Wiestler, O.D., Kleihues, P., Ellison, D.W.: The 2016world health organization classiﬁcation of tumors of the central nervous system: asummary. Acta neuropathologica (6), 803–820 (2016)10. Mehreen, A., Anwar, S.M., Haseeb, M., Majid, M., Ullah, M.O.: A hybrid schemefor drowsiness detection using wearable sensors. IEEE Sensors Journal (13),5119–5126 (2019)11. Polepaka, S., Rao, C.S., Mohan, M.C.: Idss-based two stage classiﬁcation of braintumor using svm. Health and Technology pp. 1–10 (2019)12. RaviPrakash, H., Korostenskaja, M., Castillo, E.M., Lee, K.H., Salinas, C.M.,Baumgartner, J., Anwar, S.M., Spampinato, C., Bagci, U.: Deep learning pro-vides exceptional accuracy to ecog-based functional language mapping for epilepsysurgery. Frontiers in Neuroscience , 409 (2020)13. Sun, L., Zhang, S., Chen, H., Luo, L.: Brain tumor segmentation and survivalprediction using multimodal mri scans with deep learning. Frontiers in neuroscience , 810 (2019)14. Sun, L., Zhang, S., Luo, L.: Tumor segmentation and survival prediction in gliomawith deep learning. In: International MICCAI Brainlesion Workshop. pp. 83–93.Springer (2018)15. Suter, Y., Jungo, A., Rebsamen, M., Knecht, U., Herrmann, E., Wiest, R., Reyes,M.: Deep learning versus classical regression for brain tumor patient survival pre-diction. In: International MICCAI Brainlesion Workshop. pp. 429–440. Springer(2018)16. Villanueva-Meyer, J.E., Mabray, M.C., Cha, S.: Current clinical brain tumor imag-ing. Neurosurgery (3), 397–415 (2017)17. Wang, F., Jiang, R., Zheng, L., Meng, C., Biswal, B.: 3d u-net based brain tumorsegmentation and survival days prediction. In: International MICCAI BrainlesionWorkshop. pp. 131–141. Springer (2019)18. Weninger, L., Rippel, O., Koppers, S., Merhof, D.: Segmentation of brain tumorsand patient survival prediction: methods for the brats 2018 challenge. In: Interna-tional MICCAI Brainlesion Workshop. pp. 3–12. Springer (2018)19. Yang, D., Rao, G., Martinez, J., Veeraraghavan, A., Rao, A.: Evaluation of tumor-derived mri-texture features for discrimination of molecular subtypes and predic-tion of 12-month survival status in glioblastoma. Medical physics42