[PDF] Context-Aware Learning using Transferable Features for Classification of Breast Cancer Histology Images

Abstract

Convolutional neural networks (CNNs) have been recently used for a variety of histology image analysis. However, availability of a large dataset is a major prerequisite for training a CNN which limits its use by the computational pathology community. In previous studies, CNNs have demonstrated their potential in terms of feature generalizability and transferability accompanied with better performance. Considering these traits of CNN, we propose a simple yet effective method which leverages the strengths of CNN combined with the advantages of including contextual information, particularly designed for a small dataset. Our method consists of two main steps: first it uses the activation features of CNN trained for a patch-based classification and then it trains a separate classifier using features of overlapping patches to perform image-based classification using the contextual information. The proposed framework outperformed the state-of-the-art method for breast cancer classification.

Full PDF

CContext-Aware Learning using TransferableFeatures for Classiﬁcation of Breast CancerHistology Images

Ruqayya Awan (cid:63) , Navid Alemi Koohbanani (cid:63) , Muhammad Shaban , AnnaLisowska , and Nasir Rajpoot Department of Computer Science, University of Warwick, Coventry, UK The Alan Turing Institute, London, UK Department of Pathology, University Hospitals Coventry & Warwickshire, UK

Abstract.

Convolutional neural networks (CNNs) have been recentlyused for a variety of histology image analysis. However, availability ofa large dataset is a major prerequisite for training a CNN which limitsits use by the computational pathology community. In previous studies,CNNs have demonstrated their potential in terms of feature generaliz-ability and transferability accompanied with better performance. Consid-ering these traits of CNN, we propose a simple yet eﬀective method whichleverages the strengths of CNN combined with the advantages of includ-ing contextual information, particularly designed for a small dataset. Ourmethod consists of two main steps: ﬁrst it uses the activation features ofCNN trained for a patch-based classiﬁcation and then it trains a separateclassiﬁer using features of overlapping patches to perform image-basedclassiﬁcation using the contextual information. The proposed frameworkoutperformed the state-of-the-art method for breast cancer classiﬁcation.

Keywords:

Digital pathology, Convolutional neural network, Context-aware learning, Transferable features, Breast cancer

Breast cancer is the most common type of cancer diagnosed and is the secondmost common type of cancer with high mortality rate after lung cancer in women[1]. Due to the increased incidence of breast cancer and subjectivity in diagno-sis, there is an increasing demand for automated systems. To this end, deepneural networks (DNNs) have been widely used to produce the state-of-the-artresults for a variety of histology image analysis tasks such as nuclei detectionand classiﬁcation [2], tissue classiﬁcation [3,4] and segmentation [5,6].The CAMELYON16 challenge [6] is the best demonstration of using deeplearning for automatic tissue analysis, outperforming the pathologists in termsof detection of tumors within the whole slide images (WSIs). The objective of thischallenge was to automatically detect the metastasis in haematoxylin and eosin (cid:63)

Joint co-authors a r X i v : . [ c s . C V ] M a r BACH Challenge (H&E) stained WSIs of lymph node sections. Cruz-Roa et al. [3] presented a deeplearning architecture for automated basal carcinoma detection. This methodﬁrst learns image representation via autoencoder and then a CNN is applied onthis representation to capture both translation invariant features and a compactimage representation. Spanhol et al. [7] applied a simple CNN for classifying theBreaKHis database [8] consisting of microscopic images of benign and malignantbreast tumor biopsies. Small patches were extracted at diﬀerent magniﬁcationlevels to train the network and during inference, ﬁnal output was produced bycombining the predictions of the small patches.The generalizability property of DNN makes their features transferable toother applications which encouraged the researchers to employ transfer learn-ing for histology images as in [5,9,10]. These features have also been used totrain separate classiﬁers for predictions [11,12,13,14], which are particularly use-ful when there is not enough dataset for training the CNN from scratch. Insome recent studies [15,16], context-aware based learning architecture has beenintroduced, in which ﬁrst CNN is trained using high pixel resolution patches toextract features at a cellular level that are then fed to a second CNN, stackedon top of the ﬁrst for expanding the context from a single patch to a large tissueregion. The experimental results of these studies suggest that the contextualinformation plays a crucial role in identifying abnormalities in heterogeneoustissue structures.Our contribution in this work is twofold. First, we propose to use CNN fea-tures as a generic descriptor for a small dataset, provided as a part of a chal-lenge dataset. We extract transferable features from a number of networks, eachtrained on a diﬀerent dataset for the purpose of classiﬁcation by a separate clas-siﬁer trained on these features. As our second contribution, we combine thesefeatures to learn context of a large patch to improve our classiﬁcation perfor-mance. To this end, we use transferable features for a block of consecutive patchesto train a SVM model to classify the H&E stained breast images into normal,benign, carcinoma insitu (CIS) and breast invasive carcinoma (BIC).

We used the dataset provided as a part of the ICIAR 2018 challenge for theclassiﬁcation of breast cancer histology images. This dataset comprises of 400high resolution images of size 2048 × × magniﬁcation, stainedwith H&E stain. The pixel resolution for these images is 0.42 µ m. Each imagebelongs to one of the four classes: normal, benign, insitu carcinoma or invasivecarcinoma. The ground truth was provided by the two pathologists. To study thefeature transferability of CNN, we experimented with other part of the challengedataset provided for segmentation task. Ten WSIs with coarse annotations wereprovided for this task. We extracted patches from these WSIs after manuallyreﬁning the original annotations.The challenge dataset for a classiﬁcation task consists of training imagesused in [14] along with 151 additional images. To evaluate the eﬀectiveness of ACH Challenge 3 our proposed approach, we splitted the challenge dataset for two settings. In theﬁrst setting, we use the same images for training and testing which were used in[14] for a fair comparison. We included the additional images in our validationset while training the network. The test dataset contains two set of images, withequal number of images in each class. The testing data is not provided with thechallenge data but is made publicly available by the authors in two sets. Theﬁrst test set contains 20 images while the second set contains 16 images and isreferred as test extended dataset in this paper. In the second setting which isused for submission to the challenge, we combined the whole challenge datasetfrom task-1 and the test dataset and randomly split them into 75% training and25% validation set.Regarding the implementation, we used residual neural network with 50 lay-ers for patch-based classiﬁcation in Tensorﬂow. For context-aware image-basedclassiﬁcation, support vector machine (SVM) classiﬁer with radial basis function(RBF) was used and implemented in MATLAB. Further details on both thesesteps are given in the

Methods section.

In this paper, we introduce an eﬀective model for the purpose of image-basedclassiﬁcation using more context information, particularly for a small dataset.To this end, we design our model in two main steps: patch-based classiﬁcationand context-aware image-based classiﬁcation. The overall system architecture isshown in Figure 1.

Fig. 1.

Flow Diagram of our classiﬁcation framework. Twelve non-overlapping patchesare extracted from the input image. A 8192-dimensional feature vector is then obtainedfor each patch using a trained ResNet. The class label for the overlapping blocks (2 × Stain inconsistency of digitized WSIs is a signiﬁcant issue aﬀecting the perfor-mance of machine learning (ML) systems. The dataset provided for this chal-

BACH Challenge lenge contains images with large stain variation. To this end, we performed stainnormalization using the Reinhard method [17], available in our group’s

StainNormalization Toolbox [18]. This method transforms the color distribution of animage to the color distribution of a target image by matching the mean and stan-dard deviation of the source image to that of target image. This transformationis carried out for each channel separately, in the Lab colorspace.

Fig. 2.

Output of stain normalization: A, B and C show the target image, the originalimage and the stain normalized version of B respectively.

ResNet [19] introduced in 2015 by Microsoft has been shown to outperformseveral architectures including VGG [20], GoogleNet [21], PReLU-net [22] andBN-inception [23]. This network also outperformed best performing networkswith a signiﬁcant margin for the classiﬁcation of histopathology colorectal images[24]. The state-of-the-art results of ResNet on diﬀerent datasets motivated us touse it for our patch-based classiﬁcation. For our experiments, we used ResNetwith 50 layers. For network training, overlapping patches of size 512 ×

512 pixelswere extracted from the images. The network was trained for 16 epochs withbatch size of 12 and the best trained network was selected for further processing.The training was done using stochastic gradient descent with momentum set to0.95. The learning rate was initially set to 0.001 and was decremented after eachupdate. Due to the very small dataset and also to make our network robustto feature transformation, we performed data augmentation involving randomrotation (90 to 360 degrees with step of 90 degrees) and ﬂipping during thetraining stage.

The above patch-based classiﬁcation network learns a limited contextual repre-sentation for each class by using small patches of size 512 ×

512 pixels. To train aclassiﬁer with larger context, we divided each image into twelve non-overlappingpatches and for each patch, we then extracted 8192 dimensional feature vector

ACH Challenge 5 from the last layer of our patch-based network. We then trained an SVM clas-siﬁer with the ﬂattened features of 2 × × For the evaluation of our proposed method, we experimented with diﬀerent con-ﬁgurations to show the signiﬁcance of contextual information, eﬀect of featuretransferability using networks trained on diﬀerent datasets and also to compareour method with the results of [14].Firstly, we experimented with contextual information captured from the vary-ing size of block of patches. We trained SVM with the context of 1 × ×

512 pixels), 2 × × × × × × × × BACH Challenge achieved higher accuracy compared to [14] which demonstrates the capability ofthe contextual information for discriminating diﬀerent classes.

Fig. 3.

Summary of our experimental results. (a) Accuracy obtained using the contextof various size of blocks where Context(1 × ×

2) and Context(3 ×

3) repre-sent contextual block of size 512 × × × In this paper, we proposed a context-aware network for automated classiﬁcationof breast cancer histology images. The proposed method leverages the power ofCNNs to encode the representation of a patch into high dimensional space anduses traditional machine method (SVM) to aggregate the contextual informationfrom the high dimensional features while having a limited dataset. Our proposedapproach outperformed the existing methods proposed for the same task. Theproposed method is not limited to breast cancer classiﬁcation task. It could beapplied to other problems where both high resolution and contextual informationare required to make an optimal prediction.

ACH Challenge 7

References

1. R. L. Siegel, K. D. Miller, and A. Jemal, “Cancer statistics, 2016,”

CA: a cancerjournal for clinicians , vol. 66, no. 1, pp. 7–30, 2016.2. K. Sirinukunwattana, S. E. A. Raza, Y.-W. Tsang, D. R. Snead, I. A. Cree, andN. M. Rajpoot, “Locality sensitive deep learning for detection and classiﬁcationof nuclei in routine colon cancer histology images,”

IEEE transactions on medicalimaging , vol. 35, no. 5, pp. 1196–1206, 2016.3. A. Cruz-Roa, A. Basavanhally, F. Gonz´alez, H. Gilmore, M. Feldman, S. Ganesan,N. Shih, J. Tomaszewski, and A. Madabhushi, “Automatic detection of invasiveductal carcinoma in whole slide images with convolutional neural networks,” in

SPIE medical imaging , vol. 9041, pp. 904103–904103, International Society forOptics and Photonics, 2014.4. D. Wang, A. Khosla, R. Gargeya, H. Irshad, and A. H. Beck, “Deep learning foridentifying metastatic breast cancer,” arXiv preprint arXiv:1606.05718 , 2016.5. H. Chen, X. Qi, L. Yu, and P.-A. Heng, “Dcan: Deep contour-aware networks foraccurate gland segmentation,” in

Proceedings of the IEEE conference on ComputerVision and Pattern Recognition , pp. 2487–2496, 2016.6. B. E. Bejnordi, M. Veta, P. J. van Diest, B. van Ginneken, N. Karssemeijer, G. Lit-jens, J. A. van der Laak, M. Hermsen, Q. F. Manson, M. Balkenhol, et al. , “Diag-nostic assessment of deep learning algorithms for detection of lymph node metas-tases in women with breast cancer,”

Jama , vol. 318, no. 22, pp. 2199–2210, 2017.7. F. A. Spanhol, L. S. Oliveira, C. Petitjean, and L. Heutte, “Breast cancerhistopathological image classiﬁcation using convolutional neural networks,” in

Neu-ral Networks (IJCNN), 2016 International Joint Conference on , pp. 2560–2567,IEEE, 2016.8. F. A. Spanhol, L. S. Oliveira, C. Petitjean, and L. Heutte, “A dataset for breastcancer histopathological image classiﬁcation,”

IEEE Transactions on BiomedicalEngineering , vol. 63, no. 7, pp. 1455–1462, 2016.9. N. Bayramoglu and J. Heikkil¨a, “Transfer learning for cell nuclei classiﬁcation inhistopathology images,” in

Computer Vision–ECCV 2016 Workshops , pp. 532–539,Springer, 2016.10. Z. Han, B. Wei, Y. Zheng, Y. Yin, K. Li, and S. Li, “Breast cancer multi-classiﬁcation from histopathological images with structured deep learning model,”

Scientiﬁc reports , vol. 7, no. 1, p. 4172, 2017.11. Y. Xu, Z. Jia, L.-B. Wang, Y. Ai, F. Zhang, M. Lai, I. Eric, and C. Chang, “Largescale tissue histopathology image classiﬁcation, segmentation, and visualization viadeep convolutional activation features,”

BMC bioinformatics , vol. 18, no. 1, p. 281,2017.12. M. Valkonen, K. Kartasalo, K. Liimatainen, M. Nykter, L. Latonen, and P. Ruusu-vuori, “Dual structured convolutional neural network with feature augmentationfor quantitative characterization of tissue histology,” in

Proceedings of the IEEEConference on Computer Vision and Pattern Recognition , pp. 27–35, 2017.13. Y. Xu, Z. Jia, Y. Ai, F. Zhang, M. Lai, I. Eric, and C. Chang, “Deep convolutionalactivation features for large scale brain tumor histopathology image classiﬁcationand segmentation,” in

Acoustics, Speech and Signal Processing (ICASSP), 2015IEEE International Conference on , pp. 947–951, IEEE, 2015.14. T. Ara´ujo, G. Aresta, E. Castro, J. Rouco, P. Aguiar, C. Eloy, A. Pol´onia, andA. Campilho, “Classiﬁcation of breast cancer histology images using convolutionalneural networks,”

PloS one , vol. 12, no. 6, p. e0177544, 2017. BACH Challenge15. A. Agarwalla, M. Shaban, and N. M. Rajpoot, “Representation-aggregationnetworks for segmentation of multi-gigapixel histology images,” arXiv preprintarXiv:1707.08814 , 2017.16. B. E. Bejnordi, G. Zuidhof, M. Balkenhol, M. Hermsen, P. Bult, B. van Ginneken,N. Karssemeijer, G. Litjens, and J. van der Laak, “Context-aware stacked con-volutional neural networks for classiﬁcation of breast carcinomas in whole-slidehistopathology images,” arXiv preprint arXiv:1705.03678 , 2017.17. E. Reinhard, M. Adhikhmin, B. Gooch, and P. Shirley, “Color transfer betweenimages,”

IEEE Computer graphics and applications , vol. 21, no. 5, pp. 34–41, 2001.18. A. M. Khan, N. Rajpoot, D. Treanor, and D. Magee, “A nonlinear mapping ap-proach to stain normalization in digital histopathology images using image-speciﬁccolor deconvolution,”

IEEE Transactions on Biomedical Engineering , vol. 61, no. 6,pp. 1729–1738, 2014.19. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recog-nition,” in

Proceedings of the IEEE conference on computer vision and patternrecognition , pp. 770–778, 2016.20. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scaleimage recognition,” arXiv preprint arXiv:1409.1556 , 2014.21. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Van-houcke, and A. Rabinovich, “Going deeper with convolutions,” in

Proceedings ofthe IEEE conference on computer vision and pattern recognition , pp. 1–9, 2015.22. K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectiﬁers: Surpassinghuman-level performance on imagenet classiﬁcation,” in

Proceedings of the IEEEinternational conference on computer vision , pp. 1026–1034, 2015.23. S. Ioﬀe and C. Szegedy, “Batch normalization: Accelerating deep network train-ing by reducing internal covariate shift,” in

International Conference on MachineLearning , pp. 448–456, 2015.24. B. Korbar, A. M. Olofson, A. P. Miraﬂor, K. M. Nicka, M. A. Suriawinata, L. Tor-resani, A. A. Suriawinata, and S. Hassanpour, “Deep-learning for classiﬁcation ofcolorectal polyps on whole-slide images,” arXiv preprint arXiv:1703.01550arXiv preprint arXiv:1703.01550