[PDF] A Comparative Study of U-Net Topologies for Background Removal in Histopathology Images

Abstract

During the last decade, the digitization of pathology has gained considerable momentum. Digital pathology offers many advantages including more efficient workflows, easier collaboration as well as a powerful venue for telepathology. At the same time, applying Computer-Aided Diagnosis (CAD) on Whole Slide Images (WSIs) has received substantial attention as a direct result of the digitization. The first step in any image analysis is to extract the tissue. Hence, background removal is an essential prerequisite for efficient and accurate results for many algorithms. In spite of the obvious discrimination for human operators, the identification of tissue regions in WSIs could be challenging for computers, mainly due to the existence of color variations and artifacts. Moreover, some cases such as alveolar tissue types, fatty tissues, and tissues with poor staining are difficult to detect. In this paper, we perform experiments on U-Net architecture with different network backbones (different topologies) to remove the background as well as artifacts from WSIs in order to extract the tissue regions. We compare a wide range of backbone networks including MobileNet, VGG16, EfficientNet-B3, ResNet50, ResNext101 and DenseNet121. We trained and evaluated the network on a manually labeled subset of The Cancer Genome Atlas (TCGA) Dataset. EfficientNet-B3 and MobileNet by almost 99% sensitivity and specificity reached the best results.

Full PDF

AA Comparative Study of U-Net Topologies forBackground Removal in Histopathology Images

Abtin Riasatian, Maral Rasoolijaberi, Morteza Babaei, H.R. Tizhoosh

Kimia Lab, University of Waterloo, Canada { abtin.riasatian, mrasooli, mbabaie, tizhoosh } @uwaterloo.ca Abstract —During the last decade, the digitization of pathologyhas gained considerable momentum. Digital pathology offersmany advantages including more efﬁcient workﬂows, easiercollaboration as well as a powerful venue for telepathology. Atthe same time, applying Computer-Aided Diagnosis (CAD) onWhole Slide Images (WSIs) has received substantial attentionas a direct result of the digitization. The ﬁrst step in any imageanalysis is to extract the tissue. Hence, background removal is anessential prerequisite for efﬁcient and accurate results for manyalgorithms. In spite of the obvious discrimination for humanoperator, the identiﬁcation of tissue regions in WSIs could bechallenging for computers mainly due to the existence of colorvariations and artifacts. Moreover, some cases such as alveolartissue types, fatty tissues, and tissues with poor staining aredifﬁcult to detect. In this paper, we perform experiments onU-Net architecture with different network backbones (differenttopologies) to remove the background as well as artifacts fromWSIs in order to extract the tissue regions. We compare awide range of backbone networks including MobileNet, VGG16,EfﬁcientNet-B3, ResNet50, ResNext101 and DenseNet121. Wetrained and evaluated the network on a manually labeled subsetof The Cancer Genome Atlas (TCGA) Dataset. EfﬁcientNet-B3and MobileNet by almost sensitivity and speciﬁcity reachedthe best results.

Index Terms —Histopathology, Convolutional Networks, TissueSegmentation, U-Net, artifact removal.

I. I

NTRODUCTION

In the recent decade, the image digitization has recentlybecome more popular in the pathology practice. Improvementin this technology has led to the manufacturing of high-resolution whole-slide scanners which can produce WSIs ina short time. The digital scan of the biopsy glass slides canbe explored by image viewers rather than the conventionalmicroscope. Also, despite the large size of scans (a typicalWSI ﬁle is usually at least several hundred megabytes), newstorage and network sharing progress make it possible toshare these ﬁles much faster than mailing glass samples forthe purpose of consultations and acquiring second opinions[1]. An important beneﬁt of digital pathology is that AI andcomputer vision methods can be applied on tissue scans tohelp pathologists create more accurate reports [2]. Due to thelarge size of WSIs, most pathology image processing methodsdivide the slides into small tiles (patches) before feeding themto the CAD systems. Unquestionably foreground segmentation * This work was funded by a NSERC-CRD grant on “Design and De-velopment of Devices and Procedures for Recognizing Artefacts and ForeignTissue Origin for Diagnostic Pathology” is a necessary prerequisite for almost every tile-based methodto decrease the time complexity and possibility of makingmistakes by the algorithms due to analyzing irrelevant parts[3]. Thus, one has to remove irrelevant pixels from WSIs asmuch without removing any tissue pixels [4], [5]. Since inmedical imaging, histopathology image analysis is generallythe last step for cancer diagnosis [6], it is crucial to avoidlosing tissue pixels. Therefore, the expected segmentationsensitivity has to be very high.Another application of tissue foreground segmentation isin whole slide scanners which digitize glass slides containingtissue specimens to generate WSI ﬁles. The focus depth ofwhole slide scanners must be adjusted for different tissueregions due to variable tissue thickness. Hence, scanners needto identify all areas which contain tissue. If an error occursduring digitizing glass slides, there is no way to ﬁx theerror in the following steps of the digital pathology workﬂow.Currently, a technician manually checks every slide afterscanning, which is a tedious and expensive procedure [2], [7].Some of the challenges in tissue segmentation inhistopathology images are related to the tissue type. Forinstance, air sacs in the lung, and fat which could appearin many tissue types, may confuse algorithms due to theirresemblance with the background color while they can beeasily segmented as tissue by an expert. Another importantchallenge is the presence of artifacts including bubbles, tissuefolds, extra stain, broken glass, debris, and marker traces [4],[8]. Moreover, mistakes in tissue preparation such as weakstaining raise difﬁculties for tissue identiﬁcation algorithms[2], [9]. Examples of some of the mentioned challenging casesare indicated in Fig.1.In this paper, we propose a novel method to identify tissueareas in the WSI thumbnail images. The main contributionsof this paper are: (1) Releasing a publicly available datasetconsisting of 244 thumbnails of TCGA WSIs along withtheir segmentation masks, (2) proposing a deep learningtopology using U-Net for reliable, accurate, and automatictissue segmentation, and (3) comparing the performance ofdifferent encoders as the backbone of U-Net in the tissuesegmentation task. The manifest to download the data fromthe GDC website, manually reﬁned labels ,as well as codes torun the proposed U-Net, is available for download . https://kimialab.uwaterloo.ca/kimia/index.php/data-and-code/ a r X i v : . [ ee ss . I V ] J un a) Lung tissue with Air Sacs (b) Fatty Tissue (c) Dirty Glass Slide (d) Extra Stain in Background (e) Poor Staining (f) Broken Glass Fig. 1. Some examples for challenges in tissue extraction (images selectedfrom the TCGA dataset).

II. R

ELATED LITERATURE

The identiﬁcation of regions containing tissue is usuallythe ﬁrst step in histopathology image analysis. However, thisproblem is often treated as a trivial part of research mostlysolved via threshold-based methods. Most research papershave used empirical rules to set the threshold for differentimage speciﬁcations such as gradient, intensity, color, etc. [9]–[13].

A. Machine vision based methods

Estimation of the texture complexity in small neighbors hasbeen used by Oswal et al. to detect the foreground [10]. Babaieet al. [11] used homogeneity and gradient values to estimatethe patch complexity. Bentaieb et al. [12] used a threshold on the pixel intensity values to detect the tissue. As anotherexample, Kothari et al. [9] removed blank regions by settinga threshold on saturation and intensity of pixels. Other worksused homogeneity criteria to only select patches containing aconsiderable part of the tissue [14].The Otsu algorithm [15] as a robust iterative thresholdingmethod has been widely used to compute the optimal thresh-old. Mohit employed the Otsu method on HSV transformedimage for background removal [16]. Nguyen et al. appliedOtsu’s method on the b channel of the LAB color space toobtain tissue regions in WSIs [17].One of the most well known and widely used open-sourcelibraries in digital histopathology, Histomics Toolkit (His-tomicsTK) can also perform tissue detection on the thumbnailof a WSI. The process contains a series of Gaussian smoothingand Otsu thresholding. Also, another threshold is used to ﬁlterregions smaller than a preset size.In contrast to the mentioned works, which treat backgrounddetection as a small part of the entire WSI processing, there arefew studies which have addressed the foreground/backgrounddetection in histopathology slides as a major problem [4], [7],[13], [18]. Similar to the previous vision-based methods, FESI[13] used a combination of basic methods, such as medianﬁltering, thresholding, erosion and dilation to address thisproblem. Calculation of absolute value of the Laplacian basedon gray-scale image, and then applying Gaussian ﬁlter is usedin their work. Recently Chen et al. [19] introduced tissuelocalization method by applying inverse binarization on thegray-scale images followed by erosion and dilation. B. Network based Methods

Neural network based methods are a rather recent trendin the literature to address the tissue segmentation. Raja etal. [4] have extracted four different features including color,appearance, texture and spatial features. They fed the selectedfeatures to a two-layer neural network to classify the patchesinto background and foreground pixels. Bandi et al. [7] trainedFCN and U-Net networks for tissue segmentation with patcheswith a single label. They used patches with the size of × pixels for U-Net and × pixels for FCNN.Their patches were randomly extracted from 54 WSIs. Theyassigned only one label to each patch based on its centralpixel which means the same labels are allocated to roughly800,000 pixels in the U-Net case. It seems that all network-based methods are working on the highest usually availablemagniﬁcation (namely, × magniﬁcation). As a result, fora whole slide processing, a large number of small patchesmust be fed to their network which is a time-consuming task.However, a more efﬁcient way of segmentation is to assigna label to each pixel in a thumbnail to save time and alsoto avoid losing tissue parts (especially borders) as much aspossible. Therefore, in this paper we provide manually labeledWSI thumbnails (low magniﬁcation) to train U-Net models(Fig. 2). We have compared the most commonly used networkarchitectures to ﬁnd the best backbone for proposed U-Net. https://github.com/DigitalSlideArchive/HistomicsTK . U-Net U-Net is a convolutional neural network which ﬁrstly wasproposed for the segmentation of neural structures in electronmicroscopic images in 2015 [20]. Since then, this networkhas shown impressive performance in various segmentationtasks in medical imaging. Dong et al. [21] proposed anautomatic method to detect and segment brain tumors in MRIby using U-Net. Bulten et al. [22] utilized U-Net for epithelialtissue segmentation to assist pathologists in prostate cancerdiagnosis. Naylor et al. [23] proposed a method for cell nucleisegmentation by formulating this task as the regression ofthe distance map. They compared results of three differentarchitectures: (1) the pre-trained VGG16 with ﬁne-tuning asthe FCN approach, (2) U-Net, and (3) Mask R-CNN withthe pre-trained ResNet 101 as its backbone. U-Net can betrained end-to-end using a small number of images [24]. Thisis the most signiﬁcant advantage of the U-Net, especially inapplications such as biomedical domain where usually only afew annotated images are available.

Concatenate

Input Image Output Mask

Encoder Decoder

Fig. 2. Network Architecture: Each block shows a feature map.

III. M

ETHODOLOGY

A. Data Annotation

For tissue segmentation, a label must be assigned to eachpixel to indicate whether it belongs to a tissue region or not.A mask is a binary image where every pixel is either zero(black) or one (white) where the white pixels generally markthe region of interest.A typical WSI contains more than several thousand pix-els in each image axis (i.e., a typical WSI may easily be , × , or larger). Thus, assigning a label to eachpixel of the WSI is not a feasible task. To overcome thischallenge, we ﬁrst work with thumbnails instead of WSIs,that is generally the image at × magniﬁcation. Workingwith thumbnails has the advantage of fast computation. Also,tissue regions at higher magniﬁcations can be constructedfrom their corresponding segmented thumbnails by simplecalculations commonly known for the pyramidal structuresof whole slide images. We developed a handcrafted imageprocessing approach, details in Alg. 1, to produce initial masksfrom thumbnails [25], [26].Masking was performed in . × magniﬁcation to preservedetails. Note that based on our practical experiments, in Algorithm 1:

Handcrafted Masking Method

Input :

The rgb thumbnail of the WSI

Output:

Thumbnail binary mask with the same size chosenContours ← []; binThmb ← binaryThresholding( rgbThmb ) ; contours , hierarchy ← findContours( binThmb ) ; fatherContours ← getContours( contours , hierarchy , ) ; append( chosenContours , fatherContours ) ; ﬁrstLevelChildren ← getContours( contours , hierarchy , ) ; ﬁrstLevelChildren ← sort( ﬁrstLevelChildren ,’area’ ) ; append( chosenContours , ﬁrstLevelChildren [0] ) ;i ← while ﬁrstLevelChildren [ i ] .area> min( ﬁrstLevelChildren [ i − .area * ratioThreshold , areaThreshold ) do append( chosenContours , ﬁrstLevelChildren [ i ] ) ; i ← i + 1 endforeach x in ﬁrstLevelChildren do distCond ← distanceToClosest( x , chosenContours ) < distThreshold ; if distCond and x not in chosenContours then append( chosenContours , x ) ; endforeach x in ﬁrstLevelChildren do areaCond ← getArea( x ) > areaThreshold ; if areaCond and x not in chosenContours then append( chosenContours , x ) ; end drawContours( chosenContours , ﬁnalMask ,’white’ ) ; foreach hole in invert( binThmb ) doif hLowerThresh < hole .area < hUpperThresh then drawContours( hole , ﬁnalMask , ’black’ ) ; end challenging cases, the . × magniﬁcation was the smallestsize which still could distinguish the tissue parts from artifactssuch as extra staining. Thereafter, initial masks were reﬁnedmanually to make sure that all tissue regions are selected,and noise and artifacts are removed as much as possible. Anexample of the mentioned steps can be found in Fig. 3.With regard to difﬁcult cases, we used image dilation with × kernels to make sure every pixel of tissue, especiallyat borders, are preserved. Finally, each pair of mask andthumbnail is resized (preserving the aspect ratio) in a way thatach image dimension does not exceed pixels to makethe images small enough to be processed by the network. Itis noteworthy that the thumbnails have various dimensionsnecessitating background padding to have the uniﬁed size × for all images. B. Model Architecture

U-Net, which is a fully convolutional network with a U-shape architecture, has two parts, called encoder and decoder.The ﬁrst sub-network, known as the encoder, extracts high-level features to capture the image content. The decoder sub-network, also known as the expansion part, creates the desiredsegmentation map [20]. Fig. 2 shows the proposed networkarchitecture. U-Net-based deep networks, the same as U-Net, include two encoder and decoder sub-networks. As theinput image passes through the ﬁrst sub-network, higher-levelfeatures are extracted. In the next sub-network, deep featuremaps are combined with low-level feature maps from theencoder sub-network. The spatial resolution of feature mapsare increased in the second sub-network to achieve an outputmask with the same size as the input image. The connectionsbetween the encoder and decoder in U-Net architecture fa-cilitate information propagation. In terms of connections inthe U-Net architecture, feature maps from the encoder partare cropped and concatenated to feature maps in the decodersub-network to retrieve local information. These connectionsenable the network to learn from a few samples [20].To improve the performance of U-Net, we applied custombackbones on its architecture using Segmentation Modelslibrary . The encoder part of these customized networks arethe feature extractor, i.e., complete network architecture exceptthe last dense layer, of a chosen network, e.g., MobileNet. Thedecoder part consists of 5 decoder blocks with ﬁlters of size256, 128, 64, 32 and 16 as it gets deeper. The structure ofeach decoder block is made up of one 2d-upsampling layerand two repetitions of 2d-convolution, batch-normalization andReLU activation. Four skip connections connect layers fromthe encoder part, usually the output of ReLU activation at acertain layer of each encoder block, to the last four decoderblocks, after the up-sampling layer. The last layers of thenetwork is a 2d-convolution layer with Sigmoid activation.We experimented with six different backbones (topologies)for U-Net-based solutions for tissue segmentation which areintroduced in Section IV-B.IV. E XPERIMENTS

A. Data

We used 244 WSIs selected from different organs suchas brain, breast, kidney, and lung. All WSIs were randomlyselected from The Cancer Genoum Atlas (TCGA) dataset [27],[28]. TCGA is one of the largest publicly available datasetswith histopathology whole slide images. https://github.com/qubvel/segmentation models B. Topologies and Training Process

We have experimented with various network topologiesincluding MobileNet, VGG16, EfﬁcientNetB3, ResNet50,ResNext101, and DenseNet121 as the backbone of U-Netmodel to ﬁnd the most suitable ones for tissue segmentation[29]–[33]. All networks were trained for 50 epochs, no earlystopping, using Adam optimizer [34] with the learning rateof e − on one NVIDIA Tesla V100 GPU with 32GBmemory. After running the experiments with two loss func-tions, namely Jaccard Index and sensitivity plus speciﬁcity,we chose the latter so the network tries to come up withan approximation which avoids the misclassiﬁcation of tissueparts as background while having a good performance atrecognizing background. The drawback of using Jaccard Indexas the loss function was the relatively low sensitivity of theresults. The networks were initialized with ImageNet weightsand trained and evaluated with ﬁve-fold cross validation. Foreach fold, 195 × RGB images were used as theinput and binary masks with the same size as the label inwhich pixel value 1 (positive) meant tissue and pixel value0 (negative) meant background. Input images and their corre-sponding masks were augmented by three transformations: (1)Random rotation within the range of -180 and 180 degrees, (2)random horizontal ﬂipping, and (3) random vertical ﬂipping.The validation dataset contained 49 images for each fold.

C. Comparison of Methods

To compare our results against other methods, we used thesame input images fed to our networks as their input andcalculated their performance against the ground-truth masks.All methods were checked to be able to work with thegiven inputs. We compared our results against four traditionalcomputer vision methods:(1) FESI algorithm [13] is improved by changing the colorspace of the input image from BGR to LAB and the valueof the ﬁrst two channels, lightness and red/green value, arechanged to maximum intensity value . Color space of theresulting image is changed to gray-scale and binerized usingthe mean value of the image as threshold. This binary imageis passed to the Gaussian ﬁlter instead of using the absolutevalue of the Laplacian of the gray-scale image as done in theoriginal paper.(2) We used locate tissue cnts function available in theopen-source Python package , TissueLoc [19], as a recentlydeveloped method for comparative purposes. We modiﬁed thefunction in a way that it uses the thumbnail image as input.Also all of input parameters of the function are set to defaultvalues except min tissue size which is set to 50 to makesure the algorithm would detect all tissue parts.(3) Histomics Toolkit Python library is one of themost popular libraries in the histopathology domain. saliency.tissue detection.get tissue mask function was usedas tissue segmentation method. We set the input parameters https://github.com/alexander-rakhlin/he stained fg extraction https://github.com/PingjunChen/tissueloc riginal Thumbnails Initial Masks Refined (Final) Masks Fig. 3. Two samples of sequential steps for generating masks.TABLE IS

UMMARY OF RESULTS : C

OMPARING OUR NETWORKS WITH IMAGE - PROCESSING METHODS ( GRAY ROWS ).Method Time (s) Jaccard Index Dice Coeff. Sensitivity SpeciﬁcityMobileNet 0.11 0.95 0.97 0.99 0.99EfﬁcientNet-B3 0.18 0.95 0.97 0.99 0.98ResNet50 0.16 0.94 0.97 0.99 0.98DenseNet121 0.16 0.93 0.96 0.99 0.98ResNext101 0.50 0.93 0.96 0.99 0.98VGG16 0.11 0.75 0.82 0.99 0.81Improved FESI [13] 0.11 0.86 0.92 0.91 0.97TissueLoc [19] 0.26 0.81 0.88 0.88 0.97Otsu algorithm 0.02 0.81 0.89 0.82 0.99Histomics-TK 0.13 0.78 0.87 0.79 0.99 deconvolve f irst to False, n thresholding steps to 1 and min size threshold to 50.(4) Otsu binarization method as one of the well-knownalgorithms to classify pixels into foreground and background.The RGB thumbnail images are ﬁrst converted to gray-scaleand then the Otsu method is applied.

D. Performance Evaluation

In the test phase, we compared the result of all methods withthe ground-truths via 5-fold cross validation. In addition tothe processing time, four different performance measurements,i.e., Jaccard index, Dice coefﬁcient, sensitivity, and speciﬁcity,were conducted which are deﬁned as:Jaccard := TPTP + FP + FN , (1)Dice := 2 ∗ TP ∗ TP + FP + FN , (2)Sensitivity := TPTP + FN , (3)Speciﬁcity := TNTN + FP , (4) where TP, TN, FP, and FN denote the number of truepositives, true negatives, false positives, and false negatives,respectively. The segmented pixels are considered as positiveand negative where they are labeled as tissue and non-tissue pixels, respectively. Note that the sensitivity is moreimportant than the speciﬁcity in the tissue segmentation taskbecause sensitivity penalizes wrong labeling of tissue regionas background while speciﬁcity gives a penalty to the wronglabeling of the background region as tissue. In histopathology,it is paramount to do not miss any part of the tissue. E. Analysis of Results

In addition to Improved FESI and TissueLoc methods, wechose HistomicsTK tissue segmentation and Otsu algorithm tocompare our networks’ results against well-known methods forhistopathology image analysis. Table.I shows that all networks,except VGG16, outperform all four handcrafted methods withrespect to all performance measurements. The most importantadvantage of networks over the handcrafted methods is theirhigh sensitivity ( ≈ ). In addition networks are as fast ashandcrafted methods such as Improved FESI and TissueLocwhile achieving considerably higher Jaccard Index and DiceCoefﬁcients. It can be seen that Jaccard Index for the networksith best performance, namely MobileNet and EfﬁcientNet-B3, is 9% higher than the best handcrafted method, namelyImproved FESI. Considering the changes in the validationloss for two networks, ResNext101 with around 51 millionparameters and EfﬁcientNet-B3 with less than 18 millionparameters, through 50 epochs, Fig.5, it seems that both havethe same pattern; 20 epochs appeared to be enough for propernetwork training. This would take around 20 minutes for amedium-size network and 40 minutes for a large networkwhich is a negligible cost considering the beneﬁts of usingnetworks.To compare network backbones, it can be seen that Mo-bileNet showed the best performance. Also EfﬁcientNet-B3shows very high performance. The poor performance ofVGG16 could be due to several reasons. First of all, thisnetwork has a large number of parameters (more than 23millions) while it only has 66 layers compared to othernetworks such as MobileNet with more than 8 million pa-rameters and 128 layers and EfﬁcientNet-B3 with around 18million parameters and 418 layers. Also, the use of batchnormalization and ReLU activation layers in the convolutionblocks in other architectures have the beneﬁts of avoidinginternal covariate shift, which results in faster convergence,and keeping the network sparse, causing the generalizationerror to decrease, respectively. Since VGG16 lacks these layersin its architecture, it converges with difﬁculty. Fig. 6 depicts avisual overview of the proposed network results versus image-processing methods. As we can see, the proposed networksoutperform in fatty tissue (second column) or tissue with anair bubble (third column) considerably.V. C ONCLUSION

In this paper, we have compared the performance of U-Netwith various custom topologies (backbones) for the identiﬁca-tion of tissue regions in whole slide images. Using differentnetworks combines the strength of current state-of-the-artCNNs with the custom architecture of the U-Net model forimage segmentation. Whereas U-Net topologies can generate

Fig. 4. Jaccard Index L o ss ResNext101_train_lossResNext101_validation_lossEfficientNetB3_train_lossEfficientNetB3_validation_loss

Fig. 5. Losses for ResNext101 and EfﬁcientNet-B3 segments with 99% sensitivity, handcrafted methods struggledto approach high 80%. Both MobileNet and EfﬁcientNet-B3appeared to be the best backbone topology for the U-Net.The next step for this research would be changing the currentbinary masking network to a multi-class one which could labeleach pixel as classes such as marker trace, dirt and tissue fold,fat and informative tissue. Authors have made the segmentedimages publicly available for sake of reproducibility.R

EFERENCES[1] S. Al-Janabi, A. Huisman, and P. J. Van Diest, “Digital pathology:current status and future perspectives,”

Histopathology , vol. 61, no. 1,pp. 1–9, 2012.[2] L. Pantanowitz, P. N. Valenstein, A. J. Evans, K. J. Kaplan, J. D. Pfeifer,D. C. Wilbur, L. C. Collins, and T. J. Colgan, “Review of the current stateof whole slide imaging in pathology,”

Journal of pathology informatics ,vol. 2, 2011.[3] M. D. Kumar, M. Babaie, and H. R. Tizhoosh, “Deep barcodes forfast retrieval of histopathology scans,” in . IEEE, 2018, pp. 1–8.[4] R. S. Alomari, R. Allen, B. Sabata, and V. Chaudhary, “Localization oftissues in high-resolution digital anatomic pathology images,” in

MedicalImaging 2009: Computer-Aided Diagnosis , vol. 7260. InternationalSociety for Optics and Photonics, 2009, p. 726016.[5] A. M. Khan, H. El-Daly, and N. Rajpoot, “Ranpec: Random projectionswith ensemble clustering for segmentation of tumor areas in breasthistology images,” in

Medical Image Understanding and Analysis , 2012,pp. 17–23.[6] L. He, L. R. Long, S. Antani, and G. R. Thoma, “Histology imageanalysis for carcinoma detection and grading,”

Computer methods andprograms in biomedicine , vol. 107, no. 3, pp. 538–556, 2012.[7] P. B´andi, R. van de Loo, M. Intezar, D. Geijs, F. Ciompi, B. van Gin-neken, J. van der Laak, and G. Litjens, “Comparison of different methodsfor tissue segmentation in histopathological whole-slide images,” in .IEEE, 2017, pp. 591–595.[8] M. Babaie and H. R. Tizhoosh, “Deep features for tissue-fold detectionin histopathology images,” arXiv preprint arXiv:1903.07011 , 2019.[9] S. Kothari, J. H. Phan, and M. D. Wang, “Eliminating tissue-foldartifacts in histopathological whole-slide images for improved image-based prediction of cancer grade,”

Journal of pathology informatics ,vol. 4, 2013.[10] V. Oswal, A. Belle, R. Diegelmann, and K. Najarian, “An entropy-basedautomated cell nuclei segmentation and quantiﬁcation: application inanalysis of wound healing process,”

Computational and mathematicalmethods in medicine , vol. 2013, 2013.11] M. Babaie, S. Kalra, A. Sriram, C. Mitcheltree, S. Zhu, A. Khatami,S. Rahnamayan, and H. R. Tizhoosh, “Classiﬁcation and retrieval ofdigital pathology scans: A new dataset,” in

The IEEE Conference onComputer Vision and Pattern Recognition (CVPR) Workshops , July2017.[12] A. BenTaieb and G. Hamarneh, “Predicting cancer with a recurrentvisual attention model for histopathology images,” in

InternationalConference on Medical Image Computing and Computer-Assisted In-tervention . Springer, 2018, pp. 129–137.[13] D. Bug, F. Feuerhake, and D. Merhof, “Foreground extraction forhistopathological whole slide imaging,” in

Bildverarbeitung f¨ur dieMedizin 2015 . Springer, 2015, pp. 419–424.[14] H. Erfankhah, M. Yazdi, M. Babaie, and H. R. Tizhoosh,“Heterogeneity-aware local binary patterns for retrieval of histopathol-ogy images,”

IEEE Access , vol. 7, pp. 18 354–18 367, 2019.[15] T. Otsu and M. Yoshida, “Role of initiator-transfer agent-terminator (in-iferter) in radical polymerizations: Polymer design by organic disulﬁdesas iniferters,”

Die Makromolekulare Chemie, Rapid Communications ,vol. 3, no. 2, pp. 127–132, 1982.[16] M. Mohit, “Automated histopathological analyses at scale,” Ph.D. dis-sertation, Massachusetts Institute of Technology, 2017.[17] K. Nguyen, A. K. Jain, and B. Sabata, “Prostate cancer detection: Fusionof cytological and textural features,”

Journal of pathology informatics ,vol. 2, 2011.[18] S. Fouad, D. Randell, A. Galton, H. Mehanna, and G. Landini,“Unsupervised morphological segmentation of tissue compartments inhistopathological images,”

PloS one , vol. 12, no. 11, p. e0188717, 2017.[19] P. Chen and L. Yang, “tissueloc: Whole slide digital pathology imagetissue localization.”

J. Open Source Software , vol. 4, no. 33, p. 1148,2019.[20] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networksfor biomedical image segmentation,” in

International Conference onMedical image computing and computer-assisted intervention . Springer,2015, pp. 234–241.[21] H. Dong, G. Yang, F. Liu, Y. Mo, and Y. Guo, “Automatic braintumor detection and segmentation using u-net based fully convolutionalnetworks,” in annual conference on medical image understanding andanalysis . Springer, 2017, pp. 506–517. [22] W. Bulten, C. A. Hulsbergen-van de Kaa, J. van der Laak, G. J. Litjens et al. , “Automated segmentation of epithelial tissue in prostatectomyslides using deep learning,” in

Medical Imaging 2018: Digital Pathology ,vol. 10581. International Society for Optics and Photonics, 2018, p.105810S.[23] P. Naylor, M. La´e, F. Reyal, and T. Walter, “Segmentation of nuclei inhistopathology images by deep regression of the distance map,”

IEEEtransactions on medical imaging , vol. 38, no. 2, pp. 448–459, 2018.[24] P. Yakubovskiy, “Segmentation models,” https://github.com/qubvel/segmentation models, 2019.[25] S. Jeyalaksshmi and S. Prasanna, “Measuring distinct regions ofgrayscale image using pixel values,”

International Journal of Engineer-ing & Technology , vol. 7, no. 1.1, pp. 121–124, 2018.[26] A. Belsare and M. Mushrif, “Histopathological image analysis usingimage processing techniques: An overview,”

Signal & Image Processing ,vol. 3, no. 4, p. 23, 2012.[27] “The cancer genoum atlas (tcga) dataset,” https://portal.gdc.cancer.gov/,accessed: November 2019.[28] K. Tomczak, P. Czerwi´nska, and M. Wiznerowicz, “The cancer genomeatlas (tcga): an immeasurable source of knowledge,”

Contemporaryoncology , vol. 19, no. 1A, p. A68, 2015.[29] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang,T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efﬁcient convo-lutional neural networks for mobile vision applications,” arXiv preprintarXiv:1704.04861 , 2017.[30] K. Simonyan and A. Zisserman, “Very deep convolutional networks forlarge-scale image recognition,” arXiv preprint arXiv:1409.1556 , 2014.[31] M. Tan and Q. V. Le, “Efﬁcientnet: Rethinking model scaling forconvolutional neural networks,” arXiv preprint arXiv:1905.11946 , 2019.[32] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for imagerecognition,” in

Proceedings of the IEEE conference on computer visionand pattern recognition , 2016, pp. 770–778.[33] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Denselyconnected convolutional networks,” in

Proceedings of the IEEE confer-ence on computer vision and pattern recognition , 2017, pp. 4700–4708.[34] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980 , 2014. riginalEfficient-Net

Mobile-Net

VGG16OtsuHistomicsTK

Glass Slide Margin Fatty Tissue Slide Preparation Artifact Air Sacs