A Multi-Scale Conditional Deep Model for Tumor Cell Ratio Counting
Eric Cosatto, Kyle Gerard, Hans-Peter Graf, Maki Ogura, Tomoharu Kiyuna, Kanako C. Hatanaka, Yoshihiro Matsuno, Yutaka Hatanaka
AA Multi-Scale Conditional Deep Modelfor Tumor Cell Ratio Counting *Eric Cosatto a , Kyle Gerard a , Hans-Peter Graf a , Maki Ogura b , Tomoharu Kiyuna b , Kanako CHatanaka c,d,e , Yoshihiro Matsuno f,e , and Yutaka Hatanaka d,ea Dept. of Machine Learning, NEC Laboratories America, Princeton, NJ, USA b Digital Healthcare Business Development Office, NEC Corporation, Tokyo, Japan c Clinical Research and Medical Innovation CenterHokkaido University Hospital, Hokkaido, Japan d Research Division of Genome Companion DiagnosticsHokkaido University Hospital, Hokkaido, Japan e Center for Development of Advanced DiagnosticsHokkaido University Hospital, Hokkaido, Japan f Department of Surgical PathologyHokkaido University Hospital, Hokkaido, Japan
ABSTRACT
We propose a method to accurately obtain the ratio of tumor cells over an entire histological slide. We usedeep fully convolutional neural network models trained to detect and classify cells on images of H&E-stainedtissue sections. Pathologists’ labels consisting of exhaustive nuclei locations and tumor regions were used totrained the model in a supervised fashion. We show that combining two models, each working at a differentmagnification allows the system to capture both cell-level details and surrounding context to enable successfuldetection and classification of cells as either tumor-cell or normal-cell. Indeed, by conditioning the classificationof a single cell on a multi-scale context information, our models mimic the process used by pathologists whoassess cell neoplasticity and tumor extent at different microscope magnifications. The ratio of tumor cells canthen be readily obtained by counting the number of cells in each class. To analyze an entire slide, we split it intomultiple tiles that can be processed in parallel. The overall tumor cell ratio can then be aggregated. We performexperiments on a dataset of 100 slides with lung tumor specimens from both resection and tissue micro-array(TMA). We train fully-convolutional models using heavy data augmentation and batch normalization. On anunseen test set, we obtain an average mean absolute error on predicting the tumor cell ratio of less than 6%,which is significantly better than the human average of 20% and is key in properly selecting tissue samples forrecent genetic panel tests geared at prescribing targeted cancer drugs. We perform ablation studies to show theimportance of training two models at different magnifications and to justify the choice of some parameters, suchas the size of the receptive field.
1. INTRODUCTION AND PURPOSE
Targeted treatment therapies for various types of cancer rely on DNA analysis of patients’ cancer cells to identifythe drugs that would benefit them the most. The mutations and fusions of genes that cause cancer are detectedwith genomics panel tests run on next generation sequencing devices. Input material for these tests comesfrom several types of biopsies, surgical resections and cell-blocks. The tissue samples are fixed in formalinand embedded in paraffin blocks (FFPE). Genomics panel tests based on gene sequencing operations require aminimum ratio of tumor cells to be present in the analyzed tissue sections to provide accurate results. Typically,a set of slide-mounted FFPE, unstained, about 5-micron thick sections are required for the test. For example, inFDA-approved FoundatioOne and OncomineDx Target Test, the overall tumor content (ratio) should be morethan 20%. For tissues where the overall tumor content is between 10% and 20%, micro-dissection and enrichment *corresponding author a r X i v : . [ ee ss . I V ] J a n hould be performed to bring up the overall ratio. Currently pathologists manually estimate the overall tumorcell ratio from hematoxylin-eosin (H&E) stained tissue sections, identify areas of high tumor cell ratios and, ifneeded, micro-dissect tissue fragments to enrich the tumor content to over 20%. This process is highly manualand subjective and could benefit from automation. Furthermore, pathologists’ estimation of the tumor cell ratiohas been shown to be inaccurate. In particular, that study showed that 38% of the estimations would haveled to insufficient ratios (less than 20%), possibly causing false negative results on the genomics panel tests. Inaddition, these genomics tests are very costly and time-consuming, generally taking a few weeks to complete, andare destructive. Therefore, a method for accurate ratio counting over the entire tumor area is needed. Manuallycounting individual cells by pathologists does involve counting tens of thousands of individual cells, making ithighly impractical.Our goal is to provide an easy-to-use whole-slide interactive system that helps a pathologist select portionsof tissue where the ratio of tumor cells is above a safe threshold for use in genomics panel tests. We developeda client-server approach where a browser-based client displays the slide and allows free pan/zoom navigationwithin the slide. The user submits analysis request for the entire slide or of a marked portion thereof. The AIanalysis server processes the requested area and returns the location of all cells and the classification as tumoror non-tumor. The client browser then displays the tumor cell ratio information to the user such that she candecide which tissue area(s) to select for use in genomics panel tests. An example screen shot of the browser-basedclient tool is shown in Fig. 1.
Figure 1: Screenshot of our browser-based viewer showing the result of analyzing an entire slideof the test set. The overall ratio of tumor cells is shown on top and individual regions are color-coded to visualize the ratios over areas of the slide. Tumor cells are also overlaid in cyan. Theblack dots are sometimes added by pathologists directly on the glass slide to indicate the tumorarea. Our system is able to automatically detect them and analyze the corresponding area. . PREVIOUS WORK
Automated cell counting techniques have been developed in cytology, immuno-histochemistry (IHC) and immuno-fluorescence (IF). In cytology, the most common test is the pap-smear which is automated and was approvedby the United States Food and Drug Administration (FDA) in 1998.
5, 6
Cytology images exhibit cells floatingin liquid, making the detection and analysis by computer vision systems easier than in tissue sections, wherethe structure of the tissue interferes with the detection and analysis of the cells. In histology, advanced stainingtechniques such as IHC and IF can make certain mutated cells easy to segment from the background by usingsimple color-based analysis techniques. However, such staining is not appropriate for the task of obtaining theratio of tumor cells because only certain types of mutated cells are highlighted by the stain, while the ratiocalculation requires all cells to be counted.While no cell counting systems exist for H&E stains that have been FDA approved, several published researchstudies have addressed the issues of cell detection, classification and segmentation. Two general approaches havebeen used to detect cells: the first approach starts by detecting candidate objects on the image using imageanalysis techniques and then use machine-learning techniques, such as support vector machines (SVM) orK-nearest-neighbors (KNN), to classify these objects as cells or non-cells using features . Recently, ”deep-learning” approaches have taken over and shown to be superior to image-analysis by learning features of cancerfrom the data instead of relying on ”hand-crafted” heuristic features. Such models have been demonstratedto estimate where cells are located on an image, directly, without the need for explicit object detection. Pushing further, some methods directly regress the number of cells present in an image patch. Although directregression has the advantage of bypassing the explicit detection of cells, it is completely black-box and does notprovide any way to explain the prediction to the user. We feel that, at a minimum, the user should be ableto see which cells are detected and which are classified as tumor cells, so as to gain confidence in the system’spredictions. We follow the general object detection approach proposed in Lempitsky et al., teaching a deepconvolutional neural network model to learn a mapping between an input image and a density map. The densitymap is generated from the ground-truth-labeled center of cells’ nuclei.Most recent cell classification methods employ a deep convolutional neural network (CNN) trained in asupervised fashion with backpropagation to classify a small input image patch into distinct types of cells suchas epithelial, fibroblast, mitotic figures , etc. Variants of CNN architectures
13, 18 have been proposed for thiskind of approach. Most methods use image data at a resolution of 0.55 microns per pixel (equivalent to a 20Xoptical magnification) and perform the analysis on a relatively small receptive field. For example Basha et al. and Sirinukunwattana et al. use a 32x32 and 27x27 receptive field respectively, which corresponds to about 15microns, three times the size of a nucleus. Such a small receptive field makes the learning model focus solely onone cell, ignoring the surrounding context.Segmentation of tumor areas is another related active research topic. Deep learning methods have been shownto produce excellent results on general image segmentation tasks and have recently been applied to histologyimages for nuclei segmentation and tumor area segmentation.
23, 24
The general principle is the same asfor nuclei detection. A model is trained to map an input image to a density map representing the tumor area.For tumor segmentation, the working magnification should be lower than for nuclei detection. While separatingindividual nuclei requires a high magnification (20X or 40X) and a smaller field of view (50 to 100 microns),segmenting a tumor area necessitates a wider field of view (500 to 1000 microns) with a lower resolution (5X or10X). Similarly, pathologists observe specimens at low magnification to understand the extent of the tumor areaand then zoom in to certain areas to observe finer characteristics such as the morphology of nuclei. We followthis principle in our approach for tumor cell ratio counting. We use a high magnification for detection of nucleito ensure that all nuclei can be counted, even in dense areas. This is followed by an analysis at low magnificationfor tumor area segmentation to ensure that large features of the tumor, such as deformed glands, can be seenby the model. In addition, we combine high and low resolution features to classify individual cells as tumor ornormal. This allows to properly count normal cells appearing in a tumor area, as well as isolated tumor cells ina normal area.Methods using a larger receptive field and a fully-convolutional approach using the U-net CNN architec-ture have been demonstrated for cell segmentation in DAPI and FISH staining. Such methods are ideal ofimage segmentation as they produce a binary segmentation map directly from the image input. We adapt thisegmentation method to cell counting by generating target cell location maps where cells are represented byGaussian peaks rather than their actual shape. From the predicted map, the (x,y) location of cells is obtainedusing only local peak detection. This approach avoids the issue of adjacent cells forming clumps that cannot beeasily separated into individual cells for counting. We further adapt the method by multi-tasking detection andclassification on the same model by learning to predict two target maps, one for detection, where all cells aredrawn and one for classification, where only the tumor cells are drawn.
3. MATERIAL AND METHODS3.1 Data
One hundred WSI slides were obtained from a cohort of 100 lung cancer cases at Hokkaido University Hospital(Hokkaido, Japan). Fifty-five cases were categorized as adenocarcinoma (AC) and 45 as squamous-cell carcinoma(SC). The specimens were acquired between 2005 and 2010. Thirty slides contained a tissue micro-array (TMA)specimen (15 AC and 15 SC), while seventy slides contained tissue from surgical resections (40 AC and 30 SC).The tissue specimens were prepared with the Formalin-Fixed Paraffin-Embedded (FFPE) method and stainedwith Hematoxylin-Eosin (H&E). The slides were scanned using a Hamamatsu scanner into whole slide image(WSI) files. 131 regions of interest (ROI) were selected by a pathologist (KCH) for annotations. These ROIswere 1980x1980 pixels at level-0 (40X) magnification, corresponding to a standard microscope high-power field.Two types of annotations were obtained from a pathologist: point location of the center of each cell’s nucleusand freehand traces of the tumor areas. We combine two deep neural network (DNN) models and train them in supervised fashion, one to detect andclassify cells as normal or tumor, the other to segment tumor areas. Following the work of Lempitsky et al., we train the DNN models to learn a mapping between an input RGB image patch and a target map.For the first model, two target maps are concatenated to form the target output for training the first model.One target map represents the position of the center of all cells’ nuclei present in the input image. This mapwill be used for nuclei detection. The second target map represents the positions of the tumor cells’ nuclei and isused for nuclei classification. The maps are built by drawing disks at the (x,y) positions of the nuclei center andthen by transforming the disks into Gaussian peaks by convolving a Gaussian kernel with the map (see left sideof Fig. 2 for an illustration of the two nuclei maps and their corresponding input image and annotations). Thereason for this step is to train the model to produce a smooth peak at a nucleus position, making it easier forpost-inference processing to detect them individually, especially in the case where groups of nuclei are bunched-up together. This way, a simple and fast local peak detector checking a 3x3 neighborhood can detect the peaksdirectly from the DNN output map.For the second model, the target map is generated in a similar way, at lower magnification, by drawing afilled freehand tumor area against the background. See the right side of Fig. 2 for an illustration of the tumorarea map and its corresponding input image and annotations (dashed contour lines). Our choice for model architecture is fully-convolutional. These models are the natural choice for detection ofmultiple objects as they conserve the image’s 2D relationships throughout the layers. We experimented withboth U-net and Resnet with a fully-convolutional head. These models have been applied successfully in awide variety of image problems. We found that, for our application, U-net has a small performance edge, asimpler architecture and smaller footprint.U-net models have an encoder-decoder architecture with a bottleneck in the middle. The number of featureplanes increases as their size decreases in the encoder, with the reverse happening in the decoder. The convolu-tional units in the decoder are implemented with transposed convolutions which can be seen as the gradient ofthe convolution with respect to its input. A graphical representation of the model is shown in figure 3, left.We choose the number of convolutional blocks such that the model’s receptive field is 188x188 pixels, which,at 40X magnification (4.4 pixels per microns), corresponds to a patch of 43x43 microns (172x172 microns at igure 2: System overview. The left part shows the high-magnification process of detectingcells and classifying them as tumor, while the right part shows the low magnification process ofsegmenting the tumor area. Scores from individual cell classification and tumor area segmentationare combined to obtain a tumor score for each cell. The ratio of tumor cells can then be readilycomputed. Ground-truth annotations used to generate the targets are shown as overlays on theinput images (green dots mark normal cells, red dots mark tumor cells, dotted teal lines mark thetumor areas). fully convolutional model,we can size it such as to optimize the number of models that can occupy a GPU memory. For example, for a 2model configuration, an input image of 800x800 pixels allows both models to be loaded in a 1080 GPU (8GB).Individual layers of the U-net model are listed in figure 3. Both the detection/classification model and the segmentation model are trained independently using the sameprocedure. Their outputs are combined to detect and classify the cells. Instead of jointly optimizing the models,we train them separately and only subsequently optimize the detection and classification thresholds to achievethe lowest possible error on the validation set. This approach is preferable as the input to the two models aredifferent and the loss function may be dominated by one model.We partition the annotated data into three sets. Seventy percent of the data is used to train the model, tenpercent is used for validation of the model and the remaining twenty percent is used only for the final evaluation.These subsets are built such that all annotation ROIs from a given slide go into the same subset.H&E stained specimens exhibit shades of two staining agents (blue-purple for hematoxylin which colorsthe nuclei, and reddish-pink for eosin which colors the cytoplasm and extracellular matrix). The amount andproportion of staining agents, the age of the sample and the type of scanner significantly affect the final colorof the pixels. Hence it is necessary, in order to create a robust model, to make sure there is as much stainingand scanning variation as possible in the training set. Unfortunately, it is difficult to procure samples with suchvariations, as cohorts tend to come from the same institution and therefore have been stained and scanned usingthe same protocol. To compensate for this relative uniformity in our training data and to make sure our models igure 3: Architecture of the U-net fully-convolutional model. The left image shows the graphicalrepresentation of the model (for a 572x572 input image). The middle and right tables show theactual size of the layers for the encoder and decoder block respectively, for an 800x800 inputimage and 2 output maps. The total number of parameters is 28,942,850 and the total size of themodel for inference (not counting training gradients) is about 3GB. will perform adequately on specimens from other institutions, we apply data augmentation to simulate variationsin staining, color shifts and image sharpness. To simulate staining variations, we use an optical-density basedstain projection method and shift the pixel intensities in the projection spaces. Figure 4 shows an example ofstain augmentation using our method. To simulate color shifts due to the scanner light, we apply intensity shiftsin the Hue-Saturation-Luminance color space. To simulate variations in scanner optics and focusing mechanism,we apply small amounts of pixel blur/sharpen. Finally, since image orientation does not affect the labels, wealso increase the number of examples by random rotation and mirroring. Figure 4: Left side: receptive field of the model at 10X and 40X magnification. Right side:example of data augmentation by H&E stain shifts.
The model is trained with the binary cross-entropy loss combined with a sigmoid layer σ ( x ): L BCE = − N N (cid:88) n [ y n ln σ ( x n ) + (1 − y n ) ln(1 − σ ( x n ))] (1)where N is the number of pixels in the output map.We experimented with batch normalization and found it to be useful in providing a smooth learning curve.We use the Pytorch toolchain to train our models using the Adam optimizer and a learning rate of 1e-3.Training a model takes about 3 hours on a GPU. To avoid overfitting the model on the training set, training isstopped when the loss on the validation set stops to decrease for 4 epochs. At each epoch the model is trainedwith 4000 examples and achieves convergence in about 50 epochs. Figure 5 shows the training curve of ourmodel. igure 5: Training curve showing the loss at each epoch. In red is the loss over the training set,while in blue is the loss over the validation set. The training is stopped when the loss on thevalidation set ceases to decrease. So far the models have been trained individually to minimize the cross-entropy loss, which is a reconstructionloss on the detection, classification and segmentation maps. The real goal, however, is to obtain a list of cells,each with a ( x, y ) location and a tumor score. The detection/classification model outputs two floating-pointmaps: map d and map c , while the segmentation model outputs map s . From map d , the detection of the cells isperformed using local peak detection of 3x3 pixel neighborhood on the detection map (see figure 6 center). Sincethe model is trained to predict smooth peaks, non-maxima suppression is not necessary and every detected peakis considered a candidate cell. The three intensity values at the map k ( x, y ) locations of cell i is recorded for allthree maps resulting in a feature vector f i = [ I d , I c , I s ].A classifier is then constructed with two thresholds, the detection threshold t d and the classification threshold t c . Considering the feature vector f i = [ I d , I c , I s ] of all candidate cells, cells where I d < t d are discarded. Then,cells where αI c + (1 − α ) I s > t c are classified as tumor cells. α is a hyper-parameter that weighs the classificationand the segmentation model output to generate a score. We use a value of 0 . The first step is to evaluate the detection accuracy of the detection model. We declare a detected cell a matchwhen the detected peak ( x, y ) location is within a distance of 3.2 microns (14 pixels at 40X magnification - 4.4microns per pixel resolution) of a labeled cell. Each detected cell is matched to its closest unmatched label,in a greedy way. Matched cells are true positives (TP), leftover unmatched cells are false positives (FP) andunmatched labels are false negatives (FN). The detection accuracy is thus computed as
ACC ( det ) = T PT P + F P + F N ,the precision as
P RE ( det ) = T PT P + F P and the recall as
REC ( det ) = T PT P + F N . The detection F score, also knownas the Dice coefficient, is defined as F ( det ) = P RE ( det ) REC ( det ) P RE ( det )+ REC ( det ) . igure 6: Detection output map (center) and its 3D view (right) for the input image (left).Locations of peaks are overlaid on the images. For classification, we calculate the accuracy using the matched detected cells only. A matched cell that isclassified as tumor and which label is also tumor is declared a true positive (TP). A matched cell that is classifiedas tumor and which label is normal is declared a false positive (FP). A matched cell that is classified as normaland which label is also normal is declared a true negative (TN). A matched cell that is classified as normaland which label is tumor is declared a false negative (FN). The classification accuracy is thus computed as
ACC ( cla ) = T P + T NT P + T N + F P + F N . For tumor cells (positives), the precision is calculated as
P RE pos ( cla ) = T PT P + F P and the recall as
REC pos ( cla ) = T PT P + F N . For normal cells (negatives), the precision as
P RE neg ( cla ) = T NT N + F N and the recall as
REC neg ( cla ) = T NT N + F P . The classification F score is defined as F ( cla ) = P RP + R where P = P RE pos ( cla )+ P RE neg ( cla )2 and R = REC pos ( cla )+ REC neg ( cla )2 are the average precision and recall over thedetected tumor cells and non-tumor cells.The predicted ratio of tumor cells is defined as (cid:91) T CR = N T N where N T is the number cells classified as tumorand N is total number of detected cells. We create an evaluation set by combining the training and the validation sets. Using this evaluation set, weperform a grid search for the best threshold pair ( t d , t c ). For the detection threshold t d , the criterion we use isto maximize the detection F score. For classification, the criterion is to minimize the mean absolute error onthe predicted ratio: E T CR = 1 N N (cid:88) n (cid:12)(cid:12)(cid:12) (cid:91) T CR − T CR (cid:12)(cid:12)(cid:12) (2)Since we use two distinct criterion, each threshold is searched independently: first, the optimal detectionthreshold, followed by the classification threshold. We found that jointly searching for both thresholds using asingle classification criteria yielded worse results and took longer to perform. Once the optimal thresholds areobtained on the evaluation set, the final performance on the test set can be obtained.
4. RESULTS
In a first experiment, we investigate the importance of the magnification level. For the detection of cells,the plot 7.a) shows that higher magnifications are more accurate, with a dramatic drop in accuracy betweenmagnifications 16X and 12X. This is easy to understand intuitively as the cells become closer and increasinglyharder to tell apart at lower magnifications. To show the impact of magnification on the classification of cells, weuse ground-truth cell positions and use the tumor cell density output map to classify them as normal or tumor.he trend on plot 7.b) shows a clear decrease for higher magnifications. The intuition here is that a larger fieldof view is useful to understand the exact extent of the tumor. We then apply detection and classification, bothat the same magnification. The trend on plot 7.c) shows a maximum at around 24X. This can be understoodas a combination of the two previous experiments, exhibiting a sweet spot where each cell’s nucleus can still beaccurately separated from the others, while providing enough context to classify them as tumor or normal.
Figure 7: Importance of magnification level (x-axis: resizeFactor=1.0 ≡ ≡ In light of the above result and to take full advantage of both the higher detection accuracy at highermagnifications and the higher classification accuracy at lower magnifications, we train two separate models atdifferent magnifications. We then combine these models using additive ensembling and report the results in table1. The best result of 5.3% mean absolute error (MAE) on the predicted ratio is obtained by detection at 20X,followed by adding the score from the 20X and 10X tumor cell density maps. This approach can also be seenas conditioning the classification of detected cells on a larger context of tissue. We also show that the U-netarchitecture returns slightly better results.
Table 1: Results for 2 model architectures and 2 configurations of the detection/classification(DT+CL) model and the segmentation model (SEG). We report E T CR , the MAE of the predictedtumor cell ratio and the processing speed in mm /sec . We trained these models on 3 differentrandom partitions of the data and report the mean values and standard deviations. Resnet50+FC head number of model(s) E T CR mm /sec DT+CL@20X 1 6.6% ± ± E T CR mm /sec DT+CL@20X + SEG@10X 2 ± ± F score and table 3 summarizesits processing time on various hardware configurations. able 2: Cell-level accuracy and F1-score for the U-net DT+CL@10X configuration. We reportthe detection accuracy and F1 score as well as the classification accuracy and F1-score. Theclassification accuracy and F1-score are obtained on the correctly detected cells. We trained thismodel on 3 different random partitions of the data and report the mean values and standarddeviations. Detection accuracy % 92.9 ± ± ± ± Table 3: Left: speed and acceleration factors for various CPU/GPU hardware setups and for a1-model DT+CL@10X configuration (see figure 7). The speed is given in square millimeters oftissue processed per second. A large resection tissue sample may be as large as 600 mm , whilecore needle samples are usually about 10 mm . Right: timing profile for the fastest configuration. Hardware configuration mm /sec speedup3 GPUs and 9 CPU cores 3.7 46 X3 GPUs and 6 CPU cores 3.3 40 X2 GPUs and 4 CPU cores 2.5 30 X1 GPU and 2 CPU cores 1.7 20 X0 GPU and 4 CPU cores 0.18 2 X0 GPU and 1 CPU core 0.08 1 X Operation % of total timeRead pixels 42Model inference 36Save result 11Peak detect 7Model setup 4
5. CONCLUSIONS
We show that conditioning the classification of detected cells on tissue context information provides increasedaccuracy. Hence, we propose a model that combines high magnification cell detection with low magnificationtumor area segmentation, taking advantage of high resolution to accurately separate cells, while using a largerfield of view to better classify them as normal or tumor cells. We use fully convolutional DNN architecturesand predict density maps from which cell counts can be readily extracted. Finally, we propose a whole-slideapproach to calculate the tumor cell ratio by splitting the tissue into square tiles, processing them in paralleland displaying a heatmap of the tumor cell ratio in a browser-based client. The achieved MAE of 5.3% ± We are currently collecting an extended dataset from different hospitals to test oursystem for variations in staining protocols. We expect it to be robust against such variation because of our dataaugmentation approach during the training of the models, which simulates such variations.Our main contributions are: 1) a novel multi-scale DNN tumor cell detection and classification model whichtakes advantage of both high magnification to accurately distinguish individual cells and low magnification toclassify them as tumor or normal based on a large tissue context; 2) a novel whole-slide tumor cell ratio counterthat is highly accurate at 6% mean absolute error.
REFERENCES
Journal of clinical pathology (11), 923–931 (2014).4] Smits, A. J., Kummer, J. A., De Bruin, P. C., Bol, M., Van Den Tweel, J. G., Seldenrijk, K. A., Willems,S. M., Offerhaus, G. J. A., De Weger, R. A., Van Diest, P. J., et al., “The estimation of tumor cell percentagefor molecular testing by pathologists is not accurate,” Modern Pathology (2), 168–174 (2014).[5] Bergeron, C., Masseroli, M., Ghezi, A., Lemarie, A., Mango, L., and Koss, L. G., “Quality control of cervicalcytology in high-risk women. papnet system compared with manual rescreening.,” Acta cytologica (2),151 (2000).[6] Tench, W. D., “Validation of autopap primary screening system sensitivity and high-risk performance.,” Acta cytologica (2), 296–302 (2002).[7] Cortes, C. and Vapnik, V., “Support vector machine,” Machine learning (3), 273–297 (1995).[8] Cosatto, E., Miller, M., Graf, H. P., and Meyer, J. S., “Grading nuclear pleomorphism on histologicalmicrographs,” in [ ], 1–4, IEEE (2008).[9] Arteta, C., Lempitsky, V., Noble, J. A., and Zisserman, A., “Learning to detect cells using non-overlappingextremal regions,” in [ International Conference on Medical Image Computing and Computer-Assisted In-tervention ], 348–356, Springer (2012).[10] Sertel, O., Kong, J., Shimada, H., Catalyurek, U. V., Saltz, J. H., and Gurcan, M. N., “Computer-aidedprognosis of neuroblastoma on whole-slide images: Classification of stromal development,”
Pattern recogni-tion (6), 1093–1103 (2009).[11] Lempitsky, V. and Zisserman, A., “Learning to count objects in images,” in [ Advances in neural informationprocessing systems ], 1324–1332 (2010).[12] Xie, W., Noble, J. A., and Zisserman, A., “Microscopy cell counting and detection with fully convolu-tional regression networks,”
Computer methods in biomechanics and biomedical engineering: Imaging &Visualization (3), 283–292 (2018).[13] Sirinukunwattana, K., Raza, S. E. A., Tsang, Y.-W., Snead, D. R., Cree, I. A., and Rajpoot, N. M.,“Locality sensitive deep learning for detection and classification of nuclei in routine colon cancer histologyimages,” IEEE transactions on medical imaging (5), 1196–1206 (2016).[14] Xue, Y., Ray, N., Hugh, J., and Bigras, G., “Cell counting by regression using convolutional neural network,”in [ European Conference on Computer Vision ], 274–290, Springer (2016).[15] LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P., “Gradient-based learning applied to document recogni-tion,”
Proceedings of the IEEE (11), 2278–2324 (1998).[16] Rumelhart, D. E., Hinton, G. E., and Williams, R. J., “Learning representations by back-propagatingerrors,” nature (6088), 533–536 (1986).[17] Malon, C. D. and Cosatto, E., “Classification of mitotic figures with convolutional neural networks andseeded blob features,” Journal of pathology informatics (2013).[18] Basha, S. S., Ghosh, S., Babu, K. K., Dubey, S. R., Pulabaigari, V., and Mukherjee, S., “Rccnet: Anefficient convolutional neural network for histological routine colon cancer nuclei classification,” in [ ], 1222–1227, IEEE(2018).[19] He, K., Gkioxari, G., Doll´ar, P., and Girshick, R., “Mask r-cnn,” in [ Proceedings of the IEEE internationalconference on computer vision ], 2961–2969 (2017).[20] Naylor, P., La´e, M., Reyal, F., and Walter, T., “Segmentation of nuclei in histopathology images by deepregression of the distance map,”
IEEE transactions on medical imaging (2), 448–459 (2018).[21] Graham, S., Vu, Q. D., Raza, S. E. A., Azam, A., Tsang, Y. W., Kwak, J. T., and Rajpoot, N., “Hover-net: Simultaneous segmentation and classification of nuclei in multi-tissue histology images,” Medical ImageAnalysis , 101563 (2019).[22] Raza, S. E. A., Cheung, L., Shaban, M., Graham, S., Epstein, D., Pelengaris, S., Khan, M., and Rajpoot,N. M., “Micro-net: A unified model for segmentation of various objects in microscopy images,” Medicalimage analysis , 160–173 (2019).[23] Lahiani, A., Gildenblat, J., Klaman, I., Navab, N., and Klaiman, E., “Generalising multistain immunohis-tochemistry tissue segmentation using end-to-end colour deconvolution deep neural networks,” IET ImageProcessing (7), 1066–1073 (2019).24] Xu, J., Luo, X., Wang, G., Gilmore, H., and Madabhushi, A., “A deep convolutional neural network forsegmenting and classifying epithelial and stromal regions in histopathological images,” Neurocomputing ,214–223 (2016).[25] Ronneberger, O., Fischer, P., and Brox, T., “U-net: Convolutional networks for biomedical image seg-mentation,” in [
International Conference on Medical image computing and computer-assisted intervention ],234–241, Springer (2015).[26] Rajkumar, U., Turner, K., Luebeck, J., Deshpande, V., Chandraker, M., Mischel, P., and Bafna, V.,“Ecseg: semantic segmentation of metaphase images containing extrachromosomal dna,” iScience Thirty-first AAAI conference on artificial intelligence ], (2017).[29] Ruifrok, A. C., Johnston, D. A., et al., “Quantification of histochemical staining by color deconvolution,”
Analytical and quantitative cytology and histology (4), 291–299 (2001).[30] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N.,Antiga, L., et al., “Pytorch: An imperative style, high-performance deep learning library,” in [ Advances inneural information processing systems ], 8026–8037 (2019).[31] Zhang, Z., “Improved adam optimizer for deep neural networks,” in [2018 IEEE/ACM 26th InternationalSymposium on Quality of Service (IWQoS)