[PDF] Deep learning-based transformation of the H&E stain into special stains improves kidney disease diagnosis

Abstract

Pathology is practiced by visual inspection of histochemically stained slides. Most commonly, the hematoxylin and eosin (H&E) stain is used in the diagnostic workflow and it is the gold standard for cancer diagnosis. However, in many cases, especially for non-neoplastic diseases, additional "special stains" are used to provide different levels of contrast and color to tissue components and allow pathologists to get a clearer diagnostic picture. In this study, we demonstrate the utility of supervised learning-based computational stain transformation from H&E to different special stains (Masson's Trichrome, periodic acid-Schiff and Jones silver stain) using tissue sections from kidney needle core biopsies. Based on evaluation by three renal pathologists, followed by adjudication by a fourth renal pathologist, we show that the generation of virtual special stains from existing H&E images improves the diagnosis in several non-neoplastic kidney diseases, sampled from 16 unique subjects. Adjudication of N=48 diagnoses from the three pathologists revealed that the virtually generated special stains yielded 22 improvements (45.8%), 23 concordances (47.9%) and 3 discordances (6.3%), when compared against the use of H&E stained tissue only. As the virtual transformation of H&E images into special stains can be achieved in less than 1 min per patient core specimen slide, this stain-to-stain transformation framework can improve the quality of the preliminary diagnosis when additional special stains are needed, along with significant savings in time and cost, reducing the burden on healthcare system and patients.

Full PDF

11 Deep learning-based transformation of the H&E stain into special stains improves kidney disease diagnosis

Kevin de Haan , Yijie Zhang , Tairan Liu , Anthony E. Sisk , Miguel F. P. Diaz , Jonathan E. Zuckerman , Yair Rivenson , W. Dean Wallace , Aydogan Ozcan Electrical and Computer Engineering Department, University of California, Los Angeles, CA, USA Bioengineering Department, University of California, Los Angeles, CA, USA California NanoSystems Institute (CNSI), University of California, Los Angeles, CA, USA Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA. Kaiser Permanente Los Angeles Medical Center, Department of Pathology, Los Angeles, CA, USA. Department of Pathology and Laboratory Medicine, Keck School of Medicine of USC, Los Angeles, CA, USA Department of Surgery, David Geffen School of Medicine, University of California, Los Angeles, CA, USA Corresponding authors: Yair Rivenson; [email protected] W. Dean Wallace; [email protected] Aydogan Ozcan; [email protected]

Abstract

Pathology is practiced by visual inspection of histochemically stained slides. Most commonly, the hematoxylin and eosin (H&E) stain is used in the diagnostic workflow and it is the gold standard for cancer diagnosis. However, in many cases, especially for non-neoplastic diseases, additional “special stains” are used to provide different levels of contrast and color to tissue components and allow pathologists to get a clearer diagnostic picture. In this study, we demonstrate the utility of supervised learning-based computational stain transformation from H&E to different special stains (Masson’s Trichrome, periodic acid-Schiff and Jones silver stain) using tissue sections from kidney needle core biopsies. Based on evaluation by three renal pathologists, followed by adjudication by a fourth renal pathologist, we show that the generation of virtual special stains from existing H&E images improves the diagnosis in several non-neoplastic kidney diseases, sampled from 16 unique subjects. Adjudication of N=48 diagnoses from the three pathologists revealed that the virtually generated special stains yielded 22 improvements (45.8%), 23 concordances (47.9%) and 3 discordances (6.3%), when compared against the use of H&E stained tissue only. As the virtual transformation of H&E images into special stains can be achieved in ≤1 min per patient core specimen slide, this stain-to-stain transformation framework can improve the quality of the preliminary diagnosis when additional special stains are needed, along with significant savings in time and cost, reducing the burden on healthcare system and patients.

Introduction

Histological analysis of stained human tissue samples is the gold standard for evaluation of many diseases, as the fundamental basis of any pathologic evaluation is the examination of histologically stained tissue affixed on a glass slide using either a microscope or a digitized version of the histologic image following the image capture by a whole slide image (WSI) scanner. The histological staining step is a critical part of the pathology workflow and is required to provide contrast and color to tissue by facilitating a chromatic distinction among different tissue constituents. The most common stain (otherwise referred to as the routine stain) is the hematoxylin and eosin (H&E), which is applied to nearly all clinical cases, covering ~80% of all the human tissue staining performed globally . The standard H&E stain is relatively easy to perform and is standardized across the industry, allowing pathologists and researchers to easily interpret histologic images from anywhere around the world. In addition to H&E, there are a variety of other histological stains with different properties which are used by pathologists to better highlight different tissue constituents. For example, Masson’s trichrome (MT) stain is used to view connective tissue and periodic acid-Schiff (PAS) can be used to better scrutinize basement membranes; both features have importance in some disease types such as non-neoplastic kidney disease . These non-H&E stains are also called “special stains” and their use is the standard of care in the pathologic evaluation of certain disease entities including non-neoplastic kidney, liver and lung diseases, among others. The traditional histopathology workflow can be time consuming, expensive, and requires laboratory infrastructure. Tissue must first be sampled from the patient, fixed either through freezing in Optimal Cutting Temperature (OCT) compound, or paraffin embedding, sliced into thin (2-10 μm) sections, and mounted onto a glass slide. Only then can these sections be stained using the desired chemical staining procedure. Furthermore, if multiple stains are needed, multiple tissue sections are cut, and a separate procedure must be used for each stain. While H&E staining is performed using a streamlined staining procedure, the special stains often require more preparation time, effort and monitoring by a histotechnologist, which increases the cost of the procedure and takes additional time to produce. This can in turn increase the time for diagnosis, especially when a pathologist determines that these additional special stains are needed after the H&E stained tissue has been examined. The tissue sectioning and staining procedure may therefore need to be repeated for each special stain, which is wasteful in terms of resources, materials and might place a burden on both the healthcare system and patients if there is an urgent need for a diagnosis. Recognizing some of these limitations, different approaches have been developed to improve the histopathology workflow. Histological staining has been reproduced by imaging rapidly labeled tissue sections (usually by a nuclear staining dye) using an alternative contrast mechanism acquired by e.g., non-linear microscopy or ultraviolet tissue surface excitation , and digitally transforming the captured images into user-calibrated H&E-like images . These approaches mainly focus on eliminating tissue fixation from the workflow, targeting rapid intraoperative contrast to unfixed specimens. More recently, computational staining techniques known as “virtual staining” have been developed. Using deep learning, virtual staining has been applied on label-free (i.e., unstained) fixed and glass slide affixed tissue sections using various modalities such as autofluorescence , hyperspectral imaging , quantitative phase imaging , and others . Virtual staining of label-free tissue not only has the ability to reduce costs and allow for faster staining, but also allows the user to perform further advanced analysis on the tissue since the destructive additional sectioning and staining process is avoided that can cause the specimen to be depleted leading to e.g., additional/unnecessary biopsies from the patients . Furthermore, virtual staining of label-free tissue enables new capabilities such as the use of multiple virtual stains upon a single tissue section, stain normalization (i.e., standardization), region-of-interest specific digital blending of multiple stains, all of which are challenging or highly impractically with standard histochemical staining workflows . An alternative approach that can be used to bypass histochemical tissue staining is to computationally transform the WSI of an already stained tissue into another stain (this will be referred to as “stain transformation”). This allows users to reduce the number of physical stains required without making any changes to their traditional histopathology workflow , and also carries many of the benefits of the virtual staining techniques such as improving stain consistency and reduction in stain preparation time. Different stain transformations have been demonstrated in the literature, e.g., transformation of H&E into MT or transformation of fibroblast activation protein-cytokeratin (FAP-CK), a duplex immunohistochemistry (IHC) protocol , from images of Ki67-CD8 stained slides. Stain transformations have also been used as a tool to improve the effectiveness of image segmentation algorithms . However, many of these stain transformation techniques rely upon unsupervised approaches which use distribution matching losses used by techniques such as cycle consistent generative adversarial networks (GANs) – also known as CycleGANs . It has been shown that, when applied to medical imaging, neural networks trained using only these types of distribution matching losses are prone to hallucinations . Some researchers have been able to avoid the use of these distribution matching losses and unpaired image data by training networks to perform other stain-to-stain transformations. For example, a stain transformation network was trained using image pairs acquired from adjacent tissue sections , while another work used image pairs captured by chemically de-staining and then re-staining the same tissue sections . In this paper, we present a supervised deep learning-based stain transformation framework, outlined in Figure 1. Supervised training of this stain transformation workflow is achieved by the help of another deep learning-based inference framework: virtual staining of label-free tissue samples based on their autofluorescence images (see Figure 1). This label-free virtual staining method helped us generate precisely registered training pairs of (1) H&E images and (2) the corresponding special stain images of the same tissue sections, all virtually generated. This created a spatially registered (i.e., perfectly paired) training image dataset and allowed the stain transformation network to be trained without relying on unpaired image data and corresponding distribution matching losses . Furthermore, no stain-to-stain image aberrations or misalignments exist in this training data due to the fact that the source of information (autofluorescence of the label-free tissue) is common for all the virtually stained images. This feature significantly improves the reliability and accuracy of the stain-to-stain transformation that is learned using our method. While one of the enablers for the training of our stain transformation workflow is the virtual staining of label-free tissue, the resulting networks that are trained with our methodology can digitally transform any existing chemically-stained tissue image into new types of stains. We demonstrate the efficacy of this technique by evaluating kidney tissues with various non-neoplastic diseases. Non-neoplastic kidney disease relies on special stains to provide the standard of care pathologic evaluation. In many clinical practices, H&E stains are available well before the special stains are prepared, and pathologists may provide a preliminary diagnosis to enable the patient’s nephrologist to begin any necessary treatment. In a setting when only H&E slides are initially available, the preliminary diagnosis is followed by the final diagnosis made by examining the special stain images, which are usually provided the next working day. Using the presented stain transformation technique would allow pathologists to view the special stains much more quickly. This is especially useful for medical conditions such as crescentic glomerulonephritis or transplant rejection where quick and accurate diagnosis followed by rapid initiation of treatment may lead to significant improvements in clinical outcomes. In this manuscript, we investigated and blindly tested whether significant improvements can be made to the preliminary diagnosis by generating, from an existing H&E whole slide image of a given patient, three additional virtual special stains, i.e., PAS, MT and Jones methenamine silver (JMS), that can be reviewed by the pathologist simultaneously with the histochemically stained existing H&E image (i.e., entirely bypassing the need to stain and wait for new slides). Based on tissue samples from 16 unique patients that are blindly evaluated by 3 independent renal pathologists (i.e., N=48), our results revealed that the generation of virtual special stains (PAS, MT and JMS) improved the diagnoses in various non-neoplastic kidney diseases. These computationally generated panels of special stains transformed from existing H&E images using deep learning give the pathologists the additional information channels needed for standard of patient care. We believe this unique stain-to-stain transformation workflow can be applied to a variety of diseases, and could significantly improve the quality of the preliminary diagnosis when additional special stains are needed, also providing time savings and helping to reduce healthcare costs and burden for histopathology labs and patients. Figure 1:

Overview of various virtual staining paths that are presented in this paper. Path (1): Histochemical staining of H&E, which is then digitally transformed using a deep neural network into the special stains. Path (2): Autofluorescence images of label-free tissue are virtually stained. (2a) Generation of virtual H&E, which can then be transformed into special stains using secondary deep neural networks. (2b) The special stains can also be directly generated from autofluorescence images using a virtual staining network. Scale bar indicates 50 µm. (i) Generation of JMS. (ii) Generation of MT. (iii) Generation of PAS.

Results Design and training of stain transformation networks

Deep neural networks were used to perform the transformation between the H&E stained tissue and the special stains. To train these networks, a set of additional deep neural networks were used in conjunction with one another. This training workflow relies upon the ability of virtual staining to generate images of different stains using a single unlabeled tissue section (Figure 2a). By using a single neural network to generate both the H&E images alongside the special stains (PAS, MT, JMS), a perfectly matched training image dataset can be created. However, due to the standardization of the output images generated using the staining network, the virtually stained images (to be used as inputs when training the stain transformation network) must be augmented with additional staining styles to ensure generalization. In other words, we designed our network to be able to handle inevitable variability in histochemical H&E staining that is a natural result of (i) differing staining procedures and reagents among histotechnologists and pathology labs, and (ii) differences among digital WSI scanners that are being used. This augmentation is performed by K =8 unique style transfer (staining normalization) networks (Figure 2b), which ensured that a broad sample space is covered for the presented method to be effective when applied to H&E stained tissue samples regardless of the inter-technician, inter-lab or inter-equipment (e.g., WSI) variations observed at different institutions. Note here that these style transfer networks and the underlying training methods (e.g., CycleGANs) were solely used for H&E stain data augmentation . The use of CycleGANs only expands the sample space of the network inputs during the training, and their outputs were therefore not part of our stain transformation network loss function. This was possible since we utilized perfectly registered training images created by virtual staining of label-free autofluorescence images of tissue. This process simultaneously generated both the H&E and special stain images with nanoscopic match in the local coordinates of each virtually stained image pair of our training dataset, which eliminated the need for the use of CycleGANs for stain-to-stain transformation. Using this image dataset, the stain transformation network is trained, following the scheme shown in Figure 2c . The network is randomly fed with image patches either coming from the virtually stained tissue, or the virtually stained images passing through one of the 8 style transfer networks. The corresponding special stain (virtually stained from the same unlabeled field of view) is used as the ground truth regardless of the H&E style transfer. After its training, the network is then blindly tested on a variety of digitized H&E slides taken from UCLA repository, which represent a cohort of diseases and staining variations (all taken from patients that the network was not trained with). The network performs the stain transformation at rate of ~1.5mm /s which takes in total ~0.5-1 min for a typical needle core kidney biopsy slide that was used in this study. Figure 2:

Deep neural networks used to generate the training data for the stain transformation network. a) Virtual staining network which can generate both the H&E and special stain images. b) Style transfer network that is used just to augment the training data. c) Scheme used to train the stain transformation network. During its training, the stain transformation network is randomly given, as the input, either the virtually stained H&E tissue, or an image of the same field of view after passing through one of the 8 style transfer networks. A perfectly matched virtually stained tissue image with the desired special stain (in this example shown: PAS) is used as the ground truth to train this neural network.

Blind testing of stain transformation networks and evaluation of kidney disease diagnoses

To validate the presented stain transformation technique, a study was performed using WSI data from 16 different H&E stained tissue sections (each corresponding to a unique patient) obtained from an existing database of non-neoplastic kidney diseases. In this blinded study, three board-certified pathologists filled out a diagnostic worksheet for each H&E WSI (see supplementary Table S1). Following a >3-week washout period, the same pathologists were asked to fill out the same diagnostic worksheet, but along with the H&E, they were also provided the virtually stained WSIs corresponding to special stains PAS, MT, and JMS, all generated from the existing H&E images. A diagram visualizing this study process can be seen in Figure 3. Following the second round of diagnoses, a fourth board-certified pathologist adjudicated all the results/diagnoses and determined whether the viewing of the neural network generated special stains resulted in an Improvement (I), Concordance (C) or Discordance (D) with respect to the original H&E-only diagnoses.

Figure 3:

Overview of the study design. Phase 1 shows the initial portion of the study where three pathologists review H&E WSIs of 16 different tissue sections (each from a unique patient). After a >3-week washout period, the second phase of diagnosis is performed, where the same three pathologists view the same WSIs, where, in addition to the H&E, the virtually stained special stains (PAS, Masson’s Trichrome, Jones) are provided as well. (i) Generation of JMS. (ii) Generation of MT. (iii) Generation of PAS.

Adjudication of the N=48 preliminary diagnoses without and with the virtual special stains from the three pathologists revealed 23 Concordances (47.9%), 22 Improvements (45.8%) and 3 Discordances (6.3%). Two cases had an improved preliminary diagnosis by all 3 pathologists, 3 cases had an improved preliminary diagnosis by 2 of 3 pathologists, 7 cases had improved preliminary diagnoses by only 1 pathologist, 1 case was concordant by all pathologists and 3 cases had one discordance by 1 pathologist each (see Table 1). For each of the diagnoses marked as improvements, the pathologists were able to provide more accurate characterization or a more complete diagnosis. As an example, Figure 4 demonstrates the improvement using the presented stain transformation technique for case

Table 1.

Summary of the results obtained by the study described in Figure 3.

Worksheet

Figure 4:

Examples of improved diagnoses fostered by the virtual special stains. We report here WSIs that are virtually generated using the stain transformation technique for the case

In the three episodes of Discordance, one was determined to be due to pathologist interpretation error (case

We should note that previous research on statistical evaluation of intra-observer decisions revealed a small intra-observer disagreement rate of ~4% when the same cases are viewed by the same pathologist at two different time points . This could potentially account for the discordance in case Figure 5. a) Example of improved diagnosis fostered by the virtual special stains. For case Discussion

While different approaches have been explored over the past few years to perform a transformation between two stains, the approach presented here has several unique advantages: (1) it involves less chemical processing applied to tissue, without the need for de-staining and re-staining; and (2) our approach is based on supervised training of the stain transformation network using pairs of perfectly registered training images that are created by label-free virtual staining, which constitutes a precise structural fidelity constraint for the distribution loss that is learned by the discriminator, significantly helping its generalization. Both of these important advantages are enabled by using autofluorescence-based virtual staining of label-free tissue sections with multiple stains to create perfectly paired training image datasets. While in this paper we used autofluoresence to generate contrast from label-free tissue, other contrast mechanisms such as quantitative phase imaging, multi-photon-microscopy, fluorescence lifetime imaging and photo-acoustic microscopy, among others, can also support this supervised training of the presented stain transformation method. The ability of this stain transformation network to generalize across stain variations is also highly beneficial as there are significant differences among stains produced by different labs and even across stains performed by the same histotechnician (e.g., Supplementary Figure S2a demonstrates three examples of such variations for stains produced by the same lab). However, in order a stain transformation technique to be effective for any practical application, the network must generalize across this wide sample space. As one of the key features of virtual staining is stain normalization , the network requires data augmentation to better facilitate the learning across a wide input staining distribution. For this purpose, we used a set of 8 CycleGAN networks to perform this stain data augmentation of the H&E dataset used to train our stain transformation network. The use of CycleGAN networks to perform a stain normalizing style transfer has been shown to be more effective than traditional stain normalization algorithms . Furthermore, they have proven to be highly effective at performing data augmentation for medical imaging . By applying these CycleGAN augmentation networks to our training image dataset, we were able to successfully generalize to various slides used for blind testing. Three examples of this CycleGAN-based stain augmentation results are reported in Supplementary Figure S2b, which demonstrates that the three different networks are capable of converting the virtually stained tissue to have H&E distributions which match the distributions seen in Figure S2a. Furthermore, the results show that the same stain transformation network is consistent across these various distributions as there is little variation among the virtual PAS outputs (Supplementary Figure S2b). These style normalization/transfer networks used in data augmentation can be easily further expanded upon, if needed, using existing databases of H&E images. As we have emphasized earlier, these style transfer networks were only used for H&E stain data augmentation and were not included in our stain transformation loss function. We utilized perfectly registered training images generated by virtual staining of label-free tissue; as a result of this, potential hallucinations or artifacts related to unsupervised training with CycleGANs and unpaired training data are eliminated. 2 In addition to histological stains, immunofluorescence and electron microscopy based evaluation play significant roles in the standard of care for non-neoplastic kidney biopsy evaluation. In this study we have attempted to isolate the role of standard light microscopy in the non-neoplastic kidney disease evaluation and therefore these other modalities were not included. However, their application in clinical cases would only serve to support the pathologic final diagnosis and add a layer of further confirmation and safety to this resource-saving stain transformation technique. While the use of stain-to-stain transformation is one way to generate special stains, we can in many cases skip histological staining of H&E altogether by using autofluorescence images of the unlabeled tissue to virtually create the panel of H&E as well as the additional special stains as needed. However, virtually generating the panel of special stains directly from an existing histochemically stained H&E image has the important advantage that an abundance of whole slide H&E images already exist in numerous data repositories. These existing images can be used to train additional networks and these stain transformation techniques may help users transition toward chemistry-free, all digital staining. However, the reagents, human factors, the digital slide scanners and other variables will ultimately affect the quality of any scanned histochemically stained tissue sample. As virtual staining provides a path for standardized staining (i.e., eliminating the staining variability) it could alleviate some of these challenges, including the stain normalization step. In this work, we focused on image transformations from H&E to special stains, since H&E is used as the bulk of the staining procedures, covering approximately 80% of all the human tissue staining procedures . However, other stain-to-stain transformations can also be considered. For example, transformations from special stains to H&E or from immunofluorescence to H&E or special stains could be performed using the presented method. Our approach allows pathologists to visualize different tissue constituents without waiting for additional slides to be stained with special stains, and we demonstrated it to be effective for clinical diagnosis of multiple renal diseases. Another advantage of the presented technique is that it can rapidly perform the stain transformation (at a rate of 1.5 mm /s on a consumer-grade desktop computer with two GPUs), while saving labor, time, chemicals and can significantly benefit the patient as well as the healthcare system. Methods

Training of stain transformation network

All of the stain transformation networks and virtual staining networks used in this paper were trained using GANs. Each of these GANs consists of a generator (G) and a discriminator (D). The generator is used to perform the transformation of the input images (x input ), while the discriminator network is used to help train the network to generate images which match the distribution of the ground truth stained images. It does this by trying to discriminate between the generated images (G(x input )) and the ground truth images (z label ). The generator is in turn taught to generate images which cannot be classified correctly by the discriminator. This GAN 3 loss is used in conjunction with two additional losses: a mean absolute error (L ) loss, and a total variation (TV) loss. The L loss is used to ensure that the transformations are performed accurately in space and color, while the TV loss is used as a regularizer, and reduces noise created by the GAN loss. Together, the overall loss function is described as: 𝑙 𝑔𝑒𝑛𝑒𝑟𝑎𝑡𝑜𝑟 = 𝐿 {𝑧 𝑙𝑎𝑏𝑒𝑙 , 𝐺(𝑥 𝑖𝑛𝑝𝑢𝑡 )} + 𝛼 × 𝑇𝑉{𝐺(𝑥 𝑖𝑛𝑝𝑢𝑡 )} + ꞵ × (1 − 𝐷(𝐺(𝑥 𝑖𝑛𝑝𝑢𝑡 ), )) (1) where α and ꞵ are constants used to balance the various terms of the loss function. The stain transformation networks are tuned such that the L loss makes up ~1% of the overall loss, the TV loss makes up only ~0.03% of the overall loss, and the discriminator loss makes up the remaining ~99% of the loss (relative ratios change over the course of the training). The L portion of the loss can be written as: 𝐿 (𝓏, 𝐺) = 1𝑃 × 𝑄 ∑ ∑ |𝓏 𝑝,𝑞 − 𝐺(𝑥 𝑖𝑛𝑝𝑢𝑡 ) 𝑝,𝑞 | 𝑞𝑝 (2) where p and q are the pixel indices and P and Q are the total number of pixels in each image. The total variation loss is defined as: 𝑇𝑉(𝐺(𝑥 𝑖𝑛𝑝𝑢𝑡 )) = ∑ ∑ |𝐺(𝑥 𝑖𝑛𝑝𝑢𝑡 ) 𝑝+1,𝑞 − 𝐺(𝑥 𝑖𝑛𝑝𝑢𝑡 ) 𝑝,𝑞 | + |𝐺(𝑥 𝑖𝑛𝑝𝑢𝑡 ) 𝑝,𝑞+1𝑞𝑝 − 𝐺(𝑥 𝑖𝑛𝑝𝑢𝑡 ) 𝑝,𝑞 | (3) The discriminator network has a separate loss function which is defined as: 𝑙 𝑑𝑖𝑠𝑐𝑟𝑖𝑚𝑖𝑛𝑎𝑡𝑜𝑟 = 𝐷 (𝐺(𝑥 𝑖𝑛𝑝𝑢𝑡 )) + (1 − 𝐷(𝑧 𝑙𝑎𝑏𝑒𝑙 )) (4) A modified U-net neural network architecture was used for the generator, while the discriminator used a VGG-style network. The U-net architecture uses a set of 4 up-blocks and 4 down-blocks, each containing three convolutional layers with a 3×3 kernel size, activated upon by the LeakyReLU activation function which is described as: 𝐿𝑒𝑎𝑘𝑦𝑅𝑒𝐿𝑈(𝑥) = { 𝑥 𝑓𝑜𝑟 𝑥 > 00.1𝑥 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 (5)

The first down block increases the number of channels to 32, while the rest each increase the number of channels by a factor of two. Each of these down-blocks ends with an average pooling layer which has both a stride and a kernel size of two. The up-blocks begin with a bicubic up-sampling prior to the application of the convolutional layers. Between each of the blocks of a certain layer, a skip connection is used to pass data through the network without needing to go through all the blocks. After the final up-block, a convolutional layer maps back to three channels. The discriminator is made up of five blocks. These blocks contain two convolutional layers and LeakyReLU pairs, which together increase the number of channels by a factor of two. These are followed by an average pooling layer with a stride of two. After the five blocks, two fully connected layers reduce the output dimensionality to a single value, which in turn is input into a 4 sigmoid activation function to calculate the probability that the input to the discriminator network is real, i.e., not generated. Both the generator and discriminator were trained using the adaptive moment estimation (Adam) optimizer to update the learnable parameters. A learning rate of 1×10 -5 was used for the discriminator network while a rate of 1×10 -4 was used for the generator network. For each iteration of the discriminator training, the generator network is trained for seven iterations. This ratio reduces by one every 4000 iterations of the discriminator to a minimum of one discriminator iteration for every three generator iterations. The network was trained for 50000 iterations of the discriminator, with the model being saved every 1000 iterations. The best generator model was chosen manually from these saved models by visually comparing different models. For all three of the generator networks (MT, PAS and JMS), the 15000 th iteration of the discriminator was chosen as the optimal model. The stain transformation networks were trained using pairs of 256×256-pixel image patches generated by the class conditional virtual staining network (label-free), downsampled by a factor of 2 (to match 20× magnification). These patches were randomly cropped from one of 1013 712×712-pixel images coming from 10 unique tissue sections. 76 additional images coming from three unique tissue sections were used to validate the network. These images were augmented using the eight stain augmentation networks, and further augmented through random rotation and flipping of the images. Each of the three stain transformation networks (MT, PAS and JMS) were trained using images generated by the label-free virtual staining networks from the same input autofluorescence images. Furthermore, the images were converted to the YCbCr color space before being used as either the input or ground truth for the neural networks. Image data acquisition

All of the neural networks were trained using data obtained by microscopic imaging of thin tissue sections coming from needle core kidney biopsies. Unlabeled tissue sections were obtained from the UCLA Translational Pathology Core Laboratory (TPCL) under UCLA IRB 18-001029, from existing specimen. The autofluorescence images were captured using an Olympus IX-83 microscope, using a DAPI filter cube (Semrock OSFI3-DAPI5060C, EX 377/50 nm EM 447/60 nm) as well as a Texas Red filter cube (Semrock OSFI3-TXRED-4040C, EX 562/40 nm EM 624/40 nm) to generate the second autofluorescence image channel. In order to create the training dataset for the virtual staining network, pairs of matched unlabeled autofluorescence images and brightfield images of the histochemical stained tissue were obtained. H&E, MT and PAS histochemical staining were performed by the Tissue Technology Shared Resource at UC San Diego Moores Cancer Center. The JMS staining was performed by the Department of Pathology and Laboratory Medicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA. These stained slides were digitally scanned using a brightfield scanning microscope (Leica Biosystems Aperio AT slide, using 40x/0.75NA objective). All the slides and digitized slide images were prepared from existing specimen. Therefore, this work did not interfere with standard practices of care or sample collection procedures. The H&E image dataset used for in the study came from the existing UCLA pathology online database containing WSIs of stained kidney needle-core biopsies, under UCLA IRB 18-001029. These slides were similarly imaged using Aperio AT slide scanning microscopes. 5

Image co-registration

To train label-free virtual staining networks, the autofluorescence images of unlabeled tissue were co-registered to the ground truth histochemically stained tissue. This image co-registration was done through a multi-step process , beginning with a coarse matching which was progressively improved until subpixel level accuracy is achieved. The registration process first used a cross-correlation based method to extract the most similar portions of the two images. Next, the matching was improved using multimodal image registration . This registration step applied an affine transformation to the images of the histochemically stained tissue to correct for any changes in size or rotations. To achieve pixel-level co-registration accuracy, an elastic registration algorithm was then applied. However, this relies upon a local correlation-based matching. Therefore, to ensure that this matching could be accurately performed, an initial rough virtual staining network is applied to the autofluorescence images . These roughly stained images were then co-registered to the brightfield images of the stained tissue using a correlation-based elastic pyramidal co-registration algorithm . Once the image co-registration is complete, the autofluorescence images were normalized by subtracting the average pixel value of the tissue area for the WSI and subsequently dividing it by the standard deviation of the pixel values in the tissue area. Class conditional virtual staining of label-free tissue

A class conditional GAN was used to generate both the input and the ground truth images to be used during the training of the presented stain transformation networks (Figure 2a). This class conditional GAN allows multiple stains to be created simultaneously using a single deep neural network . To ensure that the features of the virtually stained images are highly consistent between stains, a single network must be used to generate the stain transformation network input (virtual H&E) and the corresponding ground truth images (virtual special stains) that are automatically registered to each other as the information source is the same image. This is only required for the training of the stain transformation neural networks and is rather beneficial as it allows both the H&E and special stains to be perfectly matched. Furthermore, an alternative image dataset made up of co-registered virtually stained and histochemically stained fields of view will present limitations due to imperfect co-registration and deformities caused by the staining process. These are eliminated by using a single class conditional GAN to generate both the input and the ground truth images. This network uses the same general architecture as the network described in the previous section, with the addition of a “Digital Staining Matrix” concatenated to the network input for both the generator and discriminator . This staining matrix defines the stain coordinates within a given image field of view. Therefore, the loss functions for the generator and discriminator are: 𝑙 𝑔𝑒𝑛𝑒𝑟𝑎𝑡𝑜𝑟 = 𝐿 {𝑧 𝑙𝑎𝑏𝑒𝑙 , 𝐺(𝑥 𝑖𝑛𝑝𝑢𝑡 , 𝑐̃)} + 𝛼 × 𝑇𝑉{𝐺(𝑥 𝑖𝑛𝑝𝑢𝑡 , 𝑐̃)} + ꞵ × (1 − 𝐷(𝐺(𝑥 𝑖𝑛𝑝𝑢𝑡 , 𝑐̃), 𝑐̃)) (6) 𝑙 𝑑𝑖𝑠𝑐𝑟𝑖𝑚𝑖𝑛𝑎𝑡𝑜𝑟 = 𝐷(𝐺(𝑥 𝑖𝑛𝑝𝑢𝑡 , 𝑐̃), 𝑐̃) + (1 − 𝐷(𝑧 𝑙𝑎𝑏𝑒𝑙 , 𝑐̃)) (7)

6 where 𝑐̃ is a one-hot encoded digital staining matrix with the same pixel dimensions as the input image. When used in the testing phase, the one-hot encoding allows the network to generate two separate stains (H&E and the corresponding special stain) for each field of view. The number of channels in each layer used by this deep neural network was increased by a factor of two compared to the stain transformation architecture described above to account for the larger dataset size and the need for the network to perform two distinct stain transformations. Style transfer for H&E image data augmentation

In order to ensure that the stain transformation neural network is capable of being applied to a wide variety of histochemically stained H&E images, we use the CycleGAN model to augment the training dataset by performing style transfer (Figure 2b). As discussed, these CycleGAN networks only augment the image data used as inputs in the training phase. This CycleGAN model learns to map between two domains 𝑋 and 𝑌 given the training samples 𝑥 and 𝑦 , where 𝑋 is the domain for the original virtually stained H&E and Y is the domain for the H&E image generated by a different lab or hospital. This model performs two mappings 𝐺 ∶ 𝑋 → 𝑌 and

𝐹 ∶𝑌 → 𝑋 . In addition, two adversarial discriminators 𝐷 𝑋 and 𝐷 𝑌 are introduced. A diagram showing the relationship between these various networks is shown in Supplementary Figure S3. The loss function of the generator 𝑙 𝑔𝑒𝑛𝑒𝑟𝑎𝑡𝑜𝑟 contains two types of terms: adversarial losses 𝑙 𝑎𝑑𝑣 to match the stain style of the generated images to the style of histochemically stained images in target domain; and cycle consistency losses 𝑙 𝑐𝑦𝑐𝑙𝑒 to prevent the learned mappings 𝐺 and 𝐹 from contradicting each other. The overall loss is therefore described by: 𝑙 𝑔𝑒𝑛𝑒𝑟𝑎𝑡𝑜𝑟 = 𝜆 × 𝑙 𝑐𝑦𝑐𝑙𝑒 + 𝜑 × 𝑙 𝑎𝑑𝑣 (8) where 𝜆 and 𝜑 are relative weights/constants. For each of the networks, we set 𝜆 = 10 and 𝜑 = 1. Each generator is associated with a discriminator, which ensures that the generated image matches the distribution of the ground truth. The adversarial losses for each of the generator networks can we written as: 𝑙 𝑎𝑑𝑣 𝑋→𝑌 = (1 − 𝐷 𝑌 (𝐺(𝑥))) (9) 𝑙 𝑎𝑑𝑣 𝑌→𝑋 = (1 − 𝐷 𝑋 (𝐹(𝑦))) (10) And the cycle consistency loss can be described as: 𝑙 𝑐𝑦𝑐𝑙𝑒 = 𝐿 {𝑦, 𝐺(𝐹(𝑦))} + 𝐿 {𝑥, 𝐹(𝐺(𝑥))} (11) The adversarial loss terms used to train 𝐷 𝑋 and 𝐷 𝑌 are defined as: 𝑙 𝐷 𝑋 = (1 − 𝐷 𝑋 (𝑥)) + 𝐷 𝑋 (𝐹(𝑦)) (13) 𝑙 𝐷 𝑌 = (1 − 𝐷 𝑌 (𝑦)) + 𝐷 𝑌 (𝐺(𝑥)) (14)

7 For these CycleGAN models, 𝐺 and 𝐹 use U-net architectures similar to the stain transformation network. It consists of three down-blocks followed by three up-blocks. Each of these down-blocks and up-blocks are identical to the corresponding blocks in the stain transformation network. 𝐷 𝑋 and 𝐷 𝑌 also have similar architectures to the discriminator network of stain transformation network. However, they have four blocks rather than five blocks as in the previous model. During the training, the Adam optimizer was used to update the learnable parameters with learning rates of 2×10 -5 for both the generator and discriminator networks. For each step of discriminator training, one iteration of training was performed for the generator network, and the batch size for training was set to 6. Training of single-stain virtual staining networks

In addition to performing multiple virtual stains using a single neural network, separate networks which only generate one individual virtual stain each were also trained. These networks were used for two purposes: (1) to perform the rough virtual staining that enables the elastic co-registration; (2) to generate the virtual staining comparisons shown in Figure 1 (Path 2b). These networks were trained using the procedures outlined in Rivenson et al. , and the architecture used by these networks is identical to that of the stain transformation network. Implementation details

The neural networks were trained and implemented using Python version 3.6.2 with TensorFlow version 1.8.0. Timing was measured on a Windows 10 computer with two Nvidia GeForce GTX 1080 Ti GPUs, 64GB of RAM, and an Intel I9-7900X CPU.

Pathologic evaluation of kidney biopsies

Sixteen non-neoplastic kidney cases were selected by a board-certified kidney pathologist (J.E.Z.) to represent a variety of kidney diseases (Table 1). For each case, the WSI of the histochemically stained H&E slide, along with a worksheet that included a brief clinical history, were presented to 3 board-certified renal pathologists (W.D.W, M.F.P.D and A.S.). The diagnostic worksheet can be seen in supplementary Table S1. The WSIs were exported to the Zoomify format , and uploaded to the GIGAmacro website to allow the pathologists to confidentially view the images using a standard web browser. The WSIs were viewed using standard displays (e.g., LCD Monitor, FullHD, 1920x1080 pixels). In the diagnostic worksheet, the reviewers were given the H&E WSI and a brief patient history and asked to make a preliminary diagnosis and quantify certain features of the biopsy (i.e. number of glomeruli and arteries) and provide additional comments if necessary. After a >3-week washout period to reduce the pathologists’ familiarity with the cases, the 3 reviewing pathologists received, in addition to the same histologically stained H&E WSIs and the same patient medical history, 3 virtually generated special stain WSIs for each case: MT, PAS and JMS. Being given these slides, they were asked to provide a preliminary diagnosis for a second time. To test the hypothesis that using additional virtual stains can be used to improve the preliminary diagnosis, the adjudicator pathologist (J.E.Z.) who was not among the 3 diagnosticians provided 8 judgement to determine Concordance (C), Discordance (D) or Improvements (I) between the diagnosis quality of the first and second round of preliminary diagnoses provided by the group of diagnosticians (see Supplementary Table S1). References Global Tissue Diagnostics Market, Forecast to 2022 . (2018). 2. Alturkistani, H. A., Tashkandi, F. M. & Mohammedsaleh, Z. M. Histological Stains: A Literature Review and Case Study.

Glob J Health Sci , 72–79 (2016). 3. Walker, P. D., Cavallo, T., Bonsib, S. M. & Ad Hoc Committee on Renal Biopsy Guidelines of the Renal Pathology Society. Practice guidelines for the renal biopsy. Mod. Pathol. , 1555–1563 (2004). 4. Tao, Y. K. et al. Assessment of breast pathologies using nonlinear microscopy.

PNAS , 15304–15309 (2014). 5. Fereidouni, F. et al.

Microscopy with ultraviolet surface excitation for rapid slide-free histology.

Nature Biomedical Engineering , 957–966 (2017). 6. Glaser, A. K. et al. Light-sheet microscopy for slide-free non-destructive pathology of large clinical specimens.

Nature Biomedical Engineering , 1–10 (2017). 7. Rivenson, Y. et al. Virtual histological staining of unlabelled tissue-autofluorescence images via deep learning.

Nature Biomedical Engineering et al.

Digital synthesis of histological stains using micro-structured and multiplexed virtual staining of label-free tissue.

Light: Science & Applications , 78 (2020). 9. Bayramoglu, N., Kaakinen, M., Eklund, L. & Heikkilä, J. Towards Virtual H and E Staining of Hyperspectral Lung Histology Images Using Conditional Generative Adversarial Networks. in IEEE International Conference on Computer Vision Workshops (ICCVW) et al.

PhaseStain: the digital staining of label-free quantitative phase microscopy images using deep learning.

Light: Science & Applications , 23 (2019). 0 11. Rana, A. et al. Use of Deep Learning to Develop and Analyze Computational Hematoxylin and Eosin Staining of Prostate Core Biopsy Images for Tumor Diagnosis.

JAMA Netw Open , e205111–e205111 (2020). 12. Borhani, N., Bower, A. J., Boppart, S. A. & Psaltis, D. Digital staining through the application of deep neural networks to multi-modal multi-photon microscopy. Biomed. Opt. Express, BOE , 1339–1350 (2019). 13. Roy-Chowdhuri, S. et al. Collection and Handling of Thoracic Small Biopsy and Cytology Specimens for Ancillary Studies: Guideline From the College of American Pathologists in Collaboration With the American College of Chest Physicians, Association for Molecular Pathology, American Society of Cytopathology, American Thoracic Society, Pulmonary Pathology Society, Papanicolaou Society of Cytopathology, Society of Interventional Radiology, and Society of Thoracic Radiology.

Arch. Pathol. Lab. Med.

IEEE J Biomed Health Inform (2020) doi:10.1109/JBHI.2020.2975151. 16. Gadermayr, M., Appel, V., Klinkhammer, B. M., Boor, P. & Merhof, D. Which Way Round? A Study on the Performance of Stain-Translation for Segmenting Arbitrarily Dyed Histological Images. in

Medical Image Computing and Computer Assisted Intervention – MICCAI 2018 (eds. Frangi, A. F., Schnabel, J. A., Davatzikos, C., Alberola-López, C. & Fichtinger, G.) 165–173 (Springer International Publishing, 2018). doi:10.1007/978-3-030-00934-2_19. 1 17. Kapil, A. et al.

DASGAN -- Joint Domain Adaptation and Segmentation for the Analysis of Epithelial Regions in Histopathology PD-L1 Images. arXiv:1906.11118 [cs, eess] (2019). 18. Zhu, J.-Y., Park, T., Isola, P. & Efros, A. A. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. in arXiv:1805.08841 [cs] (2018). 20. Fujitani, M. et al.

Re-staining Pathology Images by FCNN. in et al.

Virtual staining for mitosis detection in Breast Histopathology. arXiv:2003.07801 [cs, eess] (2020). 22. Bauer, T. W. et al.

Validation of whole slide imaging for primary diagnosis in surgical pathology.

Arch. Pathol. Lab. Med. , 518–524 (2013). 23. Shaban, M. T., Baur, C., Navab, N. & Albarqouni, S. Staingan: Stain Style Transfer for Digital Histological Images. in

Scientific Reports , 16884 (2019). 25. ERLANDSON, R. A. Role of Electron Microscopy in Modern Diagnostic Surgical Pathology. Modern Surgical Pathology et al.

Deep learning enables cross-modality super-resolution in fluorescence microscopy.

Nature Methods et al. Quantitative mapping and minimization of super-resolution optical imaging artifacts.

Nature Methods15