Classification of COVID-19 in CT Scans using Multi-Source Transfer Learning
CClassification of COVID-19 in CT Scans usingMulti-Source Transfer Learning
Alejandro R. MartinezDartmouth College [email protected]
Abstract —Since December of 2019, novel coronavirus diseaseCOVID-19 has spread around the world infecting millions ofpeople and upending the global economy. One of the drivingreasons behind its high rate of infection is due to the unreliabilityand lack of RT-PCR testing. At times the turnaround resultsspan as long as a couple of days, only to yield a roughly 70%sensitivity rate. As an alternative, recent research has investigatedthe use of Computer Vision with Convolutional Neural Networks(CNNs) for the classification of COVID-19 from CT scans. Dueto an inherent lack of available COVID-19 CT data, theseresearch efforts have been forced to leverage the use of TransferLearning. This commonly employed Deep Learning technique hasshown to improve model performance on tasks with relativelysmall amounts of data, as long as the Source feature spacesomewhat resembles the Target feature space. Unfortunately, alack of similarity is often encountered in the classification ofmedical images as publicly available Source datasets usually lackthe visual features found in medical images. In this study, wepropose the use of Multi-Source Transfer Learning (MSTL) toimprove upon traditional Transfer Learning for the classificationof COVID-19 from CT scans. With our multi-source fine-tuningapproach, our models outperformed baseline models fine-tunedwith ImageNet. We additionally, propose an unsupervised labelcreation process, which enhances the performance of our DeepResidual Networks. Our best performing model was able toachieve an accuracy of 0.893 and a Recall score of 0.897,outperforming its baseline Recall score by 9.3%.
Keywords:
COVID-19, Transfer Learning, ConvolutionalNeural Networks, CT, Computer Vision
I. I
NTRODUCTION & R
ELATED W ORK
On March 11th 2020, the World Health Organization(WHO) proclaimed the novel coronavirus disease COVID-19a global pandemic. Originating in the Hubei Province of Chinain late 2019, COVID-19 has spread across 185 countries,infecting over 30 million people and causing nearly 1 milliondeaths [1], [2]. One of the main reasons for its unprecedentedgrowth is due to the unreliability and lack of testing [3].The most widely employed test kits are reverse transcriptionpolymerase chain reaction (RT-PCR) assays, which check forthe detection of nucleic acid from SARS-CoV-2 in respiratoryspecimens [4]. While RT-PCR Assays are commonly used,they are reported to yield poor sensitivity in early stages ofinfection and require a lengthy processing time [3], [5]. Inaddition to these issues, RT-PCR Assays face severely limitedsupply, causing many symptomatic people to be left untested[6]. In light of these constraints, Computed Tomography (CT)imaging has been explored as a possible alternative diagnostictool for COVID-19 [3], [5], [7]. Prominent features of thevirus, such as bilateral ground-glass opacities, have beenidentified in the chest CT scans of patients with COVID-19.These visual features have a potential to act as regions ofinterest in the detection of the virus [3], [5]. CT imaging alsoproduces much faster results in comparison to RT-PCR Assaysand is widely available with roughly 6,000-7,000 scannerspresent in the United States [8].The use of CT imaging as a diagnostic tool would requireDeep Learning and Computer Vision technologies. Thesecomputational tools have been successfully applied to CT andother medical imaging classification tasks with low rates oferror [9], [10], [11]. The most commonly applied algorithmfor image classification is the Convolutional Neural Network(CNN). Researchers have used CNNs for classification oflung diseases in chest CT with radiologist level accuracy[12]. Recently, CNNs have been applied to COVID-19 chestCT classification and have shown promising results. He etal (2020) and Xu et al (2020) have both explored the useof CNNs for distinguishing COVID-19 from other types ofpneumonia or normal chest CTs and managed to achieveoverall accuracies of 86.0% and 86.7%, respectively [13], [14].While recent research has seemed hopeful, there still remainlimitations.With all Deep Learning problems, the amount of collecteddata largely determines the success of the system. As COVID-19 is a novel virus, there is an inherent lack of availabledatasets to construct a robust Deep Learning classifier capableof producing reliable results. To compensate for this issue,both He et al (2020) and Xu et al (2020) exploit the use oftransfer learning: a method by which a network is pretrainedon a large Source task and then retrained on a smallerTarget task. Transfer learning provides a network with a deepunderstanding of generic features from a Source dataset, so itdoes not require much data to learn the idiosyncrasies of theTarget dataset [15]. The problem with applying this methodto medical imaging, however, is that the Source dataset thenetwork is trained on (i.e. ImageNet) usually contains verydissimilar feature spaces to those in the Target dataset [10].This leaves the network with sub par performance, comparedto what it could achieve if pretrained on an additional datasetof medical images. a r X i v : . [ ee ss . I V ] S e p n this study we aim to provide a highly sensitive classifi-cation model for the detection of COVID-19 by exploiting aMulti-Source Transfer Learning (MSTL) process to distinguishCOVID-19 from normal CT scans.We start with the collection of multiple Deep CNNs pre-trained on ImageNet, provided by TensorFlow, Google’s opensource Machine Learning Library [16]. We then collect twodatasets: the first is comprised of 22,238 lung CT slices fromthe SPIE-AAPM Lung CT Challenge provided by the CancerImaging Archive; the second is comprised of 349 COVID-19and 397 normal CT scans provided by the UCSD Departmentof Engineering [17], [18]. The first dataset is used to teachour pretrained ImageNet models to extract relevant featuresfrom chest CT scans, and the second dataset is used to furtherfine-tune the models on distinguishing COVID-19 from normalchest CT scans. II. M ETHODS
In this section we describe our Multi-Source Transfer Learn-ing (MSTL) approach for the classification of COVID-19 fromnormal chest CT scans. Our methodology begins with thecollection of multiple chest CT datasets, followed by datapreprocessing, model selection, and ultimately MSTL.
A. Data description
Our study utilizes three separate datasets as part of ourmulti-source fine-tuning paradigm: a source, a transition, anda target.
Source dataset:
Being that our selected models are pre-trained on an ImageNet subset known as the ImageNet LargeScale Visual Recognition Challenge (ILSVRC) dataset, thiswill serve as our Source dataset. The ILSVRC dataset com-prises of 1.2 million images spanning 1,000 unique classes.Each class in the dataset corresponds to a distinct synonymset, or a synset, defined by WordNet, a large lexical databasethat retains a semantic hierarchy between concepts throughthe construction of synsets [19]. This structure ensures thatILSVRC classes hold unique feature representations, makingthe dataset conducive to generalization. All images are labeledby human annotators via Amazon’s Mechanical Turk, a crowdsourcing marketplace [19], [20].
Transition dataset:
Our Transition dataset comes fromthe 2015 SPIE-AAPM-NCI Lung Nodule Classification Chal-lenge, made available through The Cancer Imaging Archive(TCIA) and sponsored by the SPIE, NCI/NIH, AAPM andThe University of Chicago. The dataset contains 22,489 CTscan slices from 70 patients (28 males, 42 females: medianage: 61 years), containing 42 benign and 41 malignant lungnodules in total. All scans were acquired on Philips Brilliance16, 16P, and 64 scanners, and stored as 3-dimensional DICOMfiles with resolution of 512x512 per slice. All protected healthinformation was removed from DICOM headers [17].
Target dataset:
Our Target dataset is collected from an openaccess COVID-19 CT image repository provided by the UCSDDepartment of Engineering [18]. The dataset contains 349 COVID-19 and 397 nonCOVID-19 or normal CT scans. Thenormal CT scans were collected from MedPix, an open-accessonline database of medical images. The COVID-19 scans weremanually selected from 760 preprints on COVID-19 frommedRxiv and bioRxiv, published from January 19th to March25th. Of the scans collected, 137 contain gender informationand 169 contain age information. From the available metadata,the mean age of patients is calculated to be roughly 45 yearsold and the gender distribution is 86 males to 51 females.Most cases were reported to be from East Asia, with anoverwhelming majority from Wuhan, China. It should be notedthat the creators of the dataset claim the quality of CT imagesis well-preserved.
Fig. 1:
Regions of interest exhibiting key identifiers ofCOVID-19
B. Data preprocessing
The purpose of the Transition step is to improve the learningtransfer from the Source task to Target task. It does thisby teaching our network to extract low to mid-level featuresthat are more prevalent in our Target feature space than inour Source feature space. Therefore, it is essential that theTransition dataset is thoroughly filtered of images that may betoo dissimilar from images in our Target dataset, so the modelonly learns relevant features during the Transition step.Scans from the Target dataset contain only 2-dimensionalslices, focusing on areas of the lungs displaying distinguish-able COVID-19 symptoms. This region of interest captures aclear view of lung lobes that would exhibit key identifiersof COVID-19 such as bilateral ground glass opacities, asdepicted in Fig.1. Because our Transition dataset comprises of3-dimensional CT scans, containing many axial slices per scan,we excluded any slices that did not display the same regionsof interest as shown in the Target slices. We then exported theresulting 10,176 Transition slices as JPG files to process themas image arrays in our pretrained networks. We conducted allpreprocessing on Transition data with Horos.To diminish the variability in image processing, we ensuredthat each input image was reshaped to a standard dimensionsof (224,224,3).
C. Model Selection
We selected four pretrained models from Tensorflow’sKeras API: ResNet50V2, ResNet101V2, DenseNet121, andDenseNet169 [21], [22]. See Figure 2 for a comparison ofarchitectures. ig. 2:
ResNet (top) & DenseNet block (bottom) Architectures
ResNet:
The ResNet is a Deep Residual Network, a typeof CNN often employed in the field of computer vision sincegaining recognition from wining the 2015 ILSVRC [21]. DeepResidual Networks act similarly to deep CNNs except theyimplement a residual connection between any given layer andits following layer’s output. This entails that any given layerwill receive feature maps as input from its two precedinglayers. The residual connections improve upon regular neuralnetworks in two ways: they mitigate the vanishing gradientproblem by allowing the use of an alternative paths for gradientflow, and they allow the model to learn referenced functionswhich ensures deeper layers will perform either better or atleast as good as shallower layers [21].
DenseNet:
The DenseNet is very similar to the ResNet,however, instead of retaining a single residual connectionbetween any given layer and its following layer’s output, anygiven layer in a DenseNet retains a residual connection fromeach of its preceding layers. This entails that any n th layerwill take feature maps as input from n-1 preceding layers.The Dense connectivity of this network is then compacted bygrouping dense layers into dense blocks and introducing tran-sition layers, which apply convolutions and pooling operationsto the dense blocks’ output feature maps, reducing the depthof these feature maps by a compression factor of θ [22]. Thiscompression process allows DenseNets to hold less parametersthan ResNets, as depicted in Table 1. TABLE I:
Parameters of each model
D. Multi-Source Transfer Learning
As illustrated in Figure 3, our Multi-Source Transfer Learn-ing process involves a three step process: 1)
Source step: a randomly initialized model learns aSource task T s on a Source domain D s Transition step: the model is then fine-tuned on aTransition task T t on a Transition domain D t Target step: the model is further fine-tuned on a Targettask T targ on a Target domain D targ
Fig. 3:
Multi-Source Transfer Learning paradigmIn theory, the Transition step can be a sequence of steps,however, in this study we limit the total number of Transitionsteps to 1 for computational simplicity. The Multi-Sourceprocess allows for increasingly positive transfer of knowledgeat each step assuming that at each step, i , the domain D i is abetter approximation of Target domain D targ than its precedingstep, i -1.Therefore: | D i-1 − D targ | > | D i − D targ | (1),where the difference in similarity between a and b isrepresented as: | a − b | (2) Source step:
As indicated in section 2.3, our loaded modelsare pretrained on the ILSVRC dataset. This indicates that ourmodels have already learned our Source task T s on our Sourcedomain D s . Transition step:
For our Transition step we decided toexplore the use of two alternate Transition tasks via differentlabeling strategies: Soft Labeling and Hard Labeling. • Hard Labeling: hard labels can be though of as groundtruth labels. In other words, for the Hard Labeling strategywe utilize the original labels that were provided to usin the Transition dataset: malignant (1) or benign (0).This binary labeling strategy makes our Hard LabelingTransition task T t-hard a binary classification task. • Soft Labeling: soft labels can be thought of as a sortof unsupervised labeling strategy that can be applied tounlabeled data. Previous attempts at soft labeling employthe use of an auxiliary task, which teaches a model howto learn unlabeled data when knowledge of the domainis known [15]. Alternatively, we explore a soft labelingstrategy in which knowledge of the domain is either notknown or ignored.As shown in Figure 4, our soft labeling strategy beginsby first feeding the Transition data through a pretrainedInceptionV3 convolutional base, provided by Tensorflow[23]. We utilize the InceptionV3 base as a trained featureextractor, which converts our original input images ofdimensions (224,224,3) into feature maps of dimensions(5, 5, 2048). The resulting feature maps are then flattenedand fed into a KMeans clustering algorithm, provided ig. 4:
Unsupervised creation of Soft Labelsby scikit-learn, which clusters the data into 16 distinctfeature groups [24]. We then apply corresponding labelsto each cluster resulting in 16 labels: making the Transi-tion task T t-soft a 16-class classification task. To select thenumber of clusters (k), we performed a grid search withcross validation of 10, evaluating k values of { } .After acquiring our hard and soft labels, we configuredour models to fit the Transition tasks. Because our pretrainedmodels are loaded as convolutional bases we added randomlyinitialized pooling and fully connected layers, appropriateto each model. For the ResNet models we appended a 2-dimensional average pooling layer, a flatten layer, and a denselayer of 1000 nodes. For the DenseNet models we appended a2-dimensional global average pooling layer and a dense layerof 1000 nodes. For each model the output layer either consistedof a SoftMax activation function followed by a 16 node outputlayer for Transition task T t-soft or a Sigmoid activation functionfollowed by a single node output layer for Transition taskT t-hard .We further configured our models by freezing the shallowerhalf of all convolutional layers. Freezing layers is a commonstrategy performed when fine-tuning models for two mainreasons: the first is to mitigate overfitting by reducing the totalnumber of trainable parameters in the network; the second isto avoid redundant learning of low-level generic features thatare already learned in our Source models [15].After our models were configured for the Transition step.We split the Transition data into 80:20 Train and Valida-tion sets, respectively. We then fine-tuned each model withStochastic Gradient Descent and a batch size of 32, saving theweights that yielded the highest validation accuracy. Sparse-Categorical Cross Entropy loss was utilized for the Transitiontask T t-soft and Binary Cross Entropy loss was utilized for theTransition task T t-hard . Target step:
To begin the Target step we once againrandomly initialized the pooling and fully connected layersof each model. This re-initialization ensures that we are fine-tuning the fully connected layers only on the Target task T targ .The binary nature of T targ moreover required a replacementof the SoftMax activation function with a Sigmoid activationfunction in soft labels models. We also added a dropout layerof 0.5 prior to the output layer of each model, to furthermitigate overfitting.
TABLE II:
Division of Target datasetAs shown in Table 2, the Target dataset was split into60:15:25 Train, Validation and Test sets, respectively. We thenfine-tuned each model with Stochastic Gradient Descent and amomentum of 0.9 for 60 epochs and a batch size of 32, savingthe weights that yielded the highest validation accuracy.III. R
ESULTS
In this section we present the results of our experimentoutlined in the Methods section. As this paper aims to enhancethe process of traditional Transfer Learning with a multi-source process, we compare the performances of our MSTLmodels against that of their baseline models pretrained onthe ILSVRC dataset and fine-tuned on the Target dataset,referred to as the ’ImageNet’ models. The performances ofour finalized models are compared against their baselines toassess the magnitude of positive or negative transfer.After considering our two alternate labeling strategiesused on our four models, we are left with 8 finalizedMSTL models: ResNet50V2: Soft Labels, ResNet101V2: SoftLabels, DenseNet121: Soft Labels, DenseNet169: Soft La-bels, ResNet50V2: Hard Labels, ResNet101V2: Hard Labels,DenseNet121: Hard Labels, and DenseNet169: Hard Labels.We first plotted the Receiver Operating Characteristic(ROC) of our finalized models against their baseline ’Ima-geNet’ models isolated by model architecture. The ROC curveplots true positive rate against the false positive rate of samplespredicted from the Test set. As shown in Figure 5, when usinga Hard Labeling strategy both ResNet50V2 and ResNet101V2models outperformed their baseline AUCs by 1.8% and 5.4%,respectively. However, their performances were not as excep-tional when utilizing a Soft Labeling strategy, as only theResNet101V2 outperformed its baseline AUC by 3.4%, whilethe ResNet50V2 underperformed by 2.9%. For the DenseNets,when using a Hard Labeling strategy, the deeper DenseNet169outperformed its baseline AUC by 0.8%, however, the shal-lower DenseNet121 underperformed by 0.2%. When using aSoft Labeling strategy, both DenseNet121 and DenseNet169underperformed by 1.2% and 3.0%, respectively.We then plotted the ROC of our finalized models isolated bylabeling strategy. As shown in Figure 6, our DenseNet modelsoutperformed our ResNet models by a margin of at least 0.7%when employing a Hard Labeling strategy, and our ResNetmodels outperformed our DenseNet models by a marginof at least 0.9% when employing a Soft Labeling strategy.When using Hard Labels the DenseNet169 model achieved a ig. 5:
ROC Curve per model architecture. True positive rate is plotted on y-axis, while false positive rate is plotted on x-axis.superior AUC of 0.965, followed by the DenseNet121 withan AUC of 0.947. When using Soft Labels the ResNet101V2model achieved a superior AUC of 0.960, followed by theResNet50V2 with an AUC of 0.946.F1, Accuracy, Precision and Recall scores were then cal-culated between the models to further evaluate the modelsperformances. As depicted in Table 3, the DenseNet169: HardLabels model achieves superior F1, Accuracy, and Precisionscores of 0.903, 0.904, and 0.944 while the ResNet101V2:Soft Labels model achieves a superior Recall score of 0.897. IV. D
ISCUSSION
When evaluating the performances of our models it isessential that we are reminded of our research objective.As stated in the Introduction section, our purpose in thisstudy is to develop a highly sensitive classification modelfor the detection of COVID-19. We seek to emphasize theuse of ’sensitive’, because the sensitivity of the model takesprecedence over all other metrics in the case of a medicalimaging diagnosis. In other words, the cost of a False Negativegreatly surpasses the cost of a False Positive, especially inthe diagnosis of an infectious disease. In the case of a False ig. 6:
ROC Curve per labeling method. True positive rate is plotted on y-axis, while false positive rate is plotted on x-axis.Positive, the worst outcome would be that a individual withoutCOVID-19 is told to self-isolate for two weeks. In the caseof a False Negative, the worst outcome would be that anindividual with COVID-19 continues to spread the infectionafter receiving a negative diagnosis.Therefore, the model with the highest Recall score is ourmost desirable model as it has the smallest chance of produc-ing a False Negative. The classification results between ourtwo highest performing models are visualized in Figure 7. Asshown in the confusion matrices, although the DenseNet169:Hard Labels model has total fewer misclassifications thanthe ResNet101V2: Soft Labels model, the ResNet101V2: SoftLabels model has 3 fewer False Negative predictions, and thusis more desirable.It should be noted that a significant finding of this studywas the variance in performances between our DenseNets andResNets with respect to the labeling strategy. As stated inthe Results section, the DenseNets outperformed the ResNetswhen utilizing a Hard Labeling strategy, while the ResNetsoutperformed the DenseNets when utilizing a Soft Label-ing strategy. We attribute these discrepancies in performancelargely to the number of parameters in our models. Asshown in Table 1, the ResNets are much more complex thanDenseNets as they retain a larger number of total parameters.This greater complexity most likely caused the ResNets tooverfit more to the binary classification Transition task T t-hard than to the multi-class classification task T t-soft . However,to conclude these hypotheses, further analysis is required toassess the relationship between the number of classes in T t-soft and the performances of our models.
A. Limitations & Future Work
Although our results appropriately reflected the aspirationsof our research objective, certain limitations apply to thisstudy.The first limitation to this study is that CT scans are difficultto implement for mass testing. While scan results would bereturned much quicker than RT-PCR Assays, CT scanningwould need to be conducted indoors and under the samemachine for thousands of patients. Due to the airborne natureof this infectious disease, an indoor testing environment isanything but ideal.The second limitation to this study is that CT scans areusually stored as 3-dimensional DICOM files, while our studyrequires an input of a 2-dimensional axial slice. This issuewas exhibited in section 2.2, when our Transition datasetof 3-dimensional scans needed to be manually decomposedinto relevant 2-dimensional slices. This preprocessing stepwas very computationally expensive, as it required manualselection of 10,176 CT slices displaying the regions of interest.To prevent this limitation, our study can be improved byautomating the slice selection process. In this scenario, wecould retain our original 2-dimensional architecture and Targetdataset. The only expense would be training a independentclassifier to assess if a given slice retains the region of interestwe seek.A third limitation to this study is that confounding diseasescan disrupt the performances of our models. As our modelswere trained to distinguish COVID-19 from normal CT scans,
ABLE III:
F1, Accuracy, Precision and Recall scores
Fig. 7:
DenseNet169: Hard Labels (left) & ResNet101V2: Soft Labels (right)it runs the risk of classifying other lung diseases as FalsePositives (i.e. Influenza A). Although this presents an issue,we can diminish the risk of this type of missclassification byincluding other diseases in our Target dataset.V. C
ONCLUSION
In this study we presented a Multi-Source Transfer Learningapproach for the classification of COVID-19 from CT scans.By learning to classify an additional dataset of images moreclosely related to the Target domain, our models were ableto outperform baseline models fine-tuned with traditionalTransfer Learning methods. We additionally proposed an un-supervised label creation process, which further improved theperformances of our Deep Residual Networks. The results of this study show the following: Transfer Learning can beimproved by bridging the gap between the Source domainand the Target domain with a target-related Transition domain;unsupervised label creation has the potential to improve theperformance of Deep Residual Networks; and with limiteddata, the application of Computer Vision for the detection ofCOVID-19 from CT scans exhibits high sensitivity and shouldbe further investigated with the discussed limitations in mind.VI. A
CKNOWLEDGEMENTS
This work was initiated and facilitated by CS 89.20/189 -Data Science for Health (Spring 2020) at Dartmouth College,taught by Professor Temiloluwa Prioleau.
EFERENCES[1] “Who timeline - covid-19,” World Health Organization.[2] “Maps & trends,” Johns Hopkins Coronavirus Resource Center.[3] T. Ai, Z. Yang, H. Hou, C. Zhan, C. Chen, W. Lv, Q. Tao, Z. Sun, andL. Xia, “Correlation of chest ct and rt-pcr testing in coronavirus disease2019 (covid-19) in china: a report of 1014 cases,”
Radiology , p. 200642,2020.[4] C. for Devices and R. Health, “Emergency use authorizations.”[5] Y. Fang, H. Zhang, J. Xie, M. Lin, L. Ying, P. Pang, and W. Ji,“Sensitivity of chest ct for covid-19: comparison to rt-pcr,”
Radiology ,p. 200432, 2020.[6] E. J. Emanuel, G. Persad, R. Upshur, B. Thome, M. Parker, A. Glickman,C. Zhang, C. Boyle, M. Smith, J. P. Phillips, and et al., “Fair allocation ofscarce medical resources in the time of covid-19,”
New England Journalof Medicine , vol. 382, no. 21, p. 2049–2055, 2020.[7] M. D. Hope, C. A. Raptis, and T. S. Henry, “Chest computed tomographyfor detection of coronavirus disease 2019 (covid-19): don’t rush thescience,” 2020.[8] M. Castillo, “The industry of ct scanning,”
American Journal of Neuro-radiology , vol. 33, no. 4, p. 583–585, 2011.[9] K. Suzuki, “Overview of deep learning in medical imaging,”
Radiolog-ical physics and technology , vol. 10, no. 3, pp. 257–273, 2017.[10] R. Yamashita, M. Nishio, R. K. G. Do, and K. Togashi, “Convolutionalneural networks: an overview and application in radiology,”
Insights intoimaging , vol. 9, no. 4, pp. 611–629, 2018.[11] Q. Song, L. Zhao, X. Luo, and X. Dou, “Using deep learning forclassification of lung nodules on computed tomography images,”
Journalof healthcare engineering , vol. 2017, 2017.[12] P. Rajpurkar, J. Irvin, K. Zhu, B. Yang, H. Mehta, T. Duan, D. Ding,A. Bagul, C. Langlotz, K. Shpanskaya, et al. , “Chexnet: Radiologist-level pneumonia detection on chest x-rays with deep learning,” arXivpreprint arXiv:1711.05225 , 2017.[13] X. He, X. Yang, S. Zhang, J. Zhao, Y. Zhang, E. Xing, and P. Xie,“Sample-efficient deep learning for covid-19 diagnosis based on ctscans,” medRxiv , 2020.[14] C. Butt, J. Gill, D. Chun, and B. A. Babu, “Deep learning system toscreen coronavirus disease 2019 pneumonia,”
Applied Intelligence , p. 1,2020.[15] S. J. Pan and Q. Yang, “A survey on transfer learning,”
IEEE Trans-actions on knowledge and data engineering , vol. 22, no. 10, pp. 1345–1359, 2009.[16] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S.Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow,A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur,J. Levenberg, D. Man´e, R. Monga, S. Moore, D. Murray, C. Olah,M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker,V. Vanhoucke, V. Vasudevan, F. Vi´egas, O. Vinyals, P. Warden, M. Wat-tenberg, M. Wicke, Y. Yu, and X. Zheng, “TensorFlow: Large-scalemachine learning on heterogeneous systems,” 2015. Software availablefrom tensorflow.org.[17] “Spie-aapm lung ct challenge - the cancer imaging archive (tcia) publicaccess - cancer imaging archive wiki,” Cancer Imaging Archive.[18] J. Zhao, Y. Zhang, X. He, and P. Xie, “Covid-ct-dataset: a ct scan datasetabout covid-19,” arXiv preprint arXiv:2003.13865 , 2020.[19] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet:A large-scale hierarchical image database,” in , pp. 248–255, Ieee, 2009.[20] A. Sorokin and D. Forsyth, “Utility data annotation with amazon me-chanical turk,” in , pp. 1–8, IEEE, 2008.[21] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for imagerecognition,” in
Proceedings of the IEEE conference on computer visionand pattern recognition , pp. 770–778, 2016.[22] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Denselyconnected convolutional networks,” in
Proceedings of the IEEE confer-ence on computer vision and pattern recognition , pp. 4700–4708, 2017.[23] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinkingthe inception architecture for computer vision,” in
Proceedings of theIEEE conference on computer vision and pattern recognition , pp. 2818–2826, 2016.[24] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion,O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vander-plas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duch- esnay, “Scikit-learn: Machine learning in Python,”