COVID-Net CT-2: Enhanced Deep Neural Networks for Detection of COVID-19 from Chest CT Images Through Bigger, More Diverse Learning
CCOVID-Net CT-2: Enhanced Deep Neural Networksfor Detection of COVID-19 from Chest CT ImagesThrough Bigger, More Diverse Learning
Hayden Gunraj , ∗ , Ali Sabri , David Koff , and Alexander Wong , , , ∗ Department of Mechanical and Mechatronics Engineering, University of Waterloo, Canada Department of Systems Design Engineering, University of Waterloo, Canada Waterloo Artificial Intelligence Institute, Canada DarwinAI Corp., Canada Department of Radiology, Niagara Health, McMaster University, Canada Department of Radiology, Hamilton Health Sciences, McMaster University, Canada ∗ Corresponding authors: {hayden.gunraj,a28wong}@uwaterloo.ca
Abstract
Background:
The COVID-19 pandemic continues to rage on around the world,with multiple waves causing substantial harm to health and economies around theworld. Motivated by the use of computed tomography (CT) imaging at clinicalinstitutes around the world as an effective complementary screening method toRT-PCR testing, we introduced COVID-Net CT, a deep neural network tailoredfor detection of COVID-19 cases from chest CT images, along with a large cu-rated benchmark dataset comprising 1,489 patient cases as part of the open sourceCOVID-Net initiative. However, one potential limiting factor is restricted quantityand diversity given the single nation patient cohort used in the study.
Methods : Motivated by the success of COVID-Net CT, we introduce COVID-Net CT-2, enhanced deep neural networks for COVID-19 detection from chest CTimages trained on the largest quantity and diversity of multinational patient casesin research literature. We accomplish this through the introduction of two new CTbenchmark datasets, the largest of which comprises a multinational cohort of 4,501patients from at least 15 countries. To the best of our knowledge, this represents thelargest, most diverse multinational cohort for COVID-19 CT images in open accessform. We leverage explainability to investigate the decision-making behaviour ofCOVID-Net CT-2 to ensure that decisions are based on relevant indicators, with theresults for select cases reviewed and reported on by two board-certified radiologistswith over 10 and 30 years of experience, respectively.
Results:
The COVID-Net CT-2 neural networks achieved accuracy, COVID-19sensitivity, positive predictive value, specificity, and negative predictive value of98.1%/96.2%/96.7%/99%/98.8% and 97.9%/95.7%/96.4%/98.9%/98.7%, respec-tively. Explainability-driven performance validation shows that COVID-Net CT-2’sdecision-making behaviour is consistent with radiologist interpretation by leverag-ing correct, clinically relevant critical factors.
Conclusions:
The results are promising and suggest the strong potential of deepneural networks as an effective tool for computer-aided COVID-19 assessment.While not a production-ready solution, we hope the open-source, open-accessrelease of COVID-Net CT-2 and benchmark datasets will continue to enableresearchers, clinicians, and citizen data scientists alike to build upon them. https://github.com/haydengunraj/COVIDNet-CT Preprint. Under review. a r X i v : . [ ee ss . I V ] J a n Introduction
The coronavirus disease 2019 (COVID-19) pandemic, caused by severe acute respiratory syndromecoronavirus 2 (SARS-CoV-2), continues to rage on around the world, with multiple waves causingsubstantial harm to health and economies around the world. Real-time reverse transcription poly-merase chain reaction (RT-PCR) testing remains the primary screening tool for COVID-19, whereSARS-CoV-2 ribonucleic acid (RNA) is detected within an upper respiratory tract sputum sample [1].However, despite being highly specific, the sensitivity of RT-PCR can be relatively low [2, 3] and canvary greatly depending on the time since symptom onset as well as sampling method [4, 3, 5].Clinical institutes around the world have explored the use of computed tomography (CT) imaging asan effective, complementary screening tool alongside RT-PCR [2, 5, 6]. In particular, studies haveshown CT to have great utility in detecting COVID-19 infections during routine CT examinationsfor non-COVID-19 related reasons such as elective surgical procedure monitoring and neurologicalexaminations [7, 8]. Other scenarios where CT imaging has been leveraged include cases wherepatients have worsening respiratory complications, as well as cases where patients with negativeRT-PCR test results are suspected to be COVID-19 positive due to other factors. Early studies haveshown that a number of potential indicators for COVID-19 infections may be present in chest CTimages [9, 10, 11, 12, 2, 5, 6], but may also be present in non-COVID-19 infections. This can lead tochallenges for radiologists in distinguishing COVID-19 infections from non-COVID-19 infectionsusing chest CT [13, 14].Inspired by the potential of CT imaging as a complementary screening method and the challenges ofCT interpretation for COVID-19 screening, we previously introduced COVID-Net CT [15], a deepconvolutional neural network tailored for detection of COVID-19 cases from chest CT images. Wefurther introduced COVIDx CT, a large curated benchmark dataset comprising of chest CT scans froma cohort of 1,489 patients derived from a collection by the China National Center for Bioinformation(CNCB) [16]. Both COVID-Net CT and COVIDx CT were made publicly available as part of theCOVID-Net [17, 18] initiative, an open source initiative aimed at accelerating advancement andadoption of deep learning in the fight against the COVID-19 pandemic. While COVID-Net CT wasable to achieve state-of-the-art COVID-19 detection performance, one potential limiting factor isthe restricted quantity and diversity of CT imaging data used to learn the deep neural network giventhe entirely Chinese patient cohort used in the study. As such, a greater quantity and diversity inthe patient cohort has the potential to improve generalization, particularly when COVID-Net CT isleveraged under different clinical settings around the world.Motivated by the success and widespread adoption of COVID-Net CT and COVIDx CT, as well asits potential data quantity and diversity limitations, in this study we introduce COVID-Net CT-2,enhanced deep convolutional neural networks for COVID-19 detection from chest CT images thatare trained on a large, diverse, multinational patient cohort. More specifically, we accomplish thisthrough the introduction of two new CT benchmark datasets (COVIDx CT-2A and COVIDx CT-2B),the largest of which comprises a multinational cohort of 4,501 patients from at least 15 countries.To the best of the authors’ knowledge, these benchmark datasets represent the largest, most diversemultinational cohorts for COVID-19 CT images available in open access form. Finally, we leverageexplainability to investigate the decision-making behaviour of COVID-Net CT-2 to ensure decisionsare based on relevant visual indicators in CT images, with the results for select patient cases beingreviewed and reported on by two board-certified radiologists with 10 and 30 years of experience,respectively. The COVID-Net CT-2 networks and corresponding COVIDx CT-2 datasets are publiclyavailable as part of the COVID-Net initiative [17, 18]. While not a production-ready solution, wehope the open-source, open-access release of the COVID-Net CT-2 networks and the correspondingCOVIDx CT-2 benchmark datasets will enable researchers, clinicians, and citizen data scientists aliketo build upon them. In this study, we introduce COVID-Net CT-2 L and COVID-Net CT-2 S, a pair of enhanced deepconvolutional neural networks for the detection of COVID-19 from chest CT. To train and testthese networks, we further introduce two COVIDx CT-2 benchmark datasets which represent the https://alexswong.github.io/COVID-Net ∼ × and ∼ × lower architectural complexity than ResNet-50 [19]for COVID-Net CT-2 L and S networks, respectively) and selective long-range connectivity to draw abalance between complexity and representational power. The COVID-Net CT-2 design was trainedon CT scans from a large, diverse, multinational cohort of patient cases across at least 15 countries(i.e., COVIDx CT-2).largest, most diverse multinational patient cohorts for COVID-19 CT images available in openaccess form, spanning cases from at least 15 countries. A visual overview of COVID-Net CT-2and COVIDx CT-2 is shown in Figure 1. The methodology behind the preparation of the twoCOVIDx CT-2 benchmark datasets, the construction and learning of the COVID-Net CT-2 networks,and the explainability-driven performance validation process are described in detail below. The original COVIDx CT benchmark dataset consists of chest CT scans collected by the ChinaNational Center for Bioinformation (CNCB) [16] which were carefully processed and selected toform a cohort of 1,489 patient cases. While COVIDx CT is significantly larger than many CT datasetsfor COVID-19 detection in literature, a potential limitation with leveraging COVIDx CT for learningdeep neural networks is the limited diversity in terms of patient demographics. More specifically,the cohort of patients used in COVIDx CT are collected in different provinces of China, and as suchthe characteristics of COVID-19 infection as observed in the chest CT images may not generalize topatients around the world outside of China. Therefore, increasing the quantity and diversity of thepatient cohort in constructing new benchmark datasets could result in more diverse, well-roundedlearning of deep neural networks. In doing so, improved generalization and applicability for useunder different clinical environments around the world can be achieved.In this study, we carefully processed and curated CT images from several patient cohorts from aroundthe world which were collected using a variety of CT equipment types, protocols, and levels ofvalidation. By unifying CT imaging data from several cohorts from around the world, we created twodiverse, large-scale benchmark datasets:•
COVIDx CT-2A : This benchmark dataset comprises 194,922 CT images from a multina-tional cohort of 3,745 patients between 0 and 93 years old (median age of 51) with stronglyclinically-verified findings. The multinational cohort consists of patient cases collectedby the following organizations and initiatives from around the world: (1) China NationalCenter for Bioinformation (CNCB) [16] (China), (2) National Institutes of Health IntramuralTargeted Anti-COVID-19 (ITAC) Program (hosted by TCIA [20], countries unknown), (3)Negin Radiology Medical Center [21] (Iran), (4) Union Hospital and Liyuan Hospital of3igure 2: Patient distribution across training, validation, and test for COVIDx CT-2A (left) andCOVIDx CT-2B (right).Huazhong University of Science and Technology [22] (China), (5) COVID-19 CT Lung andInfection Segmentation initiative, annotated and verified by Nanjing Drum Tower Hospi-tal [23] (Iran, Italy, Turkey, Ukraine, Belgium, some countries unknown), (6) Lung ImageDatabase Consortium (LIDC) and Image Database Resource Initiative (IDRI) [24] (countriesunknown), and (7) Radiopaedia collection [25] (Iran, Italy, Australia, Afghanistan, Scotland,Lebanon, England, Algeria, Peru, Azerbaijan, some countries unknown).•
COVIDx CT-2B : This benchmark dataset comprises 201,103 CT images from a multina-tional cohort of 4,501 patients between 0 and 93 years old (median age of 51) with a mix ofstrongly verified findings and weakly verified findings. The patient cohort in COVIDx CT-2B consists of the multinational patient cohort we leveraged to construct COVIDx CT-2A,which have strongly clinically-verified findings, with additional patient cases with weaklyverified findings collected by the Research and Practical Clinical Center of Diagnosticsand Telemedicine Technologies, Department of Health Care of Moscow (MosMed) [26](Russia). Notably, these additional cases are only included in the training dataset, and assuch the validation and test datasets are identical to those of COVIDx CT-2A.In both COVIDx CT-2 benchmark datasets, the findings for the chest CT volumes corresponds tothree different infection types: (1) novel coronavirus pneumonia due to SARS-CoV-2 viral infection(NCP), (2) common pneumonia (CP), and (3) normal controls, with the patient distribution for thethree infection types across training, validation, and test shown in Figure 2. For CT volumes labelledas NCP or CP, slices containing abnormalities were identified and assigned the same labels as the CTvolumes. Notably, patient age was not available for all cases, and as such the age ranges and medianages reported above are based on patient cases for which age was available. For images which wereoriginally in Hounsfield units (HU), a standard lung window centered at -600 HU with a width of1500 HU was used to map the images to unsigned 8-bit integer range (i.e., [0, 255]).The rationale for creating two different COVIDx CT-2 benchmark datasets stems from the availabilityof weakly verified findings (i.e., findings not based on RT-PCR test results or final radiology reports),which can be useful for further increasing the quantity and diversity of patient cases that a deepneural network can be exposed to and can be of great interest for researchers, clinicians, and citizenscientists to explore and build upon while being made aware of the fact some of the CT scans donot have strongly verified findings available. Both COVIDx CT-2A and COVIDx CT-2B benchmarkdatasets are publicly available as part of the COVID-Net initiative, with example CT images fromeach type of infection shown in Figure 3. .2 COVID-Net CT-2 construction and learning By leveraging the COVIDx CT-2 benchmark datasets introduced in the previous section, we build theCOVID-Net CT-2 deep convolutional neural networks in a way that is more generalizable and morereadily adoptable to a wider range of clinical scenarios around the world through bigger, more diverselearning on the largest quantity and diversity of multinational patient cases in research literature.More specifically, two COVID-Net CT-2 networks are built (COVID-Net CT-2 L and COVID-Net CT-2 S), with both sharing the same macroarchitecture design but different number of parameters. TheCOVID-Net CT-2 architecture is shown in Figure 1, and the networks are made publicly available .More specifically, we leverage the COVID-Net CT network architecture design proposed in [15]as the core of the architecture designs of the COVID-Net CT-2 networks. The architecture designswere discovered automatically via a machine-driven design exploration process using generativesynthesis [27], where the macroarchitecture and microarchitecture designs of a tailored deep neuralnetwork architecture for the task and data at hand is determined via iterative constrained optimizationbased on a universal performance function (e.g., [28]) and a set of quantitative constraints. Theresult is highly customized architecture designs that strike a strong balance between complexity andrepresentational power beyond what a human designer can achieve alone.The COVID-Net CT-2 designs possess several interesting architectural characteristics. First, COVID-Net CT-2 designs exhibit very diverse yet lightweight designs composed largely of a heterogeneouscombination of strided and unstrided depthwise convolutions as well as pointwise convolutions, withunique microarchitecture design characteristics tailored during the machine-driven design explorationprocess. Second, COVID-Net CT-2 leverages selective long-range connectivity through several pointconvolution hubs to draw a balance between architectural complexity and representational power. Asa result of these macroarchitecture and microarchitecture design traits tailored around COVID-19detection from CT images, the COVID-Net CT-2 architecture designs have, at ∼ ∼ ∼ × and ∼ × lower architectural complexity thanResNet-50 [19], respectively.The constructed COVID-Net CT-2 deep convolutional neural networks were trained on theCOVIDx CT-2A benchmark dataset via stochastic gradient descent with momentum [29], wherethe following hyperparameters were leveraged: learning rate=5e-4, momentum=0.9, number ofepochs=25, batch size=64. To further increase data diversity beyond what is provided by the largemultinational cohort in order to improve the generalization of COVID-Net CT-2, we leveraged dataaugmentation in the form of cropping box jitter, rotation, horizontal and vertical shear, horizontalflip, and intensity shift and scaling. The construction, training, and evaluation of COVID-Net CT-2networks were conducted using the TensorFlow [30] machine learning library. As with COVID-Net CT [15], we utilize GSInquire [31] as the explainability method of choice toconduct explainability-driven performance validation, Using GSInquire, we audit the trained COVID-Net CT-2 to better understanding and verify its decision-making behaviour when analyzing CTimages to predict the condition of a patient. This form of performance validation via model auditingis particularly important in a clinical context, as the decisions made about a patient’s conditionscan affect the health of patients via treatment and care decisions made using a model’s predictions.Therefore, examining the decision-making behaviour through model auditing is key to ensuring thatthe right visual indicators in the CT scans (in the case of COVID-19 infections, visual anomaliessuch as ground-glass opacities and bilateral abnormalities) are leveraged for making a prediction asopposed to irrelevant visual cues (e.g., synthetic padding, circular scan artifacts, patient table, etc.).Furthermore, incorporating interpretability in the validation process also increases the level of trustthat a clinician has in leveraging such models for clinical decision support by adding an extra degreeof algorithmic transparency.To facilitate explainability-driven performance validation via model auditing, GSInquire providesan explanation of how a model makes a decision based on input data by identifying a set of criticalfactors within the input data that impact the decision-making process of the deep neural network ina quantitatively significant way. This is accomplished by probing the model with an input signal(in this case, a CT image) as the targeted stimulus signal and observing the reactionary response https://github.com/haydengunraj/COVIDNet-CT bold .Network Accuracy (%)COVID-Net CT-1 [15] 94.5COVID-Net CT-2 L COVID-Net CT-2 S 97.9
To explore the efficacy of the COVID-Net CT-2 networks for COVID-19 detection from CT images,we conducted a quantitative evaluation of the trained deep neural networks using the COVIDx CT-2test dataset. For comparison purposes, we also conduct a quantitative comparison with COVID-Net CT [15] (referred from here on as COVID-Net CT-1 for clarity), which was previously shownto achieve state-of-the-art performance when compared with state-of-the-art deep neural networkarchitectures such as ResNet-50 [37], NASNet-A-Mobile [38], and EfficientNet-B0 [39] for the taskof COVID-19 detection from CT images. The test accuracy of the COVID-Net CT-2 networks andCOVID-Net CT-1 are shown in Table 1. It can be observed that COVID-Net CT-2 L and COVID-Net CT-2 S achieved strong test accuracies of 98.1% and 97.9%, respectively, on the COVIDx CT-2test dataset, while at the same time possessing low architectural complexity ( ∼ ∼ ∼ ∼ To audit the decision-making behaviour of COVID-Net CT-2 and ensure that it is leveraging relevantvisual indicators when predicting the condition of a patient, we conducted explainability-driven7able 2: Sensitivity for each infection type at the image level on the COVIDx CT-2 benchmark testdataset. Best results highlighted in bold .Network Sensitivity (%)Normal CP NCPCOVID-Net CT-1 [15] 98.8
COVID-Net CT-2 S 98.9 98.1 95.7Table 3: Positive predictive value (PPV) for each infection type at the image level on the COVIDx CT-2 benchmark test dataset. Best results highlighted in bold .Network PPV (%)Normal CP NCPCOVID-Net CT-1 [15] 96.1 90.2
COVID-Net CT-2 L
The expert radiologist findings and observations with regards to the critical factors identified byGSInquire for each of the four patient cases shown in Figure 4 are as follows. In all four cases,COVID-Net CT-2 L detected them to be novel coronavirus pneumonia due to SARS-CoV-2 viralinfection, which was clinically confirmed.
Case 1 (top-left of Figure 4) . It was observed by one of the radiologists that there are bilateralperipheral mixed ground-glass and patchy opacities with subpleural sparing, which is consistent withthe identified critical factors leveraged by COVID-Net CT-2 L. The absence of large lymph nodesand effusion further helped the radiologist point to novel coronavirus pneumonia due to SARS-CoV-2viral infection. The degree of severity is observed to be moderate to high. It was confirmed by thesecond radiologist that the identified critical factors leveraged by COVID-Net CT-2 L are correctareas of concern and represent areas of consolidation with a geographic distribution that is in favourof novel coronavirus pneumonia due to SARS-CoV-2 viral infection.
Case 2 (top-right of Figure 4) . It was observed by one of the radiologists that there are bilateralperipherally-located ground-glass opacities with subpleural sparing, which is consistent with theidentified critical factors leveraged by COVID-Net CT-2 L. As in Case 1, the absence of large lymphnodes and large effusion further helped the radiologist point to novel coronavirus pneumonia dueto SARS-CoV-2 viral infection. The degree of severity is observed to be moderate to high. It wasconfirmed by the second radiologist that the identified critical factors leveraged by COVID-Net CT-8able 4: Specificity for each infection type at the image level on the COVIDx CT-2 benchmark testdataset. Best results highlighted in bold .Network Specificity (%)Normal CP NCPCOVID-Net CT-1 [15] 96.3 95.7
COVID-Net CT-2 L bold .Network NPV (%)Normal CP NCPCOVID-Net CT-1 [15] 98.9
COVID-Net CT-2 S 99.0 99.2 98.72 L are correct areas of concern and represent areas of consolidation with a geographic distributionthat is in favour of novel coronavirus pneumonia due to SARS-CoV-2 viral infection.
Case 3 (bottom-left of Figure 4) . It was observed by one of the radiologists that there are peripheralbilateral patchy opacities, which is consistent with the identified critical factors leveraged by COVID-Net CT-2 L. Unlike the first two cases, there is small right effusion. However, as in Cases 1 and 2, theabsence of large effusion further helped the radiologist point to novel coronavirus pneumonia due toSARS-CoV-2 viral infection. Considering that the opacities are at the base, a differential of atelectasischange was also provided. The degree of severity is observed to be moderate. It was confirmed bythe second radiologist that the identified critical factors leveraged by COVID-Net CT-2 L are correctareas of concern and represent areas of consolidation.
Case 4 (bottom-right of Figure 4) . It was observed by one of the radiologists that there areperipherally located asymmetrical bilateral patchy opacities, which is consistent with the identifiedcritical factors leveraged by COVID-Net CT-2 L. As in Cases 1 and 2, the absence of lymph nodesand large effusion further helped the radiologist point to novel coronavirus pneumonia due to SARS-CoV-2 viral infection, but a differential of bacterial pneumonia was also provided consideringthe bronchovascular distribution of patchy opacities. In addition, there is no subpleural sparing.This highlights the potential difficulties in differentiating between novel coronavirus pneumoniaand common pneumonia. It was confirmed by the second of the radiologists that the identifiedcritical factors leveraged by COVID-Net CT-2 L are correct areas of concern and represent areas ofconsolidation with a geographic distribution that is in favour of novel coronavirus pneumonia due toSARS-CoV-2 viral infection.Therefore, it can be observed that the explainability-driven validation process shows consistencybetween the decision-making process of COVID-Net CT-2 and radiologist interpretation, whichsuggests strong potential for computer-aided COVID-19 assessment within a clinical environment.Based on both quantitative and qualitative results, it can be seen that not only does COVID-Net CTachieve high performance, but it is leveraging relevant abnormalities in the lungs in its decision-making process rather than erroneous visual cues.
In this work, we introduced COVID-Net CT-2, enhanced deep convolutional neural networks tailoredfor the purpose of COVID-19 detection from chest CT images via more diverse learning on thelargest quantity and diversity of multinational patient cases in research literature. Two new CTbenchmark datasets were introduced and used to facilitate the learning of COVID-Net CT-2, and9igure 4: Example chest CT images from four COVID-19 cases reviewed and reported on by twoboard-certified radiologists, and the associated critical factors (highlighted in red) as identified byGSInquire [31] for COVID-Net CT-2 L. Based on the observations made by two expert radiologists,it was found that the critical factors leveraged by COVID-Net CT-2 L are consistent with radiologistinterpretation.these datasets represent the largest, most diverse, multinational cohorts of their kind available inopen access form, spanning cases from at least 15 countries. Experimental results show that theCOVID-Net CT-2 networks are capable of not only achieving strong test accuracy, sensitivity, andpositive predictive value, but also do so in a manner that is consistent with radiologist interpretationvia explainability-driven performance validation. The results are promising and suggest the strongpotential of deep neural networks as an effective tool for computer-aided COVID-19 assessment.Given the severity of the COVID-19 pandemic and the potential for deep learning as a potentialtool to facilitate computer-assisted COVID-19 clinical decision support, a number of deep learningsystems have been proposed in research literature for detecting SARS-CoV-2 infections using CTimages [14, 16, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 15, 22]. While some proposed deeplearning systems focus on binary detection (SARS-CoV-2 positive vs. negative) [51], several proposedsystems operate at a finer level of granularity by further identifying whether SARS-CoV-2 negativecases are normal control [16, 40, 48, 49], SARS-CoV-2 negative pneumonia (e.g., bacterial pneumonia,viral pneumonia, community-acquired pneumonia (CAP), etc.) [16, 40, 41, 42, 43, 49, 50], or non-pneumonia [42]. 10igure 5: Example chest CT images from four COVID-19 cases, and the associated critical factors(highlighted in red) as identified by GSInquire [31] for COVID-Net CT-2 S.The majority of the proposed deep learning systems for COVID-19 detection from CT images relyon pre-existing network architectures that were designed for other image classification tasks. Alarge number of proposed systems additionally rely on segmentation of the lung region and/or lunglesions [14, 16, 40, 41, 42, 45, 46, 48, 49]. Some proposed systems also augment pre-existing networkarchitectures, with Xu et al. [40] augmenting a pre-existing ResNet-18 [19] backbone architecture withlocation-attention classification, and Li et al. [42] and Bai et al. [41] augmenting pre-existing networkarchitectures with pooling operations for volume-driven detection. Of the deep learning systems thatproposed new deep neural network architectures, Shah et al. [44] proposed a 10-layer convolutionalneural network architecture named CTnet-10, which ultimately showed lower detection performancethan pre-existing architectures in literature. Zheng et al. [46] proposed a 3D convolutional neuralnetwork architecture named DeCovNet which is capable of volume-driven detection. Finally, inthe system introduced by Gunraj et al. [15], machine-driven design exploration was leveraged toconstruct a deep neural network architecture that is tailored specifically for COVID-19 detectionusing CT images.While the concept of leveraging deep learning for COVID-19 detection from CT images has beenpreviously explored, even the largest studies in research literature in this area have been limitedin terms of quantity and/or diversity of patients, with many limited to single-nation cohorts. Forexample, the studies by Mei et al. [14], Gunraj et al. [15], Ning et al. [22], and Zhang et al. [16] wereall limited to Chinese patient cohorts consisting of 905 patients, 1,489 patients, 1,521 patients, and3,777 patients, respectively. The largest multinational study in research literature was conducted by11armon et al. [51], which leveraged a cohort of 2,617 patients across 4 countries. To the best ofthe authors’ knowledge, the largest of the unified multinational patient cohorts introduced in thisstudy represents the largest, most diverse multinational patient cohort at 4,501 patients across atleast 15 countries. By building the proposed COVID-Net CT-2 deep neural networks using a largemultinational patient cohort, we can better study the generalization capabilities and applicabilityof deep learning for computer-assisted assessment under a wider diversity of clinical scenarios anddemographics.With the tremendous burden the ongoing COVID-19 pandemic has put on healthcare systems andhealthcare workers around the world, the hope is that research such as COVID-Net CT-2 and opensource initiatives such as the COVID-Net initiative can accelerate the advancement and adoptionof deep learning solutions within a clinical setting to aid front-line health workers and healthcaresystems in improving clinical workflow efficiency and effectiveness in the fight against the COVID-19 pandemic. While to the best of the authors’ knowledge this research does not put anyone ata disadvantage, it is important to note that COVID-Net CT-2 is not a production-ready solutionand is meant for research purposes. As such, predictions made by COVID-Net CT-2 should notbe utilized blindly and should instead be built upon and leveraged in a human-in-the-loop fashionby researchers, clinicians, and citizen data scientists alike. Future work involves leveraging thecore COVID-Net CT-2 backbone for downstream tasks such as lung function prediction, severityassessment, and actionable predictions for guiding personalized treatment and care for SARS-CoV-2positive patients.
Acknowledgments
We thank the Natural Sciences and Engineering Research Council of Canada (NSERC), the CanadaResearch Chairs program, the Canadian Institute for Advanced Research (CIFAR), DarwinAI Corp.,Justin Kirby of the Frederick National Laboratory for Cancer Research, and the various organizationsand initiatives from around the world collecting valuable COVID-19 data to advance science andknowledge. The study has received ethics clearance from the University of Waterloo (42235).
Author contributions statement
H.G. and A.W. conceived the experiments, H.G. conducted the experiments, all authors analysed theresults, D.K. and A.S. reviewed and reported on select patient cases and corresponding explainabilityresults illustrating model’s decision-making behaviour, and all authors reviewed the manuscript.
Declaration of interests
A.W. is affiliated with DarwinAI Corp.
References [1] W. Wang, Y. Xu, R. Gao, R. Lu, K. Han, G. Wu, and W. Tan, “Detection of sars-cov-2 indifferent types of clinical specimens,”
JAMA , vol. 323, no. 18, pp. 1843–1844, 05 2020.[2] Y. Fang, H. Zhang, J. Xie, M. Lin, L. Ying, P. Pang, and W. Ji, “Sensitivity of chest ct forcovid-19: Comparison to rt-pcr,”
Radiology , vol. 296, no. 2, pp. E115–E117, 2020, pMID:32073353.[3] Y. Li, L. Yao, J. Li, L. Chen, Y. Song, Z. Cai, and C. Yang, “Stability issues of rt-pcr testingof sars-cov-2 for hospitalized patients clinically diagnosed with covid-19,”
Journal of MedicalVirology , vol. 92, no. 7, pp. 903–908, 2020.[4] Y. Yang, M. Yang, C. Shen, F. Wang, J. Yuan, J. Li, M. Zhang, Z. Wang, L. Xing, J. Wei,L. Peng, G. Wong, H. Zheng, M. Liao, K. Feng, J. Li, Q. Yang, J. Zhao, Z. Zhang, L. Liu, andY. Liu, “Evaluating the accuracy of different respiratory specimens in the laboratory diagnosisand monitoring the viral shedding of 2019-ncov infections,” medRxiv , 2020.[5] T. Ai, Z. Yang, H. Hou, C. Zhan, C. Chen, W. Lv, Q. Tao, Z. Sun, and L. Xia, “Correlation ofchest ct and rt-pcr testing for coronavirus disease 2019 (covid-19) in china: A report of 1014cases,”
Radiology , vol. 296, no. 2, pp. E32–E40, 2020, pMID: 32101510.126] X. Xie, Z. Zhong, W. Zhao, C. Zheng, F. Wang, and J. Liu, “Chest ct for typical coronavirusdisease 2019 (covid-19) pneumonia: Relationship to negative rt-pcr testing,”
Radiology , vol.296, no. 2, pp. E41–E45, 2020, pMID: 32049601.[7] S. Tian, W. Hu, L. Niu, H. Liu, H. Xu, and S.-Y. Xiao, “Pulmonary pathology of early-phase2019 novel coronavirus (covid-19) pneumonia in two patients with lung cancer,”
Journal ofThoracic Oncology , 2020.[8] J. Shatri, L. Tafilaj, A. Turkaj, K. Dedushi1, M. Shatri, S. Bexheti, and S. K. Mucaj, “The roleof chest computed tomography in asymptomatic patients of positive coronavirus disease 2019:A case and literature review,”
Journal of Clinical Imaging Science , 2020.[9] W.-j. Guan, Z.-y. Ni, Y. Hu, W.-h. Liang, C.-q. Ou, J.-x. He, L. Liu, H. Shan, C.-l. Lei, D. S. Hui,B. Du, L.-j. Li, G. Zeng, K.-Y. Yuen, R.-c. Chen, C.-l. Tang, T. Wang, P.-y. Chen, J. Xiang, S.-y.Li, J.-l. Wang, Z.-j. Liang, Y.-x. Peng, L. Wei, Y. Liu, Y.-h. Hu, P. Peng, J.-m. Wang, J.-y. Liu,Z. Chen, G. Li, Z.-j. Zheng, S.-q. Qiu, J. Luo, C.-j. Ye, S.-y. Zhu, and N.-s. Zhong, “Clinicalcharacteristics of coronavirus disease 2019 in china,”
New England Journal of Medicine , vol.382, no. 18, pp. 1708–1720, 2020.[10] D. Wang, B. Hu, C. Hu, F. Zhu, X. Liu, J. Zhang, B. Wang, H. Xiang, Z. Cheng, Y. Xiong,Y. Zhao, Y. Li, X. Wang, and Z. Peng, “Clinical characteristics of 138 hospitalized patientswith 2019 novel coronavirus–infected pneumonia in wuhan, china,”
JAMA , vol. 323, no. 11, pp.1061–1069, 03 2020.[11] M. Chung, A. Bernheim, X. Mei, N. Zhang, M. Huang, X. Zeng, J. Cui, W. Xu, Y. Yang, Z. A.Fayad, A. Jacobi, K. Li, S. Li, and H. Shan, “Ct imaging features of 2019 novel coronavirus(2019-ncov),”
Radiology , vol. 295, no. 1, pp. 202–207, 2020, pMID: 32017661.[12] F. Pan, T. Ye, P. Sun, S. Gui, B. Liang, L. Li, D. Zheng, J. Wang, R. L. Hesketh, L. Yang, andC. Zheng, “Time course of lung changes at chest ct during recovery from coronavirus disease2019 (covid-19),”
Radiology , vol. 295, no. 3, pp. 715–721, 2020, pMID: 32053470.[13] H. X. Bai, B. Hsieh, Z. Xiong, K. Halsey, J. W. Choi, T. M. L. Tran, I. Pan, L.-B. Shi, D.-C.Wang, J. Mei, X.-L. Jiang, Q.-H. Zeng, T. K. Egglin, P.-F. Hu, S. Agarwal, F.-F. Xie, S. Li,T. Healey, M. K. Atalay, and W.-H. Liao, “Performance of radiologists in differentiating covid-19 from non-covid-19 viral pneumonia at chest ct,”
Radiology , vol. 296, no. 2, pp. E46–E54,2020, pMID: 32155105.[14] X. Mei, H.-C. Lee, K.-y. Diao, M. Huang, B. Lin, C. Liu, Z. Xie, Y. Ma, P. Robson, M. Chung,A. Bernheim, V. Mani, C. Calcagno, K. Li, S. Li, H. Shan, J. Lv, T. Zhao, J. Xia, and Y. Yang,“Artificial intelligence–enabled rapid diagnosis of patients with covid-19,”
Nature Medicine , pp.1–5, 05 2020.[15] H. Gunraj, L. Wang, and A. Wong, “COVIDNet-CT: A tailored deep convolutional neuralnetwork design for detection of COVID-19 cases from chest CT images,”
Frontiers in Medicine ,2020. [Online]. Available: https://doi.org/10.3389/fmed.2020.608525[16] K. Zhang, X. Liu, J. Shen, Z. Li, Y. Sang, X. Wu, Y. Zha, W. Liang, C. Wang, K. Wang,L. Ye, M. Gao, Z. Zhou, L. Li, J. Wang, Z. Yang, H. Cai, J. Xu, L. Yang, W. Cai, W. Xu,S. Wu, W. Zhang, S. Jiang, L. Zheng, X. Zhang, L. Wang, L. Lu, J. Li, H. Yin, W. Wang,O. Li, C. Zhang, L. Liang, T. Wu, R. Deng, K. Wei, Y. Zhou, T. Chen, J. Y.-N. Lau, M. Fok,J. He, T. Lin, W. Li, and G. Wang, “Clinically applicable ai system for accurate diagnosis,quantitative measurements, and prognosis of covid-19 pneumonia using computed tomography,”
Cell , vol. 18, no. 6, pp. 1423–1433, 2020.[17] L. Wang, Z. Q. Lin, and A. Wong, “Covid-net: A tailored deep convolutional neural networkdesign for detection of covid-19 cases from chest x-ray images,”
Scientific Reports , 2020.[18] A. Wong, Z. Q. Lin, L. Wang, A. G. Chung, B. Shen, A. Abbasi, M. Hoshmand-Kochi, and T. Q.Duong, “Covidnet-s: Towards computer-aided severity assessment via training and validationof deep neural networks for geographic extent and opacity extent scoring of chest x-rays forsars-cov-2 lung disease severity,” 2020.[19] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in , 2016, pp. 770–778.[20] A. P, X. S, H. SA, T. EB, S. TH, A. A, K. M, V. N, B. M, A. V, P. F, C. G, T. BT, and W. BJ, “Ctimages in covid-19 [data set],” in
The Cancer Imaging Archive , 2020. [Online]. Available:https://doi.org/10.7937/tcia.2020.gqry-nc811321] M. Rahimzadeh, A. Attar, and S. M. Sakhaei, “A fully automated deep learning-based networkfor detecting covid-19 from a new and large lung ct scan dataset,” medRxiv et al. , “Open resource of clinical data from patients with pneumoniafor the prediction of covid-19 outcomes via deep learning,”
Nature Biomedical Engineering ,vol. 4, pp. 1197–1207, 2020.[23] M. Jun, W. Yixin, A. Xingle, G. Cheng, Y. Ziqi, C. Jianan, Z. Qiongjie, D. Guoqiang, H. Jian,H. Zhiqiang, N. Ziwei, and Y. Xiaoping, “Towards efficient covid-19 ct annotation: A benchmarkfor lung and infection segmentation,” arXiv preprint arXiv:2004.12537 , 2020.[24] S. Armato III, G. McLennan, L. Bidaut, M. McNitt-Gray, C. Meyer, A. Reeves, B. Zhao,D. Aberle, C. Henschke, E. A. Hoffman, E. Kazerooni, H. MacMahon, E. van Beek,D. Yankelevitz, A. Biancardi, P. Bland, M. Brown, R. Engelmann, G. Laderach, D. Max,R. Pais, D. Qing, R. Roberts, A. Smith, A. Starkey, P. Batra, P. Caligiuri, A. Farooqi, G. Gladish,C. Jude, R. Munden, I. Petkovska, L. Quint, L. Schwartz, B. Sundaram, L. Dodd, C. Fenimore,D. Gur, N. Petrick, J. Freymann, J. Kirby, B. Hughes, A. Casteele, S. Gupte, M. Sallam,M. Heath, M. Kuhn, E. Dharaiya, R. Burns, D. Fryd, M. Salganicoff, V. Anand, U. Shreter,S. Vastagh, B. Croft, and L. Clarke, “Data from lidc-idri,” in
The Cancer Imaging Archive ,2015. [Online]. Available: http://doi.org/10.7937/K9/TCIA.2015.LO9QL9SX[25] “Covid-19,”
Radiopaedia . [Online]. Available: https://radiopaedia.org/articles/covid-19-4[26] S. Morozov, A. Andreychenko, N. Pavlov, A. Vladzymyrskyy, N. Ledikhova, V. Gombolevskiy,I. Blokhin, P. Gelezhe, A. Gonchar, and V. Chernina, “Mosmeddata: Chest ctscans with covid-19 related findings dataset,” medRxiv
CoRR , vol. abs/1806.05512, 2018. [Online]. Available:http://arxiv.org/abs/1806.05512[29] N. Qian, “On the momentum term in gradient descent learning algorithms,”
Neural Networks ,vol. 12, no. 1, pp. 145–151, 1999.[30] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis,J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Joze-fowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. Murray, C. Olah,M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Va-sudevan, F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng,“TensorFlow: Large-scale machine learning on heterogeneous systems,” 2015, software availablefrom tensorflow.org.[31] Z. Q. Lin, M. J. Shafiee, S. Bochkarev, M. S. Jules, X. Y. Wang, and A. Wong, “Do explanationsreflect decisions? a machine-centric strategy to quantify the performance of explainabilityalgorithms,” 2019.[32] D. Kumar, A. Wong, and G. W. Taylor, “Explaining the unexplained: A class-enhancedattentive response (clear) approach to understanding deep neural networks,”
IEEE Conferenceon Computer Vision and Pattern Recognition Workshops , 2017.[33] S. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,” 2017.[34] M. T. Ribeiro, S. Singh, and C. Guestrin, “"why should i trust you?": Explaining the predictionsof any classifier,” 2016.[35] G. Erion, J. D. Janizek, P. Sturmfels, S. Lundberg, and S.-I. Lee, “Improving performance ofdeep learning models with axiomatic attribution priors and expected gradients,” 2020.[36] D. Kumar, G. W. Taylor, and A. Wong, “Discovery radiomics with clear-dr: Interpretablecomputer aided diagnosis of diabetic retinopathy,” 2017.[37] K. He, X. Zhang, S. Ren, and J. Sun, “Identity mappings in deep residual networks,” in
ComputerVision - ECCV 2016 , B. Leibe, J. Matas, N. Sebe, and M. Welling, Eds. Cham: SpringerInternational Publishing, 2016, pp. 630–645.1438] B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, “Learning Transferable Architectures forScalable Image Recognition,” in , 2018, pp. 8697–8710.[39] M. Tan and Q. Le, “EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks,”in , 2019.[40] X. Xu, X. Jiang, C. Ma, P. Du, X. Li, S. Lv, L. Yu, Q. Ni, Y. Chen, J. Su, G. Lang, Y. Li,H. Zhao, J. Liu, K. Xu, L. Ruan, J. Sheng, Y. Qiu, W. Wu, T. Liang, and L. Li, “A deep learningsystem to screen novel coronavirus disease 2019 pneumonia,”
Engineering , 2020.[41] H. X. Bai, R. Wang, Z. Xiong, B. Hsieh, K. Chang, K. Halsey, T. M. L. Tran, J. W. Choi,D.-C. Wang, L.-B. Shi, J. Mei, X.-L. Jiang, I. Pan, Q.-H. Zeng, P.-F. Hu, Y.-H. Li, F.-X.Fu, R. Y. Huang, R. Sebro, Q.-Z. Yu, M. K. Atalay, and W.-H. Liao, “Ai augmentation ofradiologist performance in distinguishing covid-19 from pneumonia of other etiology on chestct,”
Radiology , vol. 0, no. 0, p. 201491, 0, pMID: 32339081.[42] L. Li, L. Qin, Z. Xu, Y. Yin, X. Wang, B. Kong, J. Bai, Y. Lu, Z. Fang, Q. Song, K. Cao,D. Liu, G. Wang, Q. Xu, X. Fang, S. Zhang, J. Xia, and J. Xia, “Using artificial intelligence todetect covid-19 and community-acquired pneumonia based on pulmonary ct: Evaluation of thediagnostic accuracy,”
Radiology , vol. 296, no. 2, pp. E65–E71, 2020, pMID: 32191588.[43] A. A. Ardakani, A. R. Kanafi, U. R. Acharya, N. Khadem, and A. Mohammadi, “Application ofdeep learning technique to manage covid-19 in routine clinical practice using ct images: Resultsof 10 convolutional neural networks,”
Computers in Biology and Medicine , vol. 121, p. 103795,2020.[44] V. Shah, R. Keniya, A. Shridharani, M. Punjabi, J. Shah, and N. Mehendale, “Diagnosis ofcovid-19 using ct scan images and deep learning techniques,” medRxiv , 2020.[45] J. Chen, L. Wu, J. Zhang, L. Zhang, D. Gong, Y. Zhao, S. Hu, Y. Wang, X. Hu, B. Zheng,K. Zhang, H. Wu, Z. Dong, Y. Xu, Y. Zhu, X. Chen, L. Yu, and H. Yu, “Deep learning-based model for detecting 2019 novel coronavirus pneumonia on high-resolution computedtomography: a prospective study,” medRxiv , 2020.[46] C. Zheng, X. Deng, Q. Fu, Q. Zhou, J. Feng, H. Ma, W. Liu, and X. Wang, “Deep learning-baseddetection for covid-19 from chest ct using weak label,” medRxiv , 2020.[47] S. Jin, B. Wang, H. Xu, C. Luo, L. Wei, W. Zhao, X. Hou, W. Ma, Z. Xu, Z. Zheng, W. Sun,L. Lan, W. Zhang, X. Mu, C. Shi, Z. Wang, J. Lee, Z. Jin, M. Lin, H. Jin, L. Zhang, J. Guo,B. Zhao, Z. Ren, S. Wang, Z. You, J. Dong, X. Wang, J. Wang, and W. Xu, “Ai-assisted ctimaging analysis for covid-19 screening: Building and deploying a medical ai system in fourweeks,” medRxiv , 2020.[48] C. Jin, W. Chen, Y. Cao, Z. Xu, Z. Tan, X. Zhang, L. Deng, C. Zheng, J. Zhou, H. Shi, andJ. Feng, “Development and evaluation of an ai system for covid-19 diagnosis,” medRxiv , 2020.[49] Y. Song, S. Zheng, L. Li, X. Zhang, X. Zhang, Z. Huang, J. Chen, H. Zhao, Y. Jie, R. Wang,Y. Chong, J. Shen, Y. Zha, and Y. Yang, “Deep learning enables accurate diagnosis of novelcoronavirus (covid-19) with ct images,” medRxiv , 2020.[50] S. Wang, B. Kang, J. Ma, X. Zeng, M. Xiao, J. Guo, M. Cai, J. Yang, Y. Li, X. Meng, andB. Xu, “A deep learning algorithm using ct images to screen for corona virus disease (covid-19),” medRxiv , 2020.[51] S. A. Harmon, T. H. Sanford et al. , “Artificial intelligence for the detection of covid-19pneumonia on chest ct using multinational datasets,”