COVID-19 detection from scarce chest x-ray image data using few-shot deep learning approach
CCOVID-19 detection from scarce chest x-ray image datausing few-shot deep learning approach
Shruti Jadon M.S. *1 SPIE member; IEEE member; Sunnyvale, CA 95134
ABSTRACT
In the current COVID-19 pandemic situation, there is an urgent need to screen infected patients quickly andaccurately. Using deep learning models trained on chest X-ray images can become an efficient method forscreening COVID-19 patients in these situations. Deep learning approaches are already widely used in themedical community. However, they require a large amount of data to be accurate. The open-source community collectively has made efforts to collect and annotate the data, but it is not enough to train an accurate deeplearning model. Few-shot learning is a sub-field of machine learning that aims to learn the objective withless amount of data. In this work, we have experimented with well-known solutions for data scarcity in deeplearning to detect COVID-19. These include data augmentation, transfer learning, and few-shot learning, andunsupervised learning. We have also proposed a custom few-shot learning approach to detect COVID-19 usingsiamese networks. Our experimental results showcased that we can implement an efficient and accurate deeplearning model for COVID-19 detection by adopting the few-shot learning approaches even with less amount ofdata. Using our proposed approach we were able to achieve 96.4% accuracy an improvement from 83% usingbaseline models. Our code is available on github: https://github.com/shruti-jadon/Covid-19-Detection
Keywords:
Deep Learning, COVID-19, Image Classification, X-ray, Medical imaging, Few-shot learning.
1. INTRODUCTION
A new coronavirus designated Covid-19 was first identified in Wuhan, the capital of China’s Hubei province.It has been reported that people started developing pneumonia without a clear cause and for which existingvaccines or treatments were not effective. The virus has shown evidence of human-to-human transmission. As of24 January 2021, approximately 99 million people have contracted the virus and 2 million have lost their lives.As ripple affect, a lot of people have lost their livelihood and about 40% of small businesses have closed down.Majority of the countries weren’t prepared for such pandemic situation in their hospitality domain which ledto situation of a lot of doctors’ risking their lives and working on multiple cases. With the help of technology,Covid-19 detection through CT scans can be automated to reduce up to 2 minutes per scan basis which generallytake close to 10-15 minutes. A lot of recent research papers have suggested to tackle this issue with the helpof deep learning, but with less amount of data and biased data scenarios, its tough to make a good inference ontheir results.In this work, we have experimented with some deep learning based techniques for training a model in low-dataregime. We have also proposed a custom metrics based few-shot learning approach using siamese networks. Our proposed architecture has proven to perform well on scarce data. To validate the effectiveness and comparethe performance of models, we have performed extensive set of experiments and showcased results. The paperis organized as follows: Section 2 explains the classification modeling approaches we have experimented with.In Section 3, we discuss about the evaluation metrics used to assess the performance of models on a givendataset. Our experimental results are listed in Section 4 on several real-world data-sets. We then finallyconclude the outcomes in section 5. The project code has been made available for validation and replication ofthese experiments and can be found at: https://github.com/shruti-jadon/Covid-19-Detection
Shruti Jadon, [email protected] a r X i v : . [ ee ss . I V ] M a r igure 1. Sample Covid-19 CT scan dataset images For our research purposes, we have decided to experiment on two popular labeled datasets:1. dataset-1:
Covid-19 Radiography database is a collaborative efforts by various universities in Asia. Itconsists of 1200 COVID-19 positive images, 1341 normal images, and 1345 viral pneumonia images. And,2. dataset-2: Covid-19 data collected with the help of University of Montreal.
1, 8
This data consists of 317labeled images into three categories: Viral Pneumonia, Normal, and Covid.After analyzing the open-source data-set for Covid-19, we realized that it’s a case of scarce data and thereforeto train a high capacity model from scratch wouldn’t be a good idea. To increase our dataset, we have takenhelp of data augmentation, but medical image augmentation have certain constraints, unlike generic vision baseddataset we can’t manipulate medical images. Therefore, we have used augmentation techniques such as shear,zoom, and rotation of smaller respective values.
2. MODELING APPROACHES
The data scarcity situation is not new in the medical field. The medical community, in general, suffers fromdata-scarcity problems leading to the slow development of automation towards disease detection. The problemwith less data is that if we train a good capacity model, we get underfitting, whereas if we train a low capacitymodel, our performance fails. To mitigate the effects of scarce data: the data augmentation approach generallyworks well, but if the data distribution is not a representation of real world data, it could lead to a biased model.For this research, we have selected certain widely used deep learning approaches to tackle scarce data situation.1. Transfer Learning2. Unsupervised Learning3. Semi-Supervised Learning4. Few-Shot Learning igure 2. Sample Convolutional Neural Network architecture followed by non-linear layers and sigmoid function toconvert embeddings into probabilistic output of possible categories.
Creating an efficient performance model requires two main properties: Good Embeddings and better objectivefunction. For image-based features, a model needs to have a high capacity or learned weights to extract high-levelfeatures. Similarly, the objective function should create clearer defined segregation among classes even in caseof scarce or biased data.
For this research, we have taken logistic regression as a baseline model for Covid classification. Logistic regressionconsists of one layer of non-linearity using softmax followed by cross-entropy objective function. The logisticRegression model generally works best when the objective focuses on some low-level feature such as binaryclassification of a house-sale or color-basis classification. Deep Learning has transformed many industries ranging from the manufacturing industry, food industry, andnow medical industry. Among all architectures, Convolutional Neural networks played an important role inthe computer vision domain. Convolutional Neural Networks are inspired by mammals’ visual cortex and howtheir vision system uses a layered architecture of neurons in the brain. Just like humans have a group of neuronsto recognize shapes and sizes, Convolutional neural network layers extract specific forms of features from imageinput to analyze the object in an image. Convolutional layers are also called feature extractor layer because arange of features of the image is extracted within each layer. In this work, we have implemented a ten layeredconvolutional neural network for Covid classification. For our experiments, we used a 5 layer convolutional neuralnetwork followed by 5 linear layers with cross entropy loss function as objective. Transfer learning refers to a scenario where an architecture that has been optimized on a similar domain data-set, can be used to learn a low-data regime objective. It uses one trained neural network for generalizationand then uses the current data-set to improve those parameters. Transfer learning became popular and hashelped resolve scarce data situations in many cases, but we generally do not get similar domain data withmedical images. In this research, we have taken one of the leading VGG-16 architecture trained on Image-Netfor transfer learning. We have chosen different domain for pre-training as training a VGG-16 level of architecturefrom scratch requires more than 10k images properly annotated and access to good computation power. Forour experiments, we extracted the features of the final convolutional layer of VGG-16 Net and added a logisticregression layer followed by cross-entropy loss for Covid classification. igure 3. An Example of how transfer learning can be utlized in various sub-fields of medical industry with help ofsimilar domain supervised data. and PCA t-Distributed Stochastic Neighbor Embedding (t-SNE) is a widely used technique for dimensionality reduction.It is particularly used for the visualization and analysis of high-dimensional data-sets. t-Distributed stochasticneighbor embedding (t-SNE) minimizes the divergence between two distributions: a distribution that measurespairwise similarities of the input objects and a distribution that measures pairwise similarities correspondingto low-dimensional points in the embedding. In simpler terms, t-SNE analyzes the original data entered intothe algorithm and hypothesizes the best representation of data in lower dimensions by matching both distri-butions. The hypothesizing process is computationally expensive; therefore, there are certain limitations to itsusage with real data. For example, in very high dimensional data, to avoid the heavy computation, we can addanother dimensionality reduction technique before using t-SNE. In this research, as our data comes under highdimensional data (251X224X224X3), we have used another popular dimensionality reduction approach known asPCA(Principal Component Analysis). PCA attempts to reduce the size of feature dimensions with eigen vectors’help while ensuring we are not losing essential information. For our experiments, we have first used PCA tobring the dimensions down to 180 and later used t-SNE for visualization, as shown in fig 5. K-Means clustering is a 3-step process.1. initialize K centroids in the embedding space.2. Compute the distance of each point from each centroid and assign a point to that cluster whose centroidis closest to it. Do this for every point until preliminary clusters have been formed.3. Within each cluster, recompute the centroid and repeat step 2 until clusters stop changing. Here, K is thenumber of clusters that the user wants.The objective of K-Means is to partition N data points into K clusters in such a manner that the within-clustersum of squares or variance is minimized. We used scikit-learn’s available k-means clustering algorithm for ourimplementation, with randomly initialized centroids. .4.3 Gaussian Mixture Models Clustering
In Gaussian Mixture Model clustering, clusters are modeled with Gaussian distributions, which means that weuse variance and the mean to define each cluster.GMM allows for overlapping clusters, with the mixture modelbeing parameterized by three values - each cluster’s weight, the mean of each cluster, and the variance of eachcluster. The probability of belonging to a particular cluster is assigned to each data point using the Expectation-Maximization algorithm. Given the number of component gaussians or clusters (K), this algorithm consists ofthe following two steps:1. Expectation:1. Expectation: In the first step, the probability of each point belonging to a cluster calculated for the currentvalues of weight, mean, and variance of that cluster.2. Maximization: In this step, the expectation calculated in the previous step is maximized by modifying thevalues of weight, mean, and variance of clusters.This iterative model runs until convergence, at which point the maximum likelihood estimate is provided.Once the model parameters have been estimated, the fitted model can be used for clustering. A point is assignedto that cluster for which the probability of it belonging to the cluster is maximum. Few-shot learning is a sub-field of machine learning which aims to develop models that can be trained with lessamount of data-set and provide the required performance. For a model to be efficient, it requires good embeddingsor better optimization approach which can reach the desired objective within less steps. There are three types offew-shot learning apporoaches: Metrics based, Models based, and Optimization based; Metric based approachesfocuses of learning better embeddings whereas Models based and Optimization based approaches focuses onimproving the architectural components and optimization algorithms respectively. In this work, we have takenadvantage of one of these Metrics based approach known as Siamese Networks. Siamese stands for ‘twins‘, andas the name suggests Siamese Networks consists of two architectures similar in all features and shares weightsamong it. For image type data, these similar architectures are generally chosen to be of Convolutional NeuralNetwork type followed by contrastive loss function. For training a siamese network, we pass the input in set ofpairs, e.g; we take 2 input images and label them if they are similar or not, then these two input images (x1and x2) are passed through the ConvNet to generate a fixed length feature vector for each (h(x1) and h(x2)).Assuming the neural network model is trained properly, we can make the following hypothesis: If the two inputimages belong to the same character, then their feature vectors must also be similar, while if the two inputimages belong to the different characters, then their feature vectors will also be different. This idea of extractingembeddings on basis of similarity and dissimilarity helps model to train with less number of examples and eventake advantage of transfer learning approach.For our research, we further modified the Siamese network to getfeatures followed by VGG16 layers (trained on Image Net).
3. EVALUATION METRICS
To assess the performance and understand machine learning models, we needed a good set of evaluation metrics.For this work, we have used five metrics: Accuracy, Precision, Recall, F1-score, and the Silhouette score.
Accuracy is the most intuitive performance measure, and it is merely a ratio of correctly predicted observationsto the total observations. One may think that if we have high accuracy, then our model is best. Yes, accuracy isan excellent measure only when we have symmetric data-sets where false positives and false negatives are almostthe same. Therefore, we have to look at other parameters to evaluate the performance of our model.
Accuracy = T P + T NT P + T N + F P + F N (1) igure 4. An architectural representation of Siamese Networks with case of two different class input.
Precision is the ratio of correctly predicted positive observations to the total predicted positive observations.For each category/class, there is one precision value. We focus on precision when we need the predictions to becorrect, i.e., ideally, we want to make sure the model is right when we predict a label.
P recision = T PT P + F P (2)
Recall is the ratio of what the model predicted correctly to what the actual labels are. Similarly, for eachcategory/class, there is one recall value. We care about recall when we want to maximize the prediction of aparticular class, i.e., ideally, we want the model to capture all the class examples.
Recall = T PT P + F N (3)
F1 score is the weighted average of Precision and Recall. Therefore, this score takes both false positives andfalse negatives into account. Intuitively it is not as easy to understand as accuracy, but F1 is usually more usefulthan accuracy, especially if we have an uneven class distribution. Accuracy works best if false positives and falsenegatives have similar costs. If the cost of false positives and false negatives is very different, it’s better to lookat Precision and Recall. F − Score = 2 ∗ P recision ∗ RecallP recision + Recall (4)
Silhouette score is used to analyze the clustering-based approaches. It analyzes the mean of intra-cluster vs.mean of inter-cluster distance using the formula:
SilhouetteScore = ( b − a ) max ( a, b ) (5)There is one condition that the number of clusters’ should be less than the number of samples - 1. It rangesfrom -1 to 1, where 1 represents the best value. . EXPERIMENTS AND RESULTS In this paper, we have performed experiments using listed models and evaluated them based on well-knownmetrics listed in the above section. We used the TensorFlow library for the implementation of our models on
CPU:
Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
RAM:
4x 8GB, 2133 MT/s
GPU:
GeForce GTX 1060, 6GBTo validate the effectiveness of the proposed Siamese based approach with transfer learning features, which isinspired by VGG16 architecture. We first explored some image augmentation and increase the data-size by10% making our data-set of 3800 and 4200, respectively. After further pre-processing such as normalizingand reshaping the CT-scan images to (224X224), we divided the data-set for training 60%, validation 20% andtesting 20% which belonged to 3 classes: Covid, Viral Pneumonia, and Normal.
Table 1. Accuracy, Precision, Recall and F1-Score using mentioned classification based modeling approaches trained andtested on combined datset-1 and dataset-2.
Model Name Accuracy Precision Recall F1-scoreLogistic Regression 82.4% 0.822 0.828 0.828Convolutional Neural Network 90.2% 0.912 0.901 0.904Transfer Learning(VGG16) 93.3% 0.931 0.932 0.928Siamese Networks 94.6% 0.945 0.941 0.947Siamese Networks(Transfer Learning)
Table 2. Analysis of Clusters formed by K-Means and GMM algorithms with Silhouette Score using input as extractedembeddings of mentioned classification approaches.
Model Name / Clustering Approach K-Means Gaussian Mixture ModelsLogistic Regression 0.156 0.158Convolutional Neural Network 0.165 0.171Transfer Learning 0.189 0.185PCA+TSNE
Siamese Networks 0.490 0.487Siamese Networks (Transfer Learning)
After performing experiments and analyzing outcomes we have come to two major conclusions on training modelsin scarce data situations.1.
Change is data distribution may result in low accuracy : When data is scarce, it’s tough to determinethe real distribution of data, even if we segregate data into train, test, and val. We might not be able tocapture the performance of model on real data distribution.2.
General Classification models might not work : In case of less data, generally unsupervised basedapproaches perform well. In our experiments, we have observed that the decision boundaries were moresegregated in the PCA+TSNE and Siamese Network approach, whereas for Logistic Regression, CNN, andTransfer Learning score observed is below 0.2 for both K Means and GMM clustering approaches. igure 5. 2-D Visualization of formed clusters using K-Means approach by extracting the feature embeddings usingdimensionality reduction approach of PCA followed by t-SNE
5. CONCLUSION
In this research article, we have experimented with deep learning architectures for the low-data regime. We alsoproposed a custom metrics based few-shot learning model to predict the multi-class classification of Covid-19.We approve our model with detailed experiments on the combined Covid-19 radiography collected dataset, whereCT scan image belongs to 3 categories as Normal, Viral Pneumonia, and Covid is used to obtain the highestaccuracy of our proposed approach. We also investigate the reduction of overfitting and regularization of themodel effect on our application performance. For this purpose, we used embedding analysis using silhouettescore and clustering approaches. Finally, we compare our proposed technique to the existing three widely usedclassification approaches, where our proposed model significantly performed better than the others. We can seeour proposed approach providing a 3% increment in accuracy for classification and 0.42 increment in clusteringscore. In the future, we plan to examine whether the same model can be employed on the other computer-aideddiagnostic problems.
REFERENCES [1] Zhao, J., Zhang, Y., He, X., and Xie, P., “Covid-ct-dataset: a ct scan dataset about covid-19,” arXivpreprint arXiv:2003.13865 (2020).[2] Jadon, S., “An overview of deep learning architectures in few-shot learning domain,” arXiv preprintarXiv:2008.06365 (2020).[3] Yuan, J., Guo, H., Jin, Z., Jin, H., Zhang, X., and Luo, J., “One-shot learning for fine-grained relationextraction via convolutional siamese neural network,” in [ ], 2194–2199, IEEE (2017).[4] Mo, P., Xing, Y., Xiao, Y., Deng, L., Zhao, Q., Wang, H., Xiong, Y., Cheng, Z., Gao, S., Liang, K., et al.,“Clinical characteristics of refractory covid-19 pneumonia in wuhan, china,”
Clinical Infectious Diseases (2020).[5] Zhang, L., Zhu, F., Xie, L., Wang, C., Wang, J., Chen, R., Jia, P., Guan, H., Peng, L., Chen, Y., et al.,“Clinical characteristics of covid-19-infected cancer patients: a retrospective case study in three hospitalswithin wuhan, china,”
Annals of oncology (7), 894–901 (2020).6] Crayne, M. P., “The traumatic impact of job loss and job search in the aftermath of covid-19.,” PsychologicalTrauma: Theory, Research, Practice, and Policy (S1), S180 (2020).[7] Afshar, P., Heidarian, S., Enshaei, N., Naderkhani, F., Rafiee, M. J., Oikonomou, A., Fard, F. B., Samimi,K., Plataniotis, K. N., and Mohammadi, A., “Covid-ct-md: Covid-19 computed tomography (ct) scandataset applicable in machine learning and deep learning,” arXiv preprint arXiv:2009.14623 (2020).[8] Cohen, J. P., Morrison, P., and Dao, L., “Covid-19 image data collection,” arXiv 2003.11597 (2020).[9] Horry, M. J., Chakraborty, S., Paul, M., Ulhaq, A., Pradhan, B., Saha, M., and Shukla, N., “Covid-19 detection through transfer learning using multimodal imaging data,” IEEE Access , 149808–149824(2020).[10] Jadon, S. and Srinivasan, A. A., “Improving siamese networks for one-shot learning using kernel-basedactivation functions,” in [ Data Management, Analytics and Innovation ], 353–367, Springer (2021).[11] Chowdhury, M. E. H., Rahman, T., Khandakar, A., Mazhar, R., Kadir, M. A., Mahbub, Z. B., Islam, K. R.,Khan, M. S., Iqbal, A., Emadi, N. A., Reaz, M. B. I., and Islam, M. T., “Can ai help in screening viral andcovid-19 pneumonia?,”
IEEE Access , 132665–132676 (2020).[12] LeCun, Y., Bengio, Y., et al., “Convolutional networks for images, speech, and time series,” The handbookof brain theory and neural networks (10), 1995 (1995).[13] Alzubaidi, L., Fadhel, M. A., Al-Shamma, O., Zhang, J., Santamar´ıa, J., Duan, Y., and Oleiwi, S. R.,“Towards a better understanding of transfer learning for medical imaging: a case study,”
Applied Sci-ences (13), 4523 (2020).[14] Chen, H., Li, J., Wang, R., Huang, Y., Meng, F., Meng, D., Peng, Q., and Wang, L., “Unsupervised learningof local discriminative representation for medical images,” arXiv preprint arXiv:2012.09333 (2020).[15] Perez, H. and Tah, J. H., “Improving the accuracy of convolutional neural networks by identifying andremoving outlier images in datasets using t-sne,” Mathematics (5), 662 (2020).[16] Narin, A., Kaya, C., and Pamuk, Z., “Automatic detection of coronavirus disease (covid-19) using x-rayimages and deep convolutional neural networks,” arXiv preprint arXiv:2003.10849 (2020).[17] West, C. P., Montori, V. M., and Sampathkumar, P., “Covid-19 testing: the threat of false-negative results,”in [ Mayo Clinic Proceedings ], (6), 1127–1129, Elsevier (2020).[18] Shi, F., Wang, J., Shi, J., Wu, Z., Wang, Q., Tang, Z., He, K., Shi, Y., and Shen, D., “Review of artificialintelligence techniques in imaging data acquisition, segmentation and diagnosis for covid-19,” IEEE reviewsin biomedical engineering (2020).[19] Ozturk, T., Talo, M., Yildirim, E. A., Baloglu, U. B., Yildirim, O., and Acharya, U. R., “Automateddetection of covid-19 cases using deep neural networks with x-ray images,”
Computers in Biology andMedicine , 103792 (2020).[20] Luz, E., Silva, P. L., Silva, R., and Moreira, G., “Towards an efficient deep learning model for covid-19patterns detection in x-ray images,” arXiv preprint arXiv:2004.05717 (2020).[21] Ghoshal, B. and Tucker, A., “Estimating uncertainty and interpretability in deep learning for coronavirus(covid-19) detection,” arXiv preprint arXiv:2003.10769 (2020).[22] Jadon, S., Leary, O. P., Pan, I., Harder, T. J., Wright, D. W., Merck, L. H., and Merck, D. L., “Acomparative study of 2d image segmentation algorithms for traumatic brain lesions using ct data fromthe protectiii multicenter clinical trial,” in [
Medical Imaging 2020: Imaging Informatics for Healthcare,Research, and Applications ], , 113180Q, International Society for Optics and Photonics (2020).[23] Jadon, S., “A survey of loss functions for semantic segmentation,” in [ ], 1–7, IEEE (2020).[24] Jun, M., Cheng, G., Yixin, W., Xingle, A., Jiantao, G., Ziqi, Y., Minqing, Z., Xin, L., Xueyuan, D.,Shucheng, C., Hao, W., Sen, M., Xiaoyu, Y., Ziwei, N., Chen, L., Lu, T., Yuntao, Z., Qiongjie, Z., Guoqiang,D., and Jian, H., “COVID-19 CT Lung and Infection Segmentation Dataset,” (Apr. 2020).[25] Vinyals, O., Blundell, C., Lillicrap, T., Wierstra, D., et al., “Matching networks for one shot learning,” in[ Advances in neural information processing systems ], 3630–3638 (2016).[26] Alshazly, H., Linse, C., Barth, E., and Martinetz, T., “Explainable covid-19 detection using chest ct scansand deep learning,”
Sensors (2), 455 (2021).27] Ahuja, S., Panigrahi, B. K., Dey, N., Rajinikanth, V., and Gandhi, T. K., “Deep transfer learning-basedautomated detection of covid-19 from lung ct scan slices,” Applied Intelligence (1), 571–585 (2021).[28] Maghdid, H. S., Asaad, A. T., Ghafoor, K. Z., Sadiq, A. S., and Khan, M. K., “Diagnosing covid-19pneumonia from x-ray and ct images using deep learning and transfer learning algorithms,” arXiv preprintarXiv:2004.00038 (2020).[29] Panwar, H., Gupta, P., Siddiqui, M. K., Morales-Menendez, R., Bhardwaj, P., and Singh, V., “A deeplearning and grad-cam based color visualization approach for fast detection of covid-19 cases using chestx-ray and ct-scan images,” Chaos, Solitons & Fractals , 110190 (2020).[30] Fu, M., Yi, S.-L., Zeng, Y., Ye, F., Li, Y., Dong, X., Ren, Y.-D., Luo, L., Pan, J.-S., and Zhang, Q.,“Deep learning-based recognizing covid-19 and other common infectious diseases of the lung by chest ctscan images,” medRxiv (2020).[31] Rahimzadeh, M., Attar, A., and Sakhaei, S. M., “A fully automated deep learning-based network fordetecting covid-19 from a new and large lung ct scan dataset,” medRxiv (2020).[32] Jain, R., Gupta, M., Jain, K., and Kang, S., “Deep learning based prediction of covid-19 virus using chestx-ray,”
Journal of Interdisciplinary Mathematics , 1–19 (2021).[33] Omoniyi, T., Alabere, H., and Sule, E., “Diagnosis of covid-19 using artificial intelligence based model,” in[
Journal of Physics: Conference Series ], (1), 012007, IOP Publishing (2021).[34] Liu, Y. and Ji, S., “A multi-stage attentive transfer learning framework for improving covid-19 diagnosis,” arXiv preprint arXiv:2101.05410 .[35] Purohit, K., Kesarwani, A., Kisku, D. R., and Dalui, M., “Covid-19 detection on chest x-ray and ct scanimages using multi-image augmented deep learning model,” BioRxiv (2020).[36] Jain, G., Mittal, D., Thakur, D., and Mittal, M. K., “A deep learning approach to detect covid-19 coron-avirus with x-ray images,”