[PDF] AI Progress in Skin Lesion Analysis

Abstract

We examine progress in the use of AI for detecting skin lesions, with particular emphasis on the erythema migrans rash of acute Lyme disease, and other lesions, such as those from conditions like herpes zoster (shingles), tinea corporis, erythema multiforme, cellulitis, insect bites, or tick bites. We discuss important challenges for these applications, in particular the problems of AI bias regarding the lack of skin images in dark skinned individuals, being able to accurately detect, delineate, and segment lesions or regions of interest compared to normal skin in images, and low shot learning (addressing classification with a paucity of training images). Solving these problems ranges from being highly desirable requirements -- e.g. for delineation, which may be useful to disambiguate between similar types of lesions, and perform improved diagnostics -- or required, as is the case for AI de-biasing, to allow for the deployment of fair AI techniques in the clinic for skin lesion analysis. For the problem of low shot learning in particular, we report skin analysis algorithms that gracefully degrade and still perform well at low shots, when compared to baseline algorithms: when using a little as 10 training exemplars per class, the baseline DL algorithm performance significantly degrades, with accuracy of 56.41%, close to chance, whereas the best performing low shot algorithm yields an accuracy of 85.26%.

Full PDF

AAI progress in skin analysis AI PROGRESS IN SKIN LESION ANALYSIS

Philippe M. Burlina, PHD , William Paul , Phil A. Mathew , Neil J. Joshi, BS , Alison W. Rebman, MPH , John N. Aucott, MD Applied Physics Laboratory, Johns Hopkins University Malone Center for Engineering in Healthcare, Johns Hopkins University Johns Hopkins Lyme Disease Research Center, Division of Rheumatology, Department of Medicine, Johns Hopkins University School of Medicine

ABSTRACT

AI bias regarding the lack of skin images in dark skinned individuals, being able to accurately detect, delineate, and segment lesions or regions of interest compared to normal skin in images, and low shot learning (addressing classification with a paucity of training images). Solving these problems ranges from being highly desirable requirements -- e.g. for delineation, which may be useful to disambiguate between similar types of lesions, and perform improved diagnostics -- or required – as is the case for AI de-biasing, to allow for the deployment of fair AI techniques in the clinic for skin lesion analysis. For the problem of low shot learning in particular, we report skin analysis algorithms that gracefully degrade and still perform well at low shots, when compared to baseline algorithms: when using a little as 10 training exemplars per class, the baseline DL algorithm performance significantly degrades, with accuracy of 56.41%, close to chance, whereas the best performing low shot algorithm yields an accuracy of 85.26%.

I Introduction

Lyme disease is the most common tick-borne disease in the United States, with an estimated 300,000 new cases per year.

The bacteria

Borrelia burgdorferi is the causative vector of Lyme disease in North America. This bacterial agent is inoculated into humans through an infected tick bite. After 3 to 30 days, a round or oval, red, centrifugally expanding skin lesion (erythema migrans or EM) manifests itself in approximately 70-80% of cases.

EM may also be accompanied by flu-like symptoms including fever, fatigue, myalgia, and arthralgia. Without appropriate antibiotic treatment, EM can persist and subsequently resolve under pressure from the immune response. Artificial intelligence (AI) can play a key role in the detection of EM, which is important for initiating early treatment for Lyme disease, without which the bacteria may disseminate into the nervous, cardiac, and rheumatologic systems.

Deep learning (DL) has led to critical successes in tackling various AI tasks, including automated image classification for the purposes of medical diagnostics such as in retinal analytics and also, related to this work, cancerous skin lesions. We have been developing AI tools for other types of skin analytics, specifically the diagnosis of non-cancerous skin lesions including erythema migrans (for Lyme disease), herpes zoster (shingles), tinea corporis, erythema multiforme, cellulitis, insect bites, or tick bites, using clinically obtained images as well as public domain data obtained from the internet

In the current manuscript, we examine advances in the use of AI for detecting skin lesions and specifically the erythema migrans rash of acute Lyme disease. In particular, we report on recent developments that address important challenges for these applications, including the problem of AI bias, which may affect classification in individuals with darker skin; the problem of addressing a paucity of training skin image exemplars of lesions, so called ‘low shot’ learning; and the problem of automated segmentation and detection of skin lesions, which can play an important role

I progress in skin analysis in characterizing the progression of skin lesions, particularly EM, over time and can aid in the detection of Lyme disease. We have recently demonstrated that AI methods can be successfully used for the detection of EM and other skin lesions.

However, one key weakness of AI via DL is the need for very large ground truth-annotated training datasets, often requiring hundreds if not thousands of images. Training based on only a few (or a dozen) exemplars is called ‘low shot’ or ‘few shot’ learning and is a challenge which humans have the ability to perform innately well compared to machines: consider, for example, a child who is easily able to differentiate zebras from horses after being shown only one image of a zebra, thus performing one shot learning. By contrast, ‘traditional’ AI algorithms generally lack the ability to adequately train with low numbers of gold standard training images, a challenge which remains an open and active problem of investigation. With only a modest number of training images available, traditional DL approaches often perform very poorly. Datasets with only a few image training examples, under certain circumstances, may also yield low testing power from which reliable conclusions cannot be inferred. There are several important, practical settings in which this low shot problem is relevant for skin lesion diagnostics or when using DL methods for automated skin analysis. One is in the case of rare diseases, for example erythema marginatum, or more unusual presentations of EM such as those with vesiculation. We have worked towards utilizing novel methods in addressing low shot learning as it applies to skin diseases and in particular, Lyme disease. We have relied on techniques including the use of self-supervision, which does not require gold standard labels and thus can exploit data that do not have annotations.

In self-supervised methods, part of the image is used to predict another part of the same image, which results in a network that can be used for representation learning. An example of such a network is Deep InfoMax, which is trained to maximize the similarity of local and global representations of images at different layers of the network, with the same source image as input, and minimize the similarity of pairs not created from the same input image. It uses an information-theoretic framework as a loss function. We performed a comparison of different approaches towards low shot learning for skin analysis which included the following: Classical Baseline Deep Learning System.

We used a baseline method using a classical DL system, i.e. ResNet50, and performed fine-tuning. We denote it henceforth as “RES”. This method has been widely used in medical AI diagnostic problems. This method was compared against novel systems which allow for

Low Shot Deep Learning (LSDL), including:

Low Shot Deep Learning Approaches via discriminative encoding.

We used methods relying on representation learning via encoding of images using a discriminative neural network. These methods used global pooling of ResNet50’s last convolutional layers. We then used one of three classifiers including support vector machines, random forest, and K-nearest neighbors, respectively abbreviated as RES_Random Forest, RES_RBF SVM and RES_KNN.

Low Shot Deep Learning (LSDL) Approaches via self-supervision.

A second family of LSDL methods that were used here worked via self-supervision and proceeded also by first encoding images via a self-supervised network then using additional classification logic. We used a specific variant of the Deep InfoMax approach that makes use of different image augmentations (color jitter, random resized crop of the same image), to compute the similarity between global and local representations of the image. As one of the LSDL algorithms, henceforth abbreviated as DIM, we used the local representation as input to train a ResNet network. Also, as in the previous group of methods, we also considered additional LSDL methods that instead used the global representation, which was then fed to three classifiers (support vector machines, random forest, and K-nearest neighbors, yielding methods which are henceforth referred to as DIM_Random Forest, DIM_RBF SVM, and DIM_KNN). I progress in skin analysis We tested and compared the performance of the above algorithms using a binary problem consisting of classifying Lyme vs healthy. We used a dataset of image provisioned online and curated by our team. We show promising results in Table 1 that demonstrate that low shot learning techniques may be beneficial for EM detection, and demonstrate graceful degradation when compared to traditional DL methods. As seen in the table, while all methods perform about the same at high shots (N=5120), when using a little as N=10 training exemplars per class, the baseline algorithm performance (RES) significantly degrades, with accuracy of 56.41%, close to chance, whereas the best performing low shot algorithm (DIM_RBF SVM) yields an accuracy of 83.33% and 85.26% for RES_Random Forest.

Number of Shots DIM DIM_KNN DIM_Random Forest DIM_RBF SVM RES RES_KNN RES_Random Forest RES_RBF SVM 5120 91.67 84.62 88.46 91.03 94.23 87.82 89.1 89.74 639 91.67 84.62 87.82 91.03 95.51 87.82 89.74 89.74 79 88.46 82.05 82.69 87.18 81.41 83.97 87.18 88.46 40 81.41 82.69 83.97 88.46 74.36 83.33 85.9 87.18 20 76.92 75.64 83.97 82.69 48.72 83.33 83.97 87.82 10 79.49 78.85 78.21 83.97 56.41 77.56 85.26 83.33

Table 1 this table illustrates the benefit of each low shot methods’ performance (as measured via accuracy, in percentage) when compared to the baseline method (RES), when the number of training exemplars per class (the number of shots N) decreases from 5120 down to only 10 exemplars. The method RES exemplifies a traditional fine-tuned ResNet50 method and shows significant degradation for N=20 and N=10, whereas other methods have a more graceful degradation and better performance even for those low values of N (e.g. DIM_RBF SVM).

Skin lesion segmentation is an important task as it would allow for time-based analysis of the evolution of the lesion. This task may help in diagnosing lesions, particularly those where there may often be a degree of ambiguity, such as in the case of EM. Unlike traditional approaches in computer vision using graph based methods for segmentation we are pursuing here automated delineation through the use of fully convolutional networks, Examples of results obtained using this technique for skin delineation are shown in Figure 1. Our work is currently directed at addressing some challenges in EM skin segmentation when dealing with poorly defined lesion boundaries or lesions that do not appear as the classic “bullseye” shape. As an alternate task to specific tissue delineation, one task which is useful to aid in skin analysis and skin lesion diagnosis is the detection of the areas of interest on the skin using object detectors. Examples of this technique are demonstrated in Figure 2, where we aimed to find areas of redness around osseointegration sites and external pins in patients with prosthetics or external fixators used to stabilize broken bones, where a redness could suggest the presence of an infection which may need to be treated.

I progress in skin analysis Figure 1: Examples of segmentation of skin (top) and lesion (bottom) applied here to tinea corporis. Figure 2: Example of applying object detectors for finding lesions and areas of interest in skin images such as fixator pins or prosthetics’ osseointegration sites.

The problem of AI bias with regard to factors such as age, gender, or skin color has gained significant attention recently as one of the key AI assurance issues that will need to be addressed for fielding of AI systems. Of the several potential causes of AI bias, one could occur when the majority of data collected and used in training AI algorithms is unbalanced and specific demographic groups are overrepresented. In our prior studies on automated tools that seek to find online images of skin lesions, we have found that an overarching amount of data available online and in the public domain includes images almost exclusively from lighter skinned individuals, and many fewer images of darker skinned individuals. Studies have also shown that medical textbooks on dermatological lesions have similar imbalances in their teaching images, with most representative of lighter skinned individuals. Consequently, 47% of dermatologists report a lack of training or exposure to patients with darker skin. Most current AI algorithms are data-driven. Therefore, classification performance of these algorithms depends on available data from large cohorts of individuals where a gold standard diagnosis was determined for each individual and good balance exists between subcategories and subpopulations. Often, the root cause of the AI bias problem may be a lack of balance in the AI training datasets, that is, the dataset might have been from predominantly light-skinned individuals. The problem may also have other causes than just data. For example, the data may be balanced but human annotators may be biased, or the quality and diversity of the data may vary by specific demographic groups, or the algorithm may use indicators for performing predictions (e.g. for predicting the need for healthcare) that may perpetuate bias. This problem can sometimes also be understood to be a domain generalization problem of ML algorithms. Often finding the root cause of biased AI can be a challenging problem in and of itself. I progress in skin analysis It turns out that there are many different definitions of bias as well as objectives regarding de-biasing. The type of diagnosis performance problem occurring when AI performance varies depending on a demographic subpopulation is referred to as “(in)equality of odds” and the desired behavior is that protected factors like gender, race, or age of the individual should not affect diagnosis performance. Other issues have been widely reported in other domains, such as for example the COMPAS system (correctional offender management profiling for alternative sanctions) that assesses recidivism risks, which has been shown to predict risk differentially by race, even when such differences did not exist. This problem may be referred to as ‘(un)equal opportunity’ type of bias. We have developed AI tools that help alleviate AI bias in retinal diagnosis in patients with light vs. darker skin and found different outcomes, with lower performance among darker skin individuals. We also found that algorithms which were developed using data from mostly white individuals had lower performance when tested on patients from Asian ethnic groups. In other recent experiments working with mental health data we also found that unbalanced data can lead to bias. One of our goals is to study the problem of AI bias for skin diseases. In a preliminary study, we found that for certain types of skin lesions, and for simpler problems, a lack of balance in the data does not necessarily lead to bias (Figure 3), It may lead to bias in more complex ones. In the future, we intend to collect data from darker skinned individuals with EM, which may be challenging, and investigate the sensitivity of classification results to the lack of balance in the data. Potential strategies for the future collection of skin lesion images from darker skinned individuals include collaborating with dermatologists who routinely care for individuals with darker skin, and continuing to mine publicly available or other research datasets for relevant images. Figure 3 A study of bias from ISIC2018 dataset of dermoscopic imagery and skin cancer lesions: while imbalance may exist among populations with varying skin tones going from very light (marked as ‘very_lt’) to darker (eg ‘tan2’) (see also this work which first reported this finding), this does not always lead to AI bias in this relatively benign use case. We will probe further such issues of bias in images taken ‘in the wild’ (arbitrary background, view angles and illumination) of more complex lesions and confusers including EM and tinea corporis. We have reported on several important challenges in the use of AI for detection of erythema migrans and other skin disease. We have shown that it is possible to address low shot learning and training with much less exemplars. We have also reported preliminary results for skin lesion delineation and detection. Doing skin lesion delineation can enhance the performance of diagnostic algorithms by allowing the assessment of the time evolution of the lesion. This time evolution can be further characterized using a combination of automatic segmentation techniques and tracking algorithms such as , a possibility we will explore in the future. We have started investigating the problem of AI bias for skin lesions. Testing for the presence or absence of bias is the first important step that should be taken prior to fielding any AI application in the clinic, as often the harm is not recognizing that there is even an issue of bias in the deployment of such AI algorithms. Finding the cause of the bias is the next step, and it is not always an easy task. Data (information bias) are not the only source of bias. Other sources include imbalances in the individuals selected for the data (selection bias) and in the I progress in skin analysis techniques used to obtain those data (ascertainment bias). Assumptions which the algorithm makes can also be a source of bias, and there can be many other additional causes. Sometimes the challenge is not necessarily a lack of data from specific demographic (age, gender, or racial/ethnic) subgroups but rather recognition that the data used to train a neural network may yield a model that does not always generalize to the subsequent data being evaluated by that neural network. In some cases, it is possible to acquire more data, and in the case of prospective data collection, experiments should be designed so that demographically-defined subgroups are appropriately represented. In some cases when using retrospective data, one may not have the option to acquire more data in order to reduce bias. This is where specific AI techniques may come in to allow some form of intelligent data augmentation to address the lack of data. Those types of AI techniques, called generative models, are able to generate new data that may be missing or to take a dataset that is biased and generate a new, unbiased dataset from which machines can then learn to predict equitable outcomes. For other areas like retinal analysis, we have shown that in some cases this strategy can help address bias issues in AI algorithms. In future analyses, we plan to apply and investigate such techniques for skin analytics. Furthermore, there are other approaches that we envision for future studies that are not data-based but more algorithmic-based. We will also be looking at developing techniques that may help find the root cause and sources of bias. We hope that by using such techniques, we can also help train humans with balanced data and address some bias that may exist in clinicians. References

1. Kuehn BM. CDC estimates 300,000 US cases of Lyme disease annually.

JAMA . 2013;310(11):1110. doi:10.1001/jama.2013.278331 2. Hinckley AF, Connally NP, Meek JI, et al. Lyme disease testing by large commercial laboratories in the United States.

Clin Infect Dis . 2014;59(5):676-681. doi:10.1093/cid/ciu397 3. Stanek G, Wormser GP, Gray J, Strle F. Lyme borreliosis.

Lancet . 2012;379(9814):461-473. doi:10.1016/S0140-6736(11)60103-7 4. Nadelman RB. Erythema migrans.

Infect Dis Clin North Am . 2015;29(2):211-239. doi:10.1016/j.idc.2015.02.001 5. Steere AC, Sikand VK. The presenting manifestations of Lyme disease and the outcomes of treatment.

N Engl J Med . 2003;348(24):2472-2474. doi:10.1056/NEJM200306123482423 6. Steere AC, Strle F, Wormser GP, et al. Lyme borreliosis.

Nat Rev Dis Prim . 2016;2:16090. doi:10.1038/nrdp.2016.90 7. Wormser GP, Dattwyler RJ, Shapiro ED, et al. The clinical assessment, treatment, and prevention of Lyme disease, human granulocytic anaplasmosis, and babesiosis: clinical practice guidelines by the Infectious Diseases Society of America.

Clin Infect Dis . 2006;43(9):1089-1134. doi:10.1086/508667 8. Mullegger RR, Glatz M. Skin manifestations of lyme borreliosis: diagnosis and management.

Am J Clin Dermatol . 2008;9(6):355-368. doi:10.2165/0128071-200809060-00002 9. Aucott JN, Crowder LA, Yedlin V, Kortte KB. Bull’s-Eye and nontarget skin lesions of Lyme disease: an internet survey of identification of erythema migrans.

Dermatol Res Pr . 2012;2012:451727. doi:10.1155/2012/451727 10. Lipsker D, Lieber-Mbomeyo A, Hedelin G. How accurate is a clinical diagnosis of erythema chronicum migrans? Prospective study comparing the diagnostic accuracy of general practitioners and dermatologists in an area where lyme borreliosis is endemic.

Arch Dermatol . 2004;140(5):620-621. doi:10.1001/archderm.140.5.620 11. Fujisawa Y, Otomo Y, Ogata Y, et al. Deep learning-based, computer-aided classifier developed with a small dataset of clinical images surpasses board-certified dermatologists in skin tumor diagnosis.

Br J Dermatol . 2018. doi:10.1111/bjd.16924 12. Centers for Disease Control and Prevention. Lyme Disease (Borrelia burgdorferi) 2017 Case Definition.

I progress in skin analysis N Engl J Med . 2014;370(18):1724-1731. doi:10.1056/NEJMcp1314325 14. Bhate C, Schwartz RA. Lyme disease: Part I. Advances and perspectives.

J Am Acad Dermatol . 2011;64(4):618-619. doi:10.1016/j.jaad.2010.03.046 15. Tibbles CD, Edlow JA. Does this patient have erythema migrans?

JAMA . 2007;297(23):2617-2627. doi:10.1001/jama.297.23.2617 16. Soloman S, Tanael M. Rash, Radiculopathy, and Cognitive Biases. In:

Aerospace Medicine and Human Performance . Aerospace Medical Association; 2019:652-654. doi:10.3357/AMHP.5339.2019 17. Li TH, Shih CM, Lin WJ, Lu CW, Chao LL, Wang CC. Erythema migrans mimicking cervical cellulitis with deep neck infection in a child with lyme disease.

J Formos Med Assoc . 2007;106(7):577-581. doi:10.1016/S0929-6646(07)60009-6 18. Nowakowski J, McKenna D, Nadelman RB, et al. Failure of treatment with cephalexin for Lyme disease.

Arch Fam Med . 2000;9(6):563-567. doi:doi:10.1001/archfami.9.6.563 19. Mazori DR, Orme CM, Mir A, Meehan SA, Neimann AL. Vesicular erythema migrans: an atypical and easily misdiagnosed form of Lyme disease.

Dermatol Online J . 2015;21(8). http://dx.doi.org/. 20. Burlina P, Billings S, Joshi N, Albayda J. Automated diagnosis of myositis from muscle ultrasound: Exploring the use of machine learning and deep learning methods.

PLoS One . 2017;12(8). doi:10.1371/journal.pone.0184059 21. Feeny A, Tadarati M, Freund D, Bressler N, Burlina P. Automated segmentation of geographic atrophy of the retinal epithelium via random forests in AREDS color fundus images.

Comput Biol Med . 2015;65:124-136. doi:10.1016/j.compbiomed.2015.06.018 22. Burlina P, Pacheco KD, Joshi N, Freund DE, Bressler NM. Comparing humans and deep learning performance for grading AMD: A study in using universal deep features and transfer learning for automated AMD analysis.

Comput Biol Med . 2017;82:80-86. doi:10.1016/j.compbiomed.2017.01.018 23. Burlina P, Joshi N, Pacheco K, Freund D, Kong J, Bressler N. Utility of Deep Learning Methods for Referability Classification of Age-Related Macular Degeneration.

Jama Opthalmology . 2018;136(11):1305-1307. doi:10.1001/jamaophthalmol.2018.3799 24. Burlina P, Joshi N, Pacheco K, Freund D, Kong J, Berssler N. Use of Deep Learning for Detailed Severity Characterization and Estimation of 5-Year Risk Among Patients With Age-Related Macular Degeneration.

JAMA Opthalmology . 2018;136(12):1359-1366. doi:10.1001/jamaophthalmol.2018.4118 25. Kankanahalli S, Burlina PM, Wolfson Y, Freund DE, Bressler NM. Automated classification of severity of age-related macular degeneration from fundus photographs.

Investig Ophthalmol Vis Sci . 2013;54(3):1789-1796. doi:10.1167/iovs.12-10928 26. Burlina P, Freund D, Dupas B, Bressler N. Automatic Screening of Age-Related Macular Degeneration and Retinal Abnormalities. In:

IEEE Engineering in Medicine and Biology Society . ; 2011:3962-3966. doi:10.1109/IEMBS.2011.6090984 27. Zhang L, Yang G, Ye X. Automatic skin lesion segmentation by coupling deep fully convolutional networks and shallow network with textons.

J Med Imaging . 2019;6(02):1. doi:10.1117/1.jmi.6.2.024001 28. Ali A-R, Li J, O’shea SJ, Yang G, Trappenberg T, Ye X.

A Deep Learning Based Approach to Skin Lesion Border Extraction With a Novel Edge Detector in Dermoscopy Images .; 2019. https://github.com/abderhasan/fuzzedge. 29. Kawahara J, Bentaieb AA, Hamarneh G.

Deep Features to Classify Skin Lesions . https://licensing.eri.ed.ac.uk/i/software/.

I progress in skin analysis

30. Yu L, Chen H, Dou Q, Qin J, Heng P-A. Automated Melanoma Recognition in Dermoscopy Images via Very Deep Residual Networks.

IEEE Trans Med Imaging . 2017;36(4):994-1004. doi:10.1109/TMI.2016.2642839 31. Čuk E, Gams M, Možek M, Strle F, Čarman VM, Tasič JT. Supervised visual system for recognition of Erythema Migrans, an early skin manifestation of Lyme Borreliosis.

Strojniški Vestn - J Mech Eng . 2014;60(2):115-123. doi:10.5545/sv-jme.2013.1046 32. Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition.

Int Conf Learn Represent . 2015. http://arxiv.org/abs/1409.1556. 33. Burlina PM, Joshi NJ, Ng E, Billings SD, Rebman AW, Aucott JN. Automated detection of erythema migrans and other confounding skin lesions via deep learning.

Comput Biol Med . 2019;105. doi:10.1016/j.compbiomed.2018.12.007 34. Horn EJ, Dempsey G, Schotthoefer AM, et al. The Lyme Disease Biobank: Characterization of 550 Patient and Control Samples from the East Coast and Upper Midwest of the United States.

J Clin Microbiol . 2020;58(6):1-12. doi:10.1128/JCM.00032-20 35. Szegedy C, Ioffe S, Vanhoucke V, Alemi AA. Inception-v4, inception-ResNet and the impact of residual connections on learning. . 2017:4278-4284. 36. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. Densely connected convolutional networks. In:

Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017 . Vol 2017-Janua. ; 2017:2261-2269. doi:10.1109/CVPR.2017.243 37. Kinyanjui NM, Odonga T, Cintas C, et al. Estimating Skin Tone and Effects on Classification Performance in Dermatology Datasets. October 2019. http://arxiv.org/abs/1910.13268. 38. Liu Y, Jain A, Eng C, et al. A deep learning system for differential diagnosis of skin diseases.

Nat Med . 2020;26:900-908. doi:10.1038/s41591-020-0842-3 39. Khan A, Sohail A, Zahoora U, Qureshi AS. A survey of the recent architectures of deep convolutional neural networks.

Artif Intell Rev . 2020. doi:10.1007/s10462-020-09825-6 40. Zoph B, Vasudevan V, Shlens J, Le Q V. Learning Transferable Architectures for Scalable Image Recognition. July 2017. http://arxiv.org/abs/1707.07012. 41. Veit A, Wilber M, Belongie S. Residual networks behave like ensembles of relatively shallow networks.

Adv Neural Inf Process Syst . 2016:550-558. 42. Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions.

IEEE Conf Comput Vis Pattern Recognit . 2015:1-9. doi:10.1109/CVPR.2015.7298594 43. Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks.

Proc 25th Int Conf Neural Inf Process Syst - Vol 1 . 2012:1097-1105. 44. LeCun Y, Bengio Y, Hinton G. Deep learning.

Nature . 2015;521(7553):436-444. doi:10.1038/nature14539 45. He K, Zhang X, Ren S, Sun J. Deep Residual Learning for Image Recognition. In: . ; 2016:770-778. doi:10.1109/CVPR.2016.90 46. Goodfellow I, Bengio Y, Courville A.

Deep Learning . Vol 1. Cambridge, MA: MIT Press; 2016. 47. Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks.

Nature . 2017;542(7639):115-118. doi:10.1038/nature21056 48. Burlina PM, Joshi N, Pekala M, Pacheco KD, Freund DE, Bressler NM. Automated Grading of Age-Related Macular Degeneration From Color Fundus Images Using Deep Convolutional Neural Networks.

JAMA Ophthalmol . 2017;135(11):1170-1176. doi:10.1001/jamaophthalmol.2017.3782 49. Burlina PM, Joshi NJ, Mathew PA, Paul W, Rebman AW, Aucott JN. AI-based detection of erythema

I progress in skin analysis migrans and disambiguation against other skin lesions. Comput Biol Med . 2020;125(August):103977. doi:10.1016/j.compbiomed.2020.103977 50. Wang Y, Yao Q, Kwok JT, Ni LM. Generalizing from a Few Examples: A Survey on Few-shot Learning.

ACM Comput Surv . 2020;53(3):1-34. doi:10.1145/3386252 51. Saito M, Hatakeyama S. Acute Rheumatic Fever with Erythema Marginatum.

N Engl J Med . 2016;375(25):2480-2480. doi:10.1056/NEJMicm1601782 52. Burlina P, Paul W, Mathew P, Joshi N, Pacheco KD, Bressler NM. Low-Shot Deep Learning of Diabetic Retinopathy With Potential Applications to Address Artificial Intelligence Bias in Retinal Diagnostics and Rare Ophthalmic Diseases.

JAMA Opthalmology . 2020:1-7. doi:10.1001/jamaophthalmol.2020.3269 53. Devon Hjelm R, Grewal K, Bachman P, et al. Learning deep representations by mutual information estimation and maximization. . 2019:1-24. 54. Bachman P, Hjelm RD, Buchwalter W. Learning Representations by Maximizing Mutual Information Across Views. 2019. http://arxiv.org/abs/1906.00910. 55. Juang R, McVeigh ER, Hoffmann B, Yuh D, Burlina P. Automatic segmentation of the left-ventricular cavity and atrium in 3D ultrasound using graph cuts and the radial symmetry transform.

Proc - Int Symp Biomed Imaging . 2011:606-609. doi:10.1109/ISBI.2011.5872480 56. Pekala M, Joshi N, Liu TYA, Bressler NM, DeBuc Cabrera D, Burlina P. OCT Segmentation via Deep Learning: A Review of Recent Work. In:

Asian Conference on Computer Vision . ; 2019. doi:https://doi.org/10.1007/978-3-030-21074-8_27 57. Pekala M, Joshi N, Liu TYA, Bressler NM, DeBuc DC, Burlina P. Deep learning based retinal OCT segmentation.

Comput Biol Med . 2019;114(November 2018):103445. doi:10.1016/j.compbiomed.2019.103445 58. Mehrabi N, Morstatter F, Saxena N, Lerman K, Galstyan A. A Survey on Bias and Fairness in Machine Learning. 2019. http://arxiv.org/abs/1908.09635. 59. Lester JC, Jia JL, Zhang L, Okoye GA, Linos E. Absence of images of skin of colour in publications of COVID‐19 skin manifestations.

Br J Dermatol . 2020;183(3):593-595. doi:10.1111/bjd.19258 60. Burlina P, Joshi N, Paul W, Pacheco KD, Bressler NM. Addressing Artificial Intelligence Bias in Retinal Disease Diagnostics.

Transl Vis Sci Technol . April 2020. http://arxiv.org/abs/2004.13515. Accessed September 18, 2020. 61. Banerjee A, Burlina P. Efficient particle filtering via sparse kernel density estimation.

IEEE Trans Image Process . 2010;19(9):2480-2490. doi:10.1109/TIP.2010.2047667. 2010;19(9):2480-2490. doi:10.1109/TIP.2010.2047667