[PDF] Medical Datasets Collections for Artificial Intelligence-based Medical Image Analysis

Abstract

We collected 32 public datasets, of which 28 for medical imaging and 4 for natural images, to conduct study. The images of these datasets are captured by different cameras, thus vary from each other in modality, frame size and capacity. For data accessibility, we also provide the websites of most datasets and hope this will help the readers reach the datasets.

Full PDF

aa r X i v : . [ ee ss . I V ] F e b Medical Dataset Collection for Artiﬁcial Intelligence-basedMedical Image Analysis– License: For Non-Commercial Use Only –

Yang WENSchool of Computer Science, UESTC [email protected]

1. Introduction

Since ﬁnding proper dataset with sufﬁcient well-annotated samples is vital for current AI-based research inthe ﬁeld of medical image analysis, I offer here a datasetcollection to the community to help beginning researchersobtain the ideal dataset in a simple and straightforward way.In total 32 public datasets was collected, of which 28 formedical imaging and 4 for natural image ones. The imagesof these datasets are captured by different cameras, thusvary from each other in modality, frame size and capacity.Detail information is demonstrated in Table. 1. For dataaccessibility, we also provide the websites of most datasetsand hope this will help the readers reach the datasets.For more information, please kindly visit my personalhomepage: https://wenyanger.github.io/

ImageNet [23]

Large Scale Visual Recognition Chal-lenge 2012 (ILSVRC2012) is aiming to estimate the con-tent of photographs for the purpose of retrieval and auto-matic annotation using a subset of the large hand-labeledImageNet dataset (10,000,000 labeled images depicting10,000+ object categories) as training. The general goal isto identify the main objects present in images. The train-ing data, the subset of ImageNet containing the 1000 cate-gories and 1.2 million images. A random subset of 50,000of the images with labels will be released as validation dataincluded in the development kit along with a list of the1000 categories. The dataset is available at http://image-net.org/challenges/LSVRC/2012/

CIFAR100 [22]

The CIFAR-100 is a dataset consists of60000 32 ×

32 colour images in 100 classes containing 600images each. There are 500 training images and 100 testingimages per class, which are randomly-selected images fromeach class. The 100 classes in the CIFAR-100 are groupedinto 20 superclasses. In our experiments, we use the ”ﬁne”label, i.e

ADE20K [39]

We use both classiﬁcation andsegmentation labels. The dataset is available athttp://groups.csail.mit.edu/vision/datasets/ADE20K/

PascalVOC [13]

The dataset is available athttp://host.robots.ox.ac.uk/pascal/VOC/

CheXpert [16]

CheXpert is a large public dataset forchest radiograph interpretation, consisting of 224,316 chestradiographs of 65,240 patients. The data were collectedfrom the chest radiographic examinations from StanfordHospital, performed between October 2002 and July 2017in both inpatient and outpatient centers, along with theirassociated radiology reports. The dataset is available athttps://stanfordmlgroup.github.io/competitions/chexpert/

ChestXRay 2017 [21]

The dataset was collected andlabeled a total of 5,232 chest X-ray images from chil-dren, including 3,883 characterized as depicting pneu-monia (2,538 bacterial and 1,345 viral) and 1,349 nor-mal, from a total of 5,856 patients to train the AI sys-tem. The model was then tested with 234 normal im-ages and 390 pneumonia images (242 bacterial and 148viral) from 624 patients. The dataset is available athttps://data.mendeley.com/datasets/rscbjbr9sj/3

LUNA [32]

LUng Nodule Analysis (LUNA) is a com-petition held to aid the development of the nodule de-tection algorithm. We use the lung segmentation of thechallenge for our experiments. The dataset is available athttps://luna16.grand-challenge.org/Data/

MURA [30]

MURA (musculoskeletal radiographs)is a large dataset of bone X-rays. Algorithms aretasked with determining whether an X-ray study isnormal or abnormal. The dataset is available athttps://stanfordmlgroup.github.io/competitions/mura/

BUS [38]

The goal of the Breast Ultrasound LesionsDataset (Dataset B) is to provide the images and ground1 able 1. Statistics of the datasets. Here CLS denotes classiﬁcation, SEG denotes segmentation, Obj-D denotes object detection, SP denotessurvival prediction. For datasets with ofﬁcial division of training and testing sets, the number of images are shown in (number of trainingimages) / (number of testing images), otherwise the total amounts of images.

Name Modality Target Frame Size ∼ ×

320 224,316/624 2 CLSChestXRay 2017 [21] X-Ray Lung ∼ ×

700 5,232/624 2 CLSLUNA [32] CT Lung 512 ×

512 267 2 SEGNLST [36] CT Lung - - - SPCHAOS-CT [20] CT Liver 512 ×

512 2874 2 SEGNIH-CT-82 [31] CT Pancreas 512 ×

512 7,141 2 SEGCHAOS-MRI [20] MRI Liver/Kidney/Spleen 320 ×

320 992 5 SEGCardiacMRI [2] MRI Heart 256 ×

256 399 3 SEGACDC [4] MRI Heart 320 ×

320 - 4 SEGTCIA [27, 5] MRI Prostate 320 ×

320 - 3 SEGPROMISE12 [25] MRI Prostate - - - SEGISIC2019 [10] Dermoscopy Skin 1024 ×

768 25,331 9 CLSTCGA-GBM [19, 26] H/E Stained Cell - - - SPTCGA-LGG [19, 26] H/E Stained Cell - - - SPTNBC [28] H/E Stained Cell 512 ×

512 50 2 SEGGlaS [33] H/E Stained Cell 775 ×

522 85/80 2 SEGMoNu [24] H/E Stained Cell 1000 × × × ×

400 2 SEGRIM-r3 [14] Fundoscopic Fundus 1072 × × × ×

584 20/20 2 SEGMESSIDOR [12] Fundoscopic Diabetic Retinopathy 2240 × ×

650 397 15 SEG/CLSEyePACS [17] Fundoscopic Diabetic Retinopathy 4000 × ×

500 163 2 SEGBUSI [1] Ultrasound Breast lesion 500 ×

500 780 3 SEG/CLSOCT2017 [21] OCT Fundus ∼ ×

512 108,309/1,000 4 CLSKIMCCS [18] Cervoscope Cervical ∼ × ×

32 50,000/10,000 100 CLSImageNet [23] Natural Image - ∼ ×

375 1.3M/60000 1000 CLSADE20K [39] Natural Image - ∼ × ∼ × BUSI [1]

The data collected at baseline include breastultrasound images among women in ages between 25 and75 years old. This data was collected in 2018. The num-ber of patients is 600 female patients. The dataset con-sists of 780 images with an average image size of 500 × CHAOS [20]

CHAOS challenge aims the segmenta-tion of abdominal organs (liver, kidneys and spleen) fromCT and MRI data. Segmentation of liver from computedtomography (CT) data sets, which are acquired at por-tal phase after contrast agent injection for pre-evaluationof living donated liver transplantation donors. Segmenta-tion of four abdominal organs ( i.e . liver, spleen, right andleft kidneys) from magnetic resonance imaging (MRI) datasets acquired with two different sequences (T1-DUAL andT2-SPIR). The dataset is available at https://chaos.grand-challenge.org/Data/

NIH-CT-82 [31]

The National Institutes of HealthClinical Center performed 82 abdominal contrast en-hanced 3D CT scans from 53 male and 27 femaleubjects. Seventeen of the subjects are healthy kidneydonors scanned prior to nephrectomy. The remaining65 patients were selected by a radiologist from patientswho neither had major abdominal pathologies nor pan-creatic cancer lesions. A medical student manuallyperformed slice-by-slice segmentations of the pan-creas as ground-truth and these were veriﬁed/modiﬁedby an experienced radiologist. The dataset is available athttps://wiki.cancerimagingarchive.net/display/Public/Pancreas-CT

CardiacMRI [2]

The dataset was comprised of shortaxis cardiac MR image sequences acquired from 33 sub-jects, for a total of 7,980 2D images. Each patient’s imagesequence consisted of exactly 20 frames and the number ofslices acquired along the long axis of the subjects rangedbetween 8 and 15. Spacing-between slices ranged between6 and 13 mm. Each image where both the endocardial andepicardial contours of the left ventricle were visible, wasmanually segmented, to provide the ground truth. The man-ual segmentation was performed by the ﬁrst author and tookapprox- imately 3 weeks of full time work. This resultedin 5011 manually segmented MR images, with a total of10,022 endocardial and epicardial contours.

ACDC [4]

Prostate segmentation

The Prostate dataset contains 40patients from the PROSTATE-DIAGNOSIS collection [27]scanned with a 1.5T Philips Achieva MRI scanner. It issplit into 30 patients for training, 5 for testing, and 5 for thecompetition (not used in our experiments). The labels areprovided by Cancer Imaging Archive (TCIA) site [9]. Theimage size is 400 ×

400 or 320 × PROMISE12 Challenge [25]

The MICCAI ProstateMR Image Segmentation-challenge 2012, to segment theprostate in transversal T2-weighted MR images. The dataincludes both patients with benign disease (e.g. benign pro-static hyperplasia) and prostate cancer. Additionaly, to testthe robustness and generalizability of the algorithms, datawill be from multiple centers and multiple MRI device ven-dors. Differences in scanning protocols will also be present in the data, e.g. patient with and without an endorectalcoil. The dataset is available at https://promise12.grand-challenge.org/Details/There are 50 training cases available for download.These cases include a transversal T2-weighted MR imageof the prostate. The training set is a representative setof the types of MR images acquired in a clinical setting.The data is multi-center and multi-vendor and has differ-ent acquistion protocols (e.g. differences in slice thickness,with/without endorectal coil). The set is selected such thatthere is a spread in prostate sizes and appearance. For eachof the cases in the training set, a reference segmentation isalso included.

Why do we want to segment the prostate on MR im-ages?

Determination of prostate volume (PV) facilitates anassessment of prostate disorders and, for prostate cancer,in conjunction with other parameters, can help predict thepathologic stage of disease, offers insights into the progno-sis, and helps predict treatment response. Prostate-speciﬁcantigen (PSA) levels have been modiﬁed to derive the PSAdensity by incorporating PV calculations to help guide clin-ical decisions. The clinical value of prostate-speciﬁc anti-gen density, however, is dependent on the quality of thePV estimate. The accuracy and variability of PV determi-nations pose limitations to its usefulness in clinical prac-tice. Information on the size/PV, shape, and location ofthe prostate relative to adjacent organs is also an essentialpart of surgical planning for prostatectomy, radiation ther-apy, and emerging minimally invasive therapies, such ascryotherapy and high-intensity focused ultrasound (HIFU).Recently, the high spatial resolution and soft-tissue contrastoffered by MRI makes it the most accurate method avail-able for obtaining this kind of information. This, combinedwith the potential of MRI to localize and grade prostatecancer, has led to a rapid increase in its adoption and in-creasing research interest in its use for this application. Fur-thermore, and of particular relevance to the MICCAI com-munity, is the fact that accurate prostate MRI segmentationis an essential pre-processing task for computer-aided de-tection and diagnostic algorithms, as well as a number ofmulti-modality image registration algorithms, which aim toenable MRI-derived information on anatomy and tumor lo-cation and extent to aid therapy planning and guidance.

Pathology Datasets Collection from AndrewJanowczyk

A Great Review of Pathology Datasets: Machinelearning methods for histopathological image analysis ttps://arxiv.org/pdf/1709.00786.pdf

GlaS [33]

Glands are important histological structures,which has been shown that malignant tumours arising fromglandular epithelium, also known as adenocarcinomas, arethe most prevalent form of cancer. Accurate segmentationof glands is often a crucial step to obtain reliable morpho-logical statistics. Nonetheless, the task by nature is verychallenging due to the great variation of glandular morphol-ogy in different histologic grades. Up until now, the major-ity of studies focus on gland segmentation in healthy or be-nign samples, but rarely on intermediate or high grade can-cer, and quite often, they are optimised to speciﬁc datasets.In GlaS challenge, participants are encouraged to run theirgland segmentation algorithms on images of Hematoxylinand Eosin (H&E) stained slides. The dataset is provided to-gether with ground truth annotations by expert pathologists.The dataset is available at [here].

TCGA-GBM/LGG [19, 26]

We focus on brain cancer inour study and used two public cancer survival datasets withhigh resolution whole slide pathological images (WSIs)from The Cancer Genome Atlas (TCGA) [19]. Speciﬁ-cally, we conducted experiemnts on two cancer subtypesof brain cancer in TCGA projects: Lower-Grade Glioma(LGG) and Glioblastoma (GBM). We adopted the same an-notations of vital status and overal survival time from pre-vious study [26]. The TCGA-GBM dataset is availableat [here]. The TCGA-LGG dataset is available at [here].The pre-processing instruction of WSIs (in Chinese) can befound in my personal blog at [blog page].

NLST [36]

The National Lung Screening Trial (NLST)was a randomized controlled trial to determine whetherscreening for lung cancer with low-dose helical computedtomography (CT) reduces mortality from lung cancer inhigh-risk individuals relative to screening with chest radio-graphy. Approximately 54,000 participants were enrolledbetween August 2002 and April 2004. The dataset is avail-able at https://cdas.cancer.gov/nlst/

EyePACS [17]

DRISHTI-GS [34]

Drishti-GS is a dataset meant forvalidation of segmenting OD, cup and detecting notch-ing. The images in the Drishti-GS dataset have beencollected and annotated by Aravind Eye hospital, Madu-rai, India. The dataset is divided into two: a train-ing set and a testing set of images. Training images (50) are provided with groundtruths for OD and Cupsegmentation and notching information. The datasetis available at https://cvit.iiit.ac.in/projects/mip/drishti-gs/mip-dataset2/Home.php

REFUGE [29]

REFUGE challenge is partnering withOMIA to widen the opportunities to present your workat MICCAI. In addition to the traditional oral and posterpresentations of OMIA, REFUGE offers the chance to tryyour software on a real challenge this year with fundus im-ages. The goal of the challenge is to evaluate and com-pare automated algorithms for glaucoma detection and opticdisc/cup segmentation on a standard dataset of retinal fun-dus images. The dataset is available at https://refuge.grand-challenge.org/Home2020/

DRIVE [35]

The Digital Retinal Images for Vessel Ex-traction (DRIVE) database has been established to enablecomparative studies on segmentation of blood vessels inretinal images. The photographs for the DRIVE databasewere obtained from a diabetic retinopathy screening pro-gram in The Netherlands. The set of 40 images has beendivided into a training and a test set, both containing 20images. All human observers that manually segmentedthe vasculature were instructed and trained by an experi-enced ophthalmologist. They were asked to mark all pix-els for which they were for at least 70% certain that theywere vessel. The dataset is available at https://drive.grand-challenge.org/

HRF [6]

MESSIDOR [12]

STARE [15]

The STARE (STructured Analysis of theRetina) Project was conceived and initiated in 1975 byMichael Goldbaum, M.D., at the University of California,San Diego. It contains 400 images attached with cor-responding diagnosis, including (0) Normal; (1) Hollen-horst Emboli; (2) Branch Retinal Artery Occlusion; (3)Cilio-Retinal Artery Occlusion; (4) Branch Retinal VeinOcclusion; (5) Central Retinal Vein Occlusion; (6) Hemi-Central Retinal Vein Occlusion; (7) Background DiabeticRetinopathy; (8) Proliferative Diabetic Retinopathy; (9) Ar-eriosclerotic Retinopathy; (10) Hypertensive Retinopathy;(11) Coat’s; (12) Macroaneurism; (13) Choroidal Neovas-cularization; (14) Others. The STARE also provides somesegmentation annotations of vessels. The dataset is avail-able at http://cecas.clemson.edu/ ahoover/stare/

OCT 2017 [21]

The dataset obtained 108,312 im-ages (among these, there are 37,206 with choroidalneovascularization, 11,349 with diabetic macu-lar edema, 8,617 with drusen, and 51,140 normal)from 4,686 patients. The dataset is available athttps://data.mendeley.com/datasets/rscbjbr9sj/3

ISIC 2019 [10, 37, 11]

The International Skin Imag-ing Collaboration (ISIC) has developed the ISIC Archive,an international repository of dermoscopic images, for boththe purposes of clinical training, and for supporting tech-nical research toward automated algorithmic analysis forskin cancer. The goal for ISIC 2019 is classify dermo-scopic images among nine categories: 1. Melanoma; 2.Melanocytic nevus; 3. Basal cell carcinoma; 4. Actinic ker-atosis; 5. Benign keratosis (solar lentigo / seborrheic ker-atosis / lichen planus-like keratosis); 6. Dermatoﬁbroma;7. Vascular lesion; 8. Squamous cell carcinoma; 9. Noneof the others. There are 25,331 images available for train-ing across 8 different categories. The dataset is available athttps://challenge2019.isic-archive.com/

References [1] Walid Al-Dhabyani, Mohammed Gomaa, Hussien Khaled,and Aly Fahmy. Dataset of breast ultrasound images.

Datain brief , 28:104863, 2020. 2[2] Alexander Andreopoulos and John K Tsotsos. Efﬁcient andgeneralizable statistical models of shape and appearance foranalysis of cardiac mri.

Medical Image Analysis , 12(3):335–357, 2008. 2, 3[3] Teresa Araujo, Guilherme Aresta, Eduardo Castro, JoseRouco, Paulo Aguiar, Catarina Eloy, Antonio Polonia, andAurelio Campilho. Classiﬁcation of breast cancer histologyimages using convolutional neural networks.

PLOS ONE ,12(6), 2017. 2[4] Olivier Bernard, Alain Lalande, Clement Zotti, Freder-ick Cervenansky, Xin Yang, Pheng-Ann Heng, Irem Cetin,Karim Lekadir, Oscar Camara, Miguel Angel GonzalezBallester, et al. Deep learning techniques for automatic mricardiac multi-structures segmentation and diagnosis: is theproblem solved?

IEEE transactions on medical imaging ,37(11):2514–2525, 2018. 2, 3[5] Nicolas Bloch, Anant Madabhushi, Henkjan Huisman, JohnFreymann, Justin Kirby, Andinet Enquobahrie, Carl Jaffe,Larry Clarke, and Keyvan Farahani. NCI-ISBI 2013 chal-lenge: automated segmentation of prostate structures.

TheCancer Imaging Archive , 370, 2015. 2[6] Attila Budai, R¨udiger Bock, Andreas Maier, JoachimHornegger, and Georg Michelson. Robust vessel segmen- tation in fundus images.

International journal of biomedicalimaging , 2013, 2013. 2, 4[7] Enrique J. Carmona, Mariano Rinc´on, Juli´an Garc´ıa-Feijo´o,and Jos´e M. Mart´ınez-De-La-Casa. Identiﬁcation of the opticnerve head with genetic algorithms.

Artiﬁcial Intelligence inMedicine , 43(3):243–259, 2008. 2[8] Jiajia Chu, Yajie Chen, Wei Zhou, Heshui Shi, Yukun Cao,Dandan Tu, Richu Jin, and Yongchao Xu. Pay more atten-tion to discontinuity for medical image segmentation. In

In-ternational Conference on Medical Image Computing andComputer-Assisted Intervention , pages 166–175. Springer,2020. 3[9] Kenneth Clark, Bruce Vendt, Kirk Smith, John Freymann,Justin Kirby, Paul Koppel, Stephen Moore, Stanley Phillips,David Mafﬁtt, Michael Pringle, et al. The cancer imag-ing archive (tcia): maintaining and operating a public infor-mation repository.

Journal of digital imaging , 26(6):1045–1057, 2013. 3[10] Noel C F Codella, David A Gutman, M Emre Celebi, andBrian Helba, et al. Skin lesion analysis toward melanoma de-tection: A challenge at the 2017 international symposium onbiomedical imaging, hosted by the international skin imag-ing collaboration (isic). arXiv: Computer Vision and PatternRecognition , 2016. 2, 5[11] Marc Combalia, Noel C F Codella, Veronica Rotemberg,Brian Helba, Veronica Vilaplana, Ofer Reiter, Allan CHalpern, Susana Puig, and Josep Malvehy. Bcn20000: der-moscopic lesions in the wild. arXiv: Image and Video Pro-cessing , 2019. 5[12] Etienne Decenci`ere, Xiwei Zhang, Guy Cazuguel, BrunoLay, B´eatrice Cochener, Caroline Trone, Philippe Gain,Richard Ordonez, Pascale Massin, Ali Erginay, et al. Feed-back on a publicly distributed image database: the messi-dor database.

Image Analysis & Stereology , 33(3):231–234,2014. 2, 4[13] Mark Everingham, SM Ali Eslami, Luc Van Gool, Christo-pher KI Williams, John Winn, and Andrew Zisserman. Thepascal visual object classes challenge: A retrospective.

Inter-national journal of computer vision , 111(1):98–136, 2015. 1,2[14] Francisco Fumero, Silvia Alay´on, Jos´e L Sanchez, JoseSigut, and M Gonzalez-Hernandez. Rim-one: An open reti-nal image database for optic nerve evaluation. In , pages 1–6. IEEE, 2011. 2[15] A. D. Hoover, Valentina Kouznetsova, and Michael Gold-baum. Locating blood vessels in retinal images by piecewisethreshold probing of a matched ﬁlter response.

IEEE Trans-actions on Medical Imaging , 19(3):203–210, 2000. 2, 4[16] Jeremy Irvin, Pranav Rajpurkar, Michael Ko, and Yifan Yu,et al. Chexpert: A large chest radiograph dataset with uncer-tainty labels and expert comparison. 1, 2[17] Kaggle EyePACS Competition. Identify signs ofdiabetic retinopathy in eye images. , 2015. 2, 4[18] Kaggle IMCCS Competition. Intel & mobileodt cervicalcancer screening-which cancer treatment will be mostffective? , 2017. 2[19] Cyriac Kandoth, Michael D McLellan, Fabio Vandin, KaiYe, Beifang Niu, Charles Lu, Mingchao Xie, QunyuanZhang, Joshua F McMichael, Matthew A Wyczalkowski,et al. Mutational landscape and signiﬁcance across 12 majorcancer types.

Nature , 502(7471):333–339, 2013. 2, 4[20] Ali Emre Kavur, M. Alper Selver, O˘guz Dicle, MustafaBarıs¸, and N. Sinem Gezer. CHAOS - Combined (CT-MR)Healthy Abdominal Organ Segmentation Challenge Data,Apr. 2019. 2[21] Daniel S Kermany, Michael Goldbaum, Wenjia Cai, Car-olina CS Valentim, Huiying Liang, Sally L Baxter, AlexMcKeown, Ge Yang, Xiaokang Wu, Fangbing Yan, et al.Identifying medical diagnoses and treatable diseases byimage-based deep learning.

Cell , 172(5):1122–1131, 2018.1, 2, 5[22] Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiplelayers of features from tiny images. 2009. 1, 2[23] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton.Imagenet classiﬁcation with deep convolutional neural net-works. In

Advances in neural information processing sys-tems , pages 1097–1105, 2012. 1, 2[24] Neeraj Kumar, Ruchika Verma, Sanuj Sharma, SurabhiBhargava, Abhishek Vahadane, and Amit Sethi. A datasetand a technique for generalized nuclear segmentation forcomputational pathology.

IEEE Transactions on MedicalImaging , 36(7):1550–1560, 2017. 2[25] Geert Litjens, Robert Toth, Wendy van de Ven, CarolineHoeks, Sjoerd Kerkstra, Bram van Ginneken, Graham Vin-cent, Gwenael Guillard, Neil Birbeck, Jindang Zhang, et al.Evaluation of prostate segmentation algorithms for mri: thepromise12 challenge.

Medical image analysis , 18(2):359–373, 2014. 2, 3[26] Pooya Mobadersany, Safoora Youseﬁ, Mohamed Amgad,David A Gutman, Jill S Barnholtz-Sloan, Jos´e E Vel´azquezVega, Daniel J Brat, and Lee AD Cooper. Predicting canceroutcomes from histology and genomics using convolutionalnetworks.

Proceedings of the National Academy of Sciences ,115(13):E2970–E2979, 2018. 2, 4[27] Bloch N, Madabhushi A, Huisman H, Freymann J, Kirby J,Grauer M, Enquobahrie A, Jaffe C, Clarke L, and FarahaniK. NCI-ISBI 2013 Challenge: Automated Segmentation ofProstate Structures.

The Cancer Imaging Archive, Availableonline: http://doi.org/10.7937/K9/TCIA.2015.zF0vlOPv , 9,2015. 2, 3[28] Peter Naylor, Marick La´e, Fabien Reyal, and Thomas Walter.Segmentation of nuclei in histopathology images by deep re-gression of the distance map.

IEEE transactions on medicalimaging , 38(2):448–459, 2018. 2[29] Jos´e Ignacio Orlando, Huazhu Fu, Jo˜ao Barbossa Breda, andKarel van Keer, et al. Refuge challenge: A uniﬁed frame-work for evaluating automated methods for glaucoma as-sessment from fundus photographs.

Medical image analysis ,59:101570, 2020. 2, 4[30] Pranav Rajpurkar, Jeremy Irvin, Aarti Bagul, Daisy Ding,Tony Duan, Hershel Mehta, Brandon Yang, Kaylie Zhu, Dil- lon Laird, and Robyn L Ball. Mura: Large dataset for ab-normality detection in musculoskeletal radiographs. 2017.1[31] Holger R Roth, Le Lu, Amal Farag, and Hoo-Chang Shin,et al. Deeporgan: Multi-level deep convolutional networksfor automated pancreas segmentation. In

International con-ference on medical image computing and computer-assistedintervention , pages 556–564. Springer, 2015. 2[32] Arnaud Arindra Adiyoso Setio, Alberto Traverso, ThomasDe Bel, Moira SN Berens, Cas van den Bogaard, PiergiorgioCerello, Hao Chen, Qi Dou, Maria Evelina Fantacci, BramGeurts, et al. Validation, comparison, and combination ofalgorithms for automatic detection of pulmonary nodules incomputed tomography images: the luna16 challenge.

Medi-cal image analysis , 42:1–13, 2017. 1, 2[33] Korsuk Sirinukunwattana, David RJ Snead, and Nasir M Ra-jpoot. A stochastic polygons model for glandular structuresin colon histology images.

IEEE transactions on medicalimaging , 34(11):2366–2378, 2015. 2, 4[34] Jayanthi Sivaswamy, S. R. Krishnadas, Gopal Datt Joshi,Madhulika Jain, and A. Ujjwaft Syed Tabish. Drishti-gs:Retinal image dataset for optic nerve head(onh) segmenta-tion. In

IEEE International Symposium on Biomedical Imag-ing , 2014. 2, 4[35] Joes Staal, Michael D Abr`amoff, Meindert Niemeijer, Max AViergever, and Bram Van Ginneken. Ridge-based vessel seg-mentation in color images of the retina.

IEEE transactionson medical imaging , 23(4):501–509, 2004. 2, 4[36] National Lung Screening Trial Research Team. The nationallung screening trial: overview and study design.

Radiology ,258(1):243–253, 2011. 2, 4[37] Philipp Tschandl, Cliff Rosendahl, and Harald Kittler. Theham10000 dataset, a large collection of multi-source der-matoscopic images of common pigmented skin lesions.

Sci-entiﬁc Data , 5(1):180161, 2018. 5[38] Moi Hoon Yap, Gerard Pons, Joan Mart´ı, Sergi Ganau,Melcior Sent´ıs, Reyer Zwiggelaar, Adrian K Davison, andRobert Mart´ı. Automated breast ultrasound lesions detec-tion using convolutional neural networks.

IEEE journal ofbiomedical and health informatics , 22(4):1218–1226, 2017.1, 2[39] Bolei Zhou, Hang Zhao, Xavier Puig, Sanja Fidler, AdelaBarriuso, and Antonio Torralba. Semantic understand-ing of scenes through the ade20k dataset. arXiv preprintarXiv:1608.05442arXiv preprintarXiv:1608.05442