[PDF] Exploiting the Transferability of Deep Learning Systems Across Multi-modal Retinal Scans for Extracting Retinopathy Lesions

Abstract

Retinal lesions play a vital role in the accurate classification of retinal abnormalities. Many researchers have proposed deep lesion-aware screening systems that analyze and grade the progression of retinopathy. However, to the best of our knowledge, no literature exploits the tendency of these systems to generalize across multiple scanner specifications and multi-modal imagery. Towards this end, this paper presents a detailed evaluation of semantic segmentation, scene parsing and hybrid deep learning systems for extracting the retinal lesions such as intra-retinal fluid, sub-retinal fluid, hard exudates, drusen, and other chorioretinal anomalies from fused fundus and optical coherence tomography (OCT) imagery. Furthermore, we present a novel strategy exploiting the transferability of these models across multiple retinal scanner specifications. A total of 363 fundus and 173,915 OCT scans from seven publicly available datasets were used in this research (from which 297 fundus and 59,593 OCT scans were used for testing purposes). Overall, a hybrid retinal analysis and grading network (RAGNet), backboned through ResNet-50, stood first for extracting the retinal lesions, achieving a mean dice coefficient score of 0.822. Moreover, the complete source code and its documentation are released at: this http URL.

Full PDF

EExploiting the Transferability of Deep LearningSystems Across Multi-modal Retinal Scans forExtracting Retinopathy Lesions

Taimur Hassan † (cid:5) , Muhammad Usman Akram (cid:5) , Naoufel Werghi † † Department of Electrical Engineering and Computer Sciences, Khalifa University, Abu Dhabi, United Arab Emirates. (cid:5)

Department of Computer and Software Engineering, National University of Sciences and Technology, Islamabad, Pakistan.

Abstract —Retinal lesions play a vital role in the accurateclassiﬁcation of retinal abnormalities. Many researchers haveproposed deep lesion-aware screening systems that analyze andgrade the progression of retinopathy. However, to the best ofour knowledge, no literature exploits the tendency of thesesystems to generalize across multiple scanner speciﬁcations andmulti-modal imagery. Towards this end, this paper presentsa detailed evaluation of semantic segmentation, scene parsingand hybrid deep learning systems for extracting the retinallesions such as intra-retinal ﬂuid, sub-retinal ﬂuid, hard exudates,drusen, and other chorioretinal anomalies from fused fundus andoptical coherence tomography (OCT) imagery. Furthermore, wepresent a novel strategy exploiting the transferability of thesemodels across multiple retinal scanner speciﬁcations. A totalof 363 fundus and 173,915 OCT scans from seven publiclyavailable datasets were used in this research (from which 297fundus and 59,593 OCT scans were used for testing purposes).Overall, a hybrid retinal analysis and grading network (RAGNet),backboned through ResNet , stood ﬁrst for extracting the retinallesions, achieving a mean dice coefﬁcient score of 0.822. Moreover,the complete source code and its documentation are released athttp://biomisa.org/index.php/downloads/. Index Terms —Retinal Lesions, Ophthalmology, ConvolutionalNeural Networks, Fundus Photography, Optical Coherence To-mography.

I. I

NTRODUCTION

Retinopathy or retinal diseases tend to damage the retina,which may result in a non-recoverable loss of vision or evenblindness if not timely treated. Most of these diseases areassociated with diabetes. However, they may also occur due toaging, uveitis, and cataract surgeries. The two common retinaldiseases are macular edema (ME) and age-related maculardegeneration (AMD). ME occurs due to ﬂuid accumulationwithin the macula mostly due to the associated hyperglycemia,uveitis, and cataract surgeries. ME caused by diabetes isoften termed as diabetic macular edema (DME) which isidentiﬁed by examining the patient’s diabetic history and alsoby analyzing the retinal thickening (caused due to retinal ﬂuid)or hard exudates (HE) within the one-disc diameter of thecenter of the macula (containing a small pit known as fovea)[1]. Early Treatment Diabetic Retinopathy Studies (ETDRS)classiﬁed clinically signiﬁcant ME as having: 1) either thethickening within 500 µ m of the macular center; 2) HE along This work is supported by a research fund from Khalifa University: Ref:CIRA-2019-047.

Figure 1:

Retinal lesions in fundus and OCT scans of Ci-CSME (A,B) and dry AMD pathology (C, D). with the adjacent thickening within the macular center of500 µ m; or 3) retinal thickening regions of one (or more) discdiameter in which some part of them are within the one-discdiameter of the macular center [2]. But, with the advent ofnew imaging techniques such as optical coherence tomography(OCT), the classiﬁcation of DME is redeﬁned as centrallyinvolved clinical signiﬁcant macular edema (Ci-CSME) if thepresence of retinal thickening, due to retinal ﬂuid or hardexudates, is discovered within the central sub-ﬁeld zone ofthe macula (having 1mm or greater diameter). Otherwise,DME is classiﬁed as non-centrally involved [3]. AMD isanother retinal syndrome mostly found in elder people. Itis typically classiﬁed into two stages i.e. non-neovascularAMD and the neovascular AMD. Non-neovascular AMD isthe “dry” form of AMD in which small, medium or large-sized drusen can be observed. With the increasing diseaseseverity, abnormal blood vessels intervene retina leading tochorioretinal anomalies such as ﬁbrotic scars and choroidalneovascular membranes (CNVM). In such a case, AMD isclassiﬁed as wet or neovascular AMD. Fig. 1 shows some ofthe fundus and OCT scans showing retinal lesions at differentstages of AMD and DME.II. R ELATED W ORK

In the literature, a large body of solutions assessing retinalregions employed features extraction techniques coupled withclassical machine learning (ML) tools. The majority of thesemethods are validated on a limited number of scans and thusexhibited feeble reproducibility. More recently, with the adventof deep learning, a wide variety of end-to-end approaches,operating on more massive datasets, have been proposed.

Traditional Approaches:

Fundus imagery has been themodality of choice for examining the retinal pathology for a r X i v : . [ c s . C V ] A ug while [5] and is still used as a secondary examination tech-nique in analyzing the complex retinal pathologies. But, withthe advent of OCT, most of the solutions for retinal image anal-ysis have migrated towards this new modality due to its abilityto present objective visualization of retinal abnormalities inearly stages. Chiu et al. [6] developed a kernel regressionwith graph theory and dynamic programming (KR+GTDP)scheme to extract retinal layers and retinal ﬂuid from DMEaffected scans [6]. In [7], a Random Forest-based frameworkwas proposed for the automated extraction of retinal layersand ﬂuid from scans affected by central serous retinopathy(CSR). Wilkins et al. [8] presented an automated method forthe extraction of intra-retinal cystoid ﬂuid using OCT images.Vidal et al. [9] used a linear discriminant classiﬁer, supportvector machines, and a Parzen window for the identiﬁcationof intra-retinal ﬂuid (IRF). Apart from this, we have alsoproposed several methods for extracting retinal layers, retinalﬂuid, and for classifying retinopathy using traditional MLtechniques [10]–[14]. Deep Learning Methods:

Many researchers have applieddeep learning for the extraction of retinal layers [15] andretinal lesions such as IRF [16], sub-retinal ﬂuid (SRF) [17]and HE [21]. Seebock et al. [19] proposed a Bayesian UNetbased framework for recognizing different anomalies withinthe retinal pathologies. Fang et al. [20] developed a lesion-aware convolutional neural network (LACNN) model for theaccurate classiﬁcation of DME, choroidal neovascularization,drusen (AMD) and normal pathologies. LACNN is composedof a lesion detection network (LDN) and lesion-attentionmodule where LDN ﬁrst generates a soft attention map toweight the lesion-aware features extracted from the lesion-attention module and then these features are used for theaccurate classiﬁcation of retinal pathologies. Apart from this,we have recently proposed a hybrid retinal analysis and grad-ing architecture (RAGNet) [21] that utilizes a single featureextraction model for retinal lesions segmentation, lesion-awareclassiﬁcation and severity grading of retinopathy based onOCT images. III. C

ONTRIBUTIONS

In this paper, we present a thorough evaluation of deeplearning models for the extraction of IRF, SRF, HE, drusen,and other chorioretinal anomalies such as ﬁbrotic scars andCNVM from multi-modal retinal images. Furthermore, weexploited the transferability of these models for retinal lesionsextraction across multi-modal imagery. To the best of ourknowledge, there is no literature available to date providinga thorough transferability analysis of encoder-decoder, fullyconvolutional, scene parsing, and hybrid deep learning systemsfor extracting this multitude of lesions in one go from multi-modal retinal imagery. Subsequently, the main contributionsof this paper are: • A ﬁrst comprehensive evaluation of semantic segmentation,scene parsing and hybrid deep learning systems such asRAGNet [21], PSPNet [22], SegNet [23], UNet [24], and FCN (8s and 32s) [25] for extracting multiple lesions frommulti-modal retinal imagery. • A comprehensive study encompassing seven publicly avail-able datasets, and ﬁve different retinal pathologies repre-sented in a total of 363 fundus and 173,915 OCT scansfrom which 297 fundus and 59,593 OCT scans were usedfor testing purposes. • A detailed exploration of the transferability of these modelsacross multiple scanner speciﬁcations.IV. P

ROPOSED A PPROACH

We propose a novel study to analyze the transferability of thestate-of-the-art deep learning frameworks across fused fundusand OCT imagery for extracting multiple retinal lesions in onego. The models which we considered are as follows:

RAGNet : is a hybrid convolutional network that can performpixel-level segmentation and scan-level classiﬁcation at thesame time [21]. The uniqueness in the RAGNet architectureis that it uses the same feature extractor for the classiﬁcationand segmentation purposes. So, if the problem demands seg-mentation and classiﬁcation from the same image based uponsimilar features, then RAGNet would be an ideal choice ratherthan using two separate models [21]. Here, we have only usedRAGNet segmentation unit since we are focusing on the retinallesions segmentation.

PSPNet : is a state-of-the-art scene parsing network that con-tains a pyramid pooling module to generate four pyramids offeature maps representing coarser to ﬁner details to minimizethe loss of global scene context while generating the latentrepresentations [22]. The pooled outputs are then concatenatedwith the original feature maps to generate the ﬁnal segmenta-tion results.

SegNet : is an encoder-decoder network for semantic seg-mentation. The uniqueness in the SegNet model is that ituses pooling indices from the corresponding encoder blockto up-sample the feature maps at the decoder end in a non-linear fashion. Afterward, the feature maps are convolved withtrainable ﬁlters to remove their sparsity. Moreover, SegNet hasa smaller number of trainable parameters due to which it iscomputationally more efﬁcient.

UNet : is an auto-encoder inspired by FCN for semanticsegmentation. The key feature of UNet is that it is fast andcan generate good segmentation results with a small numberof training samples because of its in-built data augmentationstrategy [24]. UNet uses up-sampling instead of pooling opera-tions and generates a large number of feature maps to mitigatethe contextual information to the higher resolution layers [24].

FCN : is an end-to-end model proposed for semantic seg-mentation. FCN uses learned representation from the pre-trained models, ﬁne-tune them, and generates ﬁner pixel-levelpredictions in one go based upon up-sampling lower networklayers with ﬁner strides. In this study, we have utilized FCN-8and FCN-32 (i.e. the ﬁnest and the coarsest version of FCN)for retinal lesions extraction.We have applied these models for extracting retinal lesionsfrom both fundus and OCT imagery. Here, we note thatur study covers some of the most complex and commonlyoccurring retinal pathologies including non-neovascular AMD,neovascular AMD, Ci-CSME, and non-Ci-CSME. We alsonote that the related scans were collected using machines fromdifferent manufacturers and exhibit varying scan quality. Tomake the comparison objective and highly reproducible, wehave used publicly available datasets in our investigations.Furthermore, we have tested the transferability of these modelsthrough an extensive cross-dataset validation. The series ofexperiments we conducted in this work provide a reliablebenchmark for assessing the robustness and generalizationcapacity of each model.V. E

XPERIMENTAL S ETUP

This section reports the detailed description of the datasetswhich have been used in this research. Furthermore, it containsimplementation details and the performance metrics on whichthe models are evaluated:

A. Datasets

We have evaluated all the models on seven publicly availableretinal image datasets where the ground truths for retinallesions were acquired through the Armed Forces Institute ofOphthalmology, Rawalpindi Pakistan. The detailed summaryof each dataset is presented below:

Rabbani-I [26] is one of the few datasets which containsboth OCT and fundus images of each subject reﬂecting AMD,DME, and normal pathologies. The dataset is acquired at NoorEye Hospital, Tehran Iran and contains a total of 4,241 OCTand 148 fundus scans from 50 normal, 48 dry AMD, and 50DME affected subjects. In this paper, we considered 37 fundusscans and 1,061 OCT scans for training and the rest for testingpurposes.

Rabbani-II [27] contains 12,800 OCT and 100 color fundusscans from both eyes of 50 healthy subjects. Since Rabbani-IIonly contain scans from the healthy subjects so it served asan excellent benchmark to test the false positive rate for allthe models (indicating how many false positives each modelgenerates).

Duke-I [28] is one of the oldest retinal OCT datasets contain-ing a total of 38,400 scans from which 26,900 scans reﬂect dryAMD and 11,500 scans show controlled (healthy) pathology.In the proposed study, a total of 300 scans have been used fortraining and 38,100 scans have been used for testing purposes.

Duke-II [6] has a total of 610 OCT images from 10 severeDME affected subjects. Moreover, the dataset contains highlydetailed markings for retinal layers and ﬂuid from two clini-cians. In this paper, a total of 305 scans were used for trainingfrom the ﬁrst ﬁve subjects, and the rest for testing.

Duke-III [29] is another dataset from Duke University whichwe used in our research. The dataset contains 723 scansreﬂecting dry AMD pathologies, 1,101 scans reﬂecting DMEpathology, and 1,407 scans showing normal retinal pathology.For the experimentations, we considered 3,048 scans fortraining and the rest 183 for testing.

BIOMISA dataset [30] contains a total of 5,324 OCT (657dry AMD, 2,195 ME, 904 normal, 407 wet AMD, and 1,161CSR) and 115 fundus scans from 99 subjects (17 healthy, 31ME, 8 dry AMD, 19 wet AMD, 24 CSR). In this study, atotal of 1,299 OCT and 29 fundus images from the BIOMISAdataset were used for training and the rest for the evaluationpurposes.

Zhang dataset [31] contains 109,309 scans representing wetAMD (CNV), dry AMD (Drusen), DME, and healthy patholo-gies. The dataset also presents a clear separation of 108,309scans for training while 1,000 scans for testing purposes whichwe followed in the experimentations as well.

B. Implementation Details

All the models have been implemented using Keras, Python3.7.4 on a machine having Intel 8 th generation Core i5,NVIDIA RTX 2080 GPU and 16 GB RAM where ResNet was used as a backbone. Moreover, the optimizer usedduring the training was an adaptive learning rate method(ADADELTA) with a default learning and decay rate. Thesource code has been released at http://biomisa.org/index.php/downloads/ for reproducibility. C. Evaluation Metrics

In the proposed study, all the segmentation models have beenevaluated using the following metrics:

Mean Dice Coefﬁcient : Dice coefﬁcient ( D C ) computes thedegree of similarity between the ground truth and the extractedresults using following relation: (D C = P P +F P +F N ) , where T P indicates the true positives, F P indicates the false positivesand F N indicates the false negatives. After computing D C foreach lesion class. The mean dice coefﬁcient is computed foreach network by taking an average of their D C scores. Mean Intersection-over-Union : The mean intersection-over-union (IoU) is computed by taking an average of IoU scoresfor each lesion class where the IoU scores are computedthrough (IoU = T P T P +F P +F N ) . Recall, Precision and F-score : To further evaluate the models,we computed pixel-level recall (T PR = T P T P +F N ) , precision (P PV = T P T P +F P ) and F-score (F = PR xP PV T PR +P PV ) . Qualitative Evaluations : The performances of all the modelsfor lesions extraction have been also qualitatively evaluatedthrough some visual examples.VI. R

ESULTS AND D ISCUSSION

The evaluation of segmentation models has been conductedon the combination of all seven datasets containing mixedOCT and fundus scans. In terms of T PR and F as shownin Table I, RAGNet achieves 9.48% and 3.36% improvementsas compared to UNet and PSPNet, respectively. However, interms of precision, SegNet has a lead of 1.52% as comparedto PSPNet. This indicates that SegNet produces fewer falsepositives as compared to the rest of the models. For pixel-level comparison, we have excluded accuracy because it givesbiased results towards a dominant-negative class i.e. the back-ground.able I: Performance evaluations in terms of pixel-level recall,precision and F-scores on combined dataset. Bold indicate thebest performance. Network T PR P PV F RAGNet

PSPNet 0.7540 0.9200 0.8287SegNet 0.6388

Figure 2:

Comparison of retinal lesions extraction on combineddataset. (From left to right: original image, ground truth, RAGNet,PSPNet, UNet, FCN-8, FCN-32, SegNet). Blue, red, yellow, green,and pink indicate HE, IRF, SRF, CA, and drusen, respectively.

Tables 2 and 3 reports the performance of all the models forextracting retinal lesions in terms of mean D C and mean IoU,respectively. From Table 2, it can be observed that RAGNetachieves the best mean D C score of 0.822, leading PSPNetby 4.5% and FCN-32 by 51.33%. Moreover, in terms ofmean IoU, RAGNet also achieves the overall best performance(mean IoU: 0.710) showing a neat gap over its competitorsfor extracting IRF, SRF, and HE regions. In Table 3, thesecond-best performance is achieved by PSPNet that lags fromRAGNet by 6.9%. Also, we noticed that on fundus imagesUNet achieves optimal lesion extraction results with an overallperformance comparable to that of PSPNet. Fig. 2 shows thequalitative results of all the models when trained on multi-modal images from all seven datasets at once, where we cannotice the best overall performance of RAG-Net. It should benoticed that extracting lesions accurately from both modalitiesat once is quite challenging as their image features vary a lot.In the second series of experiments, we have conducted atransferability analysis to assess the generalization capabilitiesof all models. Here, we combined Duke-I, II, and III asone dataset (i.e. Duke) and Rabbani-I and Rabbani-II dataset Table II: Performance evaluations of deep segmentation mod-els for retinal lesions extraction in terms of D C . Bold indicatesthe overall best performance. Network IRF SRF CA HE Drusen MeanRAGNet

SegNet 0.810 0.610 0.886 0.373 0.695 0.675PSPNet 0.843 0.809

Table III: Performance evaluations of deep segmentation mod-els for retinal lesions extraction in terms of IoU. Bold indicatesthe overall best performance.

Network IRF SRF CA HE Drusen MeanRAGNet

SegNet 0.681 0.439 0.796 0.229 0.533 0.535PSPNet 0.728 0.680 as Rabbani to avoid redundant combinations as they havesimilar image features. We report the results in Table 4 whereit can be observed that all the methods have shown goodperformance for Duke and Zhang dataset pairs and this isnatural because both datasets are acquired through Spectralis,Heidelberg Inc. Moreover, RAGNet achieved the overall bestperformance as evident from Table 4, whereas PSPNet stood2 nd best but its performance is comparable with UNet. Inanother experiment, we have used Rabbani-II dataset to testhow many false positive does each model produce. SinceRabbani-II contains only healthy scans, so there are no actuallesions in this dataset. The best performance in this experimentis achieved by the RAGNet with a true negative (T N ) rate of0.9999 indicating that it produces a minimum number of falseTable IV: Transferability analysis (Training → Testing) for allmodels in terms of mean IoU. Bold and blue indicates the ﬁrstand second-best performance, respectively. (Datasets name arecoded as follows: R: Rabbani, D: Duke, Z: Zhang and B:BIOMISA). The rest of the abbreviations are RN: RAGNet,PN: PSPNet, SN: SegNet, UN: UNet, F-8: FCN-8, F-32: FCN-32.

RN PN SN UN F-8 F-32R → D → R → Z → R → R → B → D → Z → B → D → Z → B esions. Apart from this, the worse performance is achievedfor FCN-32 ( T N rate: 0.9379). The worse performance ofFCN-32 is even above 90% because the ratio of T N pixelsand the F P pixels is extremely high. The results for thisexperiment are available in the codebase package for thereaders at http://biomisa.org/index.php/downloads/.VII. C ONCLUSION AND F UTURE R ESEARCH

In this paper, we presented a thorough evaluation of semanticsegmentation, scene parsing, and hybrid deep learning sys-tems for extracting retinal lesions from fused fundus andOCT imagery. We also assessed the generalization capacityof each model through comprehensive cross-data validationswhere RAGNet, due to its robustness in retaining lesioncontextual information during scan decomposition, producessuperior results as compared to other models. Furthermore, thebenchmarking performed in this work will be of great utilityfor both researchers and practitioners who want to employdeep learning models for lesion-aware grading of the retina.In the future, we plan to extend and exploit this study for theextraction of the optic disc and retinal layers in the optic nervehead region for the glaucoma analysis.R

EFERENCES[1] G. M. Comers, “Cystoid macular edema,” in

Kellog Eye Center . Ac-cessed: June 2019.[2] “Diabetic macular edema,” in

EyeWiki . Accessed: November 4th, 2019.[3] N. Relhan et al. , “The early treatment diabetic retinopathy study his-torical review and relevance to today’s management of diabetic macularedema,” in

Current Opinion in Ophthalmology . Wolters Kluwer, May2017.[4] M. U. Akram et al. , “An automated system for the grading of diabeticmaculopathy in fundus images,” in . November 12th-15th, 2012.[5] T. Hassan et al. , “Review of oct and fundus images for detection ofmacular edema,” in

IEEE International Conference on Imaging Systemsand Techniques (IST) . September, 2015.[6] S. J. Chiu et al. , “Kernel regression based segmentation of optical coher-ence tomography images with diabetic macular edema,” in

BiomedicalOptics Express . Vol. 6, No. 4, April 2015.[7] D. Xiang et al. , “Automatic retinal layer segmentation of oct imageswith central serous retinopathy,” in

IEEE Journal of Biomedical andHealth Informatics . Vol 23, No. 1, January 2019.[8] G. R. Wilkins et al. , “Automated segmentation of intraretinal cys-toid ﬂuid in optical coherence tomography,” in

IEEE Transactions onBiomedical Engineering . pp. 1109-1114, 2012.[9] P. L. Vidal et al. , “Intraretinal ﬂuid identiﬁcation via enhanced mapsusing optical coherence tomography images,” in

Biomedical OpticsExpress . October 2018.[10] S. Khalid et al. , “Automated segmentation and quantiﬁcation of drusenin fundus and optical coherence tomography images for detection ofarmd,” in

Journal of Digital Imaging . December 2017.[11] S. Khalid et al. , “Fully automated robust system to detect retinal edema,central serous chorioretinopathy, and age related macular degenerationfrom optical coherence tomography images,” in

BioMed Research Inter-national . March 2017. [12] T. Hassan et al. , “Automated segmentation of subretinal layers for thedetection of macular edema,” in

Applied Optics . 55, 454-461, 2016.[13] B. Hassan et al. , “Structure tensor based automated detection of macularedema and central serous retinopathy using optical coherence tomogra-phy images,” in

Journal of Optical Society of America A . 33, 455-463,2016.[14] A. M. Syed et al. , “Automated diagnosis of macular edema and centralserous retinopathy through robust reconstruction of 3D retinal surfaces,”in

Computer Methods and Programs in Biomedicine . 137, 1-10, 2016.[15] L. Fang et al. , “Automatic segmentation of nine retinal layer boundariesin oct images of non-exudative amd patients using deep learning andgraph search,” in

Biomedical Optics Express . Vol. 8, No. 5, May 2017.[16] A. G. Roy et al. , “ReLayNet: retinal layer and ﬂuid segmentationof macular optical coherence tomography using fully convolutionalnetworks,” in

Biomedical Optics Express . Vol. 8, No. 8, 1 August 2017.[17] T. Schlegl et al. , “Fully automated detection and quantiﬁcation ofmacular ﬂuid in oct using deep learning,” in

Ophthalmology . Vol. 125,No. 4, April 2018.[18] B. Hassan et al. , “Deep ensemble learning based objective grading ofmacular edema by extracting clinically signiﬁcant ﬁndings from fusedretinal imaging modalities,” in

MDPI Sensors . July 2019.[19] P. Seebock et al. , “Exploiting epistemic uncertainty of anatomy seg-mentation for anomaly detection in retinal oct,” in

IEEE Transactionson Medical Imaging . May 2019.[20] L. Fang et al. , “Attention to lesion: Lesion-aware convolutional neuralnetwork for retinal optical coherence tomography image classiﬁcation,”in

IEEE Transactions on Medical Imaging . August 2019.[21] T. Hassan et al. , “RAG-FW: A hybrid convolutional framework for theautomated extraction of retinal lesions and lesion-inﬂuenced grading ofhuman retinal pathology,” in

IEEE Journal of Biomedical and HealthInformatics . March 2020.[22] H. Zhao et al. , “Pyramid scene parsing network,” in

IEEE CVPR . 2017.[23] V. Badrinarayanan et al. , “Segnet: A deep convolutional encoder-decoderarchitecture for image segmentation,” in

IEEE Transactions on PatternAnalysis and Machine Intelligence . December 2017.[24] O. Ronneberger et al. , “U-net: Convolutional networks for biomedicalimage segmentation,” in

MICCAI . 2015.[25] J. Long et al. , “Fully convolutional networks for semantic segmentation,”in

IEEE CVPR . 2015.[26] R. Rasti et al. , “Macular oct classiﬁcation using a multi-scale convo-lutional neural network ensemble,” in

IEEE Transactions on MedicalImaging . vol. 37, no. 4, pp. 1024-1034, April 2018.[27] T. Mahmudi et al. , “Comparison of macular octs in right andleft eyes ofnormal people,” in Proc. SPIE, Medical Imaging, San Diego, California,United States Feb. 15-20, 2014.[28] S. Farsiu et al. , “Quantitative classiﬁcation of eyes with and withoutintermediate age-related macular degeneration using optical coherencetomography,” in

Ophthalmology . 121(1), 162-172 January 2014.[29] P. P. Srinivasan et al. , “Fully automated detection of diabetic macularedema and dry age-related macular degeneration from optical coherencetomography images,” in

Biomedical Optics Express . Vol. 5, No. 10 —DOI:10.1364/BOE.5.0035 68, 12 Sep 2014.[30] T. Hassan et al. , “BIOMISA Retinal Image Database for Macular andOcular Syndromes,” in

ICIAR-2018 . Portugal, June 2018.[31] D. Kermany et al. , “Identifying medical diagnoses and treatable diseasesby image-based deep learning,”