Segmentation of blood vessels in retinal fundus images
SSegmentation of blood vessels in retinal fundus images
Michiel Straat and Jorrit Oosterhof
Abstract —In recent years, several automatic segmentation methods have been proposed for blood vessels in retinal fundus images,ranging from using cheap and fast trainable filters [1] to complicated neural networks and even deep learning [2] [3] [4].One example of a filted-based segmentation method is B-COSFIRE [1]. In this approach the image filter is trained with exampleprototype patterns, to which the filter becomes selective by finding points in a Difference of Gaussian response on circles around thecenter with large intensity variation.In this paper we discuss and evaluate several of these vessel segmentation methods. We take a closer look at B-COSFIREand study the performance of B-COSFIRE on the recently published IOSTAR dataset [5] by experiments and we examine how theparameter values affect the performance. In the experiment we manage to reach a segmentation accuracy of 0.9419.Based on our findings we discuss when B-COSFIRE is the preferred method to use and in which circumstances it could bebeneficial to use a more (computationally) complex segmentation method. We also shortly discuss areas beyond blood vesselsegmentation where these methods can be used to segment elongated structures, such as rivers in satellite images or nerves of aleaf.
Index Terms —Blood vessel segmentation, Image processing, B-COSFIRE, Retinal image analysis, Fundus imaging, Medical imageanalysis, Retinal blood vessels, Segmentation, Fundus, Retina, Vessel segmentation.
NTRODUCTION
The inspection of the blood vessel tree in the fundus, which is the in-terior surface of the eye opposite to the lens, is important in the deter-mination of various cardiovascular diseases. This can be done manu-ally by ophthalmoscopy, which is an effective method of analysing theretina. However, it has been suggested that using fundus photographsis more reliable than ophthalmoscopy [1]. Additionally, these imagescan be used for automatic identification of the blood vessels, whichcan be a difficult task due to obstacles such as low contrast with thebackground, narrow blood vessels and various blood vessel anomalies.A segmentation method with high accuracy can serve as a significantaid in diagnosing cardiovascular diseases, as it highlights the bloodvessel tree in the fundus.In recent years, several segmentation methods have been proposedfor the automatic segmentation of blood vessels, ranging from usingcheap and fast trainable filters [1] to complicated neural networks andeven deep learning [2] [3] [4].One example of a filted-based segmentation method is Bar-Combination Of Shifted Filter Responses, in short B-COSFIRE [1].In this approach we train the image filter with example prototype pat-terns, to which the filter becomes selective by finding points in a Dif-ference of Gaussian response on circles around the center with largeintensity variation. Azzopardi et al . used two prototype patterns forthe segmentation: One for bars and one for bar-endings. The filterachieves rotation invariance by rotating the points of interest for theprototype patterns, yielding a filter that is selective for these vesselorientations as well.In section 2 we discuss the theory behind the filter-based methodB-COSFIRE in more detail. We then discuss supervised vessel seg-mentation methods that are based on machine learning in section 3. Insection 4 we describe our set-up for experiments with B-COSFIRE ona recent dataset that contains retinal images acquired with a camerabased on Scanning Laser Ophtalmoscopy (SLO) technology and wediscuss the results in section 5. Based on the study and the results ofthe experiments we then discuss the advantages- and disadvantages ofthe discussed methods. • Michiel Straat is a first year Computing Science master student at theUniversity of Groningen. • Jorrit Oosterhof is a first year Computing Science master student at theUniversity of Groningen.
In this section we discuss B-COSFIRE in more detail. The methodis based on the C ombination O f S hifted FI lter RE sponses, which isa trainable filter used for interest point detection, as described in [6].The filter is trained in a training stage in which it is configured to beselective to specific prototype patterns.The B-COSFIRE method is a specific case of COSFIRE, in whichthe B stands for ”bar”. The filter is trained using vessel-like prototypepatterns, which allows it to be selective for such structures. Azzopardi et al . proposed two B-COSFIRE filters: The symmetric filter, which issuitable for bars, and the asymmetric filter, suitable for bar endings. Inthe following, we briefly discuss the method as outlined in [1].The B-COSFIRE filtering method proceeds in a number of stages,which are roughly divided in training each filter with a prototype pat-tern in order for the filter to become selective for its prototype pattern,pre-processing the retinal input image in order to enhance contrast ofthe vessels and the actual filtering, which yields the final output re-sponse. The response is thresholded to obtain a binary image, in whichfor each pixel in the response a gray value above the threshold is shownas a white pixel (1) indicating ”vessel”, and a gray value below thethreshold is shown as black indicating ”non-vessel”. This yields thetypical vessel tree image shown in Figure 1.In the next sections we discuss the stages mentioned above in moredetail. (a) Retinal fundus image (b) Corresponding vessel segmentation. Fig. 1: A typical vessel tree extracted from a retinal image, in whicheach pixel is labeled either ”vessel” (white) or ”non-vessel” (black). a r X i v : . [ ee ss . I V ] M a y .1 Training a filter using a prototype pattern Difference of Gaussian (DoG) filters are used to detect changes in in-tensity.
DoG σ ( x , y ) = πσ exp (cid:18) − x + y σ (cid:19) − π ( . σ ) exp (cid:18) − x + y ( . σ ) (cid:19) . (1)The response by applying a DoG filter as in Equation 1 to an image f is defined as the convolution of f with the filter kernel in Equation 1: c σ ( x , y ) = | f (cid:63) DoG σ | + . (2)The selectivity of a B-COSFIRE filter is automatically configuredby presenting the filter with a prototype pattern. Then, a number ofconcentric circles are put around the filter’s center of support and thepoints on these circles with a high intensity variation are determined,called the points of interests. One such point is described by a triple ( σ , ρ , φ ) , where σ is the standard deviation of the DoG filter that con-tributed this point, ρ is the radius of the circle that this point lies onand φ is the angle with respect to the center of support. The training ofone prototype pattern yields a set of triples S = { ( σ i , ρ i , φ i ) i = ,..., n } describing n points of interest.In Figure 2a we see an example of how selectivity is obtained for avertical bar pattern. The center point is surrounded by two circles. Oneach circle two local maxima in the DoG response are found, wheretwo are above the center point and the other two below the center point.For this reason we call this the symmetric filter . In Figure 2b an ex-ample is shown of the configuration of a bar-ending. In this case twopoints above the center point are marked as local maxima in the DoGresponse. This filter we call asymmetric . input the responses of center-on DoG filters at certain positionswith respect to the center of its area of support. Such DoG filtersgive high responses to intensity changes in the input image. Eachgray circle in Fig. 2 represents the area of support of a center-onDoG filter. The response of a B -COSFIRE filter is computed asthe weighted geometric mean, essentially the product, of theresponses of the concerned DoG filters in the centers of thecorresponding circles. The positions at which we take theirresponses are determined by an automatic analysis procedure ofthe response of a DoG filter to a prototype bar structure; weexplain this procedure below. We denote by
DoG r ð x ; y Þ a center-on DoG function with anexcitatory (i.e. positive) central region and an inhibitory (i.e.negative) surround: DoG r ð x ; y Þ ¼ def pr exp $ x þ y r ! " $ p ð : r Þ exp $ x þ y ð : r Þ ! ð Þ where r is the standard deviation of the Gaussian function thatdetermines the extent of the surround. This type of function is anaccepted computational model of some cells in the lateral genicu-late nucleus (LGN) of the brain (Rodieck, 1965). Motivated by theresults of electrophysiological studies of LGN cells in owl monkey(Irvine et al., 1993; Xu et al., 2002), we set the standard deviationof the inner Gaussian function to 0 : r . For a given location ð x ; y Þ and a given intensity distribution I ð x ; y Þ of an image I , the response c r ð x ; y Þ of a DoG filter with a kernel function DoG r ð x $ x ; y $ y Þ iscomputed by convolution: c r ð x ; y Þ ¼ def j I H DoG r j þ ð Þ where j & j þ denotes half-wave rectification. Fig. 3b shows theresponse image of a DoG filter that is applied to the synthetic inputimage shown in Fig. 3a.
Fig. 4 illustrates an example of the automatic configuration of a B -COSFIRE filter. We use a synthetic image that contains a vertical bar as prototype pattern, such as the one shown in Fig. 3a. Wechoose the center point, labeled by ‘1’ in Fig. 4, and in an automaticprocess we analyse its local neighbourhood as described below.We apply a center-on DoG filter with a given r to the prototypeinput pattern. We then consider the DoG filter responses c r ð x ; y Þ along a number (in general k ) of concentric circles around thecenter point (labelled by ‘1’ in Fig. 4). The positions along these cir-cles at which these responses reach significant local maxima arethe positions of the points that characterize the dominant intensityvariations around the point of interest. For the considered example,there are two such positions for each of the two circles. Thesepoints are labelled from ‘2’ to ‘5’ in Fig. 4. The number of suchpoints depends on the number k of concentric circles we considerand the specified prototype pattern.In the proposed B -COSFIRE filter, every point i that is selectedwith the method mentioned above is described by a tuple of threeparameters ð r i ; q i ; / i Þ : r i represents the standard deviation of theDoG filter that responds most strongly and that provides the input,while q i and / i are the polar coordinates with respect to the centerof support of the B -COSFIRE filter.We denote by S ¼ f ð r i ; q i ; / i Þ j i ¼ ; . . . ; n g the set of 3-tuples ofa B -COSFIRE filter, where n stands for the number of consideredDoG responses. In Eq. (3) we report the parameter values of a set S that are determined by the automatic analysis of the inputpattern shown in Fig. 4. S ¼ ð r ¼ : ; q ¼ ; / ¼ Þ ; ð r ¼ : ; q ¼ ; / ¼ : Þ ; ð r ¼ : ; q ¼ ; / ¼ : Þ ; ð r ¼ : ; q ¼ ; / ¼ : Þ ; ð r ¼ : ; q ¼ ; / ¼ : Þ ð Þ Fig. 2.
Sketch of the proposed B -COSFIRE filter. The black spot in the middle of thewhite bar indicates the center of the filter support which is illustrated as a dashedellipse. A B -COSFIRE filter combines the responses from a group of DoG filters(represented by the solid circles) by multiplication. Fig. 3. (a) Synthetic input image (of size 100 ’
100 pixels) of a vertical line (5 pixelswide) and (b) the corresponding response image of a center-on DoG filter (here r ¼ : Fig. 4.
Example of the configuration of a B -COSFIRE filter to be selective for avertical bar. The center of the area of support is indicated by the cross marker. Theenumerated spots represent the positions at which the strongest DoG responses areachieved along the concentric circles of given radii. Half-wave rectification is an operation that suppresses (sets to 0) the negativevalues.48
G. Azzopardi et al./Medical Image Analysis 19 (2015) 46–57 (a) Symmetric filter binary images with the ones that are manually segmented by thefirst observer as ground truth.
For our experiments we only consider the green channel of ret-inal images. This decision is supported by previous works(Niemeijer et al., 2004; Staal et al., 2004; Mendonca andCampilho, 2006; Soares et al., 2006; Ricci and Perfetti, 2007) whichfound that the contrast between the vessels and the background is better defined in the green channel. Conversely, the red channelhas low contrast and the blue channel shows a small dynamicrange. Mendonca and Campilho (2006) assessed the performanceof different color representations such as the green component ofthe original RGB image, the luminance channel of the NationalTelevision Systems Committee (NTSC) color space and the a ! com-ponent of the L ! a ! b ! representation. They found that the highestcontrast between vessels and background is, in general, shown inthe green channel of the RGB image.Due to the strong contrast around the FOV of the retinal images,the pixels close to the circumference might cause the detection offalse vessels. Thus, we use the pre-processing algorithm proposedby Soares et al. (2006) to smoothen the strong contrast around thecircular border of the FOV area. It uses a region of interest (ROI)determined by the FOV-mask of the retina. For the images in theDRIVE data set we use the provided FOV-mask images todetermine the initial ROI. Since the STARE and the CHASE_DB1 datasets do not provide the FOV-mask images, we compute them bythresholding the luminosity plane of the CIELab version of the ori-ginal RGB image.Next, we dilate the border in the following iterative procedure.In the first iteration, we consider every black pixel that lies just onthe exterior boundary of the FOV-mask. We then replace everysuch pixel with the mean value of the pixels of its 8-neighbours (a) (b) Fig. 6. (b) Example of the configuration of an asymmetric B -COSFIRE filter by aprototype bar-ending (a). The point of interest of the B -COSFIRE filter is indicated bythe cross marker and lies on the end of the prototype line. Fig. 5. (a) Input retinal image (of size 565 "
584 pixels) and (b) its pre-processed version (the pre-processing procedure is thoroughly explained in Section 3.2). (c) Themaximum superimposed response image ^ r S ð x ; y Þ ) of a rotation-invariant B -COSFIRE filter ð q ; ; ; g ; r ¼ : ; a ¼ : ; r ¼ Þ that is applied with 12 equidistant values ofthe parameter w : (d) w ¼
0, (e) w ¼ p , (f) w ¼ p , (g) w ¼ p , (h) w ¼ p , (i) w ¼ p , (j) w ¼ p , (k) w ¼ p , (l) w ¼ p , (m) w ¼ p , (n) w ¼ p , (o) w ¼ p . The response images are invertedfor clarity reasons. The thresholds are 0.5 and 0.1 for the STARE and CHASE_DB1 data sets,respectively. CIELab is a color space specified by the International Commission on Illumination.50
G. Azzopardi et al. / Medical Image Analysis 19 (2015) 46–57 (b) Asymmetric filter
Fig. 2: (a): A DoG filter applied to a straight full vertical bar pattern.After applying the DoG filter four local maxima are detected on thetwo concentric circles surrounding the center point, i.e. | S | = | S | = S that werefound for the vertical bar patterns to a new set: R ψ ( S ) = { ( σ i , ρ i , φ i + ψ ) |∀ ( σ i , ρ i , φ i ) ∈ S } , (3)where ψ is the angle of the orientation of the bar for which theset R ψ ( S ) is selective. In the symmetric case, by taking ψ = π , π ,..., π , we obtain a filter that, including the original vertical barorientation, becomes selective for 12 bar orientations. In the asymmet-ric case we must naturally consider 24 orientations, as in this case arotation by π gives rise to a different orientation due to the asymmetry. We follow the pre-processing steps as discussed by Azzopardi et al .Previous works have shown that the green channel of RGB images de-fines the contrast between vessels and the background better than thered channel, which has low contrast, or the blue channel, which showsa small dynamic range [7][8][9][10][11]. Therefore, we only use thegreen channel for the segmentation. The Field Of View (FOV) masksare provided with the dataset, and we smooth the borders around theFOV to ensure there will be no false positives in these areas becauseof the high contrast. The dataset IOSTAR also comes with masks thatindicate the Optic Disc (OD) in the retina. We use these masks in-stead of the normal FOV masks to ensure that segmentation will nottake place inside the OD, since the ground truth images do not have aspecified segmentation inside the OD area as well, which gives rise toa better comparison. In the last step the contrast-limited adaptive his-togram equalization is applied. The output of the pre-processing stagefor the retinal image of Figure 1a is given in Figure 3b. (a) Mask (b) Green channel
Fig. 3: (a): The mask that is used to segment the retina including an in-dication of the optic disc. (b): The resulting green channel of a retinalimage after pre-processing is applied as described in subsection 2.2.
We have shown how a B-COSFIRE filter can achieve selectivity forspecific bar patterns, by selecting points around center of support ofthe filter with a high intensity variation. We have also shown howfrom there, rotation-invariance can be easily achieved by rotating thepoints of interest by angles ψ . We have shown how this works fortwo prototype patterns that give rise to a symmetric- and an asym-metric bar filter. In order to achieve a more tolerant filter and also aranking of the points of interest according to their importance suchthat the points more closely to the center point get a higher weight, theDoG responses are blurred with a Gaussian G σ (cid:48) ( x (cid:48) , y (cid:48) ) . Because closerpoints need to get more weight, the standard deviation of the Gaussianis defined in terms of ρ i , the radius of the circle on which the point ofinterest lies: σ (cid:48) = σ (cid:48) + αρ i , (4)where σ is a constant for the base standard deviation and α is the rateat which the standard deviation increases. Therefore, the higher α , themore tolerant the filter becomes. The DoG responses are moved to thecenter of the filter support by shifting with ( ∆ x i = − ρ i cos φ i , ∆ y i = − ρ i sin φ i ) . The i th blurred and shifted DoG response then becomes: S σ i , ρ i , φ i ( x , y ) = max x (cid:48) , y (cid:48) { c σ i ( x − ∆ x i − x (cid:48) , y − ∆ y i − y (cid:48) ) G σ (cid:48) ( x (cid:48) , y (cid:48) ) } , (5)where we consider a neighborhood around the points of interest of − σ (cid:48) ≤ x (cid:48) , y (cid:48) ≤ σ (cid:48) . As we can see, the maximum weighted responseis the value we assign to s σ i , ρ i , φ i ( x , y ) . Now at each pixel of the inputimage, each trained B-COSFIRE filter is applied by multiplying thenow obtained blurred and shifted DoG responses in the set S in thefollowing way: ( x , y ) = (cid:32) | S | ∏ i = ( S σ i , ρ i , φ i ( x , y )) ω i (cid:33) / ∑ | S | i = ω i ω i = exp − ρ
22 ˆ σ , ˆ σ =
13 max i ∈{ ... | S |} { ρ i } , (6)where we obtain r s ( x , y ) for the symmetric filter response and r a ( x , y ) for the asymmetric filter response. As a small sidenote, Equation 6shows exactly why the filter is called COSFIRE: The response is a combination , which refers to the product, of shifted filter responses,which refers to the shifts of the filter responses by ( ∆ x i , ∆ y i ) to thecenter of the B-COSFIRE filter support.The sum of the symmetric- and the asymmetric filter r as ( x , y ) = r s ( x , y ) + r a ( x , y ) is then the total response of the filtering process. Toget a binary image in which each pixel is classified as either vessel ornon-vessel, we threshold the filter response r as ( x , y ) with threshold T ,where pixel values of r as ( x , y ) above T are considered as vessel pixels,i.e.: g ( x , y ) = (cid:40) , if r as ( x , y ) > T , if r as ( x , y ) ≤ T (7) LTERNATIVE METHODS
The algorithms for segmenting blood vessels in retinal fundus imagescan be divided in supervised and unsupervised methods. B-COSFIREis an example of an unsupervised approach to vessel segmentation.Supervised methods first train on a set of example images with theirground truth segmentation provided. [12] discusses several blood ves-sel detection methods that were published by 2012. In this section,we discuss a few of these methods and compare them with each other.Table 1, Table 2 and Table 3 show some performance metrics of themethods we discuss.
One can use deep neural networks to extract blood vessels from afundus image. Deep neural networks are neural networks with many(hidden) layers. Liskowski et al . [3] use deep learning to identifyblood vessels in fundus images. The deep learning algorithm is theerror back-propagation algorithm extended with dropout [13]. The er-ror back-propagation algorithm is a common algorithm in the field ofneural networks. With dropout, during training a percentage of unitsis temporarily disabled, introducing an extra challenge for the trainingprocess.Liskowski et al . use the DRIVE [7][14], STARE [15][16] andCHASE [17] datasets to verify their deep learning method. To clas-sify pixels as being either vessel or non-vessel , a patch of m × m pixelsis used, centred at the pixel. The neural network is fed with a triple ofsuch patches. Each of those patches is at the same location, but eachpatch is from one of the three RGB channels. As a result, this methoduses all three RGB channels to determine whether a pixel belongs toa blood vessel or not. This is in contrast to the filter-based methodB-COSFIRE, which only uses the green channel as if it were a greyscale image [1], for reasons discussed in subsection 2.2.For their experiments, Liskowski et al . use multiple configurations.Table 1 and Table 2 show metrics for the two configurations they foundbest. Furthermore, Liskowski et al . used structured prediction, an ap-proach which also uses the neighbourhood of a pixel for the classifi-cation process. Separate results for the two best configurations usingstructured prediction are also shown in Table 1 and Table 2. Pleaserefer to [3] for a detailed description of the configurations with andwithout structured prediction. Fraz et al . [2] use an ensemble classifier of boosted and bagged deci-sion trees. In short, they use multiple classifiers to classify the pixelsand take a majority vote to determine whether a pixel is a vessel or Method AUC Acc Se SpB-COSFIRE 0.9614 0.9442 0.7655 0.9704Liskowski et al . [3]Balanced 0.9738 0.9230 0.9160 0.9241No-Pool 0.9720 0.9495 0.7763 0.9768Balanced SP 0.9788 0.9530 0.8149 0.9749No-Pool SP 0.9790 0.9535 0.7811 0.9807Fraz et al . [2] 0.9747 0.9480 0.7406 0.9807Orlando et al . [4] - - 0.7897 0.9684Table 1: Results on the DRIVE datasetMethod AUC Acc Se SpB-COSFIRE 0.9563 0.9497 0.7716 0.9701Liskowski et al . [3]Balanced 0.9820 0.9309 0.9307 0.9304No-Pool 0.9785 0.9566 0.7867 0.9754Balanced SP 0.9928 0.9700 0.9075 0.9771No-Pool SP 0.9928 0.9729 0.8554 0.9862Fraz et al . [2] 0.9768 0.9534 0.7548 0.9763Orlando et al . [4] - - 0.7680 0.9738Table 2: Results on the STARE dataset non-vessel . Similar to the B-COSFIRE method, this method uses thegreen channel of the RGB image for the blood vessel segmentation.For experimenting, Fraz et al . use the DRIVE, STARE and CHASEdatasets. The classifier is trained using a randomly selected subset ofpixels for each image.
Orlando et al . [4] use conditional random fields (CRFs). In contrastto normal classifiers, CRFs do not classify ‘objects’ purely based onthe object. Note that word ‘object’ here can mean anything, like indi-vidual pixels or what activity belongs to an image. For example, forclassifying that an image, from a sequence of snapshots of someone’slife, displays an activity, it can be useful to know which activity wasclassified before.CRFs work with graphs. Orlando et al . use a fully connected CRF(FC-CRF) where each pixel, being a node in the graph of the CRF, islinked to every other pixel. This has the advantage that the methodcan take long-range interactions between pixels into account, insteadof only neighbouring information, which results in an improvementof the segmentation accuracy of the method [4]. Similar to the B-COSFIRE method, this method only uses the green channel of theRGB image for the blood vessel segmentation.
XPERIMENTS
To study the performance of the filter-based B-COSFIRE method dis-cussed in section 2, we experiment with the method on a recent retinalimage dataset. In this section we report on the parameters that we useand how we evaluate the resulting segmentation.Method AUC Acc Se SpB-COSFIRE 0.9487 0.9387 0.7585 0.9587Liskowski et al . [3]Using DRIVE for trainingNo-Pool 0.9646 0.9473 0.7158 0.9810No-Pool SP 0.9710 0.9515 0.7520 0.9806Using STARE for trainingNo-Pool 0.9543 0.9525 0.7091 0.9791No-Pool SP 0.9880 0.9696 0.8145 0.9866Fraz et al . [2] 0.9712 0.9469 0.7224 0.9711Orlando et al . [4] - - 0.7277 0.9712Table 3: Results on the CHASE dataset .1 Training parameters
A major advantage of the B-COSFIRE method is that the parametersare quite intuitive, so one can estimate a good set of parameter val-ues for ( σ , ρ , σ , α ) . In any case, the parameters must be carefullychosen, as the images of different retinal image dataset have varyingproperties. The parameters that are to be determined are: • σ : The standard deviation of the outer Gaussian function in theDoG filter. Intuitively, the higher the detail in the image, i.e.,the higher the resolution, the greater σ needs to be to still detectblood vessels with high accuracy, since they are made up of morepixels in high resolution images. • ρ : The set of radii of the concentric circles surrounding the fil-ter’s center of support. • σ : The base standard deviation of the Gaussian weighting func-tion used for blurring the DoG responses, to allow for some tol-erance in the position of the found points (See Equation 4). • α : Parameter α that determines the standard deviation of theGaussian used for blurring the DoG responses. The higher α , themore the standard deviation grows for larger concentric circles(See Equation 4).To determine optimal ( σ , ρ , σ , α ) , we split the datasets into a train-ing set and a validation set. We use the training set for finding a goodset of ( σ , ρ , σ , α ) and validate the decision on the entire dataset. Welimit the search space of ( σ , ρ , σ , α ) by results found on specificdatasets in previous research, such as the results of Azzopardi et al ., inwhich for instance optimal values of σ were found for certain resolu-tions, and this hints us to the region in which the optimal value of σ islikely to reside. The procedure for determining the parameter valuescan then be summarized as follows:1. Split the dataset into an equal sized training set and a validationset, each consisting of n images.2. For a search space of ( σ s , ρ s , σ s , α s ) which depends on the prop-erties of the dataset, take the filter corresponding to this combina-tion of parameters and filter all n training images with this filterobtaining n response images.3. Iterate over a range of threshold t ∈ [ , ] with intervals of 0 . t , segment all n response images according toEq. (7). This yields n binary images with the vessel segmenta-tion.4. Having obtained these n binary images, we can compare all n images with the ground truth vessel segmentation images. Eachvessel segmentation for one image f obtained from the filter andthreshold combination paired with the ground truth segmenta-tion of image f yields a confusion matrix, as observable in Table4. We then compute for each of these n confusion matrices theMatthews Correlation Coefficient (MCC), which is directly com-putable from the corresponding confusion matrix as only quanti-ties from the confusion matrix are used:MCC = T P × T N − FP × FN (cid:112) ( T P + FP )( T P + FN )( T N + FP )( T N + FN ) (8)The MCC is particularly suitable for the assessment of binaryclassifiers, and can even be used if there is a disbalance betweenthe number of datapoints, in this case pixels, per class. In our usecase, the MCC is therefore suitable as vessel segmentation is abinary classification problem with a disbalance between classes,as there are naturally more non-vessel pixels compared to vesselpixels. Having computed the MCC for the responses of all train-ing images thresholded with the current threshold, we averagethis set of
MCC values. After that we consider the next thresh-old by going to step 4 and compute the new average
MCC . In
Classifier: Vessel Classifier: Non-VesselGT: Vessel TP FNGT: Non-Vessel FP TN
Table 4: Confusion matrix: The classifier output (the thresholded fil-ter response) is on the horizontal access. On the vertical access is theGround Truth, which is the assessment of an expert that manually an-notates all the pixels in the image as either vessel or non-vessel. σ ρ max σ α Symmetric 4.8 20 3 0.3Asymmetric 4.4 36 1 0.1Table 5: The parameters used for the symmetric- and the asymmet-ric filter used for segmentation of blood vessels in the images of theIOSTAR dataset.the end, the combination of parameters ( σ , ρ , σ , α ) along witha threshold t that yields the best average MCC on the trainingset are the parameters that we will use for segmentation of bloodvessels in the entire dataset.For the best set of filter parameters obtained from the above procedure,we store for each threshold t the corresponding average T PR and
FPR over the segmented training images. By plotting the obtained valuesof
T RP against
FPR , we get a so-called ROC curve. The threshold t that gave the best average MCC value over the segmented trainingimages is the threshold we use for the segmentation of the images inthe validation set, and possible future images that are obtained withthe same parameters as for the dataset under consideration.
The retinal images in the IOSTAR dataset were obtained using Scan-ning Laser Ophtalmoscopy (SLO) [5], [18]. In 2015, the dataset wasone of the first publicly available retinal datasets of which the datawas obtained using this technique. The resolution of the images are1024 × et al ., whichshow an expected positive correlation between resolution and σ , welimit the search space to σ = [ . , . ,..., . ] . We choose ρ max = et al . we increment the circles by 2 pixels. Therefore the radiiof the circles that are considered are ρ = [ , , ,..., ] . σ is limitedto σ = { , , } and α to α = { . , . ,..., . } .Once we choose the combination of parameters for the symmetricfilter that yields the best average MCC over the images in the trainingset, we fix these parameters and search for a good combination of pa-rameters for the asymmetric filter. Note that we know in advance thatvessel endings are usually thinner, and therefore we search for σ a in avalue range that is strictly smaller than the obtained σ s . ESULTS
In Figure 5 the ROC curve is traced out for the segmentation of thetraining images in the IOSTAR dataset with the set of parameters inTable 5, using the manually segmented ground truth images as the cor-rect classification against which the filter segmentation is compared.The threshold with the highest average MCC value is also indicated,which turns out to be a threshold of 35. In Table 6 we see the resultingAUC value and also the segmentation accuracy corresponding with thebest threshold. In Figure 4 we see an original IOSTAR training imagealongside the segmented image using the filter parameters in Table 5and threshold t =
35. In line with the good performance metrics, wecan also see that the resulting segmentation is quite accurate in thesense that the main vessel tree has been segmented correctly and eventhe tinier vessels are for the most part in the segmentation.From an experiment in [1] on the DRIVE retinal image dataset, ithas been established that the parameter α is the most sensitive param-eter, i.e., a correct configuration of this parameter is most determinant a) An IOSTAR retinal fundus image B-COSFIRE segmented image (b) Segmented with filter
Fig. 4:
Left : An original training image from the IOSTAR dataset.
Right : The corresponding segmentation using the trained B-COSFIREfilters with parameters specified in Table 5 and threshold t = FPR T R P IOSTARBest threshold
Fig. 5: The ROC curve obtained by varying the segmentation thresholdfrom 0 to 255 which yields for each threshold a true positive rate (TPR)and a false positive rate (FPR). Indicated with a square is the TPR andFPR of the threshold t =
35, which yielded the best average MCCvalue.for the performance of the segmentation. Here we perform a simi-lar experiment for the IOSTAR dataset: From the parameter values aslisted in Table 5 for the symmetric filter, we inspect the sensitivity ofeach parameter at a time by changing the parameter value by steps of0.1 from the optimal value and keeping the rest of the parameters attheir optimal values. We then compare the resulting 30 MCC valueswith the 30 MCC values obtained for the optimal parameter values inTable 5 by a two-tailed paired t-test with significance level p = . p < .
05. The t-values for which this istrue are bold-faced in Table 7.As can be see in Table 7, the parameters have a different level ofsensitivity. From our experiments, the parameter α turns out to be themost sensitive parameter. By a small change the t-statistic changesconsiderably and we found that even a change of 0.1 yields a signif-icantly different segmentation performance. We observe that the pa-rameters σ or σ are less sensitive, as the change in t-value is smallercompared to the case of α . However, we notice that the σ parameteris more sensitive when it is increased from σ = . σ : It is less sensitive when increased AUC MCC Accuracy Se SpIOSTAR 0.9519 0.6979 0.9419 0.7705 0.9613Table 6: IOSTAR segmentation accuracy with the parameters specifiedin Table 5, using the segmentation threshold t = σ σ α -0.5 -0.7858 --0.4 -0.5248 --0.3 -0.3723 -0.9215-0.2 1.3496 -0.1580 1.7180-0.1 0.5614 -0.0862 -2.4933 +0.2 -1.1272 -0.9986 -3.6321 +0.3 -1.5860 -1.9217 -4.3719 +0.4 -1.9512 -2.5296 -4.9806 +0.5 -2.2660 -3.1036 -5.4493 Table 7: An experiment to study the sensitivity of the parameters σ , σ and α for a symmetric B-COSFIRE filter on the 30 images of theIOSTAR dataset. Each row compares the 30 MCC values for the opti-mal parameters with the 30 MCC values for the parameter values forthe corresponding row. Each row indicates the t-statistic after chang-ing the parameter value by the offset in the first column from the op-timal parameter value and keeping the remaining two parameters attheir optimal values. Bold-faced are the t-values for which the null-hypothesis is rejected with significance level p < . σ = α isthe most sensitive parameter is in line with the result in [1]. UMMARY AND DISCUSSION
In this contribution we discussed various methods for the automaticsegmentation of vessels in retinal fundus images. In this section wediscuss how they perform compared to the B-COSFIRE method andour insights about the B-COSFIRE method.
Given the results of Liskowski et al ., their neural network aproachperforms better than B-COSFIRE in general. In all cases, the neuralnetwork achieves a higher AUC value than B-COSFIRE. However,in terms of accuracy and specificity, B-COSFIRE does outperformthe balanced configuration of the network on the DRIVE and STAREdataset. For the CHASE dataset, the neural network outperforms B-COSFIRE.However, in terms of training, the B-COSFIRE method is cheaper,as Liskowski et al . report that training the network can take up to 8hours on a single GPU. However, training times are not interesting forthe end user, as one usually does not need to train the classifier for eachnew image. To classify an image, the neural network needs 92 secondsusing a high end GPU. So, for end users with low end hardware anda limited budget, the B-COSFIRE method could be a better choice interms of time, because the B-COSFIRE method only needs 10 secondson a 2 GHz CPU. However, if the hardware or budget is available, theneural network is suitable, because we consider 92 seconds still anacceptable time, given the experiment results.
Compared to the B-COSFIRE method, the method of Fraz et al . ismore accurate when tested on the three datasets (DRIVE, STARE andCHASE), however, the B-COSFIRE method has a higher sensitivity ortrue positive rate (TPR), i.e. B-COSFIRE is more able to detect vessel pixels. The method of Fraz et al . performs better than B-COSFIRE,but the B-COSFIRE is faster. B-COSFIRE only needs 10 seconds toprocess an image from the DRIVE or STARE datasets whereas theensemble classification method needs 2 minutes.he method of Fraz et al . performs slightly better than B-COSFIRE, but requires expensive training with retinal fundus images.The parameters of the B-COSFIRE method must either be tweakedmanually by hand or be determined using a training set of the images.Based on experiments with the intuitive parameters of B-COSFIRE,manually tweaking the parameters quickly yields a good performancewhich is a major advantage of the B-COSFIRE method.
Orlando et al . did not provide AUC and accuracy metrics, there-fore, we only use the sensitivity and specificity to compare the FullyConnected Conditional Random Field method with B-COSFIRE. Asit turns out, there is no winner. For the DRIVE dataset, Orlando etal . achieve higher sensitivity, but lower specificity. For the STAREdataset, it is the other way around and for the CHASE dataset, Or-lando et al . achieve also less sensitivity and more specificity.
Furthermore, we have studied the inner workings of the B-COSFIREmethod and subsequently applied the method for a set of parametersto the recent IOSTAR dataset. We have seen that B-COSFIRE couldachieve an accurate and fast segmentation on this dataset.B-COSFIRE has turned out to be a fast trainable filter-based methodthat can definitely compete with much more complex methods that arebased on machine learning methods like neural networks, that take aconsiderable amount of time to train. Given the complexity of thosemethods and B-COSFIRE’s potential to compete well and its intuitiveparameters, B-COSFIRE could even be the preferred method to use,especially when having less computational power at hand.
Segmentation methods like the ones we have discussed in this paperare not solely useful for blood vessel segmentation. In fact, one coulduse these methods to segment rivers from satellite images, segmentthe nerves of a leaf, to find cracks in concrete structures, in fact thesemethods could provide useful in all applications in which elongatedstructures need to be segmented.For future work, any of these methods could be extended to not onlyextract the blood vessel tree but to also detect obstacles that could im-pede the blood flow. An additional benefit of B-COSFIRE is that its se-lectivity is not pre-programmed into the method itself but it is achievedby prototype patterns on which the filter is automatically configured.This gives the possibility to extend the filter’s selectivity to all kindsof more complex structures which gives rise to applications other thanvessel segmentation alone. R EFERENCES [1] G. Azzopardi, N. Strisciuglio, M. Vento, and N. Petkov, “Trainable cos-fire filters for vessel delineation with application to retinal images,”
Med-ical image analysis , vol. 19, pp. 46–57, 1 2015.[2] M. M. Fraz, P. Remagnino, A. Hoppe, B. Uyyanonvara, A. R. Rudnicka,C. G. Owen, and S. A. Barman, “An ensemble classification-based ap-proach applied to retinal blood vessel segmentation,”
IEEE Transactionson Biomedical Engineering , vol. 59, pp. 2538–2548, Sept 2012.[3] P. Liskowski and K. Krawiec, “Segmenting retinal blood vessels withdeep neural networks,”
IEEE Transactions on Medical Imaging , vol. 35,pp. 2369–2380, Nov 2016.[4] J. I. Orlando, E. Prokofyeva, and M. B. Blaschko, “A discriminativelytrained fully connected conditional random field model for blood vesselsegmentation in fundus images,”
IEEE Transactions on Biomedical En-gineering , vol. 64, pp. 16–27, Jan 2017.[5] J. Zhang, B. Dashtbozorg, E. Bekkers, J. P. W. Pluim, R. Duits, and B. M.ter Haar Romeny, “Robust retinal vessel segmentation via locally adap-tive derivative frames in orientation scores,”
IEEE Transactions on Med-ical Imaging , vol. 35, pp. 2631–2644, Dec 2016.[6] G. Azzopardi and N. Petkov, “Trainable cosfire filters for keypoint de-tection and pattern recognition,”
Ieee transactions on pattern analysisand machine intelligence
IEEETransactions on Medical Imaging , vol. 23, no. 4, pp. 501–509, 2004.[8] A. M. Mendonca and A. Campilho, “Segmentation of retinal blood ves-sels by combining the detection of centerlines and morphological recon-struction,”
IEEE Transactions on Medical Imaging , vol. 25, pp. 1200–1213, Sept 2006.[9] J. V. B. Soares, J. J. G. Leandro, R. M. Cesar, H. F. Jelinek, and M. J.Cree, “Retinal vessel segmentation using the 2-d gabor wavelet and su-pervised classification,”
IEEE Transactions on Medical Imaging , vol. 25,pp. 1214–1222, Sept 2006.[10] E. Ricci and R. Perfetti, “Retinal blood vessel segmentation using line op-erators and support vector classification,”
IEEE Transactions on MedicalImaging , vol. 26, pp. 1357–1365, Oct 2007.[11] M. Niemeijer, J. Staal, B. van Ginneken, M. Loog, and M. Abramoff,“Comparative study of retinal vessel segmentation methods on a new pub-licly available database,” in
SPIE Medical Imaging (J. M. Fitzpatrick andM. Sonka, eds.), vol. 5370, pp. 648–656, SPIE, SPIE, 2004.[12] M. Fraz, P. Remagnino, A. Hoppe, B. Uyyanonvara, A. Rudnicka,C. Owen, and S. Barman, “Blood vessel segmentation methodologiesin retinal images a survey,”
Computer Methods and Programs inBiomedicine , vol. 108, no. 1, pp. 407 – 433, 2012.[13] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhut-dinov, “Dropout: A simple way to prevent neural networks from overfit-ting,”
J. Mach. Learn. Res. , vol. 15, pp. 1929–1958, Jan. 2014.[14] “DRIVE Dataset.” , 2004. [Online; accessed March 5th, 2017].[15] A. Hoover, V. Kouznetsova, and M. Goldbaum, “Locating blood vesselsin retinal images by piecewise threshold probing of a matched filter re-sponse,”
IEEE Transactions on Medical Imaging , vol. 19, pp. 203–210,2000.[16] “STARE Dataset.” http://cecas.clemson.edu/˜ahoover/stare/ , 2004. [Online; accessed March 5th, 2017].[17] “CHASE Dataset.” https://blogs.kingston.ac.uk/retinal/chasedb1/ . [Online; accessed March 5th, 2017].[18] “IOSTAR Dataset.”