A fuzzy approach for segmentation of touching characters
AA fuzzy approach for segmentation of touching characters
Giuseppe Air`o Farulla , Nadir Murru , Rosaria Rossini [email protected], [email protected], [email protected] Department of Control and Computer Engineering, Politecnico di Torino,Corso Duca degli Abruzzi 24, 10129, Torino, Italy Department of Mathematics, University of Turin,Via Carlo Alberto 10, 10121 Torino, Italy Istituto Superiore Mario Boella, Center for Applied Research on ICT,Via Pier Carlo Boggio 61, 10138, Torino, Italy
Abstract
The problem of correctly segmenting touching characters is an hard task to solveand it is of major relevance in pattern recognition. In the recent years, many methodsand algorithms have been proposed; still, a definitive solution is far from being found.In this paper, we propose a novel method based on fuzzy logic. The proposed methodcombines in a novel way three features for segmenting touching characters that havebeen already proposed in other studies but have been exploited only singularly so far.The proposed strategy is based on a 3–input/1–output fuzzy inference system withfuzzy rules specifically optimized for segmenting touching characters in the case ofLatin printed and handwritten characters. The system performances are illustratedand supported by numerical examples showing that our approach can achieve a rea-sonable good overall accuracy in segmenting characters even on tricky conditions oftouching characters. Moreover, numerical results suggest that the method can be ap-plied to many different datasets of characters by means of a convenient tuning of thefuzzy sets and rules.
Automatic recognition of both printed and handwritten characters remains a challeng-ing problem in pattern recognition. Most existing Optical Character Recognition soft-ware (OCR) deal with it by exploiting simultaneously two highly correlated techniques:character segmentation and pattern recognition. As part of the OCR process, charactersegmentation techniques are applied to patterns representing individual characters to berecognized. The simplest way to perform character segmentation would be to exploit thespace between characters. This strategy unfortunately fails when considering mathemat-ical formulae, handwritten and printed words with touching characters, representing wellknown issues that often occur in degraded (e.g., photocopies) or compressed text images[39]. In these situations, two or more adjacent characters touch together and share com-mon pixels. To identify the touching regions and provide a correct segmentation is crucialto recognition, since incorrectly segmented characters are unlikely to be correctly recog-nized even from high-performance pattern recognition algorithms [52]. In facts, manyresearchers state that errors in characters segmentation affect overall pattern recognitionperformance more than a degradation in the starting image [14]. Segmentation of touchingcomponents is crucial to get higher recognition rates by OCR systems [6].Common techniques for character segmentation exploit several aspects characterizingletters and their shapes, such as vertical projection, pitch estimation or character size,1 a r X i v : . [ c s . C V ] D ec ontour analysis, or segmentation–recognition coupled techniques [24], [27]. One of themost difficult problem an image segmentation algorithm has to address is the segmenta-tion of touching characters [38], [41]. Very often, adjacent characters are touching, andmay overlap, making hard the task of segmenting a given expression or word correctlyinto its character components [4, 22]. Given the relevance of such challenging task, severalmethods have been developed in last years for performing optimal segmentation of touch-ing characters. Kurniawan et al. [18] identify touching positions in Latin handwrittencharacters by means of self organizing feature maps and a region–based approach. In [20]and [21], the authors also deal with segmentation of Latin handwritten texts. Differentapproaches involving thinning algorithms can be found in [26] and [36]. Among the earlierpieces of work on touching character segmentation, some approaches rely on contour anal-ysis of the connected components for segmentation [17, 7]. In [45], Sharma et al. studythe problem of detecting arbitrarily–oriented text from video frames. Cut positions intouching characters are evaluated using the top distance profile. Roy et al. [40] addressedthe problem of segmenting touching characters with different orientations. In [50] and[42], authors developed segmenting approaches leveraging on genetic algorithms. Rehmanet al. [37] identify character boundaries by using a set of heuristic rules. Louloudis et al.[23] performed text line and word segmentation of handwritten documents by applying theHough transform. In [29], the authors addressed the problem of automatically segmentingwords from historical handwritten documents. The water reservoir algorithm has beenexploited in some researches, e.g., [35] and [19]. In [2] and [47], the authors presentedmethods derived by combining different segmentation techniques. Further methods canbe found in [1], [5], [8], [11] and [51]. It is worth mentioning that works [33] and [9] alsodeal with segmenting characters and symbols within mathematical expressions.Generally, these strategies need a preprocessing step where the input RGB image isconverted to grayscale by eliminating the hue and saturation values while retaining theillumination, then converting the grayscale image to binary, obtaining a matrix whoseentries are 0 for foreground pixels (black) and 1 for background pixels (white) for all otherpixels. Since different thresholds are used for detection it is possible that some feature ofcharacters is lost in the process. In our experiments we resort on the Otsu thresholdingmethod [34], which has proven to be very robust to noise and to changes in scenes andinput images, providing thresholds for image binarization ensuring that information lossis minimized.A function is used to evaluate each column of the matrix leveraging on features thatcharacterize typical character positions within the text, giving a value to each column.Finally, cut positions (i.e., columns) are chosen depending on these values. Commonfunctions implied are the ratio of the second difference of the vertical projection (e.g.,[15]) and the peak–to–valley (e.g., [25]). Other functions are, e.g., based on number ofblack pixels, number of white pixels counted from the top of the column to the first blackpixel, crossing count (i.e., number of black to white transitions), number of identical black(white) pixels with left (right) column, width to height ratio for the remaining left (right)pattern after cutting (e.g., [3]). Another approach can be found in [10], where the authorsused the inverse crossing count, measure of blob thickness and degree of ”middleness”for defining a function that identifies when a column, a row, or a diagonal is a cuttingposition.However, we argue that all the methods and approaches above mentioned do notprovide a comprehensive answer to the problem of segmenting touching characters. Indeed,their performances are not always optimal, radically depending on the specific set ofcharacters involved in the segmentation. At the moment, there does not exist a standardapproach for the segmentation of touching characters. Thus, this is currently an activeresearch field.There has always been a dilemma whether it is more convenient to segment first and2hen recognize the patterns, or instead classify while segmenting. Authors in [4] reviewCasey and Lecolinet work [6] stating that that the strategies for segmentation can beclassified into three main strategies as follows:1. the classical approach already described;2. recognition-based segmentation, in which a search is made for image componentsthat match with the character classes in a valid alphabet;3. holistic approach that attempts to recognize the word as a whole.In this classification, our work lies near to the first class, still presenting some importantadvances: in fact, we aim at combining some of the previously cited features, usuallyexploited one at a time, by means of an original fuzzy logic approach in order to improveperformances in separating touching characters. Indeed, the selection of the featuresthat characterize touching positions is an art rather than a technique. In other words,the selection of the features mainly depends on the experience of the authors. Thus,in this context, fuzzy logic can be very useful, since it is congenial to capture and tocode expert–based knowledge in view of performing targeted simulations. Taking thisinto strong consideration, we also leverage on optimization techniques to increase overallperformances of our approach.Fuzzy logic has been already exploited to perform image segmentation. For instance,Garain and Chaudhuri [10] used fuzzy multifactorial analysis to combine some of the fea-tures previously described. In [32], a survey on image segmentation techniques using fuzzyclustering is presented. Fuzzy logic has been also exploited for developing segmentation–recognition coupled techniques [12]. A non–linear fuzzy approach can be found in [44].In [31], authors used edge corners and fuzzy logic to develop segmentation techniquesexploited to break down Captcha. Further approaches can be found in [13], [46], [48].In this paper, we propose a novel fuzzy approach that differs from the state of the art inseveral aspects. Firstly, we combine by means of a fuzzy strategy some features that havenever been exploited together in previous works. Secondly, we develop an original strategybased on an inference system composed by 3–input/1–output with fuzzy rules specificallyoptimized for the purpose of separating touching characters in the case of Latin printedand handwritten characters. The strength of our fuzzy strategy relies on the possibility toadjust its parameters in such a way that they can fit the characteristics of the data set. Inother words, the parameters of the method are extracted a priori considering the differentcharacters in the data set.The inference engine is based on the Mamdani model with if–then rules, minimaxset–operations, sum for composition of activated rules and defuzzification based on thecentroid method. We have chosen the Mamdani model since it is congenial to capture andto code expert–based knowledge [28].The paper is organized as follows. In Section 2, we present the fuzzy strategy conve-niently developed for performing segmentation of touching characters. In section 3, wepresent the numerical results that show the effectiveness of the proposed method. Specif-ically, in Section 3.1, we describe the datasets used for simulations. Sections 3.2 and 3.3are devoted to test the method on datasets of Latin printed and handwritten characters,respectively. Finally, in Section 4 we draw some conclusion and present future works. In the following, we only focus on binarized images, for homogeneity with the solutionsalready presented. In a binarized image, a pattern can be represented by a matrix whoseentries are 0 (black pixels) and 1 (withe pixels). Generally, methods for segmenting touch-ing characters define a function based on some features that characterize cut positions.3hen, such a function is evaluated for each column of the matrix and the cut position ischosen depending on these values. Classical functions of this kind are the peak–to–valleyfunction g and the function h defined as g ( i ) = V ( l i ) − V ( i ) + V ( r i ) V ( i ) + 1 , h ( i ) = V ( i − − V ( i ) + V ( i + 1) V ( i ) , where V ( i ) denotes the vertical projection function for the i -th column, l i and r i are thepeak positions on the left side and right side of i , respectively. The column with thehighest value of g (or h ) is identified as the cutting column. A further feature, that cansuggests if the i –th column can be a cut position, is the distance f ( i ) between i and thecenter of the pattern. Indeed, generally, cutting columns are located near to the centerof the pattern. Clearly, this feature should be only considered as an indication of theneighborhood where the cutting column is probably located. Indeed, we will exploit sucha feature in combination with the previous functions with the aim of use it in order tocorrect the results provided by the other functions. In this section, we combine functions f , g , and h by means of a fuzzy strategy that conveniently balances these functions.Let us introduce the notion of a “fuzzy degree” qualifying a column i to be a cutposition: in short, ρ = ρ ( i ) ∈ [0 , ρ , the moreprobable is that we have located a good cutting position. The strategy can be detailed bymeans of the fuzzification of the functions f , g , h .Given a pattern in a binarized image, let A , m , n , and c be the matrix of pixels ofthe binarized image, the number of row of A , the number of column of A , and the centralcolumn of A , respectively. In the following, when we refer to a column i of A , we refer tothe i –th column of A , i.e., we are considering the vector of length m whose elements arethe entries of the i –th column or we are only considering its position. This will be clearfrom the context.The central column c is evaluated by means of c = n +12 . When n is odd, c is clearly thecentral column of A ; when n is even, we consider as the central column the mean between n –th column and n + 1, even if in this case c is not an integer number. In this way, foreach column i of A , we define its distance from the center of the pattern as f ( i ) = | c − i | .In our fuzzy strategy, we take into account the normalized distance between each column i and the central column c , i.e., we consider ¯ f ( i ) = f ( i ) c .Similarly, for each column i of A , instead of directly using the functions g and h , weconsider the normalized functions˜ g ( i ) = g ( i ) − min j ∈C g ( j )max j ∈C g ( j ) − min j ∈C g ( j ) , ˜ h ( i ) = h ( i ) − min j ∈C h ( j )max j ∈C h ( j ) − min j ∈C h ( j ) , where C = { , , ..., n } is the set of the columns of A . Note that functions ˜ g and ˜ h are well–defined since we consider matrices A where at least two columns are different. Finally, inthe following we will use the functions¯ g = 1 − ˜ g, ¯ h = 1 − ˜ h so that low values of ¯ g and ¯ h identify cutting columns.Functions ¯ f , ¯ g , ¯ h are fuzzified by defining convenient fuzzy sets and related membershipfunctions. The fuzzy degree ρ will be evaluated combining these functions by means ofsome fuzzy rules. Fuzzy sets, membership functions, fuzzy rules will be specified in sections3.2 and 3.3.The inference engine will be the basic Mamdani model [28], with if–then rules, mini-max set–operations, sum for composition of activated rules, and defuzzification based onthe centroid method. The Mamdani model is congenial to capture and to code expert–based knowledge in view of performing targeted simulations; accordingly the system’sperformance is tuned by means of expert–based choices, heuristic criteria and non–linearoptimization methods. 4 a) (b) (c) Figure 1: Samples of, respectively, two (a), three (b), and four (c) handwritten cursivetouching character patterns.
Although creating a specific dataset of touching character is not our main contribution,in our research we faced the lacking of good and standardized datasets on which testingour algorithms. This reason forced us to build a dataset of touching characters. This needis shared with many other researchers and works, including recent ones as [43] (althoughthis last paper refers to Persian handwriting recognition algorithms rather than to Latincharacters-based ones). In particular, we created two different datasets for Latin charac-ters, one containing handwritten cursive characters, the other containing sided machineprinted characters. In the following, we discuss our choices and methodology.Due to the different characteristics of the two datasets, in the remaining parts of thesection we discuss the process to select the best parameters for running our method on thetwo datasets. Our fuzzy strategy has specific parameters meant to capture the differencesbetween one dataset to another. Such parameters are tailored on the specific datasetcharacteristic by using an heuristic way and non–linear optimization methods. Moreover,we would like to point out that the three features combined in our fuzzy routine seem tobe sufficient for obtaining optimal results in the segmentation. In facts, adding furtherfeatures appear to be useless. For instance, we have verified that the use of the crossingcount as a fourth fuzzy input did not lead to improvements. The experimental results arepresented in the following.
The ultimate purpose of getting good segmentation results, especially when touching char-acter are considered, is to boost the the performances for what concerns overall recognitionaccuracy. Despite the fact thus that, to be fair, different approaches should be evaluatedon the basis of their recognition performances, there are not many authors publishing theirresults on a benchmark database [49]. Also, as stated in many state-of-the-art work, e.g.,[18], unfortunately at the present a comprehensive dataset specific on touching charactersis still missing. Given that, for researches it is difficult to conduct experiments and toanalyze any proposed method; in addition, it is hard to fair compare performances andresults obtained from different authors. We try to overtake such an obstacle by proposingtwo datasets.The first one, called dataset A, contains images of handwritten cursive characters wehave built relying on samples from a standard dataset, in fashion of [18]. In particularwe started from the CCC database [5]. The CCC database contains 57’293 samples ofcursive characters that were manually extracted from images coming from different inputsources, mainly related to American Post Services. They include both upper and lower case5 a) (b) (c)
Figure 2: Samples of, respectively, two (a), three (b), and four (c) machine printed touchingcharacter patterns.letters. Each sample is stored as a binary matrix within the database, and accompaniedwith information about the size of the matrix itself and the character that is represented.Starting from the whole database, we developed a MATLAB script to randomly extract1’000 of its samples, taking care of maintaining a uniform distribution for all the characterschosen, both in their upper and lower version. These samples were later combined andmerged together to form two, three, and four touching character patterns, each of whomis accompanied by a textual descriptor indicating the index of the proper cut column (orcolumns, in the case of three and four multiple touching character patterns). One sampleof each category of patterns is shown in Figure 1. For instance, the descriptor of thesample represented in Figure 1a states that the proper column to cut to properly separatethe “e” and the “h” characters is the 52 th .This merging is however unsupervised, and so improper combinations happen duringthe process. So, we had to filter out the most unrealistic ones. Firstly, we discarded allthe samples with significant difference in their heights. Secondly, we manually removedcombinations without touching patterns (i.e., the characters were well separated) or withtouching patterns that seemed impossible to happen in real world. At the end, we kept153 combinations, of which 139 represent two touching character patterns, and the otherare equally divided into three and four touching character patterns. The disproportionbecause the common touching characters consist of two characters, while three or moretouching characters are rare [50]. Moreover, note that the quantity of combinations oftouching characters contained in our dataset is in compliance with other similar datasetsconstructed using the CCC database (e.g., in [18] a dataset of 123 touching characters isused).The second one, called dataset B, contains images of sided machine printed characterswe have built resorting on a second MATLAB script. We have identified a list of fonttypes (namely Cambria, Candara, Georgia, Lucida Sans Regular, Times New Roman andVerdana Bold) and sizes (namely 10, 20 and 25); for each type and size a MATLAB scriptcombines into images the lower characters from the alphabet to form two, three, and fourtouching character patterns. Each image is accompanied by a textual descriptor, whichin this case indicates directly the characters represented. One sample of each category ofpatterns is shown in Figure 2. For instance, the descriptor of the sample represented inFigure 2a states that the images represent the string “ax”. Also in this case, we preferredto revise manually the dataset to remove missing, or unrealistic, touching patterns. Atthe end, we kept the most promising 189 combinations (where 168 are composed by twotouching characters), in order to define a challenging dataset to test our approach. In the following, we discuss the results of segmentation of touching characters from thedataset B described in the previous section, accordingly to the fuzzy strategy describedin section 2. Fuzzy sets and membership functions related to ¯ f , ¯ g and ¯ h are definedaccordingly to expert based choices. Moreover, their construction has been optimized by6sing the Particle Swarm Optimization (PSO) algorithm [16] in order to improve overallperformances of our fuzzy strategy. Similarly, the fuzzy rules (described in the following)have been tuned using both heuristic criteria and the PSO algorithm.Given a matrix A as defined in section 2, for each column i of A , ¯ f ( i ), ¯ g ( i ) and ¯ h ( i ) areevaluated and the degree ρ ( i ) is provided by the following inference scheme that includesthree inputs (fuzzification of ¯ f , ¯ g , ¯ h ) and one output (cutting degree ρ ). The column i with the lowest value of ρ is considered as the cut column.The function ¯ f is fuzzified by defining the following fuzzy sets: • if ¯ f ( i ) ≤ .
35, then distance from the center of the pattern is
Low ; • if ¯0 . ≤ f ( i ) ≤ .
75, then distance from the center of the pattern is
Medium ; • if ¯ f ( i ) ≥ .
5, then distance from the center of the pattern is
High .For the function ¯ g , we define the following fuzzy sets: • if ¯ g ( i ) ≤ .
4, then ¯ g ( i ) is Low ; • if 0 . ≤ ¯ g ( i ) ≤ .
5, then ¯ g ( i ) is Medium ; • if ¯ g ( i ) ≥ .
45, then ¯ g ( i ) is High .The function ¯ h is fuzzified by means of the following fuzzy sets: • if ¯ h ( i ) ≤ .
4, then ¯ h ( i ) is Low ; • if ¯0 . ≤ h ( i ) ≤ .
75, then ¯ h ( i ) is Medium ; • if ¯ h ( i ) ≥ .
5, then ¯ h ( i ) is High ;Figures 3, 4, and 5 show the membership functions of the previous fuzzy sets.Finally, for the fuzzy output ρ we define the following fuzzy sets, whose membershipfunctions are depicted in Figure 6: • if ρ ( i ) ≤ .
5, then ρ ( i ) is Low ; • if 0 . ≤ ρ ( i ) ≤ .
6, then ρ ( i ) is Medium ; • if ρ ( i ) ≥ .
5, then ρ ( i ) is High ;The inference system is based on the following rules, that combine the three inputs¯ f ( i ) , ¯ g ( i ) , ¯ h ( i ) in order to produce the fuzzy output ρ ( i ), for each column i of A :1. if ¯ f ( i ) is Low and ¯ h ( i ) is Low, then ρ ( i ) is Low;2. if ¯ f ( i ) is Low and ¯ g ( i ) is not High and ¯ h ( i ) is not Low, then ρ ( i ) is Low;3. if ¯ f ( i ) is Low and ¯ g ( i ) is High and ¯ h ( i ) is Medium, then ρ ( i ) is Medium;4. if ¯ f ( i ) is Medium and ¯ h ( i ) is not High, then ρ ( i ) is Medium;5. if ¯ f ( i ) is Medium and ¯ g ( i ) is Low and ¯ h ( i ) is High, then ρ ( i ) is Medium;6. if ¯ f ( i ) is High and ¯ g ( i ) is not High and ¯ h ( i ) is Low, then ρ ( i ) is Medium;7. if ¯ f ( i ) is High and ¯ g ( i ) is Low and ¯ h ( i ) is Medium, then ρ ( i ) is Medium;8. if ¯ f ( i ) is Low and ¯ g ( i ) is High and ¯ h ( i ) is High, then ρ ( i ) is High;9. if ¯ f ( i ) and ¯ g ( i ) and ¯ h ( i ) are not Low, then ρ ( i ) is High;7igure 3: Membership functions of the fuzzy sets related to ¯ f (for dataset B)Figure 4: Membership functions of the fuzzy sets related to ¯ g (for dataset B)Figure 5: Membership functions of the fuzzy sets related to ¯ h (for dataset B)Figure 6: Membership functions of the fuzzy sets related to ρ (for dataset B)8igure 7: Touching characters “vu” for font Times New Roman and font size of 20Figure 8: Application of the fuzzy inference system to the column 12 of the pattern “vu”10. if ¯ f ( i ) and ¯ g ( i ) are High, then ρ ( i ) is High.The touching characters in dataset B are correctly segmented in the 96 .
1% of the cases.For evaluating the correctness of the segmentation, we used a pattern recognition algorithmconstructed by a neural network trained on the characters that compose the dataset B, infashion of what done by other works reviewed in [53], and obtaining comparable results.We consider that touching character are correctly segmented when the pattern recognitionalgorithm correctly recognizes the characters after the segmentation.Simulations show that the fuzzy combination of the functions f, g, h improves thecorrect identification of the cutting column with respect to their separated use.To assess the performances of the fuzzy strategy compared to the usage on only thefunctions g and h , a numerical example is reported below. Let us consider the touchingcharacters “vu” (font Times New Roman) depicted in Figure 7. Our fuzzy routine correctlyidentifies the cutting column as the column 12 which is assigned the minimum value of ρ among all the columns of the pattern. Specifically, we obtain ρ (12) = 0 . g and h separately locate the column 16 as the cutting column. Indeed,we can observe that, e.g., h (16) = 8 that is greater than h ( i ), for i = 1 , ..., i (cid:54) = 16,for example h (12) = 0. In Table 1, we report the values of ¯ f , ¯ g, ¯ h, ρ for each column ofthe previous pattern (except for the first and the last column that are not surely cuttingcolumns). In the following, we perform segmentation of touching characters from the dataset. Sim-ilarly to what described in the previous section, fuzzy sets, membership functions, and9able 1: Values of ¯ f , ¯ g , ¯ h , ρ for the columns of the touching characters “vu” (excludedfirst and last column) Column ¯ f ¯ g ¯ h ρ fuzzy rules have been defined accordingly to expert based choices and further optimizedleveraging the PSO algorithm. Patterns in the dataset A are greatly different from theones in dataset B. For instance, touching positions in dataset B are often near to the cen-ter of the pattern. In the case of dataset A, cutting columns may occur more frequentlyat high distance from the center. On the other hand, the peak to valley function seemsto have better performances in the case of the dataset A. Taking this into account, theoptimization conducted by using the PSO algorithm has been strategic as it allowed us tohighlight properties and connections among functions f, g, h which are not noticeable at aglance. All these features are reflected in the following definition of fuzzy sets, membershipfunctions, and fuzzy rules.The fuzzy sets related to ¯ f are defined by • if ¯ f ( i ) ≤ .
45, then distance from the center of the pattern is
Low ; • if ¯0 . ≤ f ( i ) ≤ .
55, then distance from the center of the pattern is
Medium ; • if ¯ f ( i ) ≥ .
5, then distance from the center of the pattern is
High .For the function ¯ g , we define the following fuzzy sets: • if ¯ g ( i ) ≤ .
2, then ¯ g ( i ) is Low ; • if 0 . ≤ ¯ g ( i ) ≤ .
55, then ¯ g ( i ) is Medium ; • if ¯ g ( i ) ≥ .
25, then ¯ g ( i ) is High .The fuzzy sets related to ¯ h are defined by • if ¯ h ( i ) ≤ .
3, then ¯ h ( i ) is Low ; • if ¯0 . ≤ h ( i ) ≤ .
65, then ¯ h ( i ) is Medium ; • if ¯ h ( i ) ≥ .
5, then ¯ h ( i ) is High ;Figures 9, 10, and 11 show the membership functions of the previous fuzzy sets.Finally, for the fuzzy output ρ we define the following fuzzy sets, whose membershipfunctions are depicted in Figure 12: 10igure 9: Membership functions of the fuzzy sets related to ¯ f (for dataset A)Figure 10: Membership functions of the fuzzy sets related to ¯ g (for dataset A)Figure 11: Membership functions of the fuzzy sets related to ¯ h (for dataset A)Figure 12: Membership functions of the fuzzy sets related to ρ (for dataset A)11 if ρ ( i ) ≤ .
4, then ρ ( i ) is Low ; • if 0 . ≤ ρ ( i ) ≤ .
65, then ρ ( i ) is Medium ; • if ρ ( i ) ≥ .
4, then ρ ( i ) is High ;The inference system is based on the following rules, that combine the three inputs¯ f ( i ) , ¯ g ( i ) , ¯ h ( i ) in order to produce the fuzzy output ρ ( i ), for each column i of the matrixof pixels:1. if ¯ f ( i ) is not High and ¯ g ( i ) is not High and ¯ h ( i ) is Low, then ρ ( i ) is Low;2. if ¯ f ( i ) is Low and ¯ g ( i ) is Low and ¯ h ( i ) is Medium, then ρ ( i ) is Low;3. if ¯ f ( i ) is Low and ¯ g ( i ) is High, then ρ ( i ) is Medium;4. if ¯ g ( i ) is Medium and ¯ h ( i ) is Medium, then ρ ( i ) is Medium;5. if ¯ f ( i ) is High and ¯ g ( i ) is Low, then ρ ( i ) is Medium;6. if ¯ f ( i ) is Medium and ¯ g ( i ) is Low and ¯ h ( i ) is Medium, then ρ ( i ) is Medium;7. if ¯ f ( i ) is High and ¯ g ( i ) is Medium and ¯ h ( i ) is Low, then ρ ( i ) is Medium;8. if ¯ f ( i ) is Medium and ¯ g ( i ) is High, then ρ ( i ) is High;9. if ¯ f ( i ) is High and ¯ g ( i ) is High, then ρ ( i ) is High;10. if ¯ f ( i ) is High and ¯ g ( i ) is Medium and ¯ h ( i ) is High, then ρ ( i ) is High.The touching characters in dataset A are correctly segmented in the 81 .
1% of the cases.Let us remember that in dataset A we have stored the textual descriptor indicating theindex of the proper cut column. We consider a correct segmentation when the routinelocates such a column. Let us note that CCC database provides a challenging set ofcharacters for segmentation purposes. The only reference where segmentation of touchingcharacters obtained from CCC database is performed similarly to this section is [18] whoseauthors obtained correct segmentation in the 76 .
2% of the cases. Let us observe thatthese results are not directly comparable and are reported just to give a reference of thegoodness of our approach, since our dataset A and dataset used in [18] are different, evenif they are obtained starting from the same CCC database. Also, authors in [18] took intoaccount percentage of inaccurate segmentation, evaluating when a cutting column is foundaround the correct position; in this case they report a success percentage of 91 . . f assumes the lowest value in correspondence of thecolumn 56, function ¯ g in correspondence of columns 34, 35, 36, and 37, and function ¯ h incorrespondence of the column 15. 12 a) (b) (c) Figure 13: Touching characters “rt”, “xm”, and “ao” extracted from dataset A. (a) (b) (c)
Figure 14: Cutting positions located by ρ (a), ¯ g (b), and ¯ h (c). (a) (b) (c) Figure 15: Cutting positions located by ρ (a), ¯ f (b), and ¯ h (c).For the touching characters “rt”, our fuzzy routine identifies the correct cutting col-umn as the column 61, whereas function ¯ f , ¯ g , and ¯ h identify the cutting position incorrespondence of the columns 51, 61, and 70, respectively.For the touching characters “xm”, our fuzzy routine identifies the correct cuttingcolumn as the column 75, whereas function ¯ f , ¯ g , and ¯ h identify the cutting positionin correspondence of the columns 79, 72, and 137, respectively.Finally, for the touching characters “ao”, our fuzzy routine identifies the correct cuttingcolumn as the column 43, whereas function ¯ f assumes the lowest value in correspondenceof columns 48 and 49, function ¯ g in correspondence of the column 38, and function ¯ h incorrespondence of the column 43.In Figures 14 and 15, we show the cutting columns found by ρ , ¯ g , and ¯ h relatedto touching characters “eh” and cutting columns found by ρ , ¯ f , ¯ h related to touchingcharacters “rt”, respectively.For the sake of readability, we do not report values of ρ , ¯ f , ¯ g , ¯ h for each column sincethese patterns have usually more than 100 columns. A fuzzy approach for segmentation of touching characters has been presented. The pro-posed method combines three classical features of touching characters usually exploited13ne at a time. Experiments have been conducted on two very different datasets composedby Latin printed and handwritten characters, respectively. Fuzzy sets, membership func-tions, and fuzzy rules, which characterize the fuzzy inference scheme, have been properlyconstructed by means of expert–based choices, heuristic criteria, and the PSO algorithmfor each dataset. Numerical results are encouraging and show that the proposed methodhas an optimal capability of correctly separating touching characters and may be ad-justed for very different varieties of characters (not only for the types considered in ourexperiments).Our research activity have shown that even by using alone the functions f , g , h it issometime possible to correctly segmenting touching characters. In fact, some experiments(not presented here) have shown that adding other inputs to the fuzzy routine does notprovide significant improvements. This is surely a perspective that should be furtherinvestigated and motivated. Indeed, looking at perspective advancements, the followingissues could be addressed in future works: • study of characterization of performance improvements when adding further featuresas inputs in the fuzzy inference system; • experiments on further datasets not involving Latin characters; • experiments on segmentation of formulae, taking also into account possibility ofsegmenting touching characters vertically, horizontally and diagonally; • use of fuzzy models different form the Mamdani one (as, e.g., the Sugeno model). Acknowledgments
This work has been developed in the framework of an agreement between IRIFOR/UICI(Institute for Research, Education and Rehabilitation/Italian Union for the Blind andPartially Sighted) and Turin University.This research has been partly supported from the National Thematic Laboratory“AsTech” of CINI.Special thanks go to Dr. Tiziana Armano and Prof. Anna Capietto for their supportto this work.
References [1] V. Alexandrov,
Using critical points in contours for segmentation of touching char-acters , Proc. of the 5th Int. Conference on Computer Systems and Technologies, 1–5,New York, 2014.[2] V. Bansal, R. M. K. Sinha,
Segmentation of touching and fused Devanagari characters ,Pattern Recognition, Vol. , 875–893, 2002.[3] T. A. Bayer, U. H. G. Krebel, Cut classification for segmentation , IEEE Proc. of 2thInternational Conference on Document Analysis and Recognition (ICDAR), 565–568,1993.[4] V. Bansal, R. M. K. Sinha,
Segmentation of touching and fused devanagari characters ,Pattern recognition, Vol. , No. , 875–893, 2002.[5] F. Camastra, M. Spinetti, A. Vinciarelli, Offline cursive character challange: a newbenchmark for machine learning and pattern recognition algorithms , Proc. of the 18thInt. Conference on Pattern Recognition, Vol. , 913–916, 2006.146] R. G. Casey, E. Lecolinet, A survey of methods and strategies in character segmen-tation , IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. , No. ,690–706, 1996.[7] L. A. Fletcher, R. Kasturi, A robust algorithm for text string separation from mixedtext-graphics images , IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. , No. , 910–918, 2002.[8] V. Frinken, A. Fischer, R. Mammatha, H. Brunke, A novel word spotting methodbased on recurrent neural networks , IEEE Trans. on Pattern Analysis and MachineIntelligence, Vol. , No. , 211–224, 2011.[9] U. Garain, B. B. Chaudhuri, Segmentation of touching symbols for OCR of printedmathematical expressions: an approach based on multifactorial analysis , IEEE Proc.of 8th International Conference on Document Analysis and Recognition (ICDAR),Vol. , 177–181, 2005.[10] U. Garain, B. B. Chaudhuri, Segmentation of touching characters in printed Devna-gari and Bangla scripts using fuzzy multifactorial analysis , IEEE Trans. on Systems,Man and Cybernetics, Vol. , No. , 449–459, 2002.[11] S. He, M. Wiering, L. Schomaker, Junction detection in handwritten documents andits application to writer identification , Pattern Recognition, Vol. , 4036–4048, 2015.[12] J. F. Hebert, M. Parizeau, N. Ghazzali, Learning to segment cursive words usingisolated characters , Proc. of Conference on Vision Interface, 33–40, 1999.[13] M. K. Jasim, A. H. Al–Saleh, A- Aijanaby,
A fuzzy based feature extraction approachfor handwritten characters , International Journal of Computer Science, Vol. , No. , 208–215, 2013[14] M. C. Jung, Y. C. Shin, S. N. Srihari, Machine printed character segmentation methodusing side profiles , Proceedings of IEEE International Conference onSystems, Manand Cybernetics, New York, 863–867, 1999.[15] S. Kahan,
On the recognition of printed characters of any font and size , IEEE Trans-actions on Pattern Analysis and Machine Intelligence, Vol. , No. , 274–288, 1987.[16] J. Kennedy, R. Eberhart, Particle swarm optimization , IEEE Int. Conference onNeural Networks, Perth, Australia, Vol. IV, 1942–1948, 2012.[17] K. K. Kim, J. H. Kim, C. Y. Suen,
Recognition of unconstrained handwritten numeralstrings by composite segmentation method
In Pattern Recognition, 2000. Proceedings.15th International Conference on, Vol. , 594–597. IEEE, 2000.[18] F. Kurniawan, M. S. M. Rahim, D. Daman, A. Rehman, D. Mohamad, S. M. Sham-suddin, Region–based touched character segmentation in handwritten words , Inter-national Journal of Innovative Computing, Information and Control, Vol. , No. ,3107–3120, 2011.[19] M. Kumar, M. K. Jindal, R. K. Sharma, Segmentation of isolatedtouching charactersin offline handwritten Grumukhi script recognition , Int. J. of Information Technologyand Computer Science, Vol. , 58–63, 2014.[20] H. Lee, B. Verma, Binary segmentation algorithm for English cursive handwritingrecognition , Pattern Recognition, Vol. , 1306–1317, 2012.1521] J. Liang, I. T. Phillips, R. M. Haralick, An optimization methodology for documentstructure extraction on Latin character documents , IEEE Trans. on Pattern Analysisand Machine Intelligence, Vol. , No. , 719–734, 2002.[22] S. Liang, M. Shridhar, M. Ahmadi, Segmentation of touching characters in printeddocument recognition , Pattern Recognition, Vol. , No. , 825–840, 1994.[23] G. Louloudis, B. Gatos, I. Pratikakis, C. Halatsis, Text line and word segmentationof handwritten documents , Pattern Recognition, Vol. , 3169–3183, 2009.[24] Y. Lu, Machine printed character segmentation – an overview , Pattern Recognition,Vol. , No. , 67–80, 1995.[25] Y. Lu, On the segmentation of touching characters , IEEE Proc. of 2th InternationalConference on Document Analysis and Recognition (ICDAR), 440–443, 1993.[26] Z. Lu, Z. Chi, W. C. Siu, P. Shi,
A background–thinning–based approach for separatingand recognizing connected handwritten digit strings , Pattern Recognition, Vol. ,921–933, 1999.[27] Y. Lu, M. Shridhar, Character segmentation in handwritten words – an overview ,Pattern Recognition, Vol. , No. , 77–96, 1996.[28] E. H. Mamdani, S. Assilian, An experiment in linguistic synthesis with a fuzzy logiccontroller , International Journal of Man–Machine Studies, Vol. , No. , 1–13, 1975.[29] R. Manmatha, J. L. Rothfeder, A scale space approach for automatically segmentingwords from historical handwritten documents
IEEE Trans. on Pattern Analysis andMachine Intelligence, Vol. , No. , 1212–1225, 2005.[30] T. Mondal, N. Ragot, J. Y. Ramel, U. Pal, Flexible sequence matching technique:an effective learning–free approach for word spotting , Pattern Recognition, Vol. ,596–612, 2016.[31] R. Nachar, E. Inaty, P. J. Bonnin, Y. Alayli, Breaking down Captcha using edge cor-ners and fuzzy logic segmentation/recognition technique , Security and CommunicationNetworks, Vol. , 3995–4012, 2015.[32] S. Naz, H. Majeed, H. Irshad, Image segmentation using fuzzy clustering: a survey ,Proc. of 6th International Conference on Emerging Technologies (ICET), 181–186,2010.[33] A. Nomura, K. Michishita, S. Uchida, M. Suzuki,
Detection and segmentation oftouching characters in mathematical expressions , IEEE Proc. of 7th InternationalConference on Document Analysis and Recognition (ICDAR), Vol. , 126–130, 2003.[34] N. Otsu, A treshold selection method from gray–level histograms , IEEE Trans. Sys.,Man., Cyber., Vol. , No. , 62–66, 1979.[35] U. Pal, A. Belad, C. Choisy, Touching numeral segmentation using water reservoirconcept , Pattern Recognition Letters, Vol. , 261–272, 2003.[36] S. Pravesjit, A. Thammano, Touching character segmentation method of archaicLanna script , Chapter E–business and Telecommunication, Vol. , Series Com-munications in Computer and Information Science, 400–408, 2012.[37] A. Rehman, F. Kurniawan, D. Mohamad,
Off–line cursive handwriting segmentation:a heuristic rule–based approach , Journal of Institute of Mathematics and ComputerScience, Vol. , No. , 135–140, 2008.1638] P. P. Roy, U. Pal, J. Llados, Proccedings of the IEEE SiRecognition of multi–orientedtouching characters in graphical documents , Proceedings of the IEEE Sixth IndianConference on Computer Vision, Graphics and Image Processing, 297–304, 2008.[39] P. P. Roy, U. Pal, J. Llad´os, M. Delalandre,
Multi-oriented touching text charactersegmentation in graphical documents using dynamic programming , Pattern Recogni-tion, Vol. , No. , 1972–1983, 2012.[40] P. P. Roy, U. Pal, J. Llados, M. Delandre, Multi–oriented touching character segmen-tation in graphical documents using dynamic programming , Pattern Recognition, Vol. , 1972–1983, 2012.[41] T. Saba, G. Sulong, A. Rehman, A survey on methods and strategies on touchedcharacters segmentation , International Journal of Research and Reviews in ComputerScience, Vol. , No. , 103–114, 2010.[42] T. Saba, G. Sulong, A. Rehman, Non–linear segmentation of touched Roman char-acters based on genetic algorithm , Int. J. on Computer Science and Engineering, Vol. , No. , 2167–2172, 2010.[43] J. Sadri, M. R. Yeganehzad, J. Saghi, A novel comprehensive database for offlinepersian handwriting recognition , Pattern Recognition, Vol. , 378–393, 2016.[44] R. Sarkar, B. Sen, N. Das, S. Basu, Handwritten Devanagari script segmentation: anon–linear fuzzy approach , Proc. of IEEE Conference on AI Tools and Engineering(ICAITE), 2008.[45] N. Sharma, P. Shvakumara, U. Pal, M. Blumenstein, C. L. Tan,
A new method forcharacter segmentation from multi–oriented video words , IEEE Proc. of 12th Interna-tional Conference on Document Analysis and Recognition (ICDAR), Vol. , 413–417,2013.[46] Z. Shi, V. Govindaraju, Line separation for complex document images using fuzzyrunlength , IEEE Proc. of the 1st International Workshop on Document Image Anal-ysis for Libraries, 306–312, 2004.[47] N. Stamatopoulos, B. Gatos, S. J. Perantonis,
A method for combining complementarytechniques for document image segmentation , Pattern Recognition, Vol. , 3158–3168, 2009.[48] O. J. Tobias, R. Seara, Image segmentation by histogram thresholding using fuzzy sets ,IEEE Trans. on Image Processing, Vol. , No. , 2002.[49] H. Lee, B. Verma, Binary segmentation algorithm for english cursive handwritingrecognition , Pattern Recognition, Vol. , No. , 1306–1317, 2012.[50] X. Wei, S. Ma, Y. Jin, Segmentation of connected Chinese characters based on geneticalgorithm , IEEE Proc. of 8th International Conference on Document Analysis andRecognition (ICDAR), Vol. , 645–649, 2005.[51] J. J. Weinman, E. L. Miller, A. R. Hanson, Text recognition using similarity and lex-icon with sparse belief propagation , IEEE Pattern Analysis and Machine Intelligence,Vol. , 1733–1746, 2009.[52] S. Zhao, Z. Chi, P. Shi, H. Yan, Two-stage segmentation of unconstrained handwrittenchinese characters , Pattern Recognition, Vol. , No. , 145–156, 2003.1753] J. Zhou, A. Krzyzak, C. Y. Suen, Verification–a method of enhancing the recognizersof isolated and touching handwritten numerals , Pattern Recognition, Vol. , No.5