Spectral Graph-based Features for Recognition of Handwritten Characters: A Case Study on Handwritten Devanagari Numerals
11 Bhat Mohammad Idrees * and B. Sharada. Spectral Graph-based Features for Recognition of Handwritten Characters: A Case Study on Handwritten Devanagari Numerals
Abstract:
Interpretation of different writing styles, unconstrained cursiveness and relationship between different primitive parts is an essential and challenging task for recognition of handwritten characters. As feature representation is inadequate, appropriate interpretation /description of handwritten characters seems to be a challenging task. Alt-hough existing research in handwritten characters is extensive, it still remains a challenge to get the effective represen-tation of characters in feature space. In this paper, we made an attempt to circumvent these problems by proposing an approach that exploits the robust graph representation and spectral graph embedding concept to characterize and effec-tively represents handwritten characters, taking into account writing styles, cursiveness and relationships. For corrobo-ration of the efficacy of the proposed method, extensive experiments were carried out on standard handwritten numeral CVPR Unit, ISI, Kolkata Dataset. The experimental results demonstrate promising findings, which can be used in future studies.
Keywords:
Writing styles, unconstrained cursiveness, primitive relationships, feature representation, graph represen-tation, spectral graph embedding.
1 Introduction
Optical Character Recognition (OCR) is concerned with automatic recognition of scanned and digitized images of text by a computer. These scanned images of text undergo various manipulations and then encoded with character codes such as American Standard Code for Information Interchange (ASCII), Unicode etc. OCR system tries to bridge the communication gap between man and machine and aides in automation of office with saving of considerable amount of time and human effort. Despite decades of research on different issues related to OCR [1, 2], research on handwrit-ten characters has been less than satisfactory. It is an essential and challenging task for the community of pattern recognition. It is primarily because of the absence of fixed structure, the presence of numerous character shapes, cur-siveness, the difference in inter and intra writer styles. Potential practical applications of it included in the automatic reading of postal codes, bank cheques, employee id, data entry, zip codes etc. Thus, recognition of handwritten charac-ters is still an open area of research. In general, problems are associated with all handwritten documents. In this paper, we have considered a case study of handwritten Devanagari numerals, because of its importance in the Indian context. One important question is how to give adequate representation/description of the underlying object (hand-written character) such that any recognition algorithm can be applied. Representation of object is done through two ways, viz. statistical representation and structural representation. In a statistical representation, the character is repre-sented as a feature vector comprising of ' ' n measurements or values and can be thought as a point in n dimensional vector space, i.e. ( , , , ) nn F f f f R . However, it has two representational limitations viz. dimension is fixed a priori i.e. all vectors in a recognition system have to agree with same length irrespective of the varying size of the underlying objects and second, they are inadequate in representing binary relationships that exist in primitive parts of the underlying object . Despite these, they are extensively used due to their flexible and computationally efficient mathematical base. For example, sum, product, mean etc, which are basic artefacts for many pattern recognition algo-rithms, can easily be computed. On the other hand, structural representation is based on symbolic data structure viz. graphs. The aforementioned limitations of feature vectors can be circumvented by graph representation [3, 4]. Howev-er, little algebraic support (less mathematical flexibility) and computationally expensive nature of many algorithms are major drawbacks to it. Compared to feature representation method, graphs provide robust representation formalism for the description of two-dimensional nature of handwritten characters viz. style variance, shape transformations, cur-siveness, and size variance [4] . In this work, in order to exploit advantages of both, we have given graph representation to handwritten numerals to capture different writing styles, cursiveness and size variability. Thereafter, graphs are transformed into vector space by the concept of Spectral Graph Theory (
SGT ) to characterize the numeral graphs. Rest of the paper is organized into five sections: - Section 2 gives brief literature on the handwritten Devanagari numeral recognition system. An overview of definitions/illustrations of the terminologies used with respect to graph and spec- tral graph theory is given in section 3. In section 4 details about the proposed system is given. The recognition experi-ment is described in section 5, starting with a description of the dataset and experimental setup, followed by experi-mental results and concluded by a comparison with related work. Finally, Future work and conclusion is drawn in section 6.
2 Related Works
Over the years, an enormous amount of research work has been carried out in an attempt to make OCR, a reality. Dif-ferent studies have explored various techniques like template matching [5], multi-pass hybrid method [6], syntactic features [7], shadow based features [8, 9] , gradient features [10,11 ] and CNN based features [12], to name just a few. Robust and stable features that are discriminating in feature space are an indispensable component in any recognition system. Inevitable characteristic of such features is that they should withstand to different types of variations (style, size etc) and shape transformations viz., rotation, scale, translation and reflection. Selection and extraction of such features in handwritten characters in the Indian context have been attempted by a number of researchers. In Ref. [13], moment features (left, right, upper, and lower profile curves), descriptive component features and density features are combined for neural network based architecture for recognition. The main aim of extracting these types of features is to capture different stylistic variations. In Ref. [14], after giving wavelet-based multi-resolution representation, a numeral is subjected to the multi-stage recognition process. In each stage, a distinct Multi-Layer-Perceptron (MLP) classifier is used which either performs recognition or rejection. Thereafter, recognition for rejected numeral is attempted at next higher level. A fuzzy model-based system is proposed in Ref. [15], numerals are repre-sented in the form of the exponential membership function, which behaves as a fuzzy model. Later recognition is per-formed by modifying exponential membership functions fitted to the fuzzy sets. Fuzzy sets are extracted from features comprising of normalized distances using the Box approach. An attempt is made in Ref. [16], to extract moment invar-iant features based on correlation coefficient, perturbed moments, image partitions and principal component analysis (PCA). These features are then used with Gaussian distribution function (GDF) for recognition purpose. In Ref. [17], translation and scale invariance of numerals are achieved by exploiting geometric moments such as Zernike moments. Extensive experiments were carried out on a large dataset that revealed the robustness of the proposed model. After giving graph representation different graph matching techniques are utilised such as sub-graph isomorphism, maxi-mum common sub-graph and graph edit distance for Holistic recognition of Devanagari word [18], Oriya digit [19], and Devanagari numerals [20], respectively. However, the robustness of the graph representation is overshadowed by time complexity in these approaches. A novel scheme based on edge histogram features is proposed in Ref. [21], scanned numeral images are pre-processed with splines together with PCA in order to improve the recognition performance. A local-based approach is proposed in Ref. [22], which exploits 16-segment display concept, extracted from half-toned binary images of numer-als. A novel approach for recognizing handwritten numerals of five Indian sub-continent scripts has been proposed in Ref. [23]. Handwritten numerals are characterized by a combination of features such as Principal component analysis (PCA)/Modular PCA (MPCA) and Quadtree-based hierarchically derived longest run (QTLR).The efficacy of the proposed approach is validated by conducting extensive experiments on various datasets and the results demonstrate significant development in recognition performance. A global-based approach proposed in Ref. [24], in which features are extracted from end-points of numeral images. Thereafter, Recognition is carried with the neuromagnetic model. The feature level fusion based approach is attempted in Ref. [25], in which global and local features are combined together for Artificial Neural network based recognition. Several techniques gained importance due to their perfor-mance such as chain code features [26], Feature sub-selection [24], Zernike moments [27], and Structural features [28]. For a comprehensive survey, we refer readers to [29-31].
From the literature survey, we observe many researchers have addressed the problem of handwritten Devanagari numeral recognition by addressing separate objectives (shape transformations, style variations etc.). However, no attempts were made to address the problem as a whole. As numerals written by people are with different writing styles, even variation of style exist within-writer also; handwritten numeral recognition seems to be difficult and challenging. Thus, there is a scope for various attempts in this direction. Also, the reported works clearly indicate that the attempts have been made only by giving feature representation. However, as stated earlier, feature representation implicates two limitations namely size constraint and inability to represent binary relationships. These two limitations are severe in representing inherent two-dimensional nature of handwriting. With this observation, if these two limitations can be removed from recognition systems, greater and reliable recognition accuracies can be achieved. Hence, there is a scope to devise a model to circumvent stated limitations by providing robust alternative representation. From such a repre-sentation, besides representing object properties, we expect that inherent two-dimensional information is adequately modelled and binary relationships are preserved. Graph representation models dependencies and relations among different primitive parts (by edges). Moreover, describing object properties . Furthermore, flexible in representing different individual object size in an application and invariant to shape transformations (scale, rotation, translation, reflection and mirror image) as well [32].
These charac-teristics of graphs are extremely beneficial to cope with different writing styles and cursiveness. Also, from the survey, with different applications such as image classification [33], image segmentation [34], synthetic graph classification [35], and many more, we observe Spectral Graph Theory (SGT) is more effective to characterize the graphs under consideration. SGT is a branch of mathematics that is primarily concerned with describing the structural properties of graphs by extracting eigenvalues of different graph associated matrices. The eigenvalues form the spectrum of the graph and exhibit interesting properties which can be exploited for recognition purposes . To enhance the recognition performance classifier fusion at decision level is also utilized. CVPR, Unit, ISI Kolkata dataset is employed as a da-taset due to its popularity, availability and its complexity. Recognition results are lesser than the best result claimed by [36].
However, the main aim was not to outperform it but to circumvent stated limitations by giving graph representa-tion and observe the results.
Fig. 1:
Illustration of numeral images with several intra-class variations with respect to size and style .
3 Required Graph Terminologies
Brief and concise illustrations are given for various terminologies used in this study vis-a-vis graph theory and spectral graph theory (SGT). However, for comprehensive reading, we refer readers to [37-40].
Definition
1. (Graph).
A Graph is a four-tuple ( , , , )
G V E where, V set of vertices (or nodes); cardinality of it, is the order of the graph. E V V set of Edges; cardinality of it is the size of the graph. : v V l associating labels, v l , to each vertex in V . : e E l associating labels, e l , to each edge in E . A directed graph or digraph G in which all edges e in E are directed from one vertex to another i.e., vertices are ordered pairs in V . An undirected graph G is a graph in which all edges e in E are bidirectional i.e., vertices are unordered pairs in V . A weighted Graph G is a graph in which each edge e in E is assigned a numerical weight by some weighting function ( ) i w e . Mainly non-negative numeric values are used (called as the cost of the edges). One such weighting function ( ) i w e is the length of the edge e in E. The degree of a vertex v denoted by ( ) d v in G is the total number of vertices that are adjacent to it. There are different matrices associated with graphs which are im-portant such as Adjacency matrix, Laplacian matrix. In a Graph G with V vertices, an adjacency matrix ( ( ) A G ) is a
V V matrix. Each ij a in ( ) A G is 1 if the vertices { , } i j v v in V are adjacent, otherwise 0. Laplacian matrix ( ( )
L G ) of Graph G is defined as ( ) L G = ( ) ( ) D G A G , where ( ) D G and ( )
A G are the degree and Adjacency ma-trix of Graph G . Each ij l in ( ) L G is deg ( ) i v if { } i j v v , i j , -1 if edges e in E are adjacent ( i j ) and 0 oth-erwise. Weighted Adjacency
Matrix, ( )
WA G is constructed by removing all entries where { , } i j v v =1 in ( ) A G with respective weights assigned by a weighting function ({ , }) i j w v v . Weighted Laplacian matrix ( ) ( ) ( )
WL G D G WA G where ( )
D G is a degree matrix each ij l in ( ) WL G is deg ( ) i v if { } i j v v , i j , negative times weight assigned by a weighting function to edges e in E which are adjacent ( i j ) and 0 otherwise. Dis-tance matrix , ( ) Dist G of vertices in a graph G is V V matrix contains pairwise distances (provided by a weighting function, ( ) i w e ) between each v in V i.e., distances are included even for non-adjacent nodes v in V . Despite robust structural representational formalism of objects, as stated earlier, graph-based methods in pattern recognition (like graph matching ) have major limitations. These limitations include computationally expensive nature of algorithms and presence of little algebraic properties (basic operations required in many pattern recognition algorithms such as sum, mean, and product etc. are not defined in a standard way). In order to, overcome these limitations graphs are trans-formed into low dimensional vector space such a technique is called Graph embedding : n G R . One such tech-nique is
Spectral Graph Embedding (SGE), in which graphs are transformed into vector space by the
Spectrum of the graph . The spectrum of Graph G (where G can be represented by any graph associated matrix M , in this study ( ( ), ( ), ( ) WA G WL G and Dist G ) is the set of eigenvalues , together with their algebraic multiplicities (number of times they occur). Representation of any graph associated matrix in terms of its eigenvalues and eigenvectors is called its eigendecomposition/spectral decomposition . For better illustration, let (5,7) G be the graph in which each edge e is weighted (labelled) arbitrarily, and then the desired matrices can be extracted as shown in Fig.2. It should be noted here that there is a subtle difference between label and weight of the graph, in this study label and weight refer the same and are used interchangeably. WA G D G WL G
Fig. 2 : Weighted Graph G (5, 7) (order | | 5 V and size | | 7 E , labelled arbitrarily) and its associated weighted Adjacency ( ) WA G , Degree Matrix ( )
D G , and weighted Laplacian Matrix ( )
WL G respectively. ( ( ) ( ) ( ))
WL G D G WA G .
4 Proposed Model
Various steps involved in the proposed handwritten Devanagari numeral recognition model are shown in Fig.3. These steps are explained in following subsections :- Fig. 3:
Process of extraction of sorted spectra.
Image Pre-processing
Image pre-processing deals with reducing variations on scanned images of handwritten numerals caused by noise. In this study, scanned numeral images are first filtered by Difference of Gaussian (
DoG )-filtering, second normalization is applied to handle variability in size, and later numeral images are binarized. Finally, numeral images are skeleton-ised by a thinning operator [41]. Graph Representation
There exist various graph representations [32], however, we selected interest point graph representation as it preserves inherent structural characteristics of numeral images. It identifies the points in an image where the signal information is rich such as junction points, start and end points, corner points of circular primitive parts of numerals. Fig.4 shows some extracted sample numeral graphs and interest points in each numeral graph.
Fig. 4:
Snapshot of underlying graphs obtained from handwritten Devanagari numerals with interest points (0-9)
Feature Extraction
Weighted graphs include more discriminating information than un-weighted such as stretching of the graph [32].
In order to give weights to numeral graphs, edges are labelled with most well-known and intuitive weighting function : ( ) w E G which assigns Euclidean distance to each edge in G . Euclidean distance is computed from respective D coordinates of nodes incident with each edge e in E (shown in Fig.5a). The motivation behind using such a weighting function is twofold; first, it is computationally simple and, secondly, the distance between any two objects (in this study, nodes) remains unaffected with the inclusion of more objects (nodes) in the analysis [42]. However, there is an arsenal of weighting functions described in the literature [43] one can use any one of them. As stated earlier, spectral graph embedding is described in terms of matrices associated with graphs. Selection and extraction of matrices which preserve underlining structure or topology of the numeral graphs are indispensable. In consideration of this fact, we selected weighted Adjacency matrix ( ( ) WA G ), weighted Laplacian matrix ( ( )
WL G ) and Distance Matrix ( ( )
Dist G ). These matrices exhibit different topological information (global or local) of graphs which can be crucial for the characterisation of numeral graphs. Adjacency matrix consists of a length of edges and it is unique for each graph (up to permutation rows and columns) that leads to isomorphism, invariance of graphs. A total number of connected com-ponents and spanning trees for a given graph is given by Laplacian matrix. A number of spanning trees ( ) t G , in a connected graph, is a well-known invariant and leads to many more discriminating properties of the graph. Distance matrix gives the mutual pair-wise distance between each node; the matrix thus formed is different for graphs having equal order [44-47]. Matrix decomposition follows subsequent representation of these matrices in-terms of eigenvalues (with their multiplicities) called spectral decomposition or eigendecomposition of graphs. Let M be some matrix representation of graph G ( ( ) WA G , ( ) WL G and ( )
Dist G ) then the spectral decomposition (or eigendecomposition) is T M where ( , , , , ) V diag is the ordered eigenvalues of a diagonal matrix and = ( , , , , ) V is the ordered eigenvectors as columns in a matrix M . Then the spectrum (eigendecomposition) of M is the set of eigen-values { , , , , } V . For the eigenvalues { , , , , } V and corresponding eigenvectors ( , , , , ) V equation (1) holds. The advantage of using spectrum in characterising a graph is that eigendecomposition of various matrices associated with graphs can be quickly computed (computation of a spectrum from a matrix requires ( ) O n operations, where ' ' n is the order of the graph). Furthermore, Spectral parameters of graph illustrate/specify various discriminating properties, that otherwise are exponentially computed (chromatic number, sub-graph isomorphism, perturbation of graph, number of paths of length ' ' K between two nodes, number of connected components in a graph etc.). Thus exploiting spectrum for the graph characterisation is clearly beneficial. ( ) M I a) b) Fig. 5:
Illustration of assigning weights to numeral graphs a) each node labelled with 2D coordinates b) each edge in numeral graph labelled (weighted) with the Euclidean distance between two adjacent nodes.
For an illustration of eigendecomposition, let
WAG M be the matrix representation of a graph G described in section 3. Equation (1) can also be written as M I M I M I II
Where ' ' I is the identity matrix, is a special vector (eigenvector) that is in the same direction as M . After multi-plying with M , the vector M is a number time the actual , called as an eigenvalue of M . That means, upon linear transformation M on , is an amount of how much vector is elongated or shrunk, reversed or unchanged, that is described by, an eigenvalue. Eigendecomposition of weighted Adjacency Matrix ( WAG ) can be carried out as follows:-
WAG = after applying ( ) II Then solving for the equation,
140 378 1445 344 , we arrive at ordered (dominant) eigenvalues:- (12.6880,1.9669,0.2570, 6.0595, 8.8.523)
Similarly, eigendecomposition is carried out for Weighted Laplacian Matrix ( )
WL G and Distance Matrix ( )
Dist G . Thereafter, we arrive at feature matrices consisting of ordered (dominant) eigenvalues (spectrum) of
WAG , ( ) WL G , ( ) Dist G respectively. Further, these features (spectrum) are first inspected individually for characterisation potential and later they are fused together at decision level (or classifier level fusion) to characterise the numeral graphs.
Adequacy of the Features
Spectrum inherits different properties (global and local) from their respective graph associated matrices which make them ideal candidates for recognition purposes, a thorough study can be found in [48-51]. However, few important properties which are concerned with this study are described as follows:- –
Spectrum is real if the associated graph matrix is real and symmetric. Hence, the spectral decomposition map graphs in a coordinate system thereafter, any clustering or classification procedures can be used. –
Spectrum is invariant with respect to labelling of the graph (Isomorphic graphs ) if sorted either in ascending or descending order because swapping of two columns have no effect on values. Therefore, different orders of the graphs have no influence. –
Since each eigenvalue contains information about all nodes in a graph so it is possible to use an only certain subset of them. Therefore, it is not mandatory to use all eigenvalues. Imbalanced (short) spectra’s can be balanced with padding zero values. –
For disconnected graph G spectrum is the union of the spectra’s of different components in G .
5 Experimentation
For experimentation, we have used isolated handwritten Devanagari numeral dataset from Computer Vision Pattern Recognition, Unit of Indian Statistical Institute Kolkata (CVPR, Unit, ISI Kolkata). It consists of 22,556 samples writ-ten by 1049 persons. 368 mail pieces, 274 job application forms, and specially designed forms were used. In a dataset, numerals are with different writing styles, size and stroke widths. Dataset also comprises of certain samples that cannot be recognized by humans also. We divided entire dataset of labelled numeral images into three disjoint sets viz. Train-ing, validation and test set respectively. The validation set is used to tune/optimise the meta-parameters of the classifier and proposed method. However, original dataset is divided into training and testing ratios, but the authors of the da-taset have stated in [52], that depending upon the requirement, the dataset can be partitioned into training, validation and test sets, respectively. Hence, we divided the dataset into two standard ratios of 60:20:20 and 50:25:25 [53] of training, validation and test set respectively. Fig.1 shows some numeral samples of the dataset. The complete descrip-tion of the dataset can be found in [54].
Due to its robustness, which is validated from numerous fields of pattern recognition, we have employed multi-class Support Vector Machines in association with a kernel called Gaussian Kernel (also called Radial basis function,
RBF kernel) [56,57]. There are two possible ways of classification in multi-class SVM : One-vs.-One classifier ( IV ) and One-vs.-All classifier ( IVA ). We have utilized One-vs.-One method, as it is insensitive towards imbalanced dataset. In this method, training is done with all pairs of two-class
SVMs (e.g., for class problem, ) also called pair-wise decomposition. All possible pair-wise classifiers ( ( 1) / 2 n n ) are evaluated and decision for unseen observation is made by majority vote. During training RBF based
SVM have to optimise, two meta-parameters (namely C and , represents classification cost and non-linear function respectively), empirically on the dataset. To arrive at optimised parameters, values for C and are varied from on a logarithmic scale ( base ) ( . ., 0.001,0.01.........) i e . Each SVM is trained for every possible pair ( C , ) on the training set and the recognition accuracy is tested on the validation set. Values leading to the best recognition accuracy are then used with Independent test set (Table I). Each spectrum (spectra of ( ) WA G , ( ) WL G , and ( )
Dist G ) is investigated individually for recognition potential. From now on, we refer spectra of ( )
WA G , ( ) WL G , and ( )
Dist G as (Feature Type) , ,
FT FT and FT respectively. The individual recognition results from each feature type are then compared. In order to improve the accuracy of indi-vidual classifiers, Multi-classifier system (MCS) [57] or classifier fusion is employed. Classifier fusion combines their results by using various combining strategies; however we used Bayesian fusion (described in subsection 5.2). It is worth underlining that in MCS, individual classifiers should be accurate and diverse [57].
As stated earlier, the accuracy of
SVMs is experimentally validated in a number of practical recognition problems, diversity means each classifier should make different errors or their decision boundaries should be different. In this study, diversity is achieved by using different feature types (as discussed in sub-section 4.3) of the numeral graphs.
We used Bayesian combination rule (also known as Bayesian Belief Integration) as a combined technique. It is based on the concept of conditional probability. To compute the conditional probabilities of each classifier for all classes, confusion matrix has to be calculated first. Let l C be the confusion matrix for each classifier l e with
1, , , l L where L is the total number of classifiers used (in this study L = 3).
11 12 121 22 231 31 31 2 .........
NNl NN N NN
C C CC C CC C C CC C C (2) Where, , 1, , i j N , N is the number of classes, , i j C in l C is the total number of samples in which classifier l e predicted class label j whereas actual label was i . By using information present in confusion matrix the probability, that the test sample ‘ x ’ corresponds to class ‘ i ’ if the classifier l e predicts class j can be calculated as follows:- , ,1 ( | ( ) ) li ji j l N li ji CP P x i e x j C (3) Probability matrix l P for each classifier l e is:-
11 12 121 22 231 32 31 2 ............
NNl NN N NN
P P PP P PP P P PP P P (4) Based on l P for each classifier a combined estimate value, ( ) b i for each class ‘ i ’, is calculated for each sample ‘ x ’ in test set. ,1 ,1 1 ( ) L i jll LN i jli l
Pb i P (5) For a test sample, ‘x’ classifier l e predicts class label l j . To make a decision for one of the class maximum values in b(i) is used. Several experiments were carried out for all three feature types ( , , )
FT FT and FT and subsequently repeated for random trials of training, validation and testing in the ratios of
60: 20: 20 and
50: 25: 25 respectively. In each trial, the performance of the proposed method is assessed by the recognition rate in terms of F measure and the average F measure is computed from all trials. Table 1 gives the class wise performance in-terms of F measure (for both the ratios belonging to all the feature types) and also presents validated meta-values for RBF kernel. Fig.6 shows confusion matrices obtained for optimised parameters of the classifier (for each feature type ( , , ) FT FT and FT re-spectively). The performance of any recognition method is assessed in terms of precision, recall, and F measure described as follows: Precision = ' CPCP FP (6) Recall = CPCP FN (7) F-measure = (2* Pr * Re )(Pr Re ) ecision callecision call (8) Measures Precision, Recall and F measure are based on correct positive, false negative, false positive, and correct negative for overall samples of the test set. Table 2 presents the average F measure computed from all trails. Individually these feature types ( , , ) FT FT and FT generate
75 85 % average recognition rate. Since FT comprises of all the pair-wise distances, the shape of the numeral graph is not preserved. Numeral graphs with an equal number of vertices | | V , are only dis-tinct in pair-wise distances of the vertices but equal in a number of non-zero entries. Perhaps, this could be the reason for its ( ) FT lowest recognition result (
75 76 % ). FT & FT preserve the exact shape of the numeral graphs such as the presence of edges and also their weights hence they generate over average recognition rates. Since each graph associated matrices contain non-overlapping information therefore by combining the classifiers at decision level greater recognition rates can be achieved. With classifier fusion at decision level, we achieved maximum average recognition rate (fusion is carried out individually for each trial and then average recognition accuracy is recorded) shown in Table 3. Therefore, by decision fusion at classifier level recognition rate is increased ( , , ) FT FT and FT by . The numerals which have same underlying graph structure (more or less) build the misclassified pairs such as Devanagari zero and Devanagari one (as can be observed from Table 1 and confusion ma-trices). Furthermore, invariance property of spectrum also adds to the confusion. It can be understood by observing the shape of the Devanagari numeral three and Devanagari numeral six (as shown in Fig.2, just mirror images of each other). Since we sorted spectrum, therefore, their spectra are more or less equal. In consideration of these facts, recog-nition performance is encouraging. It should be noted that each spectrum was sorted in descending order. In order to choose ' ' n largest eigenvalues for each feature type FT , FT and FT , we have conducted experiments for various values ' ' n on the validation set. We observe that only small value of has significant development ( 3) n . But when we increase the value of ' ' n we have not observed much significant development in recognition performance. Thus, in experimentation, we have con-sidered the value of ' ' n equal to for every feature type ( , , ) FT FT and FT respectively. The results obtained after fusion with varying ' ' n are shown in Table 4. Fig. 6:
Confusion matrices for each feature type ( , , )
FT FT and FT for both divisions respectively Table. 1:
Class- wise performance of all feature types ( 0.125 0.001, 0.031 0.0004, 0.001 0.004)
FT C FT C and FT C
Class
Index
Training: Validation: Testing Class Index Training: Validation: Testing
60: 20: 20
50: 25: 25
60: 20: 20
50: 25: 25 FT FT FT FT FT FT FT FT FT FT FT FT
1 0.90 0.93 0.79 0.89 0.92 0.96 6 0.75 0.74 0.75 0.74 0.72 0.74 2 0.92 0.94 0.77 0.87 0.93 0.93 7 0.68 0.81 0.80 0.67 0.80 0.79 3 0.78 0.72 0.73 0.72 0.71 0.72 8 0.88 0.85 0.94 0.87 0.84 0.93 4 0.69 0.85 0.93 0.68 0.84 0.92 9 0.65 0.77 0.69 0.64 0.76 0.68 5 0.81 0.67 0.96 0.80 0.66 0.95 10 0.61 0.62 0.88 0.60 0.61 0.87
Note : Where FT = feature type one or sorted spectrum of the weighted adjacency matrix, FT = feature type two or sorted spectrum of weighted Laplacian matrix and FT = feature type three or sorted spectrum of distance matrix respectively. Values of C and are the validated meta-parameters for RBF kernel SVM for each feature type , ,
FT FT and FT re- spectively.
Table. 2 : Overall Average Recognition Performance (in-terms of F measure) for both ratios . Dataset Feature Type Ratio’s of Training, Valida-tion and Testing Overall Recognition Rate
CVPR-ISI Kolkata FT
60: 20: 20
50: 25: 25 FT
60: 20: 20
50: 25: 25 FT
60: 20: 20
50: 25: 25 Table. 3 : Average Recognition rate
Table. 4 : Empirical Evaluation of ' ' n Largest Eigen Values
Ratio’s of Training, Validation and Testing Largest Eigen Values Recognition Accuracy in-terms of F measure
60: 20: 20
50: 25: 25 Dataset Ratio’s of Training, Validation and Testing Average Recognition Rate in-terms of F measure CVPR-ISI Kolkata
60: 20: 20
50: 25: 25 We have compared our model with the paper, where graph representation is utilized on the same dataset. From the literature, we observe authors in [58] achieved recognition accuracy (in terms of F measure) by using Graph representation and Lipchitz embedding. Lipchitz embedding is based on transforming a graph into ' ' n distances to already set aside ' ' n m-dimensional reference set of graphs as shown in Fig.7. Each ' ' i d in Feature vector ( , ,..., ) n F d d d is obtained by taking the minimum distance between input graph ' ' g and graphs present in each reference set i.e., min ( , , , ) i m d R R R where, , , , m R R R are the individual graphs belonging to each refer-ence set.
Fig. 7:
Illustration of the compared model.
Consequently, a graph ' ' g is converted to the n dimensional vector space n R by computing the Graph Edit Distance (GED) of ' ' g to all of the ' ' n reference sets (each m dimensional). However, transforming numeral graphs into vector spaces by computation of dissimilarities from ' ' n m dimensional selected reference set (carefully selected set of graphs) is time-consuming. Input graph ' ' g is matched with every single graph in Reference set that requires time complexity in cubic order with respect to the order of the graph (thus inappropriate for graphs having large orders). Furthermore, GED depends on optimization of various factors viz. Insertion, deletion and substitution cost of nodes and edges. Recognition performance is greatly influenced by number and dimension of Reference sets. Moreover, type of the graphs selected from the dataset for each Reference set also has a great impact on the recognition performance. Our model is superior to discussed model despite its superior performance since we transform the numeral graph into vector space by eigendecomposition (or spectrum as a feature vector) to avoid computationally expensive graph matching. Furthermore, most misclassification occurs in our model due to invariance property of the spectrum. Thus the efficacy of the proposed method can easily be justified. Since our model gives graph representation it is not directly comparable with conventional feature representation models.
6 Conclusion and Future Work
In this study, we presented a method that exploits robust graph representation and spectral graph embedding for recog-nition of style variant, cursive handwritten characters by taking a case study of Devanagari numerals. Largest ' ' n ei-genvalues (spectrum) are extracted from selected (application dependent) weighted numeral graph associated matrices. We empirically validated highest performing ' ' n from each spectrum. Recognition performance from individual spec-trum ranges from
75 85% (in terms of average F-measure). In order to augment recognition accuracy classifier fusion at the decision level is also studied. That increases recognition accuracy significantly as shown in Table 4. Perfor-mance of the method is corroborated by conducting extensive experiments on standard CVPR-ISI Kolkata dataset. After observing, the results from different experiments, we conclude that the proposed method is effective in repre- senting complex relationships between different primitives, different intra-class size, style, image transformations (translation, scale, rotation, reflection and mirror image) and cursiveness for recognition of handwritten Devanagari numerals. However, the method may not withstand with handwritten characters/numeral if they have same (more or less) underlying graph representation. Furthermore, invariance property of the spectrum also adds to the confusion. Hence, due to these reasons, most misclassification occurs. There are various issues that need further investigation. For example, there seems to be room for employing spec-tra of the further graph associated matrices at decision level fusion. Furthermore, experiments/observations in this study have been based on Support Vector Machines. It would be interesting to repeat experiments/observations with different classifiers. Moreover, utilizing probabilistic outputs (Fuzzy) in One-vs.-one and one-vs.-all Multi-class classi-fication seems to be an interesting topic for further research. Finally, in this study, we have utilised Euclidean distance for labelling graphs. It would be interesting to observe the influence of distance on eigendecomposition of numeral graphs. Acknowledgement . We would like to thank Prof. Ujjwal Bhattacharya and Prof. B.B. Chaudhuri of Computer Vision and Pattern Recognition Unit (CVPR-Unit) of Indian Statistical Institute (ISI) Kolkata for providing Handwritten De-vanagari Numeral dataset.
Bibliography
1. Fujisawa, H.: Forty years of research in character and document recognition-an industrial perspective. Pattern Recognit. 41, 2435–2446 (2008). 2. Cheriet, M., El Yacoubi, M., Fujisawa, H., Lopresti, D., Lorette, G.: Handwriting recognition research: Twenty years of achievement... and beyond. Pattern Recognit. 42, 3131–3135 (2009) 3. D. Conte, P. Foggia, C. Sansone, and M. Vento.: Thirty years of graph matching in pattern recognition .Int. J. Pattern Recogn. Artif. Intell.,vol. 18, no. 3, pp. 265–298, 2004. 4. P.Wang. Historical handwriting representation model dedicated to word spotting application. Computer vision and Pattern Recognition [cs.CV]. Universit_e Jean Monnet - Saint-Etienne, 2014. English. NNT : 2014STET4019. 5. Bhowmik, S. et al.: A holistic word recognition technique for handwritten Bangla words. Int. J. Appl. Pattern Recognit. 2, 2, 142–159 (2015). 6. Trier, O. et al.: Feature extraction methods for character recognition - A survey. (1996). 7. Pavlidis, T.: Decomposition of Polygons Into Simpler Components: Feature Generation for Syntactic Pattern Recognition. IEEE Trans. Comput. C-24, 6, 636–650 (1975). 8. Sarkhel, R. et al.: A multi-objective approach towards cost effective isolated handwritten Bangla character and digit recognition. Pattern Recognit. 58, 172–189 (2016). 9. Basu, S. et al.: A hierarchical approach to recognition of handwritten Bangla characters. Pattern Recognit. 42, 7, 1467–1484 (2009). 10. Sarkhel, R. et al.: An enhanced harmony search method for Bangla handwritten character recognition using region sampling. 2015 IEEE 2nd Int. Conf. Recent Trends Inf. Syst. 325–330 (2015). 11. Kekre, H.B. et al.: Devnagari Handwritten Character Recognition using LBG vector quantization with gradient masks. 2013 Int. Conf. Adv. Technol. Eng. ICATE 2013. 1–4 (2013). 12. Le Cun, Y., Bengio, Y.: Word-level training of a handwritten word recognizer based on convolutional neural networks. Proc. 12th IAPR Int. Conf. Pattern Recognit. (Cat. No.94CH3440-5). 2, 88–92 (1994). 13. Bajaj, R., Dey, L., Chaudhury, S.: Devnagari numeral recognition by combining decision of multiple connectionist classifiers. Sadhana. 27, 59–72 (2002). 14. Bhattacharya, U., Chaudhuri, B.B.: Handwritten Numeral Databases of Indian Scripts and Multistage Recognition of Mixed Numerals.pdf. 31, 444–457 (2009). 15. Hanmandlu, M., Murthy, O.V.R.: Fuzzy model based recognition of handwritten numerals. Pattern Recognit. 40, 1840–1854 (2007). 16. Ramteke, R.J., and Mehrotra, S. C.: Feature Extraction Based on Moment Invariants for Handwriting Recognition. Proc.- IEEE Conference on Cybernetics and Intelligent
Systems, Bangkok, 2006, pp. 1-6. doi: 10.1109/ICCIS.2006.252262. 17. Patil, P.M., Sontakke, T.R.: Rotation, scale and translation invariant handwritten Devanagari numeral character recognition using general fuzzy neural network. Pattern Recognit. 40, 2110–2117 (2007). 18. Malik, L.: A graph based approach for handwritten devanagri word recogntion. Int. Conf. Emerg. Trends Eng. Technol. ICETET. 309–313 (2012). 19. Ghosh S., Das N., Kundu M., Nasipuri M. (2016) Handwritten Oriya Digit Recognition Using Maximum Common Subgraph Based Similarity Measures. In: Satapathy S., Mandal J., Udgata S., Bhateja V. (eds) Information Systems Design and Intelligent Applications. Advances in Intelligent Systems and Computing, vol 435. Springer, New Delhi. 20. Bhat, M.I., Sharada, B.: Recognition of handwritten devanagiri numerals by graph representation and SVM. 2016 Int. Conf. Adv. Comput. Commun. Informatics, ICACCI 2016. 1930–1935 (2016). 21. Vasantha Lakshmi, C., Jain, R., Patvardhan, C.: Handwritten devnagari numerals recognition with higher accuracy. Proc. - Int. Conf. Comput. Intell. Multimed. Appl. ICCIMA 2007. 3, 255–259 (2008). 22. Banashree, N., Andhre, D., Vasanta, R.: OCR for script identification of Hindi (Devnagari) numerals using error diffusion Halftoning Algorithm with neural classifier. Proc. World. 46–50 (2007). 23. Das, N., Reddy, J.M., Sarkar, R., Basu, S., Kundu, M., Nasipuri, M., Basu, D.K.: A statistical-topological feature combination for recognition of handwritten numerals. Appl. Soft Comput. J. 12, 2486–2495 (2012). 24. Banashree, N.P., Vasanta, R.: OCR for Script Identification of Hindi ( Devnagari ) Numerals using Feature Sub Selection by Means of End-Point with Neuro-Memetic Model. 1, 206–210 (2007). 25. Singh, P., Verma, A.: Handwritten Devnagari Digit Recognition using Fusion of Global and Local Features. 89, 6–12 (2014). 26. Sharma, N., Pal, U., Kimura, F., Pal, S.: Recognition of Off-Line Handwritten Devnagari Characters Using Quadratic Classifier. 805–816 (2006). 27. More, V.N., Rege, P.P.: Devanagari handwritten numeral identification based on Zernike moments. IEEE Reg. 10 Annu. Int. Conf. Proceedings/TENCON. (2008). 28. Bhattacharya,U.,Parui,S.K., Shaw, B., Bhattacharya,K.:Neural combination of ANN and HMM for handwritten devanagari numeral recognition.Tenth Intl. Work. Front.Handwrit. Recognition. (2006). 29. Pal, U., Chaudhuri, B.B.: Indian script character recognition: A survey. Pattern Recognit. 37, 1887–1899 (2004). 30. Bag, S., Harit, G.: A survey on optical character recognition for Bangla. Sadhana. 38, 133–168 (2013). 31. Jayadevan, R., Kolhe, S.R., Patil, P.M., Pal, U.: Offline recognition of Devanagari script: A survey. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 41, 782–796 (2011). 32. Conte, D., Foggia, P., Sansone, C., Vento, M., Kandel, A., Bunke, H., Last, M., A Kandel, H Bunke, M.L., Kandel, A., Bunke, H., Last, M.: Applied Graph Theory in Computer Vision and Pattern Recognition. Stud. Comput. Intell. 52, 85–135 (2007). 33. S. Sarkar, K.L. Boyer, Quantitative measures of change based on feature organization: Eigenvalues and eigenvectors, Comput. Vision Image Understanding 71 (1998) 110–136 34. Shi J., Malik J., Normalized cuts and image segmentation, Proc. IEEE Conf. Computer Vision and Pattern Recognition, 1997, 731-737; IEEE Trans. Pattern Analysis Machine Intell., 28(2000), 888–905. 35. Schmidt, M., Palm, G., Schwenker, F.: Spectral graph features for the classification of graphs and graph sequences. Comput. Stat. 29, 65–80 (2014).
36. Pal,U.,Wakabayashi,T.,Sharma,N.,Kimura,F.:Handwirtten numeral recognition of six popular Indian scripts.Proc.Int.Conf.Doc.Anal.Recognition,ICDAR.2,749-753 (2007). 37. Chung, F.R.K.: Spectral Graph Theory. ACM SIGACT News. 30, 14 (1999). 38. Brouwer,A.E.,Haermers,W.H.:Spectra of Graphs.Universitext.Springer (2012). 39. Deo,Narsingh.Graph Theory With Applications To Engineering & Computer Science.Print.
40. Ghosh, S. et al.: The journey of graph kernels through two decades. Comput. Sci. Rev. 27, 88–111 (2018).
41. Guo,Z.,Hall,R.W.:Parallel thinning with two-subiteration algorithms.Commun.ACM.32,359-373 (1989). 42. Deza, M.M., Deza, E.: Encyclopedia of distances. (2009). 43. Gallian, J.A.: A dynamic survey of graph labeling. Electron. J. Comb. 1–219 (2009). 44. Wilson, R.C.: Graph Theory and Spectral Methods for Pattern Recognition. 45. Terms-eigendecomposition, I.: An Eigendecomposition Approach to W e ighted G raph Matching Problems. Analysis. (1988). 46. Stewman, J., Bowyer, K.: Learning Graph Matching. Comput. Vision, 1988. ICCV 1988. IEEE 2nd Int. Conf. 31, 6, 494–500 (1988). 47. CONTE, D. et al.: Thirty Years of Graph Matching in Pattern Recognition. Int. J. Pattern Recognit. Artif. Intell. 18, 3, 265–298 (2004). 48. Cvetkovc D.M et at.: Recent Results in the Theory of Graph Spectra.(1991). 49. Brualdi A.Richard, A Combinatorial Approach To Matrix Theory And Its Applications .Print 50. Cvetkovc D.M et at.:Eigenspaces of Graphs.Print 51. Cvetkovc D.M et at.:Spectral Generalisations of Line Graphs.Print. 52. Bhattacharya, U., Chaudhuri, B.B.: Handwritten numeral databases of Indian scripts and multistage recognition of mixed numerals. IEEE Trans. Pattern Anal. Mach. Intell. 31, 3, 444–457 (2009). 53.
Nelles, O.: Nonlinear system identification: from classical approaches to neural networks and fuzzy models. (2001).
54. Bhattacharya, U., Chaudhuri, B.B.: Databases for research on recognition of handwritten characters of Indian scripts. Proc. Int. Conf. Doc. Anal. Recognition, ICDAR. 2005, 789–793 (2005). 55. Schölkopf, B.Smola, A.J: Learning with kernels:Support Vector Machines, Regularisation, Optimisation, and Beyond. Adaptive Computation and Machine Learning. MIT Press (2002). 56. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. New York John Wiley, Sect. 654 (2000). 57. Kuncheva, L.I.: Combining Pattern Classifiers: Methods and Algorithms:Combining Pattern Classifiers: Methods and Algorithms. (2005). 58. Bhat M.I., Sharada B. (2017) Recognition of Handwritten Devanagari Numerals by Graph Representation and Lipschitz Embedding. In: Santosh K., Hangarge M., Bevilacqua V., Negi A. (eds) Recent Trends in Image Processing and Pattern Recognition. RTIP2R 2016. Communications in Computer and Information Science, vol 709. Springer, Singapore.6