Bilan Zhu
Tokyo University of Agriculture and Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Bilan Zhu.
document recognition and retrieval | 2010
Bilan Zhu; Xiang-Dong Zhou; Cheng-Lin Liu; Masaki Nakagawa
This paper describes a robust context integration model for on-line handwritten Japanese text recognition. Based on string class probability approximation, the proposed method evaluates the likelihood of candidate segmentation–recognition paths by combining the scores of character recognition, unary and binary geometric features, as well as linguistic context. The path evaluation criterion can flexibly combine the scores of various contexts and is insensitive to the variability in path length, and so, the optimal segmentation path with its string class can be effectively found by Viterbi search. Moreover, the model parameters are estimated by the genetic algorithm so as to optimize the holistic string recognition performance. In experiments on horizontal text lines extracted from the TUAT Kondate database, the proposed method achieves the segmentation rate of 0.9934 that corresponds to a f-measure and the character recognition rate of 92.80%.
IEICE Transactions on Information and Systems | 2005
Masaki Nakagawa; Bilan Zhu; Motoki Onuma
This paper presents a model and its effect for on-line handwritten Japanese text recognition free from line-direction constraint and writing format constraint such as character writing boxes or ruled lines. The model evaluates the likelihood composed of character segmentation, character recognition, character pattern structure and context. The likelihood of character pattern structure considers the plausible height, width and inner gaps within a character pattern that appear in Chinese characters composed of multiple radicals (subpatterns). The recognition system incorporating this model separates freely written text into text line elements, estimates the average character size of each element, hypothetically segments it into characters using geometric features, applies character recognition to segmented patterns and employs the model to search the text interpretation that maximizes likelihood as Japanese text. We show the effectiveness of the model through recognition experiments and clarify how the newly modeled factors in the likelihood affect the overall recognition rate.
international conference on document analysis and recognition | 2011
Bilan Zhu; Jinfeng Gao; Masaki Nakagawa
This paper describes effective object function design for combining on-line and off-line character recognizers for on-line handwritten Japanese text recognition. We combine on-line and off-line recognizers using a linear or nonlinear function with weighting parameters optimized by the MCE criterion. We apply a k-means method to cluster the parameters of all character categories into groups so that the categories belonging to the same group have the same weight parameters. Moreover, we apply a genetic algorithm to estimate super parameters such as the number of clusters, initial learning rate and maximum learning times as well as the sigmoid function parameter for MCE optimization. Experimental results on horizontal text lines extracted from the TUAT Kondate database demonstrate the superiority of our method.
international conference on document analysis and recognition | 2011
Bilan Zhu; Masaki Nakagawa
This paper describes a Markov random field (MRF) model with weighting parameters optimized by conditional random field (CRF) for on-line recognition of handwritten Japanese characters. It also presents updated evaluation using a large testing set. The model extracts feature points along the pen-tip trace from pen-down to pen-up and sets each feature point from an input pattern as a site and each state from a character class as a label. It employs the coordinates of feature points as unary features and the differences in coordinates between the neighboring feature points as binary features. The weighting parameters are estimated by CRF or the minimum classification error (MCE) method. In experiments using the TUAT Kuchibue database, the method achieved a character recognition rate of 92.77%, which is higher than the previous models rate, and the method of estimating the weighting parameters using CRF was more accurate than using MCE.
IEICE Transactions on Information and Systems | 2008
Bilan Zhu; Masaki Nakagawa
This paper describes a method of producing segmentation point candidates for on-line handwritten Japanese text by a support vector machine (SVM) to improve text recognition. This method extracts multi-dimensional features from on-line strokes of handwritten text and applies the SVM to the extracted features to produces segmentation point candidates. We incorporate the method into the segmentation by recognition scheme based on a stochastic model which evaluates the likelihood composed of character pattern structure, character segmentation, character recognition and context to finally determine segmentation points and recognize handwritten Japanese text. This paper also shows the details of generating segmentation point candidates in order to achieve high discrimination rate by finding the optimal combination of the segmentation threshold and the concatenation threshold. We compare the method for segmentation by the SVM with that by a neural network (NN) using the database HANDS-Kondate_t_bf-2001–11 and show the result that the method by the SVM bring about a better segmentation rate and character recognition rate.
document recognition and retrieval | 2011
Bilan Zhu; Masaki Nakagawa
This paper describes a Markov random field (MRF) model with weighting parameters optimized by conditional random field (CRF) for on-line recognition of handwritten Japanese characters. The model extracts feature points along the pen-tip trace from pen-down to pen-up and sets each feature point from an input pattern as a site and each state from a character class as a label. It employs the coordinates of feature points as unary features and the differences in coordinates between the neighboring feature points as binary features. The weighting parameters are estimated by CRF or the minimum classification error (MCE) method. In experiments using the TUAT Kuchibue database, the method achieved a character recognition rate of 92.77%, which is higher than the previous models rate, and the method of estimating the weighting parameters using CRF was more accurate than using MCE.
Proceedings of the 2011 Workshop on Historical Document Imaging and Processing | 2011
Truyen Van Phan; Bilan Zhu; Masaki Nakagawa
In this paper, we present the first effort in preprocessing and character segmentation on digitized Nom document pages toward their digital archiving. Nom is an ideographic script to represent Vietnamese, used from the 10th century to 20th century. Because of various complex layouts, we propose an efficient method based on connected component analysis for extraction of characters from images. The area Voronoi diagram is then employed to represent the neighborhood and boundary of connected components. Based on this representation, each character can be considered as a group of extracted adjacent Voronoi regions. To improve the performance of segmentation, we use the recursive x-y cut method to segment separated regions. We evaluate the performance of this method on several pages in different layouts. The results confirm that the method is effective for character segmentation in Nom documents.
international conference on document analysis and recognition | 2009
Cheng Cheng; Bilan Zhu; Xiaorong Chen; Masaki Nakagawa
This paper presents a revised method for keyword search from handwritten digital ink in comparison with the previous system. We adopt a search method using noise reduction. Experiments on digital ink databases show that the revised method typically improves the system’s overall accuracy (f-measure) from 0.653 to 0.891.
international conference on pattern recognition | 2004
Masaki Nakagawa; Bilan Zhu; Motoki Onuma
This work presents a formalization of an on-line writing-box free, line-direction free handwritten Japanese text recognition and its effect. By normalizing character orientation, even text of arbitrary character orientation can be recognized. The method evaluates the likelihood composed of character segmentation, character recognition, character pattern structure and context. The likelihood of character pattern structure considers the plausible height, width and gaps within a character pattern that appear in Chinese characters composed of multiple radicals (subpatterns). We show how the newly modeled factors in the likelihood affect the overall recognition rate.
Pattern Recognition Letters | 2014
Bin Chen; Bilan Zhu; Masaki Nakagawa
This paper presents effects of a large amount of training patterns artificially generated to train an on-line handwritten Japanese character recognizer, which is based on the Markov Random Field model. In general, the more training patterns, the higher the recognition accuracy. In reality, however, the existing pattern samples are not enough, especially for languages with large sets of characters, for which a higher number of parameters needs to be adjusted. We use six types of linear distortion models and combine them among themselves and with a non-linear distortion model to generate a large amount of artificial patterns. These models are based on several geometry transform models, which are considered to simulate distortions in real handwriting. We apply these models to the TUAT Nakayosi database and expand its volume by up to 300 times while evaluating the notable effect of the TUAT Kuchibue database for improving recognition accuracy. The effect is analyzed for subgroups in the character set and a significant effect is observed for Kanji, ideographic characters of Chinese origin. This paper also considers the order of linear and non-linear distortion models and the strategy to select patterns in the original database from patterns close to character class models to those away from them or vice versa. For this consideration, we merge the Nakayosi and Kuchibue databases. We take 100 patterns existed in the merged database to form the testing set, while the remaining samples to form the training set. For the order, linear then non-linear distortions produce higher recognition accuracy. For the strategy, selecting patterns away from character class models to those close to them produce higher accuracy.