Katsumi Marukawa | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Katsumi Marukawa is active.

Explore More

Publication

Featured researches published by Katsumi Marukawa.

Pattern Recognition | 2005

Pseudo two-dimensional shape normalization methods for handwritten Chinese character recognition

Cheng-Lin Liu; Katsumi Marukawa

The nonlinear normalization (NLN) method based on line density equalization is popularly used in handwritten Chinese character recognition. To overcome the insufficient shape restoration capability of one-dimensional NLN, a pseudo two-dimensional NLN (P2DNLN) method has been proposed and has yielded higher recognition accuracy. The P2DNLN method, however, is very computationally expensive because of the line density blurring of each row/column. In this paper, we propose a new pseudo 2D normalization method using line density projection interpolation (LDPI), which partitions the line density map into soft strips and generate 2D coordinate mapping function by interpolating the 1D coordinate functions that are obtained by equalizing the line density projections of these strips. The LDPI method adds little computational overhead to one-dimensional NLN yet performs comparably well with P2DNLN. We also apply this strategy to extending other normalization methods, including line density projection fitting, centroid-boundary alignment, moment, and bi-moment methods. The latter three methods are directly based on character image instead of line density map. Their 2D extensions provide real-time computation and high recognition accuracy, and are potentially applicable to gray-scale images and online trajectories.

international conference on document analysis and recognition | 2007

Online Handwritten Japanese Character String Recognition Incorporating Geometric Context

Xiang-Dong Zhou; Jinlun Yu; Cheng-Lin Liu; Takeshi Nagasaki; Katsumi Marukawa

This paper describes an online handwritten Japanese character string recognition system integrating scores of geometric context, character recognition, and linguistic context. We give a string evaluation criterion for better integrating the multiple scores while overcoming the effect of string length variability. For measuring geometric context, we propose a statistical method for modeling both single- character and between-character plausibility. Our experimental results on TUAT HANDS databases show that the geometric context improves the character segmentation accuracy remarkably.

international conference on frontiers in handwriting recognition | 2004

Global shape normalization for handwritten Chinese character recognition: a new method

Cheng-Lin Liu; Katsumi Marukawa

Nonlinear normalization (NLN) based on line density equalization has been widely used in handwritten Chinese character recognition (HCCR). Our previous results showed that global transformation methods, including moment normalization and a newly proposed bi-moment method, generate smooth normalized shapes at lower computation effort while yielding comparable recognition accuracies. This paper proposes a new global transformation method, named modified centroid-boundary alignment (MCBA) method, for HCCR. The previous CBA method can efficiently correct the skewness of centroid by quadratic curve fitting but fails to adjust the inner density. The MCBA method adds a simple trigonometric (sine) function onto quadratic function to adjust the inner density. The amplitude of the sine wave is estimated from the centroids of half images. Experiments on the ETL9B and JEITA-HP databases show that the MCBA method yields comparably high accuracies to the NLN and bi-moment methods and shows complementariness.

Pattern Recognition | 1997

Document retrieval tolerating character recognition errors : Evaluation and application

Katsumi Marukawa; Tao Hu; Hiromichi Fujisawa; Yoshihiro Shima

Abstract This paper presents two methods of combining character recognition with techniques for retrieving Japanese documents and also shows how these methods can be applied to textual image retrieval. Both retrieval methods are tolerant of errors that occur during the character recognition process. The basic idea is to utilize the characteristics of recognition errors. One uses a confusion matrix to generate “equivalent” query strings that should match erroneously recognized text. The other one searches a “non-deterministic text” that contains multiple candidates for ambigous recognition results. Simulation experiments have shown that both methods can effectively combine character recognition with retrieval techniques.

international conference on document analysis and recognition | 2001

A recursive analysis for form cell recognition

Hiroshi Shinjo; Eiichi Hadano; Katsumi Marukawa; Yoshihiro Shima; Hiroshi Sako

It is very difficult to analyze form structures because of breaks in lines and additional noises in the form image. This paper focuses on cell recognition in low quality form images. The recognition method has two features to achieve robustness in cell recognition. One is grid representation using several types of intersection and the terminal points of the frame lines. The other is the recursive modification of the representation. A new representation is created according to the determination of the breaks in the line and the hypothesized location of the missed intersections by using the previous representation. The modification is processed recursively until the representation has perfect consistency and all form cells are detected. In an experiment using 1565 form samples, all cells in 1538 samples (98.3% of 1565 samples) were recognized correctly by this method.

international conference on pattern recognition | 2004

Handwritten numeral string recognition: character-level vs string-level classifier training

Cheng-Lin Liu; Katsumi Marukawa

The performance of handwritten numeral string recognition integrating segmentation and classification relies on the classification accuracy and the resistance to non-characters of the underlying classifier. The classifier can be trained at either character level (with character and non-character samples) or string level (with string samples). We show that both character-level and string-level training yield superior string recognition performance. String-level training improves segmentation but deteriorates classification. By combining the character-level trained classifier and the string-level trained classifier, we have achieved higher string recognition performance. We show the experimental results of three classifier structures on the numeral strings of NIST Special Database 19.

international conference on frontiers in handwriting recognition | 2004

Normalization ensemble for handwritten character recognition

Cheng-Lin Liu; Katsumi Marukawa

This paper proposes a multiple classifier approach, called normalization ensemble, for handwritten character recognition by combining multiple normalization methods. By varying the coordinate mapping mode, we have devised 14 normalization functions, and switching on/off slant correction results in 28 instantiated classifiers. We would show that the classifiers with different normalization methods are complementary and the combination of them can significantly improve the recognition accuracy. In experiments of handwritten digit recognition on the NIST special database 19, the normalization ensemble was shown to reduce the error rate by factors from 10.6% to 26.9% and achieved the best error rate 0.43%. We also show that the complexity of normalization ensemble can be reduced by selecting seven classifiers from 28 with little loss of accuracy.

international conference on frontiers in handwriting recognition | 2004

Document retrieval system tolerant of segmentation errors of document images

Takeshi Nagasaki; Toshikazu Takahashi; Katsumi Marukawa

This paper describes a new document retrieval method that is tolerant of OCR segmentation errors in document images. To overcome the segmentation and recognition errors that most OCR-based retrieval systems suffer from, the proposed method consists of two processing phases. First, the OCR engine first generates multiple character-segmentation and recognition hypotheses. Then the retrieval engine extracts keywords from the recognition hypotheses by using lexicon-driven dynamic programming (DP) matching. We have applied this method to both handwritten and printed document images and have demonstrated its effectiveness in reducing false drops and false alarms.

international conference on document analysis and recognition | 1995

Optimal techniques in OCR error correction for Japanese texts

Toru Hisamitsu; Katsumi Marukawa; Yoshihiro Shima; Hiromichi Fujisawa; Yoshihiko Nitta

This paper investigates three fundamental techniques in OCR error correction for Japanese texts using morphological analysis: (1) an optimal method for candidate word extraction from a candidate character lattice, (2) optimal word entries for Japanese verb inflection analysis, and (3) a new method of word matching cost calculation which is more suitable to be used with linguistic criteria. Comparative evaluation shows that the combination of these techniques requires 84% less computation, captures 2.6% more candidate words, reduces the chart parsing computation by 20%, and attains 25% higher error correction rate than a commonly used method.

international conference on document analysis and recognition | 1999

A method for street number matching in Japanese address recognition

Hisao Ogata; Yoo Ueda; Katsumi Marukawa; Hiroshi Sako; Hiromichi Fujisawa

A method for street number matching in Japanese address recognition is presented. This method uses the knowledge about the number expression formats, which is represented by the expression patterns of street numbers. The expression patterns are composed of character class symbols which represent numerals and delimiter characters. By matching the expression patterns against the character candidates in the lattice which is the result of character recognition, the class symbol lattice is generated. In the class symbol lattice, the numeral and delimiter character candidates are replaced by the corresponding class symbol. An experiment using 6,136 expression patterns showed that 93% of street numbers on the test mail-pieces were correctly recognized.

Explore More