Yutaka Katsuyama | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yutaka Katsuyama is active.

Explore More

Publication

Featured researches published by Yutaka Katsuyama.

international conference on document analysis and recognition | 2005

Camera based degraded text recognition using grayscale feature

Jun Sun; Yoshinobu Hotta; Yutaka Katsuyama; Satoshi Naoi

As the rapid progress of digital imaging technology, camera based character recognition receives more and more attentions. One challenge in camera based OCR is the recognition for degraded text. Conventional OCR engines usually recognize on binary image. However, the performance drops dramatically as the degradation level increases. In this paper, a new recognition method is proposed to recognize degraded character based on dual eigenspace decomposition and synthetic degraded data. Then, the degraded character string is segmented by the combination of binary and grayscale analysis. Experiments on single character and text string recognition prove the effectiveness of our method.

conference on information and knowledge management | 2004

Low resolution character recognition by dual eigenspace and synthetic degraded patterns

Jun Sun; Yushinobu Hotta; Yutaka Katsuyama; Satoshi Naoi

As the rapid progress of digital imaging technology, the requirements of character recognition for text embedded in image increase dramatically. Many image text characters are in low resolution with heavy degradation. Traditional OCR methods dont have good recognition performance on these degraded images due to poor binarization. In this paper, a novel feature extraction method based on dual eigenspace and synthetic pattern generation is proposed to recognize character images under low resolution. A subpixel grayscale normalization method is first used to normalize the low resolution character images. The dual eigenspace performs classification from coarse to fine. The multi-templates generated from the synthetic patterns provide good robustness against real degradation. Experimental results indicate that our method is very effective on low resolution Japanese character images.

document recognition and retrieval | 2003

Slide identification for lecture movies by matching characters and images

Noriaki Ozawa; Hiroaki Takebe; Yutaka Katsuyama; Satoshi Naoi; Haruo Yokota

Slide identification is very important when creating e-Learning materials as it detects slides being changed during lecture movies. Simply detecting the change would not be enough for e-Learning purposes. Because, which slide is now displayed in the frame is also important for creating e-Learning materials. A matching technique combined with a presentation file containing answer information is very useful in identifying slides in a movie frame. We propose two methods for slide identification in this paper. The first is character-based, which uses the relationship between the character code and its coordinates. The other is image-based, which uses normalized correlation and dynamic programming. We used actual movies to evaluate the performance of these methods, both independently and in combination, and the experimental results revealed that they are very effective in identifying slides in lecture movies.

chinese conference on pattern recognition | 2009

Text Detection in Images Based on Grayscale Decomposition and Stroke Extraction

Wei Fan; Jun Sun; Yutaka Katsuyama; Yoshinobu Hotta; Satoshi Naoi

A method of detecting text regions in images which combines grayscale decomposition and stroke extraction is proposed. By checking the consistency of the two text features, text-like connected components are grouped together to generate text line regions in the processed image. It shows good performance on efficiently detecting image text rendered in relatively complex backgrounds.

international conference on pattern recognition | 2008

Video caption duration extraction

Hongliang Bai; Jun Sun; Satoshi Naoi; Yutaka Katsuyama; Yoshinobu Hotta; Katsuhito Fujimoto

Caption detection in the video is an active research topic in recent years. In the conventional methods, one of most difficult problems is to effectively and quickly extract the durations of the different-size captions in the complex background. To solve this problem, a novel and effective method is presented to locate and track the captions in the video. The main contributions are: (1)present a multi-scale Harris-corner based method to detect the initial position of the caption (2)propose the SGF (Steady Global Feature) to determine the caption duration. Extensive experiments demonstrate the effectiveness of the proposed method.

document analysis systems | 2012

A Fast Caption Detection Method for Low Quality Video Images

Tianyi Gui; Jun Sun; Satoshi Naoi; Yutaka Katsuyama; Akihiro Minagawa; Yoshinobu Hotta

Captions in videos are important and accurate clues for video retrieval. In this paper, we propose a fast and robust video caption detection and localization algorithm to handle low quality video images. First, the stroke response maps from complex background are extracted by a stoke filter. Then, two localization algorithms are used to locate thin stroke and thick stroke caption regions respectively. Finally, a HOG based SVM classifier is carried out on the detected results to further remove noises. Experimental results show the superior performance of our proposed method compared with existing work in terms of accuracy and speed.

international symposium on multimedia | 2006

Treatment of Laser Pointer and Speech Information in Lecture Scene Retrieval

Wataru Nakano; Takashi Kobayashi; Yutaka Katsuyama; Haruo Yokota

We have previously proposed a unified presentation contents search mechanism named UPRISE (unified presentation slide retrieval by impression search engine), and have also proposed a method to use laser pointer information in lecture scene retrieval. In this paper, we discuss the treatment of the laser pointer and speech information, and propose two methods to filter the laser pointer information using keyword occurrence in slides and speech. We also propose weighting schemata with filtered laser pointer information using slide text and speech information. We evaluate our approach by using actual lecture videos and presentation slides

Systems and Computers in Japan | 1988

Rough classification of handwritten characters by divide‐and‐unify method

Toshiaki Ejima; Yutaka Katsuyama; Masayuki Kimura

A new method is proposed in which all the characteristic vectors are classified into subvectors. Then a rough classification of handwritten characters is performed by unifying classification information obtained in each subvector all over again. In rough classification, “soundness” of classification and “quickness” of processing are required. To perform rough classification effectively, it is necessary to store considerable classification information in a dictionary. On the other hand, to implement processing quickly, it is necessary to decrease the amount of computation by condensing the classification information. n n n nThe method herein not only realizes efficient rough classification through decreased redundancy by condensing classification information independently for each subvector of the characteristic vector, but also realizes a sounder rough classification by unifying classification information obtained in each subvector all over again. In the classification experiments, the amount of processing could be reduced to one-fourth or one-fifth that in the previous method to examine all the character classes without reducing a successful classification rate; this demonstrates the effectiveness of the method.

international conference on pattern recognition | 2014

Fast and Accurate Text Detection in Natural Scene Images with User-Intention

Liuan Wang; Wei Fan; Yuan He; Jun Sun; Yutaka Katsuyama; Yoshinobu Hotta

We propose an accurate and robust coarse-to-fine text detection scheme with user-intention which captures the intrinsic characteristics of natural scene texts. In the coarse detection stage, a double edge detector is designed to estimate the symmetry of stroke and the stroke width, which help segment the foreground. Then the initial user-intention region is extended to generate a coarse bounding box based on the estimated foreground. In the refinement stage, candidate connected components (CCs) from Niblack decomposition, are grouped together by location to form text lines after noise removal and layer selection. Experimental results demonstrate the effectiveness of the proposed method which yields higher performance compared with state-of-the-art methods.

international conference on data engineering | 2005

Unified Presentation Contents Retrieval Using Laser Pointer Information

Wataru Nakano; Yuta Ochi; Takashi Kobayashi; Yutaka Katsuyama; Satoshi Naoi; Haruo Yokota

We have proposed unifying presentation contents, such as lecture video and presentation slides used in lectures, using metadata. For the unified contents, we have also proposed a search mechanism named UPRISE (Unified Presentation Slide Retrieval by Impression Search Engine). In this paper, we focus on the position, count, and duration information of a laser pointer, and propose a retrieval method using the information. In proposed method, we extract the position of laser pointer from video, and choose a sentence near the laser pointer position with a number of candidates. We evaluate our approach by applying it to actual presentation contents.

Explore More