Chucai Yi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Chucai Yi is active.

Explore More

Publication

Featured researches published by Chucai Yi.

IEEE Transactions on Image Processing | 2011

Text String Detection From Natural Scenes by Structure-Based Partition and Grouping

Chucai Yi; Yingli Tian

Text information in natural scene images serves as important clues for many image-based applications such as scene understanding, content-based image retrieval, assistive navigation, and automatic geocoding. However, locating text from a complex background with multiple colors is a challenging task. In this paper, we explore a new framework to detect text strings with arbitrary orientations in complex natural scene images. Our proposed framework of text string detection consists of two steps: 1) image partition to find text character candidates based on local gradient features and color uniformity of character components and 2) character candidate grouping to detect text strings based on joint structural features of text characters in each text string such as character size differences, distances between neighboring characters, and character alignment. By assuming that a text string has at least three characters, we propose two algorithms of text string detection: 1) adjacent character grouping method and 2) text line grouping method. The adjacent character grouping method calculates the sibling groups of each character candidate as string segments and then merges the intersecting sibling groups into text string. The text line grouping method performs Hough transform to fit text line among the centroids of text candidates. Each fitted text line describes the orientation of a potential text string. The detected text string is presented by a rectangle region covering all characters whose centroids are cascaded in its text line. To improve efficiency and accuracy, our algorithms are carried out in multi-scales. The proposed methods outperform the state-of-the-art results on the public Robust Reading Dataset, which contains text only in horizontal orientation. Furthermore, the effectiveness of our methods to detect text strings with arbitrary orientations is evaluated on the Oriented Scene Text Dataset collected by ourselves containing text strings in nonhorizontal orientations.

IEEE Transactions on Image Processing | 2012

Localizing Text in Scene Images by Boundary Clustering, Stroke Segmentation, and String Fragment Classification

Chucai Yi; Yingli Tian

In this paper, we propose a novel framework to extract text regions from scene images with complex backgrounds and multiple text appearances. This framework consists of three main steps: boundary clustering (BC), stroke segmentation, and string fragment classification. In BC, we propose a new bigram-color-uniformity-based method to model both text and attachment surface, and cluster edge pixels based on color pairs and spatial positions into boundary layers. Then, stroke segmentation is performed at each boundary layer by color assignment to extract character candidates. We propose two algorithms to combine the structural analysis of text stroke with color assignment and filter out background interferences. Further, we design a robust string fragment classification based on Gabor-based text features. The features are obtained from feature maps of gradient, stroke distribution, and stroke width. The proposed framework of text localization is evaluated on scene images, born-digital images, broadcast video images, and images of handheld objects captured by blind persons. Experimental results on respective datasets demonstrate that the framework outperforms state-of-the-art localization algorithms.

Computer Vision and Image Understanding | 2013

Text extraction from scene images by character appearance and structure modeling

Chucai Yi; Yingli Tian

In this paper, we propose a novel algorithm to detect text information from natural scene images. Scene text classification and detection are still open research topics. Our proposed algorithm is able to model both character appearance and structure to generate representative and discriminative text descriptors. The contributions of this paper include three aspects: 1) a new character appearance model by a structure correlation algorithm which extracts discriminative appearance features from detected interest points of character samples; 2) a new text descriptor based on structons and correlatons, which model character structure by structure differences among character samples and structure component co-occurrence; and 3) a new text region localization method by combining color decomposition, character contour refinement, and string line alignment to localize character candidates and refine detected text regions. We perform three groups of experiments to evaluate the effectiveness of our proposed algorithm, including text classification, text detection, and character identification. The evaluation results on benchmark datasets demonstrate that our algorithm achieves the state-of-the-art performance on scene text classification and detection, and significantly outperforms the existing algorithms for character identification.

machine vision applications | 2013

Toward a computer vision-based wayfinding aid for blind persons to access unfamiliar indoor environments

Yingli Tian; Xiaodong Yang; Chucai Yi; Aries Arditi

Independent travel is a well-known challenge for blind and visually impaired persons. In this paper, we propose a proof-of-concept computer vision-based wayfinding aid for blind people to independently access unfamiliar indoor environments. In order to find different rooms (e.g. an office, a laboratory, or a bathroom) and other building amenities (e.g. an exit or an elevator), we incorporate object detection with text recognition. First, we develop a robust and efficient algorithm to detect doors, elevators, and cabinets based on their general geometric shape, by combining edges and corners. The algorithm is general enough to handle large intra-class variations of objects with different appearances among different indoor environments, as well as small inter-class differences between different objects such as doors and door-like cabinets. Next, to distinguish intra-class objects (e.g. an office door from a bathroom door), we extract and recognize text information associated with the detected objects. For text recognition, we first extract text regions from signs with multiple colors and possibly complex backgrounds, and then apply character localization and topological analysis to filter out background interference. The extracted text is recognized using off-the-shelf optical character recognition software products. The object type, orientation, location, and text information are presented to the blind traveler as speech.

CBDAR'11 Proceedings of the 4th international conference on Camera-Based Document Analysis and Recognition | 2011

Assistive text reading from complex background for blind persons

Chucai Yi; Yingli Tian

In the paper, we propose a camera-based assistive system for visually impaired or blind persons to read text from signage and objects that are held in the hand. The system is able to read text from complex backgrounds and then communicate this information aurally. To localize text regions in images with complex backgrounds, we design a novel text localization algorithm by learning gradient features of stroke orientations and distributions of edge pixels in an Adaboost model. Text characters in the localized regions are recognized by off-the-shelf optical character recognition (OCR) software and transformed into speech outputs. The performance of the proposed system is evaluated on ICDAR 2003 Robust Reading Dataset. Experimental results demonstrate that our algorithm outperforms previous algorithms on some measures. Our prototype system was further evaluated on a dataset collected by 10 blind persons, with the system effectively reading text from complex backgrounds.

international conference on document analysis and recognition | 2011

Text Detection in Natural Scene Images by Stroke Gabor Words

Chucai Yi; Yingli Tian

In this paper, we propose a novel algorithm, based on stroke components and descriptive Gabor filters, to detect text regions in natural scene images. Text characters and strings are constructed by stroke components as basic units. Gabor filters are used to describe and analyze the stroke components in text characters or strings. We define a suitability measurement to analyze the confidence of Gabor filters in describing stroke component and the suitability of Gabor filters on an image window. From the training set, we compute a set of Gabor filters that can describe principle stroke components of text by their parameters. Then a K-means algorithm is applied to cluster the descriptive Gabor filters. The clustering centers are defined as Stroke Gabor Words (SGWs) to provide a universal description of stroke components. By suitability evaluation on positive and negative training samples respectively, each SGW generates a pair of characteristic distributions of suitability measurements. On a testing natural scene image, heuristic layout analysis is applied first to extract candidate image windows. Then we compute the principle SGWs for each image window to describe its principle stroke components. Characteristic distributions generated by principle SGWs are used to classify text or non-text windows. Experimental results on benchmark datasets demonstrate that our algorithm can handle complex backgrounds and variant text patterns (font, color, scale, etc.).

IEEE-ASME Transactions on Mechatronics | 2014

Portable Camera-Based Assistive Text and Product Label Reading From Hand-Held Objects for Blind Persons

Chucai Yi; Yingli Tian; Aries Arditi

We propose a camera-based assistive text reading framework to help blind persons read text labels and product packaging from hand-held objects in their daily lives. To isolate the object from cluttered backgrounds or other surrounding objects in the camera view, we first propose an efficient and effective motion-based method to define a region of interest (ROI) in the video by asking the user to shake the object. This method extracts moving object region by a mixture-of-Gaussians-based background subtraction method. In the extracted ROI, text localization and recognition are conducted to acquire text information. To automatically localize the text regions from the object ROI, we propose a novel text localization algorithm by learning gradient features of stroke orientations and distributions of edge pixels in an Adaboost model. Text characters in the localized text regions are then binarized and recognized by off-the-shelf optical character recognition software. The recognized text codes are output to blind users in speech. Performance of the proposed text localization algorithm is quantitatively evaluated on ICDAR-2003 and ICDAR-2011 Robust Reading Datasets. Experimental results demonstrate that our algorithm achieves the state of the arts. The proof-of-concept prototype is also evaluated on a dataset collected using ten blind persons to evaluate the effectiveness of the systems hardware. We explore user interface issues and assess robustness of the algorithm in extracting and reading text from different objects with complex backgrounds.

international conference on document analysis and recognition | 2013

Feature Representations for Scene Text Character Recognition: A Comparative Study

Chucai Yi; Xiaodong Yang; Yingli Tian

Recognizing text character from natural scene images is a challenging problem due to background interferences and multiple character patterns. Scene Text Character (STC) recognition, which generally includes feature representation to model character structure and multi-class classification to predict label and score of character class, mostly plays a significant role in word-level text recognition. The contribution of this paper is a complete performance evaluation of image-based STC recognition, by comparing different sampling methods, feature descriptors, dictionary sizes, coding and pooling schemes, and SVM kernels. We systematically analyze the impact of each option in the feature representation and classification. The evaluation results on two datasets CHARS74K and ICDAR2003 demonstrate that Histogram of Oriented Gradient (HOG) descriptor, soft-assignment coding, max pooling, and Chi-Square Support Vector Machines (SVM) obtain the best performance among local sampling based feature representations. To improve STC recognition, we apply global sampling feature representation. We generate Global HOG (GHOG) by computing HOG descriptor from global sampling. GHOG enables better character structure modeling and obtains better performance than local sampling based feature representations. The GHOG also outperforms existing methods in the two benchmark datasets.

IEEE Transactions on Image Processing | 2014

Scene Text Recognition in Mobile Applications by Character Descriptor and Structure Configuration

Chucai Yi; Yingli Tian

Text characters and strings in natural scene can provide valuable information for many applications. Extracting text directly from natural scene images or videos is a challenging task because of diverse text patterns and variant background interferences. This paper proposes a method of scene text recognition from detected text regions. In text detection, our previously proposed algorithms are applied to obtain text regions from scene image. First, we design a discriminative character descriptor by combining several state-of-the-art feature detectors and descriptors. Second, we model character structure at each character class by designing stroke configuration maps. Our algorithm design is compatible with the application of scene text extraction in smart mobile devices. An Android-based demo system is developed to show the effectiveness of our proposed method on scene text information extraction from nearby objects. The demo system also provides us some insight into algorithm design and performance improvement of scene text extraction. The evaluation results on benchmark data sets demonstrate that our proposed scheme of text recognition is comparable with the best existing methods.

systems, man and cybernetics | 2013

Semantic Indoor Navigation with a Blind-User Oriented Augmented Reality

Samleo L. Joseph; Xiaochen Zhang; Ivan Dryanovski; Jizhong Xiao; Chucai Yi; Yingli Tian

The aim of this paper is to design an inexpensive conceivable wearable navigation system that can aid in the navigation of a visually impaired user. A novel approach of utilizing the floor plan map posted on the buildings is used to acquire a semantic plan. The extracted landmarks such as room numbers, doors, etc act as a parameter to infer the way points to each room. This provides a mental mapping of the environment to design a navigation framework for future use. A human motion model is used to predict a path based on how real humans ambulate towards a goal by avoiding obstacles. We demonstrate the possibilities of augmented reality (AR) as a blind user interface to perceive the physical constraints of the real world using haptic and voice augmentation. The haptic belt vibrates to direct the user towards the travel destination based on the metric localization at each step. Moreover, travel route is presented using voice guidance, which is achieved by accurate estimation of the users location and confirmed by extracting the landmarks, based on landmark localization. The results show that it is feasible to assist a blind user to travel independently by providing the constraints required for safe navigation with user oriented augmented reality.

Explore More