Bolan Su
National University of Singapore
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Bolan Su.
International Journal on Document Analysis and Recognition | 2010
Shijian Lu; Bolan Su; Chew Lim Tan
Document images often suffer from different types of degradation that renders the document image binarization a challenging task. This paper presents a document image binarization technique that segments the text from badly degraded document images accurately. The proposed technique is based on the observations that the text documents usually have a document background of the uniform color and texture and the document text within it has a different intensity level compared with the surrounding document background. Given a document image, the proposed technique first estimates a document background surface through an iterative polynomial smoothing procedure. Different types of document degradation are then compensated by using the estimated document background surface. The text stroke edge is further detected from the compensated document image by using L1-norm image gradient. Finally, the document text is segmented by a local threshold that is estimated based on the detected text stroke edges. The proposed technique was submitted to the recent document image binarization contest (DIBCO) held under the framework of ICDAR 2009 and has achieved the top performance among 43 algorithms that are submitted from 35 international research groups.
document analysis systems | 2010
Bolan Su; Shijian Lu; Chew Lim Tan
This paper presents a new document image binarization technique that segments the text from badly degraded historical document images. The proposed technique makes use of the image contrast that is defined by the local image maximum and minimum. Compared with the image gradient, the image contrast evaluated by the local maximum and minimum has a nice property that it is more tolerant to the uneven illumination and other types of document degradation such as smear. Given a historical document image, the proposed technique first constructs a contrast image and then detects the high contrast image pixels which usually lie around the text stroke boundary. The document text is then segmented by using local thresholds that are estimated from the detected high contrast pixels within a local neighborhood window. The proposed technique has been tested over the dataset that is used in the recent Document Image Binarization Contest (DIBCO) 2009. Experiments show its superior performance.
IEEE Transactions on Image Processing | 2013
Bolan Su; Shijian Lu; Chew Lim Tan
Segmentation of text from badly degraded document images is a very challenging task due to the high inter/intra-variation between the document background and the foreground text of different document images. In this paper, we propose a novel document image binarization technique that addresses these issues by using adaptive image contrast. The adaptive image contrast is a combination of the local image contrast and the local image gradient that is tolerant to text and background variation caused by different types of document degradations. In the proposed technique, an adaptive contrast map is first constructed for an input degraded document image. The contrast map is then binarized and combined with Cannys edge map to identify the text stroke edge pixels. The document text is further segmented by a local threshold that is estimated based on the intensities of detected text stroke edge pixels within a local window. The proposed method is simple, robust, and involves minimum parameter tuning. It has been tested on three public datasets that are used in the recent document image binarization contest (DIBCO) 2009 & 2011 and handwritten-DIBCO 2010 and achieves accuracies of 93.5%, 87.8%, and 92.03%, respectively, that are significantly higher than or close to that of the best-performing methods reported in the three contests. Experiments on the Bickley diary dataset that consists of several challenging bad quality document images also show the superior performance of our proposed method, compared with other techniques.
acm multimedia | 2011
Bolan Su; Shijian Lu; Chew Lim Tan
Many digital images contain blurred regions which are caused by motion or defocus. Automatic detection and classification of blurred image regions are very important for different multimedia analyzing tasks. This paper presents a simple and effective automatic image blurred region detection and classification technique. In the proposed technique, blurred image regions are first detected by examining singular value information for each image pixels. The blur types (i.e. motion blur or defocus blur) are then determined based on certain alpha channel constraint that requires neither image deblurring nor blur kernel estimation. Extensive experiments have been conducted over a dataset that consists of 200 blurred image regions and 200 image regions with no blur that are extracted from 100 digital images. Experimental results show that the proposed technique detects and classifies the two types of image blurs accurately. The proposed technique can be used in many different multimedia analysis applications such as image segmentation, depth estimation and information retrieval.
international conference on document analysis and recognition | 2011
Bolan Su; Shijian Lu; Chew Lim Tan
Document image binarization has been studied for decades, and many practical binarization techniques have been proposed for different kinds of document images. However, many state-of-the-art methods are particularly suitable for the document images that suffer from certain specific type of image degradation or have certain specific type of image characteristics. In this paper, we propose a classification framework to combine different thresholding methods and produce better performance for document image binarization. Given the binarization results of some reported methods, the proposed framework divides the document image pixels into three sets, namely, foreground pixels, background pixels and uncertain pixels. A classifier is then applied to iteratively classify those uncertain pixels into foreground and background, based on the pre-selected froeground and background sets. Extensive experiments over different datasets including the Document Image Binarization Contest(DIBCO)2009 and Handwritten Document Image Binarization Competition(H-DIBCO)2010 show that our proposed framework outperforms most state-of-the-art methods significantly.
asian conference on computer vision | 2014
Bolan Su; Shijian Lu
Scene text recognition is a useful but very challenging task due to uncontrolled condition of text in natural scenes. This paper presents a novel approach to recognize text in scene images. In the proposed technique, a word image is first converted into a sequential column vectors based on Histogram of Oriented Gradient (HOG). The Recurrent Neural Network (RNN) is then adapted to classify the sequential feature vectors into the corresponding word. Compared with most of the existing methods that follow a bottom-up approach to form words by grouping the recognized characters, our proposed method is able to recognize the whole word images without character-level segmentation and recognition. Experiments on a number of publicly available datasets show that the proposed method outperforms the state-of-the-art techniques significantly. In addition, the recognition results on publicly available datasets provide a good benchmark for the future research in this area.
international conference on document analysis and recognition | 2013
Shangxuan Tian; Shijian Lu; Bolan Su; Chew Lim Tan
Scene text recognition is a fundamental step in End-to-End applications where traditional optical character recognition (OCR) systems often fail to produce satisfactory results. This paper proposes a technique that uses co-occurrence histogram of oriented gradients (Co-HOG) to recognize the text in scenes. Compared with histogram of oriented gradients (HOG), Co-HOG is a more powerful tool that captures spatial distribution of neighboring orientation pairs instead of just a single gradient orientation. At the same time, it is more efficient compared with HOG and therefore more suitable for real-time applications. The proposed scene text recognition technique is evaluated on ICDAR2003 character dataset and Street View Text (SVT) dataset. Experiments show that the Co-HOG based technique clearly outperforms state-of-the-art techniques that use HOG, Scale Invariant Feature Transform (SIFT), and Maximally Stable Extremal Regions (MSER).
Pattern Recognition | 2016
Shangxuan Tian; Ujjwal Bhattacharya; Shijian Lu; Bolan Su; Qingqing Wang; Xiaohua Wei; Yue Lu; Chew Lim Tan
Automatic machine reading of texts in scenes is largely restricted by the poor character recognition accuracy. In this paper, we extend the Histogram of Oriented Gradient (HOG) and propose two new feature descriptors: Co-occurrence HOG (Co-HOG) and Convolutional Co-HOG (ConvCo-HOG) for accurate recognition of scene texts of different languages. Compared with HOG which counts orientation frequency of each single pixel, the Co-HOG encodes more spatial contextual information by capturing the co-occurrence of orientation pairs of neighboring pixels. Additionally, ConvCo-HOG exhaustively extracts Co-HOG features from every possible image patches within a character image for more spatial information. The two features have been evaluated extensively on five scene character datasets of three different languages including three sets in English, one set in Chinese and one set in Bengali. Experiments show that the proposed techniques provide superior scene character recognition accuracy and are capable of recognizing scene texts of different scripts and languages. HighlightsIntroduced powerful features Co-HOG and ConvCo-HOG for scene character recognition.Designed a new offset based strategy for dimension reduction of above features.Developed two new scene character datasets for Chinese and Bengali scripts.Extensive simulations on 5 datasets of 3 scripts show the efficiency of the approach.
document analysis systems | 2012
Bolan Su; Shijian Lu; Umapada Pal; Chew Lim Tan
Musical staff line detection and removal techniques detect the staff positions in musical documents and segment musical score from musical documents by removing those staff lines. It is an important preprocessing step for ensuing the Optical Music Recognition tasks. This paper proposes an effective staff line detection and removal method that makes use of the global information of the musical document and models the staff line shape. It first estimates the staff height and space, and then models the shape of the staff line by examining the orientation of the staff pixels. At last the estimated model is used to find out the location of staff lines and hence to remove those detected staff lines. The proposed technique is simple, robust, and involves few parameters. It has been tested on the dataset of the recent staff removal competition held under the International Conference of Document Analysis and Recognition(ICDAR) 2011. Experimental results show the effectiveness and robustness of our proposed technique on musical documents with various types of deformations.
international conference on document analysis and recognition | 2011
Trung Quy Phan; Palaiahnakote Shivakumara; Bolan Su; Chew Lim Tan
In this paper, we propose a method based on gradient vector flow for video character segmentation. By formulating character segmentation as a minimum cost path finding problem, the proposed method allows curved segmentation paths and thus it is able to segment overlapping characters and touching characters due to low contrast and complex background. Gradient vector flow is used in a new way to identify candidate cut pixels. A two-pass path finding algorithm is then applied where the forward direction helps to locate potential cuts and the backward direction serves to remove the false cuts, i.e. those that go through the characters, while retaining the true cuts. Experimental results show that the proposed method outperforms an existing method on multi-oriented English and Chinese video text lines. The proposed method also helps to improve binarization results, which lead to a better character recognition rate.