Palaiahnakote Shivakumara
Information Technology University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Palaiahnakote Shivakumara.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2011
Palaiahnakote Shivakumara; Trung Quy Phan; Chew Lim Tan
In this paper, we propose a method based on the Laplacian in the frequency domain for video text detection. Unlike many other approaches which assume that text is horizontally-oriented, our method is able to handle text of arbitrary orientation. The input image is first filtered with Fourier-Laplacian. K-means clustering is then used to identify candidate text regions based on the maximum difference. The skeleton of each connected component helps to separate the different text strings from each other. Finally, text string straightness and edge density are used for false positive elimination. Experimental results show that the proposed method is able to handle graphics text and scene text of both horizontal and nonhorizontal orientation.
Pattern Recognition | 2006
S. Noushath; G. Hemantha Kumar; Palaiahnakote Shivakumara
Although 2DLDA algorithm obtains higher recognition accuracy, a vital unresolved problem of 2DLDA is that it needs huge feature matrix for the task of face recognition. To overcome this problem, this paper presents an efficient approach for face image feature extraction, namely, (2D)^2LDA method. Experimental results on ORL and Yale database show that the proposed method obtains good recognition accuracy despite having less number of coefficients.
international conference on document analysis and recognition | 2009
Trung Quy Phan; Palaiahnakote Shivakumara; Chew Lim Tan
In this paper, we propose an efficient text detection method based on the Laplacian operator. The maximum gradient difference value is computed for each pixel in the Laplacian-filtered image. K-means is then used to classify all the pixels into two clusters: text and non-text. For each candidate text region, the corresponding region in the Sobel edge map of the input image undergoes projection profile analysis to determine the boundary of the text blocks. Finally, we employ empirical rules to eliminate false positives based on geometrical properties. Experimental results show that the proposed method is able to detect text of different fonts, contrast and backgrounds. Moreover, it outperforms three existing methods in terms of detection and false positive rates.
Pattern Recognition | 2010
Palaiahnakote Shivakumara; Weihua Huang; Trung Quy Phan; Chew Lim Tan
Detection of both scene text and graphic text in video images is gaining popularity in the area of information retrieval for efficient indexing and understanding the video. In this paper, we explore a new idea of classifying low contrast and high contrast video images in order to detect accurate boundary of the text lines in video images. In this work, high contrast refers to sharpness while low contrast refers to dim intensity values in the video images. The method introduces heuristic rules based on combination of filters and edge analysis for the classification purpose. The heuristic rules are derived based on the fact that the number of Sobel edge components is more than the number of Canny edge components in the case of high contrast video images, and vice versa for low contrast video images. In order to demonstrate the use of this classification on video text detection, we implement a method based on Sobel edges and texture features for detecting text in video images. Experiments are conducted using video images containing both graphic text and scene text with different fonts, sizes, languages, backgrounds. The results show that the proposed method outperforms existing methods in terms of detection rate, false alarm rate, misdetection rate and inaccurate boundary rate.
international conference on document analysis and recognition | 2009
Palaiahnakote Shivakumara; Trung Quy Phan; Chew Lim Tan
In this paper, we propose a new method based on wavelet transform, statistical features and central moments for both graphics and scene text detection in video images. The method uses wavelet single level decomposition LH, HL and HH subbands for computing features and the computed features are fed to k means clustering to classify the text pixel from the background of the image. The average of wavelet subbands and the output of k means clustering helps in classifying true text pixel in the image. The text blocks are detected based on analysis of projection profiles. Finally, we introduce a few heuristics to eliminate false positives from the image. The robustness of the proposed method is tested by conducting experiments on a variety of images of low contrast, complex background, different fonts, and size of text in the image. The experimental results show that the proposed method outperforms the existing methods in terms of detection rate, false positive rate and misdetection rate.
document analysis systems | 2008
Palaiahnakote Shivakumara; Weihua Huang; Chew Lim Tan
Both graphic text and scene text detection in video images with complex background and low resolution is still a challenging and interesting problem for researchers in the field of image processing and computer vision. In this paper, we present a novel technique for detecting both graphic text and scene text in video images by finding segments containing text in an input image and then using statistical features such as vertical and horizontal bars for edges in the segments for detecting true text blocks efficiently. To identify a segment containing text, heuristic rules are formed based on combination of filters and edge analysis. Furthermore, the same rules are extended to grow the boundaries of a candidate segment in order to include complete text in the input image. The experimental results of the proposed method show that the technique performs better than existing methods in terms of a number of metrics.
document analysis systems | 2012
Nabin Sharma; Palaiahnakote Shivakumara; Umapada Pal; Michael Myer Blumenstein; Chew Lim Tan
Text detection in video frames plays a vital role in enhancing the performance of information extraction systems because the text in video frames helps in indexing and retrieving video efficiently and accurately. This paper presents a new method for arbitrarily-oriented text detection in video, based on dominant text pixel selection, text representatives and region growing. The method uses gradient pixel direction and magnitude corresponding to Sobel edge pixels of the input frame to obtain dominant text pixels. Edge components in the Sobel edge map corresponding to dominant text pixels are then extracted and we call them text representatives. We eliminate broken segments of each text representatives to get candidate text representatives. Then the perimeter of candidate text representatives grows along the text direction in the Sobel edge map to group the neighboring text components which we call word patches. The word patches are used for finding the direction of text lines and then the word patches are expanded in the same direction in the Sobel edge map to group the neighboring word patches and to restore missing text information. This results in extraction of arbitrarily-oriented text from the video frame. To evaluate the method, we considered arbitrarily-oriented data, non-horizontal data, horizontal data, Huas data and ICDAR-2003 competition data (Camera images). The experimental results show that the proposed method outperforms the existing method in terms of recall and f-measure.
IEEE Transactions on Circuits and Systems for Video Technology | 2010
Palaiahnakote Shivakumara; Trung Quy Phan; Chew Lim Tan
In this paper, we propose new Fourier-statistical features (FSF) in RGB space for detecting text in video frames of unconstrained background, different fonts, different scripts, and different font sizes. This paper consists of two parts namely automatic classification of text frames from a large database of text and non-text frames and FSF in RGB for text detection in the classified text frames. For text frame classification, we present novel features based on three visual cues, namely, sharpness in filter-edge maps, straightness of the edges, and proximity of the edges to identify a true text frame. For text detection in video frames, we present new Fourier transform based features in RGB space with statistical features and the computed FSF features from RGB bands are subject to K-means clustering to classify text pixels from the background of the frame. Text blocks of the classified text pixels are determined by analyzing the projection profiles. Finally, we introduce a few heuristics to eliminate false positives from the frame. The robustness of the proposed approach is tested by conducting experiments on a variety of frames of low contrast, complex background, different fonts, and sizes of text in the frame. Both our own test dataset and a publicly available dataset are used for the experiments. The experimental results show that the proposed approach is superior to existing approaches in terms of detection rate, false positive rate, and misdetection rate.
international conference on pattern recognition | 2008
Palaiahnakote Shivakumara; Weihua Huang; Chew Lim Tan
In this paper, we explore new edge features such as straightness for the elimination of non significant edges from the segmented text portion of a video frame to detect accurate boundary of the text lines in video images. To segment the complete text portions, the method introduces candidate text block selection from a given image. Heuristic rules are formed based on combination of filters and edge analysis for identifying a candidate text block in the image. Furthermore, the same rules are extended to grow boundary of candidate text block in order to segment complete text portions in the image. The experimental results of the proposed method show that the method outperforms an existing method in terms of a number of metrics.
international conference on document analysis and recognition | 2009
Palaiahnakote Shivakumara; Trung Quy Phan; Chew Lim Tan
Text detection in video images has received increasing attention, particularly in scene text detection in video images, as it plays a vital role in video indexing and information retrieval. This paper proposes a new and robust gradient difference technique for detecting both graphics and scene text in video images. The technique introduces the concept of zero crossing to determine the bounding boxes for the detected text lines in video images, rather than using the conventional projection profiles based method which fails to fix bounding boxes when there is no proper spacing between the detected text lines. We demonstrate the capability of the proposed technique by conducting experiments on video images containing both graphics text and scene text with different font shapes and sizes, languages, text directions, background and contrasts. Our experimental results show that the proposed technique outperforms existing methods in terms of detection rate for large video image database.