Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ayatullah Faruk Mollah is active.

Publication


Featured researches published by Ayatullah Faruk Mollah.


ieee india conference | 2009

A Fast Skew Correction Technique for Camera Captured Business Card Images

Ayatullah Faruk Mollah; Srinka Basu; Nibaran Das; Ram Sarkar; Mita Nasipuri; Mahantapas Kundu

Performance of an Optical Character Recognition engine may be affected if the images are skewed and distorted due to perspective projection. In this paper, a computationally efficient skew correction technique has been presented for extracted text regions from business card images. The skew angle is estimated by analyzing the bottom and/or top profiles (height/depth from a horizontal base line) of a text region. After rejecting some of the profile elements based on mean and mean deviation, three reference profile elements are chosen from which we get three skew angles. The average is considered as the computed skew angle. Besides being faster, it has an applicable accuracy and the effect of perspective distortion is made normalized. It is observed from the experiment that the average deviation of the skew corrected text lines, from the ground truth data, is within ±3 degree while the average processing time is between 17-110 milliseconds for 0.45-3.0 mega pixel images.


international conference on emerging applications of information technology | 2012

Text detection from camera captured images using a novel fuzzy-based technique

Ayatullah Faruk Mollah; Subhadip Basu; Mita Nasipuri

Text information extraction from camera captured text embedded images has a wide variety of applications. In this paper, a fuzzy membership based robust text detection technique is presented. The given image is partitioned into blocks that are assigned two types of fuzyy memberships. The membership values are post-processed for finer classification as foreground block or background block. Adjacent foreground blocks form foreground components. Then, a feature-based Multi Layer Perceptron is used to classify the foreground components as text or non-text. Experiments show that the number of false negative is very small compared to that of the false positives. The technique yields an average of 99.75% recall and 93.75% precision rates.


Journal of intelligent systems | 2013

Handheld Mobile Device Based Text Region Extraction and Binarization of Image Embedded Text Documents

Ayatullah Faruk Mollah; Subhadip Basu; Mita Nasipuri; Dipak Kumar Basu

Abstract. Effective text region extraction and binarization of image embedded text documents on mobile devices having limited computational resources is an open research problem. In this paper, we present one such technique for preprocessing images captured with built-in cameras of handheld devices with an aim of developing an efficient Business Card Reader. At first, the card image is processed for isolating foreground components. These foreground components are classified as either text or non-text using different feature descriptors of texts and images. The non-text components are removed and the textual ones are binarized with a fast adaptive algorithm. Specifically, we propose new techniques (targeted to mobile devices) for (i) foreground component isolation, (ii) text extraction and (iii) binarization of text regions from camera captured business card images. Experiments with business card images of various resolutions show that the present technique yields better accuracy and involves low computational overhead in comparison with the state-of-the-art. We achieve optimum text/non-text separation performance with images of resolution 800×600 pixels with an average recall rate of 93.90% and a precision rate of 96.84%. It involves a peak memory consumption of 0.68 MB and processing time of 0.102 seconds on a moderately powerful notebook, and 4 seconds of processing time on a PDA.


International Journal of Image and Graphics | 2013

HANDHELD DEVICE-BASED CHARACTER RECOGNITION SYSTEM FOR CAMERA CAPTURED IMAGES

Ayatullah Faruk Mollah; Subhadip Basu; Mita Nasipuri

A novel character recognition system aimed at embedding into handheld devices is presented. At first, text regions (TRs) are isolated from the camera captured image by segmenting the image. Then, these TRs are rotated by the skew angles estimated for each of them by a self-developed fast skew correction technique. After that, the skew corrected regions are segmented and recognized. Using Tesseract as the plugged-in recognition engine, we have found a recognition accuracy of 93.51% for business card images. Recognized results are then reorganized to get the original layout back. Experiments reveal that the proposed system is computationally efficient, required memory consumption is significantly low and therefore applicable on handheld devices.


international conference information processing | 2012

Computationally Efficient Implementation of Convolution-Based Locally Adaptive Binarization Techniques

Ayatullah Faruk Mollah; Subhadip Basu; Mita Nasipuri

One of the most important steps of document image processing is binarization. The computational requirements of locally adaptive binarization techniques make them unsuitable for devices with limited computing facilities. In this paper, we have presented a computationally efficient implementation of convolution based locally adaptive binarization techniques keeping the performance comparable to the original implementation. The computational complexity has been reduced from O(W 2 N 2) to O(WN 2) where W×W is the window size and N×N is the image size. Experiments over benchmark datasets show that the computation time has been reduced by 5 to 15 times depending on the window size while memory consumption remains the same with respect to the state-of-the-art algorithmic implementation.


Archive | 2019

Distance Transform-Based Stroke Feature Descriptor for Text Non-text Classification

Tauseef Khan; Ayatullah Faruk Mollah

Natural scene or document images captured from camera devices containing text are the most informative region for communication. Extraction of text regions from such images is the primary and fundamental task of obtaining textual content present in images. Classifying foreground objects as text/non-text elements is one of the significant modules in scene text localization. Stroke width is an important discriminating feature of text blocks. In this paper, a distance transform-based stroke feature descriptor is reported for component level classification of foreground components obtained from input images. Potential stroke pixels are identified from distance map of a component using strict staircase method, and distribution of distance values of such pixels is used for designing the feature descriptors. Finally, we classify the components using a neural network-based classifier. Experimental result shows that component classification accuracy is more than 88%, which is much impressive in practical scenario.


Archive | 2019

Script Identification from Camera-Captured Multi-script Scene Text Components

Madhuram Jajoo; Neelotpal Chakraborty; Ayatullah Faruk Mollah; Subhadip Basu; Ram Sarkar

Identification of script from multi-script text components of camera-captured images is an emerging research field. Here, challenges are mainly twofold: (1) typical challenges of camera-captured images like blur, uneven illumination, complex background, etc., and (2) challenges related to shape, size, and orientation of the texts written in different scripts. In this work, an effective set consisting of both shape-based and texture-based features is designed for script classification. An in-house scene text data set comprising 300 text boxes written in three scripts, namely Bangla, Devanagri, and Roman is prepared. Performance of this feature set is associated with five popular classifiers and highest accuracy of 90% is achieved with Multi-layer Perceptron (MLP) classifier, which is reasonably satisfactory considering the domain complexity.


Archive | 2019

Multi-lingual Text Localization from Camera Captured Images Based on Foreground Homogenity Analysis

Indra Narayan Dutta; Neelotpal Chakraborty; Ayatullah Faruk Mollah; Subhadip Basu; Ram Sarkar

Detecting and localizing multi-lingual text regions in natural scene images is a challenging task due to variation in texture properties of the image and geometric properties of multi-lingual text. In this work, we explore the possibility of identifying and localizing text regions based on their degree of homogeneity compared to the non-text regions of the image by binning red, green, blue channels and gray levels into bins represented individually by binary images whose connected components undergo several elimination processes and the possible text regions are distinguished and localized from non-text regions. We evaluated our proposed method on our camera captured image collection having multi-lingual texts in languages namely, English, Bangla, Hindi and Oriya and observed 0.69 as the F-measure value for best case where the image has good number of possible text regions.


Archive | 2018

An Automatic Annotation Scheme for Scene Text Archival Applications

Ayatullah Faruk Mollah; Subhadip Basu; Mita Nasipuri

Smart automated management and access to ever increasing number of scene text images is a pressing need to enable individuals and organizations save time and energy. Text embedded in such an image is an important descriptor of the image itself. In this paper, a novel scheme for automatic generation of annotations by OCRing scene text images is presented and the performance is demonstrated on a smart infobase, a knowledge base designed with trie data structure. A neuro-fuzzy approach is used for text detection and multi-layer perceptron is incorporated for character recognition. Appropriate post-processing has increased the classification performance from 90.73% to 96.86% (i.e. higher than Tesseract 3.01 that yields 93.51%). Q-gram based index keys are generated from the OCR’d text and indexed in the infobase enabling appropriate relevance scoring. Besides ‘query text’, the system also supports ‘query image’. The retrieval engine returns scene images in order of relevance i.e. in decreasing order. The performance is successfully demonstrated on a set of 100 camera captured scene text images. The system works satisfactorily within the present scope of applications.


CVIP (1) | 2018

A Novel Text Localization Scheme for Camera Captured Document Images

Tauseef Khan; Ayatullah Faruk Mollah

In this paper, a hybrid model for detecting text regions from scene images as well as document image is presented. At first, background is suppressed to isolate foreground regions. Then, morphological operations are applied on isolated foreground regions to ensure appropriate region boundary of such objects. Statistical features are extracted from these objects to classify them as text or non-text using a multi-layer perceptron. Classified text components are localized, and non-text ones are ignored. Experimenting on a data set of 227 camera captured images, it is found that the object isolation accuracy is 0.8638 and text non-text classification accuracy is 0.9648. It may be stated that for images with near homogenous background, the present method yields reasonably satisfactory accuracy for practical applications.

Collaboration


Dive into the Ayatullah Faruk Mollah's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge