Sheikh Faisal Rashid
Kaiserslautern University of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sheikh Faisal Rashid.
Proceedings of the 4th International Workshop on Multilingual OCR | 2013
Sheikh Faisal Rashid; Marc-Peter Schambach; Jörg Rottland; Stephan von der Nüll
OCR of multi-font Arabic text is difficult due to large variations in character shapes from one font to another. It becomes even more challenging if the text is rendered at very low resolution. This paper describes a multi-font, low resolution, and open vocabulary OCR system based on a multidimensional recurrent neural network architecture. For this work, we have developed various systems, trained for single-font/single-size, single-font/multi-size, and multi-font/multi-size data of the well known Arabic printed text image database (APTI). The evaluation tasks from the second Arabic text recognition competition, organized in conjunction with ICDAR 2013, have been adopted. Ten Arabic fonts in six font size categories are used for evaluation. Results show that the proposed method performs very well on the task of printed Arabic text recognition even for very low resolution and small font size images. Overall, the system yields above 99% recognition accuracy at character and word level for most of the printed Arabic fonts.
document analysis systems | 2012
Sheikh Faisal Rashid; Faisal Shafait; Thomas M. Breuel
Optical character recognition (OCR) of machine printed Latin script documents is ubiquitously claimed as a solved problem. However, error free OCR of degraded or noisy text is still challenging for modern OCR systems. Most recent approaches perform segmentation based character recognition. This is tricky because segmentation of degraded text is itself problematic. This paper describes a segmentation free text line recognition approach using multi layer perceptron (MLP) and hidden markov models (HMMs). A line scanning neural network-trained with character level contextual information and a special garbage class-is used to extract class probabilities at every pixel succession. The output of this scanning neural network is decoded by HMMs to provide character level recognition. In evaluations on a subset of UNLV-ISRI document collection, we achieve 98.4% character recognition accuracy that is statistically significantly better in comparison with character recognition accuracies obtained from state-of-the-art open source OCR systems.
international conference on document analysis and recognition | 2011
Sheikh Faisal Rashid; Faisal Shafait; Thomas M. Breuel
Segmentation and recognition of screen rendered text is a challenging task due to its low resolution (72 or 96 ppi) and use of antialiased rendering. This paper evaluates Hidden Markov Model (HMM) techniques for OCR of low resolution text -- both on screen rendered isolated characters and screen rendered text-lines -- and compares it with the performance of other commercial and open source OCR systems. Results show that HMM-based methods reach the performance of other methods on screen rendered text and yield above 98% character level accuracies on both screen rendered text-lines and characters.
ieee international multitopic conference | 2009
Sheikh Faisal Rashid; Syed Saqib Bukhari; Faisal Shafait; Thomas M. Breuel
Orientation detection is an important preprocessing step for accurate recognition of text from document images. Many existing orientation detection techniques are based on the fact that in Roman script text ascenders occur more likely than descenders, but this approach is not applicable to document of other scripts like Urdu, Arabic, etc. In this paper, we propose a discriminative learning approach for orientation detection of Urdu documents with varying layouts and fonts. The main advantage of our approach is that it can be applied to documents of other scripts easily and accurately. Our approach is based on classification of individual connected component orientation in the document image, and then the orientation of the page image is determined via majority count. A convolutional neural network is trained as discriminative learning model for the labeled Urdu books dataset with four target orientations: 0, 90, 180 and 270 degrees. We demonstrate the effectiveness of our method on dataset of Urdu documents categorized into the layouts of book, novel and poetry. We achieved 100% orientation detection accuracy on a test set of 328 document images.
international conference on document analysis and recognition | 2015
Riaz Ahmad; Muhammad Zeshan Afzal; Sheikh Faisal Rashid; Marcus Liwicki; Thomas M. Breuel
Optical Character Recognition (OCR) of cursive scripts like Pashto and Urdu is difficult due the presence of complex ligatures and connected writing styles. In this paper, we evaluate and compare different approaches for the recognition of such complex ligatures. The approaches include Hidden Markov Model (HMM), Long Short Term Memory (LSTM) network and Scale Invariant Feature Transform (SIFT). Current state of the art in cursive script assumes constant scale without any rotation, while real world data contain rotation and scale variations. This research aims to evaluate the performance of sequence classifiers like HMM and LSTM and compare their performance with descriptor based classifier like SIFT. In addition, we also assess the performance of these methods against the scale and rotation variations in cursive script ligatures. Moreover, we introduce a database of 480,000 images containing 1000 unique ligatures or sub-words of Pashto. In this database, each ligature has 40 scale and 12 rotation variations. The evaluation results show a significantly improved performance of LSTM over HMM and traditional feature extraction technique such as SIFT.
international conference on document analysis and recognition | 2013
Marc-Peter Schambach; Sheikh Faisal Rashid
Cursive handwriting recognition is still a hot topic of research, especially for non-Latin scripts. One of the techniques which yields best recognition results is based on recurrent neural networks: with neurons modeled by long short-term memory (LSTM) cells, and alignment of label sequence to output sequence performed by a connectionist temporal classification (CTC) layer. However, network training is time consuming, unstable, and tends to over-adaptation. One of the reasons is the bootstrap process, which aligns the label data more or less randomly in early training iterations. This also leads to the fact that the emission peak positions within a character are located unpredictably. But positions near the center of a character are more desirable: In theory, they better model the properties of a character. The solution presented here is to guide the back-propagation training in early iterations: Character alignment is enforced by replacing the forward-backward alignment by fixed character positions: either pre-segmented, or equally distributed. After a number of guided iterations, training may be continued by standard dynamic alignment. A series of experiments is performed to answer some of these questions: Can peak positions be controlled in the long run? Can training iterations be reduced, getting results faster? Is training more stable? And finally: Do defined character position lead to better recognition performance?
international conference on image processing | 2010
Sheikh Faisal Rashid; Faisal Shafait; Thomas M. Breuel
Document script recognition is one of the important preprocessing steps in a multilingual optical character recognition (MOCR) system. A MOCR system requires prior knowledge of script to accurately recognize multilingual text in a single document. In multilingual documents two scripts can be mixed together within a single text line. Many existing script recognition methods lack the ability to recognize multiple scripts mixed within a single text line. Besides, these methods usually use script dependent features for script recognition thereby limiting their scope to particularly that script. In this paper we propose a discriminative learning approach for multi-script recognition at connected component level by using a convolutional neural network. The convolutional neural network combines feature extraction and script recognition process in one step and discriminative features for script recognition are extracted and learned as convolutional kernels from raw input. This eliminates the need for manually defining discriminative features for particular scripts. Results show above 95% script recognition accuracy at connected component level on datasets of Greek-Latin, Arabic-Latin multi-script documents and Antiqua-Fraktur documents. The proposed method can be easily adapted to different scripts.
Proceedings of SPIE | 2010
Sheikh Faisal Rashid; Faisal Shafait; Thomas M. Breuel
In current study we examine how letter permutation affects in visual recognition of words for two orthographically dissimilar languages, Urdu and German. We present the hypothesis that recognition or reading of permuted and non-permuted words are two distinct mental level processes, and that people use different strategies in handling permuted words as compared to normal words. A comparison between reading behavior of people in these languages is also presented. We present our study in context of dual route theories of reading and it is observed that the dual-route theory is consistent with explanation of our hypothesis of distinction in underlying cognitive behavior for reading permuted and non-permuted words. We conducted three experiments in lexical decision tasks to analyze how reading is degraded or affected by letter permutation. We performed analysis of variance (ANOVA), distribution free rank test, and t-test to determine the significance differences in response time latencies for two classes of data. Results showed that the recognition accuracy for permuted words is decreased 31% in case of Urdu and 11% in case of German language. We also found a considerable difference in reading behavior for cursive and alphabetic languages and it is observed that reading of Urdu is comparatively slower than reading of German due to characteristics of cursive script.
international conference on pattern recognition | 2012
Adnan Ul-Hasan; Syed Saqib Bukhari; Sheikh Faisal Rashid; Faisal Shafait; Thomas M. Breuel
international conference on document analysis and recognition | 2017
Sheikh Faisal Rashid; Abdullah Akmal; Muhammad Adnan; Ali Adnan Aslam; Andreas Dengel