Sabri A. Mahmoud
King Fahd University of Petroleum and Minerals
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sabri A. Mahmoud.
Signal Processing | 1995
Badr Al-Badr; Sabri A. Mahmoud
Abstract Research work on Arabic optical text recognition (AOTR), although lagging that of other languages, is becoming more intensive than before and commercial systems for AOTR are becoming available. This paper presents a comprehensive survey and bibliography of research on AOTR, by covering all the research publications on AOTR to which the authors had access. This paper introduces the general topic of optical character recognition (OCR), and highlights the characteristics of Arabic text. It also presents an historical review of the Arabic text recognition systems. Further, this paper reports on the state of the art in AOTR research, and lists the specifications of commercially available systems for AOTR. In this paper, we first underline the capabilities of different AOTR systems, and then introduce a five stage model for AOTR systems and classify research work according to this model. We devote a section to each of the stages of this model: preprocessing, segmentation, feature extraction, classification, and post-processing. In the preprocessing section, we emphasize handling degraded documents, and thinning of Arabic text. In the segmentation section, we discuss methods of segmenting Arabic text and categorize the methods into five general approaches. In the feature extraction and classification sections, we highlight the main techniques and analyze AOTR research works based on those techniques. We then discuss approaches for post-processing and show their relation to the Arabic language. We conclude by pointing problems and directions for future research on AOTR.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 1994
I. S. I. Abuhaiba; Sabri A. Mahmoud; Roger J. Green
An automatic off-line character recognition system for handwritten cursive Arabic characters is presented. A robust noise-independent algorithm is developed that yields skeletons that reflect the structural relationships of the character components. The character skeleton is converted to a tree structure suitable for recognition. A set of fuzzy constrained character graph models (FCCGMs), which tolerate large variability in writing, is designed. These models are graphs, with fuzzily labeled arcs used as prototypes for the characters. A set of rules is applied in sequence to match a character tree to an FCCGM. Arabic handwritings of four writers were used in the learning and testing stages. The system proved to be powerful in tolerance to variable writing, speed, and recognition rate. >
Signal Processing | 2008
Husni Al-Muhtaseb; Sabri A. Mahmoud; Rami Qahwaji
This paper describes a technique for automatic recognition of off-line printed Arabic text using Hidden Markov Models. In this work different sizes of overlapping and non-overlapping hierarchical windows are used to generate 16 features from each vertical sliding strip. Eight different Arabic fonts were used for testing (viz. Arial, Tahoma, Akhbar, Thuluth, Naskh, Simplified Arabic, Andalus, and Traditional Arabic). It was experimentally proven that different fonts have their highest recognition rates at different numbers of states (5 or 7) and codebook sizes (128 or 256). Arabic text is cursive, and each character may have up to four different shapes based on its location in a word. This research work considered each shape as a different class, resulting in a total of 126 classes (compared to 28 Arabic letters). The achieved average recognition rates were between 98.08% and 99.89% for the eight experimental fonts. The main contributions of this work are the novel hierarchical sliding window technique using only 16 features for each sliding window, considering each shape of Arabic characters as a separate class, bypassing the need for segmenting Arabic text, and its applicability to other languages.
ACM Computing Surveys | 2013
Mohammad Tanvir Parvez; Sabri A. Mahmoud
Research in offline Arabic handwriting recognition has increased considerably in the past few years. This is evident from the numerous research results published recently in major journals and conferences in the area of handwriting recognition. Features and classifications techniques utilized in recent research work have diversified noticeably compared to the past. Moreover, more efforts have been diverted, in last few years, to construct different databases for Arabic handwriting recognition. This article provides a comprehensive survey of recent developments in Arabic handwriting recognition. The article starts with a summary of the characteristics of Arabic text, followed by a general model for an Arabic text recognition system. Then the used databases for Arabic text recognition are discussed. Research works on preprocessing phase, like text representation, baseline detection, line, word, character, and subcharacter segmentation algorithms, are presented. Different feature extraction techniques used in Arabic handwriting recognition are identified and discussed. Different classification approaches, like HMM, ANN, SVM, k-NN, syntactical methods, etc., are discussed in the context of Arabic handwriting recognition. Works on Arabic lexicon construction and spell checking are presented in the postprocessing phase. Several summary tables of published research work are provided for used Arabic text databases and reported results on Arabic character, word, numerals, and text recognition. These tables summarize the features, classifiers, data, and reported recognition accuracy for each technique. Finally, we discuss some future research directions in Arabic handwriting recognition.
Pattern Recognition | 1991
Sabri A. Mahmoud; Ibrahim AbuHaiba; Roger J. Green
Abstract Character skeletonization is an essential step in many character recognition techniques. In this paper, skeletonization of Arabic characters is addressed. While other techniques employ thinning algorithms, in this paper clustering of Arabic characters is used. The use of clustering technique (an expensive step) is justified by the properties of the generated skeleton which has the advantages of other thinning techniques and is robust. The presented technique may be used in the modeling and training stages to reduce the processing time of the recognition system.
Pattern Recognition | 2014
Sabri A. Mahmoud; Irfan Ahmad; Wasfi G. Al-Khatib; Mohammad Alshayeb; Mohammad Tanvir Parvez; Volker Märgner; Gernot A. Fink
Abstract A comprehensive Arabic handwritten text database is an essential resource for Arabic handwritten text recognition research. This is especially true due to the lack of such database for Arabic handwritten text. In this paper, we report our comprehensive Arabic offline Handwritten Text database (KHATT) consisting of 1000 handwritten forms written by 1000 distinct writers from different countries. The forms were scanned at 200, 300, and 600 dpi resolutions. The database contains 2000 randomly selected paragraphs from 46 sources, 2000 minimal text paragraph covering all the shapes of Arabic characters, and optionally written paragraphs on open subjects. The 2000 random text paragraphs consist of 9327 lines. The database forms were randomly divided into 70%, 15%, and 15% sets for training, testing, and verification, respectively. This enables researchers to use the database and compare their results. A formal verification procedure is implemented to align the handwritten text with its ground truth at the form, paragraph and line levels. The verified ground truth database contains meta-data describing the written text at the page, paragraph, and line levels in text and XML formats. Tools to extract paragraphs from pages and segment paragraphs into lines are developed. In addition we are presenting our experimental results on the database using two classifiers, viz. Hidden Markov Models (HMM) and our novel syntactic classifier. The database is made freely available to researchers world-wide for research in various handwritten-related problems such as text recognition, writer identification and verification, forms analysis, pre-processing, segmentation. Several international research groups/researchers acquired the database for use in their research so far.
Signal Processing | 2009
Sameh M. Awaidah; Sabri A. Mahmoud
This paper describes a technique for the recognition of optical off-line handwritten Arabic (Indian) numerals using hidden Markov models (HMM). Features that measure the image characteristics at local, intermediate, and large scales were applied. Gradient, structural, and concavity features at the sub-regions level are extracted and used as the features for the Arabic (Indian) numeral. Several experiments were conducted for estimating the suitable number of image divisions, and the best combination of features using the HMM classifier. A number of experiments were conducted to estimate the best number of states and codebook sizes in terms of the highest recognition rate possible. In this work, we did not follow the general trend of using the sliding window technique with HMM. Instead, a multi-resolution feature extraction approach was implemented on the whole digit. A database of 44 writers, with 48 samples per digit resulting in a database of 21120 samples was used. The achieved average recognition rate is 99%. The classification errors were analysed and attributed to bad data, different writing styles of some digits, errors between digit pairs, and genuine errors. The presented technique, which is writer independent, proved to be effective in the automatic recognition of Arabic (Indian) numerals.
IEEE Transactions on Acoustics, Speech, and Signal Processing | 1988
Sabri A. Mahmoud; Mostafa S. Afifi; Roger J. Green
Two one-dimensional time sequences are generated from the projections of the two-dimensional sequence on the x and y axes. Then the two-dimensional fast Fourier transform for the generated time sequences is computed. A peak in the spectrum for the selected spatial frequency is detected. The temporal frequency at which the peak is detected gives an estimate of the velocity of the moving object. Analytical formulations for large moving objects in a time sequence with zero background are presented. An algorithm is given for velocity formulation. >
international conference on frontiers in handwriting recognition | 2012
Sabri A. Mahmoud; Irfan Ahmad; Mohammad Alshayeb; Wasfi G. Al-Khatib; Mohammad Tanvir Parvez; Gernot A. Fink; Volker Märgner; Haikal El Abed
In this paper, we report our comprehensive Arabic offline Handwritten Text database (KHATT) after completion of the collection of 1000 handwritten forms written by 1000 writers from different countries. It is composed of an image database containing images of the written text at 200, 300, and 600 dpi resolutions, a manually verified ground truth database that contains meta-data describing the written text at the page, paragraph, and line levels. A formal verification procedure is implemented to align the handwritten text with its ground truth at the form, paragraph and line levels. Tools to extract paragraphs from pages and segment paragraphs into lines are developed. Preliminary experiments on Arabic handwritten text recognition are conducted using sample data from the database and the results are reported. The database will be made freely available to researchers world-wide for research in various handwritten-related problems such as text recognition, writer identification and verification, etc.
IET Software | 2011
Abdulaziz Alkhalid; Mohammad Alshayeb; Sabri A. Mahmoud
Enhancing, modifying or adapting the software to new requirements increases the internal software complexity. Software with high level of internal complexity is difficult to maintain. Software refactoring reduces software complexity and hence decreases the maintenance effort. However, software refactoring becomes quite challenging task as the software evolves. The authors use clustering as a pattern recognition technique to assist in software refactoring activities at the package level. The approach presents a computer aided support for identifying ill-structured packages and provides suggestions for software designer to balance between intra-package cohesion and inter-package coupling. A comparative study is conducted applying three different clustering techniques on different software systems. In addition, the application of refactoring at the package level using an adaptive k-nearest neighbour (A-KNN) algorithm is introduced. The authors compared A-KNN technique with the other clustering techniques (viz. single linkage algorithm, complete linkage algorithm and weighted pair-group method using arithmetic averages). The new technique shows competitive performance with lower computational complexity.