Husni Al-Muhtaseb
King Fahd University of Petroleum and Minerals
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Husni Al-Muhtaseb.
Signal Processing | 2008
Husni Al-Muhtaseb; Sabri A. Mahmoud; Rami Qahwaji
This paper describes a technique for automatic recognition of off-line printed Arabic text using Hidden Markov Models. In this work different sizes of overlapping and non-overlapping hierarchical windows are used to generate 16 features from each vertical sliding strip. Eight different Arabic fonts were used for testing (viz. Arial, Tahoma, Akhbar, Thuluth, Naskh, Simplified Arabic, Andalus, and Traditional Arabic). It was experimentally proven that different fonts have their highest recognition rates at different numbers of states (5 or 7) and codebook sizes (128 or 256). Arabic text is cursive, and each character may have up to four different shapes based on its location in a word. This research work considered each shape as a different class, resulting in a total of 126 classes (compared to 28 Arabic letters). The achieved average recognition rates were between 98.08% and 99.89% for the eight experimental fonts. The main contributions of this work are the novel hierarchical sliding window technique using only 16 features for each sliding window, considering each shape of Arabic characters as a separate class, bypassing the need for segmenting Arabic text, and its applicability to other languages.
Pattern Recognition Letters | 2011
Jawad Hasan Yasin AlKhateeb; Jinchang Ren; Jianmin Jiang; Husni Al-Muhtaseb
Recognition of handwritten Arabic cursive texts is a complex task due to the similarities between letters under different writing styles. In this paper, a word-based off-line recognition system is proposed, using Hidden Markov Models (HMMs). The method employed involves three stages, namely preprocessing, feature extraction and classification. First, words from input scripts are segmented and normalized. Then, a set of intensity features are extracted from each of the segmented words, which is based on a sliding window moving across each mirrored word image. Meanwhile, structure-like features are also extracted including number of subwords and diacritical marks. Finally, these features are applied in a combined scheme for classification. Intensity features are used to train a HMM classifier, whose results are re-ranked using structure-like features for improved recognition rate. In order to validate the proposed techniques, extensive experiments were carried out using the IFN/ENIT database which contains 32,492 handwritten Arabic words. The proposed algorithm yields superior results of improved accuracy in comparison with several typical methods.
Information Sciences | 2002
Moustafa Elshafei; Husni Al-Muhtaseb; Mansour M. Alghamdi
The paper proposes a diphone/sub-syllable method for Arabic Text-to-Speech (ATTS) systems. The proposed approach exploits the particular syllabic structure of the Arabic words. For good quality, the boundaries of the speech segments are chosen to occur only at the sustained portion of vowels. The speech segments consists of consonants-half vowels, half vowel-consonants, half vowels, middle portion of Vowels, and suffix consonants. The minimum set consists of about 310 segments for classical Arabic.
International Journal of Speech Technology | 2007
Mansour M. Alghamdi; Moustafa Elshafei; Husni Al-Muhtaseb
This paper describes the development of an Arabic broadcast news transcription system. The presented system is a speaker-independent large vocabulary natural Arabic speech recognition system, and it is intended to be a test bed for further research into the open ended problem of achieving natural language man-machine conversation. The system addresses a number of challenging issues pertaining to the Arabic language, e.g. generation of fully vocalized transcription, and rule-based spelling dictionary. The developed Arabic speech recognition system is based on the Carnegie Mellon university Sphinx tools. The Cambridge HTK tools were also utilized at various testing stages.The system was trained on 7.0 hours of a 7.5 hours of Arabic broadcast news corpus and tested on the remaining half an hour. The corpus was made to focus on economics and sport news. At this experimental stage, the Arabic news transcription system uses five-state HMM for triphone acoustic models, with 8 and 16 Gaussian mixture distributions. The state distributions were tied to about 1680 senons. The language model uses both bi-grams and tri-grams. The test set consisted of 400 utterances containing 3585 words. The Word Error Rate (WER) came initially to 10.14 percent. After extensive testing and tuning of the recognition parameters the WER was reduced to about 8.61% for non-vocalized text transcription.
Journal of Information Technology Research | 2009
Mohamed Ali; Moustafa Elshafei; Mansour M. Alghamdi; Husni Al-Muhtaseb
Phonetic dictionaries are essential components of large-vocabulary speaker-independent speech recognition systems. This paper presents a rule-based technique to generate phonetic dictionaries for a large vocabulary Arabic speech recognition system. The system used conventional Arabic pronunciation rules, common pronunciation rules of Modern Standard Arabic, as well as some common dialectal cases. The paper gives in detail an explanation of these rules as well as their formal mathematical presentation. The rules were used to generate a dictionary for a 5.4 hour corpus of broadcast news. The rules and the phone set were tested and evaluated on an Arabic speech recognition system. The system was trained on 4.3 hours of the 5.4 hours of Arabic broadcast news corpus and tested on the remaining 1.1 hours. The phonetic dictionary contains 23,841 definitions corresponding to about 14232 words. The language model contains both bi-grams and tri-grams. The Word Error Rate (WER) came to 9.0%.
international conference on innovations in information technology | 2008
Mohamed Ali; Moustafa Elshafei; Mansour M. Alghamdi; Husni Al-Muhtaseb; Atef J. Al-Najjar
Phonetic dictionaries are essential components of large-vocabulary natural language speaker-independent speech recognition systems. This paper presents a rule-based technique to generate Arabic phonetic dictionaries for a large vocabulary speech recognition system. The system used classic Arabic pronunciation rules, common pronunciation rules of Modern Standard Arabic, as well as morphologically driven rules. The paper gives in detail an explanation of these rules as well as their formal mathematical presentation. The rules were used to generate a dictionary for a 5.4 hours corpus of broadcast news. The phonetic dictionary contains 23,841 definitions corresponding to about 14232 words. The generated dictionary was evaluated on an actual Arabic speech recognition system. The pronunciation rules and the phone set were validated by test cases. The Arabic speech recognition system achieves word error rate of %11.71 for fully diacritized transcription of about 1.1 hours of Arabic broadcast news.
International Journal of Speech Technology | 2011
Dia AbuZeina; Wasfi G. Al-Khatib; Moustafa Elshafei; Husni Al-Muhtaseb
One of the problems in the speech recognition of Modern Standard Arabic (MSA) is the cross-word pronunciation variation. Cross-word pronunciation variations alter the phonetic spelling of words beyond their listed forms in the phonetic dictionary, leading to a number of Out-Of-Vocabulary (OOV) wordforms. This paper presents a knowledge-based approach to model cross-word pronunciation variation at both phonetic dictionary and language model levels. The proposed approach is based on modeling cross-word pronunciation variation by expanding the phonetic dictionary and corpus transcription. The Baseline system contains a phonetic dictionary of 14,234 words from a 5.4 hours corpus of Arabic broadcast news. The expanded dictionary contains 15,873 words. Also, the corpus transcription is expanded according to the applied Arabic phonological rules. Using Carnegie Mellon University (CMU) Sphinx speech recognition engine, the Enhanced system achieved Word Error Rate (WER) of 9.91% on a test set of fully discretized transcription of about 1.1 hours of Arabic broadcast news. The WER is enhanced by 2.3% compared to the Baseline system.
Archive | 2011
Husni Al-Muhtaseb; Yousef Elarian; Lahouari Ghouti
Training and testing data for optical character recognition are cumbersome to obtain. If large amounts of data can be produced from small amounts, much time and effort can be saved. This paper presents an approach to synthesize Arabic handwriting. We segment word images into labeled characters and then use these in synthesizing arbitrary words. The synthesized text should look natural; hence, we define some criteria to decide on what is acceptable as natural-looking. The text that is synthesized by using the naturallooking constrain is compared to text that is synthesized without using the natural-looking constrain for evaluation.
International Journal of Speech Technology | 2012
Dia AbuZeina; Wasfi G. Al-Khatib; Moustafa Elshafei; Husni Al-Muhtaseb
Pronunciation variation is a major obstacle in improving the performance of Arabic automatic continuous speech recognition systems. This phenomenon alters the pronunciation spelling of words beyond their listed forms in the pronunciation dictionary, leading to a number of out of vocabulary word forms. This paper presents a direct data-driven approach to model within-word pronunciation variations, in which the pronunciation variants are distilled from the training speech corpus. The proposed method consists of performing phoneme recognition, followed by a sequence alignment between the observation phonemes generated by the phoneme recognizer and the reference phonemes obtained from the pronunciation dictionary. The unique collected variants are then added to dictionary as well as to the language model. We started with a Baseline Arabic speech recognition system based on Sphinx3 engine. The Baseline system is based on a 5.4 hours speech corpus of modern standard Arabic broadcast news, with a pronunciation dictionary of 14,234 canonical pronunciations. The Baseline system achieves a word error rate of 13.39%. Our results show that while the expanded dictionary alone did not add appreciable improvements, the word error rate is significantly reduced by 2.22% when the variants are represented within the language model.
international test conference | 1994
Alaaeldin Amin; Mohamed Y. Osman; Radwan E. Abdel-Aal; Husni Al-Muhtaseb
The testability problem of dual port memories is investigated. Architectural modifications which enhance testability with minimal overhead on both silicon area and device performance are described. New fault models for both the memory array and the address decoders are proposed and efficient O(/spl radic/n) test algorithms are presented. The new fault models account for the simultaneous dual access property of the device. In addition to the classical static neighborhood pattern sensitive faults, the array test algorithm covers a new class of pattern sensitive faults, Duplex Dynamic Neighborhood Pattern Sensitive faults (DDNPSF).