Moustafa Elshafei
King Fahd University of Petroleum and Minerals
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Moustafa Elshafei.
Information Sciences | 2002
Moustafa Elshafei; Husni Al-Muhtaseb; Mansour M. Alghamdi
The paper proposes a diphone/sub-syllable method for Arabic Text-to-Speech (ATTS) systems. The proposed approach exploits the particular syllabic structure of the Arabic words. For good quality, the boundaries of the speech segments are chosen to occur only at the sustained portion of vowels. The speech segments consists of consonants-half vowels, half vowel-consonants, half vowels, middle portion of Vowels, and suffix consonants. The minimum set consists of about 310 segments for classical Arabic.
international conference on computer and communication engineering | 2010
Mohammad A. M. Abushariah; Raja Noor Ainon; Roziati Zainuddin; Moustafa Elshafei; Othman Omran Khalifa
This paper reports the design, implementation, and evaluation of a research work for developing a high performance natural speaker-independent Arabic continuous speech recognition system. It aims to explore the usefulness and success of a newly developed speech corpus, which is phonetically rich and balanced, presenting a competitive approach towards the development of an Arabic ASR system as compared to the state-of-the-art Arabic ASR researches. The developed Arabic AS R mainly used the Carnegie Mellon University (CMU) Sphinx tools together with the Cambridge HTK tools. To extract features from speech signals, Mel-Frequency Cepstral Coefficients (MFCC) technique was applied producing a set of feature vectors. Subsequently, the system uses five-state Hidden Markov Models (HMM) with three emitting states for tri-phone acoustic modeling. The emission probability distribution of the states was best using continuous density 16 Gaussian mixture distributions. The state distributions were tied to 500 senons. The language model contains uni-grams, bi-grams, and tri-grams. The system was trained on 7.0 hours of phonetically rich and balanced Arabic speech corpus and tested on another one hour. For similar speakers but different sentences, the system obtained a word recognition accuracy of 92.67% and 93.88% and a Word Error Rate (WER) of 11.27% and 10.07% with and without diacritical marks respectively. For different speakers but similar sentences, the system obtained a word recognition accuracy of 95.92% and 96.29% and a Word Error Rate (WER) of 5.78% and 5.45% with and without diacritical marks respectively. Whereas different speakers and different sentences, the system obtained a word recognition accuracy of 89.08% and 90.23% and a Word Error Rate (WER) of 15.59% and 14.44% with and without diacritical marks respectively.
International Journal of Speech Technology | 2007
Mansour M. Alghamdi; Moustafa Elshafei; Husni Al-Muhtaseb
This paper describes the development of an Arabic broadcast news transcription system. The presented system is a speaker-independent large vocabulary natural Arabic speech recognition system, and it is intended to be a test bed for further research into the open ended problem of achieving natural language man-machine conversation. The system addresses a number of challenging issues pertaining to the Arabic language, e.g. generation of fully vocalized transcription, and rule-based spelling dictionary. The developed Arabic speech recognition system is based on the Carnegie Mellon university Sphinx tools. The Cambridge HTK tools were also utilized at various testing stages.The system was trained on 7.0 hours of a 7.5 hours of Arabic broadcast news corpus and tested on the remaining half an hour. The corpus was made to focus on economics and sport news. At this experimental stage, the Arabic news transcription system uses five-state HMM for triphone acoustic models, with 8 and 16 Gaussian mixture distributions. The state distributions were tied to about 1680 senons. The language model uses both bi-grams and tri-grams. The test set consisted of 400 utterances containing 3585 words. The Word Error Rate (WER) came initially to 10.14 percent. After extensive testing and tuning of the recognition parameters the WER was reduced to about 8.61% for non-vocalized text transcription.
Isa Transactions | 2016
Magdi S. Mahmoud; Muhammad Sabih; Moustafa Elshafei
This paper addresses the problem of output-feedback communication and control with event-triggered framework in the context of distributed networked control systems. The design problem of the event-triggered output-feedback control is proposed as a linear matrix inequality (LMI) feasibility problem. The scheme is developed for the distributed system where only partial states are available. In this scheme, a subsystem uses local observers and share its information to its neighbors only when the subsystems local error exceeds a specified threshold. The developed method is illustrated by using a coupled cart example from the literature.
Journal of Information Technology Research | 2009
Mohamed Ali; Moustafa Elshafei; Mansour M. Alghamdi; Husni Al-Muhtaseb
Phonetic dictionaries are essential components of large-vocabulary speaker-independent speech recognition systems. This paper presents a rule-based technique to generate phonetic dictionaries for a large vocabulary Arabic speech recognition system. The system used conventional Arabic pronunciation rules, common pronunciation rules of Modern Standard Arabic, as well as some common dialectal cases. The paper gives in detail an explanation of these rules as well as their formal mathematical presentation. The rules were used to generate a dictionary for a 5.4 hour corpus of broadcast news. The rules and the phone set were tested and evaluated on an Arabic speech recognition system. The system was trained on 4.3 hours of the 5.4 hours of Arabic broadcast news corpus and tested on the remaining 1.1 hours. The phonetic dictionary contains 23,841 definitions corresponding to about 14232 words. The language model contains both bi-grams and tri-grams. The Word Error Rate (WER) came to 9.0%.
IEEE Transactions on Aerospace and Electronic Systems | 2000
Moustafa Elshafei; S. S. Akhtar; Mohammed Shahgir Ahmed
An artificial neural network (ANN) based helicopter identification system is proposed. The feature vectors are based on both the tonal and the broadband spectrum of the helicopter signal, ANN pattern classifiers are trained using various parametric spectral representation techniques. Specifically, linear prediction, reflection coefficients, cepstrum, and line spectral frequencies (LSF) are compared in terms of recognition accuracy and robustness against additive noise. Finally, an 8-helicopter ANN classifier is evaluated. It is also shown that the classifier performance is dramatically improved if it is trained using both clean data and data corrupted with additive noise.
international conference on innovations in information technology | 2008
Mohamed Ali; Moustafa Elshafei; Mansour M. Alghamdi; Husni Al-Muhtaseb; Atef J. Al-Najjar
Phonetic dictionaries are essential components of large-vocabulary natural language speaker-independent speech recognition systems. This paper presents a rule-based technique to generate Arabic phonetic dictionaries for a large vocabulary speech recognition system. The system used classic Arabic pronunciation rules, common pronunciation rules of Modern Standard Arabic, as well as morphologically driven rules. The paper gives in detail an explanation of these rules as well as their formal mathematical presentation. The rules were used to generate a dictionary for a 5.4 hours corpus of broadcast news. The phonetic dictionary contains 23,841 definitions corresponding to about 14232 words. The generated dictionary was evaluated on an actual Arabic speech recognition system. The pronunciation rules and the phone set were validated by test cases. The Arabic speech recognition system achieves word error rate of %11.71 for fully diacritized transcription of about 1.1 hours of Arabic broadcast news.
Fuzzy Sets and Systems | 2001
Mohammed Shahgir Ahmed; U. L. Bhatti; Fouad M. AL-Sunni; Moustafa Elshafei
A design method of a fuzzy servo-controller for nonlinear plants has been presented. The proposed method is an error feedback scheme, where the controller also receives signals representing the plant operating points. Integrator is used in the control loop to ensure setpoint following, low-frequency disturbance rejection, and to enhance the robustness of the closed-loop system. A training scheme for the fuzzy controller is derived that minimizes the output error between a reference model and the plant. The training is conducted off-line for a class of setpoints conforming to the normal operating condition of the plant. Results of simulation studies are also presented.
Intelligent Automation and Soft Computing | 2001
Moustafa Elshafei; Mohammed Shahgir Ahmed
Abstract This paper introduces the Hilbert Space-Filling Curves SFC and outlines algorithms, which demonstrate the SFC efficient self-organizing features. We then propose a SFC fuzzy model based on clustering the object space. The SFC fuzzy model is utilized to construct a fuzzy feedback controller. It is shown that the self-replicating feature of SFC can be utilized to construct successively improving control strategies based on symmetric decision trees. The proposed SFC technique achieves a dramatic reduction in the complexity of fuzzy controllers by reducing the multi-dimensional fuzzification problem to aone-dimensional space.
International Journal of Speech Technology | 2011
Dia AbuZeina; Wasfi G. Al-Khatib; Moustafa Elshafei; Husni Al-Muhtaseb
One of the problems in the speech recognition of Modern Standard Arabic (MSA) is the cross-word pronunciation variation. Cross-word pronunciation variations alter the phonetic spelling of words beyond their listed forms in the phonetic dictionary, leading to a number of Out-Of-Vocabulary (OOV) wordforms. This paper presents a knowledge-based approach to model cross-word pronunciation variation at both phonetic dictionary and language model levels. The proposed approach is based on modeling cross-word pronunciation variation by expanding the phonetic dictionary and corpus transcription. The Baseline system contains a phonetic dictionary of 14,234 words from a 5.4 hours corpus of Arabic broadcast news. The expanded dictionary contains 15,873 words. Also, the corpus transcription is expanded according to the applied Arabic phonological rules. Using Carnegie Mellon University (CMU) Sphinx speech recognition engine, the Enhanced system achieved Word Error Rate (WER) of 9.91% on a test set of fully discretized transcription of about 1.1 hours of Arabic broadcast news. The WER is enhanced by 2.3% compared to the Baseline system.