Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yousef Ajami Alotaibi is active.

Publication


Featured researches published by Yousef Ajami Alotaibi.


annual conference on computers | 2009

Automatic speech recognition for Bangla digits

Ghulam Muhammad; Yousef Ajami Alotaibi; Mohammad Nurul Huda

In this paper, we introduce a system for Bangla digit automatic speech recognition (ASR). Though Bangla is one of the largely spoken languages in the world, only a few works on Bangla ASR can be found in the literature, especially on Bangladeshi accented Bangla. In this work, the corpus is collected from natives in Bangladesh. Mel-frequency cepstral coefficients (MFCCs) based features and hidden Markov model (HMM) based classifiers are used for recognition. Experimental results show comparatively high recognition performance (more than 95%) for first six digits (0 – 5) and low performance (less than 90%) for the next four digits (6 – 9). We notice two confused pairs of digits: one with (6) and (9), and the other with (7) and (8), in the experiments. We also find that different dialects in Bangladesh have a greater role on this confusion.


international conference on database theory | 2010

Environment Recognition Using Selected MPEG-7 Audio Features and Mel-Frequency Cepstral Coefficients

Ghulam Muhammad; Yousef Ajami Alotaibi; Mansour Alsulaiman; Mohammad Nurul Huda

In this paper, we propose a system for environment recognition using selected MPEG-7 audio low level descriptors together with conventional mel-frequency cepstral coefficients (MFCC). The MPEG-7 descriptors are first ranked based on Fisher’s discriminant ratio. Then principal component analysis is applied on top ranked 30 MPEG-7 descriptors to obtain 13 features. These 13 features are appended with MFCC features to complete the feature set of the proposed system. Gaussian mixture models (GMMs) are used as classifier. The system is evaluated using ten different environment sounds. The experimental results show a significant improvement in recognition performance of the proposed system over MFCC or full MPEG-7 descriptor based systems. For example, the best performance is achieved in Restaurant environment where MFCC, full MPEG-7, and the proposed method give 90%, 94%, and 96% accuracy, respectively.


Eurasip Journal on Audio, Speech, and Music Processing | 2008

Experiments on automatic recognition of nonnative Arabic speech

Yousef Ajami Alotaibi; Sid-Ahmed Selouani; Douglas D. O'Shaughnessy

The automatic recognition of foreign-accented Arabic speech is a challenging task since it involves a large number of nonnative accents. As well, the nonnative speech data available for training are generally insufficient. Moreover, as compared to other languages, the Arabic language has sparked a relatively small number of research efforts. In this paper, we are concerned with the problem of nonnative speech in a speaker independent, large-vocabulary speech recognition system for modern standard Arabic (MSA). We analyze some major differences at the phonetic level in order to determine which phonemes have a significant part in the recognition performance for both native and nonnative speakers. Special attention is given to specific Arabic phonemes. The performance of an HMM-based Arabic speech recognition system is analyzed with respect to speaker gender and its native origin. The WestPoint modern standard Arabic database from the language data consortium (LDC) and the hidden Markov Model Toolkit (HTK) are used throughout all experiments. Our study shows that the best performance in the overall phoneme recognition is obtained when nonnative speakers are involved in both training and testing phases. This is not the case when a language model and phonetic lattice networks are incorporated in the system. At the phonetic level, the results show that female nonnative speakers perform better than nonnative male speakers, and that emphatic phonemes yield a significant decrease in performance when they are uttered by both male and female nonnative speakers.


International Journal of Computer Processing of Languages | 2009

Evaluating the MSA West Point Speech Corpus

Yousef Ajami Alotaibi; Sid-Ahmed Selouani

Compared to other major languages of the world, the Arabic language suffers from a dearth of research initiatives and research resources. As a result, Modern Standard Arabic (MSA) lacks reliable speech corpora for research in phonetics and related areas of linguistics. In recent years the Linguistic Data Consortium (LDC) published the first public MSA speech corpus designed for speech recognition experiments. That corpus was called West Point. Currently, we are using this corpus in our research experiments for speech recognition and other speech processing investigations. The aim of this paper is to evaluate the West Point Corpus from the MSA phonetic and linguistic point of view. The phonemes used and their numbers, the phoneme definitions, the labeling, and the scripts established by the West Point Corpus are included in the evaluation. Weaknesses, strengths, and discrepancies of the West Point Corpus regarding the linguistic rules and phonetic characteristics of MSA are also discussed in this paper.


international symposium on signal processing and information technology | 2008

A Word-Dependent Automatic Arabic Speaker Identification System

Suliman S. Al-Dahri; Youssaf H. Al-Jassar; Yousef Ajami Alotaibi; Mansour Alsulaiman; Khondaker Abdullah-Al-Mamun

Automatic speaker recognition is one of the difficult tasks in the field of computer speech and speaker recognition. Speaker recognition is a biometric process of automatically recognizing who is speaking on the basis of speaker dependent features of the speech signal. Currently, speaker recognition system is an important need for authenticating the personal like other biometrics such as finger prints and retinal scans. Speech based recognition permits both on site and remote access to the user. In this research, speaker identification system is investigated from the speaker recognition problem point of view. It is an important component of a speech-based user interface. The aim of this research is to develop a system that is capable of identifying an individual from a sample of his or her speech. Arabic language is a semitic language that differs from European languages such as English. Our system is based on Arabic speech. We have chosen to work on a word-dependent system using the Arabic isolated word /ns10 as10 cs10 as10 ms10//[unk]/ a single keyword for the test utterance. This choice has been made because the word /ns10 as10 cs10 as10 ms10//[unk]/ is mostly used by the Arabic speakers. Speech features are extracted using MFCC. The HTK is used to implement the speaker identification module with phoneme based HMM. The designed automatic Arabic speaker identification system contains 100 speakers and it achieved 96.25% accuracy for recognizing the correct speaker.


international conference natural language processing | 2010

Comparative evaluation of two arabic speech corpora

Yousef Ajami Alotaibi; Ali H. Meftah

The aim of this paper is to conduct a constructive and comparative evaluation between two important Arabic corpora for two different Arabic dialects, namely, Saudi dialect corpus that was collected by King Abdulaziz City for Science and Technology (KACST), and a Levantine Arabic dialect corpus. Levantine dialect is spoken by ordinary Lebanese, Jordanian, Syrian, and Palestinian people. The later one was produced by the Linguistic Data Consortium (LDC). Advantages and disadvantages of these two corpora were presented and discussed. This discussion is aiming to help digital speech processing researchers to figure out the weakness and strength sides of these important corpora before considering them in their experiments. Moreover, this paper can motivate in designing, maintaining, distributing, and upgrading Arabic corpora to help Arabic language speech research communities.


Computer Speech & Language | 2010

Study on pharyngeal and uvular consonants in foreign accented Arabic for ASR

Yousef Ajami Alotaibi; Ghulam Muhammad

This paper investigates the unique pharyngeal and uvular consonants of Arabic from the point of view of automatic speech recognition (ASR). Comparisons of the recognition error rates for these phonemes are analyzed in five experiments that involve different combinations of native and non-native Arabic speakers. The most three confusing consonants for every investigated consonant are discussed. All experiments use the Hidden Markov Model Toolkit (HTK) and the Language Data Consortium (LDC) WestPoint Modern Standard Arabic (MSA) database. Results confirm that these Arabic distinct consonants are a major source of difficulty for Arabic ASR. While the recognition rate for certain of these unique consonants such as /@?/ can drop below 35% when uttered by non-native speakers, there is advantage to include non-native speakers in ASR. Besides, regional differences in pronunciation of MSA by native Arabic speakers require the attention of Arabic ASR research.


international conference on future generation information technology | 2009

Speech Recognition System and Formant Based Analysis of Spoken Arabic Vowels

Yousef Ajami Alotaibi; Amir Hussain

Arabic is one of the worlds oldest languages and is currently the second most spoken language in terms of number of speakers. However, it has not received much attention from the traditional speech processing research community. This study is specifically concerned with the analysis of vowels in modern standard Arabic dialect. The first and second formant values in these vowels are investigated and the differences and similarities between the vowels are explored using consonant-vowels-consonant (CVC) utterances. For this purpose, an HMM based recognizer was built to classify the vowels and the performance of the recognizer analyzed to help understand the similarities and dissimilarities between the phonetic features of vowels. The vowels are also analyzed in both time and frequency domains, and the consistent findings of the analysis are expected to facilitate future Arabic speech processing tasks such as vowel and speech recognition and classification.


International Journal of Speech Technology | 2012

Comparing ANN to HMM in implementing limited Arabic vocabulary ASR systems

Yousef Ajami Alotaibi

In this paper we investigated Artificial Neural Networks (ANN) based Automatic Speech Recognition (ASR) by using limited Arabic vocabulary corpora. These limited Arabic vocabulary subsets are digits and vowels carried by specific carrier words. In addition to this, Hidden Markov Model (HMM) based ASR systems are designed and compared to two ANN based systems, namely Multilayer Perceptron (MLP) and recurrent architectures, by using the same corpora. All systems are isolated word speech recognizers. The ANN based recognition system achieved 99.5% correct digit recognition. On the other hand, the HMM based recognition system achieved 98.1% correct digit recognition. With vowels carrier words, the MLP and recurrent ANN based recognition systems achieved 92.13% and 98.06, respectively, correct vowel recognition; but the HMM based recognition system achieved 91.6% correct vowel recognition.


international conference on image and signal processing | 2010

Speech recognition system of Arabic alphabet based on a telephony Arabic corpus

Yousef Ajami Alotaibi; Mansour M. Alghamdi; Fahad Alotaiby

Automatic recognition of spoken alphabets is one of the difficult tasks in the field of computer speech recognition. In this research, spoken Arabic alphabets are investigated from the speech recognition problem point of view. The system is designed to recognize spelling of an isolated word. The Hidden Markov Model Toolkit (HTK) is used to implement the isolated word recognizer with phoneme based HMM models. In the training and testing phase of this system, isolated alphabets data sets are taken from the telephony Arabic speech corpus, SAAVB. This standard corpus was developed by KACST and it is classified as a noisy speech database. A hidden Markov model based speech recognition system was designed and tested with automatic Arabic alphabets recognition. Four different experiments were conducted on these subsets, the first three trained and tested by using each individual subset, the fourth one conducted on these three subsets collectively. The recognition system achieved 64.06% overall correct alphabets recognition using mixed training and testing subsets collectively.

Collaboration


Dive into the Yousef Ajami Alotaibi's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Mansour M. Alghamdi

King Abdulaziz City for Science and Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Mohammad Nurul Huda

United International University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge