Mohamed Elmahdy
German University in Cairo
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mohamed Elmahdy.
Archive | 2012
Mohamed Elmahdy; Rainer Gruhn; Wolfgang Minker
Novel Techniques for Dialectal Arabic Speech describes approaches to improve automatic speech recognition for dialectal Arabic. Since speech resources for dialectal Arabic speech recognition are very sparse, the authors describe how existing Modern Standard Arabic (MSA) speech data can be applied to dialectal Arabic speech recognition, while assuming that MSA is always a second language for all Arabic speakers. In this book, Egyptian Colloquial Arabic (ECA) has been chosen as a typical Arabic dialect. ECA is the first ranked Arabic dialect in terms of number of speakers, and a high quality ECA speech corpus with accurate phonetic transcription has been collected. MSA acoustic models were trained using news broadcast speech. In order to cross-lingually use MSA in dialectal Arabic speech recognition, the authors have normalized the phoneme sets for MSA and ECA. After this normalization, they have applied state-of-the-art acoustic model adaptation techniques like Maximum Likelihood Linear Regression (MLLR) and Maximum A-Posteriori (MAP) to adapt existing phonemic MSA acoustic models with a small amount of dialectal ECA speech data. Speech recognition results indicate a significant increase in recognition accuracy compared to a baseline model trained with only ECA data.
international conference on acoustics, speech, and signal processing | 2011
Mohamed Elmahdy; Rainer Gruhn; Slim Abdennadher; Wolfgang Minker
We propose the Arabic Chat Alphabet (ACA) as naturally written in everyday life for dialectal Arabic speech transcription. Our assumption is that ACA is a natural language that includes short vowels that are missing in traditional Arabic orthography. Furthermore, ACA transcriptions can be rapidly prepared. Egyptian Colloquial Arabic was chosen as a typical dialect. Two speech recognition baselines were built: phonemic and graphemic. Original transcriptions were re-written in ACA by different transcribers. Ambiguous ACA sequences were handled by automatically generating all possible variants. ACA variations across transcribers were modeled by phonemes normalization and merging. Results show that the ACA-based approach outperforms the graphemic baseline while it performs as accurate as the phoneme-based baseline with a slight increase in WER.
natural language processing and knowledge engineering | 2009
Mohamed Elmahdy; Rainer Gruhn; Wolfgang Minker; Slim Abdennadher
In this paper we are proposing a new multilingual approach for dialectal Arabic speech recognition. Dialectal Arabic is only spoken and not used in written form in almost all domains and there is no standard for dialectal Arabic transcription. Therefore, preparing large training corpora for dialectal Arabic acoustic modeling is too difficult compared to Modern Standard Arabic. We have built several acoustic models with news broadcast speech corpus of Modern Standard Arabic speech. Egyptian Colloquial Arabic has been chosen in our work as a typical Arabic dialect example. We have collected Egyptian Colloquial Arabic connected digits corpus to evaluate our approach. We were able to use Modern Standard Arabic acoustic models as multilingual models to decode Egyptian Arabic. We were able to reach a recognition rate of 99.34% which is very satisfactory compared to the monolingual approach and compared to previous work in spoken Arabic digits speech recognition.
2007 ITI 5th International Conference on Information and Communications Technology | 2007
Ahmed Hamdy; Mohamed Elmahdy; Maha Elsabrouty
Face detection plays a huge role in many applications such as security, surveillance and human-computer interface. This paper presents a new face detection for the purpose of drowsy driver assistant system. The algorithm is based on a combination two different principles of detection, namely detection of skin color and modified PCA analysis. The algorithm has shown improved performance compared to using either of the principles alone and is performing well under different lighting conditions.
midwest symposium on circuits and systems | 2003
Ahmed M. Badawi; Mohamed Elmahdy
We propose a path planning simulation system for 3D ultrasound guided needle biopsy, this system is designed to accurately take a 3D biopsy specimen from an abdominal focal lesion without puncturing a blood vessel or pass through ribs, the introduced system traces the 3D position of the biopsy needle and visualizes the needle in 3D and 2D in real time format, this new designed system gives the physician the optimal pathway to get a biopsy before puncturing the patient and visualizes this pathway in real time during the flythrough movement. Real time visualization of the needle in MPR and 3D volume of interest is performed. Experimental results on an in-vitro phantoms and after path phantoms showed promising results of this system
Procedia Computer Science | 2017
Injy Hamed; Mohamed Elmahdy; Slim Abdennadher
Abstract The use of mixed languages in daily conversations, referred to as “code-switching”, has become a common linguistic phenomenon among bilingual/multilingual communities. Code-switching involves the alternating use of distinct languages or “codes” at sentence boundaries or within the same sentence. With the rise of globalization, code-switching has become prevalent in daily conversations, especially among urban youth. This lead to an increasing demand on automatic speech recognition systems to be able to handle such mixed speech. In this paper, we present the first steps towards building a multilingual language model (LM) for code-switched Arabic-English. One of the main challenges faced when building a multilingual LM is the need of explicit mixed text corpus. Since code-switching is a behaviour used more commonly in spoken than written form, text corpora with code-switching are usually scarce. Therefore, the first aim of this paper is to introduce a code-switch Arabic-English text corpus that is collected by automatically downloading relevant documents from the web. The text is then extracted from the documents and processed to be useable by NLP tasks. For language modeling, a baseline LM was built from existing monolingual corpora. The baseline LM gave a perplexity of 11841.9 and Out-of-Vocabulary (OOV) rate of 4.07%. The gathered code-switch Arabic-English corpus, along with the existing monolingual corpora were then used to construct several LMs. The best LM achieved a great improvement over the baseline LM, with a perplexity of 275.41 and an OOV rate of 0.71%.
2016 IEEE Symposium on Computer Applications & Industrial Electronics (ISCAIE) | 2016
Ayman A. Zayyan; Mohamed Elmahdy; Husniza Husni; Jihad Mohamad Al Ja'am
In this paper, the problem of missing diacritic marks in most of Arabic written resources is investigated. Our aim is to implement a scalable and extensible platform to automatically restore missing diacritic marks for Modern Standard Arabic text. Different rule-based and statistical techniques are proposed. These include: morphological analyzer-based, maximum likelihood estimate, and statistical n-gram models. Diacritization accuracy of each technique was evaluated based on Diacritic Error Rate (DER) and Word Error Rate (WER). The proposed platform includes helper tools for text preprocessing and encoding conversion. It yielded a WER of 7.1% and DER of 3.9%. When the case ending was ignored, the platform yielded a WER and DER of 5.1% and 2.7%, respectively.
international conference natural language processing | 2009
Mohamed Elmahdy; Rainer Gruhn; Wolfgang Minker; Slim Abdennadher
Grapheme-based acoustic modeling for Arabic is a demanding research area since high phonetic transcription accuracy is not yet solved completely. In this paper, we are studying the use of a pure grapheme-based approach using Gaussian mixture model to implicitly model missing diacritics and investigating the effect of Gaussian densities and amount of training data on speech recognition accuracy. Two transcription systems were built: a phoneme-based system and a grapheme-based system. Several acoustic models were created with each system by changing the number of Gaussian densities and the amount of training data. Results show that by increasing the number of Gaussian densities or the amount of training data, the improvement rate in the grapheme-based approach was found to be faster than in the phoneme-based approach. Hence the accuracy gap between the two approaches can be compensated by increasing either the number of Gaussian densities or the amount of training data.
International Conference on Statistical Language and Speech Processing | 2018
David Awad; Caroline Sabty; Mohamed Elmahdy; Slim Abdennadher
Many applications that we use on a daily basis incorporate Natural Language Processing (NLP), from simple tasks such as automatic text correction to speech recognition. A lot of research has been done on NLP for the English language but not much attention was given to the NLP of the Arabic language. The purpose of this work is to implement a tagging model for Arabic Name Entity Recognition which is an important information extraction task in NLP. It serves as a building block for more advanced tasks. We developed a deep learning model that consists of Bidirectional Long Short Term Memory and Conditional Random Field with the addition of different network layers such as Word Embedding, Convolutional Neural Network, and Character Embedding. Hyperparameters have been tuned to maximize the F1-score.
International Conference on Advanced Intelligent Systems and Informatics | 2018
Injy Hamed; Mohamed Elmahdy; Slim Abdennadher
It has become common, especially among urban youth, for people to use more than one language in their everyday conversations - a phenomenon referred to by linguists as “code-switching”. With the rise in globalization and the widespread of code-switching among multilingual societies, a great demand has been placed on Natural Language Processing (NLP) applications to be able to handle such mixed data. In this paper, we present our efforts in language modeling for code-switch Arabic-English. In order to train a language model (LM), huge amounts of text data is required in the respective language. However, the main challenge faced in language modeling for code-switch languages, is the lack of available data. In this paper, we propose an approach to artificially generate code-switch Arabic-English n-grams and thus improve the language model. This was done by expanding the relatively-small available corpus and its corresponding n-grams using translation-based approaches. The final LM achieved relative improvements in both perplexity and OOV rates of 1.97% and 16.36% respectively.