Yacine Benahmed
Institut national de la recherche scientifique
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yacine Benahmed.
International Journal of Distance Education Technologies | 2008
Sid-Ahmed Selouani; Tang-Ho Lê; Yacine Benahmed; Douglas D. O'Shaughnessy
This article presents systems that use speech technology, to emulate the one-on-one interaction a student can get from a virtual instructor. A web-based learning tool, the Learn IN Context (LINC+) system, designed and used in a real mixed-mode learning context for a computer (C++ language) programming course taught at the Universite de Moncton (Canada) is described here. It integrates an Internet Voice Searching and Navigating (IVSN) system that helps learners to search and navigate both the web and their desktop environment through voice commands and dictation. LINC+ also incorporates an Automatic User Profile Building and Training (AUPB&T) module that allows users to increase speech recognition performance without having to go through the long and fastidious manual training process. The findings show that the majority of learners seem to be satisfied with this new media, and confirm that it does not negatively affect their cognitive load.
canadian conference on electrical and computer engineering | 2006
Yacine Benahmed; Sid-Ahmed Selouani
This paper presents an automatic user profile building and training (AUPB&T) system using voice pitch variation for speech recognition engines. The problem with current ASR engines is that their vocabularies are usually only suited for general usage. Another problem with current ASR engines is that there is no easy means for visually challenged users to train the engine to improve its performance. Our proposed solution consists of a system that can accept a users document and favorite Web pages. These documents can then be parsed and their words added to the ASR engines lexicon. Next, it uses those documents to start an ASR training session. The training can completed automatically by using a high quality text-to-speech (TTS) natural voice. In order to overcome the problem of the limited number of high quality natural TTS voices available, we propose to integrate voice pitch variation during the training phase of AUPB&T, which can cover a broader range of user variability. The results of our experiments using standard ASR and TTS engines show that the AUPB&T system using pitch variation improved the recognition rate for an unknown beta speaker
international conference on acoustics, speech, and signal processing | 2011
Yacine Benahmed; Sid-Ahmed Selouani; Douglas D. O'Shaughnessy; Amin Haji Abolhassani
This paper presents a system that allows the user to interact by speech with a Radio-Frequency IDentification (RFID) network working in a highly noisy environment. A new dialog framework is proposed in order to give the human operators the ability to communicate with the system in a more natural fashion. This is achieved by the implementation of the Artificial Intelligence Markup Language which, allows the system to respond to close-to natural language queries by means of pattern matching. Besides this, and in order to deal with highly noisy and changing environments, an online signal subspace enhancement technique based on the Variance of the Reconstruction Error of Karhunen-Love Transform (VRE-KLT) is proposed. Both of the dialog naturalness and speech enhancement efficiency are evaluated in real-life situations by using the Microsoft Speech API framework.
international conference on acoustics, speech, and signal processing | 2014
Yacine Benahmed; Sid-Ahmed Selouani; Douglas D. O'Shaughnessy
In this paper, we introduce a novel method of smoothing language models (LM) based on the semantic information found in ontologies that is especially adapted for limited-resources language modeling. We exploit the latent knowledge of language that is deeply encoded within ontologies. As such, this work examines the potential of using the semantic and syntactic relations between words from the WordNet ontology to generate new plausible contexts for unseen events to simulate a larger corpus. These unseen events are then mixed-up with a baseline Witten-Bell(WB) LM in order to improve its performance both in terms of language model perplexity and automatic speech recognition word error rates. Results indicate a significant reduction in the perplexity of the language model (up to 9.85% relative) all the while reducing word error rate in a statistically significant manner compared to both the original WB LM and baseline Kneser-Ney smoothed language model on the Wall Street Journal-based Continuous Speech Recognition Phase II corpus.
international conference on innovations in information technology | 2007
Yacine Benahmed; Sid-Ahmed Selouani; Habib Hamam; Douglas D. O'Shaughnessy
This paper presents an automatic user profile building and training (AUPB&T) system for speech recognition. This system uses text-to-speech (TTS) voices to improve the language models and the performance of current commercial automatic speech recognition (ASR) engines. The vocabularies of these systems are usually suited for general usage. Users have no easy means of training these engines. They generally shun the proposed training methods that require long and picky training sessions. Our proposed solution is a system that accepts the user documents and favorite Web pages, and feeds them to a (TTS) module in order to improve the accuracy of spoken information retrieval queries. The results show that AUPB&T considerably improves the recognition engine performance of the Microsoft speech recognition system without having to resort to manual training.
european signal processing conference | 2016
Yacine Benahmed; Sid-Ahmed Selouani; Douglas D. O'Shaughnessy
This paper investigates the use of graph metrics to further enhance the performance of a language model smoothing algorithm. Bin-Based Ontological Smoothing has been successfully used to improve language model performance in automatic speech recognition tasks. It uses ontologies to estimate novel utterances for a given language model. Since ontologies can be represented as graphs, we investigate the use of graph metrics as an additional smoothing factor in order to capture additional semantic or relational information found in ontologies. More specifically, we investigate the effect of HITS, PageRank, Modularity, and weighted degree, on performance. The entire power set of bins is evaluated. Our results show that the interpolation of the original bins at distances 1, 3 and 5 resulted in an improvement in WER of 0.71% relative over the interpolation of bins 1 to 5. Furthermore, modularity, PageRank and HITS show promise for further study.
information sciences, signal processing and their applications | 2012
Yacine Benahmed; Sid-Ahmed Selouani; Douglas D. O'Shaughnessy
This paper presents a recognition engine especially tailored to the French language spoken in the Canadian pro-vince of New-Brunswick. It studies a global monophone model that handles the linguistic variability found in the province. The study also explores the impact of speaker locality on recognition rate when using the global model. Three models are implemented for each linguistic poles; North-East, North-West, and South-East. The results show respectively 83.58% and 72.66% phone and word recognition rate for Mel frequency Cepstral coefficients, energy, delta and acceleration parameters acoustic models trained discriminatively with maximum mutual information and minimum phone error criterions respectively. Finally, we observe that the general acoustic models are sufficiently generalized to perform uniformly across the three linguistic poles with an average of 82.8% phone recognition rate across the three different acoustic models.
information sciences, signal processing and their applications | 2012
Alaidine Ben Ayed; Sid-Ahmed Selouani; Mustapha Kardouchi; Yacine Benahmed
This paper presents a new system for radiological image classification. The proposed system is built on Hidden Markov Models (HMMs). In this work, the Hidden Markov Models Toolkit (HTK) is adapted to deal with image classification issue. HTK was primarily designed for speech recognition research. Features are extracted through Shape context descriptor. They are converted to HTK format by first adding headers, then, representing them in successive frames. Each frame is multiplied by a windowing function. Features are used by HTK for training and classification. Classes of the medical IRMA database are used in experiments. A comparison with a neural network based system shows the efficiency of the proposed approach.
canadian conference on electrical and computer engineering | 2012
Yacine Benahmed; Sid-Ahmed Selouani; Douglas D. O'Shaughnessy
This paper presents improvements in a dialogue interpreter sub-system for an application that allows the user to interact by speech with a Radio-Frequency IDentification (RFID) network working in a highly noisy environment. A new dialog framework is proposed in order to give the human operators the ability to communicate with the system in a more natural fashion. This is achieved by the implementation of the Artificial Intelligence Markup Language combined with an ontology-based pattern generation and root semantical analysis algorithms, which allows the system to respond to close-to natural language queries by means of pattern matching.
Archive | 2007
Sid-Ahmed Selouani; Habib Hamam; Yacine Benahmed