Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Abdelaziz A. Abdelhamid is active.

Publication


Featured researches published by Abdelaziz A. Abdelhamid.


international conference on social robotics | 2012

RoboASR: a dynamic speech recognition system for service robots

Abdelaziz A. Abdelhamid; Waleed H. Abdulla; Bruce A. MacDonald

This paper proposes a new method for building dynamic speech decoding graphs for state based spoken human-robot interaction (HRI). The current robotic speech recognition systems are based on either finite state grammar (FSG) or statistical N-gram models or a dual FSG and N-gram using a multi-pass decoding. The proposed method is based on merging both FSG and N-gram into a single decoding graph by converting the FSG rules into a weighted finite state acceptor (WFSA) then composing it with a large N-gram based weighted finite state transducer (WFST). This results in a tiny decoding graph that can be used in a single pass decoding. The proposed method is applied in our speech recognition system (RoboASR) for controlling service robots with limited resources. There are three advantages of the proposed approach. First, it takes the advantage of both FSG and N-gram decoders by composing both of them into a single tiny decoding graph. Second, it is robust, the resulting tiny decoding graph is highly accurate due to it fitness to the HRI state. Third, it has a fast response time in comparison to the current state of the art speech recognition systems. The proposed system has a large vocabulary containing 64K words with more than 69K entries. Experimental results show that the average response time is 0.05% of the utterance length and the average ratio between the true and false positives is 89% when tested on 15 interaction scenarios using live speech.


robotics automation and mechatronics | 2013

On the robustness of tiny decoding graphs for voice-based robotic interaction

Abdelaziz A. Abdelhamid; Waleed H. Abdulla; Bruce A. MacDonald

In this paper we study the robustness of a command decoding approach based on tiny decoding graphs for voice-based robotic interaction. This approach comprises the fusion of the grammar rules and the statistical n-gram language models to produce an elegant and quite efficient tiny decoding graph. The resulting tiny graph has several advantages such as high speed and improved robustness of command decoding even in adverse noisy conditions. To validate the robustness of the proposed approach, we employed a set of spoken commands from the Resource Management (RM1) command and control corpus. These commands are artificially corrupted by 10 types of noise at different signal-to-noise ratios (SNRs). Experimental results show that the proposed approach achieved word error rates of 1.9% and 29% for the commands at 20dB and 5dB respectively, whereas the word error rates of the same task using the traditional grammar rules were 43% and 75% for the commands at 20dB and 5dB SNRs, respectively.


international conference on communications | 2013

Optimizing the parameters of WFST-based decoding graphs using soft margin estimation

Abdelaziz A. Abdelhamid; Waleed H. Abdulla

The document, which includes fourblank pages, was not presented at the conference and therefore was not made available for publication as part of the conference proceedings.


international conference on communications | 2013

Discriminative training of context-dependent phones on WFST-based decoding graphs

Abdelaziz A. Abdelhamid; Waleed H. Abdulla

The document, which includes blank pages, was not presented at the conference and therefore was not made available for publication as part of the conference proceedings.


asia-pacific signal and information processing association annual summit and conference | 2013

Optimization on decoding graphs using soft margin estimation

Abdelaziz A. Abdelhamid; Waleed H. Abdulla

This paper proposes a discriminative learning algorithm for improving the accuracy of continuous speech recognition systems through optimizing the language model parameters on decoding graphs. The proposed algorithm employs soft margin estimation (SME) to build an objective function for maximizing the margin between the correct transcriptions and the corresponding competing hypotheses. To this end, we adapted a discriminative training procedure based on SME, which is originally devised for optimizing acoustic models, to a different case of optimizing the parameters of language models on a decoding graph constructed using weighted finite-state transducers. Experimental results show that the proposed algorithm outperforms a baseline system based on the maximum likelihood estimation and achieves a reduction of 15.11% relative word error rate when tested on the Resource Management (RM1) database.


asia-pacific signal and information processing association annual summit and conference | 2013

UML-based robotic speech recognition development: A case study

Abdelaziz A. Abdelhamid; Waleed H. Abdulla

The development of automatic speech recognition (ASR) systems plays a crucial role in their performance as well as their integration with spoken dialogue systems for controlling service robots. However, to the best of our knowledge, there is no research in the literature addressing the development of ASR systems and their integration with service robots from the software engineering perspective. Therefore, we propose in this paper a set of software engineering diagrams supporting a rapid development of ASR systems for controlling service robots. The proposed diagrams are presented in terms of a case study based on our speech recognition system, called RoboASR. The internal structure of this system is composed of five threads running concurrently to optimally carry out the various speech recognition processes along with the interaction with the dialogue manager of service robots. The diagrams proposed in this paper are presented in terms of the COMET method which is designed for describing practical and concurrent systems.


asia-pacific signal and information processing association annual summit and conference | 2013

Joint discriminative learning of acoustic and language models on decoding graphs

Abdelaziz A. Abdelhamid; Waleed H. Abdulla

In traditional models of speech recognition, acoustic and language models are treated in independence and usually estimated separately, which may yield a suboptimal recognition performance. In this paper, we propose a joint optimization framework for learning the parameters of acoustic and language models using minimum classification error criterion. The joint optimization is performed in terms of a decoding graph constructed using weighted finite-state transducers based on context-dependent hidden Markov models and trigram language models. To emphasize the effectiveness of the proposed framework, two speech corpora, TIMIT and Resource Management (RM1), are incorporated in the conducted experiments. The preliminary experiments show that the proposed approach can achieve significant reduction in phone, word and sentence error rates on both TIMIT and RM1 when compared with conventional parameter estimation approaches.


asia-pacific signal and information processing association annual summit and conference | 2013

SPIDER: A continuous speech light decoder

Abdelaziz A. Abdelhamid; Waleed H. Abdulla; Bruce A. MacDonald

In this paper, we propose a speech decoder, called SPeech lIght decoDER (SPIDER), for extracting the best decoding hypothesis from a search space constructed using weighted finite-state transducers. Despite existence of many speech decoders, these decoders are quite complicated as they take into consideration many design goals, such as extraction of N-best decoding hypotheses and generation of lattices. This makes it difficult to learn these decoders and test new ideas in speech recognition that often require decoder modification. Therefore, we propose in this paper a simple decoder supporting the primitive functions required for achieving real-time speech recognition with state-of-the-art recognition performance. This decoder can be viewed as a seed for further improvements and addition of new functionalities. Experimental results show that the performance of the proposed decoder is quite promising when compared with two other speech decoders, namely HDecode and Sphinx3.


Imaging and Signal Processing in Health Care and Technology | 2012

WFST-based Large Vocabulary Continuous Speech Decoder for Service Robots

Abdelaziz A. Abdelhamid; Waleed H. Abdulla; Bruce A. MacDonald


asia pacific signal and information processing association annual summit and conference | 2012

Optimizing the parameters of decoding graphs using new log-based MCE

Abdelaziz A. Abdelhamid; Waleed H. Abdulla

Collaboration


Dive into the Abdelaziz A. Abdelhamid's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge