Rafid A. Sukkar
Bell Labs
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Rafid A. Sukkar.
IEEE Transactions on Speech and Audio Processing | 1996
Rafid A. Sukkar; Chin-Hui Lee
An integral part of any deployable speech recognition system is the capability to detect if the input speech does not contain any of the words in the recognizer vocabulary set. This capability, which is called utterance verification (or keyword recognition and nonkeyword rejection), is therefore becoming increasingly important as speech recognition systems continue to migrate from the laboratory to actual applications. We present a framework and a method for vocabulary independent utterance verification in subword-based speech recognition. The verification process is cast as a statistical hypothesis test, where vocabulary independence is accomplished through a two-stage verification process: subword-level verification followed by string-level verification. A verification function is defined and discriminatively trained to perform subword-level verification. String-level verification is accomplished by defining and evaluating an overall string-level log likelihood ratio that is a function of the subword-level verification scores. Experimental results show that this vocabulary-independent discriminative utterance verification method significantly outperforms a baseline method commonly used in wordspotting tasks.
international conference on acoustics, speech, and signal processing | 1993
Rafid A. Sukkar; Jay G. Wilpon
A classifier for utterance rejection in a hidden Markov model (HMM) based speech recognizer is presented. This classifier, termed the two-pass classifier, is a postprocessor to the HMM recognizer, and consists of a two-stage discriminant analysis. The first stage employs the generalized probabilistic descent (GPD) discriminative training framework, while the second stage performs linear discrimination combining the output of the first stage with HMM likelihood scores. In this fashion the classification power of the HMM is combined with that of the GPD stage which is specifically designed for keyword/nonkeyword classification. Experimental results show that, on two separate databases, the two-pass classifier significantly outperforms a single-pass classifier based solely on the HMM likelihood scores.<<ETX>>
international conference on acoustics speech and signal processing | 1996
Rafid A. Sukkar; Anand Rangaswamy Setlur; Mazin G. Rahim; Chin-Hui Lee
An utterance verification method based on minimum verification error training is presented. In a two-stage process, the recognition hypothesis produced by an HMM-based speech recognizer is verified using a set of verification-specific models that are independent of the models used in the recognition process. The verification models are trained using a discriminative training procedure that seeks to minimize the verification error by simultaneously maximizing the rejection of non-keywords and misrecognized keywords while minimizing the rejection of correctly recognized keywords. This method is evaluated on a connected digit recognition task with a null grammar. The baseline string error rate for this task was 4.85%. At 5% rejection of valid strings, the string error rate decreased to 2.70% using the proposed verification method. The corresponding performance on non-keyword speech was a rejection rate of over 99.0%.
international conference on spoken language processing | 1996
Anand Rangaswamy Setlur; Rafid A. Sukkar; John Jacob
Utterance verification (UV) is a process by which the output of a speech recognizer is verified to determine if the input speech actually includes the recognized keyword(s). The output of the speech verifier is a binary decision to accept or reject the recognized utterance based on a UV confidence score. In this paper, we extend the notion of utterance verification to not only detect errors but also to selectively correct them. We perform error correction by flipping the hypotheses produced by an N-best recognizer in cases when the top candidate has a UV confidence score that is lower than that of the next candidate. We propose two measures for computing confidence scores and investigate the use of a hybrid confidence measure that combines the two measures into a single score. Using this hybrid confidence measure and an N-best algorithm, we obtained an 11% improvement in word-error rate on a connected digit recognition task. This improvement was achieved while still maintaining reliable detection of non-keyword speech and misrecognitions.
international conference on acoustics speech and signal processing | 1998
Rafid A. Sukkar
We formulate a training framework and present a method for task independent utterance verification. Verification-specific HMMs are defined and discriminatively trained using minimum verification error training. Task independence is accomplished by performing the verification on the subword level and training the verification models using a general phonetically balanced database that is independent of the application tasks. Experimental results show that the proposed method significantly outperforms two other commonly used task independent utterance verification techniques. It is shown that the equal error rate of false alarms and false keyword rejection is reduced by more than 22% compared to the other two methods on a large vocabulary recognition task.
international conference on acoustics, speech, and signal processing | 1994
Rafid A. Sukkar
Connected digit recognition is an area that has attracted significant attention because of its importance in automating a wide range of applications. In an actual application, the performance of the connected digit recognizer must be measured along two dimensions: its recognition accuracy over connected digit strings, and its rejection rate over utterances with no digits. This paper presents a rejection method for connected digit recognition. This rejection method is a post-processor to an HMM-based recognizer and consists of two stages: a digit/non-digit classification stage, and a string verification stage. The digit/non-digit classifier is discriminatively trained to determine if a given digit segment in the recognized string actually contains a digit. The string verification stage combines the results of the digit/non-digit classifier to make the final rejection decision. Experimental results on two independent databases, show that this rejection method is successful in not only rejecting speech with no connected digits, but also in rejecting putative errors, which would have resulted in misrecognition.<<ETX>>
Archive | 1992
Raymond W. Bennett; Joseph G. Klinger; Brian C. Prorok; Rafid A. Sukkar
conference of the international speech communication association | 1998
Anand Rangaswamy Setlur; Rafid A. Sukkar
IEEE Transactions on Speech and Audio Processing | 2000
Rafid A. Sukkar; Malan Bhatki Gandhi; Anand Rangaswamy Setlur
conference of the international speech communication association | 1991
David L. Thomson; Jay G. Wilpon; Rafid A. Sukkar; Dimitrios Panos Prezas