Anand Rangaswamy Setlur
Alcatel-Lucent
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Anand Rangaswamy Setlur.
international conference on spoken language processing | 1996
Anand Rangaswamy Setlur; Rafid A. Sukkar; John Jacob
Utterance verification (UV) is a process by which the output of a speech recognizer is verified to determine if the input speech actually includes the recognized keyword(s). The output of the speech verifier is a binary decision to accept or reject the recognized utterance based on a UV confidence score. In this paper, we extend the notion of utterance verification to not only detect errors but also to selectively correct them. We perform error correction by flipping the hypotheses produced by an N-best recognizer in cases when the top candidate has a UV confidence score that is lower than that of the next candidate. We propose two measures for computing confidence scores and investigate the use of a hybrid confidence measure that combines the two measures into a single score. Using this hybrid confidence measure and an N-best algorithm, we obtained an 11% improvement in word-error rate on a connected digit recognition task. This improvement was achieved while still maintaining reliable detection of non-keyword speech and misrecognitions.
Speech Communication | 1997
Rafid Antoon Sukkar; Anand Rangaswamy Setlur; Chin-Hui Lee; John Jacob
Abstract Utterance verification (UV) is a process by which the output of a speech recognizer is verified to determine if the input speech actually includes the recognized keyword(s). The output of the speech verifier is a binary decision to accept or reject the recognized utterance based on a UV confidence score. In this paper, we extend the notion of utterance verification by presenting an utterance verification method that will be utilized to perform three tasks: (1) detect non-keyword strings (false alarms), (2) detect keyword substitution errors, and (3) selectively correct substitution errors when N -best string hypotheses are available. The utterance verification method presented here employs a set of verification-specific models that are independent of the models used in the recognition process. The verification models are trained using a discriminative training procedure that seeks to minimize the verification error by simultaneously maximizing the rejection of non-keywords and misrecognized keywords while minimizing the rejection of correctly recognized keywords. The error correction is performed by reordering the hypotheses produced by an N -best recognizer based on a UV confidence score.
international conference on acoustics speech and signal processing | 1999
Carl D. Mitchell; Anand Rangaswamy Setlur
This paper addresses the problem of selecting a name from a very large list using spelling recognition. In order to greatly reduce the computational resources required, we propose a tree-based lexical fast match scheme to select a short list of candidate names. Our system consists of a free letter recognizer, a fast matcher, and a rescoring stage. The letter recognizer uses n-grams to generate an n-best list of letter hypotheses. The fast matcher is a tree that is based on confusion classes, where a confusion class is a group of acoustically similar letters such as the e-set. The fast matcher reduces over 100,000 unique last names to tens or hundreds of candidates. Then the rescoring stage picks the best name using either letter alignment or a constrained grammar. The fast matcher retained the correct name 99.6% of the time and the system retrieved the correct name 97.6% of the time.
international conference on acoustics, speech, and signal processing | 2000
Rafid Antoon Sukkar; Shawn M. Herman; Anand Rangaswamy Setlur; Carl Dennis Mitchell
In this paper, we present a method that manipulates the decoding network to reduce both computational complexity and response latency while maintaining high ASR accuracy. The method employs a TSVQ (tree structured vector quantization) classifier that reliably discriminates between silence and non-silence frames. Reductions in computational complexity and response latency are achieved through three techniques: 1) silence skipping, 2) silence-based pruning of the dynamic programming network, and 3) early decision. Experimental results on a connected digit task and a large vocabulary company name task show that the proposed method can reduce ASR response latency by more than 82%. Furthermore, the computational complexity, measured in CPU seconds, was reduced by 13.6% on the connected digit task and 6.7% on the company name task while maintaining the recognition accuracy of the baseline system.
Archive | 1997
Anand Rangaswamy Setlur; Rafid Antoon Sukkar
Journal of the Acoustical Society of America | 1998
Malan Bhatki Gandhi; Anand Rangaswamy Setlur; Rafid Antoon Sukkar
Archive | 2000
Carl Dennis Mitchell; Anand Rangaswamy Setlur; Rafid Antoon Sukkar
Journal of the Acoustical Society of America | 1998
Anand Rangaswamy Setlur; Rafid Antoon Sukkar; Joseph Lawrence LoCicero; Grzegorz Szeszko
Archive | 2000
Rathinavelu Chengalvarayan; Richard Harry Ketchum; Anand Rangaswamy Setlur; David Lynn Thomson
Archive | 2000
Richard Harry Ketchum; Anand Rangaswamy Setlur