Ajit V. Rao
University of California, Santa Barbara
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ajit V. Rao.
IEEE Transactions on Signal Processing | 1996
David J. Miller; Ajit V. Rao; Kenneth Rose; Allen Gersho
A global optimization method is introduced that minimize the rate of misclassification. We first derive the theoretical basis for the method, on which we base the development of a novel design algorithm and demonstrate its effectiveness and superior performance in the design of practical classifiers for some of the most popular structures currently in use. The method, grounded in ideas from statistical physics and information theory, extends the deterministic annealing approach for optimization, both to incorporate structural constraints on data assignments to classes and to minimize the probability of error as the cost objective. During the design, data are assigned to classes in probability so as to minimize the expected classification error given a specified level of randomness, as measured by Shannons entropy. The constrained optimization is equivalent to a free-energy minimization, motivating a deterministic annealing approach in which the entropy and expected misclassification cost are reduced with the temperature while enforcing the classifiers structure. In the limit, a hard classifier is obtained. This approach is applicable to a variety of classifier structures, including the widely used prototype-based, radial basis function, and multilayer perceptron classifiers. The method is compared with learning vector quantization, back propagation (BP), several radial basis function design techniques, as well as with paradigms for more directly optimizing all these structures to minimize probability of error. The annealing method achieves significant performance gains over other design methods on a number of benchmark examples from the literature, while often retaining design complexity comparable with or only moderately greater than that of strict descent methods. Substantial gains, both inside and outside the training set, are achieved for complicated examples involving high-dimensional data and large class overlap.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 1999
Ajit V. Rao; David J. Miller; Kenneth Rose; Allen Gersho
A new learning algorithm is proposed for piecewise regression modeling. It employs the technique of deterministic annealing to design space partition regression functions. While the performance of traditional space partition regression functions such as CART and MARS is limited by a simple tree-structured partition and by a hierarchical approach for design, the deterministic annealing algorithm enables the joint optimization of a more powerful piecewise structure based on a Voronoi partition. The new method is demonstrated to achieve consistent performance improvements over regular CART as well as over its extension to allow arbitrary hyperplane boundaries. Comparison tests, on several benchmark data sets from the regression literature, are provided.
IEEE Signal Processing Letters | 1996
Amitava Das; Ajit V. Rao; Allen Gersho
In many signal compression applications, the evolution of the signal over time can be represented by a sequence of random vectors with varying dimensionality. Frequently, the generation of such variable-dimension vectors can be modeled as a random sampling of another signal vector with a large but fixed dimension. Efficient quantization of these variable-dimension vectors is a challenging task and a critical issue in speech coding algorithms based on harmonic spectral modeling. We introduce a simple and effective formulation of the problem and present a novel technique, called variable-dimension vector quantization (VDVQ), where the input variable-dimension vector is directly quantized with a single universal codebook. The application of VDVQ to low bit-rate speech coding demonstrates significant gain in subjective quality as well as in rate-distortion performance over prior indirect methods.
international conference on acoustics speech and signal processing | 1996
Ajit V. Rao; David J. Miller; Kenneth Rose; Allen Gersho
In vector quantization, one approximates an input random vector, Y, by choosing from a finite set of values known as the codebook. We consider a more general problem where one may not have direct access to Y but only to some statistically related random vector X. We observe X and would like to generate an approximation to Y from a codebook of candidate vectors. This operation, called generalized vector quantization (GVQ), is essentially that of quantized estimation. An important special case of GVQ is the problem of noisy source coding wherein a quantized approximation of a vector, Y, is obtained from observation of its noise-corrupted version, X. The optimal GVQ encoder has high complexity. We overcome the complexity barrier by optimizing a structurally-constrained encoder. This challenging optimization task is solved via a probabilistic approach, based on deterministic annealing, which overcomes problems of shallow local minima that trap simpler descent methods. We demonstrate the successful application of our method to the coding of noisy sources.
data compression conference | 1994
Amitava Das; Ajit V. Rao; Allen Gersho
Optimal vector quantization of variable-dimension vectors in principle is feasible by using a set of fixed dimension VQ codebooks. However, for typical applications, such a multi-codebook approach demands a grossly excessive and impractical storage and computational complexity. Efficient quantization of such variable-dimension spectral shape vectors is the most challenging and difficult encoding task required in an important family of low bit-rate vocoders. The authors introduce a simple and effective formulation of variable-dimension vector quantization (VDVQ) which quantizes variable-dimension vectors using a single universal codebook having fixed dimension yet covering the entire range of input vector dimensions under consideration. This VDVQ technique is applied to quantize variable-dimension spectral shape vectors leading to a high quality speech coder at the low bit-rate of 2.5 kb/s. The combination of a universal spectral codebook and structured VQ reduces storage and computational complexity, yet delivers a high quantization efficiency and enhanced perceptual quality of the coded speech.<<ETX>>Optimal vector quantization of variable-dimension vectors in principle is feasible by using a set of fixed dimension VQ codebooks. However, for typical applications, such a multi-codebook approach demands a grossly excessive and impractical storage and computational complexity. Efficient quantization of such variable-dimension spectral shape vectors is the most challenging and difficult encoding task required in an important family of low bit-rate vocoders. The authors introduce a simple and effective formulation of variable-dimension vector quantization (VDVQ) which quantizes variable-dimension vectors using a single universal codebook having fixed dimension yet covering the entire range of input vector dimensions under consideration. This VDVQ technique is applied to quantize variable-dimension spectral shape vectors leading to a high quality speech coder at the low bit-rate of 2.5 kb/s. The combination of a universal spectral codebook and structured VQ reduces storage and computational complexity, yet delivers a high quantization efficiency and enhanced perceptual quality of the coded speech. >
IEEE Transactions on Speech and Audio Processing | 2001
Ajit V. Rao; Kenneth Rose
Many conventional speech recognition systems are based on the use of hidden Markov models (HMM) within the context of discriminant-based pattern classification. While the speech recognition objective is a low rate of misclassification, HMM design has been traditionally approached via maximum likelihood (ML) modeling which is, in general, mismatched with the minimum error objective and hence suboptimal. Direct minimization of the error rate is difficult because of the complex nature of the cost surface, and has only been addressed previously by discriminative design methods such as generalized probabilistic descent (GPD). While existing discriminative methods offer significant benefits, they commonly rely on local optimization via gradient descent whose performance suffers from the prevalence of shallow local minima. As an alternative, we propose the deterministic annealing (DA) design method that directly minimizes the error rate while avoiding many poor local minima of the cost. The DA is derived from fundamental principles of statistical physics and information theory. In DA, the HMM classifiers decision is randomized and its expected error rate is minimized subject to a constraint on the level of randomness which is measured by the Shannon entropy. The entropy constraint is gradually relaxed, leading in the limit of zero entropy to the design of regular nonrandom HMM classifiers. An efficient forward-backward algorithm is proposed for the DA method. Experiments on synthetic data and on a simplified recognizer for isolated English letters demonstrate that the DA design method can improve recognition error rates over both ML and GPD methods.
global communications conference | 1994
Amitabha Das; Ajit V. Rao; Allen Gersho
This paper presents a high quality, low bit rate speech coder which applies an effective spectral modeling technique called discrete all-pole (DAP) modeling to efficiently represent speech spectra. The technique provides a fixed-dimension representation of the (pitch-dependent) variable dimension spectral shape vectors which arise in harmonic coders. Consequently, the spectral shapes are quantized more efficiently than with the usual linear prediction modeling, leading to better speech quality. We present a 2.4 kb/s speech coder, based on the multiband excitation (MBE) model and DAP modeling of the speech spectra, which delivers speech quality comparable to two standard higher rate coders: the 4.8 kb/s U.S. Federal Standard 1016 CELP coder and the 4.15 kb/s IMBE coder, adopted as the INMARSAT-M standard for satellite voice communications.
international symposium on information theory | 1995
Ajit V. Rao; David J. Miller; Kenneth Rose; Allen Gersho
Given a pair of random vectors X, Y, we study the problem of finding an efficient or optimal estimator of Y given X when the range of the estimator is constrained to be a finite set of values. A generalized vector quantizer (GVQ), with input dimension k, output dimension m, and size N maps input X/spl isin//spl Rscr//sup k/, to output V(X)/spl isin//spl Rscr//sup m/. The output V(X) is constrained to be one of the estimation codevectors in the codebook, {y/sub 1/,y/sub 2/...y/sub N/}. The performance of the GVQ is measured by the average distortion, D=E[d(Y,V(X))] for a suitable output-space distortion measure d(.,.). A GVQ reduces to a conventional vector quantizer in the special case where X=Y. The GVQ problem has been approached in the information theory literature from many different standpoints. In particular, it appears in the context of noisy source coding, which is the special case where we quantize X, the observable, noisy version of a source, Y.
international conference on acoustics speech and signal processing | 1998
Ajit V. Rao; Kenneth Rose; Allen Gersho
We attack the general problem of HMM-based speech recognizer design, and in particular, the problem of isolated letter recognition in the presence of background noise. The standard design method based on maximum likelihood (ML) is known to perform poorly when applied to isolated letter recognition. The minimum classification error (MCE) approach directly targets the ultimate design criterion and offers substantial improvements over the ML method. However, the standard MCE method relies on gradient descent optimization which is susceptible to shallow local minima traps. We propose to overcome this difficulty with a powerful optimization method based on deterministic annealing (DA). The DA method minimizes a randomized MCE cost subject to a constraint on the level of entropy which is gradually relaxed. It may be derived based on information-theoretic or statistical physics principles. DA has a low implementation complexity and outperforms both standard ML and the gradient descent based MCE algorithm by a factor of 1.5 to 2.0 on the benchmark CSLU spoken letter database. Further, the gains are maintained under a variety of background noise conditions.
international conference on acoustics, speech, and signal processing | 2006
Chanaveeragouda Virupaxagouda Goudar; Pankaj Rabha; Murali M. Deshpande; Ajit V. Rao
This paper describes the recently developed SMVLite speech codec. SMVLite is the reduced complexity variant of the new 3GPP2 (3rd generation partnership project 2) CDMA standard SMV (selectable mode vocoder). SMV provides superior speech quality at low bit-rates compared to other CDMA codecs. However, its computational complexity is significantly higher than other CDMA standards, thereby rendering it inefficient for real-time implementation. We have developed a lower complexity version of the SMV called SMVLite. SMVLite is bit-stream interoperable with SMV and its voice quality is perceptually equivalent to SMV in all modes & conditions of interest. The computational complexity of SMVLite is 25% lower than SMV. The voice quality equivalence of SMV and SMVLite has been conclusively proven in a formal subjective listening test conducted at Dynastat