Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Mukund Padmanabhan is active.

Publication


Featured researches published by Mukund Padmanabhan.


IEEE Transactions on Speech and Audio Processing | 2001

Data-driven approach to designing compound words for continuous speech recognition

George Saon; Mukund Padmanabhan

We present a new approach to deriving compound words from a training corpus. The motivation for making compound words is because under some assumptions, speech recognition errors occur less frequently in longer words. Furthermore, they also enable more accurate modeling of pronunciation variability at the boundary between adjacent words in a continuously spoken utterance. We introduce a measure based on the product between the direct and the reverse bigram probability of a pair of words for finding candidate pairs in order to create compound words. Our experimental results show that by augmenting both the acoustic vocabulary and the language model with these new tokens, the word recognition accuracy can be improved by absolute 2.8% (7% relative) on a voice mail continuous speech recognition task. We also compare the proposed measure for selecting compound words with other measures that have been described in the literature.


IEEE Transactions on Circuits and Systems | 1991

Resonator-based filter-banks for frequency-domain applications

Mukund Padmanabhan; Ken Martin

The filter-bank structure proposed is based on a digital simulation of a singly terminated ladder filter. This filter bank can also be arrived at from a filter described by G Peceli (1989) and represents an extremely hardware efficient structure, having a complexity of O(N). The main application examined is adaptive line enhancement. The filter-bank-based line enhancer is shown to have the necessary conditions for global convergence and to yield uncorrelated sinusoidal enhanced outputs that are undistorted versions of the corresponding frequency components of the input. A number of additional possible applications for the filter-bank are described. These include the tracking of periodic signals, subband coding, frequency-domain adaptive noise-cancellation, and frequency-domain processing of signals from phased arrays. >


international conference on acoustics speech and signal processing | 1998

Acoustics-only based automatic phonetic baseform generation

Bhuvana Ramabhadran; Lalit R. Bahl; Peter Vincent Desouza; Mukund Padmanabhan

Phonetic baseforms are the basic recognition units in most speech recognition systems. These baseforms are usually determined by linguists once a vocabulary is chosen and not modified thereafter. However, several applications, such as name dialing, require the user be able to add new words to the vocabulary. These new words are often names, or task-specific jargon, that have user-specific pronunciations. This paper describes a novel method for generating phonetic transcriptions (baseforms) of words based on acoustic evidence alone. It does not require either the spelling or any prior acoustic representation of the new word, is vocabulary independent, and does not have any linguistic constraints (pronunciation rules). Our experiments demonstrate the high decoding accuracies obtained when baseforms deduced using this approach are incorporated into our speech recognizer. Also, the error rates on the added words were found to be comparable to or better than when the baseforms were derived by hand.


IEEE Transactions on Speech and Audio Processing | 2002

Automatic speech recognition performance on a voicemail transcription task

Mukund Padmanabhan; George Saon; Jing Huang; Brian Kingsbury; Lidia Mangu

We report on the performance of automatic speech recognition (ASR) systems on voicemail transcription. Voicemail is spontaneous telephone speech recorded over a variety of channels; consequently, it is representative of many challenging problems in speech recognition. In the course of working on this task, several algorithms were developed that focus on different components of an ASR system, including lexicon design, feature extraction, hypothesis search, and adaptation. We report the improvements provided by these techniques, as well as other standard techniques, on a voicemail test set. Although the techniques are benchmarked on voicemail test data, their scope is not restricted to this domain as they address fundamental aspects of the speech recognition process.


IEEE Transactions on Speech and Audio Processing | 2005

Maximizing information content in feature extraction

Mukund Padmanabhan; Satya Dharanipragada

In this paper, we consider the problem of quantifying the amount of information contained in a set of features, to discriminate between various classes. We explore these ideas in the context of a speech recognition system, where an important classification sub-problem is to predict the phonetic class, given an observed acoustic feature vector. The connection between information content and speech recognition system performance is first explored in the context of various feature extraction schemes used in speech recognition applications. Subsequently, the idea of optimizing the information content to improve recognition accuracy is generalized to a linear projection of the underlying features. We show that several prior methods to compute linear transformations (such as linear/heteroscedastic discriminant analysis) can be interpreted in this general framework of maximizing the information content. Subsequently, we extend this reasoning and propose a new objective function to maximize a penalized mutual information (pMI) measure. This objective function is seen to be very well correlated with the word error rate of the final system. Finally experimental results are provided that show that the proposed pMI projection consistently outperforms other methods for a variety of cases, leading to relative improvements in the word error rate of 5%-16% over earlier methods.


IEEE Transactions on Circuits and Systems Ii: Analog and Digital Signal Processing | 1993

A second-order hyperstable adaptive filter for frequency estimation

Mukund Padmanabhan; Ken Martin

A second-order adaptive filter structure that can be used for frequency estimation and enhancement of an input sinusoid is proposed. Unlike other gradient-based adaptive line enhancement schemes, this filter has the advantage that it can be made to converge arbitrarily fast by increasing the step size of the update, while still retaining stability. This is possible because the adaptation strategy has its roots in hyperstability theory. A limitation of the scheme is that it gives a biased estimate in the presence of noise; however, some techniques that bring the bias down to a low level are described. >


IEEE Transactions on Signal Processing | 1993

Using an IIR adaptive filter bank to analyze short data segments of noisy sinusoids

Kenneth W. Martin; Mukund Padmanabhan

An infinite impulse response (IIR) adaptive resonator-in-a-loop filter bank that can be used for high-resolution spectral analysis of not only long data segments, but short data segments as well, with accuracies approaching the Cramer-Rao lower bounds for SNRs as well as 10 dB is described. The basic approach taken is to reanalyze the data segment many times while running the data forward and backward through the filter, as the coefficients converge. Special care is taken at the data endpoints, when reinitializing the filter state variables, to eliminate transients. >


international conference on acoustics, speech, and signal processing | 1992

Some further results on modulated/extended lapped transforms

Mukund Padmanabhan; Ken Martin

Some further results on modulated and extended lapped transforms are reported. Sufficient conditions to ensure perfect reconstruction for extended lapped transforms are derived. These conditions are valid for transforms of length lM and place no restriction on l; in this respect they apply to a more general case than earlier derivations that constrained l to be even. Also, a new filter structure is reported for implementing the modulated lapped transform. It has a hardware complexity of O(M) as compared to O(M log M) for other implementations.<<ETX>>


international symposium on circuits and systems | 1992

Lerner-based non-uniform filter-banks and some of their applications

Ken Martin; Mukund Padmanabhan

A design process is described which starts with a prototype resonator-based uniform filter-bank, whose bandpass outputs have poor stopband characteristics, and applies the concept of Lerner grouping of adjacent channels to develop a uniform filter-bank with much better bandpass characteristics. For applications requiring the lower frequencies of the input signal to be resolved more accurately than the higher frequencies, the Lerner based uniform filter-bank may be converted to a nonuniform one by applying an allpass transformation; the merit of using such a transformation is that it retains the good magnitude characteristics of the bandpass channels. The main difference between the present design procedure and that of G. Doblinger (1991) is the manner in which the allpass transformation is applied. Further, unlike previous work, the filter-bank designed here has the property that the sum of the bandpass channels is an allpass function. The implementation of the filter-bank is also considered. The primary additional application of the nonuniform channel-bank examined here is frequency-domain adaptive filtering.<<ETX>>


international conference on acoustics, speech, and signal processing | 1991

The spectral analysis of short data-segments using an IIR adaptive filter

Ken Martin; Mukund Padmanabhan

An infinite impulse response (IIR) adaptive filter that is efficient and can be used to analyze short data segment with high resolution and very small biases is presented. The approach taken is to run the short data segment continuously forwards and backwards in time through the adaptive filter. The proposed adaptive filter is a filter bank composed of a number of parallel digital resonators with common feedback around all of them. Because of this particular architecture, the filter is easy to reset the internal state variables of the adaptive filter each time the direction of the input data segment is changed. This allows the same data segment to be run through the adaptive filter many times, with the adaptive filter converging closer to its optimum state each time. No windowing is required, which results in higher accuracies when determining the sinusoidal components of the input data. The resonator frequencies are adapted to match the frequencies of the input data segment. When the adaptive filter has converged, the isolated input sinusoids are available at the outputs of the resonators.<<ETX>>

Collaboration


Dive into the Mukund Padmanabhan's collaboration.

Researchain Logo
Decentralizing Knowledge