Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ramesh R. Sarukkai is active.

Publication


Featured researches published by Ramesh R. Sarukkai.


international conference on spoken language processing | 1996

Improved spontaneous dialogue recognition using dialogue and utterance triggers by adaptive probability boosting

Ramesh R. Sarukkai; Dana H. Ballard

Based on the observation that the unpredictable nature of conversational speech makes it almost impossible to reliably model sequential word constraints, the notion of word set error criteria is proposed for improved recognition of spontaneous dialogues. The basic idea in the TAB algorithm is to predict a set of words based on some a priori information, and perform a re-scoring pass, wherein the probabilities of the words in the predicted word set are amplified or boosted in some manner. An adaptive gradient descent procedure for tuning the word boosting factor has been formulated. Two novel models which predict the required word sets have been presented: utterance triggers which capture within-utterance long distance word interdependencies, and dialogue triggers which capture local temporal dialogue oriented word relations. The proposed Trigger and Adaptive Boosting (TAB) algorithm have been experimentally tested on a subset of the TRAINS 93 spontaneous dialogues and the TRAINS 95 semispontaneous corpus, and have resulted in improved performances.


IEEE Transactions on Speech and Audio Processing | 1997

Word set probability boosting for improved spontaneous dialog recognition

Ramesh R. Sarukkai; Dana H. Ballard

Based on the observation that the unpredictable nature of conversational speech makes it almost impossible to reliably model sequential word constraints, the notion of word set error criteria is proposed for improved recognition of spontaneous dialogs. The single-pass adaptive boosting (AB) algorithm enables the language model weights to be tuned using the word set error criteria. In the two-pass version of the algorithm, the basic idea is to predict a set of words based on some a priori information, and perform a rescoring pass wherein the probabilities of the words in the predicted word set are amplified or boosted in some manner. An adaptive gradient descent procedure for tuning the word boosting factor is formulated, which enables the boost factors to be incrementally adjusted to maximize the accuracy of the speech recognition system outputs on held-out training data using the word set error criteria. Two novel models which predict the required word sets are presented: (i) utterance triggers, which capture within-utterance long-distance word interdependencies, and (ii) dialog triggers, which capture local temporal dialog-oriented word relations. The proposed trigger and adaptive boosting (TAB) algorithm, and the single-pass adaptive boosting (AB) algorithm are experimentally tested on a subset of the TRAINS-93 spontaneous dialogs and the TRAINS-95 semispontaneous corpus, and the results summarized.


IEEE Transactions on Neural Networks | 1996

Supervised self-coding in multilayered feedforward networks

Ramesh R. Sarukkai

Supervised neural-network learning algorithms have proven very successful at solving a variety of learning problems. However, they suffer from a common problem of requiring explicit output labels. This requirement makes such algorithms implausible as biological models. In this paper, it is shown that pattern classification can be achieved, in a multilayered feedforward neural network, without requiring explicit output labels, by a process of supervised self-coding. The class projection is achieved by optimizing appropriate within-class uniformity, and between-class discernability criteria. The mapping function and the class labels are developed together, iteratively using the derived self-coding backpropagation algorithm. The ability of the self-coding network to generalize on unseen data is also experimentally evaluated on real data sets, and compares favorably with the traditional labeled supervision with neural networks. However, interesting features emerge out of the proposed self-coding supervision, which are absent in conventional approaches. The further implications of supervised self-coding with neural networks are also discussed.


international symposium on neural networks | 1995

Solving XOR with a single layered perceptron by supervised self-organization of multiple output labels per class

Ramesh R. Sarukkai

Popular neural network learning algorithms such as Kohonens LVQ handle nonlinearity by assigning multiple codebook vectors per class. However, the architectural constraint requires the output units to activate in a winner-take-all fashion. In this paper, clustering of output projections developed with traditional discriminant analysis networks is achieved by allowing multiple output labels for every class: the key to such a formulation lies in the supervised self-organization algorithm which enables conventional feedforward networks to self-organize their own output labels given class information. The idea of supervised self-organization of multiple output labels has been demonstrated by implementing the XOR problem with a single layer perceptron network.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 1998

Phonetic set indexing for fast lexical access

Ramesh R. Sarukkai; Dana H. Ballard

A novel nonsequential indexing mechanism (termed phonetic set indexing) has been evaluated for the purpose of fast word pre-selection. Our approach to handling the lexical access problem stems from the primary observation that the set of phones which are present in the transcription of a word is sparsely distributed across the vocabulary, and is thus suitable as an indexing key for retrieving a short-list of word possibilities.


Neural Computation | 1997

Supervised networks that self-organize class outputs

Ramesh R. Sarukkai

Supervised, neural network, learning algorithms have proved very successful at solving a variety of learning problems; however, they suffer from a common problem of requiring explicit output labels. In this article, it is shown that pattern classification can be achieved, in a multilayered, feedforward, neural network, without requiring explicit output labels, by a process of supervised self-organization. The class projection is achieved by optimizing appropriate within-class uniformity and between-class discernibility criteria. The mapping function and the class labels are developed together iteratively using the derived self organizing backpropagation algorithm. The ability of the self-organizing network to generalize on unseen data is also experimentally evaluated on real data sets and compares favorably with the traditional labeled supervision with neural networks. In addition, interesting features emerge out of the proposed self-organizing supervision, which are absent in conventional approaches.


international conference on acoustics speech and signal processing | 1996

A novel word pre-selection method based on phonetic set indexing

Ramesh R. Sarukkai; Dana H. Ballard

The possibility of pre-fetching words using the phoneme sequence output of automatic speech recognition systems has been explored. A novel non-sequential indexing mechanism (termed phonetic set indexing) has been evaluated for the purpose of fast word pre-selection. Our approach to handling the lexical access problem stems from the primacy observation that the set of phonemes which are present in a transcription of a word is sparsely distributed across the vocabulary, and is thus suitable as an indexing key for retrieving a short list of word possibilities. The fixed dimensionality of the phonetic set representation also makes it suitable for implementation with compact bit strings, thus enabling fast lexical access using low-level bit operations.


international symposium on neural networks | 1994

Normalizing internal representations for speech classification

Ramesh R. Sarukkai; Dana H. Ballard

Speech segments are encoded using an autoassociative network, and the possibility of matching the hidden unit activation sequences for classification is studied. Good discrimination can be achieved by matching the lower dimensional projections of the unknown with template speech patterns. The possibility of normalizing the variations of the hidden unit activation sequences is then explored. In particular, experiments demonstrate the advantage of the presented technique for single- and multi-speaker syllable distinction tasks. Normalisation of the encoded representations of sounds within classes and across speakers improves results significantly.<<ETX>>


international conference on pattern recognition | 1994

Cross-coding networks for speech classification

Ramesh R. Sarukkai; Dana H. Ballard

What kind of internal representations develop with networks that transform speech of one speaker to that of another? This question is addressed in this paper by a novel supervised coding scheme: cross-coding. Instead of performing auto-association, we train networks to map speech of many speakers to speech of a particular speaker, with intermediate bottlenecks. The internal representations developed are then input to another network trained to label the corresponding sounds. Interestingly, the cross-codings seem to have captured speaker invariant properties in the different sounds. Experiments with multispeaker syllable recognition task show that the proposed scheme outperforms the corresponding multilayered net.


international symposium on neural networks | 1995

Prime numbers and output codes

Ramesh R. Sarukkai

In this paper, the prime number encoding scheme is proposed for representing the supervisory class signals at the output layer of multilayered feedforward networks. The notion of prime number encoding stems from the idea that the module representations with respect to two appropriately chosen primes can uniquely represent any bounded integer value. In particular, the derived prime number encoding scheme is a viable replacement for the conventional 1-per-class coding scheme used for indexing tasks. Face recognition experiments performed using a 24-face database, and word-spotting experiments performed with the TIMIT speech database, suggest that the prime number encoding scheme is beneficial for indexing purposes, not only from the point of parameter reduction, but also in terms of recognition performance.

Collaboration


Dive into the Ramesh R. Sarukkai's collaboration.

Top Co-Authors

Avatar

Dana H. Ballard

University of Texas at Austin

View shared research outputs
Researchain Logo
Decentralizing Knowledge