Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Lalit R. Bahl is active.

Publication


Featured researches published by Lalit R. Bahl.


IEEE Transactions on Information Theory | 1974

Optimal decoding of linear codes for minimizing symbol error rate (Corresp.)

Lalit R. Bahl; John Cocke; Frederick Jelinek; Josef Raviv

The general problem of estimating the a posteriori probabilities of the states and transitions of a Markov source observed through a discrete memoryless channel is considered. The decoding of linear block and convolutional codes to minimize symbol error probability is shown to be a special case of this problem. An optimal decoding algorithm is derived.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 1983

A Maximum Likelihood Approach to Continuous Speech Recognition

Lalit R. Bahl; Frederick Jelinek; Robert L. Mercer

Speech recognition is formulated as a problem of maximum likelihood decoding. This formulation requires statistical models of the speech production process. In this paper, we describe a number of statistical models for use in speech recognition. We give special attention to determining the parameters for such models from sparse data. We also describe two decoding methods, one appropriate for constrained artificial languages and one appropriate for more realistic decoding tasks. To illustrate the usefulness of the methods described, we review a number of decoding results that have been obtained with them.


international conference on acoustics, speech, and signal processing | 1986

Maximum mutual information estimation of hidden Markov model parameters for speech recognition

Lalit R. Bahl; Peter F. Brown; P. V. de Souza; Robert L. Mercer

A method for estimating the parameters of hidden Markov models of speech is described. Parameter values are chosen to maximize the mutual information between an acoustic observation sequence and the corresponding word sequence. Recognition results are presented comparing this method with maximum likelihood estimation.


IEEE Transactions on Acoustics, Speech, and Signal Processing | 1989

A tree-based statistical language model for natural language speech recognition

Lalit R. Bahl; Peter F. Brown; P. V. de Souza; Robert L. Mercer

The problem of predicting the next word a speaker will say, given the words already spoken; is discussed. Specifically, the problem is to estimate the probability that a given word will be the next word uttered. Algorithms are presented for automatically constructing a binary decision tree designed to estimate these probabilities. At each node of the tree there is a yes/no question relating to the words already spoken, and at each leaf there is a probability distribution over the allowable vocabulary. Ideally, these nodal questions can take the form of arbitrarily complex Boolean expressions, but computationally cheaper alternatives are also discussed. Some results obtained on a 5000-word vocabulary with a tree designed to predict the next word spoken from the preceding 20 words are included. The tree is compared to an equivalent trigram model and shown to be superior. >


IEEE Transactions on Information Theory | 1975

Design of a linguistic statistical decoder for the recognition of continuous speech

Frederick Jelinek; Lalit R. Bahl; Robert L. Mercer

Most current attempts at automatic speech recognition are formulated in an artificial intelligence framework. In this paper we approach the problem from an information-theoretic point of view. We describe the overall structure of a linguistic statistical decoder (LSD) for the recognition of continuous speech. The input to the decoder is a string of phonetic symbols estimated by an acoustic processor (AP). For each phonetic string, the decoder finds the most likely input sentence. The decoder consists of four major subparts: 1) a statistical model of the language being recognized; 2) a phonemic dictionary and statistical phonological rules characterizing the speaker; 3) a phonetic matching algorithm that computes the similarity between phonetic strings, using the performance characteristics of the AP; 4) a word level search control. The details of each of the subparts and their interaction during the decoding process are discussed.


international conference on acoustics speech and signal processing | 1988

Acoustic Markov models used in the Tangora speech recognition system

Lalit R. Bahl; Peter F. Brown; P. V. de Souza; Michael Picheny

The Speech Recognition Group at IBM Research has developed a real-time, isolated-word speech recognizer called Tangora, which accepts natural English sentences drawn from a vocabulary of 20000 words. Despite its large vocabulary, the Tangora recognizer requires only about 20 minutes of speech from each new user for training purposes. The accuracy of the system and its ease of training are largely attributable to the use of hidden Markov models in its acoustic match component. An automatic technique for constructing Markov word models is described and results are included of experiments with speaker-dependent and speaker-independent models on several isolated-word recognition tasks.<<ETX>>


IEEE Transactions on Information Theory | 1975

Decoding for channels with insertions, deletions, and substitutions with applications to speech recognition

Lalit R. Bahl; Frederick Jelinek

A model for channels in which an input sequence can produce output sequences of varying length is described. An efficient computational procedure for calculating Pr \{Y\mid X\} is devised, where X = x_1,x_2,\cdots,x_M and Y = y_1,y_2,\cdots,y_N are the input and output of the channel. A stack decoding algorithm for decoding on such channels is presented. The appropriate likelihood function is derived. Channels with memory are considered. Some applications to speech and character recognition are discussed.


international conference on acoustics, speech, and signal processing | 1995

Performance of the IBM large vocabulary continuous speech recognition system on the ARPA Wall Street Journal task

Lalit R. Bahl; S. Balakrishnan-Aiyer; J.R. Bellgarda; Martin Franz; Ponani S. Gopalakrishnan; David Nahamoo; Miroslav Novak; Mukund Padmanabhan; Michael Picheny; Salim Roukos

In this paper we discuss various experimental results using our continuous speech recognition system on the Wall Street Journal task. Experiments with different feature extraction methods, varying amounts and type of training data, and different vocabulary sizes are reported.


Information & Computation | 1970

Block codes for a class of constrained noiseless channels

Donald T. Tang; Lalit R. Bahl

A class of discrete noiseless channels having upper and lower bounds on the separation between adjacent nonzero input symbols is considered. Recursion relations are derived for determining the number of input sequences which satisfy the constraints for all block lengths, and the asymptotic information rate is calculated. Applications to compaction and synchronization are discussed. An optimal algebraic block coding scheme for such channels is developed.


international conference on acoustics, speech, and signal processing | 1991

Decision trees for phonological rules in continuous speech

Lalit R. Bahl; Peter Vincent Desouza; Ponani S. Gopalakrishnan; David Nahamoo; Michael Picheny

The authors present an automatic method for modeling phonological variation using decision trees. For each phone they construct a decision tree that specifies the acoustic realization of the phone as a function of the context in which it appears. Several-thousand sentences from a natural language corpus spoken by several speakers are used to construct these decision trees. Experimental results on a 5000-word vocabulary natural language speech recognition task are presented.<<ETX>>

Collaboration


Dive into the Lalit R. Bahl's collaboration.

Researchain Logo
Decentralizing Knowledge