Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Larry Gillick.
international conference on acoustics, speech, and signal processing | 1997
Larry Gillick; Yoshiko Ito; Jonathan Young
In this paper we propose a novel way of estimating confidences for words that are recognized by a speech recognition system, together with a natural methodology for evaluating the overall quality of those confidence estimates. Our approach is based on an interpretation of a confidence as the probability that the corresponding recognized word is correct, and makes use of generalized linear models as a means for combining various predictor scores so as to arrive at confidence estimates. Experimental results using these models are presented based on four different sources of speech data: switchboard, Spanish and Mandarin CallHome, and Wall Street Journal.
international conference on acoustics, speech, and signal processing | 1993
Larry Gillick; Janet M. Baker; John S. Bridle; Melvyn J. Hunt; Yoshiko Ito; S. Lowe; Jeremy Orloff; Barbara Peskin; R. Roth; F. Scattone
The authors describe a novel approach to the problems of topic and speaker identification that makes use of large-vocabulary continuous speech recognition. A theoretical framework for dealing with these problems in a symmetric way is provided. Some empirical results on topic and speaker identification that have been obtained on the extensive Switchboard corpus of telephone conversations are presented.<<ETX>>
international conference on spoken language processing | 1996
Michael Newman; Larry Gillick; Yoshiko Ito; Don McAllaster; Barbara Peskin
The authors present a study of a speaker verification system for telephone data based on large-vocabulary speech recognition. After describing the recognition engine, they give details of the verification algorithm and draw comparisons with other systems. The system has been tested on a test set taken from the Switchboard corpus of conversational telephone speech, and they present results showing how performance varies with length of test utterance, and whether or not the training data has been transcribed. The dominant factor in performance appears to be channel or handset mismatch between training and testing data.
international conference on acoustics speech and signal processing | 1996
Sergio Mendoza; Larry Gillick; Yoshiko Ito; Stephen A. Lowe; Michael Newman
We have developed a highly accurate automatic language identification system based on large vocabulary continuous speech recognition (LVCSR). Each test utterance is recognized in a number of languages, and the language ID decision is based on the probability of the output word sequence reported by each recognizer. Recognizers were implemented for this test in English, Japanese, and Spanish, using the Ricardo corpus of telephone monologues. When tested on the OGI corpus of digitally recorded telephone speech, we obtained error rates of 3% or lower on 2-way and 3-way closed-set classification of ten-second and one-minute speech segments.
human language technology | 1993
Barbara Peskin; Larry Gillick; Yoshiko Ito; Stephen Lowe; Robert Roth; Francesco Scattone; James K. Baker; Janet M. Baker; John S. Bridle; Melvyn J. Hunt; Jeremy Orloff
In this paper we exhibit a novel approach to the problems of topic and speaker identification that makes use of a large vocabulary continuous speech recognizer. We present a theoretical framework which formulates the two tasks as complementary problems, and describe the symmetric way in which we have implemented their solution. Results of trials of the message identification systems using the Switchboard corpus of telephone conversations are reported.
international conference on acoustics, speech, and signal processing | 1993
R. Roth; Janet M. Baker; Larry Gillick; Melvyn J. Hunt; Yoshiko Ito; S. Lowe; Jeremy Orloff; Barbara Peskin; F. Scattone
The authors report on the progress that has been made at Dragon Systems in speaker-independent large-vocabulary speech recognition using speech from DARPAs Wall Street Journal corpus. First they present an overview of the recognition and training algorithms. Then, they describe experiments involving two improvements to these algorithms, moving to higher-dimensional streams and using an IMELDA transformation. They also present some results showing the reduction in error rates.<<ETX>>
international conference on acoustics, speech, and signal processing | 1997
Venkatesh Nagesha; Larry Gillick
This paper studies the use of transformation-based speaker adaptation in improving the performance of large vocabulary continuous speech recognition systems. We present a formulation of the adaptation procedure that is simpler than existing methods. Our experiments demonstrate that speaker normalization continues to be important even after significant amounts of speaker adaptation. An automatic clustering algorithm is compared to human expertise in sorting output distributions into collections that share the same transformation. We quantify improvements over standard Bayesian (by maximum a posteriori or MAP) adaptation in terms of (a) speed of adaptation, and (b) robustness to transcription errors. Finally, we discuss the use of speaker transformations in the training process.
human language technology | 1994
Steve Lowe; Anne Demedts; Larry Gillick; Mark A. Mandel; Barbara Peskin
The goal of this study is to evaluate the potential for using large vocabulary continuous speech recognition as an engine for automatically classifying utterances according to the language being spoken. The problem of language identification is often thought of as being separate from the problem of speech recognition. But in this paper, as in Dragons earlier work on topic and speaker identification, we explore a unifying approach to all three message classification problems based on the underlying stochastic process which gives rise to speech. We discuss the theoretical framework upon which our message classification systems are built and report on a series of experiments in which this theory is tested, using large vocabulary continuous speech recognition to distinguish English from Spanish.
human language technology | 1992
Larry Gillick; Barbara Peskin; Robert Roth
This paper describes a new algorithm for building rapid match models for use in Dragons continuous speech recognizer. Rather than working from a single representative token for each word, the new procedure works directly from a set of trained hidden Markov models. By simulated traversals of the HMMs, we generate a collection of sample tokens for each word which are then averaged together to build new rapid match models. This method enables us to construct models which better reflect the true variation in word occurrences and which no longer require the extensive adaptation needed in our original method. In this preliminary report, we outline this new procedure for building rapid match models and report results from initial testing on the Wall Street Journal recognition task.
human language technology | 1993
Janet M. Baker; Larry Gillick; Robert Roth
The primary long term goal of speech research at Dragon Systems is to develop algorithms that are capable of achieving very high performance large vocabulary continuous speech recognition. At the same time, in the long run we are also concerned to keep the demands of those algorithms for computational power and memory as modest as possible, so that the results of our research can be incorporated into products that will run on moderately priced personal computers.