Charles T. Hemphill
Texas Instruments
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Charles T. Hemphill.
human language technology | 1990
Charles T. Hemphill; John J. Godfrey; George R. Doddington
Speech research has made tremendous progress in the past using the following paradigm:• define the research problem,• collect a corpus to objectively measure progress, and• solve the research problem.Natural language research, on the other hand, has typically progressed without the benefit of any corpus of data with which to test research hypotheses. We describe the Air Travel Information System (ATIS) pilot corpus, a corpus designed to measure progress in Spoken Language Systems that include both a speech and natural language component. This pilot marks the first full-scale attempt to collect such a corpus and provides guidelines for future efforts.
Journal of the Acoustical Society of America | 2003
Lajos Molnar; Charles T. Hemphill
A text-to-pronunciation system (11) includes a large training set of word pronunciations (19) and an extractor for extracting language specific information from the training set to produce pronunciations for words not in its training set. A learner (13) forms pronunciation guesses for words in the training set and for finding a transformation rule that improves the guesses. A rule applier (15) applies the transformation rule found to guesses. The learner (13) repeats the finding of another rule and the rule applier (15) applies the new rule to find the rules that improves the guesses the most.
IEEE MultiMedia | 1996
Charles T. Hemphill; Philip R. Thrift; John Linn
Computer users have long desired a personal software agent that could execute verbal commands. Todays World Wide Web (WWW or Web), with its point and click hypertext interface, makes a tremendous amount of information readily available online. A speech interface would make the Web even more powerful, allowing us to access information by surfing the Web by voice. TI have developed Speech Aware Multimedia (SAM) with this in mind, to make information on the Web more accessible and useful. They combined an innovative speech recognition engine with the Web to let anyone browse arbitrary Web pages using only speech as the input medium. Speech brings added flexibility and power to the classical Web interface and makes information access more natural. Todays speech recognition capability is well matched to Web browsing. The Web page provides a natural, well defined context for a speech recognition application. The recognition engine does not need to recognize any and all possible phrases, but only those phrases pertaining to the specific page in view at the moment. This context imposes limits that significantly aid recognition performance. Furthermore, the visual information on a page prompts the user on what to request and how to request it by voice.
international conference on acoustics, speech, and signal processing | 1992
Barbara Wheatley; George R. Doddington; Charles T. Hemphill; John J. Godfrey; Edward Holliman; Jane McDaniel; Drew Fisher
A method for automatic time alignment of orthographically transcribed speech using supervised speaker-independent automatic speech recognition based on the orthographic transcription, an online dictionary, and HMM phone models is presented. This method successfully aligns transcriptions with speech in unconstrained 5 to 10 min conversations collected over long-distance telephone lines. It requires minimal manual processing and generally produces correct alignments despite the challenging nature of the data. The robustness and efficiency of the method make it a practical tool for very large speech corpora.<<ETX>>
Journal of the Acoustical Society of America | 2000
Charles T. Hemphill; Lorin Netsch; Christopher M. Kribs
This is a speech recognition method for modeling adjacent word context, comprising: dividing a first word or period of silence into two portions; dividing a second word or period of silence, adjacent to the first word, into two potions; and combining last portion of the first word or period of silence and first portion of the second word or period of silence to make an acoustic model. The method includes constructing a grammar to restrict the acoustic models to the middle-to-middle context.
international conference on acoustics, speech, and signal processing | 1997
Kazuhiro Kondo; Charles T. Hemphill
Previously, we have developed Speech-Aware Multimedia (SAM) which controls a WWW browser using English speech. We have extended its capability to use Japanese speech to browse Japanese pages, and developed a prototype using speaker-independent, continuous speech recognition with Japanese context-dependent phonetic models. Some challenges not seen in English include: segregation of Japanese text into word units for optional silence insertion, Japanese text to phone conversion and accommodation of English link names embedded in Japanese pages. In order to accomplish the first two, we modified a public-domain dictionary look-up tool for segmentation and to accommodate the heuristics required for improved text-to-phone conversion accuracy. Preliminary tests show that the conversion result contains the correct phone sequence over 97% of the time, and the prototype correctly understands the input speech 91.5% of the time.
international conference on acoustics, speech, and signal processing | 1994
Yu-Hung Kao; Charles T. Hemphill; Barbara Wheatley; Periagaram K. Rajasekaran
Vocabulary-independence of speech recognition systems has become an important issue because of the need for flexible vocabulary and the high cost of speech corpus collection. We outline the necessary steps to achieve the goal of vocabulary-independent speech recognition, and relate our experimental experience with telephone speech recognition. Two sets of experiments were conducted: (1) 34-command recognition, in which we compared vocabulary-independent (VI) and vocabulary-dependent (VD) systems as well as phonetic and word based systems, and (2) 42-city name recognition, in which our vocabulary independent recognition performance (8.5% W. Err.) was much better than the VI performance (18%) reported by the Oregon Graduate Institute (OGI) and very close to OGIs VD performance (8%). We conclude that we have made some strides toward vocabulary independence, but much remains to be done; we identify the areas of improvement that are likely to lead to the goal.<<ETX>>
international conference on acoustics, speech, and signal processing | 1989
Charles T. Hemphill; Joseph Picone
The authors describe a stochastic unification grammar system that is a generalization of the conventional hidden Markov model (HMM) approach. Unification grammars concisely model context, providing a more powerful characterization of the acoustic data than the first-order Markov process. It is shown that this approach generalizes traditional FSA (finite-state automaton)-based HMM systems and that a stochastic chart parsing algorithm produces the exact same solutions as an existing FSA-based system. The shift from automata to grammars allows efficient processing of complex language models by hypothesizing symbols once per frame, no matter how many times they are needed. As an added benefit, the chart parsing algorithm allows parallel processing of lower level hypotheses autonomously with no fundamental algorithm changes.<<ETX>>
human language technology | 1989
Charles T. Hemphill; Joseph Picone
Performance in speech recognition systems has progressed to the point where it is now realistic to begin integrating speech with natural language systems to produce spoken language systems. Two factors have contributed to the advances in speech: statistical modeling of the input signal and language constraints. To produce spoken language systems, then, the grammar formalisms used in natural language systems must incorporate statistical information and efficient parsers for these stochastic language models must be developed. In this paper we outline how chart parsing techniques provide advantages in both computation and accuracy for spoken language systems. We describe a system that models all levels of the spoken language system using stochastic language models and present experimental results.
Archive | 1992
Barbara Wheatley; Charles T. Hemphill; Thomas D. Fisher; George R. Doddington