Wayne H. Ward | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Wayne H. Ward is active.

Explore More

Publication

Featured researches published by Wayne H. Ward.

human language technology | 1994

Recent improvements in the CMU spoken language understanding system

Wayne H. Ward; Sunil Issar

We have been developing a spoken language system to recognize and understand spontaneous speech. It is difficult for such systems to achieve good coverage of the lexicon and grammar that subjects might use because spontaneous speech often contains disfluencies and ungrammatical constructions. Our goal is to respond appropriately to input, even though coverage is not complete. The natural language component of our system is oriented toward the extraction of information relevant to a task, and seeks to directly optimize the correctness of the extracted information (and therefore the system response). We use a flexible frame-based parser, which parses as much of the input as possible. This approach leads both to high accuracy and robustness. We have implemented a version of this system for the Air Travel Information Service (ATIS) task, which is being used by several ARPA-funded sites to develop and evaluate speech understanding systems. Users are asked to perform a task that requires getting information from an Air Travel database. In this paper, we describe recent improvements in our system resulting from our efforts to improve the coverage given a limited amount of training data. These improvements address a number of problems including generating an adequate lexicon and grammar for the recognizer, generating and generalizing an appropriate grammar for the parser, and dealing with ambiguous parses.

human language technology | 1990

The CMU air travel information service: understanding spontaneous speech

Wayne H. Ward

Understanding spontaneous speech presents several problems not found in processing read speech input. Spontaneous speech is often not fluent. It contains stutters, filled pauses, restarts, repeats, interjections, etc. Casual users do not know the lexicon and grammar used by the system. It is therefore very difficult for a speech understanding system to achieve good coverage of the lexicon and grammar that subjects might use.

international conference on acoustics, speech, and signal processing | 1991

Understanding spontaneous speech: the Phoenix system

Wayne H. Ward

The author describes the design of the Phoenix speech understanding system and reports on its current status. Phoenix is a system currently being developed at Carnegie Mellon University to understand spontaneous speech. The system has been implemented for an air travel information service (ATIS) task. In the ATIS task, novice users are asked to perform a task that requires getting information from the air travel database. Users compose the questions themselves, and are allowed to phrase the queries any way they choose. No explicit grammar or lexicon is given to the subject. This task presents several problems not found in real input. Not only is the speech not fluent, but the vocabulary and grammar are open. The results of the processing of both the speech and transcript data for subjects performing the task are reported.<<ETX>>

Communications of The ACM | 1989

High level knowledge sources in usable speech recognition systems

Sheryl R. Young; Alexander G. Hauptmann; Wayne H. Ward; D. Edward T. Smith; Philip Werner

The authors detail an integrated system which combines natural language processing with speech understanding in the context of a problem solving dialogue. The MINDS system uses a variety of pragmatic knowledge sources to dynamically generate expectations of what a user is likely to say.

international conference on acoustics speech and signal processing | 1996

A class based language model for speech recognition

Wayne H. Ward; Sunil Issar

Class based language models are often used when there is insufficient data to generate a word based language model directly from the training data. In this approach, similar items are clustered into classes, an n-gram language model for the class tokens is generated, and then the probabilities for words in a class are distributed according to the smoothed relative unigram frequencies of the words. Classes expand to lists of single word tokens, that is, a class cannot represent a sequence of lexical tokens. We propose a more general mechanism for defining a language model class. In it, classes are expanded to word sequences through finite-state networks. This allows expansion to word sequences without requiring compound words in the lexicon. Where finite-state models are too brittle to represent sentence-level strings, they can represent class-level strings (dates, names and titles for example). We compared the perplexity on the ARPA Dec93 ATIS Test set and found that the new model reduced the perplexity by approximately 17 percent (relative).

international conference on acoustics, speech, and signal processing | 2000

Confidence measures for dialogue management in the CU Communicator system

R. San-Segundo; Bryan L. Pellom; Wayne H. Ward; José Manuel Pardo

This paper provides improved confidence assessment for detection of word-level speech recognition errors and out-of-domain user requests using language model features. We consider a combined measure of confidence that utilizes the language model back-off sequence, language model score, and phonetic length of recognized words as indicators of speech recognition confidence. The paper investigates the ability of each feature to detect speech recognition errors and out-of-domain utterances as well as two methods for combining the features contextually: a multi-layer perceptron and a statistical decision tree. We illustrate the effectiveness of the algorithm by considering utterances from the ATIS airline information task as either in-domain and out-of-domain for the DARPA Communicator task. Using this hand-labeled data, it is shown that 27.9% of incorrectly recognized words and 36.4% of out-of-domain phrases are detected at a 2.5% false alarm rate.

north american chapter of the association for computational linguistics | 2007

Towards Robust Semantic Role Labeling

Sameer Pradhan; Wayne H. Ward; James H. Martin

Most semantic role labeling (SRL) research has been focused on training and evaluating on the same corpus. This strategy, although appropriate for initiating research, can lead to overtraining to the particular corpus. This article describes the operation of assert, a state-of-the art SRL system, and analyzes the robustness of the system when trained on one genre of data and used to label a different genre. As a starting point, results are first presented for training and testing the system on the PropBank corpus, which is annotated Wall Street Journal (WSJ) data. Experiments are then presented to evaluate the portability of the system to another source of data. These experiments are based on comparisons of performance using PropBanked WSJ data and PropBanked Brown Corpus data. The results indicate that whereas syntactic parses and argument identification transfer relatively well to a new corpus, argument classification does not. An analysis of the reasons for this is presented and these generally point to the nature of the more lexical/semantic features dominating the classification task where more general structural features are dominant in the argument identification task.

Proceedings of the IEEE | 2003

Perceptive animated interfaces: first steps toward a new paradigm for human-computer interaction

Ron Cole; S. Van Vuuren; Bryan L. Pellom; K. Hacioglu; Jiyong Ma; Javier R. Movellan; S. Schwartz; D. Wade-Stein; Wayne H. Ward; Jie Yan

This paper presents a vision of the near future in which computer interaction is characterized by natural face-to-face conversations with lifelike characters that speak, emote, and gesture. These animated agents will converse with people much like people converse effectively with assistants in a variety of focused applications. Despite the research advances required to realize this vision, and the lack of strong experimental evidence that animated agents improve human-computer interaction, we argue that initial prototypes of perceptive animated interfaces can be developed today, and that the resulting systems will provide more effective and engaging communication experiences than existing systems. In support of this hypothesis, we first describe initial experiments using an animated character to teach speech and language skills to children with hearing problems, and classroom subjects and social skills to children with autistic spectrum disorder. We then show how existing dialogue system architectures can be transformed into perceptive animated interfaces by integrating computer vision and animation capabilities. We conclude by describing the Colorado Literacy Tutor, a computer-based literacy program that provides an ideal testbed for research and development of perceptive animated interfaces, and consider next steps required to realize the vision.

international conference on acoustics, speech, and signal processing | 2001

What kind of pronunciation variation is hard for triphones to model

Daniel Jurafsky; Wayne H. Ward; Zhang Banping; K. Herold; Yu Xiuyang; Zhang Sen

In order to help understand why gains in pronunciation modeling have proven so elusive, we investigated which kinds of pronunciation variation are well captured by triphone models, and which are not. We do this by examining the change in behavior of a recognizer as it receives further triphone training. We show that many of the kinds of variation which previous pronunciation models attempt to capture, including phone substitution or phone reduction, are in fact already well captured by triphones. Our analysis suggests new areas where future pronunciation models should focus, including syllable deletion.

meeting of the association for computational linguistics | 2005

Semantic Role Labeling Using Different Syntactic Views

Sameer S. Pradhan; Wayne H. Ward; Kadri Hacioglu; James H. Martin; Daniel Jurafsky

Semantic role labeling is the process of annotating the predicate-argument structure in text with semantic labels. In this paper we present a state-of-the-art baseline semantic role labeling system based on Support Vector Machine classifiers. We show improvements on this system by: i) adding new features including features extracted from dependency parses, ii) performing feature selection and calibration and iii) combining parses obtained from semantic parsers trained using different syntactic views. Error analysis of the baseline system showed that approximately half of the argument identification errors resulted from parse errors in which there was no syntactic constituent that aligned with the correct argument. In order to address this problem, we combined semantic parses from a Minipar syntactic parse and from a chunked syntactic representation with our original baseline system which was based on Charniak parses. All of the reported techniques resulted in performance improvements.

Explore More