Amanda Stent | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Amanda Stent is active.

Explore More

Publication

Featured researches published by Amanda Stent.

Ai Magazine | 2001

Toward Conversational Human-Computer Interaction

James F. Allen; Donna K. Byron; Myroslava O. Dzikovska; George Ferguson; Lucian Galescu; Amanda Stent

The belief that humans will be able to interact with computers in conversational speech has long been a favorite subject in science fiction, reflecting the persistent belief that spoken dialogue would be the most natural and powerful user interface to computers. With recent improvements in computer technology and in speech and language processing, such systems are starting to appear feasible. There are significant technical problems that still need to be solved before speech-driven interfaces become truly conversational. This article describes the results of a 10-year effort building robust spoken dialogue systems at the University of Rochester.

intelligent user interfaces | 2001

An architecture for more realistic conversational systems

James F. Allen; George Ferguson; Amanda Stent

In this paper, we describe an architecture for conversational systems that enables human-like performance along several important dimensions. First, interpretation is incremental, multi-level, and involves both general and task- and domain-specific knowledge. Second, generation is also incremental, proceeds in parallel with interpretation, and accounts for phenomena such as turn-taking, grounding and interruptions. Finally, the overall behavior of the system in the task at hand is determined by the (incremental) results of interpretation, the persistent goals and obligations of the system, and exogenous events of which it becomes aware. As a practical matter, the architecture supports a separation of responsibilities that enhances portability to new tasks and domains.

meeting of the association for computational linguistics | 2002

MATCH: An Architecture for Multimodal Dialogue Systems

Michael Johnston; SrinivasBangalore; Gunaranjan Vasireddy; Amanda Stent; Patrick Ehlen; Marilyn A. Walker; Steve Whittaker; Preetam Maloor

Mobile interfaces need to allow the user and system to adapt their choice of communication modes according to user preferences, the task at hand, and the physical and social environment. We describe a multimodal application architecture which combines finite-state multimodal language processing, a speech-act based multimodal dialogue manager, dynamic multimodal output generation, and user-tailored text planning to enable rapid prototyping of multimodal interfaces with flexible input and adaptive output. Our testbed application MATCH (Multimodal Access To City Help) provides a mobile multimodal speech-pen interface to restaurant and sub-way information for New York City.

Natural Language Engineering | 2000

An architecture for a generic dialogue shell

James F. Allen; Donna K. Byron; Myroslava O. Dzikovska; George Ferguson; Lucian Galescu; Amanda Stent

This paper describes our work on dialogue systems that can mimic human conversation, with the goal of providing intuitive access to a wide range of applications by expanding the users options in the interaction. We concentrate on practical dialogue: dialogues in which the participants need to accomplish some objective or perform some task. Two hypotheses regarding practical dialogue motivate our research. First, that the conversational competence required for practical dialogues, while still complex, is significantly simpler to achieve than general human conversational competence. And second, that within the genre of practical dialogue, the bulk of the complexity in the language interpretation and dialogue management is independent of the task being performed. If these hypotheses are true, then it should be possible to build a generic dialogue shell for practical dialogue, by which we mean the full range of components required in a dialogue system, including speech recognition, language processing, dialogue management and response planning, built in such a way as to be readily adapted to new applications by specifying the domain and task models. This paper documents our progress and what we have learned so far based on building and adapting systems in a series of different problem solving domains.

meeting of the association for computational linguistics | 2004

Trainable Sentence Planning for Complex Information Presentations in Spoken Dialog Systems

Amanda Stent; Rashmi Prassad; Marilyn A. Walker

A challenging problem for spoken dialog systems is the design of utterance generation modules that are fast, flexible and general, yet produce high quality output in particular domains. A promising approach is trainable generation, which uses general-purpose linguistic knowledge automatically adapted to the application domain. This paper presents a trainable sentence planner for the MATCH dialog system. We show that trainable sentence planning can produce output comparable to that of MATCHs template-based generator even for quite complex information presentations.

international world wide web conferences | 2004

Hearsay: enabling audio browsing on hypertext content

I. V. Ramakrishnan; Amanda Stent; Guizhen Yang

In this paper we present HearSay, a system for browsing hypertext Web documents via audio. The HearSay system is based on our novel approach to automatically creating audio browsable content from hypertext Web documents. It combines two key technologies: (1) automatic partitioning of Web documents through tightly coupled structural and semantic analysis, which transforms raw HTML documents into semantic structures so as to facilitate audio browsing; and (2) VoiceXML, an already standardized technology which we adopt to represent voice dialogs automatically created from the XML output of partitioning. This paper describes the software components of HearSay and presents an initial system evaluation.

computer vision and pattern recognition | 2015

TVSum: Summarizing web videos using titles

Yale Song; Jordi Vallmitjana; Amanda Stent; Alejandro Jaimes

Video summarization is a challenging problem in part because knowing which part of a video is important requires prior knowledge about its main topic. We present TVSum, an unsupervised video summarization framework that uses title-based image search results to find visually important shots. We observe that a video title is often carefully chosen to be maximally descriptive of its main topic, and hence images related to the title can serve as a proxy for important visual concepts of the main topic. However, because titles are free-formed, unconstrained, and often written ambiguously, images searched using the title can contain noise (images irrelevant to video content) and variance (images of different topics). To deal with this challenge, we developed a novel co-archetypal analysis technique that learns canonical visual concepts shared between video and images, but not in either alone, by finding a joint-factorial representation of two data sets. We introduce a new benchmark dataset, TVSum50, that contains 50 videos and their shot-level importance scores annotated via crowdsourcing. Experimental results on two datasets, SumMe and TVSum50, suggest our approach produces superior quality summaries compared to several recently proposed approaches.

Journal of Artificial Intelligence Research | 2007

Individual and domain adaptation in sentence planning for dialogue

Marilyn A. Walker; Amanda Stent; François Mairesse; Rashmi Prasad

One of the biggest challenges in the development and deployment of spoken dialogue systems is the design of the spoken language generation module. This challenge arises from the need for the generator to adapt to many features of the dialogue domain, user population, and dialogue context. A promising approach is trainable generation, which uses general-purpose linguistic knowledge that is automatically adapted to the features of interest, such as the application domain, individual user, or user group. In this paper we present and evaluate a trainable sentence planner for providing restaurant information in the MATCH dialogue system. We show that trainable sentence planning can produce complex information presentations whose quality is comparable to the output of a template-based generator tuned to this domain. We also show that our method easily supports adapting the sentence planner to individuals, and that the individualized sentence planners generally perform better than models trained and tested on a population of individuals. Previous work has documented and utilized individual preferences for content selection, but to our knowledge, these results provide the first demonstration of individual preferences for sentence planning operations, affecting the content order, discourse structure and sentence structure of system responses. Finally, we evaluate the contribution of different feature sets, and show that, in our application, n-gram features often do as well as features based on higher-level linguistic representations.

north american chapter of the association for computational linguistics | 2009

Geo-Centric Language Models for Local Business Voice Search

Amanda Stent; Ilija Zeljkovic; Diamantino Caseiro; Jay G. Wilpon

Voice search is increasingly popular, especially for local business directory assistance. However, speech recognition accuracy on business listing names is still low, leading to user frustration. In this paper, we present a new algorithm for geo-centric language model generation for local business voice search for mobile users. Our algorithm has several advantages: it provides a language model for any user in any location; the geographic area covered by the language model is adapted to the local business density, giving high recognition accuracy; and the language models can be pre-compiled, giving fast recognition time. In an experiment using spoken business listing name queries from a business directory assistance service, we achieve a 16.8% absolute improvement in recognition accuracy and a 3-fold speedup in recognition time with geocentric language models when compared with a nationwide language model.

international conference on computational linguistics | 2005

Evaluating evaluation methods for generation in the presence of variation

Amanda Stent; Matthew Marge; Mohit Singhai

Recent years have seen increasing interest in automatic metrics for the evaluation of generation systems. When a system can generate syntactic variation, automatic evaluation becomes more difficult. In this paper, we compare the performance of several automatic evaluation metrics using a corpus of automatically generated paraphrases. We show that these evaluation metrics can at least partially measure adequacy (similarity in meaning), but are not good measures of fluency (syntactic correctness). We make several proposals for improving the evaluation of generation systems that produce variation.

Explore More