Antoine Raux | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Antoine Raux is active.

Explore More

Publication

Featured researches published by Antoine Raux.

annual meeting of the special interest group on discourse and dialogue | 2008

Optimizing Endpointing Thresholds using Dialogue Features in a Spoken Dialogue System

Antoine Raux; Maxine Eskenazi

This paper describes a novel algorithm to dynamically set endpointing thresholds based on a rich set of dialogue features to detect the end of user utterances in a dialogue system. By analyzing the relationship between silences in users speech to a spoken dialogue system and a wide range of automatically extracted features from discourse, semantics, prosody, timing and speaker characteristics, we found that all features correlate with pause duration and with whether a silence indicates the end of the turn, with semantics and timing being the most informative. Based on these features, the proposed method reduces latency by up to 24% over a fixed threshold baseline. Offline evaluation results were confirmed by implementing the proposed algorithm in the Lets Go system.

ieee automatic speech recognition and understanding workshop | 2003

A unit selection approach to F0 modeling and its application to emphasis

Antoine Raux; Alan W. Black

The paper presents a new unit selection approach to F0 modeling for speech synthesis. We construct the F0 contour of an utterance by selecting portions of contours from a recorded speech database. In this approach, the elementary unit is the segment, which gives the system flexibility to combine segments from different phrases and model both macroprosody and microprosody. Using this method, we built a model of emphasis in English. Informal experimental results show that utterances whose prosody was generated with our method are generally preferred over utterances using Festivals handwritten rule-based F0 model.

ieee automatic speech recognition and understanding workshop | 2007

A multi-layer architecture for semi-synchronous event-driven dialogue management

Antoine Raux; Maxine Eskenazi

We present a new architecture for spoken dialogue systems that explicitly separates the discrete, abstract representation used in the high-level dialogue manager and the continuous, real-time nature of real world events. We propose to use the concept of conversational floor as a means to synchronize the internal state of the dialogue manager with the real world. To act as the interface between these two layers, we introduce a new component, called the Interaction Manager. The proposed architecture was implemented as a new version of the Olympus framework, which can be used across different domains and modalities. We confirmed the practicality of the approach by porting Lets Go, an existing deployed dialogue system to the new architecture.

north american chapter of the association for computational linguistics | 2007

Conquest---An Open-Source Dialog System for Conferences

Dan Bohus; Sergio Grau Puerto; David Huggins-Daines; Venkatesh Keri; Gopala Krishna; Rohit Kumar; Antoine Raux; Stefanie Tomko

We describe ConQuest, an open-source, reusable spoken dialog system that provides technical program information during conferences. The system uses a transparent, modular and open infrastructure, and aims to enable applied research in spoken language interfaces. The conference domain is a good platform for applied research since it permits periodical redeployments and evaluations with a real user-base. In this paper, we describe the systems functionality, overall architecture, and we discuss two initial deployments.

spoken language technology workshop | 2006

ONLINE SUPERVISED LEARNING OF NON-UNDERSTANDING RECOVERY POLICIES

Dan Bohus; Brian Langner; Antoine Raux; Alan W. Black; Maxine Eskenazi; Alexander I. Rudnicky

Spoken dialog systems typically use a limited number of non- understanding recovery strategies and simple heuristic policies to engage them (e.g. first ask user to repeat, then give help, then transfer to an operator). We propose a supervised, online method for learning a non-understanding recovery policy over a large set of recovery strategies. The approach consists of two steps: first, we construct runtime estimates for the likelihood of success of each recovery strategy, and then we use these estimates to construct a policy. An experiment with a publicly available spoken dialog system shows that the learned policy produced a 12.5% relative improvement in the non-understanding recovery rate.

annual meeting of the special interest group on discourse and dialogue | 2014

Situated Language Understanding at 25 Miles per Hour

Teruhisa Misu; Antoine Raux; Rakesh Gupta; Ian R. Lane

In this paper, we address issues in situated language understanding in a rapidly changing environment ‐ a moving car. Specifically, we propose methods for understanding user queries about specific target buildings in their surroundings. Unlike previous studies on physically situated interactions such as interaction with mobile robots, the task is very sensitive to timing because the spatial relation between the car and the target is changing while the user is speaking. We collected situated utterances from drivers using our research system, Townsurfer, which is embedded in a real vehicle. Based on this data, we analyze the timing of user queries, spatial relationships between the car and targets, head pose of the user, and linguistic cues. Optimized on the data, our algorithms improved the target identification rate by 24.1% absolute.

Ai Magazine | 2014

The Dialog State Tracking Challenge Series

Jason D. Williams; Matthew Henderson; Antoine Raux; Blaise Thomson; Alan W. Black

In spoken dialog systems, dialog state tracking refers to the task of correctly inferring the users goal at a given turn, given all of the dialog history up to that turn. The Dialog State Tracking Challenge is a research community challenge task that has run for three rounds. The challenge has given rise to a host of new methods for dialog state tracking, and also deeper understandings about the problem itself, including methods for evaluation.

meeting of the association for computational linguistics | 2008

Building Practical Spoken Dialog Systems

Antoine Raux; Brian Langner; Alan W. Black; Maxine Eskenazi

This tutorial will give a practical description of the free software Carnegie Mellon Olympus 2 Spoken Dialog Architecture. Building real working dialog systems that are robust enough for the general public to use is difficult. Most frequently, the functionality of the conversations is severely limited - down to simple question-answer pairs. While off-the-shelf toolkits help the development of such simple systems, they do not support more advanced, natural dialogs nor do they offer the transparency and flexibility required by computational linguistic researchers. However, Olympus 2 offers a complete dialog system with automatic speech recognition (Sphinx) and synthesis (SAPI, Festival) and has been used, along with previous versions of Olympus, for teaching and research at Carnegie Mellon and elsewhere for some 5 years. Overall, a dozen dialog systems have been built using various versions of Olympus, handling tasks ranging from providing bus schedule information to guidance through maintenance procedures for complex machinery, to personal calendar management. In addition to simplifying the development of dialog systems, Olympus provides a transparent platform for teaching and conducting research on all aspects of dialog systems, including speech recognition and synthesis, natural language understanding and generation, and dialog and interaction management. The tutorial will give a brief introduction to spoken dialog systems before going into detail about how to create your own dialog system within Olympus 2, using the Lets Go bus information system as an example. Further, we will provide guidelines on how to use an actual deployed spoken dialog system such as Lets Go to validate research results in the real world. As a possible testbed for such research, we will describe Lets Go Lab, which provides access to both the Lets Go system and its genuine user population for research experiments.

Archive | 2016

Situated Dialog in Speech-Based Human-Computer Interaction

Alexander I. Rudnicky; Antoine Raux; Ian R. Lane; Teruhisa Misu

This book provides a survey of the state-of-the-art in the practical implementation of Spoken Dialog Systems for applications in everyday settings. It includes contributions on key topics in situated dialog interaction from a number of leading researchers and offers a broad spectrum of perspectives on research and development in the area. In particular, it presents applications in robotics, knowledge access and communication and covers the following topics: dialog for interacting with robots; language understanding and generation; dialog architectures and modeling; core technologies; and the analysis of human discourse and interaction. The contributions are adapted and expanded contributions from the 2014 International Workshop on Spoken Dialog Systems (IWSDS 2014), where researchers and developers from industry and academia alike met to discuss and compare their implementation experiences, analyses and empirical findings.

Journal of the Acoustical Society of America | 2008

Call early in the evening on a spring day

Maxine Eskenazi; Antoine Raux

The CMU Lets Go Spoken Dialogue System has been used daily for about three years to answer calls to the Pittsburgh Port Authority for bus information in the evening and on weekends. This has resulted in a database of over 50 000 spoken dialogues as of January 2008, one of the largest publicly available sets of this type of data. While retraining the system with part of this data, it became apparent that there are times of the day, of the week and of the year when the average number of successful calls is significantly higher. We will present evidence, using these three measures of time (hour, day of week, month of year) and criteria such as signal‐to‐noise ratio, estimated success rate, number of turns per dialogue, number of non‐understandings per dialogue, and barge‐in rate to detect the regular, predictable appearance of high and low success rates and to suggest methods for palliating this effect in order to increase overall dialogue success rates.

Explore More