Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Maxine Eskenazi is active.

Publication


Featured researches published by Maxine Eskenazi.


Speech Communication | 2009

An overview of spoken language technology for education

Maxine Eskenazi

This paper reviews research in spoken language technology for education and more specifically for language learning. It traces the history of the domain and then groups main issues in the interaction with the student. It addresses the modalities of interaction and their implementation issues and algorithms. Then it discusses one user population - children - and an application for them. Finally it has a discussion of overall systems. It can be used as an introduction to the field and a source of reference materials.


empirical methods in natural language processing | 2005

Automatic Question Generation for Vocabulary Assessment

Jonathan Brown; Gwen A. Frishkoff; Maxine Eskenazi

In the REAP system, users are automatically provided with texts to read targeted to their individual reading levels. To find appropriate texts, the users vocabulary knowledge must be assessed. We describe an approach to automatically generating questions for vocabulary assessment. Traditionally, these assessments have been hand-written. Using data from WordNet, we generate 6 types of vocabulary questions. They can have several forms, including wordbank and multiple-choice. We present experimental results that suggest that these automatically-generated questions give a measure of vocabulary skill that correlates well with subject performance on independently developed human-written questions. In addition, strong correlations with standardized vocabulary tests point to the validity of our approach to automatic assessment of word knowledge.


workshop on innovative use of nlp for building educational applications | 2008

An Analysis of Statistical Models and Features for Reading Difficulty Prediction

Michael Heilman; Kevyn Collins-Thompson; Maxine Eskenazi

A reading difficulty measure can be described as a function or model that maps a text to a numerical value corresponding to a difficulty or grade level. We describe a measure of readability that uses a combination of lexical features and grammatical features that are derived from subtrees of syntactic parses. We also tested statistical models for nominal, ordinal, and interval scales of measurement. The results indicate that a model for ordinal regression, such as the proportional odds model, using a combination of grammatical and lexical features is most effective at predicting reading difficulty.


north american chapter of the association for computational linguistics | 2009

A Finite-State Turn-Taking Model for Spoken Dialog Systems

Antoine Raux; Maxine Eskenazi

This paper introduces the Finite-State Turn-Taking Machine (FSTTM), a new model to control the turn-taking behavior of conversational agents. Based on a non-deterministic finite-state machine, the FSTTM uses a cost matrix and decision theoretic principles to select a turn-taking action at any time. We show how the model can be applied to the problem of end-of-turn detection. Evaluation results on a deployed spoken dialog system show that the FSTTM provides significantly higher responsiveness than previous approaches.


annual meeting of the special interest group on discourse and dialogue | 2008

Optimizing Endpointing Thresholds using Dialogue Features in a Spoken Dialogue System

Antoine Raux; Maxine Eskenazi

This paper describes a novel algorithm to dynamically set endpointing thresholds based on a rich set of dialogue features to detect the end of user utterances in a dialogue system. By analyzing the relationship between silences in users speech to a spoken dialogue system and a wide range of automatically extracted features from discourse, semantics, prosody, timing and speaker characteristics, we found that all features correlate with pause duration and with whether a silence indicates the end of the turn, with semantics and timing being the most informative. Based on these features, the proposed method reduces latency by up to 24% over a fixed threshold baseline. Offline evaluation results were confirmed by implementing the proposed algorithm in the Lets Go system.


spoken language technology workshop | 2010

Toward better crowdsourced transcription: Transcription of a year of the Let's Go Bus Information System data

Gabriel Parent; Maxine Eskenazi

Transcription is typically a long and expensive process. In the last year, crowdsourcing through Amazon Mechanical Turk (MTurk) has emerged as a way to transcribe large amounts of speech. This paper presents a two-stage approach for the use of MTurk to transcribe one year of Lets Go Bus Information System data, corresponding to 156.74 hours (257,658 short utterances). This data was made available for the Spoken Dialog Challenge 2010 [1]1. While others have used a one stage approach, asking workers to label, for example, words and noises in the same pass, the present approach is closer to what expert transcribers do, dividing one complicated task into several less complicated ones with the goal of obtaining a higher quality transcript. The two stage approach shows better results in terms of agreement with experts and the quality of acoustic modeling. When “gold-standard” quality control is used, the quality of the transcripts comes close to NIST published expert agreement, although the cost doubles.


international conference on spoken language processing | 1996

Detection of foreign speakers' pronunciation errors for second language training-preliminary results

Maxine Eskenazi

With the present generation of speech recognizers dealing with speaker-independent continuous speech and medium-sized vocabularies, the possibilities for their application are becoming larger, and yet some applications have not yet been tried, or have been tried with heavy constraints on the user, due to the expected poor recognition performance, and the lack of results to date in the domain of prosody has severely limited the use of that information. The author thinks that researchers may be overly pessimistic. She explores the possibility of using Carnegie Mellon Universitys SPHINX II recognizer and of obtaining correct prosody information in order to implement it in a system to aid in foreign language learning.


annual meeting of the special interest group on discourse and dialogue | 2016

Towards End-to-End Learning for Dialog State Tracking and Management using Deep Reinforcement Learning

Tiancheng Zhao; Maxine Eskenazi

This paper presents an end-to-end framework for task-oriented dialog systems using a variant of Deep Recurrent Q-Networks (DRQN). The model is able to interface with a relational database and jointly learn policies for both language understanding and dialog strategy. Moreover, we propose a hybrid algorithm that combines the strength of reinforcement learning and supervised learning to achieve faster learning speed. We evaluated the proposed model on a 20 Question Game conversational game simulator. Results show that the proposed method outperforms the modular-based baseline and learns a distributed representation of the latent dialog state.


ieee automatic speech recognition and understanding workshop | 2007

A multi-layer architecture for semi-synchronous event-driven dialogue management

Antoine Raux; Maxine Eskenazi

We present a new architecture for spoken dialogue systems that explicitly separates the discrete, abstract representation used in the high-level dialogue manager and the continuous, real-time nature of real world events. We propose to use the concept of conversational floor as a means to synchronize the internal state of the dialogue manager with the real world. To act as the interface between these two layers, we introduce a new component, called the Interaction Manager. The proposed architecture was implemented as a new version of the Olympus framework, which can be used across different domains and modalities. We confirmed the practicality of the approach by porting Lets Go, an existing deployed dialogue system to the new architecture.


international conference on acoustics, speech, and signal processing | 1984

The French language database: Defining, planning, and recording a large database

René Carré; Raymond Descout; Maxine Eskenazi; Joseph-Jean Mariani; M. Rossi

A database of spoken French sounds is described with the methodology of the way in which it was built up (speaker selection, recording conditions, computer control of data acquisition,...). The base will contain various corpus for evaluation and training of recognition and synthesis systems, acoustic, articulatory and prosodic studies. The preliminary labeling and further exploitations of this database are starting in several laboratories.

Collaboration


Dive into the Maxine Eskenazi's collaboration.

Top Co-Authors

Avatar

Alan W. Black

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Antoine Raux

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Tiancheng Zhao

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Isabel Trancoso

Instituto Superior Técnico

View shared research outputs
Top Co-Authors

Avatar

Sungjin Lee

Pohang University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Brian Langner

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Gabriel Parent

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Jorge Baptista

University of the Algarve

View shared research outputs
Researchain Logo
Decentralizing Knowledge