Marie Meteer
BBN Technologies
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Marie Meteer.
Computational Linguistics | 2000
Andreas Stolcke; Noah Coccaro; Rebecca Bates; Paul Taylor; Carol Van Ess-Dykema; Klaus Ries; Elizabeth Shriberg; Daniel Jurafsky; Rachel Martin; Marie Meteer
We describe a statistical approach for modeling dialogue acts in conversational speech, i.e., speech-act-like units such as STATEMENT, Question, BACKCHANNEL, Agreement, Disagreement, and Apology. Our model detects and predicts dialogue acts based on lexical, collocational, and prosodic cues, as well as on the discourse coherence of the dialogue act sequence. The dialogue model is based on treating the discourse structure of a conversation as a hidden Markov model and the individual dialogue acts as observations emanating from the model states. Constraints on the likely sequence of dialogue acts are modeled via a dialogue act n-gram. The statistical dialogue grammar is combined with word n-grams, decision trees, and neural networks modeling the idiosyncratic lexical and prosodic manifestations of each dialogue act. We develop a probabilistic integration of speech recognition with dialogue modeling, to improve both speech recognition and dialogue act classification accuracy. Models are trained and evaluated using a large hand-labeled database of 1,155 conversations from the Switchboard corpus of spontaneous human-to-human telephone speech. We achieved good dialogue act labeling accuracy (65 based on errorful, automatically recognized words and prosody, and 71 based on word transcripts, compared to a chance baseline accuracy of 35 and human accuracy of 84) and a small reduction in word recognition error.
Language and Speech | 1998
Elizabeth Shriberg; Rebecca A. Bates; Andreas Stolcke; Paul Taylor; Daniel Jurafsky; Klaus Ries; Noah Coccaro; Rachel Martin; Marie Meteer; Carol Van Ess-Dykema
Identifying whether an utterance is a statement, question, greeting, and so forth is integral to effective automatic understanding of natural dialog. Little is known, however, about how such dialog acts (DAs) can be automatically classified in truly natural conversation. This study asks whether current approaches, which use mainly word information, could be improved by adding prosodic information. The study is based on more than 1000 conversations from the Switchboard corpus. DAs were hand-annotated, and prosodic features (duration, pause, F0, energy, and speaking rate) were automatically extracted for each DA. In training, decision trees based on these features were inferred; trees were then applied to unseen test data to evaluate performance. Performance was evaluated for prosody models alone, and after combining the prosody models with word information—either from true words or from the output of an automatic speech recognizer. For an overall classification task, as well as three subtasks, prosody made significant contributions to classification. Feature-specific analyses further revealed that although canonical features (such as F0 for questions) were important, less obvious features could compensate if canonical features were removed. Finally, in each task, integrating the prosodic model with a DA-specific statistical language model improved performance over that of the language model alone, especially for the case of recognized words. Results suggest that DAs are redundantly marked in natural conversation, and that a variety of automatically extractable prosodic features could aid dialog processing in speech applications.
ieee automatic speech recognition and understanding workshop | 1997
Daniel Jurafsky; Rebecca A. Bates; Noah Coccaro; Rachel Martin; Marie Meteer; Klaus Ries; Elizabeth Shriberg; Andreas Stolcke; Paul Taylor; C. Van Ess-Dykema
We describe a new approach for statistical modeling and detection of discourse structure for natural conversational speech. Our model is based on 42 dialog acts (DAs), (question, answer, backchannel, agreement, disagreement, apology, etc.). We labeled 1155 conversations from the Switchboard (SWBD) database (Godfrey et al., 1992) of human-to-human telephone conversations with these 42 types and trained a dialog act detector based on three distinct knowledge sources: sequences of words which characterize a dialog act; prosodic features which characterize a dialog act; and a statistical discourse grammar. Our combined detector, although still in preliminary stages, already achieves a 65% dialog act detection rate based on acoustic waveforms, and 72% accuracy based on word transcripts. Using this detector to switch among the 42 dialog-act-specific trigram LMs also gave us an encouraging but not statistically significant reduction in SWBD word error.
international conference on acoustics, speech, and signal processing | 1992
Jan Robin Rohlicek; D. Ayuso; M. Bates; Robert J. Bobrow; Albert Boulanger; Herbert Gish; Philippe Jeanrenaud; Marie Meteer; Man-Hung Siu
A novel system for extracting information from stereotyped voice traffic is described. Off-the-air recordings of commercial air traffic control communications are interpreted in order to identify the flights present and determine the scenario (e.g., takeoff, landing) that they are following. The system combines algorithms from signal segmentation, speaker segregation, speech recognition, natural language parsing, and topic classification into a single system. Initial evaluation of the algorithm on data recorded at Dallas-Fort Worth airport yields performance of 68% detection of flights with 98% precision at an operating point where 76% of the flight identifications are correctly recognized. In tower recording containing both takeoff and landing scenarios, flights are correctly classified as takeoff or landing 94% of the time.<<ETX>>
international conference on acoustics, speech, and signal processing | 1993
L. Denenberg; Herbert Gish; Marie Meteer; T. Miller; Jan Robin Rohlicek; W. Sadkin; Man-Hung Siu
The authors describe additions and modifications to a prototype system for analyzing air traffic contol (ATC) communication. The primary goal of the effort was to achieve real-time performance. This involved both system architectural and algorithmic modifications. The task of the system is to extract the gist of activity as it is monitored. In the ATC domain this involves identifying those flights that are present and classifying each flight as a departure, arrival, or other. The system combines a variety of techniques from speaker-identification, speech recognition, natural-language processing, and artificial intelligence. Continuous processing versions of the algorithms have been constructed and it has been demonstrated that real-time performance is possible by distributing the processing over a small number of workstations. To accomplish this task, a flexible software task-construction tool that allows simple specification of complex systems, supporting both dataflow and client/server models, has been developed.<<ETX>>
international conference on acoustics, speech, and signal processing | 1994
Philippe Jeanrenaud; Man-Hung Siu; Jan Robin Rohlicek; Marie Meteer; Herbert Gish
Jeanrenaud et al. (1993) introduced the notion of event spotting and showed that the detection of events could be approached as a word spotting problem. The present authors first concentrate on the problem of spotting complex units such as grammatic events and describe the three issues that have to be addressed. These issues are: 1) hypothesizing-how to detect an event, 2) scoring-how to generate a consistent score for the event, 3) identification-what word sequence constitutes the particular realization of the event. The authors also discuss ways to evaluate the performance of an event spotter and how to characterize the complexity of an event. They then present two approaches to spot events. The first approach is based on posterior probability scoring using sub-grammars, and the second uses a large vocabulary recognizer. They show that both approaches have comparable performance on an event spotting task using the Switchboard corpus. In the posterior probability scoring approach, however, there is the advantage of being able to choose an operating point.<<ETX>>
conference on applied natural language processing | 1988
David D. McDonald; Marie Meteer
In this paper we present a means of compensating for the semantic deficits of linguistically naive underlying application programs without compromising principled grammatical treatments in natural language generation. We present a method for building an interface from todays underlying application programs to the linguistic realization component Mumble-86. The goal of the paper is not to discuss how Mumble works, but to describe how one exploits its capabilities. We provide examples from current generation projects using Mumble as their linguistic component.
Speech Communication | 2000
Kristine W. Ma; George Zavaliagkos; Marie Meteer
Abstract According to discourse theories in linguistics, conversational utterances possess an informational structure. That is, each sentence consists of two components: the given and the new . The given refers to information that has previously been conveyed in the conversation such as that in Thats interesting . The new section of a sentence introduces additional information that is new to the conversation such as the word interesting in the previous example. In this work, we take advantage of this inherent structure for the purpose of automatic conversational speech recognition by building sub-sentence discourse language models (LMs) to represent the bi-modal nature of each conversational sentence. The internal sentence structure is captured with a statistical sentence model regardless of whether the input sentences are linguistically or acoustically segmented. The proposed model is verified on the Switchboard corpus. The resulting model contributes to a reduction in both LM perplexity and word recognition error rate.
Ai Magazine | 1995
Koenraad De Smedt; Eduard H. Hovy; David D. McDonald; Marie Meteer
The Seventh International Workshop on Natural Language Generation was held from 21 to 24 June 1994 in Kennebunkport, Maine. Sixty-seven people from 13 countries attended this 4-day meeting on the study of natural language generation in computational linguistics and AI. The goal of the workshop was to introduce new, cutting-edge work to the community and provide an atmosphere in which discussion and exchange would flourish.
Archive | 1998
Andreas Stolcke; Elizabeth Shriberg; Rebecca Bates; Noah Coccaro; Daniel Jurafsky; Rachel Martin; Marie Meteer; Klaus Ries; Paul Taylor; Carol Van Ess-Dykema