Marsal Gavaldà
Carnegie Mellon University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Marsal Gavaldà.
international conference on acoustics, speech, and signal processing | 1997
Alon Lavie; Alex Waibel; Lori S. Levin; Michael Finke; Donna Gates; Marsal Gavaldà; Torsten Zeppenfeld; Puming Zhan
This paper describes JANUS-III, our most recent version of the JANUS speech-to-speech translation system. We present an overview of the system and focus on how system design facilitates speech translation between multiple languages, and allows for easy adaptation to new source and target languages. We also describe our methodology for evaluation of end-to-end system performance with a variety of source and target languages. For system development and evaluation, we have experimented with both push-to-talk as well as cross-talk recording conditions. To date, our system has achieved performance levels of over 80% acceptable translations on transcribed input, and over 70% acceptable translations on speech input recognized with a 75-90% word accuracy. Our current major research is concentrated on enhancing the capabilities of the system to deal with input in broad and general domains.
Machine Translation | 2000
Lori S. Levin; Alon Lavie; Monika Woszczyna; Donna Gates; Marsal Gavaldà; Detlef Koll; Alex Waibel
The Janus-III system translates spoken languages in limiteddomains. The current research focus is on expanding beyond tasksinvolving a single limited semantic domain to significantly broaderand richer domains. To achieve this goal, The MT components of oursystem have been engineered to build and manipulate multi-domain parselattices that are based on modular grammars for multiple semanticdomains. This approach yields solutions to several problems includingmulti-domain disambiguation, segmentation of spoken utterances intosentence units, modularity of system design, and re-use of earliersystems with incompatible output.
meeting of the association for computational linguistics | 1998
Marsal Gavaldà; Alex Waibel
A critical path in the development of natural language understanding (NLU) modules lies in the difficulty of defining a mapping from words to semantics: Usually it takes in the order of years of highly-skilled labor to develop a semantic mapping, e.g., in the form of a semantic grammar, that is comprehensive enough for a given domain. Yet, due to the very nature of human language, such mapping invariably fail to achieve full coverage on unseen data. Acknowledging the impossibility of stating a priori all the surface forms by which a concept can be expressed, we present GSG: an empathic computer system for the rapid deployment of NLU front-ends and their dynamic customization by non-expert end-users. Given a new domain for which an NLU front-end is to be developed, two stages are involved. In the authoring stage, GSG aids the developer in the construction of a simple domain model and a kernel analysis grammar. Then, in the run-time stage, GSG provides the end-user with an interactive environment in which the kernel grammar is dynamically extended. Three learning methods are employed in the acquisition of semantic mappings from unseen data: (i) parser predictions, (ii) hidden understanding model, and (iii) end-user paraphrases. A baseline version of GSG has been implemented and preliminary experiments show promising results.
international workshop/conference on parsing technologies | 2004
Marsal Gavaldà
This chapter describes the key features of SOUP, a stochastic, chart-based, top-down parser, especially engineered for real-time analysis of spoken language with very large, multi-domain semantic grammars. SOUP achieves flexibility by encoding context-free grammars, specified for example in the Java Speech Grammar Format, as probabilistic recursive transition networks, and robustness by allowing skipping of input words at any position and producing ranked interpretations that may consist of multiple parse trees. Moreover, SOUP is very efficient, which allows for practically instantaneous backend response.
european conference on artificial intelligence | 1996
Donna Gates; Alon Lavie; Lori S. Levin; Alex Waibel; Marsal Gavaldà; Laura Mayfield; Monika Woszczyna; Puming Zhan
JANUS is a multi-lingual speech-to-speech translation system designed to facilitate communication between two parties engaged in a spontaneous conversation in a limited domain. In this paper we describe our methodology for evaluating translation performance. Our current focus is on end- to- end evaluations- the evaluation of the translation capabilities of the system as a whole. The main goal of our end-to-end evaluation procedure is to determine translation accuracy on a test set of previously unseen dialogues. Other goals include evaluating the effectiveness of the system in conveying domain-relevant information and in detecting and dealing appropriately with utterances (or portions of utterances) that are out-of-domain. End-to-end evaluations are performed in order to verify the general coverage of our knowledge sources, guide our development efforts, and to track our improvement over time. We discuss our evaluation procedures, the criteria used for assigning scores to translations produced by the system, and the tools developed for performing this task. Recent Spanish-to-English performance evaluation results are presented as an example.
international conference on acoustics, speech, and signal processing | 1995
Laura Mayfield; Marsal Gavaldà; Wayne H. Ward; Alex Waibel
As part of the JANUS speech-to-speech translation project, the authors have developed a robust translation system based on the information structures inherent to the task being performed. The basic premise is that the structure of the information to be transmitted is largely independent of the language used to encode it. The system performs no syntactic analysis; speaker utterances are parsed into semantic chunks, which can be strung together without grammatical rules, and passed through a simple template-based translation module. The authors have achieved encouraging coverage rates on English, German and Spanish input with English, German and Spanish output.
conference of the association for machine translation in the americas | 1998
Monika Woszczcyna; Matthew Broadhead; Donna Gates; Marsal Gavaldà; Alon Lavie; Lori S. Levin; Alex Waibel
The MT engine of the Janus speech-to-speech translation system is designed around four main principles: 1) an interlingua approach that allows the efficient addition of new languages, 2) the use of semantic grammars that yield low cost high quality translations for limited domains, 3) modular grammars that support easy expansion into new domains, and 4) efficient integration of multiple grammars using multi-domain parse lattices and domain re-scoring. Within the framework of the C-STAR-II speech-to-speech translation effort, these principles are tested against the challenge of providing translation for a number of domains and language pairs with the additional restriction of a common interchange format.
international conference on acoustics speech and signal processing | 1996
Alex Waibel; Michael Finke; Donna Gates; Marsal Gavaldà; Thomas Kemp; Alon Lavie; Lori S. Levin; Martin Maier; Laura Mayfield; Arthur E. McNair; Ivica Rogina; Kaori Shima; Tilo Sloboda; Monika Woszczyna; Torsten Zeppenfeld; Puming Zhan
JANUS-II is a research system to design and test components of speech-to-speech translation systems as well as a research prototype for such a system. We focus on two aspects of the system: (1) the new features of the speech recognition component JANUS-SR, and (2) the end-to-end performance of JANUS-II, including a comparison of two machine translation strategies used for JANUS-MT (PHOENIX and GLR*).
conference on applied natural language processing | 1997
Marsal Gavaldà; Klaus Zechner; Gregory Aist
We describe and experimentally evaluate an efficient method for automatically determining small clause boundaries in spontaneous speech. Our method applies an artificial neural network to information about part of speech and trigger words.We find that with a limited amount of data (less than 2500 words for the training set), a small sliding context window (+/-3 tokens) and only two hidden units, the neural net performs extremely well on this task: less than 5% error rate and F-score (combined precision and recall) of over .85 on unseen data.These results prove to be better than those reported earlier using different approaches.
international conference on computational linguistics | 1996
Alon Lavie; Donna Gates; Marsal Gavaldà; Laura Mayfield; Alex Waibel; Lori S. Levin
JANUS is a multi-lingual speech-to-speech translation system designed to facilitate communication between two parties engaged in a spontaneous conversation in a limited domain. In an attempt to achieve both robustness and translation accuracy we use two different translation components: the GLR module, designed to be more accurate, and the Phoenix module, designed to be more robust. We analyze the strengths and weaknesses of each of the approaches and describe our work on combining them. Another recent focus has been on developing a detailed end-to-end evaluation procedure to measure the performance and effectiveness of the system. We present our most recent Spanish-to-English performance evaluation results.