Muntsin Kolss
Karlsruhe Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Muntsin Kolss.
workshop on statistical machine translation | 2009
Jan Niehues; Muntsin Kolss
In this paper we describe a new approach to model long-range word reorderings in statistical machine translation (SMT). Until now, most SMT approaches are only able to model local reorderings. But even the word order of related languages like German and English can be very different. In recent years approaches that reorder the source sentence in a preprocessing step to better match target sentences according to POS(Part-of-Speech)-based rules have been applied successfully. We enhance this approach to model long-range reorderings by introducing discontinuous rules. We tested this new approach on a German-English translation task and could significantly improve the translation quality, by up to 0.8 BLEU points, compared to a system which already uses continuous POS-based rules to model short-range reorderings.
IEEE Transactions on Audio, Speech, and Language Processing | 2008
Gregor Leusch; Rafael E. Banchs; Nicola Bertoldi; Daniel Déchelotte; Marcello Federico; Muntsin Kolss; Young-Suk Lee; José B. Mariño; Matthias Paulik; Salim Roukos; Holger Schwenk; Hermann Ney
This paper describes an approach for computing a consensus translation from the outputs of multiple machine translation (MT) systems. The consensus translation is computed by weighted majority voting on a confusion network, similarly to the well-established ROVER approach of Fiscus for combining speech recognition hypotheses. To create the confusion network, pairwise word alignments of the original MT hypotheses are learned using an enhanced statistical alignment algorithm that explicitly models word reordering. The context of a whole corpus of automatic translations rather than a single sentence is taken into account in order to achieve high alignment quality. The confusion network is rescored with a special language model, and the consensus translation is extracted as the best path. The proposed system combination approach was evaluated in the framework of the TC-STAR speech translation project. Up to six state-of-the-art statistical phrase-based translation systems from different project partners were combined in the experiments. Significant improvements in translation quality from Spanish to English and from English to Spanish in comparison with the best of the individual MT systems were achieved under official evaluation conditions.
Machine Translation | 2007
Christian Fügen; Alex Waibel; Muntsin Kolss
With increasing globalization, communication across language and cultural boundaries is becoming an essential requirement of doing business, delivering education, and providing public services. Due to the considerable cost of human translation services, only a small fraction of text documents and an even smaller percentage of spoken encounters, such as international meetings and conferences, are translated, with most resorting to the use of a common language (e.g. English) or not taking place at all. Technology may provide a potentially revolutionary way out if real-time, domain-independent, simultaneous speech translation can be realized. In this paper, we present a simultaneous speech translation system based on statistical recognition and translation technology. We discuss the technology, various system improvements and propose mechanisms for user-friendly delivery of the result. Over extensive component and end-to-end system evaluations and comparisons with human translation performance, we conclude that machines can already deliver comprehensible simultaneous translation output. Moreover, while machine performance is affected by recognition errors (and thus can be improved), human performance is limited by the cognitive challenge of performing the task in real time.
international conference on acoustics, speech, and signal processing | 2006
Christian Fügen; Muntsin Kolss; Dietmar Bernreuther; Matthias Paulik; Sebastian Stüker; Stephan Vogel; Alex Waibel
For years speech translation has focused on the recognition and translation of discourses in limited domains, such as hotel reservations or scheduling tasks. Only recently research projects have been started to tackle the problem of open domain speech recognition and translation of complex tasks such as lectures and speeches. In this paper we present the on-going work at our laboratory in open domain speech translation of lectures and parliamentary speeches. Starting from a translation system for European parliamentary plenary sessions and a lecture speech recognition system we show how both components perform in unison on speech translation of lectures
workshop on statistical machine translation | 2009
Jan Niehues; Teresa Herrmann; Muntsin Kolss; Alex Waibel
In this paper we describe the statistical machine translation system of the Universitat Karlsruhe developed for the translation task of the Fourth Workshop on Statistical Machine Translation. The state-of-the-art phrase-based SMT system is augmented with alternative word reordering and alignment mechanisms as well as optional phrase table modifications. We participate in the constrained condition of German-English and English-German as well as in the constrained condition of French-English and English-French.
international conference on acoustics, speech, and signal processing | 2007
Sebastian Stüker; Matthias Paulik; Muntsin Kolss; Christian Fügen; Alex Waibel
In this paper we describe our work in coupling automatic speech recognition (ASR) and machine translation (MT) in a speech translation enhanced automatic speech recognition (STE-ASR) framework for transcribing and translating European parliament speeches. We demonstrate the influence of the quality of the ASR component on the MT performance, by comparing a series of WERs with the corresponding automatic translation scores. By porting an STE-ASR framework to the task at hand, we show how the word errors for transcribing English and Spanish speeches can be lowered by 3.0% and 4.8% relative, respectively.
international conference on acoustics, speech, and signal processing | 2004
Alex Waibel; Tanja Schultz; Stephan Vogel; Christian Fügen; Matthias Honal; Muntsin Kolss; Jürgen Reichert; Sebastian Stüker
Speech translation has made significant advances over the last years. We believe that we can overcome todays limits of language and domain portable conversational speech translation systems by relying more radically on learning approaches and by the use of multiple layers of reduction and transformation to extract the desired content in another language. Therefore, we cascade stochastic source-channel models that extract an underlying message from a corrupt observed output. The three models effectively translate: (1) speech to word lattices (automatic speech recognition, ASR); (2) ill-formed fragments of word strings into a compact well-formed sentence (Clean); (3) sentences in one language to sentences in another (machine translation, MT). We present results of our research efforts towards rapid language portability of all these components. The results on translation suggest that MT systems can be successfully constructed for any language pair by cascading multiple MT systems via English. Moreover, end-to-end performance can be improved, if the interlingua language is enriched with additional linguistic information that can be derived automatically and monolingually in a data-driven fashion.
spoken language technology workshop | 2008
Matthias Wölfel; Muntsin Kolss; Florian Kraft; Jan Niehues; Matthias Paulik; Alex Waibel
An increasingly globalized world fosters the exchange of students, researchers or employees. As a result, situations in which people of different native tongues are listening to the same lecture become more and more frequent. In many such situations, human interpreters are prohibitively expensive or simply not available. For this reason, and because first prototypes have already demonstrated the feasibility of such systems, automatic translation of lectures receives increasing attention. A large vocabulary and strong variations in speaking style make lecture translation a challenging, however not hopeless, task. The scope of this paper is to investigate a variety of challenges and to highlight possible solutions in building a system for simultaneous translation of lectures from German to English. While some of the investigated challenges are more general, e.g. environment robustness, other challenges are more specific for this particular task, e.g. pronunciation of foreign words or sentence segmentation. We also report our progress in building an end-to-end system and analyze its performance in terms of objective and subjective measures.
international conference on machine learning | 2006
Sebastian Stüker; Chengqing Zong; Jürgen Reichert; Wenjie Cao; Muntsin Kolss; Guodong Xie; Kay Peterson; Peng Ding; Victoria Arranz; Jian Yu; Alex Waibel
In 2008 the Olympics Games will be held in Beijing. For this purpose the city government of Beijing has launched the Special Programme for Construction of Digital Olympics. One of the objectives of the program is the use of artificial intelligence technology to overcome language barriers during the games. In order to demonstrate the contribution that speech-to-speech translation technology (SST) can make to solving this problem and in order to prove the feasibility of deploying such technology in the environment of the Olympic Games 2008 in Beijing, we have developed the Digital Olympics Speech-to-Speech Translation System that addresses a general touristic domain with a special focus on pre-arrival hotel reservation. The system allows for rapid development of SST prototypes, the study of different user-interfaces and the on-the-fly comparison of alternative approaches to the individual problems involved in this task.
IWSLT | 2004
Stephan Vogel; Sanjika Hewavitharana; Muntsin Kolss; Alex Waibel