Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Matthias Paulik is active.

Publication


Featured researches published by Matthias Paulik.


IEEE Transactions on Audio, Speech, and Language Processing | 2008

System Combination for Machine Translation of Spoken and Written Language

Gregor Leusch; Rafael E. Banchs; Nicola Bertoldi; Daniel Déchelotte; Marcello Federico; Muntsin Kolss; Young-Suk Lee; José B. Mariño; Matthias Paulik; Salim Roukos; Holger Schwenk; Hermann Ney

This paper describes an approach for computing a consensus translation from the outputs of multiple machine translation (MT) systems. The consensus translation is computed by weighted majority voting on a confusion network, similarly to the well-established ROVER approach of Fiscus for combining speech recognition hypotheses. To create the confusion network, pairwise word alignments of the original MT hypotheses are learned using an enhanced statistical alignment algorithm that explicitly models word reordering. The context of a whole corpus of automatic translations rather than a single sentence is taken into account in order to achieve high alignment quality. The confusion network is rescored with a special language model, and the consensus translation is extracted as the best path. The proposed system combination approach was evaluated in the framework of the TC-STAR speech translation project. Up to six state-of-the-art statistical phrase-based translation systems from different project partners were combined in the experiments. Significant improvements in translation quality from Spanish to English and from English to Spanish in comparison with the best of the individual MT systems were achieved under official evaluation conditions.


international conference on acoustics, speech, and signal processing | 2008

Sentence segmentation and punctuation recovery for spoken language translation

Matthias Paulik; Sharath Rao; Ian R. Lane; Stephan Vogel; Tanja Schultz

Sentence segmentation and punctuation recovery are critical components for effective spoken language translation (SLT). In this paper we describe our recent work on sentence segmentation and punctuation recovery for three different language pairs, namely for English-to-Spanish, Arabic-to-English and Chinese-to-English. We show that the proposed approach works equally well in these very different language pairs. Furthermore, we introduce two features computed from the translation beam-search lattice that indicate if phrasal and target language model context is jeopardized when segmenting at a given word boundary. These features enable us to introduce short intra-sentence segments without degrading translation performance.


international conference on acoustics, speech, and signal processing | 2008

Extracting clues from human interpreter speech for spoken language translation

Matthias Paulik; Alex Waibel

In previous work, we reported dramatic improvements in automatic speech recognition (ASR) and spoken language translation (SLT) gained by applying information extracted from spoken human interpretations. These interpretations were artificially created by collecting read sentences from a clean parallel text corpus. Real human interpretations are significantly different. They suffer from frequent synopses, omissions and self-corrections. Expressing these differences in BLEU score by evaluating human interpretations with carefully created human translations, we found that human interpretations perform two to three times worse than state-of-the art SLT. Facing these stark differences, we address the question if and how ASR and SLT can profit from human interpretations. In the following we describe initial experiments that apply knowledge derived from real human interpretations for improving English and Spanish ASR and SLT. Our experiments are conducted on a small European Parliamentary Plenary Sessions development set.


ieee automatic speech recognition and understanding workshop | 2009

Automatic translation from parallel speech: Simultaneous interpretation as MT training data

Matthias Paulik; Alex Waibel

State-of-the art statistical machine translation depends heavily on the availability of domain-specific bilingual parallel text. However, acquiring large amounts of bilingual parallel text is costly and, depending on the language pair, sometimes impossible. We propose an alternative to parallel text as machine translation (MT) training data; audio recordings of parallel speech (pSp) as it occurs in any scenario where interpreters are involved. Although interpretation (pSp) differs significantly from translation (parallel text), we achieve surprisingly strong translation results with our pSp-trained MT and speech translation systems.We argue that the presented approach is of special interest for developing speech translation in the context of resource-deficient languages where even monolingual resources are scarce.


international conference on acoustics, speech, and signal processing | 2010

Spoken language translation from parallel speech audio: Simultaneous interpretation as SLT training data

Matthias Paulik; Alex Waibel

In recent work, we proposed an alternative to parallel text as translation model (TM) training data: audio recordings of parallel speech (pSp), as it occurs in any communication scenario where interpreters are involved. Although interpretation compares poorly to translation, we reported surprisingly strong translation results for systems based on pSp trained TMs. This work extends the use of pSp as a data source for unsupervised training of all major models involved in statistical spoken language translation. We consider the scenario of speech translation between a resource rich and a resource-deficient language. Our seed models are based on 10h of transcribed audio and parallel text comprised of 100k translated words. With the help of 92h of untranscribed pSp audio, and by taking advantage of the redundancy inherent to pSp (the same information is given twice, in two languages), we report significant improvements for the resource-deficient acoustic, language and translation models.


international conference on acoustics, speech, and signal processing | 2007

Speech Translation Enhanced ASR for European Parliament Speeches - On the Influence of ASR Performance on Speech Translation

Sebastian Stüker; Matthias Paulik; Muntsin Kolss; Christian Fügen; Alex Waibel

In this paper we describe our work in coupling automatic speech recognition (ASR) and machine translation (MT) in a speech translation enhanced automatic speech recognition (STE-ASR) framework for transcribing and translating European parliament speeches. We demonstrate the influence of the quality of the ASR component on the MT performance, by comparing a series of WERs with the corresponding automatic translation scores. By porting an STE-ASR framework to the task at hand, we show how the word errors for transcribing English and Spanish speeches can be lowered by 3.0% and 4.8% relative, respectively.


ieee automatic speech recognition and understanding workshop | 2011

Leveraging large amounts of loosely transcribed corporate videos for acoustic model training

Matthias Paulik; Panchi Panchapagesan

Lightly supervised acoustic model (AM) training has seen a tremendous amount of interest over the past decade. It promises significant cost-savings by relying on only small amounts of accurately transcribed speech and large amounts of imperfectly (loosely) transcribed speech. The latter can often times be acquired from existing sources, without additional cost. We identify corporate videos as one such source. After reviewing the state of the art in lightly supervised AM training, we describe our efforts on exploiting 977 hours of loosely transcribed corporate videos for AM training. We report strong reductions in word error rate of up to 19.4% over our baseline. We also report initial results for a simple, yet effective scheme to identify a subset of lightly supervised training labels that are more important to the training process.


spoken language technology workshop | 2008

Simultaneous machine translation of german lectures into english: Investigating research challenges for the future

Matthias Wölfel; Muntsin Kolss; Florian Kraft; Jan Niehues; Matthias Paulik; Alex Waibel

An increasingly globalized world fosters the exchange of students, researchers or employees. As a result, situations in which people of different native tongues are listening to the same lecture become more and more frequent. In many such situations, human interpreters are prohibitively expensive or simply not available. For this reason, and because first prototypes have already demonstrated the feasibility of such systems, automatic translation of lectures receives increasing attention. A large vocabulary and strong variations in speaking style make lecture translation a challenging, however not hopeless, task. The scope of this paper is to investigate a variety of challenges and to highlight possible solutions in building a system for simultaneous translation of lectures from German to English. While some of the investigated challenges are more general, e.g. environment robustness, other challenges are more specific for this particular task, e.g. pronunciation of foreign words or sentence segmentation. We also report our progress in building an end-to-end system and analyze its performance in terms of objective and subjective measures.


Computer Speech & Language | 2013

Training speech translation from audio recordings of interpreter-mediated communication

Matthias Paulik; Alex Waibel

Abstract: Globalization as well as international crises and disasters spur the need for cross-lingual verbal communication for myriad languages. This is reflected in ongoing intense research activity in the field of speech translation. However, the development of deployable speech translation systems still happens only for a handful of languages. Prohibitively high costs attached to the acquisition of sufficient amounts of suitable speech translation training data are one of the main reasons for this situation. A new language pair or domain is typically only considered for speech translation development after a major need for cross-lingual verbal communication just arose-justifying the high development costs. In such situations, communication has to rely on the help of interpreters, while massive data collections for system development are conducted in parallel. We propose an alternative to this time-consuming and costly parallel effort. By training speech translation directly on audio recordings of interpreter-mediated communication, we omit most of the manual transcription effort and all of the manual translation effort that characterizes traditional speech translation development.


IEEE Potentials | 2007

Translating language with technology's help

Matthias Paulik; Sebastian Stüker; Christian Fügen; Tanja Schultz; Alex Waibel

In this article, we introduced an iterative system for improving speech recognition in the context of human mediated translation scenarios. In contrast to related work conducted in this field, we included scenarios in which only spoken language representations are available. One key feature of our iterative system is that all involved system components, ASR as well as MT, are improved. Particularly in the context of a spoken source language representation, not only is the target language ASR automatically improved but so is the source language ASR. Using Spanish as the source language and English as the target language, we were able to reduce the WER of the English ASR by 35.8% when given a written-source language representation. Given a spoken-source language representation, we achieved a relative WER reduction of 29.9% for English and 20.9% for Spanish

Collaboration


Dive into the Matthias Paulik's collaboration.

Top Co-Authors

Avatar

Alex Waibel

Karlsruhe Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Muntsin Kolss

Karlsruhe Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Stephan Vogel

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Christian Fügen

Karlsruhe Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Florian Kraft

Karlsruhe Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Jan Niehues

Karlsruhe Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Sebastian Stüker

Karlsruhe Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge