Richard Zens | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Richard Zens is active.

Explore More

Publication

Featured researches published by Richard Zens.

Lecture Notes in Computer Science | 2002

Phrase-Based Statistical Machine Translation

Richard Zens; Franz Josef Och; Hermann Ney

This paper is based on the work carried out in the framework of the VERBMOBIL project, which is a limited-domain speech translation task (German-English). In the final evaluation, the statistical approach was found to perform best among five competing approaches.In this paper, we will further investigate the used statistical translation models. A shortcoming of the single-word based model is that it does not take contextual information into account for the translation decisions. We will present a translation model that is based on bilingual phrases to explicitly model the local context. We will show that this model performs better than the single-word based model. We will compare monotone and non-monotone search for this model and we will investigate the benefit of using the sum criterion instead of the maximum approximation.

meeting of the association for computational linguistics | 2003

A Comparative Study on Reordering Constraints in Statistical Machine Translation

Richard Zens; Hermann Ney

In statistical machine translation, the generation of a translation hypothesis is computationally expensive. If arbitrary word-reorderings are permitted, the search problem is NP-hard. On the other hand, if we restrict the possible word-reorderings in an appropriate way, we obtain a polynomial-time search algorithm.In this paper, we compare two different reordering constraints, namely the ITG constraints and the IBM constraints. This comparison includes a theoretical discussion on the permitted number of reorderings for each of these constraints. We show a connection between the ITG constraints and the since 1870 known Schroder numbers.We evaluate these constraints on two tasks: the Verbmobil task and the Canadian Hansards task. The evaluation consists of two parts: First, we check how many of the Viterbi alignments of the training corpus satisfy each of these constraints. Second, we restrict the search to each of these constraints and compare the resulting translation hypotheses.The experiments will show that the baseline ITG constraints are not sufficient on the Canadian Hansards task. Therefore, we present an extension to the ITG constraints. These extended ITG constraints increase the alignment coverage from about 87% to 96%.

international conference on computational linguistics | 2004

Reordering constraints for phrase-based statistical machine translation

Richard Zens; Hermann Ney; Taro Watanabe; Eiichiro Sumita

In statistical machine translation, the generation of a translation hypothesis is computationally expensive. If arbitrary reorderings are permitted, the search problem is NP-hard. On the other hand, if we restrict the possible reorderings in an appropriate way, we obtain a polynomial-time search algorithm. We investigate different reordering constraints for phrase-based statistical machine translation, namely the IBM constraints and the ITG constraints. We present efficient dynamic programming algorithms for both constraints. We evaluate the constraints with respect to translation quality on two Japanese-English tasks. We show that the reordering constraints improve translation quality compared to an unconstrained search that permits arbitrary phrase reorderings. The ITG constraints preform best on both tasks and yield statistically significant improvements compared to the unconstrained search.

meeting of the association for computational linguistics | 2005

Novel Reordering Approaches in Phrase-Based Statistical Machine Translation

Stephan Kanthak; David Vilar; Richard Zens; Hermann Ney

This paper presents novel approaches to reordering in phrase-based statistical machine translation. We perform consistent reordering of source sentences in training and estimate a statistical translation model. Using this model, we follow a phrase-based monotonic machine translation approach, for which we develop an efficient and flexible reordering framework that allows to easily introduce different reordering constraints. In translation, we apply source sentence reordering on word level and use a reordering automaton as input. We show how to compute reordering automata on-demand using IBM or ITG constraints, and also introduce two new types of reordering constraints. We further add weights to the reordering automata. We present detailed experimental results and show that reordering significantly improves translation quality.

workshop on statistical machine translation | 2006

N-Gram Posterior Probabilities for Statistical Machine Translation

Richard Zens; Hermann Ney

Word posterior probabilities are a common approach for confidence estimation in automatic speech recognition and machine translation. We will generalize this idea and introduce n-gram posterior probabilities and show how these can be used to improve translation quality. Additionally, we will introduce a sentence length model based on posterior probabilities. We will show significant improvements on the Chinese-English NIST task. The absolute improvements of the BLEU score is between 1.1% and 1.6%.

north american chapter of the association for computational linguistics | 2007

Chunk-level reordering of source language sentences with automatically learned rules for statistical machine translation

Yuqi Zhang; Richard Zens; Hermann Ney

In this paper, we describe a source-side reordering method based on syntactic chunks for phrase-based statistical machine translation. First, we shallow parse the source language sentences. Then, reordering rules are automatically learned from source-side chunks and word alignments. During translation, the rules are used to generate a reordering lattice for each sentence. Experimental results are reported for a Chinese-to-English task, showing an improvement of 0.5%--1.8% BLEU score absolute on various test sets and better computational efficiency than reordering during decoding. The experiments also show that the reordering at the chunk-level performs better than at the POS-level.

international conference on computational linguistics | 2004

Symmetric word alignments for statistical machine translation

Richard Zens; Hermann Ney

In this paper, we address the word alignment problem for statistical machine translation. We aim at creating a symmetric word alignment allowing for reliable one-to-many and many-to-one word relationships. We perform the iterative alignment training in the source-to-target and the target-to-source direction with the well-known IBM and HMM alignment models. Using these models, we robustly estimate the local costs of aligning a source word and a target word in each sentence pair. Then, we use efficient graph algorithms to determine the symmetric alignment with minimal total costs (i. e. maximal alignment probability). We evaluate the automatic alignments created in this way on the German--English Verbmobil task and the French--English Canadian Hansards task. We show statistically significant improvements of the alignment quality compared to the best results reported so far. On the Verbmobil task, we achieve an improvement of more than 1% absolute over the baseline error rate of 4.7%.

international conference on acoustics, speech, and signal processing | 2007

Speech Translation by Confusion Network Decoding

Nicola Bertoldi; Richard Zens; Marcello Federico

This paper describes advances in the use of confusion networks as interface between automatic speech recognition and machine translation. In particular, it presents an implementation of a confusion network decoder which significantly improves both in efficiency and performance previous work along this direction. The confusion network decoder results as an extension of a state-of-the-art phrase-based text translation system. Experimental results in terms of decoding speed and translation accuracy are reported on a real-data task, namely the translation of plenary speeches at the European Parliament from Spanish to English.

conference of the european chapter of the association for computational linguistics | 2003

Efficient search for interactive statistical machine translation

Franz Josef Och; Richard Zens; Hermann Ney

The goal of interactive machine translation is to improve the productivity of human translators. An interactive machine translation system operates as follows: the automatic system proposes a translation. Now, the human user has two options: to accept the suggestion or to correct it. During the post-editing process, the human user is assisted by the interactive system in the following way: the system suggests an extension of the current translation prefix. Then, the user either accepts this extension (completely or partially) or ignores it. The two most important factors of such an interactive system are the quality of the proposed extensions and the response time. Here, we will use a fully fledged translation system to ensure the quality of the proposed extensions. To achieve fast response times, we will use word hypotheses graphs as an efficient search space representation. We will show results of our approach on the Verbmobil task and on the Canadian Hansards task.

international conference on computational linguistics | 2004

Improved word alignment using a symmetric lexicon model

Richard Zens; Hermann Ney

Word-aligned bilingual corpora are an important knowledge source for many tasks in natural language processing. We improve the well-known IBM alignment models, as well as the Hidden-Markov alignment model using a symmetric lexicon model. This symmetrization takes not only the standard translation direction from source to target into account, but also the inverse translation direction from target to source. We present a theoretically sound derivation of these techniques. In addition to the symmetrization, we introduce a smoothed lexicon model. The standard lexicon model is based on full-form words only. We propose a lexicon smoothing method that takes the word base forms explicitly into account. Therefore, it is especially useful for highly inflected languages such as German. We evaluate these methods on the German-English Verbmobil task and the French-English Canadian Hansards task. We show statistically significant improvements of the alignment quality compared to the best system reported so far. For the Canadian Hansards task, we achieve an improvement of more than 30% relative.

Explore More