Oliver Bender | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Oliver Bender is active.

Explore More

Publication

Featured researches published by Oliver Bender.

Computational Linguistics | 2009

Statistical approaches to computer-assisted translation

Sergio Barrachina; Oliver Bender; Francisco Casacuberta; Jorge Civera; Elsa Cubel; Shahram Khadivi; Antonio L. Lagarda; Hermann Ney; Jesús Tomás; Enrique Vidal; Juan Miguel Vilar

Current machine translation (MT) systems are still not perfect. In practice, the output from these systems needs to be edited to correct errors. A way of increasing the productivity of the whole translation process (MT plus human work) is to incorporate the human correction activities within the translation process itself, thereby shifting the MT paradigm to that of computer-assisted translation. This model entails an iterative process in which the human translator activity is included in the loop: In each iteration, a prefix of the translation is validated (accepted or amended) by the human and the system computes its best (or n-best) translation suffix hypothesis to complete this prefix. A successful framework for MT is the so-called statistical (or pattern recognition) framework. Interestingly, within this framework, the adaptation of MT systems to the interactive scenario affects mainly the search process, allowing a great reuse of successful techniques and models. In this article, alignment templates, phrase-based models, and stochastic finite-state transducers are used to develop computer-assisted translation systems. These systems were assessed in a European project (TransType2) in two real tasks: The translation of printer manuals; manuals and the translation of the Bulletin of the European Union. In each task, the following three pairs of languages were involved (in both translation directions): English-Spanish, English-German, and English-French.

north american chapter of the association for computational linguistics | 2003

Maximum entropy models for named entity recognition

Oliver Bender; Franz Josef Och; Hermann Ney

In this paper, we describe a system that applies maximum entropy (ME) models to the task of named entity recognition (NER). Starting with an annotated corpus and a set of features which are easily obtainable for almost any language, we first build a baseline NE recognizer which is then used to extract the named entities and their context information from additional non-annotated data. In turn, these lists are incorporated into the final recognizer to further improve the recognition accuracy.

workshop on statistical machine translation | 2009

A Deep Learning Approach to Machine Transliteration

Thomas Deselaers; Saša Hasan; Oliver Bender; Hermann Ney

In this paper we present a novel transliteration technique which is based on deep belief networks. Common approaches use finite state machines or other methods similar to conventional machine translation. Instead of using conventional NLP techniques, the approach presented here builds on deep belief networks, a technique which was shown to work well for other machine learning problems. We show that deep belief networks have certain properties which are very interesting for transliteration and possibly also for translation and that a combination with conventional techniques leads to an improvement over both components on an Arabic-English transliteration task.

conference of the european chapter of the association for computational linguistics | 2003

Comparison of alignment templates and maximum entropy models for natural language understanding

Oliver Bender; Klaus Macherey; Franz Josef Och; Hermann Ney

In this paper we compare two approaches to natural language understanding (NLU). The first approach is derived from the field of statistical machine translation (MT), whereas the other uses the maximum entropy (ME) framework. Starting with an annotated corpus, we describe the problem of NLU as a translation from a source sentence to a formal language target sentence. We mainly focus on the quality of the different alignment and ME models and show that the direct ME approach outperforms the alignment templates method.

ieee automatic speech recognition and understanding workshop | 2007

The RWTH Arabic-to-English spoken language translation system

Oliver Bender; Stefan Hahn; Saša Hasan; Shahram Khadivi; Hermann Ney

We present the RWTH phrase-based statistical machine translation system designed for the translation of Arabic speech into English text. This system was used in the Global Autonomous Language Exploitation (GALE) Go/No-Go Translation Evaluation 2007. Using a two-pass approach, we first generate n-best translation candidates and then rerank these candidates using additional models. We give a short review of the decoder as well as of the models used in both passes. We stress the difficulties of spoken language translation, i.e. how to combine the recognition and translation systems and how to compensate for missing punctuation. In addition, we cover our work on domain adaptation for the applied language models. We present translation results for the official GALE 2006 evaluation set and the GALE 2007 development set.

IEEE Transactions on Audio, Speech, and Language Processing | 2009

Applications of Statistical Machine Translation Approaches to Spoken Language Understanding

Klaus Macherey; Oliver Bender; Hermann Ney

In this paper, we investigate two statistical methods for spoken language understanding based on statistical machine translation. The first approach employs the source-channel paradigm, whereas the other uses the maximum entropy framework. Starting with an annotated corpus, we describe the problem of natural language understanding as a translation from a source sentence to a formal language target sentence. We analyze the quality of different alignment models and feature functions and show that the direct maximum entropy approach outperforms the source channel-based method. Furthermore, we investigate how both methods perform if the input sentences contain speech recognition errors. Finally, we investigate a new approach to combine speech recognition and spoken language understanding. For this purpose, we employ minimum error rate training which directly optimizes the final evaluation criterion. By combining all knowledge sources in a log-linear way, we show that we can decrease both the word error rate and the slot error rate. Experiments were carried out on two German inhouse corpora for spoken dialogue systems.

workshop on statistical machine translation | 2006

Morpho-syntactic Arabic Preprocessing for Arabic to English Statistical Machine Translation

Anas El Isbihani; Shahram Khadivi; Oliver Bender; Hermann Ney

The Arabic language has far richer systems of inflection and derivation than English which has very little morphology. This morphology difference causes a large gap between the vocabulary sizes in any given parallel training corpus. Segmentation of inflected Arabic words is a way to smooth its highly morphological nature. In this paper, we describe some statistically and linguistically motivated methods for Arabic word segmentation. Then, we show the efficiency of proposed methods on the Arabic-English BTEC and NIST tasks.

IWSLT | 2005