Tamer Alkhouli | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Tamer Alkhouli is active.

Explore More

Publication

Featured researches published by Tamer Alkhouli.

empirical methods in natural language processing | 2014

Translation Modeling with Bidirectional Recurrent Neural Networks

Martin Sundermeyer; Tamer Alkhouli; Joern Wuebker; Hermann Ney

This work presents two different translation models using recurrent neural networks. The first one is a word-based approach using word alignments. Second, we present phrase-based translation models that are more consistent with phrasebased decoding. Moreover, we introduce bidirectional recurrent neural models to the problem of machine translation, allowing us to use the full source sentence in our models, which is also of theoretical interest. We demonstrate that our translation models are capable of improving strong baselines already including recurrent neural language models on three tasks: IWSLT 2013 German!English, BOLT Arabic!English and Chinese!English. We obtain gains up to 1.6% BLEU and 1.7% TER by rescoring 1000-best lists.

empirical methods in natural language processing | 2015

A Comparison between Count and Neural Network Models Based on Joint Translation and Reordering Sequences

Andreas Guta; Tamer Alkhouli; Jan-Thorsten Peter; Joern Wuebker; Hermann Ney

We propose a conversion of bilingual sentence pairs and the corresponding word alignments into novel linear sequences. These are joint translation and reordering (JTR) uniquely defined sequences, combining interdepending lexical and alignment dependencies on the word level into a single framework. They are constructed in a simple manner while capturing multiple alignments and empty words. JTR sequences can be used to train a variety of models. We investigate the performances of ngram models with modified Kneser-Ney smoothing, feed-forward and recurrent neural network architectures when estimated on JTR sequences, and compare them to the operation sequence model (Durrani et al., 2013b). Evaluations on the IWSLT German!English, WMT German!English and BOLT Chinese!English tasks show that JTR models improve state-of-the-art phrasebased systems by up to 2.2 BLEU.

conference of the international speech communication association | 2016

LSTM, GRU, Highway and a bit of Attention: An Empirical Overview for Language Modeling in Speech Recognition

Kazuki Irie; Zoltán Tüske; Tamer Alkhouli; Ralf Schlüter; Hermann Ney

Popularized by the long short-term memory (LSTM), multiplicative gates have become a standard means to design artificial neural networks with intentionally organized information flow. Notable examples of such architectures include gated recurrent units (GRU) and highway networks. In this work, we first focus on the evaluation of each of the classical gated architectures for language modeling for large vocabulary speech recognition. Namely, we evaluate the highway network, lateral network, LSTM and GRU. Furthermore, the motivation underlying the highway network also applies to LSTM and GRU. An extension specific to the LSTM has been recently proposed with an additional highway connection between the memory cells of adjacent LSTM layers. In contrast, we investigate an approach which can be used with both LSTM and GRU: a highway network in which the LSTM or GRU is used as the transformation function. We found that the highway connections enable both standalone feedforward and recurrent neural language models to benefit better from the deep structure and provide a slight improvement of recognition accuracy after interpolation with count models. To complete the overview, we include our initial investigations on the use of the attention mechanism for learning word triggers.

workshop on statistical machine translation | 2015

Investigations on Phrase-based Decoding with Recurrent Neural Network Language and Translation Models

Tamer Alkhouli; Felix Rietig; Hermann Ney

This work explores the application of recurrent neural network (RNN) language and translation models during phrasebased decoding. Due to their use of unbounded context, the decoder integration of RNNs is more challenging compared to the integration of feedforward neural models. In this paper, we apply approximations and use caching to enable RNN decoder integration, while requiring reasonable memory and time resources. We analyze the effect of caching on translation quality and speed, and use it to integrate RNN language and translation models into a phrase-based decoder. To the best of our knowledge, no previous work has discussed the integration of RNN translation models into phrase-based decoding. We also show that a special RNN can be integrated efficiently without the need for approximations. We compare decoding using RNNs to rescoring n-best lists on two tasks: IWSLT 2013 German→English, and BOLT Arabic→English. We demonstrate that the performance of decoding with RNNs is at least as good as using them in rescoring.

Proceedings of the First Conference on Machine Translation: Volume 2,#N# Shared Task Papers | 2016

The QT21/HimL Combined Machine Translation System

Jan-Thorsten Peter; Tamer Alkhouli; Hermann Ney; Matthias Huck; Fabienne Braune; Alexander M. Fraser; Aleš Tamchyna; Ondrej Bojar; Barry Haddow; Rico Sennrich; Frédéric Blain; Lucia Specia; Jan Niehues; Alex Waibel; Alexandre Allauzen; Lauriane Aufrant; Franck Burlot; Elena Knyazeva; Thomas Lavergne; François Yvon; Marcis Pinnis; Stella Frank

This paper describes the joint submission of the QT21 and HimL projects for the English→Romanian translation task of the ACL 2016 First Conference on Machine Translation (WMT 2016). The submission is a system combination which combines twelve different statistical machine translation systems provided by the different groups (RWTH Aachen University, LMU Munich, Charles University in Prague, University of Edinburgh, University of Sheffield, Karlsruhe Institute of Technology, LIMSI, University of Amsterdam, Tilde). The systems are combined using RWTH’s system combination approach. The final submission shows an improvement of 1.0 BLEU compared to the best single system on newstest2016.

empirical methods in natural language processing | 2014

Vector Space Models for Phrase-based Machine Translation

Tamer Alkhouli; Andreas Guta; Hermann Ney

This paper investigates the application of vector space models (VSMs) to the standard phrase-based machine translation pipeline. VSMs are models based on continuous word representations embedded in a vector space. We exploit word vectors to augment the phrase table with new inferred phrase pairs. This helps reduce out-of-vocabulary (OOV) words. In addition, we present a simple way to learn bilingually-constrained phrase vectors. The phrase vectors are then used to provide additional scoring of phrase pairs, which fits into the standard log-linear framework of phrase-based statistical machine translation. Both methods result in significant improvements over a competitive in-domain baseline applied to the Arabic-to-English task of IWSLT 2013.

Proceedings of the First Conference on Machine Translation: Volume 1, Research Papers | 2016

Alignment-Based Neural Machine Translation.

Tamer Alkhouli; Gabriel Bretschner; Jan-Thorsten Peter; Mohammed Hethnawi; Andreas Guta; Hermann Ney

Neural machine translation (NMT) has emerged recently as a promising statistical machine translation approach. In NMT, neural networks (NN) are directly used to produce translations, without relying on a pre-existing translation framework. In this work, we take a step towards bridging the gap between conventional word alignment models and NMT. We follow the hidden Markov model (HMM) approach that separates the alignment and lexical models. We propose a neural alignment model and combine it with a lexical neural model in a loglinear framework. The models are used in a standalone word-based decoder that explicitly hypothesizes alignments during search. We demonstrate that our system outperforms attention-based NMT on two tasks: IWSLT 2013 German→English and BOLT Chinese→English. We also show promising results for re-aligning the training data using neural models.

information theory workshop | 2013

Novel tight classification error bounds under mismatch conditions based on f-Divergence

Ralf Schlüter; Markus Nussbaum-Thom; Eugen Beck; Tamer Alkhouli; Hermann Ney

By default, statistical classification/multiple hypothesis testing is faced with the model mismatch introduced by replacing the true distributions in Bayes decision rule by model distributions estimated on training samples. Although a large number of statistical measures exist w.r.t. to the mismatch introduced, these works rarely relate to the mismatch in accuracy, i.e. the difference between model error and Bayes error. In this work, the accuracy mismatch between the ideal Bayes decision rule/Bayes test and a mismatched decision rule in statistical classification/multiple hypothesis testing is investigated explicitly. A proof of a novel generalized tight statistical bound on the accuracy mismatch is presented. This result is compared to existing statistical bounds related to the total variational distance that can be extended to bounds of the accuracy mismatch. The analytic results are supported by distribution simulations.

The Prague Bulletin of Mathematical Linguistics | 2017

Empirical Investigation of Optimization Algorithms in Neural Machine Translation

Parnia Bahar; Christopher Jan-Steffen Brix; Tamer Alkhouli; Hermann Ney; Jan-Thorsten Peter

Abstract Training neural networks is a non-convex and a high-dimensional optimization problem. In this paper, we provide a comparative study of the most popular stochastic optimization techniques used to train neural networks. We evaluate the methods in terms of convergence speed, translation quality, and training stability. In addition, we investigate combinations that seek to improve optimization in terms of these aspects. We train state-of-the-art attention-based models and apply them to perform neural machine translation. We demonstrate our results on two tasks: WMT 2016 En→Ro and WMT 2015 De→En.

meeting of the association for computational linguistics | 2017

Hybrid Neural Network Alignment and Lexicon Model in Direct HMM for Statistical Machine Translation

Weiyue Wang; Tamer Alkhouli; Derui Zhu; Hermann Ney

Recently, the neural machine translation systems showed their promising performance and surpassed the phrase-based systems for most translation tasks. Retreating into conventional concepts machine translation while utilizing effective neural models is vital for comprehending the leap accomplished by neural machine translation over phrase-based methods. This work proposes a direct HMM with neural network-based lexicon and alignment models, which are trained jointly using the Baum-Welch algorithm. The direct HMM is applied to rerank the n-best list created by a state-of-the-art phrase-based translation system and it provides improvements by up to 1.0% Bleu scores on two different translation tasks.

Explore More