Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Hieu Hoang is active.

Publication


Featured researches published by Hieu Hoang.


conference of the european chapter of the association for computational linguistics | 2014

Integrating an Unsupervised Transliteration Model into Statistical Machine Translation

Nadir Durrani; Hassan Sajjad; Hieu Hoang; Philipp Koehn

We investigate three methods for integrating an unsupervised transliteration model into an end-to-end SMT system. We induce a transliteration model from parallel data and use it to translate OOV words. Our approach is fully unsupervised and language independent. In the methods to integrate transliterations, we observed improvements from 0.23-0.75 ( 0.41) BLEU points across 7 language pairs. We also show that our mined transliteration corpora provide better rule coverage and translation quality compared to the gold standard transliteration corpora.


workshop on statistical machine translation | 2008

Towards better Machine Translation Quality for the German-English Language Pairs

Philipp Koehn; Abhishek Arun; Hieu Hoang

The Edinburgh submissions to the shared task of the Third Workshop on Statistical Machine Translation (WMT-2008) incorporate recent advances to the open source Moses system. We made a special effort on the German--English and English--German language pairs, leading to substantial improvements.


Software Engineering, Testing, and Quality Assurance for Natural Language Processing | 2008

Design of the Moses Decoder for Statistical Machine Translation

Hieu Hoang; Philipp Koehn

We present a description of the implementation of the open source decoder for statistical machine translation which has become popular with many researchers in SMT research. The goal of the project is to create an open, high quality phrase-based decoder which can reduce the time and barrier to entry for researchers wishing to do SMT research. We discuss the major design objective for the Moses decoder, its performance relative to other SMT decoders, and the steps we are taking to ensure that its success will continue.


workshop on statistical machine translation | 2009

A Systematic Analysis of Translation Model Search Spaces

Michael Auli; Adam Lopez; Hieu Hoang; Philipp Koehn

Translation systems are complex, and most metrics do little to pinpoint causes of error or isolate system differences. We use a simple technique to discover induction errors, which occur when good translations are absent from model search spaces. Our results show that a common pruning heuristic drastically increases induction error, and also strongly suggest that the search spaces of phrase-based and hierarchical phrase-based models are highly overlapping despite the well known structural differences.


empirical methods in natural language processing | 2008

Improving Interactive Machine Translation via Mouse Actions

Germán Sanchis-Trilles; Daniel Ortiz-Martínez; Jorge Civera; Francisco Casacuberta; Enrique Vidal; Hieu Hoang

Although Machine Translation (MT) is a very active research field which is receiving an increasing amount of attention from the research community, the results that current MT systems are capable of producing are still quite far away from perfection. Because of this, and in order to build systems that yield correct translations, human knowledge must be integrated into the translation process, which will be carried out in our case in an Interactive-Predictive (IP) framework. In this paper, we show that considering Mouse Actions as a significant information source for the underlying system improves the productivity of the human translator involved. In addition, we also show that the initial translations that the MT system provides can be quickly improved by an expert by only performing additional Mouse Actions. In this work, we will be using word graphs as an efficient interface between a phrase-based MT system and the IP engine.


meeting of the association for computational linguistics | 2009

Improving Mid-Range Re-Ordering Using Templates of Factors

Hieu Hoang; Philipp Koehn

We extend the factored translation model (Koehn and Hoang, 2007) to allow translations of longer phrases composed of factors such as POS and morphological tags to act as templates for the selection and reordering of surface phrase translation. We also reintroduce the use of alignment information within the decoder, which forms an integral part of decoding in the Alignment Template System (Och, 2002), into phrase-based decoding. Results show an increase in translation performance of up to 1.0% bleu for out-of-domain French-English translation. We also show how this method compares and relates to lexicalized reordering.


empirical methods in natural language processing | 2014

Preference Grammars and Soft Syntactic Constraints for GHKM Syntax-based Statistical Machine Translation

Matthias Huck; Hieu Hoang; Philipp Koehn

In this work, we investigate the effectiveness of two techniques for a featurebased integration of syntactic information into GHKM string-to-tree statistical machine translation (Galley et al., 2004): (1.) Preference grammars on the target language side promote syntactic wellformedness during decoding while also allowing for derivations that are not linguistically motivated (as in hierarchical translation). (2.) Soft syntactic constraints augment the system with additional sourceside syntax features while not modifying the set of string-to-tree translation rules or the baseline feature scores. We conduct experiments with a stateof-the-art setup on an English!German translation task. Our results suggest that preference grammars for GHKM translation are inferior to the plain targetsyntactified model, whereas the enhancement with soft source syntactic constraints provides consistent gains. By employing soft source syntactic constraints with sparse features, we are able to achieve improvements of up to 0.7 points BLEU and 1.0 points TER.


workshop on statistical machine translation | 2014

Augmenting String-to-Tree and Tree-to-String Translation with Non-Syntactic Phrases

Matthias Huck; Hieu Hoang; Philipp Koehn

We present an effective technique to easily augment GHKM-style syntax-based machine translation systems (Galley et al., 2006) with phrase pairs that do not comply with any syntactic well-formedness constraints. Non-syntactic phrase pairs are distinguished from syntactic ones in order to avoid harming effects. We apply our technique in state-of-the-art string-totree and tree-to-string setups. For tree-tostring translation, we furthermore investigate novel approaches for translating with source-syntax GHKM rules in association with input tree constraints and input tree features.


Proceedings of the First Conference on Machine Translation: Volume 1, Research Papers | 2016

Fast and highly parallelizable phrase table for statistical machine translation

Nikolay Bogoychev; Hieu Hoang

Speed of access is a very important property for phrase tables in phrase based statistical machine translation as they are queried many times per sentence. In this paper we present a new standalone phrase table, optimized for query speed and memory locality. The phrase table is cache free and can optionally incorporate a reordering table within. We are able to achieve two times faster decoding by using our phrase table in the Moses decoder in place of the current state-of-the-art phrase table solution without sacrificing translation quality. Using a new, experimental version of Moses we are able to achieve 10 times faster decoding using our novel phrase table.


The Association for Computational Linguistics | 2007

Moses: Open Source Toolkit for Statistical Machine Translation

Philipp Koehn; Hieu Hoang; Alexandra Birch; Chris Callison-Burch; Marcello Federico; Nicola Bertoldi; Brooke Cowan; Wade Shen; Christine Moran; Richard Zens; Chris Dyer; Ondrej Bojar; Alexandra Constantin; Evan Herbst

Collaboration


Dive into the Hieu Hoang's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kenneth Heafield

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Brooke Cowan

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Chris Dyer

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Christine Moran

Massachusetts Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge