Maxim Khalilov
Polytechnic University of Catalonia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Maxim Khalilov.
meeting of the association for computational linguistics | 2009
Maxim Khalilov; José A. R. Fonollosa
In this paper we compare and contrast two approaches to Machine Translation (MT): the CMU-UKA Syntax Augmented Machine Translation system (SAMT) and UPC-TALP N-gram-based Statistical Machine Translation (SMT). SAMT is a hierarchical syntax-driven translation system underlain by a phrase-based model and a target part parse tree. In N-gram-based SMT, the translation process is based on bilingual units related to word-to-word alignment and statistical modeling of the bilingual context following a maximum-entropy framework. We provide a step-by-step comparison of the systems and report results in terms of automatic evaluation metrics and required computational resources for a smaller Arabic-to-English translation task (1.5M tokens in the training corpus). Human error analysis clarifies advantages and disadvantages of the systems under consideration. Finally, we combine the output of both systems to yield significant improvements in translation quality.
Computer Speech & Language | 2011
Maxim Khalilov; José A. R. Fonollosa
Abstract: In this paper, we develop an approach called syntax-based reordering (SBR) to handling the fundamental problem of word ordering for statistical machine translation (SMT). We propose to alleviate the word order challenge including morpho-syntactical and statistical information in the context of a pre-translation reordering framework aimed at capturing short- and long-distance word distortion dependencies. We examine the proposed approach from the theoretical and experimental points of view discussing and analyzing its advantages and limitations in comparison with some of the state-of-the-art reordering methods. In the final part of the paper, we describe the results of applying the syntax-based model to translation tasks with a great need for reordering (Chinese-to-English and Arabic-to-English). The experiments are carried out on standard phrase-based and alternative N-gram-based SMT systems. We first investigate sparse training data scenarios, in which the translation and reordering models are trained on a sparse bilingual data, then scaling the method to a large training set and demonstrating that the improvement in terms of translation quality is maintained.
workshop on statistical machine translation | 2007
Marta R. Costa-jussià; Josep Maria Crego; Patrik Lambert; Maxim Khalilov; José A. R. Fonollosa; José B. Mariño; Rafael E. Banchs
This paper describes the 2007 Ngram-based statistical machine translation system developed at the TALP Research Center of the UPC (Universitat Politecnica de Catalunya) in Barcelona. Emphasis is put on improvements and extensions of the previous years system, being highlyghted and empirically compared. Mainly, these include a novel word ordering strategy based on: (1) statistically monotonizing the training source corpus and (2) a novel reordering approach based on weighted reordering graphs. In addition, this system introduces a target language model based on statistical classes, a feature for out-of-domain units and an improved optimization procedure. The paper provides details of this system participation in the ACL 2007 SECOND WORKSHOP ON STATISTICAL MACHINE TRANSLATION. Results on three pairs of languages are reported, namely from Spanish, French and German into English (and the other way round) for both the in-domain and out-of-domain tasks.
workshop on statistical machine translation | 2008
Maxim Khalilov; Adolfo Hernández H.; Marta Ruiz Costa-Jussà; Josep Maria Crego; Carlos A. Henríquez Q.; Patrik Lambert; José A. R. Fonollosa; José B. Mariño; Rafael E. Banchs
This paper reports on the participation of the TALP Research Center of the UPC (Universitat Politecnica de Catalunya) to the ACL WMT 2008 evaluation campaign. This years system is the evolution of the one we employed for the 2007 campaign. Main updates and extensions involve linguistically motivated word reordering based on the reordering patterns technique. In addition, this system introduces a target language model, based on linguistic classes (Part-of-Speech), morphology reduction for an inflectional language (Spanish) and an improved optimization procedure. Results obtained over the development and test sets on Spanish to English (and the other way round) translations for both the traditional Europarl and a challenging News stories tasks are analyzed and commented.
workshop on statistical machine translation | 2009
José A. R. Fonollosa; Maxim Khalilov; Marta R. Costa-Juss`a; Jos'e B. Mari~no; Carlos A. Henr'aquez Q.; Adolfo Hernández H.; Rafael E. Banchs
This study presents the TALP-UPC submission to the EACL Fourth Worskhop on Statistical Machine Translation 2009 evaluation campaign. It outlines the architecture and configuration of the 2009 phrase-based statistical machine translation (SMT) system, putting emphasis on the major novelty of this year: combination of SMT systems implementing different word reordering algorithms. Traditionally, we have concentrated on the Spanish-to-English and English-to-Spanish News Commentary translation tasks.
north american chapter of the association for computational linguistics | 2009
Maxim Khalilov; José A. R. Fonollosa; Mark Dras
In this paper, we start with the existing idea of taking reordering rules automatically derived from syntactic representations, and applying them in a preprocessing step before translation to make the source sentence structurally more like the target; and we propose a new approach to hierarchically extracting these rules. We evaluate this, combined with a lattice-based decoding, and show improvements over state-of-the-art distortion models.
IWSLT | 2006
Josep Maria Crego; Patrik Lambert; Maxim Khalilov; Marta R. Costa-juss; Rafael E. Banchs
workshop on statistical machine translation | 2006
Josep Maria Crego; Adrià de Gispert; Patrik Lambert; Marta Ruiz Costa-Jussà; Maxim Khalilov; Rafael E. Banchs; José B. Mariño; José A. R. Fonollosa
International Workshop on Spoken Language Translation | 2008
Maxim Khalilov; Carlos A. Henr; Adolfo Hern; Rafael E. Banchs; Chen Boxing; Min Zhang; Aiti Aw; Haizhou Li
workshop on statistical machine translation | 2006
Marta Ruiz Costa-Jussà; Josep Maria Crego; Adrià de Gispert; Patrik Lambert; Maxim Khalilov; José B. Mariño; José A. R. Fonollosa; Rafael E. Banchs