Rajen Chatterjee
Indian Institute of Technology Bombay
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Rajen Chatterjee.
workshop on statistical machine translation | 2015
Ondrej Bojar; Rajen Chatterjee; Christian Federmann; Barry Haddow; Matthias Huck; Chris Hokamp; Philipp Koehn; Varvara Logacheva; Christof Monz; Matteo Negri; Matt Post; Carolina Scarton; Lucia Specia; Marco Turchi
This paper presents the results of the WMT15 shared tasks, which included a standard news translation task, a metrics task, a tuning task, a task for run-time estimation of machine translation quality, and an automatic post-editing task. This year, 68 machine translation systems from 24 institutions were submitted to the ten translation directions in the standard translation task. An additional 7 anonymized systems were included, and were then evaluated both automatically and manually. The quality estimation task had three subtasks, with a total of 10 teams, submitting 34 entries. The pilot automatic postediting task had a total of 4 teams, submitting 7 entries.
meeting of the association for computational linguistics | 2016
Ondˇrej Bojar; Rajen Chatterjee; Christian Federmann; Yvette Graham; Barry Haddow; Matthias Huck; Antonio Jimeno Yepes; Philipp Koehn; Varvara Logacheva; Christof Monz; Matteo Negri; Aurélie Névéol; Mariana L. Neves; Martin Popel; Matt Post; Raphael Rubino; Carolina Scarton; Lucia Specia; Marco Turchi; Karin Verspoor; Marcos Zampieri
This paper presents the results of the WMT16 shared tasks, which included five machine translation (MT) tasks (standard news, IT-domain, biomedical, multimodal, pronoun), three evaluation tasks (metrics, tuning, run-time estimation of MT quality), and an automatic post-editing task and bilingual document alignment task. This year, 102 MT systems from 24 institutions (plus 36 anonymized online systems) were submitted to the 12 translation directions in the news translation task. The IT-domain task received 31 submissions from 12 institutions in 7 directions and the Biomedical task received 15 submissions systems from 5 institutions. Evaluation was both automatic and manual (relative ranking and 100-point scale assessments). The quality estimation task had three subtasks, with a total of 14 teams, submitting 39 entries. The automatic post-editing task had a total of 6 teams, submitting 11 entries.
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers | 2016
Rajen Chatterjee; José G. C. de Souza; Matteo Negri; Marco Turchi
In this paper, we present a novel approach to combine the two variants of phrasebased APE (monolingual and contextaware) by a factored machine translation model that is able to leverage benefits from both. Our factored APE models include part-of-speech-tag and class-based neural language models (LM) along with statistical word-based LM to improve the fluency of the post-edits. These models are built upon a data augmentation technique which helps to mitigate the problem of over-correction in phrase-based APE systems. Our primary APE system further incorporates a quality estimation (QE) model, which aims to select the best translation between the MT output and the automatic post-edit. According to the shared task results, our primary and contrastive (which does not include the QE module) submissions have similar performance and achieved significant improvement of 3.31% TER and 4.25% BLEU (relative) over the baseline MT system on the English-German evaluation set.
workshop on statistical machine translation | 2015
Rajen Chatterjee; Marco Turchi; Matteo Negri
In this paper, we describe the “FBK EnglishSpanish Automatic Post-editing (APE)” systems submitted to the APE shared task at the WMT 2015. We explore the most widely used statistical APE technique (monolingual) and its most significant variant (context-aware). In this exploration, we introduce some novel task-specific dense features through which we observe improvements over the default setup of these approaches. We show these features are useful to prune the phrase table in order to remove unreliable rules and help the decoder to select useful translation options during decoding. Our primary APE system submitted at this shared task performs significantly better than the standard APE baseline.
workshop on statistical machine translation | 2014
Piyush Dungarwal; Rajen Chatterjee; Abhijit Mishra; Anoop Kunchukuttan; Ritesh M. Shah; Pushpak Bhattacharyya
In this paper, we describe our EnglishHindi and Hindi-English statistical systems submitted to the WMT14 shared task. The core components of our translation systems are phrase based (Hindi-English) and factored (English-Hindi) SMT systems. We show that the use of number, case and Tree Adjoining Grammar information as factors helps to improve English-Hindi translation, primarily by generating morphological inflections correctly. We show improvements to the translation systems using pre-procesing and post-processing components. To overcome the structural divergence between English and Hindi, we preorder the source side sentence to conform to the target language word order. Since parallel corpus is limited, many words are not translated. We translate out-of-vocabulary words and transliterate named entities in a post-processing stage. We also investigate ranking of translations from multiple systems to select the best translation.
Proceedings of the Second Conference on Machine Translation | 2017
Ondřej Bojar; Rajen Chatterjee; Christian Federmann; Yvette Graham; Barry Haddow; Shujian Huang; Matthias Huck; Philipp Koehn; Qun Liu; Varvara Logacheva; Christof Monz; Matteo Negri; Matt Post; Raphael Rubino; Lucia Specia; Marco Turchi
language resources and evaluation | 2014
Anoop Kunchukuttan; Abhijit Mishra; Rajen Chatterjee; Ritesh M. Shah; Pushpak Bhattacharyya
conference of the european chapter of the association for computational linguistics | 2017
Rajen Chatterjee; Gebremedhen Gebremelak; Matteo Negri; Marco Turchi
Proceedings of the Second Conference on Machine Translation | 2017
Rajen Chatterjee; Matteo Negri; Marco Turchi; Marcello Federico; Lucia Specia; Frédéric Blain
Proceedings of the Second Conference on Machine Translation | 2017
Rajen Chatterjee; M. Amin Farajian; Matteo Negri; Marco Turchi; Ankit Srivastava; Santanu Pal