Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Bogdan Babych is active.

Publication


Featured researches published by Bogdan Babych.


conference of the european chapter of the association for computational linguistics | 2003

Improving machine translation quality with automatic named entity recognition

Bogdan Babych; Anthony Hartley

Named entities create serious problems for state-of-the-art commercial machine translation (MT) systems and often cause translation failures beyond the local context, affecting both the overall morphosyntactic well-formedness of sentences and word sense disambiguation in the source text. We report on the results of an experiment in which MT input was processed using output from the named entity recognition module of Sheffields GATE information extraction (IE) system. The gain in MT quality indicates that specific components of IE technology could boost the performance of current MT systems.


meeting of the association for computational linguistics | 2004

Extending the BLEU MT Evaluation Method with Frequency Weightings

Bogdan Babych; Tony Hartley

We present the results of an experiment on extending the automatic method of Machine Translation evaluation BLUE with statistical weights for lexical items, such as tf.idf scores. We show that this extension gives additional information about evaluated texts; in particular it allows us to measure translation Adequacy, which, for statistical MT systems, is often overestimated by the baseline BLEU method. The proposed model uses a single human reference translation, which increases the usability of the proposed method for practical purposes. The model suggests a linguistic interpretation which relates frequency weights and human intuition about translation Adequacy and Fluency.


meeting of the association for computational linguistics | 2006

Using Comparable Corpora to Solve Problems Difficult for Human Translators

Serge Sharoff; Bogdan Babych; Anthony Hartley

In this paper we present a tool that uses comparable corpora to find appropriate translation equivalents for expressions that are considered by translators as difficult. For a phrase in the source language the tool identifies a range of possible expressions used in similar contexts in target language corpora and presents them to the translator as a list of suggestions. In the paper we discuss the method and present results of human evaluation of the performance of the tool, which highlight its usefulness when dictionary solutions are lacking.


language resources and evaluation | 2009

‘Irrefragable answers’ using comparable corpora to retrieve translation equivalents

Serge Sharoff; Bogdan Babych; Anthony Hartley

In this paper we present a tool that uses comparable corpora to find appropriate translation equivalents for expressions that are considered by translators as difficult. For a phrase in the source language the tool identifies a range of possible expressions used in similar contexts in target language corpora and presents them to the translator as a list of suggestions. In the paper we discuss the method and present results of human evaluation of the performance of the tool, which highlight its usefulness when dictionary solutions are lacking.


international conference on computational linguistics | 2004

Extending MT evaluation tools with translation complexity metrics

Bogdan Babych; Debbie Elliott; Anthony Hartley

In this paper we report on the results of an experiment in designing resource-light metrics that predict the potential translation complexity of a text or a corpus of homogenous texts for state-of-the-art MT systems. We show that the best prediction of translation complexity is given by the average number of syllables per word (ASW). The translation complexity metrics based on this parameter are used to normalise automated MT evaluation scores such as BLEU, which otherwise are variable across texts of different types. The suggested approach makes a fairer comparison between the MT systems evaluated on different corpora. The translation complexity metric was integrated into two automated MT evaluation packages - BLEU and the Weighted N-gram model. The extended MT evaluation tools are available from the first authors web site: http://www.comp.leeds.ac.uk/bogdan/evalMT.html


conference of the european chapter of the association for computational linguistics | 2006

ASSIST: automated semantic assistance for translators

Serge Sharoff; Bogdan Babych; Paul Rayson; Olga Mudraya; Scott Piao

The problem we address in this paper is that of providing contextual examples of translation equivalents for words from the general lexicon using comparable corpora and semantic annotation that is uniform for the source and target languages. For a sentence, phrase or a query expression in the source language the tool detects the semantic type of the situation in question and gives examples of similar contexts from the target language corpus.


iberian conference on information systems and technologies | 2017

Reference-free system for automated human translation quality estimation

Yu Yuan; Bogdan Babych; Serge Sharoff

In this study the plausibility of automated human translation quality estimation is investigated to tackle the slowness, expensiveness and inconsistency of human evaluation. A reference free approach using machine learning is advanced to address four research questions. The methodology characteristic of this approach is then presented in detail. Finally, the author reports the latest progress of the project and some preliminary findings.


Archive | 2016

Hybrid Approaches to Machine Translation

Marta R. Costa-juss; Reinhard Rapp; Patrik Lambert; Kurt Eberle; Rafael E. Banchs; Bogdan Babych

This volume provides an overview of the field of Hybrid Machine Translation (MT) and presents some of the latest research conducted by linguists and practitioners from different multidisciplinary areas. Nowadays, most important developments in MT are achieved by combining data-driven and rule-based techniques. These combinations typically involve hybridization of different traditional paradigms, such as the introduction of linguistic knowledge into statistical approaches to MT, the incorporation of data-driven components into rule-based approaches, or statistical and rule-based pre- and post-processing for both types of MT architectures. The book is of interest primarily to MT specialists, but also in the wider fields of Computational Linguistics, Machine Learning and Data Mining to translators and managers of translation companies and departments who are interested in recent developments concerning automated translation tools.


Proceedings of the 3rd Workshop on Hybrid Approaches to Machine Translation (HyTra) | 2014

Deriving de/het gender classification for Dutch nouns for rule-based MT generation tasks

Bogdan Babych; Jonathan Geiger; Mireia Ginestí Rosell; Kurt Eberle

Linguistic resources available in the public domain, such as lemmatisers, part-ofspeech taggers and parsers can be used for the development of MT systems: as separate processing modules or as annotation tools for the training corpus. For SMT this annotation is used for training factored models, and for the rule-based systems linguistically annotated corpus is the basis for creating analysis, generation and transfer dictionaries from corpora. However, the annotation in many cases is insufficient for rule-based MT, especially for the generation tasks. In this paper we analyze a specific case when the part-ofspeech tagger does not provide information about de/het gender of Dutch nouns that is needed for our rule-based MT systems translating into Dutch. We show that this information can be derived from large annotated monolingual corpora using a set of context-checking rules on the basis of co-occurrence of nouns and determiners in certain morphosyntactic configurations. As not all contexts are sufficient for disambiguation, we evaluate the coverage and the accuracy of our method for different frequency thresholds


language resources and evaluation | 2012

Collecting and Using Comparable Corpora for Statistical Machine Translation

Inguna Skadiņa; Ahmet Aker; Nikos Mastropavlos; Fangzhong Su; Dan TufiÈ; Mateja Verlic; Andrejs Vasiļjevs; Bogdan Babych; Paul D. Clough; Robert J. Gaizauskas; Nikos Glaros; Monica Lestari Paramita

Collaboration


Dive into the Bogdan Babych's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Patrik Lambert

Polytechnic University of Catalonia

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge