Mary Hearne | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mary Hearne is active.

Explore More

Publication

Featured researches published by Mary Hearne.

international conference on computational linguistics | 2009

Exploiting Parallel Treebanks to Improve Phrase-Based Statistical Machine Translation

John Tinsley; Mary Hearne; Andy Way

Given much recent discussion and the shift in focus of the field, it is becoming apparent that the incorporation of syntax is the way forward for the current state-of-the-art in machine translation (MT). Parallel treebanks are a relatively recent innovation and appear to be ideal candidates for MT training material. However, until recently there has been no other means to build them than by hand. In this paper, we describe how we make use of new tools to automatically build a large parallel treebank and extract a set of linguistically motivated phrase pairs from it. We show that adding these phrase pairs to the translation model of a baseline phrase-based statistical MT (PBSMT) system leads to significant improvements in translation quality. We describe further experiments on incorporating parallel treebank information into PBSMT, such as word alignments. We investigate the conditions under which the incorporation of parallel treebank data performs optimally. Finally, we discuss the potential of parallel treebanks in other paradigms of MT.

international conference on computational linguistics | 2004

Robust sub-sentential alignment of phrase-structure trees

Declan Groves; Mary Hearne; Andy Way

Data-Oriented Translation (DOT), based on Data-Oriented Parsing (DOP), is a language-independent MT engine which exploits parsed, aligned bitexts to produce very high quality translations. However, data acquisition constitutes a serious bottleneck as DOT requires parsed sentences aligned at both sentential and sub-structural levels. Manual sub-structural alignment is time-consuming, error-prone and requires considerable knowledge of both source and target languages and how they are related. Automating this process is essential in order to carry out the large-scale translation experiments necessary to assess the full potential of DOT.We present a novel algorithm which automatically induces sub-structural alignments between context-free phrase structure trees in a fast and consistent fashion requiring little or no knowledge of the language pair. We present results from a number of experiments which indicate that our method provides a serious alternative to manual alignment.

Language and Linguistics Compass | 2011

Statistical Machine Translation: A Guide for Linguists and Translators

Mary Hearne; Andy Way

This paper presents an overview of Statistical Machine Translation (SMT), which is currently the dominant approach in Machine Translation (MT) research. In Way and Hearne (2010), we describe how central linguists and translators are to the MT process, so that SMT developers and researchers may better understand how to include these groups in continuing to advance the stateof-the-art. If these constituencies are to make an impact in the field of MT, they need to know how their input is used by SMT systems. Accordingly, our objective in this paper is to present the basic principles underpinning SMT in a way that linguists and translators will find accessible and useful.

Language and Linguistics Compass | 2011

On the Role of Translations in State-of-the-Art Statistical Machine Translation

Andy Way; Mary Hearne

For the state-of-the-art in Machine Translation (MT) to advance, we contend that Statistical MT (SMT) practitioners will have to collaborate with linguists and translators. This has not happened to date for two main reasons: (i) there remain many translators – and even more surprisingly, many experienced MT protagonists – who find the basic model of SMT very difficult to understand; and (ii) while current approaches to MT depend entirely on the availability of large quantities of translated data, very little thought is given to the process by which the data were produced and the kinds of phenomena that are known to exist in real-translated texts. We argue that much light can be shed on these by looking at the areas of contrastive linguistics and translation studies. In this paper, we describe how central linguists and translators are to the MT process, so that SMT developers and researchers may better understand how to include these previously neglected groups in continuing to advance the state-of-the-art. In a companion paper, we describe the workings of SMT in language that is understandable to linguists and translators. If these constituencies are to make an impact in the field of MT, they need to know how their input is used by the SMT systems, and how they can help in the data annotation phase which is crucial if greater strides are to be made than heretofore.

Archive | 2003

Seeing the wood for the trees: data-oriented translation

Mary Hearne; Andy Way

Tinsley, John and Zhechev, Ventsislav and Hearne, Mary and Way, Andy (2007) Robust language pair-independent sub-tree alignment. In: Machine Translation Summit XI, 10-14 September, 2007, Copenhagen, Denmark. | 2007

Robust language pair-independent sub-tree alignment

John Tinsley; Ventsislav Zhechev; Mary Hearne; Andy Way

Archive | 2006

Disambiguation strategies for data-oriented translation

Mary Hearne; Andy Way

Hearne, Mary and Tinsley, John and Zhechev, Ventsislav and Way, Andy (2007) Capturing translational divergences with a statistical tree-to-tree aligner. In: TMI-07 - Proceedings of The 11th Conference on Theoretical and Methodological Issues in Machine Translation, 7-9 September 2007, Skövde, Sweden. | 2007

Capturing translational divergences with a statistical tree-to-tree aligner

Mary Hearne; John Tinsley; Ventsislav Zhechev; Andy Way

spoken language technology workshop | 2006

SYNTACTIC PHRASE-BASED STATISTICAL MACHINE TRANSLATION

Hany Hassan; Mary Hearne; Andy Way; Khalil Sima'an

Archive | 2005

Final Report of the 2005 Language Engineering Workshop on Statistical Machine Translation by Parsing

Andrea Burbank; Marine Carpuat; Stephen Clark; Markus Dreyer; Pamela Fox; Declan Groves; Mary Hearne; I. Dan Melamed; Yihai Shen; Ben Wellington; Dekai Wu

Explore More