Mary Hearne
Dublin City University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mary Hearne.
international conference on computational linguistics | 2009
John Tinsley; Mary Hearne; Andy Way
Given much recent discussion and the shift in focus of the field, it is becoming apparent that the incorporation of syntax is the way forward for the current state-of-the-art in machine translation (MT). Parallel treebanks are a relatively recent innovation and appear to be ideal candidates for MT training material. However, until recently there has been no other means to build them than by hand. In this paper, we describe how we make use of new tools to automatically build a large parallel treebank and extract a set of linguistically motivated phrase pairs from it. We show that adding these phrase pairs to the translation model of a baseline phrase-based statistical MT (PBSMT) system leads to significant improvements in translation quality. We describe further experiments on incorporating parallel treebank information into PBSMT, such as word alignments. We investigate the conditions under which the incorporation of parallel treebank data performs optimally. Finally, we discuss the potential of parallel treebanks in other paradigms of MT.
international conference on computational linguistics | 2004
Declan Groves; Mary Hearne; Andy Way
Data-Oriented Translation (DOT), based on Data-Oriented Parsing (DOP), is a language-independent MT engine which exploits parsed, aligned bitexts to produce very high quality translations. However, data acquisition constitutes a serious bottleneck as DOT requires parsed sentences aligned at both sentential and sub-structural levels. Manual sub-structural alignment is time-consuming, error-prone and requires considerable knowledge of both source and target languages and how they are related. Automating this process is essential in order to carry out the large-scale translation experiments necessary to assess the full potential of DOT.We present a novel algorithm which automatically induces sub-structural alignments between context-free phrase structure trees in a fast and consistent fashion requiring little or no knowledge of the language pair. We present results from a number of experiments which indicate that our method provides a serious alternative to manual alignment.
Language and Linguistics Compass | 2011
Mary Hearne; Andy Way
This paper presents an overview of Statistical Machine Translation (SMT), which is currently the dominant approach in Machine Translation (MT) research. In Way and Hearne (2010), we describe how central linguists and translators are to the MT process, so that SMT developers and researchers may better understand how to include these groups in continuing to advance the stateof-the-art. If these constituencies are to make an impact in the field of MT, they need to know how their input is used by SMT systems. Accordingly, our objective in this paper is to present the basic principles underpinning SMT in a way that linguists and translators will find accessible and useful.
Language and Linguistics Compass | 2011
Andy Way; Mary Hearne
For the state-of-the-art in Machine Translation (MT) to advance, we contend that Statistical MT (SMT) practitioners will have to collaborate with linguists and translators. This has not happened to date for two main reasons: (i) there remain many translators – and even more surprisingly, many experienced MT protagonists – who find the basic model of SMT very difficult to understand; and (ii) while current approaches to MT depend entirely on the availability of large quantities of translated data, very little thought is given to the process by which the data were produced and the kinds of phenomena that are known to exist in real-translated texts. We argue that much light can be shed on these by looking at the areas of contrastive linguistics and translation studies. In this paper, we describe how central linguists and translators are to the MT process, so that SMT developers and researchers may better understand how to include these previously neglected groups in continuing to advance the state-of-the-art. In a companion paper, we describe the workings of SMT in language that is understandable to linguists and translators. If these constituencies are to make an impact in the field of MT, they need to know how their input is used by the SMT systems, and how they can help in the data annotation phase which is crucial if greater strides are to be made than heretofore.
Archive | 2003
Mary Hearne; Andy Way
Tinsley, John and Zhechev, Ventsislav and Hearne, Mary and Way, Andy (2007) Robust language pair-independent sub-tree alignment. In: Machine Translation Summit XI, 10-14 September, 2007, Copenhagen, Denmark. | 2007
John Tinsley; Ventsislav Zhechev; Mary Hearne; Andy Way
Archive | 2006
Mary Hearne; Andy Way
Hearne, Mary and Tinsley, John and Zhechev, Ventsislav and Way, Andy (2007) Capturing translational divergences with a statistical tree-to-tree aligner. In: TMI-07 - Proceedings of The 11th Conference on Theoretical and Methodological Issues in Machine Translation, 7-9 September 2007, Skövde, Sweden. | 2007
Mary Hearne; John Tinsley; Ventsislav Zhechev; Andy Way
spoken language technology workshop | 2006
Hany Hassan; Mary Hearne; Andy Way; Khalil Sima'an
Archive | 2005
Andrea Burbank; Marine Carpuat; Stephen Clark; Markus Dreyer; Pamela Fox; Declan Groves; Mary Hearne; I. Dan Melamed; Yihai Shen; Ben Wellington; Dekai Wu