Francis M. Tyers | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Francis M. Tyers is active.

Explore More

Publication

Featured researches published by Francis M. Tyers.

Machine Translation | 2011

Apertium: a free/open-source platform for rule-based machine translation

Mikel L. Forcada; Mireia Ginestí-Rosell; Jacob Nordfalk; Jim O'Regan; Sergio Ortiz-Rojas; Juan Antonio Pérez-Ortiz; Felipe Sánchez-Martínez; Gema Ramírez-Sánchez; Francis M. Tyers

Apertium is a free/open-source platform for rule-based machine translation. It is being widely used to build machine translation systems for a variety of language pairs, especially in those cases (mainly with related-language pairs) where shallow transfer suffices to produce good quality translations, although it has also proven useful in assimilation scenarios with more distant pairs involved. This article summarises the Apertium platform: the translation engine, the encoding of linguistic data, and the tools developed around the platform. The present limitations of the platform and the challenges posed for the coming years are also discussed. Finally, evaluation results for some of the most active language pairs are presented. An appendix describes Apertium as a free/open-source project.

The Prague Bulletin of Mathematical Linguistics | 2010

Free/open-source resources in the Apertium platform for machine translation research and development

Francis M. Tyers; Felipe Sánchez-Martínez; Sergio Ortiz-Rojas; Mikel L. Forcada

Free/Open-Source Resources in the Apertium Platform for Machine Translation Research and Development This paper describes the resources available in the Apertium platform, a free/open-source framework for creating rule-based machine translation systems. Resources within the platform take the form of finite-state morphologies for morphological analysis and generation, bilingual transfer lexica, probabilistic part-of-speech taggers and transfer rule files, all in standardised formats. These resources are described and some examples are given of their reuse and recycling in combination with other machine translation systems.

Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw#N# Text to Universal Dependencies | 2017

CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies

Daniel Zeman; Martin Popel; Milan Straka; Jan Hajic; Joakim Nivre; Filip Ginter; Juhani Luotolahti; Sampo Pyysalo; Slav Petrov; Martin Potthast; Francis M. Tyers; Elena Badmaeva; Memduh Gokirmak; Anna Nedoluzhko; Silvie Cinková; Jaroslava Hlaváčová; Václava Kettnerová; Zdenka Uresová; Jenna Kanerva; Stina Ojala; Anna Missilä; Christopher D. Manning; Sebastian Schuster; Siva Reddy; Dima Taji; Nizar Habash; Herman Leung; Marie-Catherine de Marneffe; Manuela Sanguinetti; Maria Simi

The Conference on Computational Natural Language Learning (CoNLL) features a shared task, in which participants train and test their learning systems on the same data sets. In 2017, the task was devoted to learning dependency parsers for a large number of languages, in a real-world setting without any gold-standard annotation on input. All test sets followed a unified annotation scheme, namely that of Universal Dependencies. In this paper, we define the task and evaluation methodology, describe how the data sets were prepared, report and analyze the main results, and provide a brief categorization of the different approaches of the participating systems.

international conference natural language processing | 2010

Shooting at flies in the dark: rule-based lexical selection for a minority language pair

Linda Wiechetek; Francis M. Tyers; Thomas Omma

This paper presents a set of rules which form the prototype lexical selection component of a rule-based machine translation system between two closely-related minority languages, North Sami and Lule Sami. While the languages have comprehensive monolingual computational linguistic resources, they lack bilingual resources. One-to-one relations in the lexicon dominate, but there are also more complex relations that require lexical selection using both lexical and syntacticosemantic context. An evaluation was performed over a set of 11 word pairs, which shows that constructing lexical selection rules and doing research on a North Sami-Lule Sami contrastive lexicon is an interrelated process. Other lesser-resourced language pairs will benefit from the use of lexical selection rules as the relevance of lexical selection increases with the divergence of the languages.

The Prague Bulletin of Mathematical Linguistics | 2017

Rule-Based Machine Translation for the Italian–Sardinian Language Pair

Francis M. Tyers; Hèctor Alòs i Font; Gianfranco Fronteddu; Adrià Martín-Mor

Abstract This paper describes the process of creation of the first machine translation system from Italian to Sardinian, a Romance language spoken on the island of Sardinia in the Mediterranean. The project was carried out by a team of translators and computational linguists. The article focuses on the technology used (Rule-Based Machine Translation) and on some of the rules created, as well as on the orthographic model used for Sardinian.

Archive | 2018

Chapter 6. A constructicon for Russian: Filling in the gaps

Laura A. Janda; Olga Lyashevskaya; Tore Nesset; Ekaterina V. Rakhilina; Francis M. Tyers

The Russian Constructicon project currently prioritizes multi-word constructions that are not represented in dictionaries and that are especially useful for learners of Russian. The immediate goal is to identify constructions and determine the semantic constraints on their slots. The Russian Constructicon is being built in parallel with the Swedish Constructicon and will ultimately model the entire Russian language in terms of constructions at all levels from morpheme to discourse. The contents of the Russian Constructicon will serve learners of the language, linguists researching both language-internal and typological phenomena, and will also serve language technology applications such as spell checkers and automated readability assessment tools.

Machine Translation | 2018

The ARIEL-CMU situation frame detection pipeline for LoReHLT16: a model translation approach

Patrick Littell; Tian Tian; Ruochen Xu; Zaid A. W. Sheikh; David R. Mortensen; Lori S. Levin; Francis M. Tyers; Hiroaki Hayashi; Graham Horwood; Steve Sloto; Emily Tagtow; Alan W. Black; Yiming Yang; Teruko Mitamura; Eduard H. Hovy

The LoReHLT16 evaluation challenged participants to extract Situation Frames (SFs)—structured descriptions of humanitarian need situations—from monolingual Uyghur text. The ARIEL-CMU SF detector combines two classification paradigms, a manually curated keyword-spotting system and a machine learning classifier. These were applied by translating the models on a per-feature basis, rather than translating the input text. The resulting combined model provides the accuracy of human insight with the generality of machine learning, and is relatively tractable to human analysis and error correction. Other factors contributing to success were automatic dictionary creation, the use of phonetic transcription, detailed, hand-written morphological analysis, and naturalistic glossing for error analysis by humans. The ARIEL-CMU SF pipeline produced the top-scoring LoReHLT16 situation frame detection systems for the metrics SFType, SFType+Place+Need, SFType+Place+Relief, and SFType+Place+Urgency, at each of the three checkpoints.

Procedia Computer Science | 2017

A morphological analyser for Maltese

Vinit Ravishankar; Francis M. Tyers; Albert Gatt

Abstract This article describes the development of a free/open-source morphological description of Maltese, originally created as the analysis component in a rule-based machine translation system for Maltese to Arabic and later applied to other tasks. The lexicon formalism we use is lttoolbox, part of the Apertium machine translation platform. An evaluation of the analyser shows that the coverage is adequate, at 84.90%, while precision is 92.5% on a large automatically annotated test set and 96.2% on a smaller hand-validated set.

Proceedings of the Third Workshop on Computational Linguistics for Uralic Languages | 2017

Annotation schemes in North Sámi dependency parsing

Francis M. Tyers; Mariya Sheyanova

In this paper we describe a comparison of two annotation schemes for dependency parsing of North Sámi, a Finno-Ugric language spoken in the north of Scandinavia and Finland. The two annotation schemes are the Giellatekno (GT) scheme which has been used in research and applications for the Sámi languages and Universal Dependencies (UD)which is a cross-lingual scheme aiming to unify annotation stations across languages. We show that we are able to deterministically convert from the Giellatekno scheme to the Universal Dependencies scheme without a loss of parsing performance. While we do not claim that either scheme is a priori a more adequate model of North Sámi syntax, we do argue that the choice of annotation scheme is dependent on the intended application. This work is licensed under a Creative Commons Attribution–NoDerivatives 4.0 International Licence. Licence details: http://creativecommons.org/licenses/by-nd/4.0/

Proceedings of the First Celtic Language Technology Workshop | 2014

Subsegmental language detection in Celtic language text

Akshay Minocha; Francis M. Tyers

This paper describes an experiment to perform language identification on a sub-sentence basis. The typical case of language identification is to detect the language of documents or sentences. However, it may be the case that a single sentence or segment contains more than one language. This is especially the case in texts where code switching occurs.

Explore More