Felipe Sánchez-Martínez

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Felipe Sánchez-Martínez is active.

Explore More

Publication

Featured researches published by Felipe Sánchez-Martínez.

Machine Translation | 2011

Apertium: a free/open-source platform for rule-based machine translation

Mikel L. Forcada; Mireia Ginestí-Rosell; Jacob Nordfalk; Jim O'Regan; Sergio Ortiz-Rojas; Juan Antonio Pérez-Ortiz; Felipe Sánchez-Martínez; Gema Ramírez-Sánchez; Francis M. Tyers

Apertium is a free/open-source platform for rule-based machine translation. It is being widely used to build machine translation systems for a variety of language pairs, especially in those cases (mainly with related-language pairs) where shallow transfer suffices to produce good quality translations, although it has also proven useful in assimilation scenarios with more distant pairs involved. This article summarises the Apertium platform: the translation engine, the encoding of linguistic data, and the tools developed around the platform. The present limitations of the platform and the challenges posed for the coming years are also discussed. Finally, evaluation results for some of the most active language pairs are presented. An appendix describes Apertium as a free/open-source project.

processing of the portuguese language | 2006

Open-Source portuguese–spanish machine translation

Carme Armentano-Oller; Rafael C. Carrasco; Antonio M. Corbí-Bellot; Mikel L. Forcada; Mireia Ginestí-Rosell; Sergio Ortiz-Rojas; Juan Antonio Pérez-Ortiz; Gema Ramírez-Sánchez; Felipe Sánchez-Martínez; Miriam A. Scalco

This paper describes the current status of development of an open-source shallow-transfer machine translation (MT) system for the [European] Portuguese

Journal of Artificial Intelligence Research | 2009

Inferring shallow-transfer machine translation rules from small parallel corpora

Felipe Sánchez-Martínez; Mikel L. Forcada

\leftrightarrow

Machine Translation | 2008

Using target-language information to train part-of-speech taggers for machine translation

Felipe Sánchez-Martínez; Juan Antonio Pérez-Ortiz; Mikel L. Forcada

Spanish language pair, developed using the OpenTrad Apertium MT toolbox (www.apertium.org). Apertium uses finite-state transducers for lexical processing, hidden Markov models for part-of-speech tagging, and finite-state-based chunking for structural transfer, and is based on a simple rationale: to produce fast, reasonably intelligible and easily correctable translations between related languages, it suffices to use a MT strategy which uses shallow parsing techniques to refine word-for-word MT. This paper briefly describes the MT engine, the formats it uses for linguistic data, and the compilers that convert these data into an efficient format used by the engine, and then goes on to describe in more detail the pilot Portuguese

The Prague Bulletin of Mathematical Linguistics | 2010

Free/open-source resources in the Apertium platform for machine translation research and development

Francis M. Tyers; Felipe Sánchez-Martínez; Sergio Ortiz-Rojas; Mikel L. Forcada

\leftrightarrow

international conference natural language processing | 2006

Using alignment templates to infer shallow-transfer machine translation rules

Felipe Sánchez-Martínez; Hermann Ney

Spanish linguistic data.

workshop on statistical machine translation | 2015

UAlacant word-level machine translation quality estimation system at WMT 2015

Miquel Esplà-Gomis; Felipe Sánchez-Martínez; Mikel L. Forcada

This paper describes a method for the automatic inference of structural transfer rules to be used in a shallow-transfer machine translation (MT) system from small parallel corpora. The structural transfer rules are based on alignment templates, like those used in statistical MT. Alignment templates are extracted from sentence-aligned parallel corpora and extended with a set of restrictions which are derived from the bilingual dictionary of the MT system and control their application as transfer rules. The experiments conducted using three difierent language pairs in the free/open-source MT platform Apertium show that translation quality is improved as compared to word-for-word translation (when no transfer rules are used), and that the resulting translation quality is close to that obtained using hand-coded transfer rules. The method we present is entirely unsupervised and benefits from information in the rest of modules of the MT system in which the inferred rules are applied.

Computer Speech & Language | 2015

A generalised alignment template formalism and its application to the inference of shallow-transfer machine translation rules from scarce bilingual corpora

Víctor M. Sánchez-Cartagena; Juan Antonio Pérez-Ortiz; Felipe Sánchez-Martínez

Although corpus-based approaches to machine translation (MT) are growing in interest, they are not applicable when the translation involves less-resourced language pairs for which there are no parallel corpora available; in those cases, the rule-based approach is the only applicable solution. Most rule-based MT systems make use of part-of-speech (PoS) taggers to solve the PoS ambiguities in the source-language texts to translate; those MT systems require accurate PoS taggers to produce reliable translations in the target language (TL). The standard statistical approach to PoS ambiguity resolution (or tagging) uses hidden Markov models (HMM) trained in a supervised way from hand-tagged corpora, an expensive resource not always available, or in an unsupervised way through the Baum-Welch expectation-maximization algorithm; both methods use information only from the language being tagged. However, when tagging is considered as an intermediate task for the translation procedure, that is, when the PoS tagger is to be embedded as a module within an MT system, information from the TL can be (unsupervisedly) used in the training phase to increase the translation quality of the whole MT system. This paper presents a method to train HMM-based PoS taggers to be used in MT; the new method uses not only information from the source language (SL), as general-purpose methods do, but also information from the TL and from the remaining modules of the MT system in which the PoS tagger is to be embedded. We find that the translation quality of the MT system embedding a PoS tagger trained in an unsupervised manner through this new method is clearly better than that of the same MT system embedding a PoS tagger trained through the Baum-Welch algorithm, and comparable to that obtained by embedding a PoS tagger trained in a supervised way from hand-tagged corpora.

Journal of Artificial Intelligence Research | 2012

Generalized biwords for bitext compression and translation spotting

Felipe Sánchez-Martínez; Rafael C. Carrasco; Miguel A. Martínez-Prieto; Joaquín Adiego

Free/Open-Source Resources in the Apertium Platform for Machine Translation Research and Development This paper describes the resources available in the Apertium platform, a free/open-source framework for creating rule-based machine translation systems. Resources within the platform take the form of finite-state morphologies for morphological analysis and generation, bilingual transfer lexica, probabilistic part-of-speech taggers and transfer rule files, all in standardised formats. These resources are described and some examples are given of their reuse and recycling in combination with other machine translation systems.

string processing and information retrieval | 2009

A Two-Level Structure for Compressing Aligned Bitexts

Joaquín Adiego; Nieves R. Brisaboa; Miguel A. Martínez-Prieto; Felipe Sánchez-Martínez

When building rule-based machine translation systems, a considerable human effort is needed to code the transfer rules that are able to translate source-language sentences into grammatically correct target-language sentences. In this paper we describe how to adapt the alignment templates used in statistical machine translation to the rule-based machine translation framework. The alignment templates are converted into structural transfer rules that are used by a shallow-transfer machine translation engine to produce grammatically correct translations. As the experimental results show there is a considerable improvement in the translation quality as compared to word-for-word translation (when no transfer rules are used), and the translation quality is close to that achieved when hand-coded transfer rules are used. The method presented is entirely unsupervised, and needs only a parallel corpus, two morphological analysers, and two part-of-speech taggers, such as those used by the machine translation system in which the inferred transfer rules are integrated.

Explore More