Juan Antonio Pérez-Ortiz

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Juan Antonio Pérez-Ortiz is active.

Explore More

Publication

Featured researches published by Juan Antonio Pérez-Ortiz.

Machine Translation | 2011

Apertium: a free/open-source platform for rule-based machine translation

Mikel L. Forcada; Mireia Ginestí-Rosell; Jacob Nordfalk; Jim O'Regan; Sergio Ortiz-Rojas; Juan Antonio Pérez-Ortiz; Felipe Sánchez-Martínez; Gema Ramírez-Sánchez; Francis M. Tyers

Apertium is a free/open-source platform for rule-based machine translation. It is being widely used to build machine translation systems for a variety of language pairs, especially in those cases (mainly with related-language pairs) where shallow transfer suffices to produce good quality translations, although it has also proven useful in assimilation scenarios with more distant pairs involved. This article summarises the Apertium platform: the translation engine, the encoding of linguistic data, and the tools developed around the platform. The present limitations of the platform and the challenges posed for the coming years are also discussed. Finally, evaluation results for some of the most active language pairs are presented. An appendix describes Apertium as a free/open-source project.

processing of the portuguese language | 2006

Open-Source portuguese–spanish machine translation

Carme Armentano-Oller; Rafael C. Carrasco; Antonio M. Corbí-Bellot; Mikel L. Forcada; Mireia Ginestí-Rosell; Sergio Ortiz-Rojas; Juan Antonio Pérez-Ortiz; Gema Ramírez-Sánchez; Felipe Sánchez-Martínez; Miriam A. Scalco

This paper describes the current status of development of an open-source shallow-transfer machine translation (MT) system for the [European] Portuguese

Neural Networks | 2003

Kalman filters improve LSTM network performance in problems unsolvable by traditional recurrent nets

Juan Antonio Pérez-Ortiz; Felix A. Gers; Douglas Eck; Jürgen Schmidhuber

\leftrightarrow

international symposium on neural networks | 2001

Part-of-speech tagging with recurrent neural networks

Juan Antonio Pérez-Ortiz; Mikel L. Forcada

Spanish language pair, developed using the OpenTrad Apertium MT toolbox (www.apertium.org). Apertium uses finite-state transducers for lexical processing, hidden Markov models for part-of-speech tagging, and finite-state-based chunking for structural transfer, and is based on a simple rationale: to produce fast, reasonably intelligible and easily correctable translations between related languages, it suffices to use a MT strategy which uses shallow parsing techniques to refine word-for-word MT. This paper briefly describes the MT engine, the formats it uses for linguistic data, and the compilers that convert these data into an efficient format used by the engine, and then goes on to describe in more detail the pilot Portuguese

Machine Translation | 2008

Using target-language information to train part-of-speech taggers for machine translation

Felipe Sánchez-Martínez; Juan Antonio Pérez-Ortiz; Mikel L. Forcada

\leftrightarrow

international conference on artificial neural networks | 2001

Online Symbolic-Sequence Prediction with Discrete-Time Recurrent Neural Networks

Juan Antonio Pérez-Ortiz; Jorge Calera-Rubio; Mikel L. Forcada

Spanish linguistic data.

The Prague Bulletin of Mathematical Linguistics | 2010

Tradubi: Open-Source Social Translation for the Apertium Machine Translation Platform

Víctor M. Sánchez-Cartagena; Juan Antonio Pérez-Ortiz

The long short-term memory (LSTM) network trained by gradient descent solves difficult problems which traditional recurrent neural networks in general cannot. We have recently observed that the decoupled extended Kalman filter training algorithm allows for even better performance, reducing significantly the number of training steps when compared to the original gradient descent training algorithm. In this paper we present a set of experiments which are unsolvable by classical recurrent networks but which are solved elegantly and robustly and quickly by LSTM combined with Kalman filters.

Computer Speech & Language | 2015

A generalised alignment template formalism and its application to the inference of shallow-transfer machine translation rules from scarce bilingual corpora

Víctor M. Sánchez-Cartagena; Juan Antonio Pérez-Ortiz; Felipe Sánchez-Martínez

Explores the use of discrete-time recurrent neural networks for part-of-speech disambiguation of textual corpora. Our approach does not need a hand-tagged text for training the tagger, being probably the first neural approach doing so. Preliminary results show that the performance of this approach is, at least, similar to that of a standard hidden Markov model trained using the Baum-Welch algorithm.

The Prague Bulletin of Mathematical Linguistics | 2010

ScaleMT: a Free/Open-Source Framework for Building Scalable Machine Translation Web Services

Víctor M. Sánchez-Cartagena; Juan Antonio Pérez-Ortiz

Although corpus-based approaches to machine translation (MT) are growing in interest, they are not applicable when the translation involves less-resourced language pairs for which there are no parallel corpora available; in those cases, the rule-based approach is the only applicable solution. Most rule-based MT systems make use of part-of-speech (PoS) taggers to solve the PoS ambiguities in the source-language texts to translate; those MT systems require accurate PoS taggers to produce reliable translations in the target language (TL). The standard statistical approach to PoS ambiguity resolution (or tagging) uses hidden Markov models (HMM) trained in a supervised way from hand-tagged corpora, an expensive resource not always available, or in an unsupervised way through the Baum-Welch expectation-maximization algorithm; both methods use information only from the language being tagged. However, when tagging is considered as an intermediate task for the translation procedure, that is, when the PoS tagger is to be embedded as a module within an MT system, information from the TL can be (unsupervisedly) used in the training phase to increase the translation quality of the whole MT system. This paper presents a method to train HMM-based PoS taggers to be used in MT; the new method uses not only information from the source language (SL), as general-purpose methods do, but also information from the TL and from the remaining modules of the MT system in which the PoS tagger is to be embedded. We find that the translation quality of the MT system embedding a PoS tagger trained in an unsupervised manner through this new method is clearly better than that of the same MT system embedding a PoS tagger trained through the Baum-Welch algorithm, and comparable to that obtained by embedding a PoS tagger trained in a supervised way from hand-tagged corpora.

international conference natural language processing | 2004

Exploring the Use of Target-Language Information to Train the Part-of-Speech Tagger of Machine Translation Systems

Felipe Sánchez-Martínez; Juan Antonio Pérez-Ortiz; Mikel L. Forcada

This paper studies the use of discrete-time recurrent neural networks for predicting the next symbol in a sequence. The focus is on online prediction, a task much harder than the classical offline grammatical inference with neural networks. The results obtained show that the performance of recurrent networks working online is acceptable when sequences come from finite-state machines or even from some chaotic sources. When predicting texts in human language, however, dynamics seem to be too complex to be correctly learned in real-time by the net. Two algorithms are considered for network training: real-time recurrent learning and the decoupled extended Kalman filter.

Explore More