Is this you? Create Your Porfile

David Picó

Polytechnic University of Valencia

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where David Picó is active.

Explore More

Publication

Featured researches published by David Picó.

Computer Speech & Language | 2004

Some approaches to statistical and finite-state speech-to-speech translation

Francisco Casacuberta; Hermann Ney; Franz Josef Och; Enrique Vidal; Juan Miguel Vilar; Sergio Barrachina; I. Garcı́a-Varea; D. Llorens; César Martínez; Sirko Molau; Francisco Nevado; Moisés Pastor; David Picó; Alberto Sanchis; C. Tillmann

Abstract Speech-input translation can be properly approached as a pattern recognition problem by means of statistical alignment models and stochastic finite-state transducers. Under this general framework, some specific models are presented. One of the features of such models is their capability of automatically learning from training examples. Moreover, the stochastic finite-state transducers permit an integrated architecture similar to one used in speech recognition. In this case, the acoustic models (hidden Markov models) are embedded into the finite-state transducers, and the translation of a source utterance is the result of a (Viterbi) search on the integrated network. These approaches have been followed in the framework of the European project E u T rans . Translation experiments have been performed from Spanish to English and from Italian to English in an application involving the interaction of a customer with a receptionist at the frontdesk of a hotel.

international conference on acoustics, speech, and signal processing | 2001

Speech-to-speech translation based on finite-state transducers

Francisco Casacuberta; David Llorens; Carlos Martinez; Sirko Molau; Francisco Nevado; Hermann Ney; Moisés Pastor; David Picó; Alberto Sanchis; Enrique Vidal; Juan Miguel Vilar

Nowadays, the most successful speech recognition systems are based on stochastic finite-state networks (hidden Markov models and n-grams). Speech translation can be accomplished in a similar way as speech recognition. Stochastic finite-state transducers, which are specific stochastic finite-state networks, have proved very adequate for translation modeling. In this work a speech-to-speech translation system, the EuTRANS system, is presented. The acoustic, language and translation models are finite-state networks that are automatically learnt from training samples. This system was assessed in a series of translation experiments from Spanish to English and from Italian to English in an application involving the interaction (by telephone) of a customer with a receptionist at the front-desk of a hotel.

Pattern Recognition | 2005

Inference of finite-state transducers from regular languages

Francisco Casacuberta; Enrique Vidal; David Picó

Finite-state transducers are models that are being used in different areas of pattern recognition and computational linguistics. One of these areas is machine translation, where the approaches that are based on building models automatically from training examples are becoming more and more attractive. Finite-state transducers are very adequate to be used in constrained tasks where training samples of pairs of sentences are available. A technique to infer finite-state transducers is proposed in this work. This technique is based on formal relations between finite-state transducers and finite-state grammars. Given a training corpus of input-output pairs of sentences, the proposed approach uses statistical alignment methods to produce a set of conventional strings from which a stochastic finite-state grammar is inferred. This grammar is finally transformed into a resulting finite-state transducer. The proposed methods are assessed through series of machine translation experiments within the framework of the EUTRANS project.

Lecture Notes in Computer Science | 2004

A Syntactic Pattern Recognition Approach to Computer Assisted Translation

Jorge Civera; Juan Miguel Vilar; Elsa Cubel; Antonio L. Lagarda; Sergio Barrachina; Francisco Casacuberta; Enrique Vidal; David Picó; Jorge González

It is a fact that current methodologies for automatic translation cannot be expected to produce high quality translations. An alternative approach is to use them as an aid to manual translation. We focus on a possible way to help human translators: to interactively provide completions for the parts of the sentences already translated. We explain how finite state transducers can be used for this task and show experiments in which the keystrokes needed to translate printer manuals were reduced to nearly 25% of the original.

Machine Learning | 2001

Some Statistical-Estimation Methods for Stochastic Finite-State Transducers

David Picó; Francisco Casacuberta

Formal translations constitute a suitable framework for dealing with many problems in pattern recognition and computational linguistics. The application of formal transducers to these areas requires a stochastic extension for dealing with noisy, distorted patterns with high variability. In this paper, some estimation criteria are proposed and developed for the parameter estimation of regular syntax-directed translation schemata. These criteria are: maximum likelihood estimation, minimum conditional entropy estimation and conditional maximum likelihood estimation. The last two criteria were proposed in order to deal with situations when training data is sparse. These criteria take into account the possibility of ambiguity in the translations: i.e., there can be different output strings for a single input string. In this case, the final goal of the stochastic framework is to find the highest probability translation of a given input string. These criteria were tested on a translation task which has a high degree of ambiguity.

international conference natural language processing | 2004

SisHiTra : A Hybrid Machine Translation System from Spanish to Catalan

José R. Navarro; Jorge González; David Picó; Francisco Casacuberta; Joan M. de Val; Ferran Fabregat; Ferran Pla; Jesús Tomás

In the current European scenario, characterized by the coexistence of communities writing and speaking a great variety of languages, machine translation has become a technology of capital importance. In areas of Spain and of other countries, coofficiality of several languages implies producing several versions of public information. Machine translation between all the languages of the Iberian Peninsula and from them into English will allow for a better integration of Iberian linguistic communities among them and inside Europe. The purpose of this paper is to show a machine translation system from Spanish to Catalan that deals with text input. In our approach, both deductive (linguistic) and inductive (corpus-based) methodologies are combined in an homogeneous and efficient framework: finite-state transducers. Some preliminary results show the interest of the proposed architecture.

Lecture Notes in Computer Science | 2004

GIATI: A General Methodology for Finite-State Translation Using Alignments

David Picó; Jesús Tomás; Francisco Casacuberta

Statistical techniques for machine translation have experienced an increasing interest by the natural language research community in the last years. Both statistical language modeling and statistical machine translation are now well-established disciplines with solid basis and outstanding results. On the other hand, finite-state transducers have revealed as an efficient and flexible formalism for the representation of a wide range of the kind of information that arises in natural language processing.

Lecture Notes in Computer Science | 2000

A Statistical-Estimation Method for Stochastic Finite-State Transducers Based on Entropy Measures

David Picó; Francisco Casacuberta

The stochastic extension of formal translations constitutes a suitable framework for dealing with many problems in Syntactic Pattern Recognition. Some estimation criteria have already been proposed and developed for the parameter estimation of Regular Syntax-Directed Translation Schemata. Here, a new criterium is proposed for dealing with situations when training data is sparse. This criterium is based on entropy measurements, somehow inspired in the Maximum Mutual Information criterium, and it takes into account the possibility of ambiguity in translations (i.e., the translation model may yield different output strings for a single input string.) The goal in the stochastic framework is to find the most probable translation of a given input string. Experiments were performed on a translation task which has a high degree of ambiguity.

international colloquium on grammatical inference | 1998

Transducer-Learning Experiments on Language Understanding

David Picó; Enrique Vidal

The interest in using Finite-State Models in a large variety of applications is recently growing as more powerful techniques for learning them from examples have been developed. Language Understanding can be approached this way as a problem of language translation in which the target language is a formal language rather than a natural one. Finite-state transducers are used to model the translation process, and are automatically learned from training data consisting of pairs of natural-language/formal-language sentences. The need for training data is dramatically reduced by performing a two-level learning process based on lexical/phrase categorization. Successful experiments are presented on a task consisting in the “understanding” of Spanish natural-language sentences describing dates and times, where the target formal language is the one used in the popular Unix command “at”.

empirical methods in natural language processing | 2004