Is this you? Create Your Porfile

Elsa Cubel

Polytechnic University of Valencia

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Elsa Cubel is active.

Explore More

Publication

Featured researches published by Elsa Cubel.

Computational Linguistics | 2009

Statistical approaches to computer-assisted translation

Sergio Barrachina; Oliver Bender; Francisco Casacuberta; Jorge Civera; Elsa Cubel; Shahram Khadivi; Antonio L. Lagarda; Hermann Ney; Jesús Tomás; Enrique Vidal; Juan Miguel Vilar

Current machine translation (MT) systems are still not perfect. In practice, the output from these systems needs to be edited to correct errors. A way of increasing the productivity of the whole translation process (MT plus human work) is to incorporate the human correction activities within the translation process itself, thereby shifting the MT paradigm to that of computer-assisted translation. This model entails an iterative process in which the human translator activity is included in the loop: In each iteration, a prefix of the translation is validated (accepted or amended) by the human and the system computes its best (or n-best) translation suffix hypothesis to complete this prefix. A successful framework for MT is the so-called statistical (or pattern recognition) framework. Interestingly, within this framework, the adaptation of MT systems to the interactive scenario affects mainly the search process, allowing a great reuse of successful techniques and models. In this article, alignment templates, phrase-based models, and stochastic finite-state transducers are used to develop computer-assisted translation systems. These systems were assessed in a European project (TransType2) in two real tasks: The translation of printer manuals; manuals and the translation of the Bulletin of the European Union. In each task, the following three pairs of languages were involved (in both translation directions): English-Spanish, English-German, and English-French.

parallel computing | 2009

Human interaction for high-quality machine translation

Francisco Casacuberta; Jorge Civera; Elsa Cubel; Antonio L. Lagarda; Guy Lapalme; Elliott Macklovitch; Enrique Vidal

Introduction Translation from a source language into a target language has become a very important activity in recent years, both in official institutions (such as the United Nations and the EU, or in the parliaments of multilingual countries like Canada and Spain), as well as in the private sector (for example, to translate users manuals or newspapers articles). Prestigious clients such as these cannot make do with approximate translations; for all kinds of reasons, ranging from the legal obligations to good marketing practice, they require target-language texts of the highest quality. The task of producing such high-quality translations is a demanding and time-consuming one that is generally conferred to expert human translators. The problem is that, with growing globalization, the demand for high-quality translation has been steadily increasing, to the point where there are just not enough qualified translators available today to satisfy it. This has dramatically raised the need for improved machine translation (MT) technologies. The field of MT has undergone something of a revolution over the last 15 years, with the adoption of empirical, data-driven techniques originally inspired by the success of automatic speech recognition. Given the requisite corpora, it is now possible to develop new MT systems in a fraction of the time and with much less effort than was previously required under the formerly dominant rule-based paradigm. As for the quality of the translations produced by this new generation of MT systems, there has also been considerable progress; generally speaking, however, it remains well below that of human translation. No one would seriously consider directly using the output of even the best of these systems to translate a CV or a corporate Web site, for example, without submitting the machine translation to a careful human revision. As a result, those who require publication-quality translation are forced to make a diffcult choice between systems that are fully automatic but whose output must be attentively post-edited, and computer-assisted translation systems (or CAT tools for short) that allow for high quality but to the detriment of full automation. Currently, the best known CAT tools are translation memory (TM) systems. These systems recycle sentences that have previously been translated, either within the current document or earlier in other documents. This is very useful for highly repetitive texts, but not of much help for the vast majority of texts composed of original materials. Since TM systems were first introduced, very few other types of CAT tools have been forthcoming. Notable exceptions are the TransType system and its successor TransType2 (TT2). These systems represent a novel rework-ing of the old idea of interactive machine translation (IMT). Initial efforts on TransType are described in detail in Foster; suffice it to say here the systems principal novelty lies in the fact the human-machine interaction focuses on the drafting of the target text, rather than on the disambiguation of the source text, as in all former IMT systems. In the TT2 project, this idea was further developed. A full-fledged MT engine was embedded in an interactive editing environment and used to generate suggested completions of each target sentence being translated. These completions may be accepted or amended by the translator; but once validated, they are exploited by the MT engine to produce further, hopefully improved suggestions. This is in marked contrast with traditional MT, where typically the system is first used to produce a complete draft translation of a source text, which is then post-edited (corrected) offline by a human translator. TT2s interactive approach offers a significant advantage over traditional post-editing. In the latter paradigm, there is no way for the system, which is off-line, to benefit from the users corrections; in TransType, just the opposite is true. As soon as the user begins to revise an incorrect segment, the system immediately responds to that new information by proposing an alternative completion to the target segment, which is compatible with the prefix that the user has input. Another notable feature of the work described in this article is the importance accorded to a formal treatment of human-machine interaction, something that is seldom considered in the now-prevalent framework of statistical pattern recognition.

Lecture Notes in Computer Science | 2004

A Syntactic Pattern Recognition Approach to Computer Assisted Translation

Jorge Civera; Juan Miguel Vilar; Elsa Cubel; Antonio L. Lagarda; Sergio Barrachina; Francisco Casacuberta; Enrique Vidal; David Picó; Jorge González

It is a fact that current methodologies for automatic translation cannot be expected to produce high quality translations. An alternative approach is to use them as an aid to manual translation. We focus on a possible way to help human translators: to interactively provide completions for the parts of the sentences already translated. We explain how finite state transducers can be used for this task and show experiments in which the keystrokes needed to translate printer manuals were reduced to nearly 25% of the original.

finite state methods and natural language processing | 2005

A Novel Approach to Computer-Assisted Translation Based on Finite-State Transducers

Jorge Civera; Juan Miguel Vilar; Elsa Cubel; Antonio L. Lagarda; Sergio Barrachina; Francisco Casacuberta; Enrique Vidal

Computer-Assisted Translation (CAT) is an alternative approach to Machine Translation, that integrates human expertise into the automatic translation process. In this framework, a human translator interacts with a translation system that dynamically offers a list of translations that best completes the part of the sentence already translated. Stochastic finite-state transducer technology is proposed to support this CAT system. The system was assessed on two real tasks of different complexity in several languages.

iberian conference on pattern recognition and image analysis | 2007

Bilingual Text Classification

Jorge Civera; Elsa Cubel; Enrique Vidal

Bilingual documentation has become a common phenomenon in official institutions and private companies. In this scenario, the categorization of bilingual text is a useful tool. In this paper, different approaches will be proposed to tackle this bilingual classification task. On the one hand, three finite-state transducer algorithms from the grammatical inference framework will be presented. On the other hand, a naive combination of smoothed n-gram models will be introduced. To evaluate the performance of bilingual classifiers, two categorized bilingual corpora of different complexity were considered. Experiments in a limited-domain task show that all the models obtain similar results. However, results on a more open-domain task denote the supremacy of the naive approach.

iberian conference on pattern recognition and image analysis | 2005

Different approaches to bilingual text classification based on grammatical inference techniques

Jorge Civera; Elsa Cubel; Alfons Juan; Enrique Vidal

Bilingual documentation has become a common phenomenon in many official institutions and private companies. In this scenario, the categorization of bilingual text is a useful tool, that can be also applied in the machine translation field. To tackle this classification task, different approaches will be proposed. On the one hand, two finite-state transducer algorithms from the grammatical inference domain will be discussed. On the other hand, the well-known naive Bayes approximation will be presented along with a possible modelization based on n-gram language models. Experiments carried out on a bilingual corpus have demonstrated the adequacy of these methods and the relevance of a second information source in text classification, as supported by classification error rates. Relative reduction of 29% with respect to the best previous results on the monolingual version of the same task has been obtained.

empirical methods in natural language processing | 2004

From Machine Translation to Computer Assisted Translation using Finite-State Models.

Jorge Civera; Elsa Cubel; Antonio L. Lagarda; David Picó; Jorge González; Enrique Vidal; Francisco Casacuberta; Juan Miguel Vilar; Sergio Barrachina

european conference on artificial intelligence | 2004

Finite-state models for computer assisted translation

Elsa Cubel; Jorge Civera; Juan Miguel Vilar; Antonio L. Lagarda; Francisco Casacuberta; Enrique Vidal; David Picó; Jorge González; Luis Javier Rodríguez

Archive | 2003