Maite Melero
Pompeu Fabra University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Maite Melero.
Machine Translation | 2008
Michael Carl; Maite Melero; Toni Badia; Vincent Vandeghinste; Peter Dirix; Ineke Schuurman; Stella Markantonatou; Sokratis Sofianopoulos; Marina Vassiliou; Olga Yannoutsou
METIS-II was an EU-FET MT project running from October 2004 to September 2007, which aimed at translating free text input without resorting to parallel corpora. The idea was to use “basic” linguistic tools and representations and to link them with patterns and statistics from the monolingual target-language corpus. The METIS-II project has four partners, translating from their “home” languages Greek, Dutch, German, and Spanish into English. The paper outlines the basic ideas of the project, their implementation, the resources used, and the results obtained. It also gives examples of how METIS-II has continued beyond its lifetime and the original scope of the project. On the basis of the results and experiences obtained, we believe that the approach is promising and offers the potential for development in various directions.
The Prague Bulletin of Mathematical Linguistics | 2012
Christian Federmann; Maite Melero; Pavel Pecina; Josef van Genabith
Towards Optimal Choice Selection for Improved Hybrid Machine Translation In recent years, machine translation (MT) research focused on investigating how hybrid MT as well as MT combination systems can be designed so that the resulting translations give an improvement over the individual translations. As a first step towards achieving this objective we have developed a parallel corpus with source data and the output of a number of MT systems, annotated with metadata information, capturing aspects of the translation process performed by the different MT systems. As a second step, we have organised a shared task in which participants were requested to build Hybrid/System Combination systems using the annotated corpus as input. The main focus of the shared task is trying to answer the following question: Can Hybrid MT algorithms or System Combination techniques benefit from the extra information (linguistically motivated, decoding and runtime) from the different systems involved? In this paper, we describe the annotated corpus we have created. We provide an overview on the participating systems from the shared task as well as a discussion of the results.
international conference on computational linguistics | 2014
Jens Grivolla; Maite Melero; Toni Badia; Cosmin Cabulea; Yannick Estève; Eelco Herder; Jean-Marc Odobez; Susanne Preuss; Raúl Marín
The EUMSSI project (Event Understanding through Multimodal Social Stream Interpretation) aims at developing technologies for aggregating data presented as unstructured information in sources of very different nature. The multimodal analytics will help organize, classify and cluster cross-media streams, by enriching its associated metadata in an interactive manner, so that the data resulting from analysing one media helps reinforce the aggregation of information from other media, in a cross-modal semantic representation framework. Once all the available descriptive information has been collected, an interpretation component will dynamically reason over the semantic representation in order to derive implicit knowledge. Finally the enriched information will be fed to a hybrid recommendation system, which will be at the basis of two well-motivated use-cases. In this paper we give a brief overview of EUMSSI’s main goals and how we are approaching its implementation using UIMA to integrate and combine various layers of annotations coming from different sources.
Natural Language Engineering | 2016
Maite Melero; Marta Ruiz Costa-Jussà; Patrik Lambert; Martí Quixal
We present research aiming to build tools for the normalization of User-Generated Content (UGC). We argue that processing this type of text requires the revisiting of the initial steps of Natural Language Processing, since UGC (micro-blog, blog, and, generally, Web 2.0 user-generated texts) presents a number of nonstandard communicative and linguistic characteristics – often closer to oral and colloquial language than to edited text. We present a corpus of UGC text in Spanish from three different sources: Twitter, consumer reviews, and blogs, and describe its main characteristics. We motivate the need for UGC text normalization by analyzing the problems found when processing this type of text through a conventional language processing pipeline, particularly in the tasks of lemmatization and morphosyntactic tagging. Our aim with this paper is to seize the power of already existing spell and grammar correction engines and endow them with automatic normalization capabilities in order to pave the way for the application of standard Natural Language Processing tools to typical UGC text. Particularly, we propose a strategy for automatically normalizing UGC by adding a module on top of a pre-existing spell-checker that selects the most plausible correction from an unranked list of candidates provided by the spell-checker. To build this selector module we train four language models, each one containing a different type of linguistic information in a trade-off with its generalization capabilities. Our experiments show that the models trained on truecase and lowercase word forms are more discriminative than the others at selecting the best candidate. We have also experimented with a parametrized combination of the models by both optimizing directly on the selection task and doing a linear interpolation of the models. The resulting parametrized combinations obtain results close to the best performing model but do not improve on those results, as measured on the test set. The precision of the selector module in ranking number one the expected correction proposal on the test corpora reaches 82.5% for Twitter text (baseline 57%) and 88% for non-Twitter text (baseline 64%).
language resources and evaluation | 2008
Vincent Vandeghinste; Peter Dirix; Ineke Schuurman; Stella Markantonatou; Sokratis Sofianopoulos; Marina Vassiliou; Olga Yannoutsou; Toni Badia; Maite Melero; Gemma Boleda; Michael Carl; Paul Schmidt
Archive | 2005
Toni Badia; Gemma Boleda; Maite Melero; Antony W Oliver
language resources and evaluation | 2012
Maite Melero; Marta R. Costa-Juss`a; Judith Domingo; Montse Marquina; Martí Quixal
language resources and evaluation | 2016
Marta Villegas; Maite Melero; Núria Bel; Jorge Gracia
language resources and evaluation | 2012
Eleftherios Avramidis; Marta R. Costa-Juss`a; Christian Federmann; Josef van Genabith; Maite Melero; Pavel Pecina
language resources and evaluation | 2014
Marta Villegas; Maite Melero; Núria Bel