Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ester Boldrini is active.

Publication


Featured researches published by Ester Boldrini.


meeting of the association for computational linguistics | 2009

Opinion and Generic Question Answering Systems: a Performance Analysis

Alexandra Balahur; Ester Boldrini; Andrés Montoyo; Patricio Martínez-Barco

The importance of the new textual genres such as blogs or forum entries is growing in parallel with the evolution of the Social Web. This paper presents two corpora of blog posts in English and in Spanish, annotated according to the EmotiBlog annotation scheme. Furthermore, we created 20 factual and opinionated questions for each language and also the Gold Standard for their answers in the corpus. The purpose of our work is to study the challenges involved in a mixed fact and opinion question answering setting by comparing the performance of two Question Answering (QA) systems as far as mixed opinion and factual setting is concerned. The first one is open domain, while the second one is opinion-oriented. We evaluate separately the two systems in both languages and propose possible solutions to improve QA systems that have to process mixed questions.


Expert Systems With Applications | 2015

A novel concept-level approach for ultra-concise opinion summarization

Elena Lloret; Ester Boldrini; Tatiana Vodolazova; Patricio Martínez-Barco; Rafael Muñoz; Manuel Palomar

The task of ultra-concise opinion summarization is addressed.Syntactic simplification, sentence regeneration and concept representation are used.Our approach outperforms a number of state-of-the-art systems.The best readability results using simplification are around 2.83 out of 3. The Web 2.0 has resulted in a shift as to how users consume and interact with the information, and has introduced a wide range of new textual genres, such as reviews or microblogs, through which users communicate, exchange, and share opinions. The exploitation of all this user-generated content is of great value both for users and companies, in order to assist them in their decision-making processes. Given this context, the analysis and development of automatic methods that can help manage online information in a quicker manner are needed. Therefore, this article proposes and evaluates a novel concept-level approach for ultra-concise opinion abstractive summarization. Our approach is characterized by the integration of syntactic sentence simplification, sentence regeneration and internal concept representation into the summarization process, thus being able to generate abstractive summaries, which is one the most challenging issues for this task. In order to be able to analyze different settings for our approach, the use of the sentence regeneration module was made optional, leading to two different versions of the system (one with sentence regeneration and one without). For testing them, a corpus of 400 English texts, gathered from reviews and tweets belonging to two different domains, was used. Although both versions were shown to be reliable methods for generating this type of summaries, the results obtained indicate that the version without sentence regeneration yielded to better results, improving the results of a number of state-of-the-art systems by 9%, whereas the version with sentence regeneration proved to be more robust to noisy data.


Data Mining and Knowledge Discovery | 2012

Using EmotiBlog to annotate and analyse subjectivity in the new textual genres

Ester Boldrini; Alexandra Balahur; Patricio Martínez-Barco; Andrés Montoyo

Thanks to the increasing amount of subjective data on the Web 2.0, tools to manage and exploit such data become essential. Our research is focused on the creation of EmotiBlog, a fine-grained annotation scheme for labelling subjectivity in non-traditional textual genres. We also present the EmotiBlog corpus; a collection of blog posts composed by 270,000 tokens about 3 topics and in 3 languages: Spanish, English and Italian. Additionally, we carry out a series of experiments focused on checking the robustness of the model and its applicability to Natural Language Processing tasks with regards to the 3 languages. The experiments for the inter-annotator agreement, as well as for feature selection, provided satisfactory results, which have given an impetus to continue working with the model and extend the annotated corpus. In order to check its applicability, we tested different Machine Learning models created using the annotation in EmotiBlog on other corpora in order to see if the obtained annotation is domain and genre independent, obtaining positive results. Finally, we also applied EmotiBlog to Opinion Mining, proving that our resource allows an improvement the performance of systems built for this task.


international conference natural language processing | 2011

Evaluating EmotiBlog robustness for sentiment analysis tasks

Javi Fernández; Ester Boldrini; José M. Gómez; Patricio Martínez-Barco

EmotiBlog is a corpus labelled with the homonymous annotation schema designed for detecting subjectivity in the new textual genres. Preliminary research demonstrated its relevance as a Machine Learning resource to detect opinionated data. In this paper we compare EmotiBlog with the JRC corpus in order to check the EmotiBlog robustness of annotation. For this research we concentrate on its coarse-grained labels. We carry out a deep ML experimentation also with the inclusion of lexical resources. The results obtained show a similarity with the ones obtained with the JRC demonstrating the EmotiBlog validity as a resource for the SA task.


ieee international conference on data science and advanced analytics | 2016

Exploiting a Bootstrapping Approach for Automatic Annotation of Emotions in Texts

Lea Canales; Carlo Strapparava; Ester Boldrini; Patricio Martnez-Barco

The objective of this research is to develop a technique to automatically annotate emotional corpora. The complexity of automatic annotation of emotional corpora still presents numerous challenges and thus there is a need to develop a technique that allow us to tackle the annotation task. The relevance of this research is demonstrated by the fact that peoples emotions and the patterns of these emotions provide a great value for business, individuals, society or politics. Hence, the creation of a robust emotion detection system becomes crucial. Due to the subjectivity of the emotions, the main challenge for the creation of emotional resources is the annotation process. Thus, with this staring point in mind, the objective of our paper is to illustrate an innovative and effective bootstrapping process for automatic annotations of emotional corpora. The evaluations carried out confirm the soundness of the proposed approach and allow us to consider the bootstrapping process as an appropriate approach to create resources such as an emotional corpus that can be employed on supervised machine learning towards the improvement of emotion detection systems.


international conference on computational linguistics | 2009

A Parallel Corpus Labeled Using Open and Restricted Domain Ontologies

Ester Boldrini; Sergio Ferrández; Rubén Izquierdo; David Tomás; José L. Vicedo

The analysis and creation of annotated corpus is fundamental for implementing natural language processing solutions based on machine learning. In this paper we present a parallel corpus of 4500 questions in Spanish and English on the touristic domain, obtained from real users. With the aim of training a question answering system, the questions were labeled with the expected answer type, according to two different ontologies. The first one is an open domain ontology based on Sekines Extended Named Entity Hierarchy, while the second one is a restricted domain ontology, specific for the touristic field. Due to the use of two ontologies with different characteristics, we had to solve many problematic cases and adjusted our annotation thinking on the characteristics of each one. We present the analysis of the domain coverage of these ontologies and the results of the inter-annotator agreement. Finally we use a question classification system to evaluate the labeling of the corpus.


conference on human system interactions | 2009

A proposal of Expected Answer Type and Named Entity annotation in a Question Answering context

Ester Boldrini; Sergio Ferrández; Rubén Izquierdo; David Tomás; Óscar Ferrández; José L. Vicedo

This paper presents our research related to automatic Expected Answer Type and Named Entity annotation tasks in a Question Answering context. We present the initial step of our research, in which we created the annotation guidelines. We therefore show and justify the tag set employed in the annotation of a collection of questions, and finally, different evaluations in order to test the consistency of the labelled corpus are also presented.


conference on information and knowledge management | 2009

Towards the definition of requirements for mixed fact and opinion question answering systems

Alexandra Balahur; Ester Boldrini; Andrés Montoyo; Patricio Martínez-Barco

The growth of the Social Web led to the birth of new textual genres such as blogs, forums or reviews. Such data sources are extremely relevant because texts pertaining to these categories approach a wide range of topics and are written by people with different social backgrounds. As a consequence, they represent a rich resource that can be exploited to carry out different types of analyses by a whole diversity of entities (potential customers, companies, public figures, political parties, etc.). For this research we created a collection of factoid and opinion questions in Spanish, with the purpose of comparing the performance of an open domain and an opinion QA system. Furthermore, we carried out two separate evaluations for each one of them and thoroughly analysed the results in order to understand the reasons for the problematic cases, thus being able to infer the features that an effective QA system for opinions should have. Our conclusion is that this task requires the use of specialised resources, whose creation for languages other than English is highly necessary.


Journal of Biomedical Informatics | 2017

DrugSemantics: A corpus for Named Entity Recognition in Spanish Summaries of Product Characteristics

Isabel Moreno; Ester Boldrini; Paloma Moreda; M. Teresa Romá-Ferri

For the healthcare sector, it is critical to exploit the vast amount of textual health-related information. Nevertheless, healthcare providers have difficulties to benefit from such quantity of data during pharmacotherapeutic care. The problem is that such information is stored in different sources and their consultation time is limited. In this context, Natural Language Processing techniques can be applied to efficiently transform textual data into structured information so that it could be used in critical healthcare applications, being of help for physicians in their daily workload, such as: decision support systems, cohort identification, patient management, etc. Any development of these techniques requires annotated corpora. However, there is a lack of such resources in this domain and, in most cases, the few ones available concern English. This paper presents the definition and creation of DrugSemantics corpus, a collection of Summaries of Product Characteristics in Spanish. It was manually annotated with pharmacotherapeutic named entities, detailed in DrugSemantics annotation scheme. Annotators were a Registered Nurse (RN) and two students from the Degree in Nursing. The quality of DrugSemantics corpus has been assessed by measuring its annotation reliability (overall F=79.33% [95%CI: 78.35-80.31]), as well as its annotation precision (overall P=94.65% [95%CI: 94.11-95.19]). Besides, the gold-standard construction process is described in detail. In total, our corpus contains more than 2000 named entities, 780 sentences and 226,729 tokens. Last, a Named Entity Classification module trained on DrugSemantics is presented aiming at showing the quality of our corpus, as well as an example of how to use it.


recent advances in natural language processing | 2017

Towards the Improvement of Automatic Emotion Pre-annotation with Polarity and Subjective Information.

Lea Canales; Walter Daelemans; Ester Boldrini; Patricio Martínez-Barco

Emotion detection has a high potential positive impact on the benefit of business, society, politics or education. Given this, the main objective of our research is to contribute to the resolution of one of the most important challenges in textual emotion detection: emotional corpora annotation. This will be tackled by proposing a semi-automatic methodology. It consists in two main phases: (1) an automatic process to pre-annotate the unlabelled sentences with a reduced number of emotional categories; and (2) a manual process of refinement where human annotators will determine which is the dominant emotion between the pre-defined set. Our objective in this paper is to show the pre-annotation process, as well as to evaluate the usability of subjective and polarity information in this process. The evaluation performed confirms clearly the benefits of employing the polarity and subjective information on emotion detection and thus endorses the relevance of our approach.

Collaboration


Dive into the Ester Boldrini's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Lea Canales

University of Alicante

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge