Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Laura Plaza is active.

Publication


Featured researches published by Laura Plaza.


european conference on information retrieval | 2011

A joint model of feature mining and sentiment analysis for product review rating

Jorge Carrillo de Albornoz; Laura Plaza; Pablo Gervás; Alberto Díaz

The information in customer reviews is of great interest to both companies and consumers. This information is usually presented as non-structured free-text so that automatically extracting and rating user opinions about a product is a challenging task. Moreover, this opinion highly depends on the product features on which the user judgments and impressions are expressed. Following this idea, our goal is to predict the overall rating of a product review based on the user opinion about the different product features that are evaluated in the review. To this end, the system first identifies the features that are relevant to consumers when evaluating a certain type of product, as well as the relative importance or salience of such features. The system then extracts from the review the user opinions about the different product features and quantifies such opinions. The salience of the different product features and the values that quantify the user opinions about them are used to construct a Vector of Feature Intensities which represents the review and will be the input to a machine learning model that classifies the review into different rating categories. Our method is evaluated over 1000 hotel reviews from booking.com. The results compare favorably with those achieved by other systems addressing similar evaluations.


international conference on computational linguistics | 2008

Concept-Graph Based Biomedical Automatic Summarization Using Ontologies

Laura Plaza; Alberto Díaz; Pablo Gervás

One of the main problems in research on automatic summarization is the inaccurate semantic interpretation of the source. Using specific domain knowledge can considerably alleviate the problem. In this paper, we introduce an ontology-based extractive method for summarization. It is based on mapping the text to concepts and representing the document and its sentences as graphs. We have applied our approach to summarize biomedical literature, taking advantages of free resources as UMLS. Preliminary empirical results are presented and pending problems are identified.


BMC Bioinformatics | 2013

MeSH indexing based on automatically generated summaries

Antonio Jimeno-Yepes; Laura Plaza; James G. Mork; Alan R. Aronson; Alberto Espuny Díaz

BackgroundMEDLINE citations are manually indexed at the U.S. National Library of Medicine (NLM) using as reference the Medical Subject Headings (MeSH) controlled vocabulary. For this task, the human indexers read the full text of the article. Due to the growth of MEDLINE, the NLM Indexing Initiative explores indexing methodologies that can support the task of the indexers. Medical Text Indexer (MTI) is a tool developed by the NLM Indexing Initiative to provide MeSH indexing recommendations to indexers. Currently, the input to MTI is MEDLINE citations, title and abstract only. Previous work has shown that using full text as input to MTI increases recall, but decreases precision sharply. We propose using summaries generated automatically from the full text for the input to MTI to use in the task of suggesting MeSH headings to indexers. Summaries distill the most salient information from the full text, which might increase the coverage of automatic indexing approaches based on MEDLINE. We hypothesize that if the results were good enough, manual indexers could possibly use automatic summaries instead of the full texts, along with the recommendations of MTI, to speed up the process while maintaining high quality of indexing results.ResultsWe have generated summaries of different lengths using two different summarizers, and evaluated the MTI indexing on the summaries using different algorithms: MTI, individual MTI components, and machine learning. The results are compared to those of full text articles and MEDLINE citations. Our results show that automatically generated summaries achieve similar recall but higher precision compared to full text articles. Compared to MEDLINE citations, summaries achieve higher recall but lower precision.ConclusionsOur results show that automatic summaries produce better indexing than full text articles. Summaries produce similar recall to full text but much better precision, which seems to indicate that automatic summaries can efficiently capture the most important contents within the original articles. The combination of MEDLINE citations and automatically generated summaries could improve the recommendations suggested by MTI. On the other hand, indexing performance might be dependent on the MeSH heading being indexed. Summarization techniques could thus be considered as a feature selection algorithm that might have to be tuned individually for each MeSH heading.


Journal of the Association for Information Science and Technology | 2013

An emotion-based model of negation, intensifiers, and modality for polarity and intensity classification

Jorge Carrillo-de-Albornoz; Laura Plaza

Negation, intensifiers, and modality are common linguistic constructions that may modify the emotional meaning of the text and therefore need to be taken into consideration in sentiment analysis. Negation is usually considered as a polarity shifter, whereas intensifiers are regarded as amplifiers or diminishers of the strength of such polarity. Modality, in turn, has only been addressed in a very naive fashion, so that modal forms are treated as polarity blockers. However, processing these constructions as mere polarity modifiers may be adequate for polarity classification, but it is not enough for more complex tasks (e.g., intensity classification), for which a more fine‐grained model based on emotions is needed. In this work, we study the effect of modifiers on the emotions affected by them and propose a model of negation, intensifiers, and modality especially conceived for sentiment analysis tasks. We compare our emotion‐based strategy with two traditional approaches based on polar expressions and find that representing the text as a set of emotions increases accuracy in different classification tasks and that this representation allows for a more accurate modeling of modifiers that results in further classification improvements. We also study the most common uses of modifiers in opinionated texts and quantify their impact in polarity and intensity classification. Finally, we analyze the joint effect of emotional modifiers and find that interesting synergies exist between them.


BMC Bioinformatics | 2011

Studying the correlation between different word sense disambiguation methods and summarization effectiveness in biomedical texts

Laura Plaza; Antonio Jimeno-Yepes; Alberto Díaz; Alan R. Aronson

BackgroundWord sense disambiguation (WSD) attempts to solve lexical ambiguities by identifying the correct meaning of a word based on its context. WSD has been demonstrated to be an important step in knowledge-based approaches to automatic summarization. However, the correlation between the accuracy of the WSD methods and the summarization performance has never been studied.ResultsWe present three existing knowledge-based WSD approaches and a graph-based summarizer. Both the WSD approaches and the summarizer employ the Unified Medical Language System (UMLS) Metathesaurus as the knowledge source. We first evaluate WSD directly, by comparing the prediction of the WSD methods to two reference sets: the NLM WSD dataset and the MSH WSD collection. We next apply the different WSD methods as part of the summarizer, to map documents onto concepts in the UMLS Metathesaurus, and evaluate the summaries that are generated. The results obtained by the different methods in both evaluations are studied and compared.ConclusionsIt has been found that the use of WSD techniques has a positive impact on the results of our graph-based summarizer, and that, when both the WSD and summarization tasks are assessed over large and homogeneous evaluation collections, there exists a correlation between the overall results of the WSD and summarization tasks. Furthermore, the best WSD algorithm in the first task tends to be also the best one in the second. However, we also found that the improvement achieved by the summarizer is not directly correlated with the WSD performance. The most likely reason is that the errors in disambiguation are not equally important but depend on the relative salience of the different concepts in the document to be summarized.


language resources and evaluation | 2013

Analyzing the capabilities of crowdsourcing services for text summarization

Elena Lloret; Laura Plaza; Ahmet Aker

This paper presents a detailed analysis of the use of crowdsourcing services for the Text Summarization task in the context of the tourist domain. In particular, our aim is to retrieve relevant information about a place or an object pictured in an image in order to provide a short summary which will be of great help for a tourist. For tackling this task, we proposed a broad set of experiments using crowdsourcing services that could be useful as a reference for others who want to rely also on crowdsourcing. From the analysis carried out through our experimental setup and the results obtained, we can conclude that although crowdsourcing services were not good to simply gather gold-standard summaries (i.e., from the results obtained for experiments 1, 2 and 4), the encouraging results obtained in the third and sixth experiments motivate us to strongly believe that they can be successfully employed for finding some patterns of behaviour humans have when generating summaries, and for validating and checking other tasks. Furthermore, this analysis serves as a guideline for the types of experiments that might or might not work when using crowdsourcing in the context of text summarization.


applications of natural language to data bases | 2010

Retrieval of similar electronic health records using UMLS concept graphs

Laura Plaza; Alberto Díaz

Physicians often use information from previous clinical cases in their decision-making process. However, the large amount of patient records available in hospitals makes an exhaustive search unfeasible. We propose a method for the retrieval of similar clinical cases, based on mapping the text onto UMLS concepts and representing the patient records as semantic graphs. The method also deals with the problems of negation detection and concept identification in clinical free text. To evaluate the approach, an evaluation collection has been developed. The results show that our method correlates well with the expert judgments and outperforms remarkably the traditional term-vector space model.


BMC Bioinformatics | 2013

Evaluating the use of different positional strategies for sentence selection in biomedical literature summarization

Laura Plaza; Jorge Carrillo-de-Albornoz

BackgroundThe position of a sentence in a document has been traditionally considered an indicator of the relevance of the sentence, and therefore it is frequently used by automatic summarization systems as an attribute for sentence selection. Sentences close to the beginning of the document are supposed to deal with the main topic and thus are selected for the summary. This criterion has shown to be very effective when summarizing some types of documents, such as news items. However, this property is not likely to be found in other types of documents, such as scientific articles, where other positional criteria may be preferred. The purpose of the present work is to study the utility of different positional strategies for biomedical literature summarization.ResultsWe have evaluated three different positional strategies: (1) awarding the sentences at the beginning of the document, (2) preferring those at the beginning and end of the document, and (3) weighting the sentences according to the section in which they appear. To this end, we have implemented two summarizers, one based on semantic graphs and the other based on concept frequencies, and evaluated the summaries they produce when combined with each of the positional strategies above using ROUGE metrics. Our results indicate that it is possible to improve the quality of the summaries by weighting the sentences according to the section in which they appear (≈17% improvement in ROUGE-2 for the graph-based summarizer and ≈20% for the frequency-based summarizer), and that the sections containing the more salient information are the Methods and Material and the Discussion and Results ones.ConclusionsIt has been found that the use of traditional positional criteria that award sentences at the beginning and/or the end of the document are not helpful when summarizing scientific literature. In contrast, a more appropriate strategy is that which weights sentences according to the section in which they appear.


BMC Bioinformatics | 2015

Feature engineering for MEDLINE citation categorization with MeSH

Antonio Jimeno Yepes; Laura Plaza; Jorge Carrillo-de-Albornoz; James G. Mork; Alan R. Aronson

BackgroundResearch in biomedical text categorization has mostly used the bag-of-words representation. Other more sophisticated representations of text based on syntactic, semantic and argumentative properties have been less studied. In this paper, we evaluate the impact of different text representations of biomedical texts as features for reproducing the MeSH annotations of some of the most frequent MeSH headings. In addition to unigrams and bigrams, these features include noun phrases, citation meta-data, citation structure, and semantic annotation of the citations.ResultsTraditional features like unigrams and bigrams exhibit strong performance compared to other feature sets. Little or no improvement is obtained when using meta-data or citation structure. Noun phrases are too sparse and thus have lower performance compared to more traditional features. Conceptual annotation of the texts by MetaMap shows similar performance compared to unigrams, but adding concepts from the UMLS taxonomy does not improve the performance of using only mapped concepts. The combination of all the features performs largely better than any individual feature set considered. In addition, this combination improves the performance of a state-of-the-art MeSH indexer. Concerning the machine learning algorithms, we find that those that are more resilient to class imbalance largely obtain better performance.ConclusionsWe conclude that even though traditional features such as unigrams and bigrams have strong performance compared to other features, it is possible to combine them to effectively improve the performance of the bag-of-words representation. We have also found that the combination of the learning algorithm and feature sets has an influence in the overall performance of the system. Moreover, using learning algorithms resilient to class imbalance largely improves performance. However, when using a large set of features, consideration needs to be taken with algorithms due to the risk of over-fitting. Specific combinations of learning algorithms and features for individual MeSH headings could further increase the performance of an indexing system.


Information Processing and Management | 2012

Resolving ambiguity in biomedical text to improve summarization

Laura Plaza; Mark Stevenson; Alberto Díaz

Access to the vast body of research literature that is now available on biomedicine and related fields can be improved with automatic summarization. This paper describes a summarization system for the biomedical domain that represents documents as graphs formed from concepts and relations in the UMLS Metathesaurus. This system has to deal with the ambiguities that occur in biomedical documents. We describe a variety of strategies that make use of MetaMap and Word Sense Disambiguation (WSD) to accurately map biomedical documents onto UMLS Metathesaurus concepts. Evaluation is carried out using a collection of 150 biomedical scientific articles from the BioMed Central corpus. We find that using WSD improves the quality of the summaries generated.

Collaboration


Dive into the Laura Plaza's collaboration.

Top Co-Authors

Avatar

Alberto Díaz

Complutense University of Madrid

View shared research outputs
Top Co-Authors

Avatar

Pablo Gervás

Complutense University of Madrid

View shared research outputs
Top Co-Authors

Avatar

Jorge Carrillo de Albornoz

Complutense University of Madrid

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jorge Carrillo-de-Albornoz

National University of Distance Education

View shared research outputs
Top Co-Authors

Avatar

Ahmet Aker

University of Sheffield

View shared research outputs
Top Co-Authors

Avatar

Virginia Francisco

Complutense University of Madrid

View shared research outputs
Top Co-Authors

Avatar

Alan R. Aronson

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

David Camacho

Autonomous University of Madrid

View shared research outputs
Top Co-Authors

Avatar

Héctor D. Menéndez

Autonomous University of Madrid

View shared research outputs
Researchain Logo
Decentralizing Knowledge