Alberto Díaz
Complutense University of Madrid
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Alberto Díaz.
Information Processing and Management | 2007
Alberto Díaz; Pablo Gervás
The potential of summary personalization is high, because a summary that would be useless to decide the relevance of a document if summarized in a generic manner, may be useful if the right sentences are selected that match the user interest. In this paper we defend the use of a personalized summarization facility to maximize the density of relevance of selections sent by a personalized information system to a given user. The personalization is applied to the digital newspaper domain and it used a user-model that stores long and short term interests using four reference systems: sections, categories, keywords and feedback terms. On the other side, it is crucial to measure how much information is lost during the summarization process, and how this information loss may affect the ability of the user to judge the relevance of a given document. The results obtained in two personalization systems show that personalized summaries perform better than generic and generic-personalized summaries in terms of identifying documents that satisfy user preferences. We also considered a user-centred direct evaluation that showed a high level of user satisfaction with the summaries.
european conference on information retrieval | 2011
Jorge Carrillo de Albornoz; Laura Plaza; Pablo Gervás; Alberto Díaz
The information in customer reviews is of great interest to both companies and consumers. This information is usually presented as non-structured free-text so that automatically extracting and rating user opinions about a product is a challenging task. Moreover, this opinion highly depends on the product features on which the user judgments and impressions are expressed. Following this idea, our goal is to predict the overall rating of a product review based on the user opinion about the different product features that are evaluated in the review. To this end, the system first identifies the features that are relevant to consumers when evaluating a certain type of product, as well as the relative importance or salience of such features. The system then extracts from the review the user opinions about the different product features and quantifies such opinions. The salience of the different product features and the values that quantify the user opinions about them are used to construct a Vector of Feature Intensities which represents the review and will be the input to a machine learning model that classifies the review into different rating categories. Our method is evaluated over 1000 hotel reviews from booking.com. The results compare favorably with those achieved by other systems addressing similar evaluations.
international conference on computational linguistics | 2008
Laura Plaza; Alberto Díaz; Pablo Gervás
One of the main problems in research on automatic summarization is the inaccurate semantic interpretation of the source. Using specific domain knowledge can considerably alleviate the problem. In this paper, we introduce an ontology-based extractive method for summarization. It is based on mapping the text to concepts and representing the document and its sentences as graphs. We have applied our approach to summarize biomedical literature, taking advantages of free resources as UMLS. Preliminary empirical results are presented and pending problems are identified.
Information Processing and Management | 2008
Alberto Díaz; Antonio García; Pablo Gervás
Some of the most popular measures to evaluate information filtering systems are usually independent of the users because they are based in relevance judgments obtained from experts. On the other hand, the user-centred evaluation allows showing the different impressions that the users have perceived about the system running. This work is focused on discussing the problem of user-centred versus system-centred evaluation of a Web content personalization system where the personalization is based on a user model that stores long term (section, categories and keywords) and short term interests (adapted from user provided feedback). The user-centred evaluation is based on questionnaires filled in by the users before and after using the system and the system-centred evaluation is based on the comparison between ranking of documents, obtained from the application of a multi-tier selection process, and binary relevance judgments collected previously from real users. The user-centred and system-centred evaluations performed with 106 users during 14 working days have provided valuable data concerning the behaviour of the users with respect to issues such as document relevance or the relative importance attributed to different ways of personalization. The results obtained shows general satisfaction on both the personalization processes (selection, adaptation and presentation) and the system as a whole.
BMC Bioinformatics | 2011
Laura Plaza; Antonio Jimeno-Yepes; Alberto Díaz; Alan R. Aronson
BackgroundWord sense disambiguation (WSD) attempts to solve lexical ambiguities by identifying the correct meaning of a word based on its context. WSD has been demonstrated to be an important step in knowledge-based approaches to automatic summarization. However, the correlation between the accuracy of the WSD methods and the summarization performance has never been studied.ResultsWe present three existing knowledge-based WSD approaches and a graph-based summarizer. Both the WSD approaches and the summarizer employ the Unified Medical Language System (UMLS) Metathesaurus as the knowledge source. We first evaluate WSD directly, by comparing the prediction of the WSD methods to two reference sets: the NLM WSD dataset and the MSH WSD collection. We next apply the different WSD methods as part of the summarizer, to map documents onto concepts in the UMLS Metathesaurus, and evaluate the summaries that are generated. The results obtained by the different methods in both evaluations are studied and compared.ConclusionsIt has been found that the use of WSD techniques has a positive impact on the results of our graph-based summarizer, and that, when both the WSD and summarization tasks are assessed over large and homogeneous evaluation collections, there exists a correlation between the overall results of the WSD and summarization tasks. Furthermore, the best WSD algorithm in the first task tends to be also the best one in the second. However, we also found that the improvement achieved by the summarizer is not directly correlated with the WSD performance. The most likely reason is that the errors in disambiguation are not equally important but depend on the relative salience of the different concepts in the document to be summarized.
applications of natural language to data bases | 2010
Laura Plaza; Alberto Díaz
Physicians often use information from previous clinical cases in their decision-making process. However, the large amount of patient records available in hospitals makes an exhaustive search unfeasible. We propose a method for the retrieval of similar clinical cases, based on mapping the text onto UMLS concepts and representing the patient records as semantic graphs. The method also deals with the problems of negation detection and concept identification in clinical free text. To evaluate the approach, an evaluation collection has been developed. The results show that our method correlates well with the expert judgments and outperforms remarkably the traditional term-vector space model.
international conference on computational linguistics | 2012
Miguel Ballesteros; Virginia Francisco; Alberto Díaz; Jesús Herrera; Pablo Gervás
In the last few years negation detection systems for biomedical texts have been developed successfully. In this paper we present a system that finds and annotates the scope of negation in English sentences. It infers which words are affected by negations by browsing dependency syntactic structures. Thus, firstly a greedy algorithm detects negation cues, like no or not. And secondly the scope of these negation cues is computed. We tested the system over the Bioscope corpus, annotated with negation, obtaining competitive results. The system presented in this paper can be accessed via web.
Information Processing and Management | 2012
Laura Plaza; Mark Stevenson; Alberto Díaz
Access to the vast body of research literature that is now available on biomedicine and related fields can be improved with automatic summarization. This paper describes a summarization system for the biomedical domain that represents documents as graphs formed from concepts and relations in the UMLS Metathesaurus. This system has to deal with the ambiguities that occur in biomedical documents. We describe a variety of strategies that make use of MetaMap and Word Sense Disambiguation (WSD) to accurately map biomedical documents onto UMLS Metathesaurus concepts. Evaluation is carried out using a collection of 150 biomedical scientific articles from the BioMed Central corpus. We find that using WSD improves the quality of the summaries generated.
PLOS ONE | 2017
Lourdes Rodriguez; Beatriz Bustamante; Luz Huaroto; Cecilia Agurto; Ricardo Illescas; Rafael J. Ramirez; Alberto Díaz; José Villacorta Hidalgo
Background The incidence of candidemia is increasing in developing countries. Very little is known about the epidemiology of candidemia in Peru. The aim of this study is to describe the incidence, microbiology, clinical presentation and outcomes of Candida bloodstream infections in three Lima-Callao hospitals. Methods Candida spp. isolates were identified prospectively at participant hospitals between November 2013 and January 2015. Susceptibility testing for amphotericin B, fluconazole, posaconazole, voriconazole and anidulafungin was performed using broth microdilution method. Clinical information was obtained from medical records and evaluated. Results We collected information on 158 isolates and 157 patients. Median age of patients was 55.0 yrs., and 64.1% were males. Thirty-eight (24.2%) episodes of candidemia occurred in those <18 yrs. The frequency of non-Candida albicans was 72.1%. The most frequently recovered species were C. albicans (n = 44, 27.8%), C. parapsilosis (n = 40, 25.3%), C. tropicalis (n = 39, 24.7%) and C. glabrata (n = 15, 9.5%). Only four isolates were resistant to fluconazole, 86.7% (n = 137) were susceptible and 17 were susceptible-dose dependent. Decreased susceptibility to posaconazole was also observed in three isolates, and one to voriconazole. All isolates were susceptible to anidulafungin and amphotericin B. The most commonly associated co-morbid conditions were recent surgery (n = 61, 38.9%), mechanical ventilation (n = 60, 38.2%) and total parenteral nutrition (n = 57, 36.3%). The incidence of candidemia by center ranged between 1.01 and 2.63 cases per 1,000 admissions, with a global incidence of 2.04. Only 28.1% of cases received treatment within 72 hrs. of diagnosis. Overall, the 30-day survival was 60.4% (treated subjects, 67.4%; not-treated patients, 50.9%). Conclusions We found a very high proportion of non-albicans Candida species. Despite this, the decreased susceptibility/resistance to fluconazole was only 13.3% and not seen in the other antifungals. Overall, the incidence of candidemia mortality was high when compared to other international studies. It is possible, that the delay in initiating antifungal treatment contributed to the elevated mortality rate, in spite of low antifungal resistance.
adaptive hypermedia and adaptive web based systems | 2004
Alberto Díaz; Pablo Gervás
This paper presents a system for personalization of web contents based on a user model that stores long term and short term interests. Long term interests are modeled through the selection of specific and general categories, and keywords for which the user needs information. However, user needs change over time as a result of his interaction with received information. For this reason, the user model must be capable of adapting to those shifts in interest. In our case, this adaptation of the user model is performed by a short term model obtained from user provided feedback. The evaluation performed with 100 users during 15 days has determined that the combined use of long and short term models performs best when specific and general categories and keywords are used together for the long term model.