L. Alfonso Ureña-López
University of Jaén
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by L. Alfonso Ureña-López.
Journal of the Association for Information Science and Technology | 2011
Mohammed Rushdi-Saleh; M. Teresa Martín-Valdivia; L. Alfonso Ureña-López; José M. Perea-Ortega
Sentiment analysis is a challenging new task related to text mining and natural language processing. Although there are, at present, several studies related to this theme, most of these focus mainly on English texts. The resources available for opinion mining (OM) in other languages are still limited. In this article, we present a new Arabic corpus for the OM task that has been made available to the scientific community for research purposes. The corpus contains 500 movie reviews collected from different web pages and blogs in Arabic, 250 of them considered as positive reviews, and the other 250 as negative opinions. Furthermore, different experiments have been carried out on this corpus, using machine learning algorithms such as support vector machines and Nave Bayes. The results obtained are very promising and we are encouraged to continue this line of research.
Natural Language Engineering | 2014
Eugenio Martínez-Cámara; M. Teresa Martín-Valdivia; L. Alfonso Ureña-López; A Rturo Montejo-Ráez
In recent years, the interest among the research community in sentiment analysis (SA) has grown exponentially. It is only necessary to see the number of scientific publications and forums or related conferences to understand that this is a field with great prospects for the future. On the other hand, the Twitter boom has boosted investigation in this area due fundamentally to its potential applications in areas such as business or government intelligence, recommender systems, graphical interfaces and virtual assistance. However, to fully understand this issue, a profound revision of the state of the art is first necessary. It is for this reason that this paper aims to represent a starting point for those investigations concerned with the latest references to Twitter in SA.
Computer Speech & Language | 2014
Arturo Montejo-Ráez; Eugenio Martínez-Cámara; M. Teresa Martín-Valdivia; L. Alfonso Ureña-López
This paper presents a novel approach to Sentiment Polarity Classification in Twitter posts, by extracting a vector of weighted nodes from the graph of WordNet. These weights are used in SentiWordNet to compute a final estimation of the polarity. Therefore, the method proposes a non-supervised solution that is domain-independent. The evaluation of a generated corpus of tweets shows that this technique is promising.
Expert Systems With Applications | 2013
M. T. Martín-Valdivia; Eugenio Martínez-Cámara; Jose-M. Perea-Ortega; L. Alfonso Ureña-López
Sentiment polarity detection is one of the most popular tasks related to Opinion Mining. Many papers have been presented describing one of the two main approaches used to solve this problem. On the one hand, a supervised methodology uses machine learning algorithms when training data exist. On the other hand, an unsupervised method based on a semantic orientation is applied when linguistic resources are available. However, few studies combine the two approaches. In this paper we propose the use of meta-classifiers that combine supervised and unsupervised learning in order to develop a polarity classification system. We have used a Spanish corpus of film reviews along with its parallel corpus translated into English. Firstly, we generate two individual models using these two corpora and applying machine learning algorithms. Secondly, we integrate SentiWordNet into the English corpus, generating a new unsupervised model. Finally, the three systems are combined using a meta-classifier that allows us to apply several combination algorithms such as voting system or stacking. The results obtained outperform those obtained using the systems individually and show that this approach could be considered a good strategy for polarity classification when we work with parallel corpora.
Computers and The Humanities | 2001
L. Alfonso Ureña-López; Manuel de Buenaga; José M. Gómez
Information access methods must be improved to overcome theinformation overload that most professionals face nowadays. Textclassification tasks, like Text Categorization, help the usersto access to the great amount of text they find in the Internetand their organizations.TC is the classification of documents into a predefined set ofcategories. Most approaches to automatic TC are based on theutilization of a training collection, which is a set of manuallyclassified documents. Other linguistic resources that areemerging, like lexical databases, can also be used forclassification tasks. This article describes an approach to TCbased on the integration of a training collection (Reuters-21578)and a lexical database (WordNet 1.6) as knowledge sources.Lexical databases accumulate information on the lexical items ofone or several languages. This information must be filtered inorder to make an effective use of it in our model of TC. Thisfiltering process is a Word Sense Disambiguation task. WSDis the identification of the sense of words in context. This taskis an intermediate process in many natural language processingtasks like machine translation or multilingual informationretrieval. We present the utilization of WSD as an aid for TC. Ourapproach to WSD is also based on the integration of two linguisticresources: a training collection (SemCor and Reuters-21578) and alexical database (WordNet 1.6).We have developed a series of experiments that show that: TC andWSD based on the integration of linguistic resources are veryeffective; and, WSD is necessary to effectively integratelinguistic resources in TC.
international conference natural language processing | 2011
Eugenio Martínez-Cámara; M. Teresa Martín-Valdivia; L. Alfonso Ureña-López
Sentiment analysis is a new challenging task related to Text Mining and Natural Language Processing. Although there are some current works, most of them only focus on English texts. Web pages, information and opinions on the Internet are increasing every day, and English is not the only language used to write them. Other languages like Spanish are increasingly present so we have carried out some experiments over a Spanish film reviews corpus. In this paper we present several experiments using five classification algorithms (SVM, Nave Bayes, BBR, KNN, C4.5). The results obtained are very promising and encourage us to continue investigating in this line.
Journal of the Association for Information Science and Technology | 2014
Arturo Montejo-Ráez; Eugenio Martínez-Cámara; M. Teresa Martín-Valdivia; L. Alfonso Ureña-López
Until now, most of the methods published for polarity classification in Twitter have used a supervised approach. The differences between them are only the features selected and the method used for weighting them. In this article, we present an unsupervised method for polarity classification in Twitter. The method is based on the expansion of the concepts expressed in the tweets through the application of PageRank to WordNet. In addition, we integrate SentiWordNet to compute the final value of polarity. The synsets values are weighted with the PageRank scores obtained in the previous random walk process over WordNet. The results obtained show that disambiguation and expansion are good strategies for improving overall performance.
Information Retrieval | 2006
Fernando Martínez-Santiago; L. Alfonso Ureña-López; Maite Martín-Valdivia
A usual strategy to implement CLIR (Cross-Language Information Retrieval) systems is the so-called query translation approach. The user query is translated for each language present in the multilingual collection in order to compute an independent monolingual information retrieval process per language. Thus, this approach divides documents according to language. In this way, we obtain as many different collections as languages. After searching in these corpora and obtaining a result list per language, we must merge them in order to provide a single list of retrieved articles.In this paper, we propose an approach to obtain a single list of relevant documents for CLIR systems driven by query translation. This approach, which we call 2-step RSV (RSV: Retrieval Status Value), is based on the re-indexing of the retrieval documents according to the query vocabulary, and it performs noticeably better than traditional methods.The proposed method requires query vocabulary alignment: given a word for a given query, we must know the translation or translations to the other languages. Because this is not always possible, we have researched on a mixed model. This mixed model is applied in order to deal with queries with partial word-level alignment. The results prove that even in this scenario, 2-step RSV performs better than traditional merging methods.
Information Processing and Management | 2015
M. Dolores Molina-González; Eugenio Martínez-Cámara; M. Teresa Martín-Valdivia; L. Alfonso Ureña-López
A lexicon-based domain adaptation method is proposed.Several domain polar lexicons were compiled following a corpus-based approach.The new resources are assessed over a Spanish corpus.The promising results encourage us to follow improving this domain adaptation method. One of the problems of opinion mining is the domain adaptation of the sentiment classifiers. There are several approaches to tackling this problem. One of these is the integration of a list of opinion bearing words for the specific domain. This paper presents the generation of several resources for domain adaptation to polarity detection. On the other hand, the lack of resources in languages different from English has orientated our work towards developing sentiment lexicons for polarity classifiers in Spanish. The results show the validity of the new sentiment lexicons, which can be used as part of a polarity classifier.
Journal of the Association for Information Science and Technology | 2013
José M. Perea-Ortega; M. Teresa Martín-Valdivia; L. Alfonso Ureña-López; Eugenio Martínez-Cámara
Polarity classification is one of the main tasks related to the opinion mining and sentiment analysis fields. The aim of this task is to classify opinions as positive or negative. There are two main approaches to carrying out polarity classification: machine learning and semantic orientation based on the integration of knowledge resources. In this study, we propose to combine both approaches using a voting system based on the majority rule. In this way, we attempt to improve the polarity classification of two parallel corpora such as the opinion corpus for Arabic (OCA) and the English version of the OCA (EVOCA). Several experiments have been performed to check the feasibility of the proposed method. The results show that the experiment that took into account both approaches in the voting system obtained the best performance. Moreover, it is also shown that the proposed method slightly improves the best results obtained using machine learning approaches solely over the OCA and the EVOCA separately. Therefore, we can conclude that the approach proposed here might be considered a good strategy for polarity detection when we work with bilingual parallel corpora.