Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where José M. Perea-Ortega is active.

Publication


Featured researches published by José M. Perea-Ortega.


Journal of the Association for Information Science and Technology | 2011

OCA: Opinion corpus for Arabic

Mohammed Rushdi-Saleh; M. Teresa Martín-Valdivia; L. Alfonso Ureña-López; José M. Perea-Ortega

Sentiment analysis is a challenging new task related to text mining and natural language processing. Although there are, at present, several studies related to this theme, most of these focus mainly on English texts. The resources available for opinion mining (OM) in other languages are still limited. In this article, we present a new Arabic corpus for the OM task that has been made available to the scientific community for research purposes. The corpus contains 500 movie reviews collected from different web pages and blogs in Arabic, 250 of them considered as positive reviews, and the other 250 as negative opinions. Furthermore, different experiments have been carried out on this corpus, using machine learning algorithms such as support vector machines and Nave Bayes. The results obtained are very promising and we are encouraged to continue this line of research.


Journal of the Association for Information Science and Technology | 2013

Improving polarity classification of bilingual parallel corpora combining machine learning and semantic orientation approaches

José M. Perea-Ortega; M. Teresa Martín-Valdivia; L. Alfonso Ureña-López; Eugenio Martínez-Cámara

Polarity classification is one of the main tasks related to the opinion mining and sentiment analysis fields. The aim of this task is to classify opinions as positive or negative. There are two main approaches to carrying out polarity classification: machine learning and semantic orientation based on the integration of knowledge resources. In this study, we propose to combine both approaches using a voting system based on the majority rule. In this way, we attempt to improve the polarity classification of two parallel corpora such as the opinion corpus for Arabic (OCA) and the English version of the OCA (EVOCA). Several experiments have been performed to check the feasibility of the proposed method. The results show that the experiment that took into account both approaches in the voting system obtained the best performance. Moreover, it is also shown that the proposed method slightly improves the best results obtained using machine learning approaches solely over the OCA and the EVOCA separately. Therefore, we can conclude that the approach proposed here might be considered a good strategy for polarity detection when we work with bilingual parallel corpora.


applications of natural language to data bases | 2008

Comparing Several Textual Information Retrieval Systems for the Geographical Information Retrieval Task

José M. Perea-Ortega; Miguel A. García-Cumbreras; Manuel García-Vega; L. A. Ureña-López

This paper presents a comparison between three different Information Retrieval (IR) systems employed in a particular Geographical Information Retrieval (GIR) system, the GeoUJA IR, a GIR architecture developed by the SINAI research group. It could be interesting and useful for determining which of the most used IR systems works better in GIR task. In the experiments, we have used the Lemur, Terrier and Lucene search engines using mono and bilingual queries. We present baseline cases, without applying any external processes, such as query expansion or filtering. In addition, we have used the default settings of each IR system. Results show that Lemur works better using monolingual queries and Terrier works better using the bilingual ones.


Expert Systems With Applications | 2011

Otium: A web based planner for tourism and leisure

Arturo Montejo-Ráez; José M. Perea-Ortega; Miguel A. García-Cumbreras; Fernando Martínez-Santiago

This paper introduces the Otiŭm planner system for scheduling of leisure tasks in tourism. This novel service allows users to create their own agenda of activities within specified dates. Activities are selected from a list of recommended events according to last selected events, user preferences and other parameters. The proposed restrictions on the recommendation procedure have been found to capture static and dynamic user context. The recommendation function is linear and shows low computational cost. The events are extracted from web sources with almost no human manipulation, so the recommender is always showing new and recent events. The Ajax-based web interface eases the creation of the final plan, offering an interactive experience to the user. We consider that the trade-off between interactivity and recommendation complexity exits, and that the second issue is preferable in this type of services. The details about the design and implementation of the system are described, along with the issues the system resolves and some guidelines for enhancement. 2011 Elsevier Ltd. All rights reserved.


cross-language evaluation forum | 2009

Using WordNet in multimedia information retrieval

Manuel Carlos Díaz-Galiano; María Teresa Martín-Valdivia; L. Alfonso Ureña-López; José M. Perea-Ortega

This work investigates the use of external knowledge in a corpus with minimal textual information. We have expanded the original collection with WordNet terms in order to enrich the information included in this corpus. In addition, we have have carried out experiments with original and expanded topics. However, the obtained results show that it is necessary to continue investigating the expansion methodology. The query expansion does not improve the results, although using only the expansion for the corpus slightly achieves better MAP.


cross language evaluation forum | 2008

Filtering for Improving the Geographic Information Search

José M. Perea-Ortega; Miguel A. García-Cumbreras; Manuel García-Vega; L. A. Ureña-López

This paper describes the GEOUJA System, a Geographical Information Retrieval (GIR) system submitted by the SINAI group of the University of Jaen in GeoCLEF 2007. The objective of our system is to filter the documents retrieved from an information retrieval (IR) subsystem, given a multilingual statement describing a spatial user need. The results of the experiments show that the new heuristics and rules applied in the geo-relation validator module improve the general precision of our system. The increasing of the number of documents retrieved by the information retrieval subsystem also improves the final results.


cross language evaluation forum | 2008

Using an information retrieval system for video classification

José M. Perea-Ortega; Arturo Montejo-Ráez; Manuel Carlos Díaz-Galiano; María Teresa Mart iacuten-Valdivia; L. Alfonso Ureña-López

This paper describes a simple approach to resolve the video classification task. This approach consists in applying an Information Retrieval (IR) system as classifier. We have generated a document collection for each topic class predefined. This collection has been composed of documents retrieved using the Google search engine. Following the IR strategy, we have used the speech transcriptions of the videos as textual queries. The results obtained show that an IR system can perform well as video classifier if the speech transcriptions of the videos to classify have good quality.


cross-language evaluation forum | 2006

GEOUJA system the first participation of the University of Jaén at GeoCLEF 2006

Manuel García-Vega; Miguel A. García-Cumbreras; L. A. Ureña-López; José M. Perea-Ortega

This paper describes the first participation of the SINAI group of the University of Jaen in GeoCLEF 2006. We have developed a system made up of three main modules: the Translation Subsystem, that works with queries into Spanish and German against English collection; the Query Expansion subsystem, that integrates a Named Entity Recognizer, a thesaurus expansion module and a geographical information-gazetteer module; and the Information Retrieval subsystem. We have participated in the monolingual and the bilingual tasks. The results obtained shown that the use of geographical and thesaurus information for query expansion does not improve the retrieval in our experiments.


Expert Systems With Applications | 2013

Application of Text Summarization techniques to the Geographical Information Retrieval task

José M. Perea-Ortega; Elena Lloret; L. Alfonso Ureña-López; Manuel Palomar

Automatic Text Summarization has been shown to be useful for Natural Language Processing tasks such as Question Answering or Text Classification and other related fields of computer science such as Information Retrieval. Since Geographical Information Retrieval can be considered as an extension of the Information Retrieval field, the generation of summaries could be integrated into these systems by acting as an intermediate stage, with the purpose of reducing the document length. In this manner, the access time for information searching will be improved, while at the same time relevant documents will be also retrieved. Therefore, in this paper we propose the generation of two types of summaries (generic and geographical) applying several compression rates in order to evaluate their effectiveness in the Geographical Information Retrieval task. The evaluation has been carried out using GeoCLEF as evaluation framework and following an Information Retrieval perspective without considering the geo-reranking phase commonly used in these systems. Although single-document summarization has not performed well in general, the slight improvements obtained for some types of the proposed summaries, particularly for those based on geographical information, made us believe that the integration of Text Summarization with Geographical Information Retrieval may be beneficial, and consequently, the experimental set-up developed in this research work serves as a basis for further investigations in this field.


intelligent information systems | 2011

Using web sources for improving video categorization

José M. Perea-Ortega; Arturo Montejo-Ráez; M. Teresa Martín-Valdivia; L. Alfonso Ureña-López

In this paper, several experiments about video categorization using a supervised learning approach are presented. To this end, the VideoCLEF 2008 evaluation forum has been chosen as experimental framework. After an analysis of the VideoCLEF corpus, it was found that video transcriptions are not the best source of information in order to identify the thematic of video streams. Therefore, two web-based corpora have been generated in the aim of adding more informational sources by integrating documents from Wikipedia articles and Google searches. A number of supervised categorization experiments using the test data of VideoCLEF have been accomplished. Several machine learning algorithms have been proved to validate the effect of the corpus on the final results: Naïve Bayes, K-nearest-neighbors (KNN), Support Vectors Machine (SVM) and the j48 decision tree. The results obtained show that web can be a useful source of information for generating classification models for video data.

Collaboration


Dive into the José M. Perea-Ortega's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge