Lourdes Borrajo | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Lourdes Borrajo is active.

Explore More

Publication

Featured researches published by Lourdes Borrajo.

International Journal of Computational Intelligence and Applications | 2003

INCREASING THE AUTONOMY OF DELIBERATIVE AGENTS WITH A CASE-BASED REASONING SYSTEM

Juan M. Corchado; Rosalía Laza; Lourdes Borrajo; J. C. Yañes; M. Valiño

This paper shows how deliberative agents can be built by means of a case-based reasoning system. The concept of deliberative agent is introduced and the case-based reasoning model is presented. Once the advantages and disadvantages of such agents have been discussed, it will be shown how to solve some of their inconveniences, especially those related to their implementation and adaptation. The World Wide Web has emerged as one of the most popular vehicle for disseminating and sharing information through computer networks; a distributed agent-based solution for e-business, in which such agents have been used, is also presented and evaluated in this paper.

Expert Systems With Applications | 2013

An HMM-based over-sampling technique to improve text classification

Eva Lorenzo Iglesias; A. Seara Vieira; Lourdes Borrajo

Abstract This paper presents a novel over-sampling method based on document content to handle the class imbalance problem in text classification. The new technique, COS-HMM (Content-based Over-Sampling HMM), includes an HMM that is trained with a corpus in order to create new samples according to current documents. The HMM is treated as a document generator which can produce synthetical instances formed on what it was trained with. To demonstrate its achievement, COS-HMM is tested with a Support Vector Machine (SVM) in two medical documental corpora (OHSUMED and TREC Genomics), and is then compared with the Random Over-Sampling (ROS) and SMOTE techniques. Results suggest that the application of over-sampling strategies increases the global performance of the SVM to classify documents. Based on the empirical and statistical studies, the new method clearly outperforms the baseline method (ROS), and offers a greater performance than SMOTE in the majority of tested cases.

Computer Methods and Programs in Biomedicine | 2016

Improving the text classification using clustering and a novel HMM to reduce the dimensionality

A. Seara Vieira; Lourdes Borrajo; Eva Lorenzo Iglesias

In text classification problems, the representation of a document has a strong impact on the performance of learning systems. The high dimensionality of the classical structured representations can lead to burdensome computations due to the great size of real-world data. Consequently, there is a need for reducing the quantity of handled information to improve the classification process. In this paper, we propose a method to reduce the dimensionality of a classical text representation based on a clustering technique to group documents, and a previously developed Hidden Markov Model to represent them. We have applied tests with the k-NN and SVM classifiers on the OHSUMED and TREC benchmark text corpora using the proposed dimensionality reduction technique. The experimental results obtained are very satisfactory compared to commonly used techniques like InfoGain and the statistical tests performed demonstrate the suitability of the proposed technique for the preprocessing step in a text classification task.

Applied Soft Computing | 2015

TCBR-HMM

Lourdes Borrajo; A. Seara Vieira; Eva Lorenzo Iglesias

Graphical abstractDisplay Omitted HighlightsThe paper presents an innovative solution to model distributed adaptive systems in biomedical environments.A Case Based Reasoning system with an original Hidden Markov Model for biomedical text classification is proposed.The model classifies scientific documents by their content, taking into account the relevance of words.The model is able to adapt to new documents in an iterative learning frame.The model is tested with the SVM and k-NN classifiers using the Ohsumed scientific collection.Empirical and statistical results show the method outperforms other efficient text classifiers. This paper presents an innovative solution to model distributed adaptive systems in biomedical environments. We present an original TCBR-HMM (Text Case Based Reasoning-Hidden Markov Model) for biomedical text classification based on document content. The main goal is to propose a more effective classifier than current methods in this environment where the model needs to be adapted to new documents in an iterative learning frame. To demonstrate its achievement, we include a set of experiments, which have been performed on OSHUMED corpus. Our classifier is compared with Naive Bayes and SVM techniques, commonly used in text classification tasks. The results suggest that the TCBR-HMM Model is indeed more suitable for document classification. The model is empirically and statistically comparable to the SVM classifier and outperforms it in terms of time efficiency.

PACBB | 2014

BioClass: A Tool for Biomedical Text Classification

R. Romero; A. Seara Vieira; Eva Lorenzo Iglesias; Lourdes Borrajo

Traditional search engines are not efficient enough to extract useful information from scientific text databases. Therefore, it is necessary to develop advanced information retrieval software tools that allow for further classification of the scientific texts. The aim of this work is to present BioClass, a freely available graphic tool for biomedical text classification. With BioClass an user can parameterize, train and test different text classifiers to determine which technique performs better according to the document corpus. The framework includes data balancing and attribute reduction techniques to prepare the input data and improve the classification efficiency. Classification methods analyze documents by content and differentiate those that are best suited to the user requeriments. BioClass also offers graphical interfaces to get conclusions simply and easily.

international conference on web engineering | 2003

Agent-based web engineering

Juan M. Corchado; Rosalía Laza; Lourdes Borrajo; J. C. Yañez; A. de Luis; M. Gonzalez-Bedia

Technological evolution of the Internet world is fast and constant. Successful systems should have the capacity to adapt to it and should be provided with mechanisms that allow them to decide what to do according to such changes. This paper shows how an autonomous intelligent agent can be used to develop web-based systems with the requirements of today users. Internet applications should be reactive, proactive, and autonomous and have to be capable of adapting to changes in its environment and in the user behavior. The technological proposal presented in the paper also facilitates the interoperability and scalability of distributed systems.

distributed computing and artificial intelligence | 2011

Building Biomedical Text Classifiers under Sample Selection Bias

R. Romero; Eva Lorenzo Iglesias; Lourdes Borrajo

Scientific papers are a primary source of information for investigators to know the current status in a topic or compare their results with other colleagues. However, mining biomedical texts remains to be a great challenge by the huge volume of scientific databases stored in the public databases and their imbalanced nature, with only a very small number of relevant papers to each user query. Classifying in the presence of data imbalances presents a great challenge to machine learning. Techniques such as support-vector machines (SVMs) have excellent performance for balanced data, but may fail when applied to imbalanced datasets. In this paper, we study the effects of undersampling, resampling and subsampling balancing strategies on four different biomedical text classifiers (with lineal, sigmoid, exponential and polynomial SVM kernels, respectively). Best results were obtained by normalized lineal and sigmoid kernels using the subsampling balancing technique. These results have been compared with those obtained by other authors using the TREC Genomics 2005 public corpus.

distributed computing and artificial intelligence | 2015

A HMM text classification model with learning capacity

Eva Lorenzo Iglesias; Lourdes Borrajo; R. Romero

In this paper a method of classifying biomedical text documents based on Hidden Markov Model is proposed and evaluated. The method is integrated into a framework named BioClass. Bioclass is composed of intelligent text classification tools and facilitates the comparison between them because it has several views of the results. The main goal is to propose a more effective based-on content classifier than current methods in this environment To test the effectiveness of the classifier presented, a set of experiments performed on the OSHUMED corpus are preseted. Our model is tested adding it learning capacity and without it, and it is compared with other classification techniques. The results suggest that the adaptive HMM model is indeed more suitable for document classification.

PACBB | 2012

A Comparative Analysis of Balancing Techniques and Attribute Reduction Algorithms

R. Romero; Eva Lorenzo Iglesias; Lourdes Borrajo

In this study we analyze several data balancing techniques and attribute reduction algorithms and their impact over the information retrieval process. Specifically, we study its performance when used in biomedical text classification using Support Vector Machines (SVMs) based on Linear, Radial, Polynomial and Sigmoid kernels. From experiments on the TREC Genomics 2005 biomedical text public corpus we conclude that these techniques are necessary to improve the classification process. Kernels get some improvements about their results when attribute reduction algorithms were used.Moreover, if balancing techniques and attribute reduction algorithms are applied, results obtained with oversampling are better than subsampling.

distributed computing and artificial intelligence | 2009

Classification of MedLine Documents Using MeSH Terms

Daniel Glez-Peña; Sira López; Reyes Pavón; Rosalía Laza; Eva Lorenzo Iglesias; Lourdes Borrajo

Text classification is becoming an interesting research field due to increased availability of documents in digital form which is necessary to organize. The machine learning paradigm is usually applied to text classification, according to which a general inductive process automatically builds an text classifier from a set of pre-classified documents. In this paper we investigate the application of Bayesian networks to classify MedLine documents, where each document is identified by a set of MeSH ontology terms. Bayesian networks have been selected for their ability to describe conditional independencies between variables and provide clear methodologies for learning from observations.Our experimental evaluation of these ideas is based on the relevance judgments of the 2004 TREC workshop Genomics track.

Explore More