Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Reynaldo Gil-García is active.

Publication


Featured researches published by Reynaldo Gil-García.


Pattern Recognition Letters | 2010

Dynamic hierarchical algorithms for document clustering

Reynaldo Gil-García; Aurora Pons-Porrata

In this paper, two clustering algorithms called dynamic hierarchical compact and dynamic hierarchical star are presented. Both methods aim to construct a cluster hierarchy, dealing with dynamic data sets. The first creates disjoint hierarchies of clusters, while the second obtains overlapped hierarchies. The experimental results on several benchmark text collections show that these methods not only are suitable for producing hierarchical clustering solutions in dynamic environments effectively and efficiently, but also offer hierarchies easier to browse than traditional algorithms. Therefore, we advocate its use for tasks that require dynamic clustering, such as information organization, creation of document taxonomies and hierarchical topic detection.


iberoamerican congress on pattern recognition | 2003

Extended Star Clustering Algorithm

Reynaldo Gil-García; José Manuel Badía-Contelles; Aurora Pons-Porrata

In this paper we propose the extended star clustering algorithm and compare it with the original star clustering algorithm. We introduce a new concept of star and as a consequence, we obtain different star-shaped clusters. The evaluation experiments on TREC data, show that the proposed algorithm outperforms the original algorithm. Our algorithm is independent of the data order and obtains a smaller number of clusters.


international conference on pattern recognition | 2006

A General Framework for Agglomerative Hierarchical Clustering Algorithms

Reynaldo Gil-García; José Manuel Badía-Contelles; Aurora Pons-Porrata

This paper presents a general framework for agglomerative hierarchical clustering based on graphs. Different hierarchical agglomerative clustering algorithms can be obtained from this framework, by specifying an inter-cluster similarity measure, a subgraph of the 13-similarity graph, and a cover routine. We also describe two methods obtained from this framework called hierarchical compact algorithm and hierarchical star algorithm. These algorithms have been evaluated using standard document collections. The experimental results show that our methods are faster and obtain smaller hierarchies than traditional hierarchical algorithms while achieving a similar clustering quality


iberoamerican congress on pattern recognition | 2007

Using typical testors for feature selection in text categorization

Aurora Pons-Porrata; Reynaldo Gil-García; Rafael Berlanga-Llavori

A major difficulty of text categorization problems is the high dimensionality of the feature space. Thus, feature selection is often performed in order to increase both the efficiency and effectiveness of the classification. In this paper, we propose a feature selection method based on Testor Theory. This criterion takes into account inter-feature relationships. We experimentally compared our method with the widely used information gain using two well-known classification algorithms: k-nearest neighbour and Support Vector Machine. Two benchmark text collections were chosen as the testbeds: Reuters- 21578 and Reuters Corpus Version 1 (RCV1-v2). We found that our method consistently outperformed information gain for both classifiers and both data collections, especially when aggressive feature selection is carried out.


iberoamerican congress on pattern recognition | 2008

Hierarchical Star Clustering Algorithm for Dynamic Document Collections

Reynaldo Gil-García; Aurora Pons-Porrata

In this paper, a new clustering algorithm called DynamicHierarchical Staris introduced. Our approach aims to construct a hierarchy of overlapped clusters, dealing with dynamic data sets. The experimental results on several benchmark text collections show that this method obtains smaller hierarchies than traditional algorithms while achieving a similar clustering quality. Therefore, we advocate its use for tasks that require dynamic overlapped clustering, such as information organization, creation of document taxonomies and hierarchical topic detection.


iberoamerican congress on pattern recognition | 2006

A new nearest neighbor rule for text categorization

Reynaldo Gil-García; Aurora Pons-Porrata

The nearest neighbor (NN) rule is usually chosen in a large number of pattern recognition systems due to its simplicity and good properties. In particular, this rule has been successfully applied to text categorization. A vast number of NN algorithms have been developed during the last years. They differ in how they find the nearest neighbors, how they obtain the votes of categories, and which decision rule they use. A new NN classification rule which comes from the use of a different definition of neighborhood is introduced in this paper. The experimental results on Reuters-21578 standard benchmark collection show that our algorithm achieves better classification rates than the k-NN rule while decreasing classification time.


iberoamerican congress on pattern recognition | 2004

Parallel Algorithm for Extended Star Clustering

Reynaldo Gil-García; José Manuel Badía-Contelles; Aurora Pons-Porrata

In this paper we present a new parallel clustering algorithm based on the extended star clustering method. This algorithm can be used for example to cluster massive data sets of documents on distributed memory multiprocessors. The algorithm exploits the inherent data-parallelism in the extended star clustering algorithm. We implemented our algorithm on a cluster of personal computers connected through a Myrinet network. The code is portable to different architectures and it uses the MPI message-passing library. The experimental results show that the parallel algorithm clearly improves its sequential version with large data sets. We show that the speedup of our algorithm approaches the optimal as the number of objects increases.


iberoamerican congress on pattern recognition | 2005

Dynamic hierarchical compact clustering algorithm

Reynaldo Gil-García; José Manuel Badía-Contelles; Aurora Pons-Porrata

In this paper we introduce a general framework for hierarchical clustering that deals with both static and dynamic data sets. From this framework, different hierarchical agglomerative algorithms can be obtained, by specifying an inter-cluster similarity measure, a subgraph of the β-similarity graph, and a cover algorithm. A new clustering algorithm called Hierarchical Compact Algorithm and its dynamic version are presented, which are specific versions of the proposed framework. Our evaluation experiments on several standard document collections show that this algorithm requires less computational time than standard methods in dynamic data sets while achieving a comparable or even better clustering quality. Therefore, we advocate its use for tasks that require dynamic clustering, such as information organization, creation of document taxonomies and hierarchical topic detection.


international conference on pattern recognition | 2010

A High-Dimensional Access Method for Approximated Similarity Search in Text Mining

Fernando José Artigas-Fuentes; Reynaldo Gil-García; José Manuel Badía-Contelles

In this paper, a new access method for very high-dimensional data space is proposed. The method uses a graph structure and pivots for indexing objects, such as documents in text mining. It also applies a simple search algorithm that uses distance or similarity based functions in order to obtain the k-nearest neighbors for novel query objects. This method shows a good selectivity over very-high dimensional data spaces, and a better performance than other state-of-the-art methods. Although it is a probabilistic method, it shows a low error rate. The method is evaluated on data sets from the well-known collection Reuters corpus version 1 (RCV1-v2) and dealing with thousands of dimensions.


iberoamerican congress on pattern recognition | 2010

Fast k-NN classifier for documents based on a graph structure

Fernando José Artigas-Fuentes; Reynaldo Gil-García; José Manuel Badía-Contelles; Aurora Pons-Porrata

In this paper, a fast k nearest neighbors (k-NN) classifier for documents is presented. Documents are usually represented in a high-dimensional feature space, where their terms are treated as features and the weight of each term reflects its importance in the document. There are many approaches to find the vicinity of an object, but their performance drastically decreases as the number of dimensions grows. This problem prevents its application for documents. The proposed method is based on a graph index structure with a fast search algorithm. Its high selectivity permits to obtain a similar classification quality than the exhaustive classifier, with a few number of computed distances. Our experimental results show that our method can be applied to problems of very high dimensionality, such as Text Mining.

Collaboration


Dive into the Reynaldo Gil-García's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge