Aurora Pons-Porrata
Universidad de Oriente
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Aurora Pons-Porrata.
Information Processing and Management | 2007
Aurora Pons-Porrata; Rafael Berlanga-Llavori; José Ruiz-Shulcloper
In this paper, we present a topic discovery system aimed to reveal the implicit knowledge present in news streams. This knowledge is expressed as a hierarchy of topic/subtopics, where each topic contains the set of documents that are related to it and a summary extracted from these documents. Summaries so built are useful to browse and select topics of interest from the generated hierarchies. Our proposal consists of a new incremental hierarchical clustering algorithm, which combines both partitional and agglomerative approaches, taking the main benefits from them. Finally, a new summarization method based on Testor Theory has been proposed to build the topic summaries. Experimental results in the TDT2 collection demonstrate its usefulness and effectiveness not only as a topic detection system, but also as a classification and summarization tool.
Pattern Recognition Letters | 2010
Reynaldo Gil-García; Aurora Pons-Porrata
In this paper, two clustering algorithms called dynamic hierarchical compact and dynamic hierarchical star are presented. Both methods aim to construct a cluster hierarchy, dealing with dynamic data sets. The first creates disjoint hierarchies of clusters, while the second obtains overlapped hierarchies. The experimental results on several benchmark text collections show that these methods not only are suitable for producing hierarchical clustering solutions in dynamic environments effectively and efficiently, but also offer hierarchies easier to browse than traditional algorithms. Therefore, we advocate its use for tasks that require dynamic clustering, such as information organization, creation of document taxonomies and hierarchical topic detection.
iberoamerican congress on pattern recognition | 2003
Reynaldo Gil-García; José Manuel Badía-Contelles; Aurora Pons-Porrata
In this paper we propose the extended star clustering algorithm and compare it with the original star clustering algorithm. We introduce a new concept of star and as a consequence, we obtain different star-shaped clusters. The evaluation experiments on TREC data, show that the proposed algorithm outperforms the original algorithm. Our algorithm is independent of the data order and obtains a smaller number of clusters.
Pattern Recognition Letters | 2010
Henry Anaya-Sánchez; Aurora Pons-Porrata; Rafael Berlanga-Llavori
In this paper, we introduce a new clustering algorithm for discovering and describing the topics comprised in a text collection. Our proposal relies on both the most probable term pairs generated from the collection and the estimation of the topic homogeneity associated to these pairs. Topics and their descriptions are generated from those term pairs whose support sets are homogeneous enough for representing collection topics. Experimental results obtained over three benchmark text collections demonstrate the effectiveness and utility of this new approach.
iberoamerican congress on pattern recognition | 2009
Alexsey Lias-Rodríguez; Aurora Pons-Porrata
Typical testors are very useful in Pattern Recognition, especially for Feature Selection problems. The complexity of computing all typical testors of a training matrix has an exponential growth with respect to the number of features. Several methods that speed up the calculation of the set of all typical testors have been developed, but nowadays, there are still problems where this set is impossible to find. With this aim, a new external scale algorithm BR is proposed. The experimental results demonstrate that this method clearly outperforms the two best algorithms reported in the literature.
iberoamerican congress on pattern recognition | 2003
Aurora Pons-Porrata; José Ruiz-Shulcloper; Rafael Berlanga-Llavori
In this paper we propose an effective method to summarize document clusters. This method is based on the Testor Theory, and it is applied to a group of newspaper articles in order to summarize the events that they describe. This method is also applicable to either a very large document collection or a very large document, in order to identify the main themes (topics) of the collection (documents) and to summarize them. The results obtained in the experiments demonstrate the usefulness of the proposed method.
iberoamerican congress on pattern recognition | 2008
Henry Anaya-Sánchez; Aurora Pons-Porrata; Rafael Berlanga-Llavori
In this paper, we introduce a new clustering algorithm for obtaining labeled document clusters that accurately identify the topics of a text collection. In order to determine the topics, our approach relies on both probable term pairs generated from the collection and the estimation of the topic homogeneity associated to term pair clusters. Experimental results obtained over two benchmark text collections demonstrate the utility of this new approach.
iberoamerican congress on pattern recognition | 2007
Aurora Pons-Porrata; Reynaldo Gil-García; Rafael Berlanga-Llavori
A major difficulty of text categorization problems is the high dimensionality of the feature space. Thus, feature selection is often performed in order to increase both the efficiency and effectiveness of the classification. In this paper, we propose a feature selection method based on Testor Theory. This criterion takes into account inter-feature relationships. We experimentally compared our method with the widely used information gain using two well-known classification algorithms: k-nearest neighbour and Support Vector Machine. Two benchmark text collections were chosen as the testbeds: Reuters- 21578 and Reuters Corpus Version 1 (RCV1-v2). We found that our method consistently outperformed information gain for both classifiers and both data collections, especially when aggressive feature selection is carried out.
european conference on information retrieval | 2003
Aurora Pons-Porrata; Rafael Berlanga-Llavori; José Ruiz-Shulcloper
In this paper we propose an incremental hierarchical clustering algorithm for on-line event detection. This algorithm is applied to a set of newspaper articles in order to discover the structure of topics and events that they describe. In the first level, articles with a high temporal-semantic similarity are clustered together into events. In the next levels of the hierarchy, these events are successively clustered so that composite events and topics can be discovered. The results obtained for the F1-measure and the Detection Cost demonstrate the validity of our algorithm for on-line event detection tasks.
iberoamerican congress on pattern recognition | 2008
Reynaldo Gil-García; Aurora Pons-Porrata
In this paper, a new clustering algorithm called DynamicHierarchical Staris introduced. Our approach aims to construct a hierarchy of overlapped clusters, dealing with dynamic data sets. The experimental results on several benchmark text collections show that this method obtains smaller hierarchies than traditional algorithms while achieving a similar clustering quality. Therefore, we advocate its use for tasks that require dynamic overlapped clustering, such as information organization, creation of document taxonomies and hierarchical topic detection.