José Francisco Martínez-Trinidad
National Institute of Astrophysics, Optics and Electronics
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Hotspot
Dive into the research topics where José Francisco Martínez-Trinidad is active.
Publication
Featured researches published by José Francisco Martínez-Trinidad.
international conference on computational linguistics | 2006
René Arnulfo García-Hernández; José Francisco Martínez-Trinidad; Jesús Ariel Carrasco-Ochoa
Sequential pattern mining is an important tool for solving many data mining tasks and it has broad applications. However, only few efforts have been made to extract this kind of patterns in a textual database. Due to its broad applications in text mining problems, finding these textual patterns is important because they can be extracted from text independently of the language. Also, they are human readable patterns or descriptors of the text, which do not lose the sequential order of the words in the document. But the problem of discovering sequential patterns in a database of documents presents special characteristics which make it intractable for most of the apriori-like candidate-generation-and-test approaches. Recent studies indicate that the pattern-growth methodology could speed up the sequential pattern mining. In this paper we propose a pattern-growth based algorithm (DIMASP) to discover all the maximal sequential patterns in a document database. Furthermore, DIMASP is incremental and independent of the support threshold. Finally, we compare the performance of DIMASP against GSP, DELISP, GenPrefixSpan and cSPADE algorithms.
Knowledge and Information Systems | 2011
Ansel Y. Rodríguez-González; José Francisco Martínez-Trinidad; Jesús Ariel Carrasco-Ochoa; José Ruiz-Shulcloper
Most of the current algorithms for mining frequent patterns assume that two object subdescriptions are similar if they are equal, but in many real-world problems some other ways to evaluate the similarity are used. Recently, three algorithms (ObjectMiner, STreeDC-Miner and STreeNDC-Miner) for mining frequent patterns allowing similarity functions different from the equality have been proposed. For searching frequent patterns, ObjectMiner and STreeDC-Miner use a pruning property called Downward Closure property, which should be held by the similarity function. For similarity functions that do not meet this property, the STreeNDC-Miner algorithm was proposed. However, for searching frequent patterns, this algorithm explores all subsets of features, which could be very expensive. In this work, we propose a frequent similar pattern mining algorithm for similarity functions that do not meet the Downward Closure property, which is faster than STreeNDC-Miner and loses fewer frequent similar patterns than ObjectMiner and STreeDC-Miner. Also we show the quality of the set of frequent similar patterns computed by our algorithm with respect to the quality of the set of frequent similar patterns computed by the other algorithms, in a supervised classification context.
iberoamerican congress on pattern recognition | 2013
Julio Hernandez; Jesús Ariel Carrasco-Ochoa; José Francisco Martínez-Trinidad
Instance selection methods get low accuracy in problems with imbalanced databases. In the literature, the problem of imbalanced databases has been tackled applying oversampling or undersampling methods. Therefore, in this paper, we present an empirical study about the use of oversampling and undersampling methods to improve the accuracy of instance selection methods on imbalanced databases. We apply different oversampling and undersampling methods jointly with instance selectors over several public imbalanced databases. Our experimental results show that using oversampling and undersampling methods significantly improves the accuracy for the minority class.
knowledge discovery and data mining | 2010
Milton García-Borroto; José Francisco Martínez-Trinidad; Jesús Ariel Carrasco-Ochoa
Obtaining an accurate class prediction of a query object is an important component of supervised classification. However, it could be important to understand the classification in terms of the application domain, mostly if the prediction disagrees with the expected results. Many accurate classifiers are unable to explain their classification results in terms understandable by an application expert. Emerging Pattern classifiers, on the other hand, are accurate and easy to understand. However, they have two characteristics that could degrade their accuracy: global discretization of numerical attributes and high sensitivity to the support threshold value. In this paper, we introduce a novel algorithm to find emerging patterns without global discretization, which uses an accurate estimation of the support threshold. Experimental results show that our classifier attains higher accuracy than other understandable classifiers, while being competitive with Nearest Neighbors and Support Vector Machines classifiers.
iberoamerican congress on pattern recognition | 2008
Ansel Y. Rodríguez-González; José Francisco Martínez-Trinidad; Jesús Ariel Carrasco-Ochoa; José Ruiz-Shulcloper
Frequent Pattern Mining is an important task due to the relevance of repetitions on data, also it is a fundamental step in the Association Rule Mining. Most of the current algorithms for mining frequent patterns assume that two object subdescriptions are similar if and only if they are equal, but in soft sciences some other similarity functions are used. In this work, we focus on the search of frequent patterns on Mixed Data, incorporating similarity between objects. We propose a novel and efficient algorithm to mine frequent similar patterns for a family of similarity functions that fulfill Downward Closure property and we also propose another algorithm for the remaining families of similarity functions. Some experiments over mixed datasets are done, and the results are compared against the ObjectMiner algorithm.
iberoamerican congress on pattern recognition | 2012
Miriam Mónica Duarte-Villaseñor; Jesús Ariel Carrasco-Ochoa; José Francisco Martínez-Trinidad; Marisol Flores-Garrido
Multiclass problems, i.e., classification problems involving more than two classes, are a common scenario in supervised classification. An important approach to solve this type of problems consists in using binary classifiers repeated times; within this category we find nested dichotomies. However, most of the methods for building nested dichotomies use a random strategy, which does not guarantee finding a good one. In this work, we propose new non-random methods for building nested dichotomies, using the idea of reducing misclassification errors by separating in the higher levels those classes that are easier to separate; and, in the lower levels those classes that are more difficult to separate. In order to evaluate the performance of the proposed methods, we compare them against methods that randomly build nested dichotomies, using some datasets (with mixed data) taken from the UCI repository.
iberoamerican congress on pattern recognition | 2013
Laura Alejandra Pinilla-Buitrago; José Francisco Martínez-Trinidad; Jesús Ariel Carrasco-Ochoa
Optimal Subsequence Bijection OSB is a method that allows comparing two sequences of endnodes of two skeleton graphs which represent articulated shapes of 2D images. The OSB dissimilarity function uses a constant penalty cost for all endnodes not matching between two skeleton graphs; this can be a problem, especially in those cases where there is a big amount of not matching endnodes. In this paper, a new penalty scheme for OSB, assigning variable penalties on endnodes not matching between two skeleton graphs, is proposed. The experimental results show that the new penalty scheme improves the results on supervised classification, compared with the original OSB.
iberoamerican congress on pattern recognition | 2013
Niusvel Acosta-Mendoza; Andrés Gago-Alonso; Jesús Ariel Carrasco-Ochoa; José Francisco Martínez-Trinidad; José E. Medina-Pagola
Feature selection is an essential preprocessing step for classifiers with high dimensional training sets. In pattern recognition, feature selection improves the performance of classification by reducing the feature space but preserving the classification capabilities of the original feature space. Image classification using frequent approximate subgraph mining FASM is an example where the benefits of features selections are needed. This is due using frequent approximate subgraphs FAS leads to high dimensional representations. In this paper, we explore the use of feature selection algorithms in order to reduce the representation of an image collection represented through FASs. In our results we report a dimensionality reduction of over 50% of the original features and we get similar classification results than those reported by using all the features.
Engineering Applications of Artificial Intelligence | 2016
Niusvel Acosta-Mendoza; Andrés Gago-Alonso; Jesús Ariel Carrasco-Ochoa; José Francisco Martínez-Trinidad; José E. Medina-Pagola
In recent years, frequent approximate subgraph (FAS) mining has been used for image classification. However, using FASs leads to a high dimensional representation. In order to solve this problem, in this paper, we propose using emerging patterns for reducing the dimensionality of the image representation in this approach. Using our proposal, a dimensionality reduction over 50% of the original patterns is achieved, additionally, better classification results are obtained. HighlightsWe combine FASs together with emerging patterns for image classification.To the best of our knowledge, this is the first work that proposes such combination.A dimensionality reduction of over 50% of the original patterns is achieved.Improvements on classification results are achieved.
iberoamerican congress on pattern recognition | 2013
Milton García-Borroto; Octavio Loyola-González; José Francisco Martínez-Trinidad; Jesús Ariel Carrasco-Ochoa
Contrast pattern miners and contrast pattern classifiers typically use a quality measure to evaluate the discriminative power of a pattern. Since many quality measures exist, it is important to perform comparative studies among them. Nevertheless, previous studies mostly compare measures based on how they impact the classification accuracy. In this paper, we introduce a comparative study of quality measures over different aspects: accuracy using the whole training set, accuracy using pattern subsets, and accuracy and compression for filtering patterns. Experiments over 10 quality measures in 25 repository databases show that there is a huge correlation among different quality measures and that the most accurate quality measures are not appropriate in contexts like pattern filtering.