Raúl Cruz-Barbosa
Polytechnic University of Catalonia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Raúl Cruz-Barbosa.
International Journal of Neural Systems | 2011
Raúl Cruz-Barbosa; Alfredo Vellido
Medical diagnosis can often be understood as a classification problem. In oncology, this typically involves differentiating between tumour types and grades, or some type of discrete outcome prediction. From the viewpoint of computer-based medical decision support, this classification requires the availability of accurate diagnoses of past cases as training target examples. The availability of such labeled databases is scarce in most areas of oncology, and especially so in neuro-oncology. In such context, semi-supervised learning oriented towards classification can be a sensible data modeling choice. In this study, semi-supervised variants of Generative Topographic Mapping, a model of the manifold learning family, are applied to two neuro-oncology problems: the diagnostic discrimination between different brain tumour pathologies, and the prediction of outcomes for a specific type of aggressive brain tumours. Their performance compared favorably with those of the alternative Laplacian Eigenmaps and Semi-Supervised SVM for Manifold Learning models in most of the experiments.
Pattern Recognition Letters | 2010
Raúl Cruz-Barbosa; Alfredo Vellido
We present a novel semi-supervised model, SS-Geo-GTM, which stems from a geodesic distance-based extension of Generative Topographic Mapping that prioritizes neighbourhood relationships along a generated manifold embedded in the observed data space. With this, it improves the trustworthiness and the continuity of the low-dimensional representations it provides, while behaving robustly in the presence of noise. In SS-Geo-GTM, the model prototypes are linked by the nearest neighbour to the data manifold constructed by Geo-GTM. The resulting proximity graph is used as the basis for a class label propagation algorithm. The performance of SS-Geo-GTM is experimentally assessed, comparing positively with that of an Euclidean distance-based counterpart and with those of alternative manifold learning methods.
international conference on image analysis and processing | 2013
Caroline König; Raúl Cruz-Barbosa; René Alquézar; Alfredo Vellido
G protein-coupled receptors (GPCRs) have a key function in regulating the function of cells due to their ability to transmit extracelullar signals. Given that the 3D structure and the functionality of most GPCRs is unknown, there is a need to construct robust classification models based on the analysis of their amino acid sequences for protein homology detection. In this paper, we describe the supervised classification of the different subtypes of class C GPCRs using support vector machines (SVMs). These models are built on different transformations of the amino acid sequences based on their physicochemical properties. Previous research using semi-supervised methods on the same data has shown the usefulness of such transformations. The obtained classification models show a robust performance, as their Matthews correlation coefficient is close to 0.91 and their prediction accuracy is close to 0.93.
Medical & Biological Engineering & Computing | 2014
Raúl Cruz-Barbosa; Alfredo Vellido; Jesús Giraldo
G protein-coupled receptors (GPCRs) are integral cell membrane proteins of relevance for pharmacology. The tertiary structure of the transmembrane domain, a gate to the study of protein functionality, is unknown for almost all members of class C GPCRs, which are the target of the current study. As a result, their investigation must often rely on alignments of their amino acid sequences. Sequence alignment entails the risk of missing relevant information. Various approaches have attempted to circumvent this risk through alignment-free transformations of the sequences on the basis of different amino acid physicochemical properties. In this paper, we use several of these alignment-free methods, as well as a basic amino acid composition representation, to transform the available sequences. Novel semi-supervised statistical machine learning methods are then used to discriminate the different class C GPCRs types from the transformed data. This approach is relevant due to the existence of orphan proteins to which type labels should be assigned in a process of deorphanization or reverse pharmacology. The reported experiments show that the proposed techniques provide accurate classification even in settings of extreme class-label scarcity and that fair accuracy can be achieved even with very simple transformation strategies that ignore the sequence ordering.
ibero american conference on ai | 2008
Raúl Cruz-Barbosa; Alfredo Vellido
Nonlinear dimensionality reduction (NLDR) methods aim to provide a faithful low-dimensional representation of multivariate data. The manifold learning family of NLDR methods, in particular, do this by defining low-dimensional manifolds embedded in the observed data space. Generative Topographic Mapping (GTM) is one such manifold learning method for multivariate data clustering and visualization. The non-linearity of the mapping it generates makes it prone to trustworthinessand continuityerrors that would reduce the faithfulness of the data representation, especially for datasets of convoluted geometry. In this study, the GTM is modified to prioritize neighbourhood relationships along the generated manifold. This is accomplished through penalizing divergences between the Euclidean distances from the data points to the model prototypes and the corresponding geodesic distances along the manifold. The resulting Geodesic GTM model is shown to improve not only the continuityand trustworthinessof the representation generated by the model, but also its resilience in the presence of noise.
hybrid artificial intelligence systems | 2008
Raúl Cruz-Barbosa; Alfredo Vellido
Generative Topographic Mapping (GTM) is a probabilistic latent variable model for multivariate data clustering and visualization. It tries to capture the relevant data structure by defining a low-dimensional manifold embedded in the high-dimensional data space. This requires the assumption that the data can be faithfully represented by a manifold of much lower dimension than that of the observed space. Even when this assumption holds, the approximation of the data may, for some datasets, require plenty of folding, resulting in an entangled manifold and in breaches of topology preservation that would hamper data visualization and cluster definition. This can be partially avoided by modifying the GTM learning procedure so as to penalize divergences between the Euclidean distances from the data to the model prototypes and the corresponding geodesic distances along the manifold. We define and assess this strategy, comparing it to the performance of the standard GTM, using several artificial datasets.
mexican international conference on artificial intelligence | 2014
Verónica Rodríguez-López; Raúl Cruz-Barbosa
Nowadays, breast cancer is considered a significant health problem in Mexico. Mammogram is an effective study for detecting mass lesions, which could indicate this disease. However, due to the density of breast tissue and a wide range of mass characteristic, the mass diagnosis is difficult. In this study, the performance comparison of Bayesian networks models on classification of benign and malignant masses is presented. Here, Naive Bayes, Tree Augmented Naive Bayes, K-dependence Bayesian classifier, and Forest Augmented Naive Bayes models are analyzed. Two data sets extracted from the public BCDR-F01 database, including 112 benign and 119 malignant masses, were used to train the models. The experimental results have shown that TAN, KDB, and FAN models with a subset of only eight features have achieved a performance of 0.79 in accuracy, 0.80 in sensitivity, and 0.77 in specificity. Therefore, these models which allow dependencies among variables (features), are considered as suitable and promising methods for automated mass classification.
intelligent data engineering and automated learning | 2008
Raúl Cruz-Barbosa; Alfredo Vellido
Manifold learning methods model high-dimensional data through low-dimensional manifolds embedded in the observed data space. This simplification implies that their are prone to trustworthiness and continuity errors. Generative Topographic Mapping (GTM) is one such manifold learning method for multivariate data clustering and visualization, defined within a probabilistic framework. In the original formulation, GTM is optimized by minimization of an error that is a function of Euclidean distances, making it vulnerable to the aforementioned errors, especially for datasets of convoluted geometry. Here, we modify GTM to penalize divergences between the Euclidean distances from the data points to the model prototypes and the corresponding geodesic distances along the manifold. Several experiments with artificial data show that this strategy improves the continuity and trustworthiness of the data representation generated by the model.
International Journal of Fuzzy Systems | 2018
Arturo Téllez-Velázquez; Herón Molina-Lozano; Luis A. Villa-Vargas; Raúl Cruz-Barbosa; Esther Lugo-González; Ildar Z. Batyrshin; Imre J. Rudas
This paper presents an optimization strategy for interval type-2 fuzzy systems by using the conjunction operation called the (p)-monotone sum of t-norms. A direct-current servomotor control system is implemented to test the performance of the type-1, interval type-2 and interval type-2 fuzzy systems with parametric operations, under several noisy conditions. To rate them, a multi-objective fitness function, based on the main transient parameters, is proposed to ensure the genetic algorithm to find the best squared feedback signal, when a white noise signal with different amplitudes is added to the reference. In addition, the optimization strategy includes the parametric conjunction suppression to analyze how a rule-associated parametric conjunction directly influences on system performance. Such rule suppression can be used to reduce the number of parametric conjunction operations required to obtain an additional performance improvement. Experimental results of the servomotor control system show that parametric conjunctions used in the interval type-2 fuzzy logic system provide additional advantages over its nonparametric counterpart.
mexican conference on pattern recognition | 2015
Verónica Rodríguez-López; Raúl Cruz-Barbosa
Nowadays, breast cancer is considered a significant health problem in Mexico. Mammogram is an effective study for early detecting signs of this disease. One of the most important findings in this study is a mass, which is the main indicator of malignancy. However, mass detection and diagnosis are difficult. In this study, the impact of the inclusion of seven clinical features on the performance of Bayesian Networks models for mass diagnosis is presented. Here, Naive Bayes, Tree Augmented Naive Bayes, K-dependence Bayesian classifier, and Forest Augmented Naive Bayes models with eight image features nodes were augmented with several clinical features subsets. These models were trained with a data set extracted from the public BCDR-F01 database. The experimental results have shown that the Bayesian networks models augmented with a subset of three clinical features have improved their performance upi¾?to 0.82 in accuracy, 0.80 in sensitivity, and 0.83 in specificity. Therefore, these augmented models are considered as suitable and promising methods for mass classification.