Carlos Fernandez-Lozano
University of A Coruña
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Carlos Fernandez-Lozano.
Journal of Cheminformatics | 2015
Georgia Tsiliki; Cristian R. Munteanu; Jose A. Seoane; Carlos Fernandez-Lozano; Haralambos Sarimveis; Egon Willighagen
AbstractBackgroundPredictive regression models can be created with many different modelling approaches. Choices need to be made for data set splitting, cross-validation methods, specific regression parameters and best model criteria, as they all affect the accuracy and efficiency of the produced predictive models, and therefore, raising model reproducibility and comparison issues. Cheminformatics and bioinformatics are extensively using predictive modelling and exhibit a need for standardization of these methodologies in order to assist model selection and speed up the process of predictive model development. A tool accessible to all users, irrespectively of their statistical knowledge, would be valuable if it tests several simple and complex regression models and validation schemes, produce unified reports, and offer the option to be integrated into more extensive studies. Additionally, such methodology should be implemented as a free programming package, in order to be continuously adapted and redistributed by others.ResultsWe propose an integrated framework for creating multiple regression models, called RRegrs. The tool offers the option of ten simple and complex regression methods combined with repeated 10-fold and leave-one-out cross-validation. Methods include Multiple Linear regression, Generalized Linear Model with Stepwise Feature Selection, Partial Least Squares regression, Lasso regression, and Support Vector Machines Recursive Feature Elimination. The new framework is an automated fully validated procedure which produces standardized reports to quickly oversee the impact of choices in modelling algorithms and assess the model and cross-validation results. The methodology was implemented as an open source R package, available at https://www.github.com/enanomapper/RRegrs, by reusing and extending on the caret package.ConclusionThe universality of the new methodology is demonstrated using five standard data sets from different scientific fields. Its efficiency in cheminformatics and QSAR modelling is shown with three use cases: proteomics data for surface-modified gold nanoparticles, nano-metal oxides descriptor data, and molecular descriptors for acute aquatic toxicity data. The results show that for all data sets RRegrs reports models with equal or better performance for both training and test sets than those reported in the original publications. Its good performance as well as its adaptability in terms of parameter optimization could make RRegrs a popular framework to assist the initial exploration of predictive models, and with that, the design of more comprehensive in silico screening applications.Graphical abstractRRegrs is a computer-aided model selection framework for R multiple regression models; this is a fully validated procedure with application to QSAR modelling
Journal of Chemical Information and Modeling | 2015
Cristian R. Munteanu; António Pimenta; Carlos Fernandez-Lozano; André Melo; Maria Natália Dias Soeiro Cordeiro; Irina S. Moreira
Due to the importance of hot-spots (HS) detection and the efficiency of computational methodologies, several HS detecting approaches have been developed. The current paper presents new models to predict HS for protein-protein and protein-nucleic acid interactions with better statistics compared with the ones currently reported in literature. These models are based on solvent accessible surface area (SASA) and genetic conservation features subjected to simple Bayes networks (protein-protein systems) and a more complex multi-objective genetic algorithm-support vector machine algorithms (protein-nucleic acid systems). The best models for these interactions have been implemented in two free Web tools.
soft computing | 2015
Carlos Fernandez-Lozano; Jose A. Seoane; Marcos Gestal; Tom R. Gaunt; Julian Dorado; Colin Campbell
The interpretation of the results in a classification problem can be enhanced, specially in image texture analysis problems, by feature selection techniques, knowing which features contribute more to the classification performance. This paper presents an evaluation of a number of feature selection techniques for classification in a biomedical image texture dataset (2-DE gel images), with the aim of studying their performance and the stability in the selection of the features. We analyse three different techniques: subgroup-based multiple kernel learning (MKL), which can perform a feature selection by down-weighting or eliminating subsets of features which shares similar characteristic, and two different conventional feature selection techniques such as recursive feature elimination (RFE), with different classifiers (naive Bayes, support vector machines, bagged trees, random forest and linear discriminant analysis), and a genetic algorithm-based approach with an SVM as decision function. The different classifiers were compared using a ten times tenfold cross-validation model, and the best technique found is SVM-RFE, with an AUROC score of (
Expert Systems With Applications | 2015
Cristian R. Munteanu; Carlos Fernandez-Lozano; Virginia Mato Abad; Salvador Pita Fernández; Juan Álvarez-Linera; Juan Antonio Hernández-Tamames; Alejandro Pazos
The Scientific World Journal | 2013
Carlos Fernandez-Lozano; C. Canto; Marcos Gestal; José Manuel Andrade-Garda; Juan R. Rabuñal; Julian Dorado; Alejandro Pazos
95.88 \pm 0.39\,\%
Current Topics in Medicinal Chemistry | 2013
Carlos Fernandez-Lozano; Marcos Gestal; Nieves Pedreira-Souto; Lucian Postelnicu; Julian Dorado; Cristian R. Munteanu
Journal of Theoretical Biology | 2015
Carlos Fernandez-Lozano; Rubén F. Cuiñas; Jose A. Seoane; Enrique Fernández-Blanco; Julian Dorado; Cristian R. Munteanu
95.88±0.39%). However, this method is not significantly better than RFE-TREE, RFE-RF and grouped MKL, whilst MKL uses lower number of features, increasing the interpretability of the results. MKL selects always the same features, related to wavelet-based textures, while RFE methods focuses specially co-occurrence matrix-based features, but with high instability in the number of features selected.
Analytical Biochemistry | 2014
Alvaro Rodriguez; Carlos Fernandez-Lozano; Julian Dorado; Juan R. Rabuñal
First application of 1H-MRS data and machine-learning to the classification of AD.Classification of individuals affected by different stages of dementia.With two spectroscopic voxel volumes in left hippocampus we achieved a 0.866 AUROC.Classification results are in agreement with previous studies using MRI data.Composition of white and grey matter and cerebrospinal fluid is essential in 1H-MRS. Several magnetic resonance techniques have been proposed as non-invasive imaging biomarkers for the evaluation of disease progression and early diagnosis of Alzheimers Disease (AD). This work is the first application of the Proton Magnetic Resonance Spectroscopy 1H-MRS data and machine-learning techniques to the classification of AD. A gender-matched cohort of 260 subjects aged between 57 and 99years from the Alzheimers Disease Research Unit, of the Fundacion CIEN-Fundacion Reina Sofia has been used. A single-layer perceptron was found for AD prediction with only two spectroscopic voxel volumes (Tvol and CSFvol) in the left hippocampus, with an AUROC value of 0.866 (with TPR 0.812 and FPR 0.204) in a filter feature selection approach. These results suggest that knowing the composition of white and grey matter and cerebrospinal fluid of the spectroscopic voxel is essential in a 1H-MRS study to improve the accuracy of the quantifications and classifications, particularly in those studies involving elder patients and neurodegenerative diseases.
Expert Systems With Applications | 2017
Yong Liu; Shaoxun Tang; Carlos Fernandez-Lozano; Cristian R. Munteanu; Alejandro Pazos; Yizun Yu; Zhiliang Tan; Humberto Gonzlez-Daz
Given the background of the use of Neural Networks in problems of apple juice classification, this paper aim at implementing a newly developed method in the field of machine learning: the Support Vector Machines (SVM). Therefore, a hybrid model that combines genetic algorithms and support vector machines is suggested in such a way that, when using SVM as a fitness function of the Genetic Algorithm (GA), the most representative variables for a specific classification problem can be selected.
Scientific Reports | 2016
Carlos Fernandez-Lozano; Jose A. Seoane; Marcos Gestal; Tom R. Gaunt; Julian Dorado; Alejandro Pazos; Colin Campbell
The transport of the molecules inside cells is a very important topic, especially in Drug Metabolism. The experimental testing of the new proteins for the transporter molecular function is expensive and inefficient due to the large amount of new peptides. Therefore, there is a need for cheap and fast theoretical models to predict the transporter proteins. In the current work, the primary structure of a protein is represented as a molecular Star graph, characterized by a series of topological indices. The dataset was made up of 2,503 protein chains, out of which 413 have transporter molecular function and 2,090 have no transporter function. These indices were used as input to several classification techniques to find the best Quantitative Structure Activity Relationship (QSAR) model that can evaluate the transporter function of a new protein chain. Among several feature selection techniques, the Support Vector Machine Recursive Feature Elimination allows us to obtain a classification model based on 20 attributes with a true positive rate of 83% and a false positive rate of 16.7%.