José Ignacio Abreu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where José Ignacio Abreu is active.

Explore More

Publication

Featured researches published by José Ignacio Abreu.

Journal of Chemical Information and Modeling | 2006

Amino Acid Sequence Autocorrelation vectors and ensembles of Bayesian-Regularized Genetic Neural Networks for prediction of conformational stability of human lysozyme mutants.

Julio Caballero; Leyden Fernández; José Ignacio Abreu; Michael Fernández

Development of novel computational approaches for modeling protein properties from their primary structure is a main goal in applied proteomics. In this work, we reported the extension of the autocorrelation vector formalism to amino acid sequences for encoding protein structural information with modeling purposes. Amino Acid Sequence Autocorrelation (AASA) vectors were calculated by measuring the autocorrelations at sequence lags ranging from 1 to 15 on the protein primary structure of 48 amino acid/residue properties selected from the AAindex database. A total of 720 AASA descriptors were tested for building predictive models of the thermal unfolding Gibbs free energy change of human lysozyme mutants. In this sense, ensembles of Bayesian-Regularized Genetic Neural Networks (BRGNNs) were used for obtaining an optimum nonlinear model for the conformational stability. The ensemble predictor described about 88% and 68% variance of the data in training and test sets, respectively. Furthermore, the optimum AASA vector subset was shown not only to successfully model unfolding thermal stability but also to distribute wild-type and mutant lysozymes on a stability Self-organized Map (SOM) when used for unsupervised training of competitive neurons.

Proteins | 2007

Amino acid sequence autocorrelation vectors and bayesian‐regularized genetic neural networks for modeling protein conformational stability: Gene V protein mutants

Leyden Fernández; Julio Caballero; José Ignacio Abreu; Michael Fernández

Development of novel computational approaches for modeling protein properties from their primary structure is the main goal in applied proteomics. In this work, we reported the extension of the autocorrelation vector formalism to amino acid sequences for encoding protein structural information with modeling purposes. Amino acid sequence autocorrelation (AASA) vectors were calculated by measuring the autocorrelations at sequence lags ranging from 1 to 15 on the protein primary structure of 48 amino acid/residue properties selected from the AAindex data base. A total of 720 AASA descriptors were tested for building predictive models of the change of thermal unfolding Gibbs free energy change (ΔΔG) of gene V protein upon mutation. In this sense, ensembles of Bayesian‐regularized genetic neural networks (BRGNNs) were used for obtaining an optimum nonlinear model for the conformational stability. The ensemble predictor described about 88% and 66% variance of the data in training and test sets respectively. Furthermore, the optimum AASA vector subset not only helped to successfully model unfolding stability but also well distributed wild‐type and gene V protein mutants on a stability self‐organized map (SOM), when used for unsupervised training of competitive neurons. Proteins 2007.

Proteins | 2007

Classification of conformational stability of protein mutants from 3D pseudo-folding graph representation of protein sequences using support vector machines.

Michael Fernández; Julio Caballero; Leyden Fernández; José Ignacio Abreu; Gianco Acosta

This work reports a novel 3D pseudo‐folding graph representation of protein sequences for modeling purposes. Amino acids euclidean distances matrices (EDMs) encode primary structural information. Amino Acid Pseudo‐Folding 3D Distances Count (AAp3DC) descriptors, calculated from the EDMs of a large data set of 1363 single protein mutants of 64 proteins, were tested for building a classifier for the signs of the change of thermal unfolding Gibbs free energy change (ΔΔG) upon single mutations. An optimum support vector machine (SVM) with a radial basis function (RBF) kernel well recognized stable and unstable mutants with accuracies over 70% in crossvalidation test. To the best of our knowledge, this result for stable mutant recognition is the highest ever reported for a sequence‐based predictor with more than 1000 mutants. Furthermore, the model adequately classified mutations associated to diseases of human prion protein and human transthyretin. Proteins 2008.

Pattern Recognition Letters | 2014

A new iterative algorithm for computing a quality approximate median of strings based on edit operations

José Ignacio Abreu; Juan Ramón Rico-Juan

This paper presents a new algorithm that can be used to compute an approximation to the median of a set of strings. The approximate median is obtained through the successive improvements of a partial solution. The edit distance from the partial solution to all the strings in the set is computed in each iteration, thus accounting for the frequency of each of the edit operations in all the positions of the approximate median. A goodness index for edit operations is later computed by multiplying their frequency by the cost. Each operation is tested, starting from that with the highest index, in order to verify whether applying it to the partial solution leads to an improvement. If successful, a new iteration begins from the new approximate median. The algorithm finishes when all the operations have been examined without a better solution being found. Comparative experiments involving Freeman chain codes encoding 2D shapes and the Copenhagen chromosome database show that the quality of the approximate median string is similar to benchmark approaches but achieves a much faster convergence.

Chemical Biology & Drug Design | 2008

Proteochemometric Modeling of the Inhibition Complexes of Matrix Metalloproteinases with N‐Hydroxy‐2‐[(Phenylsulfonyl)Amino]Acetamide Derivatives Using Topological Autocorrelation Interaction Matrix and Model Ensemble Averaging

Michael Fernández; Leyden Fernández; Julio Caballero; José Ignacio Abreu; Grethel Reyes

A target‐ligand QSAR approach using autocorrelation formalism was developed for modeling the inhibitory potency (pIC50) toward matrix metalloproteinases (MMP‐1, MMP‐2, MMP‐3, MMP‐9, and MMP‐13) of N‐hydroxy‐2‐[(phenylsulfonyl)amino]acetamide derivatives. Target and ligand structural information was encoded in the Topological Autocorrelation Interaction matrix calculated from 2D topological representation of inhibitors and protein sequences. The relevant Topological Autocorrelation Interaction descriptors were selected by genetic algorithm‐based multilinear regression analysis and Bayesian‐regularized genetic neural network approaches. A model ensemble strategy was employed for achieving robust and reliable linear and non‐linear predictors having nine topological autocorrelation interaction descriptors with square correlation coefficients of ensemble test‐set fitting (R2test) about 0.80 and 0.87, respectively. Electrostatic and hydrophobicity/hydrophilicity properties were the most relevant on the optimum models. In addition, the distribution of the inhibition complexes on a self‐organized map depicted target dependence rather than an inhibitor similarity pattern.

Molecular Simulation | 2007

Comparative modeling of the conformational stability of chymotrypsin inhibitor 2 protein mutants using amino acid sequence autocorrelation (AASA) and amino acid 3D autocorrelation (AA3DA) vectors and ensembles of Bayesian-regularized genetic neural networks

Michael Fernández; José Ignacio Abreu; Julio Caballero; Miguel Garriga; Leyden Fernández

Predicting protein stability changes upon point mutation is important for understanding protein structure and designing new proteins. Autocorrelation vector formalism was extended to amino acid sequences and 3D conformations for encoding protein structural information with modeling purpose. Protein autocorrelation vectors were weighted by 48 amino acid/residue properties selected from the AAindex database. Ensembles of Bayesian-regularized genetic neural networks (BRGNNs) trained with amino acid sequence autocorrelation (AASA) vectors and amino acid 3D autocorrelation (AA3DA) vectors yielded predictive models of the change of unfolding Gibbs free energy change (ΔΔG) of chymotrypsin Inhibitor 2 protein mutants. The ensemble predictor described about 58 and 72% of the data variances in test sets for AASA and AA3DA models, respectively. Optimum sequence and 3D-based ensembles exhibit high effects on relevant structural (volume, solvent-accessible surface area), physico-chemical (hydrophilicity/hydrophobicity-related) and thermodynamic (hydration parameters) properties.

Molecular Simulation | 2007

Classification of conformational stability of protein mutants from 2D graph representation of protein sequences using support vector machines

Michael Fernández; Julio Caballero; Leyden Fernández; José Ignacio Abreu; G. Acosta

Euclidean distance counts derived from the protein 2D graphs were used for encoding protein structural information. A total of 35 amino acid 2D distance count (AA2DC) descriptors were calculated from the Euclidean distance matrices (EDM) derived from the 2D graphs at distances ranging from 0.05 to 1.8 units with a lag of 0.05 units. AA2DC descriptors were tested for building predictive classification model of the signs of the change of thermal unfolding Gibbs free energy change (ΔΔG) of a large data set of 2048 single point mutations on 64 proteins. A support vector machine (SVM) classifier with a Radial Basis Function kernel was implemented for classifying the conformational stability of protein mutants. Temperature and pH of the ΔΔG experimental measurements were also conveniently used for SVM training in addition to calculated AA2DC descriptors. The optimum SVM model correctly predicted about 72% of ΔΔG signs in crossvalidation test for all the dataset and also for stable and unstable mutant separately. To the best of our knowledge, this level of accuracy for stable mutant recognition is the highest ever reported for a predictor using sequence information. Furthermore, the classifier adequately recognized unstable mutants of human prion protein and human transthyretin associated to diseases.

Molecular Simulation | 2008

Proteometric modelling of protein conformational stability using amino acid sequence autocorrelation vectors and genetic algorithm-optimised support vector machines

Michael Fernández; Leyden Fernández; Pedro Sánchez; Julio Caballero; José Ignacio Abreu

The conformational stability of more than 1500 protein mutants was modelled by a proteometric approach using amino acid sequence autocorrelation vector (AASA) formalism. 48 amino acid/residue properties selected from the AAindex database weighted the AASA vectors. Genetic algorithm-optimised support vector machine (GA-SVM), trained with subset of AASA descriptors, yielded predictive classification and regression models of unfolding Gibbs free energy change (ΔΔG). Function mapping and binary SVM models correctly predicted about 50 and 80% of ΔΔG variances and signs in crossvalidation experiments, respectively. Test set prediction showed adequate accuracies about 70% for stable single and double point mutants. Conformational stability depended on autocorrelations at medium and long ranges in the mutant sequences of general structural, physico-chemical and thermodynamical properties relative to protein hydration process. A preliminary version of the predictor is available online at http://gibk21.bse.kyutech.ac.jp/llamosa/ddG-AASA/ddG_AASA.html.

Pattern Recognition Letters | 2011

Characterization of contour regularities based on the Levenshtein edit distance

José Ignacio Abreu; Juan Ramón Rico-Juan

This paper describes a new method for quantifying the regularity of contours and comparing them (when encoded by Freeman chain codes) in terms of a similarity criterion which relies on information gathered from Levenshtein edit distance computation. The criterion used allows subsequences to be found from the minimal cost edit sequence that specifies an alignment of contour segments which are similar. Two external parameters adjust the similarity criterion. The information about each similar part is encoded by strings that represent an average contour region. An explanation of how to construct a prototype based on the identified regularities is also reviewed. The reliability of the prototypes is evaluated by replacing contour groups (samples) by new prototypes used as the training set in a classification task. This way, the size of the data set can be reduced without sensibly affecting its representational power for classification purposes. Experimental results show that this scheme achieves a reduction in the size of the training data set of about 80% while the classification error only increases by 0.45% in one of the three data sets studied.

Molecular Simulation | 2016

Large-scale recognition of high-affinity protease–inhibitor complexes using topological autocorrelation and support vector machines

Michael Fernández; Shandar Ahmad; José Ignacio Abreu; Akinori Sarai

Several methods have been developed for the computer-aided discovery of new protease substrates and inhibitors. In this paper, we report a novel machine learning implementation to identify high-affinity protease–inhibitor complexes. The implemented proteochemometrics algorithm consists of creating topological autocorrelation descriptors for proteases and inhibitors, and then to develop support vector machine models to relate the feature vectors to the affinity class (high or low) of hypothetical protein–inhibitor complexes based on experimental inhibition constant (Ki) values. The approach based on the autocorrelation features surpassed an atom-centred (AC) approach using AC information of inhibitors. Unique to our approach is that our final classifier could recognise 80% of inhibition-complex of new ligands to be stable or unstable using only chemical connectivity of the ligands and sequence information of the targets. Moreover, the analysis of substructure classification showed a very homogenous behaviour of the model on the whole target–ligand space. The predictor is available online at: http://www.materialsinformatics.net/autoproti.html

Explore More