Gustavo E. Vazquez
Universidad Nacional del Sur
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Gustavo E. Vazquez.
Molecular Informatics | 2011
Axel J. Soto; Gustavo E. Vazquez; Marc Strickert; Ignacio Ponzoni
This work describes a methodology for assisting virtual screening of drugs during the early stages of the drug development process. This methodology is proposed to improve the reliability of in silico property prediction and it is structured in two steps. Firstly, a transformation is sought for mapping a high‐dimensional space defined by potentially redundant or irrelevant molecular descriptors into a low‐dimensional application‐related space. For this task we evaluate three different target‐driven subspace mapping methods, out of which we highlight the recent Correlative Matrix Mapping (CMM) as the most stable. Secondly, we apply an applicability domain model on the low‐dimensional space for assessing confidentiality of compound classification. By a probabilistic framework the applicability domain approach identifies poorly represented compounds in the training set (extrapolation problems) and regions in the space where the uncertainty about the correct class is higher than normal (interpolation problems). This two‐step approach represents an important contribution to the development of confident prediction tools in the chemoinformatics area, where the field is in need of both interpretable models and methods that estimate the confidence of predictions.
Advances in Engineering Software | 2000
Gustavo E. Vazquez; Ignacio Ponzoni; Mabel Sánchez; Nélida Beatriz Brignole
Abstract A computer software tool for the automatic generation of steady-state process models to be used in instrumentation analysis was developed. We describe the program, called ModGen, discussing its main advantages and potential benefits. ModGen constitutes the front-end of a complete decision support system (DSS) for plant instrumentation design and revamp. This DSS is currently under development. The paper concludes with the description of ModGens application to the classification of unmeasured variables of an existing medium-size process plant by means of GS-FLCNs structural technique for observability analysis.
PLOS ONE | 2012
Gregorio Iraola; Gustavo E. Vazquez; Lucía Spangenberg; Hugo Naya
Although there have been great advances in understanding bacterial pathogenesis, there is still a lack of integrative information about what makes a bacterium a human pathogen. The advent of high-throughput sequencing technologies has dramatically increased the amount of completed bacterial genomes, for both known human pathogenic and non-pathogenic strains; this information is now available to investigate genetic features that determine pathogenic phenotypes in bacteria. In this work we determined presence/absence patterns of different virulence-related genes among more than finished bacterial genomes from both human pathogenic and non-pathogenic strains, belonging to different taxonomic groups (i.e: Actinobacteria, Gammaproteobacteria, Firmicutes, etc.). An accuracy of 95% using a cross-fold validation scheme with in-fold feature selection is obtained when classifying human pathogens and non-pathogens. A reduced subset of highly informative genes () is presented and applied to an external validation set. The statistical model was implemented in the BacFier v1.0 software (freely available at ), that displays not only the prediction (pathogen/non-pathogen) and an associated probability for pathogenicity, but also the presence/absence vector for the analyzed genes, so it is possible to decipher the subset of virulence genes responsible for the classification on the analyzed genome. Furthermore, we discuss the biological relevance for bacterial pathogenesis of the core set of genes, corresponding to eight functional categories, all with evident and documented association with the phenotypes of interest. Also, we analyze which functional categories of virulence genes were more distinctive for pathogenicity in each taxonomic group, which seems to be a completely new kind of information and could lead to important evolutionary conclusions.
Molecules | 2012
Damián Palomba; María Jimena Martínez; Ignacio Ponzoni; Mónica Fátima Díaz; Gustavo E. Vazquez; Axel J. Soto
Volatile organic compounds (VOCs) are contained in a variety of chemicals that can be found in household products and may have undesirable effects on health. Thereby, it is important to model blood-to-liver partition coefficients (log Pliver) for VOCs in a fast and inexpensive way. In this paper, we present two new quantitative structure-property relationship (QSPR) models for the prediction of log Pliver, where we also propose a hybrid approach for the selection of the descriptors. This hybrid methodology combines a machine learning method with a manual selection based on expert knowledge. This allows obtaining a set of descriptors that is interpretable in physicochemical terms. Our regression models were trained using decision trees and neural networks and validated using an external test set. Results show high prediction accuracy compared to previous log Pliver models, and the descriptor selection approach provides a means to get a small set of descriptors that is in agreement with theoretical understanding of the target property.
evolutionary computation machine learning and data mining in bioinformatics | 2008
Axel J. Soto; Rocío L. Cecchini; Gustavo E. Vazquez; Ignacio Ponzoni
Wrapper methods look for the selection of a subset of features or variables in a data set, in such a way that these features are the most relevant for predicting a target value. In chemoinformatics context, the determination of the most significant set of descriptors is of great importance due to their contribution for improving ADMET prediction models. In this paper, a comprehensive analysis of descriptor selection aimed to physicochemical property prediction is presented. In addition, we propose an evolutionary approach where different fitness functions are compared. The comparison consists in establishing which method selects the subset of descriptors that best predicts a given property, as well as maintaining the cardinality of the subset to a minimum. The performance of the proposal was assessed for predicting hydrophobicity, using an ensemble of neural networks for the prediction task. The results showed that the evolutionary approach using a non linear fitness function constitutes a novel and a promising technique for this bioinformatic application.
Journal of Integrative Bioinformatics | 2016
Fiorella Cravero; María Jimena Martínez; Gustavo E. Vazquez; Mónica F. Díaz; Ignacio Ponzoni
Several feature extraction approaches for QSPR modelling in Cheminformatics are discussed in this paper. In particular, this work is focused on the use of these strategies for predicting mechanical properties, which are relevant for the design of polymeric materials. The methodology analysed in this study employs a feature learning method that uses a quantification process of 2D structural characterization of materials with the autoencoder method. Alternative QSPR models inferred for tensile strength at break (a well-known mechanical property of polymers) are presented. These alternative models are contrasted to QSPR models obtained by feature selection technique by using accuracy measures and a visual analytic tool. The results show evidence about the benefits of combining feature learning approaches with feature selection methods for the design of QSPR models.
ibero american conference on ai | 2006
Fernando Asteasuain; Jessica Andrea Carballido; Gustavo E. Vazquez; Ignacio Ponzoni
In this work we present a critical analysis of three novel parallel-distributed implementations of a multi-objective genetic algorithm (pdGAs) for instrumentation design applications. The pdGAs aim at establishing a sensible configuration of sensors for the initialization of instrumentation design studies of industrial processes. They were built on the basis of an evolutionary island model, the master-worker paradigm, and different migration and parameter control policies. The performance of the resulting implementations was assessed by testing algorithmic behavior on an industrial example that corresponds to an ammonia synthesis plant. The three pdGAs’ results were highly satisfactory in terms of speed-up, efficiency and instrumentation quality, thus revealing to constitute competitive tools with strong potential for their use in the industrial area. As well, from an overall point of view, the pdGA version with adaptive parameter control represents the best implementation’s alternative.
Computers & Chemical Engineering | 2001
Ignacio Ponzoni; Gustavo E. Vazquez; Mabel Sánchez; Nélida Beatriz Brignole
Abstract In this work we present the parallelisation of the global strategy with first least-connected node (GS-FLCN), which is a novel structural technique for the classification of unmeasured variables in process plant instrumentation design. The algorithm aims at partitioning the process’ occurrence matrix to a specific block lower-triangular form. A parallel master–workers philosophy is employed to search for all the paths of a given length existing in the associated graph. The code was conceived for distributed environments and the implementation was carried out using the parallel virtual machine (PVM) library. The performance of the parallel algorithm was tested for industrial case studies and the results were compared with those yielded by the sequential version. The time savings achieved thanks to the parallelisation were significant. Besides, in the parallel version, more paths can be explored per unit time. In practice, this implies greater robustness.
canadian conference on artificial intelligence | 2011
Axel J. Soto; Marc Strickert; Gustavo E. Vazquez; Evangelos E. Milios
Subspace mapping methods aim at projecting high-dimensional data into a subspace where a specific objective function is optimized. Such dimension reduction allows the removal of collinear and irrelevant variables for creating informative visualizations and task-related data spaces. These specific and generally de-noised subspaces spaces enable machine learning methods to work more efficiently. We present a new and general subspace mapping method, Correlative Matrix Mapping (CMM), and evaluate its abilities for category-driven text organization by assessing neighborhood preservation, class coherence, and classification. This approach is evaluated for the challenging task of processing short and noisy documents.
Journal of Cheminformatics | 2010
Axel J. Soto; Marc Strickert; Gustavo E. Vazquez
QSPR methods represent a useful approach in the drug discovery process, since they allow predicting in advance biological or physicochemical properties of a candidate drug. For this goal, it is necessary that the QSPR method be as accurate as possible to provide reliable predictions. Moreover, the selection of the molecular descriptors is an important task to create QSPR prediction models of low complexity which, at the same time, provide accurate predictions. In this work, a matrix-based method [1] is used to transform the original data space of chemical compounds into an alternative space where compounds with different target properties can be better separated. For using this approach, QSPR is considered as a classification problem. The advantage of using adaptive matrix metrics is twofold: it can be used to identify important molecular descriptors and at the same time it allows improving the classification accuracy. A recently proposed method making use of this concept [2] is extended to multi-class data. The new method is related to linear discriminant analysis and shows better results at yet higher computational costs. An application for relating chemical descriptors to hydrophobicity property [3] shows promising results.