André Veríssimo | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where André Veríssimo is active.

Explore More

Publication

Featured researches published by André Veríssimo.

PLOS ONE | 2015

Host Glycan Sugar-Specific Pathways in Streptococcus pneumonia: Galactose as a Key Sugar in Colonisation and Infection

Laura Paixão; Joana Oliveira; André Veríssimo; Susana Vinga; Eva C. Lourenço; M. Rita Ventura; Morten Kjos; Jan-Willem Veening; Vitor E. Fernandes; Peter W. Andrew; Hasan Yesilkaya; Ana Rute Neves

The human pathogen Streptococcus pneumoniae is a strictly fermentative organism that relies on glycolytic metabolism to obtain energy. In the human nasopharynx S. pneumoniae encounters glycoconjugates composed of a variety of monosaccharides, which can potentially be used as nutrients once depolymerized by glycosidases. Therefore, it is reasonable to hypothesise that the pneumococcus would rely on these glycan-derived sugars to grow. Here, we identified the sugar-specific catabolic pathways used by S. pneumoniae during growth on mucin. Transcriptome analysis of cells grown on mucin showed specific upregulation of genes likely to be involved in deglycosylation, transport and catabolism of galactose, mannose and N acetylglucosamine. In contrast to growth on mannose and N-acetylglucosamine, S. pneumoniae grown on galactose re-route their metabolic pathway from homolactic fermentation to a truly mixed acid fermentation regime. By measuring intracellular metabolites, enzymatic activities and mutant analysis, we provide an accurate map of the biochemical pathways for galactose, mannose and N-acetylglucosamine catabolism in S. pneumoniae. Intranasal mouse infection models of pneumococcal colonisation and disease showed that only mutants in galactose catabolic genes were attenuated. Our data pinpoint galactose as a key nutrient for growth in the respiratory tract and highlights the importance of central carbon metabolism for pneumococcal pathogenesis.

BMC Systems Biology | 2014

KiMoSys: a web-based repository of experimental data for KInetic MOdels of biological SYStems.

Rafael S. Costa; André Veríssimo; Susana Vinga

AbstractBackgroundThe kinetic modeling of biological systems is mainly composed of three steps that proceed iteratively: model building, simulation and analysis. In the first step, it is usually required to set initial metabolite concentrations, and to assign kinetic rate laws, along with estimating parameter values using kinetic data through optimization when these are not known. Although the rapid development of high-throughput methods has generated much omics data, experimentalists present only a summary of obtained results for publication, the experimental data files are not usually submitted to any public repository, or simply not available at all. In order to automatize as much as possible the steps of building kinetic models, there is a growing requirement in the systems biology community for easily exchanging data in combination with models, which represents the main motivation of Ki MoSys development.DescriptionKi MoSys is a user-friendly platform that includes a public data repository of published experimental data, containing concentration data of metabolites and enzymes and flux data. It was designed to ensure data management, storage and sharing for a wider systems biology community. This community repository offers a web-based interface and upload facility to turn available data into publicly accessible, centralized and structured-format data files. Moreover, it compiles and integrates available kinetic models associated with the data. Ki MoSys also integrates some tools to facilitate the kinetic model construction process of large-scale metabolic networks, especially when the systems biologists perform computational research.ConclusionsKi MoSys is a web-based system that integrates a public data and associated model(s) repository with computational tools, providing the systems biology community with a novel application facilitating data storage and sharing, thus supporting construction of ODE-based kinetic models and collaborative research projects.The web application implemented using Ruby on Rails framework is freely available for web access at http://kimosys.org, along with its full documentation.

BMC Bioinformatics | 2013

BGFit: management and automated fitting of biological growth curves

André Veríssimo; Laura Paixão; Ana Rute Neves; Susana Vinga

BackgroundExisting tools to model cell growth curves do not offer a flexible integrative approach to manage large datasets and automatically estimate parameters. Due to the increase of experimental time-series from microbiology and oncology, the need for a software that allows researchers to easily organize experimental data and simultaneously extract relevant parameters in an efficient way is crucial.ResultsBGFit provides a web-based unified platform, where a rich set of dynamic models can be fitted to experimental time-series data, further allowing to efficiently manage the results in a structured and hierarchical way. The data managing system allows to organize projects, experiments and measurements data and also to define teams with different editing and viewing permission. Several dynamic and algebraic models are already implemented, such as polynomial regression, Gompertz, Baranyi, Logistic and Live Cell Fraction models and the user can add easily new models thus expanding current ones.ConclusionsBGFit allows users to easily manage their data and models in an integrated way, even if they are not familiar with databases or existing computational tools for parameter estimation. BGFit is designed with a flexible architecture that focus on extensibility and leverages free software with existing tools and methods, allowing to compare and evaluate different data modeling techniques. The application is described in the context of bacterial and tumor cells growth data fitting, but it is also applicable to any type of two-dimensional data, e.g. physical chemistry and macroeconomic time series, being fully scalable to high number of projects, data and model complexity.

Phytochemistry Reviews | 2018

BacHBerry: BACterial Hosts for production of Bioactive phenolics from bERRY fruits

Alexey Dudnik; A. Filipa Almeida; Ricardo Andrade; Barbara Avila; Pilar Bañados; Diane Barbay; Jean-Etienne Bassard; Mounir Benkoulouche; Michael Bott; Adelaide Braga; Dario Breitel; Rex M. Brennan; Laurent Bulteau; Céline Chanforan; Inês Costa; Rafael S. Costa; Mahdi Doostmohammadi; N. Faria; Chengyong Feng; Armando M. Fernandes; Patrícia Ferreira; Roberto Ferro; Alexandre Foito; Sabine Freitag; Gonçalo Garcia; Paula Gaspar; Joana Godinho-Pereira; Björn Hamberger; András Hartmann; Harald Heider

BACterial Hosts for production of Bioactive phenolics from bERRY fruits (BacHBerry) was a 3-year project funded by the Seventh Framework Programme (FP7) of the European Union that ran between November 2013 and October 2016. The overall aim of the project was to establish a sustainable and economically-feasible strategy for the production of novel high-value phenolic compounds isolated from berry fruits using bacterial platforms. The project aimed at covering all stages of the discovery and pre-commercialization process, including berry collection, screening and characterization of their bioactive components, identification and functional characterization of the corresponding biosynthetic pathways, and construction of Gram-positive bacterial cell factories producing phenolic compounds. Further activities included optimization of polyphenol extraction methods from bacterial cultures, scale-up of production by fermentation up to pilot scale, as well as societal and economic analyses of the processes. This review article summarizes some of the key findings obtained throughout the duration of the project.

BMC Bioinformatics | 2016

DegreeCox – a network-based regularization method for survival analysis

André Veríssimo; Arlindo L. Oliveira; Marie-France Sagot; Susana Vinga

BackgroundModeling survival oncological data has become a major challenge as the increase in the amount of molecular information nowadays available means that the number of features greatly exceeds the number of observations. One possible solution to cope with this dimensionality problem is the use of additional constraints in the cost function optimization. Lasso and other sparsity methods have thus already been successfully applied with such idea. Although this leads to more interpretable models, these methods still do not fully profit from the relations between the features, specially when these can be represented through graphs. We propose DegreeCox, a method that applies network-based regularizers to infer Cox proportional hazard models, when the features are genes and the outcome is patient survival. In particular, we propose to use network centrality measures to constrain the model in terms of significant genes.ResultsWe applied DegreeCox to three datasets of ovarian cancer carcinoma and tested several centrality measures such as weighted degree, betweenness and closeness centrality. The a priori network information was retrieved from Gene Co-Expression Networks and Gene Functional Maps. When compared with Ridge and Lasso, DegreeCox shows an improvement in the classification of high and low risk patients in a par with Net-Cox. The use of network information is especially relevant with datasets that are not easily separated. In terms of RMSE and C-index, DegreeCox gives results that are similar to those of the best performing methods, in a few cases slightly better.ConclusionsNetwork-based regularization seems a promising framework to deal with the dimensionality problem. The centrality metrics proposed can be easily expanded to accommodate other topological properties of different biological networks.

BMC Bioinformatics | 2018

Ensemble outlier detection and gene selection in triple-negative breast cancer data

Marta B. Lopes; André Veríssimo; Eunice Carrasquinha; Sandra Casimiro; Niko Beerenwinkel; Susana Vinga

BackgroundLearning accurate models from ‘omics data is bringing many challenges due to their inherent high-dimensionality, e.g. the number of gene expression variables, and comparatively lower sample sizes, which leads to ill-posed inverse problems. Furthermore, the presence of outliers, either experimental errors or interesting abnormal clinical cases, may severely hamper a correct classification of patients and the identification of reliable biomarkers for a particular disease. We propose to address this problem through an ensemble classification setting based on distinct feature selection and modeling strategies, including logistic regression with elastic net regularization, Sparse Partial Least Squares - Discriminant Analysis (SPLS-DA) and Sparse Generalized PLS (SGPLS), coupled with an evaluation of the individuals’ outlierness based on the Cook’s distance. The consensus is achieved with the Rank Product statistics corrected for multiple testing, which gives a final list of sorted observations by their outlierness level.ResultsWe applied this strategy for the classification of Triple-Negative Breast Cancer (TNBC) RNA-Seq and clinical data from the Cancer Genome Atlas (TCGA). The detected 24 outliers were identified as putative mislabeled samples, corresponding to individuals with discrepant clinical labels for the HER2 receptor, but also individuals with abnormal expression values of ER, PR and HER2, contradictory with the corresponding clinical labels, which may invalidate the initial TNBC label. Moreover, the model consensus approach leads to the selection of a set of genes that may be linked to the disease. These results are robust to a resampling approach, either by selecting a subset of patients or a subset of genes, with a significant overlap of the outlier patients identified.ConclusionsThe proposed ensemble outlier detection approach constitutes a robust procedure to identify abnormal cases and consensus covariates, which may improve biomarker selection for precision medicine applications. The method can also be easily extended to other regression models and datasets.

bioRxiv | 2017

MassBlast: A workflow to accelerate RNA-seq and DNA database analysis

André Veríssimo; Jean-Etienne Bassard; Alice Julien-Laferrière; Marie-France Sagot; Susana Vinga

Summary Current workflows for sequence analysis heavily depend on user input and manual curation. New specialized tools and methods are appearing all the time, but the actions required for a full analysis are disconnected and very time-consuming. The software we propose, MassBlast, combines BLAST+ and an automated workflow analysis to filter the results and significantly improve the annotation of multiple sequencing databases for exploring new biosynthetic pathways and new protein families, among other applications. MassBlast is fully configurable and reproducible. Availability and Implementation The MassBlast package is written in Ruby. Source code and releases are freely available from Github (https://github.com/averissimo/mass-blast) for all major platforms (Linux, MS Windows and OS X) under the GPLv3 license. Contact [email protected]

bioRxiv | 2018

Consensus outlier detection in survival analysis using the rank product test

Eunice Carrasquinha; André Veríssimo; Susana Vinga

Survival analysis is a well known technique in the medical field. The identification of individuals whose survival time is too short or to long given their profile, assumes great importance for the detection of new prognostic factors. The study of these outlying observations have gained increasing relevancy with the availability of high-throughput molecular and clinical data for large cohorts of patients. Several methods for outlier detection in survival data have been proposed, which include the analysis of the residuals, the measurement of the concordance c-index, and methods based on quantile regression for censored data. However, different results are obtained depending on the type of method used. In order to solve the disparity of results we proposed to apply the Rank Product test. A simulated dataset, and two clinical datasets were used to illustrate our proposed consensus outlier detection method, one from myeloma disease and the other from The Cancer Genome Atlas (TCGA) ovarian cancer. Finally, the Rank Product with multiple testing corrections was performed in order to identify which observations have the highest rank amongst the methods considered. Our results illustrate the potential of this consensus approach for the automated retrieval of outliers and also the identification of biomarkers associated with survival in large datasets.

bioRxiv | 2018

Sparse network-based regularization for the analysis of patientomics high-dimensional survival data

André Veríssimo; Eunice Carrasquinha; Marta B. Lopes; Arlindo L. Oliveira; Marie-France Sagot; Susana Vinga

Data availability by modern sequencing technologies represents a major challenge in oncological survival analysis, as the increasing amount of molecular data hampers the generation of models that are both accurate and interpretable. To tackle this problem, this work evaluates the introduction of graph centrality measures in classical sparse survival models such as the elastic net. We explore the use of network information as part of the regularization applied to the inverse problem, obtained both by external knowledge on the features evaluated and the data themselves. A sparse solution is obtained either promoting features that are isolated from the network or, alternatively, hubs, i.e., features that are highly connected within the network. We show that introducing the degree information of the features when inferring survival models consistently improves the model predictive performance in breast invasive carcinoma (BRCA) transcriptomic TCGA data while enhancing model interpretability. Preliminary clinical validation is performed using the Cancer Hallmarks Analytics Tool API and the String database. These case studies are included in the recently released glmSparseNet R package1, a flexible tool to explore the potential of sparse network-based regularizers in generalized linear models for the analysis of omics data.

Biodata Mining | 2018

Identification of influential observations in high-dimensional cancer survival data through the rank product test

Eunice Carrasquinha; André Veríssimo; Marta B. Lopes; Susana Vinga

BackgroundSurvival analysis is a statistical technique widely used in many fields of science, in particular in the medical area, and which studies the time until an event of interest occurs. Outlier detection in this context has gained great importance due to the fact that the identification of long or short-term survivors may lead to the detection of new prognostic factors. However, the results obtained using different outlier detection methods and residuals are seldom the same and are strongly dependent of the specific Cox proportional hazards model selected. In particular, when the inherent data have a high number of covariates, dimensionality reduction becomes a key challenge, usually addressed through regularized optimization, e.g. using Lasso, Ridge or Elastic Net regression. In the case of transcriptomics studies, this is an ubiquitous problem, since each observation has a very high number of associated covariates (genes).ResultsIn order to solve this issue, we propose to use the Rank Product test, a non-parametric technique, as a method to identify discrepant observations independently of the selection method and deviance considered. An example based on the The Cancer Genome Atlas (TCGA) ovarian cancer dataset is presented, where the covariates are patients’ gene expressions. Three sub-models were considered, and, for each one, different outliers were obtained. Additionally, a resampling strategy was conducted to demonstrate the methods’ consistency and robustness. The Rank Product worked as a consensus method to identify observations that can be influential under survival models, thus potential outliers in the high-dimensional space.ConclusionsThe proposed technique allows us to combine the different results obtained by each sub-model and find which observations are systematically ranked as putative outliers to be explored further from a clinical point of view.

Explore More