Felipe da Veiga Leprevost
University of Michigan
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Felipe da Veiga Leprevost.
Nature Protocols | 2016
Paulo C. Carvalho; Diogo B. Lima; Felipe da Veiga Leprevost; Marlon Dias Mariano Santos; Juliana S. G. Fischer; Priscila Ferreira Aquino; James J. Moresco; John R. Yates; Valmir Carneiro Barbosa
PatternLab for proteomics is an integrated computational environment that unifies several previously published modules for the analysis of shotgun proteomic data. The contained modules allow for formatting of sequence databases, peptide spectrum matching, statistical filtering and data organization, extracting quantitative information from label-free and chemically labeled data, and analyzing statistics for differential proteomics. PatternLab also has modules to perform similarity-driven studies with de novo sequencing data, to evaluate time-course experiments and to highlight the biological significance of data with regard to the Gene Ontology database. The PatternLab for proteomics 4.0 package brings together all of these modules in a self-contained software environment, which allows for complete proteomic data analysis and the display of results in a variety of graphical formats. All updates to PatternLab, including new features, have been previously tested on millions of mass spectra. PatternLab is easy to install, and it is freely available from http://patternlabforproteomics.org.
Nature Biotechnology | 2017
Yasset Perez-Riverol; Mingze Bai; Felipe da Veiga Leprevost; Silvano Squizzato; Young Mi Park; Kenneth Haug; Adam J. Carroll; Dylan Spalding; Justin Paschall; Mingxun Wang; Noemi del-Toro; Tobias Ternent; Peng Zhang; Nicola Buso; Nuno Bandeira; Eric W. Deutsch; David S. Campbell; Ronald C. Beavis; Reza M. Salek; Ugis Sarkans; Robert Petryszak; Maria Keays; Eoin Fahy; Manish Sud; Shankar Subramaniam; Ariana Barberá; Rafael C. Jimenez; Alexey I. Nesvizhskii; Susanna-Assunta Sansone; Christoph Steinbeck
Yasset Perez-Riverola,†,*, Mingze Baia,b,c,†, Felipe da Veiga Leprevostd, Silvano Squizzatoa, Young Mi Parka, Kenneth Hauga, Adam J. Carrolle, Dylan Spaldinga, Justin Paschalla, Mingxun Wangf, Noemi del-Toroa, Tobias Ternenta, Peng Zhangd,g, Nicola Busoa, Nuno Bandeiraf, Eric W. Deutschh, David S Campbellh, Ronald C. Beavisi, Reza M. Saleka, Ugis Sarkansa, Robert Petryszaka, Maria Keaysa, Eoin Fahyj, Manish Sudj, Shankar Subramaniamj, Ariana Barberak, Rafael C. Jiménezl, Alexey I. Nesvizhskiid, SusannaAssunta Sansonem, Christoph Steinbecka, Rodrigo Lopeza, Juan Antonio Vizcaínoa, Peipei Pingn, and Henning Hermjakoba,c,* aEuropean Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
Bioinformatics | 2017
Felipe da Veiga Leprevost; Björn Grüning; Saulo Alves Aflitos; Hannes L. Röst; Julian Uszkoreit; Harald Barsnes; Marc Vaudel; Pablo Moreno; Laurent Gatto; Jonas Weber; Mingze Bai; Rafael C. Jimenez; Timo Sachsenberg; Julianus Pfeuffer; Roberto Vera Alvarez; Johannes Griss; Alexey I. Nesvizhskii; Yasset Perez-Riverol
Abstract Motivation BioContainers (biocontainers.pro) is an open-source and community-driven framework which provides platform independent executable environments for bioinformatics software. BioContainers allows labs of all sizes to easily install bioinformatics software, maintain multiple versions of the same software and combine tools into powerful analysis pipelines. BioContainers is based on popular open-source projects Docker and rkt frameworks, that allow software to be installed and executed under an isolated and controlled environment. Also, it provides infrastructure and basic guidelines to create, manage and distribute bioinformatics containers with a special focus on omics technologies. These containers can be integrated into more comprehensive bioinformatics pipelines and different architectures (local desktop, cloud environments or HPC clusters). Availability and Implementation The software is freely available at github.com/BioContainers/.
Frontiers in Genetics | 2014
Felipe da Veiga Leprevost; Valmir Carneiro Barbosa; Eduardo L. Francisco; Yasset Perez-Riverol; Paulo C. Carvalho
Felipe da Veiga Leprevost, Valmir C. Barbosa, and Paulo C. Carvalho are sup-ported by Capes and CNPq; Valmir C. Barbosa is supported by the FAPERJ BBP grant; Yasset Perez-Riverol is supported by the BBSRC PROCESS grant [reference BB/K01997X/1].
Molecular & Cellular Proteomics | 2014
Felipe da Veiga Leprevost; Richard H. Valente; Diogo B. Lima; Jonas Perales; Rafael D. Melani; John R. Yates; Valmir Carneiro Barbosa; Magno Junqueira; Paulo C. Carvalho
Peptide spectrum matching is the current gold standard for protein identification via mass-spectrometry-based proteomics. Peptide spectrum matching compares experimental mass spectra against theoretical spectra generated from a protein sequence database to perform identification, but protein sequences not present in a database cannot be identified unless their sequences are in part conserved. The alternative approach, de novo sequencing, can make it possible to infer a peptide sequence directly from a mass spectrum, but interpreting long lists of peptide sequences resulting from large-scale experiments is not trivial. With this as motivation, PepExplorer was developed to use rigorous pattern recognition to assemble a list of homologue proteins using de novo sequencing data coupled to sequence alignment to allow biological interpretation of the data. PepExplorer can read the output of various widely adopted de novo sequencing tools and converge to a list of proteins with a global false-discovery rate. To this end, it employs a radial basis function neural network that considers precursor charge states, de novo sequencing scores, peptide lengths, and alignment scores to select similar protein candidates, from a target-decoy database, usually obtained from phylogenetically related species. Alignments are performed using a modified Smith–Waterman algorithm tailored for the task at hand. We verified the effectiveness of our approach using a reference set of identifications generated by ProLuCID when searching for Pyrococcus furiosus mass spectra on the corresponding NCBI RefSeq database. We then modified the sequence database by swapping amino acids until ProLuCID was no longer capable of identifying any proteins. By searching the mass spectra using PepExplorer on the modified database, we were able to recover most of the identifications at a 1% false-discovery rate. Finally, we employed PepExplorer to disclose a comprehensive proteomic assessment of the Bothrops jararaca plasma, a known biological source of natural inhibitors of snake toxins. PepExplorer is integrated into the PatternLab for Proteomics environment, which makes available various tools for downstream data analysis, including resources for quantitative and differential proteomics.
Journal of Proteomics | 2015
Giselle Villa Flor Brunoro; Marcelle Almeida Caminha; André Teixeira da Silva Ferreira; Felipe da Veiga Leprevost; Paulo C. Carvalho; Jonas Perales; Richard H. Valente; Rubem F. S. Menna-Barreto
UNLABELLED Chagas disease is a neglected disease, caused by the protozoan Trypanosoma cruzi. This kinetoplastid presents a cycle involving different forms and hosts, being trypomastigotes the main infective form. Despite various T. cruzi proteomic studies, the assessment of bloodstream trypomastigote profile remains unexplored. The aim of this work is T. cruzi bloodstream form proteomic description. Employing shotgun approach, 17,394 peptides were identified, corresponding to 7514 proteins of which 5901 belong to T. cruzi. Cytoskeletal proteins, chaperones, bioenergetics-related enzymes, and trans-sialidases are among the top-scoring. GO analysis revealed that all T. cruzi compartments were assessed; and majority of proteins are involved in metabolic processes and/or presented catalytic activity. The comparative analysis between the bloodstream trypomastigotes and cultured-derived or metacyclic trypomastigote proteomic profiles pointed to 2202 proteins exclusively detected in the bloodstream form. These exclusive proteins are related to: (a) surface proteins; (b) non-classical secretion pathway; (c) cytoskeletal dynamics; (d) cell cycle and transcription; (e) proteolysis; (f) redox metabolism; (g) biosynthetic pathways; (h) bioenergetics; (i) protein folding; (j) cell signaling; (k) vesicular traffic; (l) DNA repair; and (m) cell death. This large-scale evaluation of bloodstream trypomastigotes, responsible for the parasite dissemination in the patient, marks a step forward in the comprehension of Chagas disease pathogenesis. BIOLOGICAL SIGNIFICANCE The hemoflagellate protozoan T. cruzi is the etiological agent of Chagas disease and affects people by the millions in Latin America and other non-endemic countries. The absence of efficient drugs, especially for treatment during the chronic phase of the disease, stimulates the continuous search for novel molecular targets. The identification of essential molecules, particularly those found in clinically relevant forms of the parasite, could be crucial. Inside the vertebrate host, trypomastigotes circulate in the bloodstream before infecting various tissues. The exposure of bloodstream forms of the parasite to the host immune system likely leads to differential protein expression in the parasite. In this context, an extensive characterization of the proteomic profile of bloodstream trypomastigotes could help to find not only promising drug targets but also antigens for vaccines or diagnostics. This work is a large-scale proteomic assessment of bloodstream trypomastigotes that show a considerable number of proteins belonging to different metabolic pathways and functions exclusive to this parasitic form, and provides a valuable dataset for the biological understanding of this clinically relevant form of T. cruzi.
Bioinformatics | 2013
Diogo Borges; Yasset Perez-Riverol; Fábio C.S. Nogueira; Gilberto B. Domont; Jesús Noda; Felipe da Veiga Leprevost; Vladimir Besada; Felipe M. G. França; Valmir Carneiro Barbosa; Aniel Sánchez; Paulo C. Carvalho
SUMMARY Protein identification by mass spectrometry is commonly accomplished using a peptide sequence matching search algorithm, whose sensitivity varies inversely with the size of the sequence database and the number of post-translational modifications considered. We present the Spectrum Identification Machine, a peptide sequence matching tool that capitalizes on the high-intensity b1-fragment ion of tandem mass spectra of peptides coupled in solution with phenylisotiocyanate to confidently sequence the first amino acid and ultimately reduce the search space. We demonstrate that in complex search spaces, a gain of some 120% in sensitivity can be achieved. AVAILABILITY All data generated and the software are freely available for academic use at http://proteomics.fiocruz.br/software/sim. CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Journal of Proteomics | 2016
Márcia H. Borges; Suely G. Figueiredo; Felipe da Veiga Leprevost; Maria Elena de Lima; Marta N. Cordeiro; Marcelo R.V. Diniz; James J. Moresco; Paulo C. Carvalho; John R. Yates
UNLABELLED Tarantula spiders, Theraphosidae family, are spread throughout most tropical regions of the world. Despite their size and reputation, there are few reports of accidents. However, like other spiders, their venom is considered a remarkable source of toxins, which have been selected through millions of years of evolution. The present work provides a proteomic overview of the fascinating complexity of the venomous extract of the Grammostola iheringi tarantula, obtained by electrical stimulation of the chelicerae. For analysis a bottom-up proteomic approach Multidimensional Protein Identification Technology (MudPIT) was used. Based on bioinformatics analyses, PepExplorer, a similarity-driven search tool that identifies proteins based on phylogenetically close organisms, a total of 395 proteins were identified in this venomous extract. Most of the identifications (~70%) were classified as predicted (21%), hypothetical (6%) and putative (37%), while a small group (6%) had no predicted function. Identified molecules matched with neurotoxins that act on ions channels; proteases, such as serine proteases, metalloproteinases, cysteine proteinases, aspartic proteinases, carboxypeptidases and cysteine-rich secretory enzymes (CRISP) and some molecules with unknown target. Additionally, non-classical venom proteins were also identified. Up to now, this study represents, to date, the first broad characterization of the composition of G. iheringi venomous extract. Our data provides a tantalizing insight into the diversity of proteins in this venom and their biotechnological potential. SIGNIFICANCE Animal venoms contain a diversity of molecules able to bind to specific cell targets. Due to their biochemical and physiological properties, these molecules are interesting for medical and biotechnological purposes. In this study, a large number of components of the venomous extract of the spider Grammostola iheringi were identified by the MudPIT technique. It was demonstrated that this approach is a sensitive and adequate method to achieve a broad spectrum of information about animal venoms. Using this bottom-up proteomic method, classical and non-classical venom proteins were identified which stimulate new interest in the systematic research of their protein components.
Journal of Proteomics | 2013
Felipe da Veiga Leprevost; Diogo B. Lima; J. Crestani; Yasset Perez-Riverol; Nilson Ivo Tonin Zanchin; Valmir Carneiro Barbosa; Paulo C. Carvalho
Mass-spectrometry-based shotgun proteomics has become a widespread technology for analyzing complex protein mixtures. Here we describe a new module integrated into PatternLab for Proteomics that allows the pinpointing of differentially expressed domains. This is accomplished by inferring functional domains through our cloud service, using HMMER3 and Pfam remotely, and then mapping the quantitation values into domains for downstream analysis. In all, spotting which functional domains are changing when comparing biological states serves as a complementary approach to facilitate the understanding of a systems biology. We exemplify the new modules use by reanalyzing a previously published MudPIT dataset of Cryptococcus gattii cultivated under iron-depleted and replete conditions. We show how the differential analysis of functional domains can facilitate the interpretation of proteomic data by providing further valuable insight.
bioRxiv | 2016
Yasset Perez-Riverol; M Bai; Felipe da Veiga Leprevost; S Squizzato; Y Park Mi; O Haug; Aj Carroll; D Spalding; J Paschall; Mengchi Wang; Noemi del-Toro; Tobias Ternent; P Zhang; N Buso; Nuno Bandeira; Eric W. Deutsch; David S. Campbell; Ronald C. Beavis; Reza M. Salek; Alexey I. Nesvizhskii; Susanna-Assunta Sansone; Christoph Steinbeck; R Lopez; Juan Antonio Vizcaíno; Peipei Ping; Henning Hermjakob
Biomedical data, in particular omics datasets are being generated at an unprecedented rate. This is due to the falling costs of generating experimental data, improved accuracy and better accessibility to different omics platforms such as genomics, proteomics and metabolomics1,2. As a result, the number of deposited datasets in public repositories originating from various omics approaches has increased dramatically in recent years. With strong support from scientific journals and funders, public data sharing is increasingly considered to be a good scientific practice, facilitating the confirmation of original results, increasing the reproducibility of the analyses, enabling the exploration of new or related hypotheses, and fostering the identification of potential errors, discouraging fraud3. This increase in public data deposition of omics results is a good starting point, but opens up a series of new challenges. For example the research community must now find more efficient ways for storing, organizing and providing access to biomedical data across platforms. These challenges range from achieving a common representation framework for the datasets and the associated metadata from different omics fields, to the availability of efficient methods, protocols and file formats for data exchange between multiple repositories. Therefore, there is a great need for development of new platforms and applications to make possible to search datasets across different omics fields, making such information accessible to the end-user. The FAIR paradigm describes a set of guiding principles to address many of these issues, and aims to make data Findable, Accessible, Interoperable and Re-usable(https://www.force11.org/group/fairgroup/fairprinciples).