Yasset Perez-Riverol
European Bioinformatics Institute
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yasset Perez-Riverol.
Nucleic Acids Research | 2012
Juan Antonio Vizcaíno; Richard G. Côté; Attila Csordas; Jose Ángel Dianes; Antonio Fabregat; Joseph M. Foster; Johannes Griss; Emanuele Alpi; Melih Birim; Javier Contell; Gavin O’Kelly; Andreas Schoenegger; David Ovelleiro; Yasset Perez-Riverol; Florian Reisinger; Daniel Ríos; Rui Wang; Henning Hermjakob
The PRoteomics IDEntifications (PRIDE, http://www.ebi.ac.uk/pride) database at the European Bioinformatics Institute is one of the most prominent data repositories of mass spectrometry (MS)-based proteomics data. Here, we summarize recent developments in the PRIDE database and related tools. First, we provide up-to-date statistics in data content, splitting the figures by groups of organisms and species, including peptide and protein identifications, and post-translational modifications. We then describe the tools that are part of the PRIDE submission pipeline, especially the recently developed PRIDE Converter 2 (new submission tool) and PRIDE Inspector (visualization and analysis tool). We also give an update about the integration of PRIDE with other MS proteomics resources in the context of the ProteomeXchange consortium. Finally, we briefly review the quality control efforts that are ongoing at present and outline our future plans.
Nucleic Acids Research | 2016
Juan Antonio Vizcaíno; Attila Csordas; Noemi del-Toro; Jose Ángel Dianes; Johannes Griss; Ilias Lavidas; Gerhard Mayer; Yasset Perez-Riverol; Florian Reisinger; Tobias Ternent; Qing-Wei Xu; Rui Wang; Henning Hermjakob
The PRoteomics IDEntifications (PRIDE) database is one of the world-leading data repositories of mass spectrometry (MS)-based proteomics data. Since the beginning of 2014, PRIDE Archive (http://www.ebi.ac.uk/pride/archive/) is the new PRIDE archival system, replacing the original PRIDE database. Here we summarize the developments in PRIDE resources and related tools since the previous update manuscript in the Database Issue in 2013. PRIDE Archive constitutes a complete redevelopment of the original PRIDE, comprising a new storage backend, data submission system and web interface, among other components. PRIDE Archive supports the most-widely used PSI (Proteomics Standards Initiative) data standard formats (mzML and mzIdentML) and implements the data requirements and guidelines of the ProteomeXchange Consortium. The wide adoption of ProteomeXchange within the community has triggered an unprecedented increase in the number of submitted data sets (around 150 data sets per month). We outline some statistics on the current PRIDE Archive data contents. We also report on the status of the PRIDE related stand-alone tools: PRIDE Inspector, PRIDE Converter 2 and the ProteomeXchange submission tool. Finally, we will give a brief update on the resources under development ‘PRIDE Cluster’ and ‘PRIDE Proteomes’, which provide a complementary view and quality-scored information of the peptide and protein identification data available in PRIDE Archive.
Nature Biotechnology | 2012
Rui Wang; Antonio Fabregat; Daniel Ríos; David Ovelleiro; Joseph M. Foster; Richard G. Côté; Johannes Griss; Attila Csordas; Yasset Perez-Riverol; Florian Reisinger; Henning Hermjakob; Lennart Martens; Juan Antonio Vizcaíno
This work was supported by the Wellcome Trust (grant number WT085949MA) and EMBL core funding. R.G.C. is supported by EU FP7 grant SLING (grant number 226073). J.A.V. is supported by the EU FP7 grants LipidomicNet (grant number 202272) and ProteomeXchange (grant number 260558). A.F. was partially supported by the Spanish network COMBIOMED (RD07/0067/0006, ISCIII-FIS). L.M. would like to acknowledge support from the EU FP7 PRIME-XS grant (grant number 262067).
Nucleic Acids Research | 2017
Eric W. Deutsch; Attila Csordas; Zhi Sun; Andrew F. Jarnuczak; Yasset Perez-Riverol; Tobias Ternent; David S. Campbell; Manuel Bernal-Llinares; Shujiro Okuda; Shin Kawano; Robert L. Moritz; Jeremy J. Carver; Mingxun Wang; Yasushi Ishihama; Nuno Bandeira; Henning Hermjakob; Juan Antonio Vizcaíno
The ProteomeXchange (PX) Consortium of proteomics resources (http://www.proteomexchange.org) was formally started in 2011 to standardize data submission and dissemination of mass spectrometry proteomics data worldwide. We give an overview of the current consortium activities and describe the advances of the past few years. Augmenting the PX founding members (PRIDE and PeptideAtlas, including the PASSEL resource), two new members have joined the consortium: MassIVE and jPOST. ProteomeCentral remains as the common data access portal, providing the ability to search for data sets in all participating PX resources, now with enhanced data visualization components. We describe the updated submission guidelines, now expanded to include four members instead of two. As demonstrated by data submission statistics, PX is supporting a change in culture of the proteomics field: public data sharing is now an accepted standard, supported by requirements for journal submissions resulting in public data release becoming the norm. More than 4500 data sets have been submitted to the various PX resources since 2012. Human is the most represented species with approximately half of the data sets, followed by some of the main model organisms and a growing list of more than 900 diverse species. Data reprocessing activities are becoming more prominent, with both MassIVE and PeptideAtlas releasing the results of reprocessed data sets. Finally, we outline the upcoming advances for ProteomeXchange.
Proteomics | 2015
Yasset Perez-Riverol; Emanuele Alpi; Rui Wang; Henning Hermjakob; Juan Antonio Vizcaíno
Compared to other data‐intensive disciplines such as genomics, public deposition and storage of MS‐based proteomics, data are still less developed due to, among other reasons, the inherent complexity of the data and the variety of data types and experimental workflows. In order to address this need, several public repositories for MS proteomics experiments have been developed, each with different purposes in mind. The most established resources are the Global Proteome Machine Database (GPMDB), PeptideAtlas, and the PRIDE database. Additionally, there are other useful (in many cases recently developed) resources such as ProteomicsDB, Mass Spectrometry Interactive Virtual Environment (MassIVE), Chorus, MaxQB, PeptideAtlas SRM Experiment Library (PASSEL), Model Organism Protein Expression Database (MOPED), and the Human Proteinpedia. In addition, the ProteomeXchange consortium has been recently developed to enable better integration of public repositories and the coordinated sharing of proteomics information, maximizing its benefit to the scientific community. Here, we will review each of the major proteomics resources independently and some tools that enable the integration, mining and reuse of the data. We will also discuss some of the major challenges and current pitfalls in the integration and sharing of the data.
Molecular & Cellular Proteomics | 2014
Johannes Griss; Andrew R. Jones; Timo Sachsenberg; Mathias Walzer; Laurent Gatto; Jürgen Hartler; Gerhard G. Thallinger; Reza M. Salek; Christoph Steinbeck; Nadin Neuhauser; Jürgen Cox; Steffen Neumann; Jun Fan; Florian Reisinger; Qing-Wei Xu; Noemi del Toro; Yasset Perez-Riverol; Fawaz Ghali; Nuno Bandeira; Ioannis Xenarios; Oliver Kohlbacher; Juan Antonio Vizcaíno; Henning Hermjakob
The HUPO Proteomics Standards Initiative has developed several standardized data formats to facilitate data sharing in mass spectrometry (MS)-based proteomics. These allow researchers to report their complete results in a unified way. However, at present, there is no format to describe the final qualitative and quantitative results for proteomics and metabolomics experiments in a simple tabular format. Many downstream analysis use cases are only concerned with the final results of an experiment and require an easily accessible format, compatible with tools such as Microsoft Excel or R. We developed the mzTab file format for MS-based proteomics and metabolomics results to meet this need. mzTab is intended as a lightweight supplement to the existing standard XML-based file formats (mzML, mzIdentML, mzQuantML), providing a comprehensive summary, similar in concept to the supplemental material of a scientific publication. mzTab files can contain protein, peptide, and small molecule identifications together with experimental metadata and basic quantitative information. The format is not intended to store the complete experimental evidence but provides mechanisms to report results at different levels of detail. These range from a simple summary of the final results to a representation of the results including the experimental design. This format is ideally suited to make MS-based proteomics and metabolomics results available to a wider biological community outside the field of MS. Several software tools for proteomics and metabolomics have already adapted the format as an output format. The comprehensive mzTab specification document and extensive additional documentation can be found online.
Biochimica et Biophysica Acta | 2014
Yasset Perez-Riverol; Rui Wang; Henning Hermjakob; Markus Müller; Vladimir Vesada; Juan Antonio Vizcaíno
Data processing, management and visualization are central and critical components of a state of the art high-throughput mass spectrometry (MS)-based proteomics experiment, and are often some of the most time-consuming steps, especially for labs without much bioinformatics support. The growing interest in the field of proteomics has triggered an increase in the development of new software libraries, including freely available and open-source software. From database search analysis to post-processing of the identification results, even though the objectives of these libraries and packages can vary significantly, they usually share a number of features. Common use cases include the handling of protein and peptide sequences, the parsing of results from various proteomics search engines output files, and the visualization of MS-related information (including mass spectra and chromatograms). In this review, we provide an overview of the existing software libraries, open-source frameworks and also, we give information on some of the freely available applications which make use of them. This article is part of a Special Issue entitled: Computational Proteomics in the Post-Identification Era. Guest Editors: Martin Eisenacher and Christian Stephan.
Nature Biotechnology | 2016
Pedro Navarro; Jörg Kuharev; Ludovic C. Gillet; Oliver M. Bernhardt; Brendan MacLean; Hannes L. Röst; Stephen Tate; Chih Chiang Tsou; Lukas Reiter; Ute Distler; George Rosenberger; Yasset Perez-Riverol; Alexey I. Nesvizhskii; Ruedi Aebersold; Stefan Tenzer
Consistent and accurate quantification of proteins by mass spectrometry (MS)-based proteomics depends on the performance of instruments, acquisition methods and data analysis software. In collaboration with the software developers, we evaluated OpenSWATH, SWATH 2.0, Skyline, Spectronaut and DIA-Umpire, five of the most widely used software methods for processing data from sequential window acquisition of all theoretical fragment-ion spectra (SWATH)-MS, which uses data-independent acquisition (DIA) for label-free protein quantification. We analyzed high-complexity test data sets from hybrid proteome samples of defined quantitative composition acquired on two different MS instruments using different SWATH isolation-window setups. For consistent evaluation, we developed LFQbench, an R package, to calculate metrics of precision and accuracy in label-free quantitative MS and report the identification performance, robustness and specificity of each software tool. Our reference data sets enabled developers to improve their software tools. After optimization, all tools provided highly convergent identification and reliable quantification performance, underscoring their robustness for label-free quantitative proteomics.
Nature Methods | 2016
Johannes Griss; Yasset Perez-Riverol; Steve Lewis; David L. Tabb; Jose Ángel Dianes; Noemi del-Toro; Marc Rurik; Mathias Walzer; Oliver Kohlbacher; Henning Hermjakob; Rui Wang; Juan Antonio Vizcaíno
Mass spectrometry (MS) is the main technology used in proteomics approaches. However, on average, 75% of spectra analyzed in an MS experiment remain unidentified. We propose to use spectrum clustering at a large scale to shed light on these unidentified spectra. The Proteomics Identifications (PRIDE) Database Archive is one of the largest MS proteomics public data repositories worldwide. By clustering all tandem MS spectra publicly available in the PRIDE Archive, coming from hundreds of data sets, we were able to consistently characterize spectra into three distinct groups: (1) incorrectly identified, (2) correctly identified but below the set scoring threshold, and (3) truly unidentified. Using multiple complementary analysis approaches, we were able to identify ∼20% of the consistently unidentified spectra. The complete spectrum-clustering results are available through the new version of the PRIDE Cluster resource (http://www.ebi.ac.uk/pride/cluster). This resource is intended, among other aims, to encourage and simplify further investigation into these unidentified spectra.
Molecular & Cellular Proteomics | 2016
Yasset Perez-Riverol; Qing Wei Xu; Rui Wang; Julian Uszkoreit; Johannes Griss; Aniel Sánchez; Florian Reisinger; Attila Csordas; Tobias Ternent; Noemi del-Toro; Jose Ángel Dianes; Martin Eisenacher; Henning Hermjakob; Juan Antonio Vizcaíno
The original PRIDE Inspector tool was developed as an open source standalone tool to enable the visualization and validation of mass-spectrometry (MS)-based proteomics data before data submission or already publicly available in the Proteomics Identifications (PRIDE) database. The initial implementation of the tool focused on visualizing PRIDE data by supporting the PRIDE XML format and a direct access to private (password protected) and public experiments in PRIDE. The ProteomeXchange (PX) Consortium has been set up to enable a better integration of existing public proteomics repositories, maximizing its benefit to the scientific community through the implementation of standard submission and dissemination pipelines. Within the Consortium, PRIDE is focused on supporting submissions of tandem MS data. The increasing use and popularity of the new Proteomics Standards Initiative (PSI) data standards such as mzIdentML and mzTab, and the diversity of workflows supported by the PX resources, prompted us to design and implement a new suite of algorithms and libraries that would build upon the success of the original PRIDE Inspector and would enable users to visualize and validate PX “complete” submissions. The PRIDE Inspector Toolsuite supports the handling and visualization of different experimental output files, ranging from spectra (mzML, mzXML, and the most popular peak lists formats) and peptide and protein identification results (mzIdentML, PRIDE XML, mzTab) to quantification data (mzTab, PRIDE XML), using a modular and extensible set of open-source, cross-platform libraries. We believe that the PRIDE Inspector Toolsuite represents a milestone in the visualization and quality assessment of proteomics data. It is freely available at http://github.com/PRIDE-Toolsuite/.