Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Joseph M. Foster is active.

Publication


Featured researches published by Joseph M. Foster.


Nucleic Acids Research | 2012

The Proteomics Identifications (PRIDE) database and associated tools: status in 2013

Juan Antonio Vizcaíno; Richard G. Côté; Attila Csordas; Jose Ángel Dianes; Antonio Fabregat; Joseph M. Foster; Johannes Griss; Emanuele Alpi; Melih Birim; Javier Contell; Gavin O’Kelly; Andreas Schoenegger; David Ovelleiro; Yasset Perez-Riverol; Florian Reisinger; Daniel Ríos; Rui Wang; Henning Hermjakob

The PRoteomics IDEntifications (PRIDE, http://www.ebi.ac.uk/pride) database at the European Bioinformatics Institute is one of the most prominent data repositories of mass spectrometry (MS)-based proteomics data. Here, we summarize recent developments in the PRIDE database and related tools. First, we provide up-to-date statistics in data content, splitting the figures by groups of organisms and species, including peptide and protein identifications, and post-translational modifications. We then describe the tools that are part of the PRIDE submission pipeline, especially the recently developed PRIDE Converter 2 (new submission tool) and PRIDE Inspector (visualization and analysis tool). We also give an update about the integration of PRIDE with other MS proteomics resources in the context of the ProteomeXchange consortium. Finally, we briefly review the quality control efforts that are ongoing at present and outline our future plans.


Nucleic Acids Research | 2010

The Proteomics Identifications database: 2010 update

Juan Antonio Vizcaíno; Richard G. Côté; Florian Reisinger; Harald Barsnes; Joseph M. Foster; Jonathan Rameseder; Henning Hermjakob; Lennart Martens

The Proteomics Identifications database (PRIDE, http://www.ebi.ac.uk/pride) at the European Bioinformatics Institute has become one of the main repositories of mass spectrometry-derived proteomics data. For the last 2 years, PRIDE data holdings have grown substantially, comprising 60 different species, more than 2.5 million protein identifications, 11.5 million peptides and over 50 million spectra by September 2009. We here describe several new and improved features in PRIDE, including the revised submission process, which now includes direct submission of fragment ion annotations. Correspondingly, it is now possible to visualize spectrum fragmentation annotations on tandem mass spectra, a key feature for compliance with journal data submission requirements. We also describe recent developments in the PRIDE BioMart interface, which now allows integrative queries that can join PRIDE data to a growing number of biological resources such as Reactome, Ensembl, InterPro and UniProt. This ability to perform extremely powerful across-domain queries will certainly be a cornerstone of future bioinformatics analyses. Finally, we highlight the importance of data sharing in the proteomics field, and the corresponding integration of PRIDE with other databases in the ProteomExchange consortium.


Proteomics | 2009

A guide to the Proteomics Identifications Database proteomics data repository

Juan Antonio Vizcaíno; Richard G. Côté; Florian Reisinger; Joseph M. Foster; Michael Mueller; Jonathan Rameseder; Henning Hermjakob; Lennart Martens

The Proteomics Identifications Database (PRIDE, www.ebi.ac.uk/pride) is one of the main repositories of MS derived proteomics data. Here, we point out the main functionalities of PRIDE both as a submission repository and as a source for proteomics data. We describe the main features for data retrieval and visualization available through the PRIDE web and BioMart interfaces. We also highlight the mechanism by which tailored queries in the BioMart can join PRIDE to other resources such as Reactome, Ensembl or UniProt to execute extremely powerful across‐domain queries. We then present the latest improvements in the PRIDE submission process, using the new easy‐to‐use, platform‐independent graphical user interface submission tool PRIDE Converter. Finally, we speak about future plans and the role of PRIDE in the ProteomExchange consortium.


Nature Biotechnology | 2012

PRIDE Inspector: a tool to visualize and validate MS proteomics data

Rui Wang; Antonio Fabregat; Daniel Ríos; David Ovelleiro; Joseph M. Foster; Richard G. Côté; Johannes Griss; Attila Csordas; Yasset Perez-Riverol; Florian Reisinger; Henning Hermjakob; Lennart Martens; Juan Antonio Vizcaíno

This work was supported by the Wellcome Trust n(grant number WT085949MA) and EMBL core nfunding. R.G.C. is supported by EU FP7 grant SLING n(grant number 226073). J.A.V. is supported by the EU nFP7 grants LipidomicNet (grant number 202272) and nProteomeXchange (grant number 260558). A.F. was npartially supported by the Spanish network nCOMBIOMED (RD07/0067/0006, ISCIII-FIS). nL.M. would like to acknowledge support from the EU nFP7 PRIME-XS grant (grant number 262067).


Journal of Proteomics | 2010

Proteomics data repositories: Providing a safe haven for your data and acting as a springboard for further research

Juan Antonio Vizcaíno; Joseph M. Foster; Lennart Martens

Despite the fact that data deposition is not a generalised fact yet in the field of proteomics, several mass spectrometry (MS) based proteomics repositories are publicly available for the scientific community. The main existing resources are: the Global Proteome Machine Database (GPMDB), PeptideAtlas, the PRoteomics IDEntifications database (PRIDE), Tranche, and NCBI Peptidome. In this review the capabilities of each of these will be described, paying special attention to four key properties: data types stored, applicable data submission strategies, supported formats, and available data mining and visualization tools. Additionally, the data contents from model organisms will be enumerated for each resource. There are other valuable smaller and/or more specialized repositories but they will not be covered in this review. Finally, the concept behind the ProteomeXchange consortium, a collaborative effort among the main resources in the field, will be introduced.


PLOS ONE | 2013

LipidHome: A Database of Theoretical Lipids Optimized for High Throughput Mass Spectrometry Lipidomics

Joseph M. Foster; Pablo Moreno; Antonio Fabregat; Henning Hermjakob; Christoph Steinbeck; Rolf Apweiler; Michael J. O. Wakelam; Juan Antonio Vizcaíno

Protein sequence databases are the pillar upon which modern proteomics is supported, representing a stable reference space of predicted and validated proteins. One example of such resources is UniProt, enriched with both expertly curated and automatic annotations. Taken largely for granted, similar mature resources such as UniProt are not available yet in some other “omics” fields, lipidomics being one of them. While having a seasoned community of wet lab scientists, lipidomics lies significantly behind proteomics in the adoption of data standards and other core bioinformatics concepts. This work aims to reduce the gap by developing an equivalent resource to UniProt called ‘LipidHome’, providing theoretically generated lipid molecules and useful metadata. Using the ‘FASTLipid’ Java library, a database was populated with theoretical lipids, generated from a set of community agreed upon chemical bounds. In parallel, a web application was developed to present the information and provide computational access via a web service. Designed specifically to accommodate high throughput mass spectrometry based approaches, lipids are organised into a hierarchy that reflects the variety in the structural resolution of lipid identifications. Additionally, cross-references to other lipid related resources and papers that cite specific lipids were used to annotate lipid records. The web application encompasses a browser for viewing lipid records and a ‘tools’ section where an MS1 search engine is currently implemented. LipidHome can be accessed at http://www.ebi.ac.uk/apweiler-srv/lipidhome.


Database | 2012

PRIDE: Quality control in a proteomics data repository

Attila Csordas; David Ovelleiro; Rui Wang; Joseph M. Foster; Daniel Ríos; Juan Antonio Vizcaíno; Henning Hermjakob

The PRoteomics IDEntifications (PRIDE) database is a large public proteomics data repository, containing over 270 million mass spectra (by November 2011). PRIDE is an archival database, providing the proteomics data supporting specific scientific publications in a computationally accessible manner. While PRIDE faces rapid increases in data deposition size as well as number of depositions, the major challenge is to ensure a high quality of data depositions in the context of highly diverse proteomics work flows and data representations. Here, we describe the PRIDE curation pipeline and its practical application in quality control of complex data depositions. Database URL: http://www.ebi.ac.uk/pride/.


Nature Methods | 2013

PRIDE Cluster: building a consensus of proteomics data

Johannes Griss; Joseph M. Foster; Henning Hermjakob; Juan Antonio Vizcaíno

To the editor: The amount of mass spectrometry (MS) proteomics data in public repositories is growing rapidly1 but its (re-)use to increase the reliability of newly performed experiments is still limited. Two of the major obstacles are the high heterogeneity of the data present in repositories, and the inflation of false positive identifications when combining datasets. Here we present ‘PRIDE Cluster’: a novel method to identify reliable identifications in heterogeneous MS proteomics experiments. It is used to highlight reliable peptide identifications in the PRIDE database2 (http://www.ebi.ac.uk/pride) and generate constantly updated, reliable spectral libraries based on these identifications.


Proteomics | 2012

Chromatographic retention time prediction for posttranslationally modified peptides

Luminita Moruz; An Staes; Joseph M. Foster; Maria Hatzou; Evy Timmerman; Lennart Martens; Lukas Käll

Retention time prediction of peptides in liquid chromatography has proven to be a valuable tool for mass spectrometry‐based proteomics, especially in designing more efficient procedures for state‐of‐the‐art targeted workflows. Additionally, accurate retention time predictions can also be used to increase confidence in identifications in shotgun experiments. Despite these obvious benefits, the use of such methods has so far not been extended to (posttranslationally) modified peptides due to the absence of efficient predictors for such peptides. We here therefore describe a new retention time predictor for modified peptides, built on the foundations of our existing Elude algorithm. We evaluated our software by applying it on five types of commonly encountered modifications. Our results show that Elude now yields equally good prediction performances for modified and unmodified peptides, with correlation coefficients between predicted and observed retention times ranging from 0.93 to 0.98 for all the investigated datasets. Furthermore, we show that our predictor handles peptides carrying multiple modifications as well. This latest version of Elude is fully portable to new chromatographic conditions and can readily be applied to other types of posttranslational modifications. Elude is available under the permissive Apache2 open source License at http://per-colator.com or can be run via a web‐interface at http://elude.sbc.su.se.


Proteomics | 2011

A posteriori quality control for the curation and reuse of public proteomics data

Joseph M. Foster; Sven Degroeve; Laurent Gatto; Matthieu Visser; Rui Wang; Johannes Griss; Rolf Apweiler; Lennart Martens

Proteomics is a rapidly expanding field encompassing a multitude of complex techniques and data types. To date much effort has been devoted to achieving the highest possible coverage of proteomes with the aim to inform future developments in basic biology as well as in clinical settings. As a result, growing amounts of data have been deposited in publicly available proteomics databases. These data are in turn increasingly reused for orthogonal downstream purposes such as data mining and machine learning. These downstream uses however, need ways to a posteriori validate whether a particular data set is suitable for the envisioned purpose. Furthermore, the (semi‐)automatic curation of repository data is dependent on analyses that can highlight misannotation and edge conditions for data sets. Such curation is an important prerequisite for efficient proteomics data reuse in the life sciences in general. We therefore present here a selection of quality control metrics and approaches for the a posteriori detection of potential issues encountered in typical proteomics data sets. We illustrate our metrics by relying on publicly available data from the Proteomics Identifications Database (PRIDE), and simultaneously show the usefulness of the large body of PRIDE data as a means to derive empirical background distributions for relevant metrics.

Collaboration


Dive into the Joseph M. Foster's collaboration.

Top Co-Authors

Avatar

Juan Antonio Vizcaíno

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Rui Wang

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Attila Csordas

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Daniel Ríos

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Antonio Fabregat

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

David Ovelleiro

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Johannes Griss

Medical University of Vienna

View shared research outputs
Researchain Logo
Decentralizing Knowledge