Andy Christoforou
University of Cambridge
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Andy Christoforou.
Nature Communications | 2016
Andy Christoforou; Claire M Mulvey; Lisa M. Breckels; Aikaterini Geladaki; Tracey Hurrell; Penelope Hayward; Thomas Naake; Laurent Gatto; Rosa Viner; Alfonso Martinez Arias; Kathryn S. Lilley
Knowledge of the subcellular distribution of proteins is vital for understanding cellular mechanisms. Capturing the subcellular proteome in a single experiment has proven challenging, with studies focusing on specific compartments or assigning proteins to subcellular niches with low resolution and/or accuracy. Here we introduce hyperLOPIT, a method that couples extensive fractionation, quantitative high-resolution accurate mass spectrometry with multivariate data analysis. We apply hyperLOPIT to a pluripotent stem cell population whose subcellular proteome has not been extensively studied. We provide localization data on over 5,000 proteins with unprecedented spatial resolution to reveal the organization of organelles, sub-organellar compartments, protein complexes, functional networks and steady-state dynamics of proteins and unexpected subcellular locations. The method paves the way for characterizing the impact of post-transcriptional and post-translational modification on protein location and studies involving proteome-level locational changes on cellular perturbation. An interactive open-source resource is presented that enables exploration of these data.
Analytical and Bioanalytical Chemistry | 2012
Andy Christoforou; Kathryn S. Lilley
Isobaric tagging has proven to be a popular quantitative proteomics tool and has been rapidly adopted to study a wide range of biological questions in the few years since its commercialization. While the flexibility and multiplexing capacity afforded by this technology are clear attractions, it is not without its shortcomings. As the speed and sensitivity of mass spectrometers have improved and the application of isobaric tags to all manner of biological systems has increased, significant issues with quantitative accuracy and precision have come to light. Here we review the issues associated with the use of isobaric tagging methods and discuss the possible solutions which have been proposed to improve their precision and accuracy to approach the levels required within quantitative proteomics.
Journal of Proteomics | 2013
Lisa M. Breckels; Laurent Gatto; Andy Christoforou; Arnoud J. Groen; Kathryn S. Lilley; Matthew Trotter
UNLABELLED Prediction of protein sub-cellular localisation by employing quantitative mass spectrometry experiments is an expanding field. Several methods have led to the assignment of proteins to specific subcellular localisations by partial separation of organelles across a fractionation scheme coupled with computational analysis. Methods developed to analyse organelle data have largely employed supervised machine learning algorithms to map unannotated abundance profiles to known protein-organelle associations. Such approaches are likely to make association errors if organelle-related groupings present in experimental output are not included in data used to create a protein-organelle classifier. Currently, there is no automated way to detect organelle-specific clusters within such datasets. In order to address the above issues we adapted a phenotype discovery algorithm, originally created to filter image-based output for RNAi screens, to identify putative subcellular groupings in organelle proteomics experiments. We were able to mine datasets to a deeper level and extract interesting phenotype clusters for more comprehensive evaluation in an unbiased fashion upon application of this approach. Organelle-related protein clusters were identified beyond those sufficiently annotated for use as training data. Furthermore, we propose avenues for the incorporation of observations made into general practice for the classification of protein-organelle membership from quantitative MS experiments. BIOLOGICAL SIGNIFICANCE Protein sub-cellular localisation plays an important role in molecular interactions, signalling and transport mechanisms. The prediction of protein localisation by quantitative mass-spectrometry (MS) proteomics is a growing field and an important endeavour in improving protein annotation. Several such approaches use gradient-based separation of cellular organelle content to measure relative protein abundance across distinct gradient fractions. The distribution profiles are commonly mapped in silico to known protein-organelle associations via supervised machine learning algorithms, to create classifiers that associate unannotated proteins to specific organelles. These strategies are prone to error, however, if organelle-related groupings present in experimental output are not represented, for example owing to the lack of existing annotation, when creating the protein-organelle mapping. Here, the application of a phenotype discovery approach to LOPIT gradient-based MS data identifies candidate organelle phenotypes for further evaluation in an unbiased fashion. Software implementation and usage guidelines are provided for application to wider protein-organelle association experiments. In the wider context, semi-supervised organelle discovery is discussed as a paradigm with which to generate new protein annotations from MS-based organelle proteomics experiments.
Nature Methods | 2011
Andy Christoforou; Kathryn S. Lilley
Isobaric tagging methods allow multiplexed quantitative analysis of a wide variety of proteome samples but have been severely limited by problems of accuracy. Two groups now explore this issue and provide complementary solutions to address the problem.
Biochimica et Biophysica Acta | 2014
Laurent Gatto; Andy Christoforou
This review presents how R, the popular statistical environment and programming language, can be used in the frame of proteomics data analysis. A short introduction to R is given, with special emphasis on some of the features that make R and its add-on packages premium software for sound and reproducible data analysis. The reader is also advised on how to find relevant R software for proteomics. Several use cases are then presented, illustrating data input/output, quality control, quantitative proteomics and data analysis. Detailed code and additional links to extensive documentation are available in the freely available companion package RforProteomics. This article is part of a Special Issue entitled: Computational Proteomics in the Post-Identification Era. Guest Editors: Martin Eisenacher and Christian Stephan.
Molecular & Cellular Proteomics | 2014
Laurent Gatto; Lisa M. Breckels; Thomas Burger; Daniel J H Nightingale; Arnoud J. Groen; Callum J Campbell; Nino Nikolovski; Claire M Mulvey; Andy Christoforou; Myriam Ferro; Kathryn S. Lilley
Quantitative mass-spectrometry-based spatial proteomics involves elaborate, expensive, and time-consuming experimental procedures, and considerable effort is invested in the generation of such data. Multiple research groups have described a variety of approaches for establishing high-quality proteome-wide datasets. However, data analysis is as critical as data production for reliable and insightful biological interpretation, and no consistent and robust solutions have been offered to the community so far. Here, we introduce the requirements for rigorous spatial proteomics data analysis, as well as the statistical machine learning methodologies needed to address them, including supervised and semi-supervised machine learning, clustering, and novelty detection. We present freely available software solutions that implement innovative state-of-the-art analysis pipelines and illustrate the use of these tools through several case studies involving multiple organisms, experimental designs, mass spectrometry platforms, and quantitation techniques. We also propose sound analysis strategies for identifying dynamic changes in subcellular localization by comparing and contrasting data describing different biological conditions. We conclude by discussing future needs and developments in spatial proteomics data analysis.
PLOS Computational Biology | 2016
Lisa M. Breckels; Sean B. Holden; David Wojnar; Claire M Mulvey; Andy Christoforou; Arnoud J. Groen; Matthew Trotter; Oliver Kohlbacher; Kathryn S. Lilley; Laurent Gatto
Sub-cellular localisation of proteins is an essential post-translational regulatory mechanism that can be assayed using high-throughput mass spectrometry (MS). These MS-based spatial proteomics experiments enable us to pinpoint the sub-cellular distribution of thousands of proteins in a specific system under controlled conditions. Recent advances in high-throughput MS methods have yielded a plethora of experimental spatial proteomics data for the cell biology community. Yet, there are many third-party data sources, such as immunofluorescence microscopy or protein annotations and sequences, which represent a rich and vast source of complementary information. We present a unique transfer learning classification framework that utilises a nearest-neighbour or support vector machine system, to integrate heterogeneous data sources to considerably improve on the quantity and quality of sub-cellular protein assignment. We demonstrate the utility of our algorithms through evaluation of five experimental datasets, from four different species in conjunction with four different auxiliary data sources to classify proteins to tens of sub-cellular compartments with high generalisation accuracy. We further apply the method to an experiment on pluripotent mouse embryonic stem cells to classify a set of previously unknown proteins, and validate our findings against a recent high resolution map of the mouse stem cell proteome. The methodology is distributed as part of the open-source Bioconductor pRoloc suite for spatial proteomics data analysis.
Methods of Molecular Biology | 2014
Andy Christoforou; Alfonso Martinez Arias; Kathryn S. Lilley
Protein subcellular localization is a fundamental feature of posttranslational functional regulation. Traditional microscopy based approaches to study protein localization are typically of limited throughput, and dependent on the availability of antibodies with high specificity and sensitivity, or fluorescent fusion proteins. In this chapter we describe how Localization of Organelle Proteins by Isotope Tagging (LOPIT), a mass spectrometry based workflow coupling biochemical fractionation and iTRAQ™ 8-plex quantification, can be applied for the high-throughput characterization of protein localization in a mammalian cell culture line.
Stem Cells | 2015
Claire M Mulvey; Christian Schröter; Laurent Gatto; Duygu Dikicioglu; Işık Barış Fidaner; Andy Christoforou; Michael J. Deery; Lily Ty Cho; Kathy K. Niakan; Alfonso Martinez-Arias; Kathryn S. Lilley
During mammalian preimplantation development, the cells of the blastocysts inner cell mass differentiate into the epiblast and primitive endoderm lineages, which give rise to the fetus and extra‐embryonic tissues, respectively. Extra‐embryonic endoderm (XEN) differentiation can be modeled in vitro by induced expression of GATA transcription factors in mouse embryonic stem cells. Here, we use this GATA‐inducible system to quantitatively monitor the dynamics of global proteomic changes during the early stages of this differentiation event and also investigate the fully differentiated phenotype, as represented by embryo‐derived XEN cells. Using mass spectrometry‐based quantitative proteomic profiling with multivariate data analysis tools, we reproducibly quantified 2,336 proteins across three biological replicates and have identified clusters of proteins characterized by distinct, dynamic temporal abundance profiles. We first used this approach to highlight novel marker candidates of the pluripotent state and XEN differentiation. Through functional annotation enrichment analysis, we have shown that the downregulation of chromatin‐modifying enzymes, the reorganization of membrane trafficking machinery, and the breakdown of cell–cell adhesion are successive steps of the extra‐embryonic differentiation process. Thus, applying a range of sophisticated clustering approaches to a time‐resolved proteomic dataset has allowed the elucidation of complex biological processes which characterize stem cell differentiation and could establish a general paradigm for the investigation of these processes. Stem Cells 2015;33:2712—2725
Journal of Proteome Research | 2014
Pavel V. Shliaha; Rebekah Jukes-Jones; Andy Christoforou; Jonathan Fox; Chris Hughes; James I. Langridge; Kelvin Cain; Kathryn S. Lilley
Despite the increasing popularity of data-independent acquisition workflows, data-dependent acquisition (DDA) is still the prevalent method of LC-MS-based proteomics. DDA is the basis of isobaric mass tagging technique, a powerful MS2 quantification strategy that allows coanalysis of up to 10 proteomics samples. A well-documented limitation of DDA, however, is precursor coselection, whereby a target peptide is coisolated with other ions for fragmentation. Here, we investigated if additional peptide purification by traveling wave ion mobility separation (TWIMS) can reduce precursor contamination using a mixture of Saccharomyces cerevisiae and HeLa proteomes. In accordance with previous reports on FAIMS-Orbitrap instruments, we find that TWIMS provides a remarkable improvement (on average 2.85 times) in the signal-to-noise ratio for sequence ions. We also report that TWIMS reduces reporter ions contamination by around one-third (to 14-15% contamination) and even further (to 6-9%) when combined with a narrowed quadrupole isolation window. We discuss challenges associated with applying TWIMS purification to isobaric mass tagging experiments, including correlation between ion m/z and drift time, which means that coselected peptides are expected to have similar mobility. We also demonstrate that labeling results in peptides having more uniform m/z and drift time distributions than observed for unlabeled peptides. Data are available via ProteomeXchange with identifier PXD001047.