Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Christine Vogel is active.

Publication


Featured researches published by Christine Vogel.


Nature Reviews Genetics | 2012

Insights into the regulation of protein abundance from proteomic and transcriptomic analyses

Christine Vogel; Edward M. Marcotte

Recent advances in next-generation DNA sequencing and proteomics provide an unprecedented ability to survey mRNA and protein abundances. Such proteome-wide surveys are illuminating the extent to which different aspects of gene expression help to regulate cellular protein abundances. Current data demonstrate a substantial role for regulatory processes occurring after mRNA is made — that is, post-transcriptional, translational and protein degradation regulation — in controlling steady-state protein abundances. Intriguing observations are also emerging in relation to cells following perturbation, single-cell studies and the apparent evolutionary conservation of protein and mRNA abundances. Here, we summarize current understanding of the major factors regulating protein expression.


Nature Biotechnology | 2007

Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation

Peng Lu; Christine Vogel; Rong Wang; Xin Yao; Edward M. Marcotte

We report a method for large-scale absolute protein expression measurements (APEX) and apply it to estimate the relative contributions of transcriptional- and translational-level gene regulation in the yeast and Escherichia coli proteomes. APEX relies upon correcting each proteins mass spectrometry sampling depth (observed peptide count) by learned probabilities for identifying the peptides. APEX abundances agree with measurements from controls, western blotting, flow cytometry and two-dimensional gels, as well as known correlations with mRNA abundances and codon bias, providing absolute protein concentrations across approximately three to four orders of magnitude. Using APEX, we demonstrate that 73% of the variance in yeast protein abundance (47% in E. coli) is explained by mRNA abundance, with the number of proteins per mRNA log-normally distributed about ∼5,600 (∼540 in E. coli) protein molecules/mRNA. Therefore, levels of both eukaryotic and prokaryotic proteins are set per mRNA molecule and independently of overall protein concentration, with >70% of yeast gene expression regulation occurring through mRNA-directed mechanisms.


Molecular BioSystems | 2009

Global signatures of protein and mRNA expression levels

Raquel de Sousa Abreu; Luiz O. F. Penalva; Edward M. Marcotte; Christine Vogel

Cellular states are determined by differential expression of the cells proteins. The relationship between protein and mRNA expression levels informs about the combined outcomes of translation and protein degradation which are, in addition to transcription and mRNA stability, essential contributors to gene expression regulation. This review summarizes the state of knowledge about large-scale measurements of absolute protein and mRNA expression levels, and the degree of correlation between the two parameters. We summarize the information that can be derived from comparison of protein and mRNA expression levels and discuss how corresponding sequence characteristics suggest modes of regulation.


Molecular Systems Biology | 2010

Sequence signatures and mRNA concentration can explain two-thirds of protein abundance variation in a human cell line

Christine Vogel; Raquel de Sousa Abreu; Daijin Ko; Shu Yun Le; Bruce A. Shapiro; Suzanne C. Burns; Devraj Sandhu; Daniel R. Boutz; Edward M. Marcotte; Luiz O. F. Penalva

Transcription, mRNA decay, translation and protein degradation are essential processes during eukaryotic gene expression, but their relative global contributions to steady‐state protein concentrations in multi‐cellular eukaryotes are largely unknown. Using measurements of absolute protein and mRNA abundances in cellular lysate from the human Daoy medulloblastoma cell line, we quantitatively evaluate the impact of mRNA concentration and sequence features implicated in translation and protein degradation on protein expression. Sequence features related to translation and protein degradation have an impact similar to that of mRNA abundance, and their combined contribution explains two‐thirds of protein abundance variation. mRNA sequence lengths, amino‐acid properties, upstream open reading frames and secondary structures in the 5′ untranslated region (UTR) were the strongest individual correlates of protein concentrations. In a combined model, characteristics of the coding region and the 3′UTR explained a larger proportion of protein abundance variation than characteristics of the 5′UTR. The absolute protein and mRNA concentration measurements for >1000 human genes described here represent one of the largest datasets currently available, and reveal both general trends and specific examples of post‐transcriptional regulation.


Nucleic Acids Research | 2009

SUPERFAMILY—sophisticated comparative genomics, data mining, visualization and phylogeny

Derek Wilson; Ralph Pethica; Yiduo Zhou; Charles Talbot; Christine Vogel; Cyrus Chothia; Julian Gough

SUPERFAMILY provides structural, functional and evolutionary information for proteins from all completely sequenced genomes, and large sequence collections such as UniProt. Protein domain assignments for over 900 genomes are included in the database, which can be accessed at http://supfam.org/. Hidden Markov models based on Structural Classification of Proteins (SCOP) domain definitions at the superfamily level are used to provide structural annotation. We recently produced a new model library based on SCOP 1.73. Family level assignments are also available. From the web site users can submit sequences for SCOP domain classification; search for keywords such as superfamilies, families, organism names, models and sequence identifiers; find over- and underrepresented families or superfamilies within a genome relative to other genomes or groups of genomes; compare domain architectures across selections of genomes and finally build multiple sequence alignments between Protein Data Bank (PDB), genomic and custom sequences. Recent extensions to the database include InterPro abstracts and Gene Ontology terms for superfamiles, taxonomic visualization of the distribution of families across the tree of life, searches for functionally similar domain architectures and phylogenetic trees. The database, models and associated scripts are available for download from the ftp site.


Nucleic Acids Research | 2007

The SUPERFAMILY database in 2007: families and functions

Derek Wilson; Christine Vogel; Cyrus Chothia; Julian Gough

The SUPERFAMILY database provides protein domain assignments, at the SCOP ‘superfamily’ level, for the predicted protein sequences in over 400 completed genomes. A superfamily groups together domains of different families which have a common evolutionary ancestor based on structural, functional and sequence data. SUPERFAMILY domain assignments are generated using an expert curated set of profile hidden Markov models. All models and structural assignments are available for browsing and download from . The web interface includes services such as domain architectures and alignment details for all protein assignments, searchable domain combinations, domain occurrence network visualization, detection of over- or under-represented superfamilies for a given genome by comparison with other genomes, assignment of manually submitted sequences and keyword searches. In this update we describe the SUPERFAMILY database and outline two major developments: (i) incorporation of family level assignments and (ii) a superfamily-level functional annotation. The SUPERFAMILY database can be used for general protein evolution and superfamily-specific studies, genomic annotation, and structural genomics target suggestion and assessment.


BMC Bioinformatics | 2008

The APEX Quantitative Proteomics Tool: Generating Protein Quantitation Estimates from LC-MS/MS Proteomics Results

John C. Braisted; Srilatha Kuntumalla; Christine Vogel; Edward M. Marcotte; Alan R. Rodrigues; Rong Wang; Shih Ting Huang; Erik S. Ferlanti; Alexander I. Saeed; Robert D. Fleischmann; Scott N. Peterson; Rembert Pieper

BackgroundMass spectrometry (MS) based label-free protein quantitation has mainly focused on analysis of ion peak heights and peptide spectral counts. Most analyses of tandem mass spectrometry (MS/MS) data begin with an enzymatic digestion of a complex protein mixture to generate smaller peptides that can be separated and identified by an MS/MS instrument. Peptide spectral counting techniques attempt to quantify protein abundance by counting the number of detected tryptic peptides and their corresponding MS spectra. However, spectral counting is confounded by the fact that peptide physicochemical properties severely affect MS detection resulting in each peptide having a different detection probability. Lu et al. (2007) described a modified spectral counting technique, Absolute Protein Expression (APEX), which improves on basic spectral counting methods by including a correction factor for each protein (called Oivalue) that accounts for variable peptide detection by MS techniques. The technique uses machine learning classification to derive peptide detection probabilities that are used to predict the number of tryptic peptides expected to be detected for one molecule of a particular protein (Oi). This predicted spectral count is compared to the proteins observed MS total spectral count during APEX computation of protein abundances.ResultsThe APEX Quantitative Proteomics Tool, introduced here, is a free open source Java application that supports the APEX protein quantitation technique. The APEX tool uses data from standard tandem mass spectrometry proteomics experiments and provides computational support for APEX protein abundance quantitation through a set of graphical user interfaces that partition thparameter controls for the various processing tasks. The tool also provides a Z-score analysis for identification of significant differential protein expression, a utility to assess APEX classifier performance via cross validation, and a utility to merge multiple APEX results into a standardized format in preparation for further statistical analysis.ConclusionThe APEX Quantitative Proteomics Tool provides a simple means to quickly derive hundreds to thousands of protein abundance values from standard liquid chromatography-tandem mass spectrometry proteomics datasets. The APEX tool provides a straightforward intuitive interface design overlaying a highly customizable computational workflow to produce protein abundance values from LC-MS/MS datasets.


Development | 2003

The immunoglobulin superfamily in Drosophila melanogaster and Caenorhabditis elegans and the evolution of complexity.

Christine Vogel; Sarah A. Teichmann; Cyrus Chothia

Drosophila melanogaster is an arthropod with a much more complex anatomy and physiology than the nematode Caenorhabditis elegans. We investigated one of the protein superfamilies in the two organisms that plays a major role in development and function of cell-cell communication: the immunoglobulin superfamily (IgSF). Using hidden Markov models, we identified 142 IgSF proteins in Drosophila and 80 in C. elegans. Of these, 58 and 22, respectively, have been previously identified by experiments. On the basis of homology and the structural characterisation of the proteins, we can suggest probable types of function for most of the novel proteins. Though overall Drosophila has fewer genes than C. elegans, it has many more IgSF cell-surface and secreted proteins. Half the IgSF proteins in C. elegans and three quarters of those in Drosophila have evolved subsequent to the divergence of the two organisms. These results suggest that the expansion of this protein superfamily is one of the factors that have contributed to the formation of the more complex physiological features that are found in Drosophila.


Proteomics | 2010

Protein abundances are more conserved than mRNA abundances across diverse taxa

Jon M. Laurent; Christine Vogel; Taejoon Kwon; Stephanie A. Craig; Daniel R. Boutz; Holly K. Huse; Kazunari Nozue; Harkamal Walia; Marvin Whiteley; Pamela C. Ronald; Edward M. Marcotte

Proteins play major roles in most biological processes; as a consequence, protein expression levels are highly regulated. While extensive post‐transcriptional, translational and protein degradation control clearly influence protein concentration and functionality, it is often thought that protein abundances are primarily determined by the abundances of the corresponding mRNAs. Hence surprisingly, a recent study showed that abundances of orthologous nematode and fly proteins correlate better than their corresponding mRNA abundances. We tested if this phenomenon is general by collecting and testing matching large‐scale protein and mRNA expression data sets from seven different species: two bacteria, yeast, nematode, fly, human, and rice. We find that steady‐state abundances of proteins show significantly higher correlation across these diverse phylogenetic taxa than the abundances of their corresponding mRNAs (p=0.0008, paired Wilcoxon). These data support the presence of strong selective pressure to maintain protein abundances during evolution, even when mRNA abundances diverge.


Nature Protocols | 2008

Calculating absolute and relative protein abundance from mass spectrometry-based protein expression data

Christine Vogel; Edward M. Marcotte

Mass spectrometry (MS)-based shotgun proteomics allows protein identifications even in complex biological samples. Protein abundances can then be estimated from the counts of tandem MS (MS/MS) spectra attributable to each protein, provided one accounts for differential MS detectability of contributing peptides. We developed a method, APEX, which calculates Absolute Protein EXpression levels based upon learned correction factors, MS/MS spectral counts and each proteins probability of correct identification. This protocol describes APEX-based calculations in three parts. (i) Using training data, peptide sequences and their sequence properties, a model is built to estimate MS detectability (Oi) for any given protein. (ii) Absolute protein abundances are calculated from spectral counts, identification probabilities and the learned Oi-values. (iii) Simple statistics allow calculation of differential expression in two distinct biological samples, i.e., measuring relative protein abundances. APEX-based protein abundances span 3–4 orders of magnitude and are applicable to mixtures of 100s to 1,000s of proteins.

Collaboration


Dive into the Christine Vogel's collaboration.

Top Co-Authors

Avatar

Edward M. Marcotte

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Luiz O. F. Penalva

University of Texas Health Science Center at San Antonio

View shared research outputs
Top Co-Authors

Avatar

Sarah A. Teichmann

Wellcome Trust Sanger Institute

View shared research outputs
Top Co-Authors

Avatar

Hyungwon Choi

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar

Cyrus Chothia

Laboratory of Molecular Biology

View shared research outputs
Top Co-Authors

Avatar

Taejoon Kwon

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar

Raquel de Sousa Abreu

University of Texas Health Science Center at San Antonio

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge