Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Gloria M. Sheynkman is active.

Publication


Featured researches published by Gloria M. Sheynkman.


Molecular & Cellular Proteomics | 2013

Discovery and Mass Spectrometric Analysis of Novel Splice-junction Peptides Using RNA-Seq

Gloria M. Sheynkman; Michael R. Shortreed; Brian L. Frey; Lloyd M. Smith

Human proteomic databases required for MS peptide identification are frequently updated and carefully curated, yet are still incomplete because it has been challenging to acquire every protein sequence from the diverse assemblage of proteoforms expressed in every tissue and cell type. In particular, alternative splicing has been shown to be a major source of this cell-specific proteomic variation. Many new alternative splice forms have been detected at the transcript level using next generation sequencing methods, especially RNA-Seq, but it is not known how many of these transcripts are being translated. Leveraging the unprecedented capabilities of next generation sequencing methods, we collected RNA-Seq and proteomics data from the same cell population (Jurkat cells) and created a bioinformatics pipeline that builds customized databases for the discovery of novel splice-junction peptides. Eighty million paired-end Illumina reads and ∼500,000 tandem mass spectra were used to identify 12,873 transcripts (19,320 including isoforms) and 6810 proteins. We developed a bioinformatics workflow to retrieve high-confidence, novel splice junction sequences from the RNA data, translate these sequences into the analogous polypeptide sequence, and create a customized splice junction database for MS searching. Based on the RefSeq gene models, we detected 136,123 annotated and 144,818 unannotated transcript junctions. Of those, 24,834 unannotated junctions passed various quality filters (e.g. minimum read depth) and these entries were translated into 33,589 polypeptide sequences and used for database searching. We discovered 57 splice junction peptides not present in the Uniprot-Trembl proteomic database comprising an array of different splicing events, including skipped exons, alternative donors and acceptors, and noncanonical transcriptional start sites. To our knowledge this is the first example of using sample-specific RNA-Seq data to create a splice-junction database and discover new peptides resulting from alternative splicing.


Journal of Proteome Research | 2014

Large-scale mass spectrometric detection of variant peptides resulting from nonsynonymous nucleotide differences.

Gloria M. Sheynkman; Michael R. Shortreed; Brian L. Frey; Mark Scalf; Lloyd M. Smith

Each individual carries thousands of nonsynonymous single nucleotide variants (nsSNVs) in their genome, each corresponding to a single amino acid polymorphism (SAP) in the encoded proteins. It is important to be able to directly detect and quantify these variations at the protein level to study post-transcriptional regulation, differential allelic expression, and other important biological processes. However, such variant peptides are not generally detected in standard proteomic analyses due to their absence from the generic databases that are employed for mass spectrometry searching. Here we extend previous work that demonstrated the use of customized SAP databases constructed from sample-matched RNA-Seq data. We collected deep-coverage RNA-Seq data from the Jurkat cell line, compiled the set of nsSNVs that are expressed, used this information to construct a customized SAP database, and searched it against deep-coverage shotgun MS data obtained from the same sample. This approach enabled the detection of 421 SAP peptides mapping to 395 nsSNVs. We compared these peptides to peptides identified from a large generic search database containing all known nsSNVs (dbSNP) and found that more than 70% of the SAP peptides from this dbSNP-derived search were not supported by the RNA-Seq data and thus are likely false positives. Next, we increased the SAP coverage from the RNA-Seq derived database by utilizing multiple protease digestions, thereby increasing variant detection to 695 SAP peptides mapping to 504 nsSNV sites. These detected SAP peptides corresponded to moderate to high abundance transcripts (30+ transcripts per million, TPM). The SAP peptides included 192 allelic pairs; the relative expression levels of the two alleles were evaluated for 51 of those pairs and were found to be comparable in all cases.


BMC Genomics | 2014

Using Galaxy-P to leverage RNA-Seq for the discovery of novel protein variations

Gloria M. Sheynkman; James E. Johnson; Pratik Jagtap; Michael R. Shortreed; Getiria Onsongo; Brian L. Frey; Timothy J. Griffin; Lloyd M. Smith

BackgroundCurrent practice in mass spectrometry (MS)-based proteomics is to identify peptides by comparison of experimental mass spectra with theoretical mass spectra derived from a reference protein database; however, this strategy necessarily fails to detect peptide and protein sequences that are absent from the database. We and others have recently shown that customized proteomic databases derived from RNA-Seq data can be employed for MS-searching to both improve MS analysis and identify novel peptides. While this general strategy constitutes a significant advance for the discovery of novel protein variations, it has not been readily transferable to other laboratories due to the need for many specialized software tools. To address this problem, we have implemented readily accessible, modifiable, and extensible workflows within Galaxy-P, short for Galaxy for Proteomics, a web-based bioinformatic extension of the Galaxy framework for the analysis of multi-omics (e.g. genomics, transcriptomics, proteomics) data.ResultsWe present three bioinformatic workflows that allow the user to upload raw RNA sequencing reads and convert the data into high-quality customized proteomic databases suitable for MS searching. We show the utility of these workflows on human and mouse samples, identifying 544 peptides containing single amino acid polymorphisms (SAPs) and 187 peptides corresponding to unannotated splice junction peptides, correlating protein and transcript expression levels, and providing the option to incorporate transcript abundance measures within the MS database search process (reduced databases, incorporation of transcript abundance for protein identification score calculations, etc.).ConclusionsUsing RNA-Seq data to enhance MS analysis is a promising strategy to discover novel peptides specific to a sample and, more generally, to improve proteomics results. The main bottleneck for widespread adoption of this strategy has been the lack of easily used and modifiable computational tools. We provide a solution to this problem by introducing a set of workflows within the Galaxy-P framework that converts raw RNA-Seq data into customized proteomic databases.


Journal of the American Society for Mass Spectrometry | 2012

Absolute Quantification of Prion Protein (90–231) Using Stable Isotope-Labeled Chymotryptic Peptide Standards in a LC-MRM AQUA Workflow

Robert M. Sturm; Gloria M. Sheynkman; Clarissa J. Booth; Lloyd M. Smith; Joel A. Pedersen; Lingjun Li

Substantial evidence indicates that the disease-associated conformer of the prion protein (PrPTSE) constitutes the etiologic agent in prion diseases. These diseases affect multiple mammalian species. PrPTSE has the ability to convert the conformation of the normal prion protein (PrPC) into a β-sheet rich form resistant to proteinase K digestion. Common immunological techniques lack the sensitivity to detect PrPTSE at subfemtomole levels, whereas animal bioassays, cell culture, and in vitro conversion assays offer higher sensitivity but lack the high-throughput the immunological assays offer. Mass spectrometry is an attractive alternative to the above assays as it offers high-throughput, direct measurement of a protein’s signature peptide, often with subfemtomole sensitivities. Although a liquid chromatography-multiple reaction monitoring (LC-MRM) method has been reported for PrPTSE, the chemical composition and lack of amino acid sequence conservation of the signature peptide may compromise its accuracy and make it difficult to apply to multiple species. Here, we demonstrate that an alternative protease (chymotrypsin) can produce signature peptides suitable for a LC-MRM absolute quantification (AQUA) experiment. The new method offers several advantages, including: (1) a chymotryptic signature peptide lacking chemically active residues (Cys, Met) that can confound assay accuracy; (2) low attomole limits of detection and quantitation (LOD and LOQ); and (3) a signature peptide retaining the same amino acid sequence across most mammals naturally susceptible to prion infection as well as important laboratory models. To the authors’ knowledge, this is the first report on the use of a non-tryptic peptide in a LC-MRM AQUA workflow.


Reviews in Analytical Chemistry | 2016

Proteogenomics: Integrating Next-Generation Sequencing and Mass Spectrometry to Characterize Human Proteomic Variation

Gloria M. Sheynkman; Michael R. Shortreed; Anthony J. Cesnik; Lloyd M. Smith

Mass spectrometry-based proteomics has emerged as the leading method for detection, quantification, and characterization of proteins. Nearly all proteomic workflows rely on proteomic databases to identify peptides and proteins, but these databases typically contain a generic set of proteins that lack variations unique to a given sample, precluding their detection. Fortunately, proteogenomics enables the detection of such proteomic variations and can be defined, broadly, as the use of nucleotide sequences to generate candidate protein sequences for mass spectrometry database searching. Proteogenomics is experiencing heightened significance due to two developments: (a) advances in DNA sequencing technologies that have made complete sequencing of human genomes and transcriptomes routine, and (b) the unveiling of the tremendous complexity of the human proteome as expressed at the levels of genes, cells, tissues, individuals, and populations. We review here the field of human proteogenomics, with an emphasis on its history, current implementations, the types of proteomic variations it reveals, and several important applications.


Trends in Biochemical Sciences | 2017

Proteome-Scale Human Interactomics

Katja Luck; Gloria M. Sheynkman; Ivy Zhang; Marc Vidal

Cellular functions are mediated by complex interactome networks of physical, biochemical, and functional interactions between DNA sequences, RNA molecules, proteins, lipids, and small metabolites. A thorough understanding of cellular organization requires accurate and relatively complete models of interactome networks at proteome scale. The recent publication of four human protein-protein interaction (PPI) maps represents a technological breakthrough and an unprecedented resource for the scientific community, heralding a new era of proteome-scale human interactomics. Our knowledge gained from these and complementary studies provides fresh insights into the opportunities and challenges when analyzing systematically generated interactome data, defines a clear roadmap towards the generation of a first reference interactome, and reveals new perspectives on the organization of cellular life.


Journal of Proteome Research | 2015

Global Identification of Protein Post-translational Modifications in a Single-Pass Database Search.

Michael R. Shortreed; Craig D. Wenger; Brian L. Frey; Gloria M. Sheynkman; Mark Scalf; Mark P. Keller; Alan D. Attie; Lloyd M. Smith

Bottom-up proteomics database search algorithms used for peptide identification cannot comprehensively identify post-translational modifications (PTMs) in a single-pass because of high false discovery rates (FDRs). A new approach to database searching enables global PTM (G-PTM) identification by exclusively looking for curated PTMs, thereby avoiding the FDR penalty experienced during conventional variable modification searches. We identified over 2200 unique, high-confidence modified peptides comprising 26 different PTM types in a single-pass database search.


Bioinformatics | 2014

sapFinder: an R/Bioconductor package for detection of variant peptides in shotgun proteomics experiments.

Bo Wen; Shaohang Xu; Gloria M. Sheynkman; Qiang Feng; Liang Lin; Q. Wang; Xun Xu; Jun Wang; Siqi Liu

UNLABELLED Single nucleotide variations (SNVs) located within a reading frame can result in single amino acid polymorphisms (SAPs), leading to alteration of the corresponding amino acid sequence as well as function of a protein. Accurate detection of SAPs is an important issue in proteomic analysis at the experimental and bioinformatic level. Herein, we present sapFinder, an R software package, for detection of the variant peptides based on tandem mass spectrometry (MS/MS)-based proteomics data. This package automates the construction of variation-associated databases from public SNV repositories or sample-specific next-generation sequencing (NGS) data and the identification of SAPs through database searching, post-processing and generation of HTML-based report with visualized interface. AVAILABILITY AND IMPLEMENTATION sapFinder is implemented as a Bioconductor package in R. The package and the vignette can be downloaded at http://bioconductor.org/packages/devel/bioc/html/sapFinder.html and are provided under a GPL-2 license.


Journal of Proteome Research | 2016

Human Proteomic Variation Revealed by Combining RNA-Seq Proteogenomics and Global Post-Translational Modification (G-PTM) Search Strategy

Anthony J. Cesnik; Michael R. Shortreed; Gloria M. Sheynkman; Brian L. Frey; Lloyd M. Smith

Mass-spectrometry-based proteomic analysis underestimates proteomic variation due to the absence of variant peptides and posttranslational modifications (PTMs) from standard protein databases. Each individual carries thousands of missense mutations that lead to single amino acid variants, but these are missed because they are absent from generic proteomic search databases. Myriad types of protein PTMs play essential roles in biological processes but remain undetected because of increased false discovery rates in variable modification searches. We address these two fundamental shortcomings of bottom-up proteomics with two recently developed software tools. The first consists of workflows in Galaxy that mine RNA sequencing data to generate sample-specific databases containing variant peptides and products of alternative splicing events. The second tool applies a new strategy that alters the variable modification approach to consider only curated PTMs at specific positions, thereby avoiding the combinatorial explosion that traditionally leads to high false discovery rates. Using RNA-sequencing-derived databases with this Global Post-Translational Modification (G-PTM) search strategy revealed hundreds of single amino acid variant peptides, tens of novel splice junction peptides, and several hundred posttranslationally modified peptides in each of ten human cell lines.


Genome Biology | 2018

Full-length mRNA sequencing uncovers a widespread coupling between transcription initiation and mRNA processing

Seyed Yahya Anvar; Guy Allard; Elizabeth Tseng; Gloria M. Sheynkman; Eleonora de Klerk; Martijn Vermaat; Raymund H. Yin; Hans E. Johansson; Yavuz Ariyurek; Johan T. den Dunnen; Stephen Turner; Peter A. C. 't Hoen

BackgroundThe multifaceted control of gene expression requires tight coordination of regulatory mechanisms at transcriptional and post-transcriptional level. Here, we studied the interdependence of transcription initiation, splicing and polyadenylation events on single mRNA molecules by full-length mRNA sequencing.ResultsIn MCF-7 breast cancer cells, we find 2700 genes with interdependent alternative transcription initiation, splicing and polyadenylation events, both in proximal and distant parts of mRNA molecules, including examples of coupling between transcription start sites and polyadenylation sites. The analysis of three human primary tissues (brain, heart and liver) reveals similar patterns of interdependency between transcription initiation and mRNA processing events. We predict thousands of novel open reading frames from full-length mRNA sequences and obtained evidence for their translation by shotgun proteomics. The mapping database rescues 358 previously unassigned peptides and improves the assignment of others. By recognizing sample-specific amino-acid changes and novel splicing patterns, full-length mRNA sequencing improves proteogenomics analysis of MCF-7 cells.ConclusionsOur findings demonstrate that our understanding of transcriptome complexity is far from complete and provides a basis to reveal largely unresolved mechanisms that coordinate transcription initiation and mRNA processing.

Collaboration


Dive into the Gloria M. Sheynkman's collaboration.

Top Co-Authors

Avatar

Lloyd M. Smith

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

Michael R. Shortreed

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

Brian L. Frey

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

Alan D. Attie

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

Anthony J. Cesnik

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

Mark P. Keller

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

Mark Scalf

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

Eleonora de Klerk

Leiden University Medical Center

View shared research outputs
Top Co-Authors

Avatar

Guy Allard

Leiden University Medical Center

View shared research outputs
Top Co-Authors

Avatar

Johan T. den Dunnen

Leiden University Medical Center

View shared research outputs
Researchain Logo
Decentralizing Knowledge