Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Zuo-Fei Yuan is active.

Publication


Featured researches published by Zuo-Fei Yuan.


Journal of Proteome Research | 2010

pNovo: De novo Peptide Sequencing and Identification Using HCD Spectra

Hao Chi; Rui-Xiang Sun; Bing Yang; Chun-Qing Song; Le-Heng Wang; Chao Liu; Yan Fu; Zuo-Fei Yuan; Haipeng Wang; Simin He; Meng-Qiu Dong

De novo peptide sequencing has improved remarkably in the past decade as a result of better instruments and computational algorithms. However, de novo sequencing can correctly interpret only approximately 30% of high- and medium-quality spectra generated by collision-induced dissociation (CID), which is much less than database search. This is mainly due to incomplete fragmentation and overlap of different ion series in CID spectra. In this study, we show that higher-energy collisional dissociation (HCD) is of great help to de novo sequencing because it produces high mass accuracy tandem mass spectrometry (MS/MS) spectra without the low-mass cutoff associated with CID in ion trap instruments. Besides, abundant internal and immonium ions in the HCD spectra can help differentiate similar peptide sequences. Taking advantage of these characteristics, we developed an algorithm called pNovo for efficient de novo sequencing of peptides from HCD spectra. pNovo gave correct identifications to 80% or more of the HCD spectra identified by database search. The number of correct full-length peptides sequenced by pNovo is comparable with that obtained by database search. A distinct advantage of de novo sequencing is that deamidated peptides and peptides with amino acid mutations can be identified efficiently without extra cost in computation. In summary, implementation of the HCD characteristics makes pNovo an excellent tool for de novo peptide sequencing from HCD spectra.


Molecular & Cellular Proteomics | 2009

A Strategy for Precise and Large Scale Identification of Core Fucosylated Glycoproteins

Wei Jia; Zhuang Lu; Yan Fu; Haipeng Wang; Le-Heng Wang; Hao Chi; Zuo-Fei Yuan; Zhaobin Zheng; Lina Song; Huanhuan Han; YiMin Liang; Jinglan Wang; Yun Cai; Yukui Zhang; Yulin Deng; Wantao Ying; Simin He; Xiaohong Qian

Core fucosylation (CF) patterns of some glycoproteins are more sensitive and specific than evaluation of their total respective protein levels for diagnosis of many diseases, such as cancers. Global profiling and quantitative characterization of CF glycoproteins may reveal potent biomarkers for clinical applications. However, current techniques are unable to reveal CF glycoproteins precisely on a large scale. Here we developed a robust strategy that integrates molecular weight cutoff, neutral loss-dependent MS3, database-independent candidate spectrum filtering, and optimization to effectively identify CF glycoproteins. The rationale for spectrum treatment was innovatively based on computation of the mass distribution in spectra of CF glycopeptides. The efficacy of this strategy was demonstrated by implementation for plasma from healthy subjects and subjects with hepatocellular carcinoma. Over 100 CF glycoproteins and CF sites were identified, and over 10,000 mass spectra of CF glycopeptide were found. The scale of identification results indicates great progress for finding biomarkers with a particular and attractive prospect, and the candidate spectra will be a useful resource for the improvement of database searching methods for glycopeptides.


Bioinformatics | 2010

Open MS/MS spectral library search to identify unanticipated post-translational modifications and increase spectral identification rate

Ding Ye; Yan Fu; Rui-Xiang Sun; Haipeng Wang; Zuo-Fei Yuan; Hao Chi; Simin He

Motivation: Identification of post-translationally modified proteins has become one of the central issues of current proteomics. Spectral library search is a new and promising computational approach to mass spectrometry-based protein identification. However, its potential in identification of unanticipated post-translational modifications has rarely been explored. The existing spectral library search tools are designed to match the query spectrum to the reference library spectra with the same peptide mass. Thus, spectra of peptides with unanticipated modifications cannot be identified. Results: In this article, we present an open spectral library search tool, named pMatch. It extends the existing library search algorithms in at least three aspects to support the identification of unanticipated modifications. First, the spectra in library are optimized with the full peptide sequence information to better tolerate the peptide fragmentation pattern variations caused by some modification(s). Second, a new scoring system is devised, which uses charge-dependent mass shifts for peak matching and combines a probability-based model with the general spectral dot-product for scoring. Third, a target-decoy strategy is used for false discovery rate control. To demonstrate the effectiveness of pMatch, a library search experiment was conducted on a public dataset with over 40 000 spectra in comparison with SpectraST, the most popular library search engine. Additional validations were done on four published datasets including over 150 000 spectra. The results showed that pMatch can effectively identify unanticipated modifications and significantly increase spectral identification rate. Availability: http://pfind.ict.ac.cn/pmatch/ Contact: [email protected]; [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.


Molecular & Cellular Proteomics | 2015

EpiProfile Quantifies Histone Peptides With Modifications by Extracting Retention Time and Intensity in High-resolution Mass Spectra

Zuo-Fei Yuan; Shu Lin; Rosalynn C. Molden; Xing-Jun Cao; Natarajan V. Bhanu; Xiaoshi Wang; Simone Sidoli; Shichong Liu; Benjamin A. Garcia

Histone post-translational modifications contribute to chromatin function through their chemical properties which influence chromatin structure and their ability to recruit chromatin interacting proteins. Nanoflow liquid chromatography coupled with high resolution tandem mass spectrometry (nanoLC-MS/MS) has emerged as the most suitable technology for global histone modification analysis because of the high sensitivity and the high mass accuracy of this approach that provides confident identification. However, analysis of histones with this method is even more challenging because of the large number and variety of isobaric histone peptides and the high dynamic range of histone peptide abundances. Here, we introduce EpiProfile, a software tool that discriminates isobaric histone peptides using the distinguishing fragment ions in their tandem mass spectra and extracts the chromatographic area under the curve using previous knowledge about peptide retention time. The accuracy of EpiProfile was evaluated by analysis of mixtures containing different ratios of synthetic histone peptides. In addition to label-free quantification of histone peptides, EpiProfile is flexible and can quantify different types of isotopically labeled histone peptides. EpiProfile is unique in generating layouts (i.e. relative retention time) of histone peptides when compared with manual quantification of the data and other programs (such as Skyline), filling the need of an automatic and freely available tool to quantify labeled and non-labeled modified histone peptides. In summary, EpiProfile is a valuable nanoflow liquid chromatography coupled with high resolution tandem mass spectrometry-based quantification tool for histone peptides, which can also be adapted to analyze nonhistone protein samples.


Molecular & Cellular Proteomics | 2014

Stable-isotope-labeled Histone Peptide Library for Histone Post-translational Modification and Variant Quantification by Mass Spectrometry

Shu Lin; Wein S; Michelle Gonzales-Cope; Gabriel L. Otte; Zuo-Fei Yuan; Leila Afjehi-Sadat; Tobias M. Maile; Shelley L. Berger; John Rush; Lill; David Arnott; Benjamin A. Garcia

To facilitate accurate histone variant and post-translational modification (PTM) quantification via mass spectrometry, we present a library of 93 synthetic peptides using Protein-Aqua™ technology. The library contains 55 peptides representing different modified forms from histone H3 peptides, 23 peptides representing H4 peptides, 5 peptides representing canonical H2A peptides, 8 peptides representing H2A.Z peptides, and peptides for both macroH2A and H2A.X. The PTMs on these peptides include lysine mono- (me1), di- (me2), and tri-methylation (me3); lysine acetylation; arginine me1; serine/threonine phosphorylation; and N-terminal acetylation. The library was subjected to chemical derivatization with propionic anhydride, a widely employed protocol for histone peptide quantification. Subsequently, the detection efficiencies were quantified using mass spectrometry extracted ion chromatograms. The library yields a wide spectrum of detection efficiencies, with more than 1700-fold difference between the peptides with the lowest and highest efficiencies. In this paper, we describe the impact of different modifications on peptide detection efficiencies and provide a resource to correct for detection biases among the 93 histone peptides. In brief, there is no correlation between detection efficiency and molecular weight, hydrophobicity, basicity, or modification type. The same types of modifications may have very different effects on detection efficiencies depending on their positions within a peptide. We also observed antagonistic effects between modifications. In a study of mouse trophoblast stem cells, we utilized the detection efficiencies of the peptide library to correct for histone PTM/variant quantification. For most histone peptides examined, the corrected data did not change the biological conclusions but did alter the relative abundance of these peptides. For a low-abundant histone H2A variant, macroH2A, the corrected data led to a different conclusion than the uncorrected data. The peptide library and detection efficiencies presented here may serve as a resource to facilitate studies in the epigenetics and proteomics fields.


Rapid Communications in Mass Spectrometry | 2010

Speeding up tandem mass spectrometry based database searching by peptide and spectrum indexing

You Li; Hao Chi; Le-Heng Wang; Haipeng Wang; Yan Fu; Zuo-Fei Yuan; Su-Jun Li; Yan-Sheng Liu; Rui-Xiang Sun; Rong Zeng; Simin He

Database searching is the technique of choice for shotgun proteomics, and to date much research effort has been spent on improving its effectiveness. However, database searching faces a serious challenge of efficiency, considering the large numbers of mass spectra and the ever fast increase in peptide databases resulting from genome translations, enzymatic digestions, and post-translational modifications. In this study, we conducted systematic research on speeding up database search engines for protein identification and illustrate the key points with the specific design of the pFind 2.1 search engine as a running example. Firstly, by constructing peptide indexes, pFind achieves a speedup of two to three compared with that without peptide indexes. Secondly, by constructing indexes for observed precursor and fragment ions, pFind achieves another speedup of two. As a result, pFind compares very favorably with predominant search engines such as Mascot, SEQUEST and X!Tandem.


Molecular & Cellular Proteomics | 2016

A Novel Quantitative Mass Spectrometry Platform for Determining Protein O-GlcNAcylation Dynamics

Xiaoshi Wang; Zuo-Fei Yuan; Jing Fan; Kelly R. Karch; Lauren E. Ball; John M. Denu; Benjamin A. Garcia

Over the past decades, protein O-GlcNAcylation has been found to play a fundamental role in cell cycle control, metabolism, transcriptional regulation, and cellular signaling. Nevertheless, quantitative approaches to determine in vivo GlcNAc dynamics at a large-scale are still not readily available. Here, we have developed an approach to isotopically label O-GlcNAc modifications on proteins by producing 13C-labeled UDP-GlcNAc from 13C6-glucose via the hexosamine biosynthetic pathway. This metabolic labeling was combined with quantitative mass spectrometry-based proteomics to determine protein O-GlcNAcylation turnover rates. First, an efficient enrichment method for O-GlcNAc peptides was developed with the use of phenylboronic acid solid-phase extraction and anhydrous DMSO. The near stoichiometry reaction between the diol of GlcNAc and boronic acid dramatically improved the enrichment efficiency. Additionally, our kinetic model for turnover rates integrates both metabolomic and proteomic data, which increase the accuracy of the turnover rate estimation. Other advantages of this metabolic labeling method include in vivo application, direct labeling of the O-GlcNAc sites and higher confidence for site identification. Concentrating only on nuclear localized GlcNAc modified proteins, we are able to identify 105 O-GlcNAc peptides on 42 proteins and determine turnover rates of 20 O-GlcNAc peptides from 14 proteins extracted from HeLa nuclei. In general, we found O-GlcNAcylation turnover rates are slower than those published for phosphorylation or acetylation. Nevertheless, the rates widely varied depending on both the protein and the residue modified. We believe this methodology can be broadly applied to reveal turnovers/dynamics of protein O-GlcNAcylation from different biological states and will provide more information on the significance of O-GlcNAcylation, enabling us to study the temporal dynamics of this critical modification for the first time.


Reviews in Analytical Chemistry | 2014

Mass Spectrometric Analysis of Histone Proteoforms

Zuo-Fei Yuan; Anna M. Arnaudo; Benjamin A. Garcia

Histones play important roles in chromatin, in the forms of various posttranslational modifications (PTMs) and sequence variants, which are called histone proteoforms. Investigating modifications and variants is an ongoing challenge. Previous methods are based on antibodies, and because they usually detect only one modification at a time, they are not suitable for studying the various combinations of modifications on histones. Fortunately, mass spectrometry (MS) has emerged as a high-throughput technology for histone analysis and does not require prior knowledge about any modifications. From the data generated by mass spectrometers, both identification and quantification of modifications, as well as variants, can be obtained easily. On the basis of this information, the functions of histones in various cellular contexts can be revealed. Therefore, MS continues to play an important role in the study of histone proteoforms. In this review, we discuss the analysis strategies of MS, their applications on histones, and some key remaining challenges.


Proteomics | 2012

pParse: A method for accurate determination of monoisotopic peaks in high-resolution mass spectra

Zuo-Fei Yuan; Chao Liu; Haipeng Wang; Rui-Xiang Sun; Yan Fu; Jingfen Zhang; Le-Heng Wang; Hao Chi; You Li; Li-Yun Xiu; Wenping Wang; Simin He

Determining the monoisotopic peak of a precursor is a first step in interpreting mass spectra, which is basic but non‐trivial. The reason is that in the isolation window of a precursor, other peaks interfere with the determination of the monoisotopic peak, leading to wrong mass‐to‐charge ratio or charge state. Here we propose a method, named pParse, to export the most probable monoisotopic peaks for precursors, including co‐eluted precursors. We use the relationship between the position of the highest peak and the mass of the first peak to detect candidate clusters. Then, we extract three features to sort the candidate clusters: (i) the sum of the intensity, (ii) the similarity of the experimental and the theoretical isotopic distribution, and (iii) the similarity of elution profiles. We showed that the recall of pParse, MaxQuant, and BioWorks was 98–98.8%, 0.5–17%, and 1.8–36.5% at the same precision, respectively. About 50% of tandem mass spectra are triggered by multiple precursors which are difficult to identify. Then we design a new scoring function to identify the co‐eluted precursors. About 26% of all identified peptides were exclusively from co‐eluted peptides. Therefore, accurately determining monoisotopic peaks, including co‐eluted precursors, can greatly increase peptide identification rate.


Journal of Proteomics | 2015

Reprint of “pFind–Alioth: A novel unrestricted database search algorithm to improve the interpretation of high-resolution MS/MS data”

Hao Chi; Kun He; Bing Yang; Zhen Chen; Rui-Xiang Sun; Sheng-Bo Fan; Kun Zhang; Chao Liu; Zuo-Fei Yuan; Q. Wang; Siqi Liu; Meng-Qiu Dong; Simin He

Database search is the dominant approach in high-throughput proteomic analysis. However, the interpretation rate of MS/MS spectra is very low in such a restricted mode, which is mainly due to unexpected modifications and irregular digestion types. In this study, we developed a new algorithm called Alioth, to be integrated into the search engine of pFind, for fast and accurate unrestricted database search on high-resolution MS/MS data. An ion index is constructed for both peptide precursors and fragment ions, by which arbitrary digestions and a single site of any modifications and mutations can be searched efficiently. A new re-ranking algorithm is used to distinguish the correct peptide-spectrum matches from random ones. The algorithm is tested on several HCD datasets and the interpretation rate of MS/MS spectra using Alioth is as high as 60%-80%. Peptides from semi- and non-specific digestions, as well as those with unexpected modifications or mutations, can be effectively identified using Alioth and confidently validated using other search engines. The average processing speed of Alioth is 5-10 times faster than some other unrestricted search engines and is comparable to or even faster than the restricted search algorithms tested.

Collaboration


Dive into the Zuo-Fei Yuan's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Hao Chi

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Rui-Xiang Sun

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Simin He

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Le-Heng Wang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Yan Fu

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Haipeng Wang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Chao Liu

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

You Li

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Shichong Liu

University of Pennsylvania

View shared research outputs
Researchain Logo
Decentralizing Knowledge