Chuan-Yih Yu
Indiana University Bloomington
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Chuan-Yih Yu.
Analytical Chemistry | 2014
Anoop Mayampurath; Chuan-Yih Yu; Ehwang Song; Jagadheshwar Balan; Yehia Mechref; Haixu Tang
Glycosylation is an important protein modification that involves enzymatic attachment of sugars to amino acid residues. Understanding the structure of these sugars and the effects of glycosylation are vital for developing indicators of disease development and progression. Although computational methods based on mass spectrometric data have proven to be effective in monitoring changes in the glycome, developing such methods for the glycoproteome are challenging, largely due to the inherent complexity in simultaneously studying glycan structures with their corresponding glycosylation sites. This paper introduces a computational framework for identifying intact N-linked glycopeptides, i.e. glycopeptides with N-linked glycans attached to their glycosylation sites, in complex proteome samples. Scoring algorithms are presented for tandem mass spectra of glycopeptides resulting from collision-induced dissociation (CID), higher-energy C-trap dissociation (HCD), and electron transfer dissociation (ETD) fragmentation modes. An empirical false-discovery rate estimation method, based on a target-decoy search approach, is derived for assigning confidence. The power of our method is further enhanced when multiple data sets are pooled together to increase identification confidence. Using this framework, 103 highly confident N-linked glycopeptides from 53 sites across 33 glycoproteins were identified in complex human serum proteome samples using conventional proteomic platforms with standard depletion of the 7-most abundant proteins. These results indicate that our method is ready to be used for characterizing site-specific protein glycosylation in complex samples.
Bioinformatics | 2013
Chuan-Yih Yu; Anoop Mayampurath; Yunli Hu; Shiyue Zhou; Yehia Mechref; Haixu Tang
UNLABELLED As a common post-translational modification, protein glycosylation plays an important role in many biological processes, and it is known to be associated with human diseases. Mass spectrometry (MS)-based glycomic profiling techniques have been developed to measure the abundances of glycans in complex biological samples and applied to the discovery of putative glycan biomarkers. To automate the annotation of glycomic profiles in the liquid chromatography-MS (LC-MS) data, we present here a user-friendly software tool, MultiGlycan, implemented in C# on Windows systems. We tested MultiGlycan by using several glycomic profiling datasets acquired using LC-MS under different preparations and show that MultiGlycan executes fast and generates robust and reliable results. AVAILABILITY MultiGlycan can be freely downloaded at http://darwin.informatics.indiana.edu/MultiGlycan/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Journal of Proteome Research | 2014
Ehwang Song; Anoop Mayampurath; Chuan-Yih Yu; Haixu Tang; Yehia Mechref
Prostate specific antigen (PSA) is currently used as a biomarker to diagnose prostate cancer. PSA testing has been widely used to detect and screen prostate cancer. However, in the diagnostic gray zone, the PSA test does not clearly distinguish between benign prostate hypertrophy and prostate cancer due to their overlap. To develop more specific and sensitive candidate biomarkers for prostate cancer, an in-depth understanding of the biochemical characteristics of PSA (such as glycosylation) is needed. PSA has a single glycosylation site at Asn69, with glycans constituting approximately 8% of the protein by weight. Here, we report the comprehensive identification and quantitation of N-glycans from two PSA isoforms using LC–MS/MS. There were 56 N-glycans associated with PSA, whereas 57 N-glycans were observed in the case of the PSA-high isoelectric point (pI) isoform (PSAH). Three sulfated/phosphorylated glycopeptides were detected, the identification of which was supported by tandem MS data. One of these sulfated/phosphorylated N-glycans, HexNAc5Hex4dHex1s/p1 was identified in both PSA and PSAH at relative intensities of 0.52 and 0.28%, respectively. Quantitatively, the variations were monitored between these two isoforms. Because we were one of the laboratories participating in the 2012 ABRF Glycoprotein Research Group (gPRG) study, those results were compared to that presented in this study. Our qualitative and quantitative results summarized here were comparable to those that were summarized in the interlaboratory study.
Rapid Communications in Mass Spectrometry | 2015
Yunli Hu; Shiyue Zhou; Chuan-Yih Yu; Haixu Tang; Yehia Mechref
RATIONALE Liquid chromatography/mass spectrometry (LC/MS) is currently considered to be a conventional glycomics analysis strategy due to the high sensitivity and ability to handle complex biological samples. Interpretation of LC/MS data is a major bottleneck in high-throughput glycomics LC/MS-based analysis. The complexity of LC/MS data associated with biological samples prompts the needs to develop computational tools capable of facilitating automated data annotation and quantitation. METHODS An LC/MS-based automated data annotation and quantitation software, MultiGlycan-ESI, was developed and utilized for glycan quantitation. Data generated by the software from LC/MS analysis of permethylated N-glycans derived from fetuin were initially validated by manual integration to assess the performance of the software. The performance of MultiGlycan-ESI was then assessed for the quantitation of permethylated fetuin N-glycans analyzed at different concentrations or spiked with permethylated N-glycans derived from human blood serum. RESULTS The relative abundance differences between data generated by the software and those generated by manual integration were less than 5%, indicating the reliability of MultiGlycan-ESI in quantitation of permethylated glycans analyzed by LC/MS. Automated quantitation resulted in a linear relationship for all six N-glycans derived from 50 ng to 400 ng fetuin with correlation coefficients (R(2) ) greater than 0.93. Spiking of permethylated fetuin N-glycans at different concentrations in permethylated N-glycan samples derived from a 0.02 μL of HBS also exhibited linear agreement with R(2) values greater than 0.9. CONCLUSIONS With a variety of options, including mass accuracy, merged adducts, and filtering criteria, MultiGlycan-ESI allows automated annotation and quantitation of LC/ESI-MS N-glycan data. The software allows the reliable quantitation of glycan LC/MS data. The software is reliable for automated glycan quantitation, thus facilitating rapid and reliable high-throughput glycomics studies.
Analytical Chemistry | 2016
Chuan-Yih Yu; Anoop Mayampurath; Rui Zhu; Lauren G. Zacharias; Ehwang Song; Lei Wang; Yehia Mechref; Haixu Tang
Mass spectrometry has become a routine experimental tool for proteomic biomarker analysis of human blood samples, partly due to the large availability of informatics tools. As one of the most common protein post-translational modifications (PTMs) in mammals, protein glycosylation has been observed to alter in multiple human diseases and thus may potentially be candidate markers of disease progression. While mass spectrometry instrumentation has seen advancements in capabilities, discovering glycosylation-related markers using existing software is currently not straightforward. Complete characterization of protein glycosylation requires the identification of intact glycopeptides in samples, including identification of the modification site as well as the structure of the attached glycans. In this paper, we present GlycoSeq, an open-source software tool that implements a heuristic iterated glycan sequencing algorithm coupled with prior knowledge for automated elucidation of the glycan structure within a glycopeptide from its collision-induced dissociation tandem mass spectrum. GlycoSeq employs rules of glycosidic linkage as defined by glycan synthetic pathways to eliminate improbable glycan structures and build reasonable glycan trees. We tested the tool on two sets of tandem mass spectra of N-linked glycopeptides cell lines acquired from breast cancer patients. After employing enzymatic specificity within the N-linked glycan synthetic pathway, the sequencing results of GlycoSeq were highly consistent with the manually curated glycan structures. Hence, GlycoSeq is ready to be used for the characterization of glycan structures in glycopeptides from MS/MS analysis. GlycoSeq is released as open source software at https://github.com/chpaul/GlycoSeq/ .
Current protocols in protein science | 2014
Haixu Tang; Anoop Mayampurath; Chuan-Yih Yu; Yehia Mechref
Glycomics aims to identify the whole set of functional glycans of glycoconjugates (attached to proteins or lipids) in biological samples. Glycoproteomics aims to characterize the complete structure of all glycoproteins in biological samples, including the glycosylation sites of proteins and the various glycan structures attached to each of these sites. Mass spectrometry (MS) and microarray are high‐throughput technologies that are commonly used in glycomics and glycoproteomics, which often result in the generation of large experimental datasets. Bioinformatics approaches play an essential role in automated analysis and interpretation of such data. This unit describes and discusses the computational tools currently available for these analyses, and their glycomics and glycoproteomics applications. Curr. Protoc. Protein Sci. 76:2.15.1‐2.15.7.
Journal of Proteome Research | 2015
Yanlin Zhang; Chuan-Yih Yu; Ehwang Song; Shuaicheng Li; Yehia Mechref; Haixu Tang; Xiaowen Liu
Glycosylation is one of the most common post-translational modifications in proteins, existing in ~50% of mammalian proteins. Several research groups have demonstrated that mass spectrometry is an efficient technique for glycopeptide identification; however, this problem is still challenging because of the enormous diversity of glycan structures and the microheterogeneity of glycans. In addition, a glycopeptide may contain multiple glycosylation sites, making the problem complex. Current software tools often fail to identify glycopeptides with multiple glycosylation sites, and hence we present GlycoMID, a graph-based spectral alignment algorithm that can identify glycopeptides with multiple hydroxylysine O-glycosylation sites by tandem mass spectra. GlycoMID was tested on mass spectrometry data sets of the bovine collagen α-(II) chain protein, and experimental results showed that it identified more glycopeptide-spectrum matches than other existing tools, including many glycopeptides with two glycosylation sites.
Methods of Molecular Biology | 2013
Chuan-Yih Yu; Anoop Mayampurath; Haixu Tang
We introduce three software tools, Cartoonist, GlycoWorkbench, and MultiGlycan, for N-glycan profiling of complex biological samples. Detailed instructions for using these tools are provided, and their performances are demonstrated by using real glycan profiling data.
Rapid Communications in Mass Spectrometry | 2015
Yunli Hu; Shiyue Zhou; Chuan-Yih Yu; Haixu Tang; Yehia Mechref
RATIONALE Liquid chromatography/mass spectrometry (LC/MS) is currently considered to be a conventional glycomics analysis strategy due to the high sensitivity and ability to handle complex biological samples. Interpretation of LC/MS data is a major bottleneck in high-throughput glycomics LC/MS-based analysis. The complexity of LC/MS data associated with biological samples prompts the needs to develop computational tools capable of facilitating automated data annotation and quantitation. METHODS An LC/MS-based automated data annotation and quantitation software, MultiGlycan-ESI, was developed and utilized for glycan quantitation. Data generated by the software from LC/MS analysis of permethylated N-glycans derived from fetuin were initially validated by manual integration to assess the performance of the software. The performance of MultiGlycan-ESI was then assessed for the quantitation of permethylated fetuin N-glycans analyzed at different concentrations or spiked with permethylated N-glycans derived from human blood serum. RESULTS The relative abundance differences between data generated by the software and those generated by manual integration were less than 5%, indicating the reliability of MultiGlycan-ESI in quantitation of permethylated glycans analyzed by LC/MS. Automated quantitation resulted in a linear relationship for all six N-glycans derived from 50 ng to 400 ng fetuin with correlation coefficients (R(2) ) greater than 0.93. Spiking of permethylated fetuin N-glycans at different concentrations in permethylated N-glycan samples derived from a 0.02 μL of HBS also exhibited linear agreement with R(2) values greater than 0.9. CONCLUSIONS With a variety of options, including mass accuracy, merged adducts, and filtering criteria, MultiGlycan-ESI allows automated annotation and quantitation of LC/ESI-MS N-glycan data. The software allows the reliable quantitation of glycan LC/MS data. The software is reliable for automated glycan quantitation, thus facilitating rapid and reliable high-throughput glycomics studies.
Rapid Communications in Mass Spectrometry | 2015
Yunli Hu; Shiyue Zhou; Chuan-Yih Yu; Haixu Tang; Yehia Mechref
RATIONALE Liquid chromatography/mass spectrometry (LC/MS) is currently considered to be a conventional glycomics analysis strategy due to the high sensitivity and ability to handle complex biological samples. Interpretation of LC/MS data is a major bottleneck in high-throughput glycomics LC/MS-based analysis. The complexity of LC/MS data associated with biological samples prompts the needs to develop computational tools capable of facilitating automated data annotation and quantitation. METHODS An LC/MS-based automated data annotation and quantitation software, MultiGlycan-ESI, was developed and utilized for glycan quantitation. Data generated by the software from LC/MS analysis of permethylated N-glycans derived from fetuin were initially validated by manual integration to assess the performance of the software. The performance of MultiGlycan-ESI was then assessed for the quantitation of permethylated fetuin N-glycans analyzed at different concentrations or spiked with permethylated N-glycans derived from human blood serum. RESULTS The relative abundance differences between data generated by the software and those generated by manual integration were less than 5%, indicating the reliability of MultiGlycan-ESI in quantitation of permethylated glycans analyzed by LC/MS. Automated quantitation resulted in a linear relationship for all six N-glycans derived from 50 ng to 400 ng fetuin with correlation coefficients (R(2) ) greater than 0.93. Spiking of permethylated fetuin N-glycans at different concentrations in permethylated N-glycan samples derived from a 0.02 μL of HBS also exhibited linear agreement with R(2) values greater than 0.9. CONCLUSIONS With a variety of options, including mass accuracy, merged adducts, and filtering criteria, MultiGlycan-ESI allows automated annotation and quantitation of LC/ESI-MS N-glycan data. The software allows the reliable quantitation of glycan LC/MS data. The software is reliable for automated glycan quantitation, thus facilitating rapid and reliable high-throughput glycomics studies.