Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Guangyong Zheng is active.

Publication


Featured researches published by Guangyong Zheng.


Bioinformatics | 2008

ITFP: an integrated platform of mammalian transcription factors

Guangyong Zheng; Kang Tu; Qing Yang; Yun Xiong; Chaochun Wei; Lu Xie; Yangyong Zhu; Yixue Li

Investigation of transcription factors (TFs) and their downstream regulated genes (targets) is a significant issue in post-genome era, which can provide a brand new vision for some vital biological process. However, information of TFs and their targets in mammalian is far from sufficient. Here, we developed an integrated TF platform (ITFP), which included abundant TFs and their targets of mammalian. In current release, ITFP includes 4105 putative TFs and 69 496 potential TF-target pairs for human, 3134 putative TFs and 37 040 potential TF-target pairs for mouse, and 1114 putative TFs and 18 055 potential TF-target pairs for rat. In short, ITFP will serve as an important resource for the research community of transcription and provide strong support for regulatory network study.


Molecular & Cellular Proteomics | 2009

Concurrent quantification of proteome and phosphoproteome to reveal system-wide association of protein phosphorylation and gene expression.

Yujian Wu; Jianwu Dai; Xing-Lin Yang; Su-Jun Li; Shi-Lin Zhao; Quanhu Sheng; Jia-shu Tang; Guangyong Zheng; Yongming Li; Wu; Rong Zeng

Reversible phosphorylation of proteins is an important process modulating cellular activities from upstream, which mainly involves sequential phosphorylation of signaling molecules, to downstream where phosphorylation of transcription factors regulates gene expression. In this study, we combined quantitative labeling with multidimensional liquid chromatography-mass spectrometry to monitor the proteome and phosphoproteome changes in the initial period of adipocyte differentiation. The phosphorylation level of a specific protein may be regulated by a kinase or phosphatase without involvement of gene expression or as a phenomenon that accompanies the alteration of its gene expression. Concurrent quantification of phosphopeptides and non-phosphorylated peptides makes it possible to differentiate cellular phosphorylation changes at these two levels. Furthermore, on the system level, certain proteins were predicted as the targeted gene products regulated by identified transcription factors. Among them, several proteins showed significant expression changes along with the phosphorylation alteration of their transcription factors. This is to date the first work to concurrently quantify proteome and phosphoproteome changes during the initial period of adipocyte differentiation, providing an approach to reveal the system-wide association of protein phosphorylation and gene expression.


BMC Bioinformatics | 2008

The combination approach of SVM and ECOC for powerful identification and classification of transcription factor

Guangyong Zheng; Ziliang Qian; Qing Yang; Chaochun Wei; Lu Xie; Yangyong Zhu; Yixue Li

BackgroundTranscription factors (TFs) are core functional proteins which play important roles in gene expression control, and they are key factors for gene regulation network construction. Traditionally, they were identified and classified through experimental approaches. In order to save time and reduce costs, many computational methods have been developed to identify TFs from new proteins and to classify the resulted TFs. Though these methods have facilitated screening of TFs to some extent, low accuracy is still a common problem. With the fast growing number of new proteins, more precise algorithms for identifying TFs from new proteins and classifying the consequent TFs are in a high demand.ResultsThe support vector machine (SVM) algorithm was utilized to construct an automatic detector for TF identification, where protein domains and functional sites were employed as feature vectors. Error-correcting output coding (ECOC) algorithm, which was originated from information and communication engineering fields, was introduced to combine with support vector machine (SVM) methodology for TF classification. The overall success rates of identification and classification achieved 88.22% and 97.83% respectively. Finally, a web site was constructed to let users access our tools (see Availability and requirements section for URL).ConclusionThe SVM method was a valid and stable means for TFs identification with protein domains and functional sites as feature vectors. Error-correcting output coding (ECOC) algorithm is a powerful method for multi-class classification problem. When combined with SVM method, it can remarkably increase the accuracy of TF classification using protein domains and functional sites as feature vectors. In addition, our work implied that ECOC algorithm may succeed in a broad range of applications in biological data mining.


BMC Genomics | 2015

MOST+: A de novo motif finding approach combining genomic sequence and heterogeneous genome-wide signatures

Yizhe Zhang; Yupeng He; Guangyong Zheng; Chaochun Wei

BackgroundMotifs are regulatory elements that will activate or inhibit the expression of related genes when proteins (such as transcription factors, TFs) bind to them. Therefore, motif finding is important to understand the mechanisms of gene regulation. De novo discovery of regulatory elements, like transcription factor binding sites (TFBSs), has long been a major challenge to gain insight on mechanisms of gene regulation. Recent advances in experimental profiling of genome-wide signals such as histone modifications and DNase I hypersensitivity sites allow scientists to develop better computational methods to enhance motif discovery. However, existing methods for motif finding suffer from high false positive rates and slow speed, and its difficult to evaluate the performance of these methods systematically.ResultHere we present MOST+, a motif finder integrating genomic sequences and genome-wide signals such as intensity and shape features from histone modification marks and DNase I hypersensitivity sites, to improve the prediction accuracy. MOST+ can detect motifs from a large input sequence of about 100 Mbs within a few minutes. Systematic comparison method has been established and MOST+ has been compared with existing methods.ConclusionMOST+ is a fast and accurate de novo method for motif finding by integrating genomic sequence and experimental signals as clues.


BMC Bioinformatics | 2011

iGepros: an integrated gene and protein annotation server for biological nature exploration

Guangyong Zheng; Haibo Wang; Chaochun Wei; Yixue Li

BackgroundIn the post-genomic era, transcriptomics and proteomics provide important information to understand the genomes. With fast development of high-throughput technology, more and more transcriptomics and proteomics data are generated at an unprecedented rate. Therefore, requirement of software to annotate those omics data and explore their biological nature arises. In the past decade, some pioneer works were presented to address this issue, but limitations still exist. Fox example, some of these tools offer command line only, which is not suitable for those users with little or no experience in programming. Besides, some tools don’t support large scale gene and protein analysis.ResultsTo overcome these limitations, an integrated gene and protein annotation server named iGepros has been developed. The server provides user-friendly interfaces and detailed on-line examples, so most researchers even those with little or no programming experience can use it smoothly. Moreover, the server provides many functionalities to compare transcriptomics and proteomics data. Especially, the server is constructed under a model-view-control framework, which makes it easy to incorporate more functions to the server in the future.ConclusionsIn this paper, we present a server with powerful capability not only for gene and protein functional annotation, but also for transcriptomics and proteomics data comparison. Researchers can survey biological characters behind gene and protein datasets and accelerate their investigation of transcriptome and proteome by applying the server. The server is publicly available at http://www.biosino.org/iGepros/.


BMC Genomics | 2012

CTF: a CRF-based transcription factor binding sites finding system

Yupeng He; Yizhe Zhang; Guangyong Zheng; Chaochun Wei

BackgroundIdentifying the location of transcription factor bindings is crucial to understand transcriptional regulation. Currently, Chromatin Immunoprecipitation followed with high-throughput Sequencing (ChIP-seq) is able to locate the transcription factor binding sites (TFBSs) accurately in high throughput and it has become the gold-standard method for TFBS finding experimentally. However, due to its high cost, it is impractical to apply the method in a very large scale. Considering the large number of transcription factors, numerous cell types and various conditions, computational methods are still very valuable to accurate TFBS identification.ResultsIn this paper, we proposed a novel integrated TFBS prediction system, CTF, based on Conditional Random Fields (CRFs). Integrating information from different sources, CTF was able to capture patterns of TFBSs contained in different features (sequence, chromatin and etc) and predicted the TFBS locations with a high accuracy. We compared CTF with several existing tools as well as the PWM baseline method on a dataset generated by ChIP-seq experiments (TFBSs of 13 transcription factors in mouse genome). Results showed that CTF performed significantly better than existing methods tested.ConclusionsCTF is a powerful tool to predict TFBSs by integrating high throughput data and different features. It can be a useful complement to ChIP-seq and other experimental methods for TFBS identification and thus improve our ability to investigate functional elements in post-genomic era.Availability: CTF is freely available to academic users at: http://cbb.sjtu.edu.cn/~ccwei/pub/software/CTF/CTF.php


Scientific Reports | 2015

Revealing Missing Human Protein Isoforms Based on Ab Initio Prediction, RNA-seq and Proteomics

Zhiqiang Hu; Hamish S. Scott; Guangrong Qin; Guangyong Zheng; Xixia Chu; Lu Xie; David L. Adelson; Bergithe E. Oftedal; Parvathy Venugopal; Milena Babic; Christopher N. Hahn; Bing Zhang; Xiaojing Wang; Nan Li; Chaochun Wei

Biological and biomedical research relies on comprehensive understanding of protein-coding transcripts. However, the total number of human proteins is still unknown due to the prevalence of alternative splicing. In this paper, we detected 31,566 novel transcripts with coding potential by filtering our ab initio predictions with 50 RNA-seq datasets from diverse tissues/cell lines. PCR followed by MiSeq sequencing showed that at least 84.1% of these predicted novel splice sites could be validated. In contrast to known transcripts, the expression of these novel transcripts were highly tissue-specific. Based on these novel transcripts, at least 36 novel proteins were detected from shotgun proteomics data of 41 breast samples. We also showed L1 retrotransposons have a more significant impact on the origin of new transcripts/genes than previously thought. Furthermore, we found that alternative splicing is extraordinarily widespread for genes involved in specific biological functions like protein binding, nucleoside binding, neuron projection, membrane organization and cell adhesion. In the end, the total number of human transcripts with protein-coding potential was estimated to be at least 204,950.


BMC Genomics | 2012

Towards biological characters of interactions between transcription factors and their DNA targets in mammals.

Guangyong Zheng; Qi Liu; Guohui Ding; Chaochun Wei; Yixue Li

BackgroundIn post-genomic era, the study of transcriptional regulation is pivotal to decode genetic information. Transcription factors (TFs) are central proteins for transcriptional regulation, and interactions between TFs and their DNA targets (TFBSs) are important for downstream genes’ expression. However, the lack of knowledge about interactions between TFs and TFBSs is still baffling people to investigate the mechanism of transcription.ResultsTo expand the knowledge about interactions between TFs and TFBSs, three biological features (sequence feature, structure feature, and evolution feature) were utilized to build TFBS identification models for studying binding preference between TFs and their DNA targets in mammals. Results show that each feature does have fairly well performance to capture TFBSs, and the hybrid model combined all three features is more robust for TFBS identification. Subsequently, correspondence between TFs and their TFBSs was investigated to explore interactions among them in mammals. Results indicate that TFs and TFBSs are reciprocal in sequence, structure, and evolution level.ConclusionsOur work demonstrates that, to some extent, TFs and TFBSs have developed a coevolutionary relationship in order to keep their physical binding and maintain their regulatory functions. In summary, our work will help understand transcriptional regulation and interpret binding mechanism between proteins and DNAs.


IEEE Intelligent Systems | 2009

A Collaborative Multiagent System for Mining Transcriptional Regulatory Elements

Yun Xiong; Guangyong Zheng; Qing Yang; Yangyong Zhu

Identification of transcriptional regulatory elements offers a key means of insight into regulation mechanisms. However, the number of known regulatory elements is inadequate and state-of-the-art identification methods are inaccurate. Moreover, it is difficult for a biologist to select interdependent tools, and existing systems ignore overall performance issues. Agent technology can provide solutions through its information integration and coordination capabilities. TREMAgent is the first multiagent-based system for mining transcriptional regulatory elements. It uses novel algorithms combined with biological domain knowledge (for example, protein functional site information) to achieve superior accuracy and collaborate with existing tools using agent technology. The autonomous problem-solving capability of agents enables the system to provide the appropriate workflow rather than having users select interdependent tools. Experiments on the real data sets show that TREMAgent can provide superior accuracy and flexible services, promising excellent potential for bioinformatics.


bioRxiv | 2014

Revealing missing isoforms encoded in the human genome by integrating genomic, transcriptomic and proteomic data

Zhiqiang Hu; Hamish S. Scott; Guangrong Qin; Guangyong Zheng; Xixia Chu; Lu Xie; David L. Adelson; Bergithe E. Oftedal; Parvalthy Venugopal; Milena Barbic; Christopher N. Hahn; Bing Zhang; Xiaojing Wang; Nan Li; Chaochun Wei

Biological and biomedical research relies on comprehensive understanding of protein-coding transcripts. However, the total number of human proteins is still unknown due to the prevalence of alternative splicing and is much larger than the number of human genes. In this paper, we detected 31,566 novel transcripts with coding potential by filtering our ab initio predictions with 50 RNA-seq datasets from diverse tissues/cell lines. PCR followed by MiSeq sequencing showed that at least 84.1% of these predicted novel splice sites could be validated. In contrast to known transcripts, the expression of these novel transcripts were highly tissue-specific. Based on these novel transcripts, at least 36 novel proteins were detected from shotgun proteomics data of 41 breast samples. We also showed L1 retrotransposons have a more significant impact on the origin of new transcripts/genes than previously thought. Furthermore, we found that alternative splicing is extraordinarily widespread for genes involved in specific biological functions like protein binding, nucleoside binding, neuron projection, membrane organization and cell adhesion. In the end, the total number of human transcripts with protein-coding potential was estimated to be at least 204,950.

Collaboration


Dive into the Guangyong Zheng's collaboration.

Top Co-Authors

Avatar

Chaochun Wei

Shanghai Jiao Tong University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yixue Li

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Lu Xie

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Rong Zeng

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Jia-shu Tang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Nan Li

Second Military Medical University

View shared research outputs
Top Co-Authors

Avatar

Shi-Lin Zhao

Chinese Academy of Sciences

View shared research outputs
Researchain Logo
Decentralizing Knowledge