Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Guoguang Zhao is active.

Publication


Featured researches published by Guoguang Zhao.


Nucleic Acids Research | 2011

Large-scale prediction of long non-coding RNA functions in a coding–non-coding gene co-expression network

Qi Liao; Changning Liu; Xiongying Yuan; Shuli Kang; Ruoyu Miao; Hui Xiao; Guoguang Zhao; Haitao Luo; Dechao Bu; Haitao Zhao; Geir Skogerbø; Zhongdao Wu; Yi Zhao

Although accumulating evidence has provided insight into the various functions of long-non-coding RNAs (lncRNAs), the exact functions of the majority of such transcripts are still unknown. Here, we report the first computational annotation of lncRNA functions based on public microarray expression profiles. A coding–non-coding gene co-expression (CNC) network was constructed from re-annotated Affymetrix Mouse Genome Array data. Probable functions for altogether 340 lncRNAs were predicted based on topological or other network characteristics, such as module sharing, association with network hubs and combinations of co-expression and genomic adjacency. The functions annotated to the lncRNAs mainly involve organ or tissue development (e.g. neuron, eye and muscle development), cellular transport (e.g. neuronal transport and sodium ion, acid or lipid transport) or metabolic processes (e.g. involving macromolecules, phosphocreatine and tyrosine).


Nucleic Acids Research | 2014

NONCODEv4: exploring the world of long non-coding RNA genes

Chaoyong Xie; Jiao Yuan; Hui Li; Ming Li; Guoguang Zhao; Dechao Bu; Weimin Zhu; Wei Wu; Runsheng Chen; Yi Zhao

NONCODE (http://www.bioinfo.org/noncode/) is an integrated knowledge database dedicated to non-coding RNAs (excluding tRNAs and rRNAs). Non-coding RNAs (ncRNAs) have been implied in diseases and identified to play important roles in various biological processes. Since NONCODE version 3.0 was released 2 years ago, discovery of novel ncRNAs has been promoted by high-throughput RNA sequencing (RNA-Seq). In this update of NONCODE, we expand the ncRNA data set by collection of newly identified ncRNAs from literature published in the last 2 years and integration of the latest version of RefSeq and Ensembl. Particularly, the number of long non-coding RNA (lncRNA) has increased sharply from 73 327 to 210 831. Owing to similar alternative splicing pattern to mRNAs, the concept of lncRNA genes was put forward to help systematic understanding of lncRNAs. The 56 018 and 46 475 lncRNA genes were generated from 95 135 and 67 628 lncRNAs for human and mouse, respectively. Additionally, we present expression profile of lncRNA genes by graphs based on public RNA-seq data for human and mouse, as well as predict functions of these lncRNA genes. The improvements brought to the database also include an incorporation of an ID conversion tool from RefSeq or Ensembl ID to NONCODE ID and a service of lncRNA identification. NONCODE is also accessible through http://www.noncode.org/.


Nucleic Acids Research | 2012

NONCODE v3.0: integrative annotation of long noncoding RNAs

Dechao Bu; Kuntao Yu; Silong Sun; Chaoyong Xie; Geir Skogerbø; Ruoyu Miao; Hui Xiao; Qi Liao; Haitao Luo; Guoguang Zhao; Haitao Zhao; Zhiyong Liu; Changning Liu; Runsheng Chen; Yi-Pei Zhao

Facilitated by the rapid progress of high-throughput sequencing technology, a large number of long noncoding RNAs (lncRNAs) have been identified in mammalian transcriptomes over the past few years. LncRNAs have been shown to play key roles in various biological processes such as imprinting control, circuitry controlling pluripotency and differentiation, immune responses and chromosome dynamics. Notably, a growing number of lncRNAs have been implicated in disease etiology. With the increasing number of published lncRNA studies, the experimental data on lncRNAs (e.g. expression profiles, molecular features and biological functions) have accumulated rapidly. In order to enable a systematic compilation and integration of this information, we have updated the NONCODE database (http://www.noncode.org) to version 3.0 to include the first integrated collection of expression and functional lncRNA data obtained from re-annotated microarray studies in a single database. NONCODE has a user-friendly interface with a variety of search or browse options, a local Genome Browser for visualization and a BLAST server for sequence-alignment search. In addition, NONCODE provides a platform for the ongoing collation of ncRNAs reported in the literature. All data in NONCODE are open to users, and can be downloaded through the website or obtained through the SOAP API and DAS services.


Nucleic Acids Research | 2013

Utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts

Liang Sun; Haitao Luo; Dechao Bu; Guoguang Zhao; Kuntao Yu; Changhai Zhang; Yuanning Liu; Runsheng Chen; Yi Zhao

It is a challenge to classify protein-coding or non-coding transcripts, especially those re-constructed from high-throughput sequencing data of poorly annotated species. This study developed and evaluated a powerful signature tool, Coding-Non-Coding Index (CNCI), by profiling adjoining nucleotide triplets to effectively distinguish protein-coding and non-coding sequences independent of known annotations. CNCI is effective for classifying incomplete transcripts and sense–antisense pairs. The implementation of CNCI offered highly accurate classification of transcripts assembled from whole-transcriptome sequencing data in a cross-species manner, that demonstrated gene evolutionary divergence between vertebrates, and invertebrates, or between plants, and provided a long non-coding RNA catalog of orangutan. CNCI software is available at http://www.bioinfo.org/software/cnci.


Nucleic Acids Research | 2013

Long non-coding RNAs function annotation: a global prediction method based on bi-colored networks

Xingli Guo; Lin Gao; Qi Liao; Hui Xiao; Xiaoke Ma; Xiaofei Yang; Haitao Luo; Guoguang Zhao; Dechao Bu; Fei Jiao; Qixiang Shao; Runsheng Chen; Yi Zhao

More and more evidences demonstrate that the long non-coding RNAs (lncRNAs) play many key roles in diverse biological processes. There is a critical need to annotate the functions of increasing available lncRNAs. In this article, we try to apply a global network-based strategy to tackle this issue for the first time. We develop a bi-colored network based global function predictor, long non-coding RNA global function predictor (‘lnc-GFP’), to predict probable functions for lncRNAs at large scale by integrating gene expression data and protein interaction data. The performance of lnc-GFP is evaluated on protein-coding and lncRNA genes. Cross-validation tests on protein-coding genes with known function annotations indicate that our method can achieve a precision up to 95%, with a suitable parameter setting. Among the 1713 lncRNAs in the bi-colored network, the 1625 (94.9%) lncRNAs in the maximum connected component are all functionally characterized. For the lncRNAs expressed in mouse embryo stem cells and neuronal cells, the inferred putative functions by our method highly match those in the known literature.


Nucleic Acids Research | 2011

ncFANs: a web server for functional annotation of long non-coding RNAs

Qi Liao; Hui Xiao; Dechao Bu; Chaoyong Xie; Ruoyu Miao; Haitao Luo; Guoguang Zhao; Kuntao Yu; Haitao Zhao; Geir Skogerbø; Runsheng Chen; Zhongdao Wu; Changning Liu; Yi Zhao

Recent interest in the non-coding transcriptome has resulted in the identification of large numbers of long non-coding RNAs (lncRNAs) in mammalian genomes, most of which have not been functionally characterized. Computational exploration of the potential functions of these lncRNAs will therefore facilitate further work in this field of research. We have developed a practical and user-friendly web interface called ncFANs (non-coding RNA Function ANnotation server), which is the first web service for functional annotation of human and mouse lncRNAs. On the basis of the re-annotated Affymetrix microarray data, ncFANs provides two alternative strategies for lncRNA functional annotation: one utilizing three aspects of a coding-non-coding gene co-expression (CNC) network, the other identifying condition-related differentially expressed lncRNAs. ncFANs introduces a highly efficient way of re-using the abundant pre-existing microarray data. The present version of ncFANs includes re-annotated CDF files for 10 human and mouse Affymetrix microarrays, and the server will be continuously updated with more re-annotated microarray platforms and lncRNA data. ncFANs is freely accessible at http://www.ebiomed.org/ncFANs/ or http://www.noncode.org/ncFANs/.


Nucleic Acids Research | 2014

NPInter v2.0: an updated database of ncRNA interactions

Jiao Yuan; Wei Wu; Chaoyong Xie; Guoguang Zhao; Yi Zhao; Runsheng Chen

NPInter (http://www.bioinfo.org/NPInter) is a database that integrates experimentally verified functional interactions between noncoding RNAs (excluding tRNAs and rRNAs) and other biomolecules (proteins, RNAs and genomic DNAs). Extensive studies on ncRNA interactions have shown that ncRNAs could act as part of enzymatic or structural complexes, gene regulators or other functional elements. With the development of high-throughput biotechnology, such as cross-linking immunoprecipitation and high-throughput sequencing (CLIP-seq), the number of known ncRNA interactions, especially those formed by protein binding, has grown rapidly in recent years. In this work, we updated NPInter to version 2.0 by collecting ncRNA interactions from recent literature and related databases, expanding the number of entries to 201 107 covering 18 species. In addition, NPInter v2.0 incorporated a service for the BLAST alignment search as well as visualization of interactions.


Parasitology Research | 2014

Genome-wide identification and functional annotation of Plasmodium falciparum long noncoding RNAs from RNA-seq data

Qi Liao; Jia Shen; Jianfa Liu; Xi Sun; Guoguang Zhao; Yanzi Chang; Leiting Xu; Xuerong Li; Ya Zhao; Huanqin Zheng; Yi Zhao; Zhongdao Wu

The life cycle of Plasmodium falciparum is very complex, with an erythrocytic stage that involves the invasion of red blood cells and the survival and growth of the parasite within the host. Over the past several decades, numbers of studies have shown that proteins exported by P. falciparum to the surface of infected red blood cells play a critical role in recognition and interaction with host receptors and are thus essential for the completion of the life cycle of P. falciparum. However, little is known about long noncoding RNAs (lncRNAs). In this study, we designed a computational pipeline to identify new lncRNAs of P. falciparum from published RNA-seq data and analyzed their sequences and expression features. As a result, 164 novel lncRNAs were found. The sequences and expression features of P. falciparum lncRNAs were similar to those of humans and mice: there was a lack of sequence conservation, low expression levels, and high expression coefficient of variance and co-expression with nearby coding sequences in the genome. Next, a coding/noncoding gene co-expression network for P. falciparum was constructed to further annotate the functions of novel and known lncRNAs. In total, the functions of 69 lncRNAs, including 44 novel lncRNAs, were annotated. The main functions of the lncRNAs included metabolic processes, biosynthetic processes, regulation of biological processes, establishment of localization, catabolic processes, cellular component organization, and interspecies interactions between organisms. Our results will provide clues to further the investigation of interactions between human hosts and parasites and the mechanisms of P. falciparum infection.


Science China-life Sciences | 2013

Systematic study of human long intergenic non-coding RNAs and their impact on cancer

Liang Sun; Haitao Luo; Qi Liao; Dechao Bu; Guoguang Zhao; Changning Liu; YuanNing Liu; Yi Zhao

The functional impact of several long intergenic non-coding RNAs (lincRNAs) has been characterized in previous studies. However, it is difficult to identify lincRNAs on a large-scale and to ascertain their functions or predict their structures in laboratory experiments because of the diversity, lack of knowledge and specificity of expression of lincRNAs. Furthermore, although there are a few well-characterized examples of lincRNAs associated with cancers, these are just the tip of the iceberg owing to the complexity of cancer. Here, by combining RNA-Seq data from several kinds of human cell lines with chromatin-state maps and human expressed sequence tags, we successfully identified more than 3000 human lincRNAs, most of which were new ones. Subsequently, we predicted the functions of 105 lincRNAs based on a coding-non-coding gene co-expression network. Finally, we propose a genetic mediator and key regulator model to unveil the subtle relationships between lincRNAs and lung cancer. Twelve lincRNAs may be principal players in lung tumorigenesis. The present study combines large-scale identification and functional prediction of human lincRNAs, and is a pioneering work in characterizing cancer-associated lincRNAs by bioinformatics.


Protein & Cell | 2012

CloudLCA: finding the lowest common ancestor in metagenome analysis using cloud computing.

Guoguang Zhao; Dechao Bu; Changning Liu; Jing Li; Jian Yang; Zhiyong Liu; Yi Zhao; Runsheng Chen

Estimating taxonomic content constitutes a key problem in metagenomic sequencing data analysis. However, extracting such content from high-throughput data of next-generation sequencing is very time-consuming with the currently available software. Here, we present CloudLCA, a parallel LCA algorithm that significantly improves the efficiency of determining taxonomic composition in metagenomic data analysis. Results show that CloudLCA (1) has a running time nearly linear with the increase of dataset magnitude, (2) displays linear speedup as the number of processors grows, especially for large datasets, and (3) reaches a speed of nearly 215 million reads each minute on a cluster with ten thin nodes. In comparison with MEGAN, a well-known metagenome analyzer, the speed of CloudLCA is up to 5 more times faster, and its peak memory usage is approximately 18.5% that of MEGAN, running on a fat node. CloudLCA can be run on one multiprocessor node or a cluster. It is expected to be part of MEGAN to accelerate analyzing reads, with the same output generated as MEGAN, which can be import into MEGAN in a direct way to finish the following analysis. Moreover, CloudLCA is a universal solution for finding the lowest common ancestor, and it can be applied in other fields requiring an LCA algorithm.

Collaboration


Dive into the Guoguang Zhao's collaboration.

Top Co-Authors

Avatar

Yi Zhao

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Dechao Bu

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Runsheng Chen

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Haitao Luo

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Changning Liu

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Chaoyong Xie

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Hui Xiao

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Kuntao Yu

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Zhongdao Wu

Sun Yat-sen University

View shared research outputs
Researchain Logo
Decentralizing Knowledge