Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Maozu Guo is active.

Publication


Featured researches published by Maozu Guo.


Bioinformatics | 2015

HAlign: Fast Multiple Similar DNA/RNA Sequence Alignment Based on the Centre Star Strategy

Quan Zou; Qinghua Hu; Maozu Guo; Guohua Wang

MOTIVATION Multiple sequence alignment (MSA) is important work, but bottlenecks arise in the massive MSA of homologous DNA or genome sequences. Most of the available state-of-the-art software tools cannot address large-scale datasets, or they run rather slowly. The similarity of homologous DNA sequences is often ignored. Lack of parallelization is still a challenge for MSA research. RESULTS We developed two software tools to address the DNA MSA problem. The first employed trie trees to accelerate the centre star MSA strategy. The expected time complexity was decreased to linear time from square time. To address large-scale data, parallelism was applied using the hadoop platform. Experiments demonstrated the performance of our proposed methods, including their running time, sum-of-pairs scores and scalability. Moreover, we supplied two massive DNA/RNA MSA datasets for further testing and research.


Briefings in Functional Genomics | 2015

An overview of SNP interactions in genome-wide association studies

Pei Li; Maozu Guo; Chunyu Wang; Xiaoyan Liu; Quan Zou

With the recent explosion in high-throughput genotyping technology, the amount and quality of single-nucleotide polymorphism (SNP) data has increased exponentially. Therefore, the identification of SNP interactions that are associated with common diseases is playing an increasing and important role in interpreting the genetic basis of disease susceptibility and in devising new diagnostic tests and treatments. However, because these data sets are large, although they typically have small sample sizes and low signal-to-noise ratios, there has been no major breakthrough despite many efforts, making this a major focus in the field of bioinformatics. In this article, we review the two main aspects of SNP interaction studies in recent years-the simulation and identification of SNP interactions-and then discuss the principles, efficiency and differences between these methods.


Genomics | 2013

A novel insight into Gene Ontology semantic similarity

Yungang Xu; Maozu Guo; Wenli Shi; Xiaoyan Liu; Chunyu Wang

Existing methods for computing the semantic similarity between Gene Ontology (GO) terms are often based on external datasets and, therefore are not intrinsic to GO. Furthermore, they not only fail to handle identical annotations but also show a strong bias toward well-annotated proteins when being used for measuring similarity of proteins. Inspired by the concept of cellular differentiation and dedifferentiation in developmental biology, we propose a shortest semantic differentiation distance (SSDD) based on the concept of semantic totipotency to measure the semantic similarity of GO terms and further compare the functional similarity of proteins. Using human ratings and a benchmark dataset, SSDD was found to improve upon existing methods for computing the semantic similarity of GO terms. An in-depth analysis shows that SSDD is able to distinguish identical annotations and does not depend on annotation richness, thus producing more unbiased and reliable results. Online services can be accessed at the Gene Functional Similarity Analysis Tools website (GFSAT: http://nclab.hit.edu.cn/GFSAT).


Bioinformatics | 2014

Inferring the soybean (Glycine max) microRNA functional network based on target gene network

Yungang Xu; Maozu Guo; Xiaoyan Liu; Chunyu Wang; Yang Liu

MOTIVATION The rapid accumulation of microRNAs (miRNAs) and experimental evidence for miRNA interactions has ushered in a new area of miRNA research that focuses on network more than individual miRNA interaction, which provides a systematic view of the whole microRNome. So it is a challenge to infer miRNA functional interactions on a system-wide level and further draw a miRNA functional network (miRFN). A few studies have focused on the well-studied human species; however, these methods can neither be extended to other non-model organisms nor take fully into account the information embedded in miRNA-target and target-target interactions. Thus, it is important to develop appropriate methods for inferring the miRNA network of non-model species, such as soybean (Glycine max), without such extensive miRNA-phenotype associated data as miRNA-disease associations in human. RESULTS Here we propose a new method to measure the functional similarity of miRNAs considering both the site accessibility and the interactive context of target genes in functional gene networks. We further construct the miRFNs of soybean, which is the first study on soybean miRNAs on the network level and the core methods can be easily extended to other species. We found that miRFNs of soybean exhibit a scale-free, small world and modular architecture, with their degrees fit best to power-law and exponential distribution. We also showed that miRNA with high degree tends to interact with those of low degree, which reveals the disassortativity and modularity of miRFNs. Our efforts in this study will be useful to further reveal the soybean miRNA-miRNA and miRNA-gene interactive mechanism on a systematic level. AVAILABILITY AND IMPLEMENTATION A web tool for information retrieval and analysis of soybean miRFNs and the relevant target functional gene networks can be accessed at SoymiRNet: http://nclab.hit.edu.cn/SoymiRNet.


Current Genomics | 2013

Computational Approaches in Detecting Non- Coding RNA

Chunyu Wang; Leyi Wei; Maozu Guo; Quan Zou

The important role of non coding RNAs (ncRNAs) in the cell has made their identification a critical issue in the biological research. However, traditional approaches such as PT-PCR and Northern Blot are costly. With recent progress in bioinformatics and computational prediction technology, the discovery of ncRNAs has become realistically possible. This paper aims to introduce major computational approaches in the identification of ncRNAs, including homologous search, de novo prediction and mining in deep sequencing data. Furthermore, related software tools have been compared and reviewed along with a discussion on future improvements.


Genetics and Molecular Research | 2015

imDC: an ensemble learning method for imbalanced classification with miRNA data.

Chunyu Wang; L.L. Hu; Maozu Guo; Xiangrong Liu; Quan Zou

Imbalances typically exist in bioinformatics and are also common in other areas. A drawback of traditional machine learning methods is the relatively little attention given to small sample classification. Thus, we developed imDC, which uses an ensemble learning concept in combination with weights and sample misclassification information to effectively classify imbalanced data. Our method showed better results when compared to other algorithms with UCI machine learning datasets and microRNA data.


Computers in Biology and Medicine | 2009

Predicting RNA secondary structure based on the class information and Hopfield network

Quan Zou; Tuo Zhao; Yang Liu; Maozu Guo

One of the models for RNA secondary structure prediction is to view it as maximum independent set problem, which can be approximately solved by Hopfield network. However, when predicting native molecules, the model is not always accurate and the heuristic method of Hopfield network is not always stable. It is because that the class information is lost and the accuracy is not determined by the number of base pairs only. Secondary structures of non-coding RNAs are believed conservative on the same class. However, software and web servers nowadays for RNA secondary structure prediction do not consider the class information. In this paper, we involve class information in the initialization of Hopfield network. When the initialization is improved, the promising experimental result shows the efficacy and superiority of our proposed methods.


Bioinformatics | 2013

LNETWORK: An Efficient and Effective Method for Constructing Phylogenetic Networks

Juan Wang; Maozu Guo; Xiaoyan Liu; Yang Liu; Chunyu Wang; Linlin Xing; Kai Che

MOTIVATION The evolutionary history of species is traditionally represented with a rooted phylogenetic tree. Each tree comprises a set of clusters, i.e. subsets of the species that are descended from a common ancestor. When rooted phylogenetic trees are built from several different datasets (e.g. from different genes), the clusters are often conflicting. These conflicting clusters cannot be expressed as a simple phylogenetic tree; however, they can be expressed in a phylogenetic network. Phylogenetic networks are a generalization of phylogenetic trees that can account for processes such as hybridization, horizontal gene transfer and recombination, which are difficult to represent in standard tree-like models of evolutionary histories. There is currently a large body of research aimed at developing appropriate methods for constructing phylogenetic networks from cluster sets. The Cass algorithm can construct a much simpler network than other available methods, but is extremely slow for large datasets or for datasets that need lots of reticulate nodes. The networks constructed by Cass are also greatly dependent on the order of input data, i.e. it generally derives different phylogenetic networks for the same dataset when different input orders are used. RESULTS In this study, we introduce an improved Cass algorithm, Lnetwork, which can construct a phylogenetic network for a given set of clusters. We show that Lnetwork is significantly faster than Cass and effectively weakens the influence of input data order. Moreover, we show that Lnetwork can construct a much simpler network than most of the other available methods. AVAILABILITY Lnetwork has been built as a Java software package and is freely available at http://nclab.hit.edu.cn/∼wangjuan/Lnetwork/. CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.


Genetics and Molecular Research | 2011

Genetic algorithm-based efficient feature selection for classification of pre-miRNAs.

Ping Xuan; Maozu Guo; Wang J; Chunxue Wang; Xiangrong Liu; Liu Y

In order to classify the real/pseudo human precursor microRNA (pre-miRNAs) hairpins with ab initio methods, numerous features are extracted from the primary sequence and second structure of pre-miRNAs. However, they include some redundant and useless features. It is essential to select the most representative feature subset; this contributes to improving the classification accuracy. We propose a novel feature selection method based on a genetic algorithm, according to the characteristics of human pre-miRNAs. The information gain of a feature, the feature conservation relative to stem parts of pre-miRNA, and the redundancy among features are all considered. Feature conservation was introduced for the first time. Experimental results were validated by cross-validation using datasets composed of human real/pseudo pre-miRNAs. Compared with microPred, our classifier miPredGA, achieved more reliable sensitivity and specificity. The accuracy was improved nearly 12%. The feature selection algorithm is useful for constructing more efficient classifiers for identification of real human pre-miRNAs from pseudo hairpins.


European Journal of Human Genetics | 2015

A gene-based information gain method for detecting gene–gene interactions in case–control studies

Jin Li; Dongli Huang; Maozu Guo; Xiaoyan Liu; Chunyu Wang; Zhixia Teng; Ruijie Zhang; Yongshuai Jiang; Hongchao Lv; Limei Wang

Currently, most methods for detecting gene–gene interactions (GGIs) in genome-wide association studies are divided into SNP-based methods and gene-based methods. Generally, the gene-based methods can be more powerful than SNP-based methods. Some gene-based entropy methods can only capture the linear relationship between genes. We therefore proposed a nonparametric gene-based information gain method (GBIGM) that can capture both linear relationship and nonlinear correlation between genes. Through simulation with different odds ratio, sample size and prevalence rate, GBIGM was shown to be valid and more powerful than classic KCCU method and SNP-based entropy method. In the analysis of data from 17 genes on rheumatoid arthritis, GBIGM was more effective than the other two methods as it obtains fewer significant results, which was important for biological verification. Therefore, GBIGM is a suitable and powerful tool for detecting GGIs in case–control studies.

Collaboration


Dive into the Maozu Guo's collaboration.

Top Co-Authors

Avatar

Chunyu Wang

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Yang Liu

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Xiaoyan Liu

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Zhixia Teng

Northeast Forestry University

View shared research outputs
Top Co-Authors

Avatar

Guojun Liu

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Jin Li

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Kai Che

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Qiguo Dai

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Yingpeng Han

Northeast Agricultural University

View shared research outputs
Researchain Logo
Decentralizing Knowledge