Fenglou Mao | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Fenglou Mao is active.

Explore More

Publication

Featured researches published by Fenglou Mao.

Nucleic Acids Research | 2012

dbCAN: a web resource for automated carbohydrate-active enzyme annotation

Yanbin Yin; Xizeng Mao; Jincai Yang; Xin Chen; Fenglou Mao; Ying Xu

Carbohydrate-active enzymes (CAZymes) are very important to the biotech industry, particularly the emerging biofuel industry because CAZymes are responsible for the synthesis, degradation and modification of all the carbohydrates on Earth. We have developed a web resource, dbCAN (http://csbl.bmb.uga.edu/dbCAN/annotate.php), to provide a capability for automated CAZyme signature domain-based annotation for any given protein data set (e.g. proteins from a newly sequenced genome) submitted to our server. To accomplish this, we have explicitly defined a signature domain for every CAZyme family, derived based on the CDD (conserved domain database) search and literature curation. We have also constructed a hidden Markov model to represent the signature domain of each CAZyme family. These CAZyme family-specific HMMs are our key contribution and the foundation for the automated CAZyme annotation.

Nucleic Acids Research | 2009

DOOR: a database for prokaryotic operons

Fenglou Mao; Phuongan Dam; Jacky Chou; Victor Olman; Ying Xu

We present a database DOOR (Database for prOkaryotic OpeRons) containing computationally predicted operons of all the sequenced prokaryotic genomes. All the operons in DOOR are predicted using our own prediction program, which was ranked to be the best among 14 operon prediction programs by a recent independent review. Currently, the DOOR database contains operons for 675 prokaryotic genomes, and supports a number of search capabilities to facilitate easy access and utilization of the information stored in it. Querying the database: the database provides a search capability for a user to find desired operons and associated information through multiple querying methods. Searching for similar operons: the database provides a search capability for a user to find operons that have similar composition and structure to a query operon. Prediction of cis-regulatory motifs: the database provides a capability for motif identification in the promoter regions of a user-specified group of possibly coregulated operons, using motif-finding tools. Operons for RNA genes: the database includes operons for RNA genes. OperonWiki: the database provides a wiki page (OperonWiki) to facilitate interactions between users and the developer of the database. We believe that DOOR provides a useful resource to many biologists working on bacteria and archaea, which can be accessed at http://csbl1.bmb.uga.edu/OperonDB.

Nucleic Acids Research | 2005

Prediction of functional modules based on comparative genome analysis and Gene Ontology application

Hongwei Wu; Zhengchang Su; Fenglou Mao; Victor Olman; Ying Xu

We present a computational method for the prediction of functional modules encoded in microbial genomes. In this work, we have also developed a formal measure to quantify the degree of consistency between the predicted and the known modules, and have carried out statistical significance analysis of consistency measures. We first evaluate the functional relationship between two genes from three different perspectives—phylogenetic profile analysis, gene neighborhood analysis and Gene Ontology assignments. We then combine the three different sources of information in the framework of Bayesian inference, and we use the combined information to measure the strength of gene functional relationship. Finally, we apply a threshold-based method to predict functional modules. By applying this method to Escherichia coli K12, we have predicted 185 functional modules. Our predictions are highly consistent with the previously known functional modules in E.coli. The application results have demonstrated that our approach is highly promising for the prediction of functional modules encoded in a microbial genome.

Nucleic Acids Research | 2014

DOOR 2.0: presenting operons and their functions through dynamic and integrated views

Xizeng Mao; Qin Ma; Chuan Zhou; Xin Chen; Hanyuan Zhang; Jincai Yang; Fenglou Mao; Wei Lai; Ying Xu

We have recently developed a new version of the DOOR operon database, DOOR 2.0, which is available online at http://csbl.bmb.uga.edu/DOOR/ and will be updated on a regular basis. DOOR 2.0 contains genome-scale operons for 2072 prokaryotes with complete genomes, three times the number of genomes covered in the previous version published in 2009. DOOR 2.0 has a number of new features, compared with its previous version, including (i) more than 250 000 transcription units, experimentally validated or computationally predicted based on RNA-seq data, providing a dynamic functional view of the underlying operons; (ii) an integrated operon-centric data resource that provides not only operons for each covered genome but also their functional and regulatory information such as their cis-regulatory binding sites for transcription initiation and termination, gene expression levels estimated based on RNA-seq data and conservation information across multiple genomes; (iii) a high-performance web service for online operon prediction on user-provided genomic sequences; (iv) an intuitive genome browser to support visualization of user-selected data; and (v) a keyword-based Google-like search engine for finding the needed information intuitively and rapidly in this database.

Nucleic Acids Research | 2005

Quantitative evaluation of protein–DNA interactions using an optimized knowledge-based potential

Zhijie Liu; Fenglou Mao; Jun-tao Guo; Bo Yan; Peng Wang; Youxing Qu; Ying Xu

Computational evaluation of protein–DNA interaction is important for the identification of DNA-binding sites and genome annotation. It could validate the predicted binding motifs by sequence-based approaches through the calculation of the binding affinity between a protein and DNA. Such an evaluation should take into account structural information to deal with the complicated effects from DNA structural deformation, distance-dependent multi-body interactions and solvation contributions. In this paper, we present a knowledge-based potential built on interactions between protein residues and DNA tri-nucleotides. The potential, which explicitly considers the distance-dependent two-body, three-body and four-body interactions between protein residues and DNA nucleotides, has been optimized in terms of a Z-score. We have applied this knowledge-based potential to evaluate the binding affinities of zinc-finger protein–DNA complexes. The predicted binding affinities are in good agreement with the experimental data (with a correlation coefficient of 0.950). On a larger test set containing 48 protein–DNA complexes with known experimental binding free energies, our potential has achieved a high correlation coefficient of 0.800, when compared with the experimental data. We have also used this potential to identify binding motifs in DNA sequences of transcription factors (TF). The TFs in 79.4% of the known TF–DNA complexes have accurately found their native binding sequences from a large pool of DNA sequences. When tested in a genome-scale search for TF-binding motifs of the cyclic AMP regulatory protein (CRP) of Escherichia coli, this potential ranks all known binding motifs of CRP in the top 15% of all candidate sequences.

Nucleic Acids Research | 2006

Computational inference and experimental validation of the nitrogen assimilation regulatory network in cyanobacterium Synechococcus sp. WH 8102

Zhengchang Su; Fenglou Mao; Phuongan Dam; Hongwei Wu; Victor Olman; Ian T. Paulsen; Brian Palenik; Ying Xu

Deciphering the regulatory networks encoded in the genome of an organism represents one of the most interesting and challenging tasks in the post-genome sequencing era. As an example of this problem, we have predicted a detailed model for the nitrogen assimilation network in cyanobacterium Synechococcus sp. WH 8102 (WH8102) using a computational protocol based on comparative genomics analysis and mining experimental data from related organisms that are relatively well studied. This computational model is in excellent agreement with the microarray gene expression data collected under ammonium-rich versus nitrate-rich growth conditions, suggesting that our computational protocol is capable of predicting biological pathways/networks with high accuracy. We then refined the computational model using the microarray data, and proposed a new model for the nitrogen assimilation network in WH8102. An intriguing discovery from this study is that nitrogen assimilation affects the expression of many genes involved in photosynthesis, suggesting a tight coordination between nitrogen assimilation and photosynthesis processes. Moreover, for some of these genes, this coordination is probably mediated by NtcA through the canonical NtcA promoters in their regulatory regions.

IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2009

Parallel Clustering Algorithm for Large Data Sets with Applications in Bioinformatics

Victor Olman; Fenglou Mao; Hongwei Wu; Ying Xu

Large sets of bioinformatical data provide a challenge in time consumption while solving the cluster identification problem, and that is why a parallel algorithm is so needed for identifying dense clusters in a noisy background. Our algorithm works on a graph representation of the data set to be analyzed. It identifies clusters through the identification of densely intraconnected subgraphs. We have employed a minimum spanning tree (MST) representation of the graph and solve the cluster identification problem using this representation. The computational bottleneck of our algorithm is the construction of an MST of a graph, for which a parallel algorithm is employed. Our high-level strategy for the parallel MST construction algorithm is to first partition the graph, then construct MSTs for the partitioned subgraphs and auxiliary bipartite graphs based on the subgraphs, and finally merge these MSTs to derive an MST of the original graph. The computational results indicate that when running on 150 CPUs, our algorithm can solve a cluster identification problem on a data set with 1,000,000 data points almost 100 times faster than on single CPU, indicating that this program is capable of handling very large data clustering problems in an efficient manner. We have implemented the clustering algorithm as the software CLUMP.

Nucleic Acids Research | 2006

Detecting uber-operons in prokaryotic genomes

Dongsheng Che; Guojun Li; Fenglou Mao; Hongwei Wu; Ying Xu

We present a study on computational identification of uber-operons in a prokaryotic genome, each of which represents a group of operons that are evolutionarily or functionally associated through operons in other (reference) genomes. Uber-operons represent a rich set of footprints of operon evolution, whose full utilization could lead to new and more powerful tools for elucidation of biological pathways and networks than what operons have provided, and a better understanding of prokaryotic genome structures and evolution. Our prediction algorithm predicts uber-operons through identifying groups of functionally or transcriptionally related operons, whose gene sets are conserved across the target and multiple reference genomes. Using this algorithm, we have predicted uber-operons for each of a group of 91 genomes, using the other 90 genomes as references. In particular, we predicted 158 uber-operons in Escherichia coli K12 covering 1830 genes, and found that many of the uber-operons correspond to parts of known regulons or biological pathways or are involved in highly related biological processes based on their Gene Ontology (GO) assignments. For some of the predicted uber-operons that are not parts of known regulons or pathways, our analyses indicate that their genes are highly likely to work together in the same biological processes, suggesting the possibility of new regulons and pathways. We believe that our uber-operon prediction provides a highly useful capability and a rich information source for elucidation of complex biological processes, such as pathways in microbes. All the prediction results are available at our Uber-Operon Database: , the first of its kind.

PLOS ONE | 2013

Elucidation of How Cancer Cells Avoid Acidosis through Comparative Transcriptomic Data Analysis

Kun Xu; Xizeng Mao; Minesh P. Mehta; Juan Cui; Chi Zhang; Fenglou Mao; Ying Xu

The rapid growth of cancer cells fueled by glycolysis produces large amounts of protons in cancer cells, which tri mechanisms to transport them out, hence leading to increased acidity in their extracellular environments. It has been well established that the increased acidity will induce cell death of normal cells but not cancer cells. The main question we address here is: how cancer cells deal with the increased acidity to avoid the activation of apoptosis. We have carried out a comparative analysis of transcriptomic data of six solid cancer types, breast, colon, liver, two lung (adenocarcinoma, squamous cell carcinoma) and prostate cancers, and proposed a model of how cancer cells utilize a few mechanisms to keep the protons outside of the cells. The model consists of a number of previously, well or partially, studied mechanisms for transporting out the excess protons, such as through the monocarboxylate transporters, V-ATPases, NHEs and the one facilitated by carbonic anhydrases. In addition we propose a new mechanism that neutralizes protons through the conversion of glutamate to γ-aminobutyrate, which consumes one proton per reaction. We hypothesize that these processes are regulated by cancer related conditions such as hypoxia and growth factors and by the pH levels, making these encoded processes not available to normal cells under acidic conditions.

Nucleic Acids Research | 2007

Hierarchical classification of functionally equivalent genes in prokaryotes

Hongwei Wu; Fenglou Mao; Victor Olman; Ying Xu

Functional classification of genes represents a fundamental problem to many biological studies. Most of the existing classification schemes are based on the concepts of homology and orthology, which were originally introduced to study gene evolution but might not be the most appropriate for gene function prediction, particularly at high resolution level. We have recently developed a scheme for hierarchical classification of genes (HCGs) in prokaryotes. In the HCG scheme, the functional equivalence relationships among genes are first assessed through a careful application of both sequence similarity and genomic neighborhood information; and genes are then classified into a hierarchical structure of clusters, where genes in each cluster are functionally equivalent at some resolution level, and the level of resolution goes higher as the clusters become increasingly smaller traveling down the hierarchy. The HCG scheme is validated through comparisons with the taxonomy of the prokaryotic genomes, Clusters of Orthologous Groups (COGs) of genes and the Pfam system. We have applied the HCG scheme to 224 complete prokaryotic genomes, and constructed a HCG database consisting of a forest of 5339 multi-level and 15 770 single-level trees of gene clusters covering ∼93% of the genes of these 224 genomes. The validation results indicate that the HCG scheme not only captures the key features of the existing classification schemes but also provides a much richer organization of genes which can be used for functional prediction of genes at higher resolution and to help reveal evolutionary trace of the genes.

Explore More