Chunmei Liu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Chunmei Liu is active.

Explore More

Publication

Featured researches published by Chunmei Liu.

IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2006

Efficient Parameterized Algorithms for Biopolymer Structure-Sequence Alignment

Yinglei Song; Chunmei Liu; Xiuzhen Huang; Russell L. Malmberg; Ying Xu; Liming Cai

Computational alignment of a biopolymer sequence (e.g., an RNA or a protein) to a structure is an effective approach to predict and search for the structure of new sequences. To identify the structure of remote homologs, the structure-sequence alignment has to consider not only sequence similarity, but also spatially conserved conformations caused by residue interactions and, consequently, is computationally intractable. It is difficult to cope with the inefficiency without compromising alignment accuracy, especially for structure search in genomes or large databases. This paper introduces a novel method and a parameterized algorithm for structure-sequence alignment. Both the structure and the sequence are represented as graphs, where, in general, the graph for a biopolymer structure has a naturally small tree width. The algorithm constructs an optimal alignment by finding in the sequence graph the maximum valued subgraph isomorphic to the structure graph. It has the computational time complexity O(k3N2) for the structure of N residues and its tree decomposition of width t. Parameter k, small in nature, is determined by a statistical cutoff for the correspondence between the structure and the sequence. This paper demonstrates a successful application of the algorithm to RNA structure search used for noncoding RNA identification. An application to protein threading is also discussed

computational systems bioinformatics | 2005

Tree decomposition based fast search of RNA structures including pseudoknots in genomes

Yinglei Song; Chunmei Liu; Russell L. Malmberg; Fangfang Pan; Liming Cai

Searching genomes for RNA secondary structure with computational methods has become an important approach to the annotation of non-coding RNAs. However, due to the lack of efficient algorithms for accurate RNA structure-sequence alignment, computer programs capable of fast and effectively searching genomes for RNA secondary structures have not been available. In this paper, a novel RNA structure profiling model is introduced based on the notion of a conformational graph to specify the consensus structure of an RNA family. Tree decomposition yields a small tree width t for such conformation graphs (e.g., t=2 for stem loops and only a slight increase for pseudo-knots). Within this modelling framework, the optimal alignment of a sequence to the structure model corresponds to finding a maximum valued isomorphic subgraph and consequently can be accomplished through dynamic programming on the tree decomposition of the conformational graph in time O(k/sup t/N/sup 2/), where k is a small parameter, and N is the size of the profiled RNA structure. Experiments show that the application of the alignment algorithm to search in genomes yields the same search accuracy as methods based on a covariance model with a significant reduction in computation time. In particular, very accurate searches of tmRNAs in bacteria genomes and of telomerase RNAs in yeast genomes can be accomplished in days, as opposed to months required by other methods. The tree decomposition based searching tool is free upon request and can be downloaded at our site http://www.uga.edu/RNA-Informatics/software/index.php.

intelligent systems in molecular biology | 2006

Peptide sequence tag-based blind identification of post-translational modifications with point process model

Chunmei Liu; Bo Yan; Yinglei Song; Ying Xu; Liming Cai

UNLABELLED An important but difficult problem in proteomics is the identification of post-translational modifications (PTMs) in a protein. In general, the process of PTM identification by aligning experimental spectra with theoretical spectra from peptides in a peptide database is very time consuming and may lead to high false positive rate. In this paper, we introduce a new approach that is both efficient and effective for blind PTM identification. Our work consists of the following phases. First, we develop a novel tree decomposition based algorithm that can efficiently generate peptide sequence tags (PSTs) from an extended spectrum graph. Sequence tags are selected from all maximum weighted antisymmetric paths in the graph and their reliabilities are evaluated with a score function. An efficient deterministic finite automaton (DFA) based model is then developed to search a peptide database for candidate peptides by using the generated sequence tags. Finally, a point process model-an efficient blind search approach for PTM identification, is applied to report the correct peptide and PTMs if there are any. Our tests on 2657 experimental tandem mass spectra and 2620 experimental spectra with one artificially added PTM show that, in addition to high efficiency, our ab-initio sequence tag selection algorithm achieves better or comparable accuracy to other approaches. Database search results show that the sequence tags of lengths 3 and 4 filter out more than 98.3% and 99.8% peptides respectively when applied to a yeast peptide database. With the dramatically reduced search space, the point process model achieves significant improvement in accuracy as well. AVAILABILITY The software is available upon request.

pacific symposium on biocomputing | 2005

Fast de novo peptide sequencing and spectral alignment via tree decomposition.

Chunmei Liu; Yinglei Song; Bo Yan; Ying Xu; Liming Cai

De novo sequencing and spectral alignment are computationally important for the prediction of new protein peptides via tandem mass spectrometry (MS/MS). Both approaches are established upon the problem of finding the longest antisymmetric path on formulated graphs. The problem is of high computational complexity and the prediction accuracy is compromised when given spectra involve noisy data, missing mass peaks, or post translational modifications (PTMs) and mutations. This paper introduces a graphical mechanism to describe relationships among mass peaks that, through graph tree decomposition, yields linear and quadratic time algorithms for optimal de novo sequencing and spectral alignment respectively. Our test results show that, in addition to high efficiency, the new algorithms can achieve desired prediction accuracy on spectra containing noisy peaks and PTMs while allowing the presence of both b-ions and y-ions.

workshop on algorithms in bioinformatics | 2005

Efficient parameterized algorithm for biopolymer structure-sequence alignment

Yinglei Song; Chunmei Liu; Xiuzhen Huang; Russell L. Malmberg; Ying Xu; Liming Cai

Computational alignment of a biopolymer sequence (e.g., an RNA or a protein) to a structure is an effective approach to predict and search for the structure of new sequences. To identify the structure of remote homologs, the structure-sequence alignment has to consider not only sequence similarity but also spatially conserved conformations caused by residue interactions, and consequently is computationally intractable. It is difficult to cope with the inefficiency without compromising alignment accuracy, especially for structure search in genomes or large databases. This paper introduces a novel method and a parameterized algorithm for structure-sequence alignment. Both the structure and the sequence are represented as graphs, where in general the graph for a biopolymer structure has a naturally small tree width. The algorithm constructs an optimal alignment by finding in the sequence graph the maximum valued subgraph isomorphic to the structure graph. It has the computational time complexity O(k t N 2) for the structure of N residues and its tree decomposition of width t. The parameter k, small in nature, is determined by a statistical cutoff for the correspondence between the structure and the sequence. The paper demonstrates a successful application of the algorithm to developing a fast program for RNA structural homology search.

The Computer Journal | 2008

Parameterized Complexity and Biopolymer Sequence Comparison

Liming Cai; Xiuzhen Huang; Chunmei Liu; Frances A. Rosamond; Yinglei Song

The paper surveys parameterized algorithms and complexities for computational tasks on biopolymer sequences, including the problems of longest common subsequence, shortest common supersequence, pairwise sequence alignment, multiple sequencing alignment, structure–sequence alignment and structure–structure alignment. Algorithm techniques, built on the structural-unit level as well as on the residue level, are discussed.

international conference on computational science | 2005

Profiling and searching for RNA pseudoknot structures in genomes

Chunmei Liu; Yinglei Song; Russell L. Malmberg; Liming Cai

A new method is developed that can profile and efficiently search for pseudoknot structures in noncoding RNA genes. It profiles interleaving stems in pseudoknot structures with independent Covariance Model (CM) components. The statistical alignment score for searching is obtained by combining the alignment scores from all CM components. Our experiments show that the model can achieve excellent accuracy on both random and biological data. The efficiency achieved by the method makes it possible to search for the pseudoknot structures in genomes of a variety of organisms.

Journal of Computer Science and Technology | 2005

RNA Structural Homology Search with a Succinct Stochastic Grammar Model

Yinglei Song; Jizhen Zhao; Chunmei Liu; Kan Liu; Russell L. Malmberg; Liming Cai

An increasing number of structural homology search tools, mostly based on profile stochastic context-free grammars (SCFGs) have been recently developed for the non-coding RNA gene identification. SCFGs can include statistical biases that often occur in RNA sequences, necessary to profile specific RNA structures for structural homology search. In this paper, a succinct stochastic grammar model is introduced for RNA that has competitive search effectiveness. More importantly, the profiling model can be easily extended to include pseudoknots, structures that are beyond the capability of profile SCFGs. In addition, the model allows heuristics to be exploited, resulting in a significant speed-up for the CYK algorithm-based search.

International Journal of Bioinformatics Research and Applications | 2006

Memory efficient alignment between RNA sequences and stochastic grammar models of pseudoknots

Yinglei Song; Chunmei Liu; Russell L. Malmberg; Congzhou He; Liming Cai

Stochastic Context-Free Grammars (SCFG) has been shown to be effective in modelling RNA secondary structure for searches. Our previous work (Cai et al., 2003) in Stochastic Parallel Communicating Grammar Systems (SPCGS) has extended SCFG to model RNA pseudoknots. However, the alignment algorithm requires O(n4) memory for a sequence of length n. In this paper, we develop a memory efficient algorithm for sequence-structure alignments including pseudoknots. This new algorithm reduces the memory space requirement from O(n4) to O(n2) without increasing the computation time. Our experiments have shown that this novel approach can achieve excellent performance on searching for RNA pseudoknots.

workshop on algorithms in bioinformatics | 2006

Phylogenetic network inferences through efficient haplotyping

Yinglei Song; Chunmei Liu; Russell L. Malmberg; Liming Cai

The genotype phasing problem is to determine the haplotypes of diploid individuals from their genotypes where linkage relationships are not known. Based on the model of perfect phylogeny, the genotype phasing problem can be solved in linear time. However, recombinations may occur and the perfect phylogeny model thus cannot interpret genotype data with recombinations. This paper develops a graph theoretical approach that can reduce the problem to finding a subgraph pattern contained in a given graph. Based on ordered graph tree decomposition, this problem can be solved efficiently with a parameterized algorithm. Our tests on biological genotype data showed that this algorithm is extremely efficient and its interpretation accuracy is better than or comparable with that of other approaches.

Explore More