Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Guoli Wang is active.

Publication


Featured researches published by Guoli Wang.


Bioinformatics | 2003

PISCES: a protein sequence culling server.

Guoli Wang; Roland L. Dunbrack

PISCES is a public server for culling sets of protein sequences from the Protein Data Bank (PDB) by sequence identity and structural quality criteria. PISCES can provide lists culled from the entire PDB or from lists of PDB entries or chains provided by the user. The sequence identities are obtained from PSI-BLAST alignments with position-specific substitution matrices derived from the non-redundant protein sequence database. PISCES therefore provides better lists than servers that use BLAST, which is unable to identify many relationships below 40% sequence identity and often overestimates sequence identity by aligning only well-conserved fragments. PDB sequences are updated weekly. PISCES can also cull non-PDB sequences provided by the user as a list of GenBank identifiers, a FASTA format file, or BLAST/PSI-BLAST output.


Nucleic Acids Research | 2005

PISCES: recent improvements to a PDB sequence culling server

Guoli Wang; Roland L. Dunbrack

PISCES is a database server for producing lists of sequences from the Protein Data Bank (PDB) using a number of entry- and chain-specific criteria and mutual sequence identity. Our goal in culling the PDB is to provide the longest list possible of the highest resolution structures that fulfill the sequence identity and structural quality cut-offs. The new PISCES server uses a combination of PSI-BLAST and structure-based alignments to determine sequence identities. Structure alignment produces more complete alignments and therefore more accurate sequence identities than PSI-BLAST. PISCES now allows a user to cull the PDB by-entry in addition to the standard culling by individual chains. In this scenario, a list will contain only entries that do not have a chain that has a sequence identity to any chain in any other entry in the list over the sequence identity cut-off. PISCES also provides fully annotated sequences including gene name and species. The server allows a user to cull an input list of entries or chains, so that other criteria, such as function, can be used. Results from a search on the re-engineered RCSBs site for the PDB can be entered into the PISCES server by a single click, combining the powerful searching abilities of the PDB with PISCESs utilities for sequence culling. The servers data are updated weekly. The server is available at .


Protein Science | 2004

Scoring profile-to-profile sequence alignments

Guoli Wang; Roland L. Dunbrack

Sequence alignment profiles have been shown to be very powerful in creating accurate sequence alignments. Profiles are often used to search a sequence database with a local alignment algorithm. More accurate and longer alignments have been obtained with profile‐to‐profile comparison. There are several steps that must be performed in creating profile–profile alignments, and each involves choices in parameters and algorithms. These steps include (1) what sequences to include in a multiple alignment used to build each profile, (2) how to weight similar sequences in the multiple alignment and how to determine amino acid frequencies from the weighted alignment, (3) how to score a column from one profile aligned to a column of the other profile, (4) how to score gaps in the profile–profile alignment, and (5) how to include structural information. Large‐scale benchmarks consisting of pairs of homologous proteins with structurally determined sequence alignments are necessary for evaluating the efficacy of each scoring scheme. With such a benchmark, we have investigated the properties of profile–profile alignments and found that (1) with optimized gap penalties, most column–column scoring functions behave similarly to one another in alignment accuracy; (2) some functions, however, have much higher search sensitivity and specificity; (3) position‐specific weighting schemes in determining amino acid counts in columns of multiple sequence alignments are better than sequence‐specific schemes; (4) removing positions in the profile with gaps in the query sequence results in better alignments; and (5) adding predicted and known secondary structure information improves alignments.


PLOS Computational Biology | 2010

Neighbor-Dependent Ramachandran Probability Distributions of Amino Acids Developed from a Hierarchical Dirichlet Process Model

Daniel Ting; Guoli Wang; Maxim V. Shapovalov; Rajib Mitra; Michael I. Jordan; Roland L. Dunbrack

Distributions of the backbone dihedral angles of proteins have been studied for over 40 years. While many statistical analyses have been presented, only a handful of probability densities are publicly available for use in structure validation and structure prediction methods. The available distributions differ in a number of important ways, which determine their usefulness for various purposes. These include: 1) input data size and criteria for structure inclusion (resolution, R-factor, etc.); 2) filtering of suspect conformations and outliers using B-factors or other features; 3) secondary structure of input data (e.g., whether helix and sheet are included; whether beta turns are included); 4) the method used for determining probability densities ranging from simple histograms to modern nonparametric density estimation; and 5) whether they include nearest neighbor effects on the distribution of conformations in different regions of the Ramachandran map. In this work, Ramachandran probability distributions are presented for residues in protein loops from a high-resolution data set with filtering based on calculated electron densities. Distributions for all 20 amino acids (with cis and trans proline treated separately) have been determined, as well as 420 left-neighbor and 420 right-neighbor dependent distributions. The neighbor-independent and neighbor-dependent probability densities have been accurately estimated using Bayesian nonparametric statistical analysis based on the Dirichlet process. In particular, we used hierarchical Dirichlet process priors, which allow sharing of information between densities for a particular residue type and different neighbor residue types. The resulting distributions are tested in a loop modeling benchmark with the program Rosetta, and are shown to improve protein loop conformation prediction significantly. The distributions are available at http://dunbrack.fccc.edu/hdp.


Journal of Molecular Biology | 2008

Statistical analysis of interface similarity in crystals of homologous proteins

Qifang Xu; Adrian A. Canutescu; Guoli Wang; Maxim V. Shapovalov; Zoran Obradovic; Roland L. Dunbrack

Many proteins function as homo-oligomers and are regulated via their oligomeric state. For some proteins, the stoichiometry of homo-oligomeric states under various conditions has been studied using gel filtration or analytical ultracentrifugation experiments. The interfaces involved in these assemblies may be identified using cross-linking and mass spectrometry, solution-state NMR, and other experiments. However, for most proteins, the actual interfaces that are involved in oligomerization are inferred from X-ray crystallographic structures using assumptions about interface surface areas and physical properties. Examination of interfaces across different Protein Data Bank (PDB) entries in a protein family reveals several important features. First, similarities in space group, asymmetric unit size, and cell dimensions and angles (within 1%) do not guarantee that two crystals are actually the same crystal form, containing similar relative orientations and interactions within the crystal. Conversely, two crystals in different space groups may be quite similar in terms of all the interfaces within each crystal. Second, NMR structures and an existing benchmark of PDB crystallographic entries consisting of 126 dimers as well as larger structures and 132 monomers were used to determine whether the existence or lack of common interfaces across multiple crystal forms can be used to predict whether a protein is an oligomer or not. Monomeric proteins tend to have common interfaces across only a minority of crystal forms, whereas higher-order structures exhibit common interfaces across a majority of available crystal forms. The data can be used to estimate the probability that an interface is biological if two or more crystal forms are available. Finally, the Protein Interfaces, Surfaces, and Assemblies (PISA) database available from the European Bioinformatics Institute is more consistent in identifying interfaces observed in many crystal forms compared with the PDB and the European Bioinformatics Institutes Protein Quaternary Server (PQS). The PDB, in particular, is missing highly likely biological interfaces in its biological unit files for about 10% of PDB entries.


Journal of Biological Chemistry | 2008

Oligomerization of BAK by p53 Utilizes Conserved Residues of the p53 DNA Binding Domain

E. Christine Pietsch; Erin Perchiniak; Adrian A. Canutescu; Guoli Wang; Roland L. Dunbrack; Maureen E. Murphy

Genotoxic stress triggers a rapid translocation of p53 to the mitochondria, contributing to apoptosis in a transcription-independent manner. Using immunopurification protocols and mass spectrometry, we previously identified the proapoptotic protein BAK as a mitochondrial p53-binding protein and showed that recombinant p53 directly binds to BAK and can induce its oligomerization, leading to cytochrome c release. In this work we describe a combination of molecular modeling, electrostatic analysis, and site-directed mutagenesis to define contact residues between BAK and p53. Our data indicate that three regions within the core DNA binding domain of p53 make contact with BAK; these are the conserved H2 α-helix and the L1 and L3 loop. Notably, point mutations in these regions markedly impair the ability of p53 to oligomerize BAK and to induce transcription-independent cell death. We present a model whereby positively charged residues within the H2 helix and L1 loop of p53 interact with an electronegative domain on the N-terminal α-helix of BAK; the latter is known to undergo conformational changes upon BAK activation. We show that mutation of acidic residues in the N-terminal helix impair the ability of BAK to bind to p53. Interestingly, many of the p53 contact residues predicted by our model are also direct DNA contact residues, suggesting that p53 interacts with BAK in a manner analogous to DNA. The combined data point to the H2 helix and L1 and L3 loops of p53 as novel functional domains contributing to transcription-independent apoptosis by this tumor suppressor protein.


BMC Bioinformatics | 2006

LS-NMF: A modified non-negative matrix factorization algorithm utilizing uncertainty estimates

Guoli Wang; Andrew V. Kossenkov; Michael F. Ochs

BackgroundNon-negative matrix factorisation (NMF), a machine learning algorithm, has been applied to the analysis of microarray data. A key feature of NMF is the ability to identify patterns that together explain the data as a linear combination of expression signatures. Microarray data generally includes individual estimates of uncertainty for each gene in each condition, however NMF does not exploit this information. Previous work has shown that such uncertainties can be extremely valuable for pattern recognition.ResultsWe have created a new algorithm, least squares non-negative matrix factorization, LS-NMF, which integrates uncertainty measurements of gene expression data into NMF updating rules. While the LS-NMF algorithm maintains the advantages of original NMF algorithm, such as easy implementation and a guaranteed locally optimal solution, the performance in terms of linking functionally related genes has been improved. LS-NMF exceeds NMF significantly in terms of identifying functionally related genes as determined from annotations in the MIPS database.ConclusionUncertainty measurements on gene expression data provide valuable information for data analysis, and use of this information in the LS-NMF algorithm significantly improves the power of the NMF technique.


Proteins | 2001

Fold recognition and accurate query‐template alignment by a combination of PSI‐BLAST and threading

Yibing Shan; Guoli Wang; Huan-Xiang Zhou

A homology‐based structure prediction method ideally gives both a correct fold assignment and an accurate query‐template alignment. In this article we show that the combination of two existing methods, PSI‐BLAST and threading, leads to significant enhancement in the success rate of fold recognition. The combined approach, termed COBLATH, also yields much higher alignment accuracy than found in previous studies. It consists of two‐way searches both by PSI‐BLAST and by threading. In the PSI‐BLAST portion, a query is used to search for hits in a library of potential templates and, conversely, each potential template is used to search for hits in a library of queries. In the threading portion, the scoring function is the sum of a sequence profile and a 6×6 substitution matrix between predicted query and known template secondary structure and solvent exposure. “Two‐way” in threading means that the querys sequence profile is used to match the sequences of all potential templates and the sequence profiles of all potential templates are used to match the querys sequence. When tested on a set of 533 nonhomologous proteins, COBLATH was able to assign folds for 390 (73%). Among these 390 queries, 265 (68%) had root‐mean‐square deviations (RMSDs) of less than 8 Å between predicted and actual structures. Such high success rate and accuracy make COBLATH an ideal tool for structural genomics. Proteins 2001;42:23–37.


Cell Biochemistry and Biophysics | 2001

Predicted structures of two proteins involved in human diseases

Huan-Xiang Zhou; Guoli Wang

Structures of 79 proteins involved in human diseases were predicted by sequence alignments with structural templates. The predicted structures for ALDP and CSA, proteins responsible for adrenoleukodystrophy and the Cockayne syndrome, respectively, were analyzed to elucidate the molecular basis of disease mutations. In particular we positioned residue P484 of ALDP in the homodimer interface. This positioning is consistent with a recent experimental finding that the mutation P484R significantly decreases the self-interaction of ALDP and suggests that the disease mechanism of this mutation lies in the impaired ALDP dimerization. We identified two new WD repeats in CSA and suggest that one of these forms part of the interaction surface with other proteins.


Bioinformatics | 2005

Quasi-consensus-based comparison of profile hidden Markov models for protein sequences

Robel Y. Kahsay; Guoli Wang; Guang R. Gao; Li Liao; Roland L. Dunbrack

A simple approach for the sensitive detection of distant relationships among protein families and for sequence-structure alignment via comparison of hidden Markov models based on their quasi-consensus sequences is presented. Using a previously published benchmark dataset, the approach is demonstrated to give better homology detection and yield alignments with improved accuracy in comparison to an existing state-of-the-art dynamic programming profile-profile comparison method. This method also runs significantly faster and is therefore suitable for a server covering the rapidly increasing structure database. A server based on this method is available at http://liao.cis.udel.edu/website/servers/modmod

Collaboration


Dive into the Guoli Wang's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Daniel Ting

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge