Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Qifang Xu is active.

Publication


Featured researches published by Qifang Xu.


Journal of Molecular Biology | 2008

Statistical analysis of interface similarity in crystals of homologous proteins

Qifang Xu; Adrian A. Canutescu; Guoli Wang; Maxim V. Shapovalov; Zoran Obradovic; Roland L. Dunbrack

Many proteins function as homo-oligomers and are regulated via their oligomeric state. For some proteins, the stoichiometry of homo-oligomeric states under various conditions has been studied using gel filtration or analytical ultracentrifugation experiments. The interfaces involved in these assemblies may be identified using cross-linking and mass spectrometry, solution-state NMR, and other experiments. However, for most proteins, the actual interfaces that are involved in oligomerization are inferred from X-ray crystallographic structures using assumptions about interface surface areas and physical properties. Examination of interfaces across different Protein Data Bank (PDB) entries in a protein family reveals several important features. First, similarities in space group, asymmetric unit size, and cell dimensions and angles (within 1%) do not guarantee that two crystals are actually the same crystal form, containing similar relative orientations and interactions within the crystal. Conversely, two crystals in different space groups may be quite similar in terms of all the interfaces within each crystal. Second, NMR structures and an existing benchmark of PDB crystallographic entries consisting of 126 dimers as well as larger structures and 132 monomers were used to determine whether the existence or lack of common interfaces across multiple crystal forms can be used to predict whether a protein is an oligomer or not. Monomeric proteins tend to have common interfaces across only a minority of crystal forms, whereas higher-order structures exhibit common interfaces across a majority of available crystal forms. The data can be used to estimate the probability that an interface is biological if two or more crystal forms are available. Finally, the Protein Interfaces, Surfaces, and Assemblies (PISA) database available from the European Bioinformatics Institute is more consistent in identifying interfaces observed in many crystal forms compared with the PDB and the European Bioinformatics Institutes Protein Quaternary Server (PQS). The PDB, in particular, is missing highly likely biological interfaces in its biological unit files for about 10% of PDB entries.


Nucleic Acids Research | 2011

The protein common interface database (ProtCID)—a comprehensive database of interactions of homologous proteins in multiple crystal forms

Qifang Xu; Roland L. Dunbrack

The protein common interface database (ProtCID) is a database that contains clusters of similar homodimeric and heterodimeric interfaces observed in multiple crystal forms (CFs). Such interfaces, especially of homologous but non-identical proteins, have been associated with biologically relevant interactions. In ProtCID, protein chains in the protein data bank (PDB) are grouped based on their PFAM domain architectures. For a single PFAM architecture, all the dimers present in each CF are constructed and compared with those in other CFs that contain the same domain architecture. Interfaces occurring in two or more CFs comprise an interface cluster in the database. The same process is used to compare heterodimers of chains with different domain architectures. By examining interfaces that are shared by many homologous proteins in different CFs, we find that the PDB and the Protein Interfaces, Surfaces, and Assemblies (PISA) are not always consistent in their annotations of biological assemblies in a homologous family. Our data therefore provide an independent check on publicly available annotations of the structures of biological interactions for PDB entries. Common interfaces may also be useful in studies of protein evolution. Coordinates for all interfaces in a cluster are downloadable for further analysis. ProtCiD is available at http://dunbrack2.fccc.edu/protcid.


Nucleic Acids Research | 2015

PyIgClassify: a database of antibody CDR structural classifications

Jared Adolf-Bryfogle; Qifang Xu; Benjamin North; Andreas Lehmann; Roland L. Dunbrack

Classification of the structures of the complementarity determining regions (CDRs) of antibodies is critically important for antibody structure prediction and computational design. We have previously performed a clustering of antibody CDR conformations and defined a systematic nomenclature consisting of the CDR, length and an integer starting from the largest to the smallest cluster in the data set (e.g. L1-11-1). We present PyIgClassify (for Python-based immunoglobulin classification; available at http://dunbrack2.fccc.edu/pyigclassify/), a database and web server that provides access to assignments of all CDR structures in the PDB to our classification system. The database includes assignments to the IMGT germline V regions for heavy and light chains for several species. For humanized antibodies, the assignment of the frameworks is to human germlines and the CDRs to the germlines of mice or other species sources. The database can be searched by PDB entry, cluster identifier and IMGT germline group (e.g. human IGHV1). The entire database is downloadable so that users may filter the data as needed for antibody structure analysis, prediction and design.


Proteins | 2013

Prediction of phenotypes of missense mutations in human proteins from biological assemblies.

Qiong Wei; Qifang Xu; Roland L. Dunbrack

Single nucleotide polymorphisms (SNPs) are the most frequent variation in the human genome. Nonsynonymous SNPs that lead to missense mutations can be neutral or deleterious, and several computational methods have been presented that predict the phenotype of human missense mutations. These methods use sequence‐based and structure‐based features in various combinations, relying on different statistical distributions of these features for deleterious and neutral mutations. One structure‐based feature that has not been studied significantly is the accessible surface area within biologically relevant oligomeric assemblies. These assemblies are different from the crystallographic asymmetric unit for more than half of X‐ray crystal structures. We find that mutations in the core of proteins or in the interfaces in biological assemblies are significantly more likely to be disease‐associated than those on the surface of the biological assemblies. For structures with more than one protein in the biological assembly (whether the same sequence or different), we find the accessible surface area from biological assemblies provides a statistically significant improvement in prediction over the accessible surface area of monomers from protein crystal structures (P = 6e‐5). When adding this information to sequence‐based features such as the difference between wildtype and mutant position‐specific profile scores, the improvement from biological assemblies is statistically significant but much smaller (P = 0.018). Combining this information with sequence‐based features in a support vector machine leads to 82% accuracy on a balanced dataset of 50% disease‐associated mutations from SwissVar and 50% neutral mutations from human/primate sequence differences in orthologous proteins. Proteins 2013.


Proteins | 2009

An unusually small dimer interface is observed in all available crystal structures of cytosolic sulfotransferases

Brian D. Weitzner; Thomas Meehan; Qifang Xu; Roland L. Dunbrack

Cytosolic sulfotransferases catalyze the sulfonation of hormones, metabolites, and xenobiotics. Many of these proteins have been shown to form homodimers and heterodimers. An unusually small dimer interface was previously identified by Petrotchenko et al. (FEBS Lett 2001;490:39–43) by cross‐linking, protease digestion, and mass spectrometry and verified by site‐directed mutagenesis. Analysis of the crystal packing interfaces in all 28 available crystal structures consisting of 17 crystal forms shows that this interface occurs in all of them. With a small number of exceptions, the publicly available databases of biological assemblies contain either monomers or incorrect dimers. Even crystal structures of mouse SULT1E1, which is a monomer in solution, contain the common dimeric interface, although distorted and missing two important salt bridges. Proteins 2009.


PLOS ONE | 2014

BioAssemblyModeler (BAM): User-Friendly Homology Modeling of Protein Homo- and Heterooligomers

Maxim V. Shapovalov; Qiang Wang; Qifang Xu; Mark Andrake; Roland L. Dunbrack

Many if not most proteins function in oligomeric assemblies of one or more protein sequences. The Protein Data Bank provides coordinates for biological assemblies for each entry, at least 60% of which are dimers or larger assemblies. BioAssemblyModeler (BAM) is a graphical user interface to the basic steps in homology modeling of protein homooligomers and heterooligomers from the biological assemblies provided in the PDB. BAM takes as input up to six different protein sequences and begins by assigning Pfam domains to the target sequences. The program utilizes a complete assignment of Pfam domains to sequences in the PDB, PDBfam (http://dunbrack2.fccc.edu/protcid/pdbfam), to obtain templates that contain any or all of the domains assigned to the target sequence(s). The contents of the biological assemblies of potential templates are provided, and alignments of the target sequences to the templates are produced with a profile-profile alignment algorithm. BAM provides for visual examination and mouse-editing of the alignments supported by target and template secondary structure information and a 3D viewer of the template biological assembly. Side-chain coordinates for a model of the biological assembly are built with the program SCWRL4. A built-in protocol navigation system guides the user through all stages of homology modeling from input sequences to a three-dimensional model of the target complex. Availability: http://dunbrack.fccc.edu/BAM.


Proteins | 2016

Biological function derived from predicted structures in CASP11

Peter J. Huwe; Qifang Xu; Maxim V. Shapovalov; Vivek Modi; Mark Andrake; Roland L. Dunbrack

In CASP11, the organizers sought to bring the biological inferences from predicted structures to the fore. To accomplish this, we assessed the models for their ability to perform quantifiable tasks related to biological function. First, for 10 targets that were probable homodimers, we measured the accuracy of docking the models into homodimers as a function of GDT‐TS of the monomers, which produced characteristic L‐shaped plots. At low GDT‐TS, none of the models could be docked correctly as homodimers. Above GDT‐TS of ∼60%, some models formed correct homodimers in one of the largest docked clusters, while many other models at the same values of GDT‐TS did not. Docking was more successful when many of the templates shared the same homodimer. Second, we docked a ligand from an experimental structure into each of the models of one of the targets. Docking to the models with two different programs produced poor ligand RMSDs with the experimental structure. Measures that evaluated similarity of contacts were reasonable for some of the models, although there was not a significant correlation with model accuracy. Finally, we assessed whether models would be useful in predicting the phenotypes of missense mutations in three human targets by comparing features calculated from the models with those calculated from the experimental structures. The models were successful in reproducing accessible surface areas but there was little correlation of model accuracy with calculation of FoldX evaluation of the change in free energy between the wild‐type and the mutant. Proteins 2016; 84(Suppl 1):370–391.


international conference on information fusion | 2005

Improving aerosol retrieval accuracy by integrating AERONET, MISR and MODIS data

Qifang Xu; Zoran Obradovic; Bo Han; Yong Li; Amy Braverman; Slobodan Vucetic

Retrieval of aerosol optical thickness (AOT) by ground- and satellite-based remote sensing provides different accuracy, coverage, and resolution. An important challenge is how to best utilize information from multiple instruments to further improve the quality of retrievals. In this study, we explored whether the accuracy of AOT retrievals could be improved by fusion of ground- and satellite-based data using neural network techniques. MISR and MODIS satellite data were obtained for several 16-day periods during 2002 and 2003 covering the continental USA. These data are joined spatially and temporally with AOT measurements from 34 AERONET ground-based stations over the continental USA. The R/sup 2/ accuracies of MODIS and MISR retrievals were estimated at 0.57 and 0.66, when AERONET AOT is used as the ground truth. When radiance and geometric attributes are used together with MISR and MODIS AOT as attributes for prediction of AERONET AOT, the R/sup 2/ accuracy was increased up to 10%.


Human Mutation | 2017

Benchmarking predictions of allostery in liver pyruvate kinase in CAGI4

Qifang Xu; Qingling Tang; Panagiotis Katsonis; Olivier Lichtarge; David Jones; Samuele Bovo; Giulia Babbi; Pier Luigi Martelli; Rita Casadio; Gyu Rie Lee; Chaok Seok; Aron W. Fenton; Roland L. Dunbrack

The Critical Assessment of Genome Interpretation (CAGI) is a global community experiment to objectively assess computational methods for predicting phenotypic impacts of genomic variation. One of the 2015–2016 competitions focused on predicting the influence of mutations on the allosteric regulation of human liver pyruvate kinase. More than 30 different researchers accessed the challenge data. However, only four groups accepted the challenge. Features used for predictions ranged from evolutionary constraints, mutant site locations relative to active and effector binding sites, and computational docking outputs. Despite the range of expertise and strategies used by predictors, the best predictions were marginally greater than random for modified allostery resulting from mutations. In contrast, several groups successfully predicted which mutations severely reduced enzymatic activity. Nonetheless, poor predictions of allostery stands in stark contrast to the impression left by more than 700 PubMed entries identified using the identifiers “computational + allosteric.” This contrast highlights a specialized need for new computational tools and utilization of benchmarks that focus on allosteric regulation.


Human Mutation | 2017

Performance of in silico tools for the evaluation of p16INK4a (CDKN2A) variants in CAGI

Marco Carraro; Giovanni Minervini; Manuel Giollo; Yana Bromberg; Emidio Capriotti; Rita Casadio; Roland L. Dunbrack; Lisa Elefanti; P. Fariselli; Carlo Ferrari; Julian Gough; Panagiotis Katsonis; Emanuela Leonardi; Olivier Lichtarge; Chiara Menin; Pier Luigi Martelli; Abhishek Niroula; Lipika R. Pal; Susanna Repo; Maria Chiara Scaini; Mauno Vihinen; Qiong Wei; Qifang Xu; Yuedong Yang; Yizhou Yin; Jan Zaucha; Huiying Zhao; Yaoqi Zhou; Steven E. Brenner; John Moult

Correct phenotypic interpretation of variants of unknown significance for cancer‐associated genes is a diagnostic challenge as genetic screenings gain in popularity in the next‐generation sequencing era. The Critical Assessment of Genome Interpretation (CAGI) experiment aims to test and define the state of the art of genotype–phenotype interpretation. Here, we present the assessment of the CAGI p16INK4a challenge. Participants were asked to predict the effect on cellular proliferation of 10 variants for the p16INK4a tumor suppressor, a cyclin‐dependent kinase inhibitor encoded by the CDKN2A gene. Twenty‐two pathogenicity predictors were assessed with a variety of accuracy measures for reliability in a medical context. Different assessment measures were combined in an overall ranking to provide more robust results. The R scripts used for assessment are publicly available from a GitHub repository for future use in similar assessment exercises. Despite a limited test‐set size, our findings show a variety of results, with some methods performing significantly better. Methods combining different strategies frequently outperform simpler approaches. The best predictor, Yang&Zhou lab, uses a machine learning method combining an empirical energy function measuring protein stability with an evolutionary conservation term. The p16INK4a challenge highlights how subtle structural effects can neutralize otherwise deleterious variants.

Collaboration


Dive into the Qifang Xu's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Mark Andrake

Fox Chase Cancer Center

View shared research outputs
Top Co-Authors

Avatar

Olivier Lichtarge

Baylor College of Medicine

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Qiong Wei

Fox Chase Cancer Center

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge