Qiangfeng Cliff Zhang
Tsinghua University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Qiangfeng Cliff Zhang.
Nature | 2012
Qiangfeng Cliff Zhang; Donald Petrey; Lei Deng; Li Qiang; Yu Shi; Chan Aye Thu; Brygida Bisikirska; Celine Lefebvre; Domenico Accili; Tony Hunter; Tom Maniatis; Barry Honig
The genome-wide identification of pairs of interacting proteins is an important step in the elucidation of cell regulatory mechanisms. Much of our present knowledge derives from high-throughput techniques such as the yeast two-hybrid assay and affinity purification, as well as from manual curation of experiments on individual systems. A variety of computational approaches based, for example, on sequence homology, gene co-expression and phylogenetic profiles, have also been developed for the genome-wide inference of protein–protein interactions (PPIs). Yet comparative studies suggest that the development of accurate and complete repertoires of PPIs is still in its early stages. Here we show that three-dimensional structural information can be used to predict PPIs with an accuracy and coverage that are superior to predictions based on non-structural evidence. Moreover, an algorithm, termed PrePPI, which combines structural information with other functional clues, is comparable in accuracy to high-throughput experiments, yielding over 30,000 high-confidence interactions for yeast and over 300,000 for human. Experimental tests of a number of predictions demonstrate the ability of the PrePPI algorithm to identify unexpected PPIs of considerable biological interest. The surprising effectiveness of three-dimensional structural information can be attributed to the use of homology models combined with the exploitation of both close and remote geometric relationships between proteins.
Nature | 2014
Yue Wan; Kun Qu; Qiangfeng Cliff Zhang; Ryan A. Flynn; Ohad Manor; Zhengqing Ouyang; Jiajing Zhang; Robert C. Spitale; Michael Snyder; Eran Segal; Howard Y. Chang
In parallel to the genetic code for protein synthesis, a second layer of information is embedded in all RNA transcripts in the form of RNA structure. RNA structure influences practically every step in the gene expression program. However, the nature of most RNA structures or effects of sequence variation on structure are not known. Here we report the initial landscape and variation of RNA secondary structures (RSSs) in a human family trio (mother, father and their child). This provides a comprehensive RSS map of human coding and non-coding RNAs. We identify unique RSS signatures that demarcate open reading frames and splicing junctions, and define authentic microRNA-binding sites. Comparison of native deproteinized RNA isolated from cells versus refolded purified RNA suggests that the majority of the RSS information is encoded within RNA sequence. Over 1,900 transcribed single nucleotide variants (approximately 15% of all transcribed single nucleotide variants) alter local RNA structure. We discover simple sequence and spacing rules that determine the ability of point mutations to impact RSSs. Selective depletion of ‘riboSNitches’ versus structurally synonymous variants at precise locations suggests selection for specific RNA shapes at thousands of sites, including 3′ untranslated regions, binding sites of microRNAs and RNA-binding proteins genome-wide. These results highlight the potentially broad contribution of RNA structure and its variation to gene regulation.
Nature | 2015
Robert C. Spitale; Ryan A. Flynn; Qiangfeng Cliff Zhang; Pete Crisalli; Byron K. Lee; Jong-Wha Jung; Hannes Y. Kuchelmeister; Pedro J. Batista; Eduardo A. Torre; Eric T. Kool; Howard Y. Chang
Visualizing the physical basis for molecular behaviour inside living cells is a great challenge for biology. RNAs are central to biological regulation, and the ability of RNA to adopt specific structures intimately controls every step of the gene expression program. However, our understanding of physiological RNA structures is limited; current in vivo RNA structure profiles include only two of the four nucleotides that make up RNA. Here we present a novel biochemical approach, in vivo click selective 2′-hydroxyl acylation and profiling experiment (icSHAPE), which enables the first global view, to our knowledge, of RNA secondary structures in living cells for all four bases. icSHAPE of the mouse embryonic stem cell transcriptome versus purified RNA folded in vitro shows that the structural dynamics of RNA in the cellular environment distinguish different classes of RNAs and regulatory elements. Structural signatures at translational start sites and ribosome pause sites are conserved from in vitro conditions, suggesting that these RNA elements are programmed by sequence. In contrast, focal structural rearrangements in vivo reveal precise interfaces of RNA with RNA-binding proteins or RNA-modification sites that are consistent with atomic-resolution structural data. Such dynamic structural footprints enable accurate prediction of RNA–protein interactions and N6-methyladenosine (m6A) modification genome wide. These results open the door for structural genomics of RNA in living cells and reveal key physiological structures controlling gene expression.
Science | 2013
Maya Kasowski; Sofia Kyriazopoulou-Panagiotopoulou; Fabian Grubert; Judith B. Zaugg; Anshul Kundaje; Yuling Liu; Alan P. Boyle; Qiangfeng Cliff Zhang; Fouad Zakharia; Damek V. Spacek; Jingjing Li; Dan Xie; Anthony O. Olarerin-George; Lars M. Steinmetz; John B. Hogenesch; Manolis Kellis; Serafim Batzoglou; Michael Snyder
DNA Differences The extent to which genetic variation affects an individuals phenotype has been difficult to predict because the majority of variation lies outside the coding regions of genes. Now, three studies examine the extent to which genetic variation affects the chromatin of individuals with diverse ancestry and genetic variation (see the Perspective by Furey and Sethupathy). Kasowski et al. (p. 750, published online 17 October) examined how genetic variation affects differences in chromatin states and their correlation to histone modifications, as well as more general DNA binding factors. Kilpinen et al. (p. 744, published online 17 October) document how genetic variation is linked to allelic specificity in transcription factor binding, histone modifications, and transcription. McVicker et al. (p. 747, published online 17 October) identified how quantitative trait loci affect histone modifications in Yoruban individuals and established which specific transcription factors affect such modifications. Variability among humans with different ancestry affects chromatin states and gene expression. [Also see Perspective by Furey and Sethupathy] The majority of disease-associated variants lie outside protein-coding regions, suggesting a link between variation in regulatory regions and disease predisposition. We studied differences in chromatin states using five histone modifications, cohesin, and CTCF in lymphoblastoid lines from 19 individuals of diverse ancestry. We found extensive signal variation in regulatory regions, which often switch between active and repressed states across individuals. Enhancer activity is particularly diverse among individuals, whereas gene expression remains relatively stable. Chromatin variability shows genetic inheritance in trios, correlates with genetic variation and population divergence, and is associated with disruptions of transcription factor binding motifs. Overall, our results provide insights into chromatin variation among humans.
Proceedings of the National Academy of Sciences of the United States of America | 2010
Qiangfeng Cliff Zhang; Donald Petrey; Raquel Norel; Barry Honig
With the advent of Systems Biology, the prediction of whether two proteins form a complex has become a problem of increased importance. A variety of experimental techniques have been applied to the problem, but three-dimensional structural information has not been widely exploited. Here we explore the range of applicability of such information by analyzing the extent to which the location of binding sites on protein surfaces is conserved among structural neighbors. We find, as expected, that interface conservation is most significant among proteins that have a clear evolutionary relationship, but that there is a significant level of conservation even among remote structural neighbors. This finding is consistent with recent evidence that information available from structural neighbors, independent of classification, should be exploited in the search for functional insights. The value of such structural information is highlighted through the development of a new protein interface prediction method, PredUs, that identifies what residues on protein surfaces are likely to participate in complexes with other proteins. The performance of PredUs, as measured through comparisons with other methods, suggests that relationships across protein structure space can be successfully exploited in the prediction of protein-protein interactions.
Cell | 2016
Zhipeng Lu; Qiangfeng Cliff Zhang; Byron K. Lee; Ryan A. Flynn; Martin A. Smith; James Robinson; Chen Davidovich; Anne R. Gooding; Karen J. Goodrich; John S. Mattick; Jill P. Mesirov; Thomas R. Cech; Howard Y. Chang
RNA has the intrinsic property to base pair, forming complex structures fundamental to its diverse functions. Here, we develop PARIS, a method based on reversible psoralen crosslinking for global mapping of RNA duplexes with near base-pair resolution in living cells. PARIS analysis in three human and mouse cell types reveals frequent long-range structures, higher-order architectures, and RNA-RNA interactions in trans across the transcriptome. PARIS determines base-pairing interactions on an individual-molecule level, revealing pervasive alternative conformations. We used PARIS-determined helices to guide phylogenetic analysis of RNA structures and discovered conserved long-range and alternative structures. XIST, a long noncoding RNA (lncRNA) essential for X chromosome inactivation, folds into evolutionarily conserved RNA structural domains that span many kilobases. XIST A-repeat forms complex inter-repeat duplexes that nucleate higher-order assembly of the key epigenetic silencing protein SPEN. PARIS is a generally applicable and versatile method that provides novel insights into the RNA structurome and interactome. VIDEO ABSTRACT.
Nucleic Acids Research | 2012
Qiangfeng Cliff Zhang; Donald Petrey; José Ignacio Garzón; Lei Deng; Barry Honig
PrePPI (http://bhapp.c2b2.columbia.edu/PrePPI) is a database that combines predicted and experimentally determined protein–protein interactions (PPIs) using a Bayesian framework. Predicted interactions are assigned probabilities of being correct, which are derived from calculated likelihood ratios (LRs) by combining structural, functional, evolutionary and expression information, with the most important contribution coming from structure. Experimentally determined interactions are compiled from a set of public databases that manually collect PPIs from the literature and are also assigned LRs. A final probability is then assigned to every interaction by combining the LRs for both predicted and experimentally determined interactions. The current version of PrePPI contains ∼2 million PPIs that have a probability more than ∼0.1 of which ∼60 000 PPIs for yeast and ∼370 000 PPIs for human are considered high confidence (probability > 0.5). The PrePPI database constitutes an integrated resource that enables users to examine aggregate information on PPIs, including both known and potentially novel interactions, and that provides structural models for many of the PPIs.
Nucleic Acids Research | 2011
Qiangfeng Cliff Zhang; Lei Deng; Markus Fisher; Jihong Guan; Barry Honig; Donald Petrey
We describe PredUs, an interactive web server for the prediction of protein–protein interfaces. Potential interfacial residues for a query protein are identified by ‘mapping’ contacts from known interfaces of the query protein’s structural neighbors to surface residues of the query. We calculate a score for each residue to be interfacial with a support vector machine. Results can be visualized in a molecular viewer and a number of interactive features allow users to tailor a prediction to a particular hypothesis. The PredUs server is available at: http://wiki.c2b2.columbia.edu/honiglab_public/index.php/Software:PredUs.
Genes & Development | 2016
Jeffrey J. Quinn; Qiangfeng Cliff Zhang; Plamen Georgiev; Ibrahim Avsar Ilik; Asifa Akhtar; Howard Y. Chang
Many long noncoding RNAs (lncRNAs) can regulate chromatin states, but the evolutionary origin and dynamics driving lncRNA-genome interactions are unclear. We adapted an integrative strategy that identifies lncRNA orthologs in different species despite limited sequence similarity, which is applicable to mammalian and insect lncRNAs. Analysis of the roX lncRNAs, which are essential for dosage compensation of the single X chromosome in Drosophila males, revealed 47 new roX orthologs in diverse Drosophilid species across ∼40 million years of evolution. Genetic rescue by roX orthologs and engineered synthetic lncRNAs showed that altering the number of focal, repetitive RNA structures determines roX ortholog function. Genomic occupancy maps of roX RNAs in four species revealed conserved targeting of X chromosome neighborhoods but rapid turnover of individual binding sites. Many new roX-binding sites evolved from DNA encoding a pre-existing RNA splicing signal, effectively linking dosage compensation to transcribed genes. Thus, dynamic change in lncRNAs and their genomic targets underlies conserved and essential lncRNA-genome interactions.
Nature Protocols | 2016
Ryan A. Flynn; Qiangfeng Cliff Zhang; Robert C. Spitale; Byron K. Lee; Maxwell R. Mumbach; Howard Y. Chang
icSHAPE (in vivo click selective 2-hydroxyl acylation and profiling experiment) captures RNA secondary structure at a transcriptome-wide level by measuring nucleotide flexibility at base resolution. Living cells are treated with the icSHAPE chemical NAI-N3 followed by selective chemical enrichment of NAI-N3–modified RNA, which provides an improved signal-to-noise ratio compared with similar methods leveraging deep sequencing. Purified RNA is then reverse-transcribed to produce cDNA, with SHAPE-modified bases leading to truncated cDNA. After deep sequencing of cDNA, computational analysis yields flexibility scores for every base across the starting RNA population. The entire experimental procedure can be completed in ∼5 d, and the sequencing and bioinformatics data analysis take an additional 4–5 d with no extensive computational skills required. Comparing in vivo and in vitro icSHAPE measurements can reveal in vivo RNA-binding protein imprints or facilitate the dissection of RNA post-transcriptional modifications. icSHAPE reactivities can additionally be used to constrain and improve RNA secondary structure prediction models.