Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yimeng Yin is active.

Publication


Featured researches published by Yimeng Yin.


Nature | 2015

DNA-dependent formation of transcription factor pairs alters their binding specificity.

Arttu Jolma; Yimeng Yin; Kazuhiro R. Nitta; Kashyap Dave; Alexander N. Popov; Minna Taipale; Martin Enge; Teemu Kivioja; Ekaterina Morgunova; Jussi Taipale

Gene expression is regulated by transcription factors (TFs), proteins that recognize short DNA sequence motifs. Such sequences are very common in the human genome, and an important determinant of the specificity of gene expression is the cooperative binding of multiple TFs to closely located motifs. However, interactions between DNA-bound TFs have not been systematically characterized. To identify TF pairs that bind cooperatively to DNA, and to characterize their spacing and orientation preferences, we have performed consecutive affinity-purification systematic evolution of ligands by exponential enrichment (CAP-SELEX) analysis of 9,400 TF–TF–DNA interactions. This analysis revealed 315 TF–TF interactions recognizing 618 heterodimeric motifs, most of which have not been previously described. The observed cooperativity occurred promiscuously between TFs from diverse structural families. Structural analysis of the TF pairs, including a novel crystal structure of MEIS1 and DLX3 bound to their identified recognition site, revealed that the interactions between the TFs were predominantly mediated by DNA. Most TF pair sites identified involved a large overlap between individual TF recognition motifs, and resulted in recognition of composite sites that were markedly different from the individual TF’s motifs. Together, our results indicate that the DNA molecule commonly plays an active role in cooperative interactions that define the gene regulatory lexicon.


Science | 2017

Impact of cytosine methylation on DNA binding specificities of human transcription factors.

Yimeng Yin; Ekaterina Morgunova; Arttu Jolma; Eevi Kaasinen; Biswajyoti Sahu; Syed Khund-Sayeed; Pratyush K. Das; Teemu Kivioja; Kashyap Dave; Fan Zhong; Kazuhiro R. Nitta; Minna Taipale; Alexander Popov; Paul Adrian Ginno; Silvia Domcke; Jian Yan; Dirk Schübeler; Charles Vinson; Jussi Taipale

Positives and negatives of methylated CpG When the DNA bases cytosine and guanine are next to each other, a methyl group is generally added to the pyrimidine, generating a mCpG dinucleotide. This modification alters DNA structure but can also affect function by inhibiting transcription factor (TF) binding. Yin et al. systematically analyzed the effect of CpG methylation on the binding of 542 human TFs (see the Perspective by Hughes and Lambert). In addition to inhibiting binding of some TFs, they found that mCpGs can promote binding of others, particularly TFs involved in development, such as homeodomain proteins. Science, this issue p. eaaj2239; see also p. 489 Genome-scale analysis reveals positive and negative binding of transcription factors to methylated CpG dinucleotides. INTRODUCTION Nearly all cells in the human body share the same primary genome sequence consisting of four nucleotide bases. One of the bases, cytosine, is commonly modified by methylation of its 5 position in CpG dinucleotides (mCpG). Most CpG dinucleotides in the human genome are methylated, but the level of CpG methylation varies with genetic location (promoter versus gene body), whether genes are active versus silenced, and cell type. Research has shown that the maintenance of a particular cellular state after cell division is dependent on faithful transmission of methylated CpGs, as well as inheritance of the mother cells’ repertoire of transcription factors by the daughter cells. These two mechanisms of epigenetic inheritance are linked to each other; the binding of transcription factors can be affected by cytosine methylation, and cytosine methylation can, in turn, be added or removed by proteins that associate with transcription factors. RATIONALE The genetic and epigenetic language, which imparts when and where genes are expressed, is understood at a conceptual level. However, a more detailed understanding is needed of the genomic regulatory mechanism by which methylated cytosines affect transcription factor binding. Because cytosine methylation changes DNA structure, it has the potential to affect binding of all transcription factors. However, a systematic analysis of binding of a large collection of transcription factors to all possible DNA sequences has not previously been conducted. RESULTS To globally characterize the effect of cytosine methylation on transcription factor binding, we systematically analyzed binding specificities of full-length transcription factors and extended DNA binding domains to unmethylated and CpG-methylated DNA by using methylation-sensitive SELEX (systematic evolution of ligands by exponential enrichment). We evaluated binding of 542 transcription factors and identified a large number of previously uncharacterized transcription factor recognition motifs. Binding of most major classes of transcription factors, including bHLH, bZIP, and ETS, was inhibited by mCpG. In contrast, transcription factors such as homeodomain, POU, and NFAT proteins preferred to bind methylated DNA. This class of binding was enriched in factors with central roles in embryonic and organismal development. The observed binding preferences were validated using several orthogonal methods, including bisulfite-SELEX and protein-binding microarrays. In addition, the preference of the pluripotency factor OCT4 to bind to a mCpG-containing motif was confirmed by chromatin immunoprecipitation analysis in mouse embryonic stem cells with low or high levels of CpG methylation (due to deficiency in all enzymes that methylate cytosines or contribute to their removal, respectively). Crystal structure analysis of the homeodomain proteins HOXB13, CDX1, CDX2, and LHX4 revealed three key residues that contribute to the preference of this developmentally important family of transcription factors for mCpG. The preference for binding to mCpG was due to direct hydrophobic interactions with the 5-methyl group of methylcytosine. In contrast, inhibition of binding of other transcription factors to methylated sequences was found to be caused by steric hindrance. CONCLUSION Our work constitutes a global analysis of the effect of cytosine methylation on DNA binding specificities of human transcription factors. CpG methylation can influence binding of most transcription factors to DNA—in some cases negatively and in others positively. Our finding that many developmentally important transcription factors prefer to bind to mCpG sites can inform future analyses of the role of DNA methylation on cell differentiation, chromatin reprogramming, and transcriptional regulation. Systematic analysis of the impact of CpG methylation on transcription factor binding. The bottom left panel shows the fraction of transcription factors that prefer methylated (orange) or unmethylated (teal) CpG sites, are affected in multiple ways (yellow), are not affected (green), or do not have a CpG in their motifs (gray), as determined by methylation-sensitive SELEX (top left). The structure and logos on the right highlight how HOXB13 recognizes mCpG (blue shading indicates a CpG affected by methylation). The majority of CpG dinucleotides in the human genome are methylated at cytosine bases. However, active gene regulatory elements are generally hypomethylated relative to their flanking regions, and the binding of some transcription factors (TFs) is diminished by methylation of their target sequences. By analysis of 542 human TFs with methylation-sensitive SELEX (systematic evolution of ligands by exponential enrichment), we found that there are also many TFs that prefer CpG-methylated sequences. Most of these are in the extended homeodomain family. Structural analysis showed that homeodomain specificity for methylcytosine depends on direct hydrophobic interactions with the methylcytosine 5-methyl group. This study provides a systematic examination of the effect of an epigenetic DNA modification on human TF binding specificity and reveals that many developmentally important proteins display preference for mCpG-containing sequences.


eLife | 2015

Conservation of transcription factor binding specificities across 600 million years of bilateria evolution

Kazuhiro R. Nitta; Arttu Jolma; Yimeng Yin; Ekaterina Morgunova; Teemu Kivioja; Junaid Akhtar; Korneel Hens; Jarkko Toivonen; Bart Deplancke; Eileen E. M. Furlong; Jussi Taipale

Divergent morphology of species has largely been ascribed to genetic differences in the tissue-specific expression of proteins, which could be achieved by divergence in cis-regulatory elements or by altering the binding specificity of transcription factors (TFs). The relative importance of the latter has been difficult to assess, as previous systematic analyses of TF binding specificity have been performed using different methods in different species. To address this, we determined the binding specificities of 242 Drosophila TFs, and compared them to human and mouse data. This analysis revealed that TF binding specificities are highly conserved between Drosophila and mammals, and that for orthologous TFs, the similarity extends even to the level of very subtle dinucleotide binding preferences. The few human TFs with divergent specificities function in cell types not found in fruit flies, suggesting that evolution of TF specificities contributes to emergence of novel types of differentiated cells. DOI: http://dx.doi.org/10.7554/eLife.04837.001


Cell | 2018

The Human Transcription Factors

Samuel A. Lambert; Arttu Jolma; Laura F. Campitelli; Pratyush K. Das; Yimeng Yin; Mihai Albu; Xiaoting Chen; Jussi Taipale; Timothy R. Hughes; Matthew T. Weirauch

Transcription factors (TFs) recognize specific DNA sequences to control chromatin and transcription, forming a complex system that guides expression of the genome. Despite keen interest in understanding how TFs control gene expression, it remains challenging to determine how the precise genomic binding sites of TFs are specified and how TF binding ultimately relates to regulation of transcription. This review considers how TFs are identified and functionally characterized, principally through the lens of a catalog of over 1,600 likely human TFs and binding motifs for two-thirds of them. Major classes of human TFs differ markedly in their evolutionary trajectories and expression patterns, underscoring distinct functions. TFs likewise underlie many different aspects of human physiology, disease, and variation, highlighting the importance of continued effort to understand TF-mediated gene regulation.


Molecular Systems Biology | 2017

Transcription factor family‐specific DNA shape readout revealed by quantitative specificity models

Lin Yang; Yaron Orenstein; Arttu Jolma; Yimeng Yin; Jussi Taipale; Ron Shamir; Remo Rohs

Transcription factors (TFs) achieve DNA‐binding specificity through contacts with functional groups of bases (base readout) and readout of structural properties of the double helix (shape readout). Currently, it remains unclear whether DNA shape readout is utilized by only a few selected TF families, or whether this mechanism is used extensively by most TF families. We resequenced data from previously published HT‐SELEX experiments, the most extensive mammalian TF–DNA binding data available to date. Using these data, we demonstrated the contributions of DNA shape readout across diverse TF families and its importance in core motif‐flanking regions. Statistical machine‐learning models combined with feature‐selection techniques helped to reveal the nucleotide position‐dependent DNA shape readout in TF‐binding sites and the TF family‐specific position dependence. Based on these results, we proposed novel DNA shape logos to visualize the DNA shape preferences of TFs. Overall, this work suggests a way of obtaining mechanistic insights into TF–DNA binding without relying on experimentally solved all‐atom structures.


Nature Communications | 2015

Structural insights into the DNA-binding specificity of E2F family transcription factors

Ekaterina Morgunova; Yimeng Yin; Arttu Jolma; Kashyap Dave; Bernhard Schmierer; Alexander Popov; Nadejda Eremina; Lennart Nilsson; Jussi Taipale

The mammalian cell cycle is controlled by the E2F family of transcription factors. Typical E2Fs bind to DNA as heterodimers with the related dimerization partner (DP) proteins, whereas the atypical E2Fs, E2F7 and E2F8 contain two DNA-binding domains (DBDs) and act as repressors. To understand the mechanism of repression, we have resolved the structure of E2F8 in complex with DNA at atomic resolution. We find that the first and second DBDs of E2F8 resemble the DBDs of typical E2F and DP proteins, respectively. Using molecular dynamics simulations, biochemical affinity measurements and chromatin immunoprecipitation, we further show that both atypical and typical E2Fs bind to similar DNA sequences in vitro and in vivo. Our results represent the first crystal structure of an E2F protein with two DBDs, and reveal the mechanism by which atypical E2Fs can repress canonical E2F target genes and exert their negative influence on cell cycle progression.


eLife | 2018

Two distinct DNA sequences recognized by transcription factors represent enthalpy and entropy optima

Ekaterina Morgunova; Yimeng Yin; Pratyush K. Das; Arttu Jolma; Fangjie Zhu; Alexander Popov; You Xu; Lennart Nilsson; Jussi Taipale

Most transcription factors (TFs) can bind to a population of sequences closely related to a single optimal site. However, some TFs can bind to two distinct sequences that represent two local optima in the Gibbs free energy of binding (ΔG). To determine the molecular mechanism behind this effect, we solved the structures of human HOXB13 and CDX2 bound to their two optimal DNA sequences, CAATAAA and TCGTAAA. Thermodynamic analyses by isothermal titration calorimetry revealed that both sites were bound with similar ΔG. However, the interaction with the CAA sequence was driven by change in enthalpy (ΔH), whereas the TCG site was bound with similar affinity due to smaller loss of entropy (ΔS). This thermodynamic mechanism that leads to at least two local optima likely affects many macromolecular interactions, as ΔG depends on two partially independent variables ΔH and ΔS according to the central equation of thermodynamics, ΔG = ΔH - TΔS.


bioRxiv | 2018

Binding specificities of human RNA binding proteins towards structured and linear RNA sequences

Arttu Jolma; Jilin Zhang; Estefanía Mondragón; Teemu Kivioja; Yimeng Yin; Fangjie Zhu; Quaid Morris; Timothy R. Hughes; L. James Maher; Jussi Taipale

Sequence specific RNA-binding proteins (RBPs) control many important processes affecting gene expression. They regulate RNA metabolism at multiple levels, by affecting splicing of nascent transcripts, RNA folding, base modification, transport, localization, translation and stability. Despite their central role in most aspects of RNA metabolism and function, most RBP binding specificities remain unknown or incompletely defined. To address this, we have assembled a genome-scale collection of RBPs and their RNA binding domains (RBDs), and assessed their specificities using high throughput RNA-SELEX (HTR-SELEX). Approximately 70% of RBPs for which we obtained a motif bound to short linear sequences, whereas ~30% preferred structured motifs folding into stem-loops. We also found that many RBPs can bind to multiple distinctly different motifs. Analysis of the matches of the motifs in human genomic sequences suggested novel roles for many RBPs. We found that three cytoplasmic proteins, ZC3H12A, ZC3H12B and ZC3H12C bound to motifs resembling the splice donor sequence, suggesting that these proteins are involved in degradation of cytoplasmic viral and/or unspliced transcripts. Surprisingly, structural analysis revealed that the RNA motif was not bound by the conventional C3H1 RNA-binding domain of ZC3H12B. Instead, the RNA motif was bound by the ZC3H12B’s PilT N-terminus (PIN) RNase domain, revealing a potential mechanism by which unconventional RNA binding domains containing active sites or molecule-binding pockets could interact with short, structured RNA molecules. Our collection containing 145 high resolution binding specificity models for 86 RBPs is the largest systematic resource for the analysis of human RBPs, and will greatly facilitate future analysis of the various biological roles of this important class of proteins.


Nucleic Acids Research | 2018

Modular discovery of monomeric and dimeric transcription factor binding motifs for large data sets

Jarkko Toivonen; Teemu Kivioja; Arttu Jolma; Yimeng Yin; Jussi Taipale; Esko Ukkonen

Abstract In some dimeric cases of transcription factor (TF) binding, the specificity of dimeric motifs has been observed to differ notably from what would be expected were the two factors to bind to DNA independently of each other. Current motif discovery methods are unable to learn monomeric and dimeric motifs in modular fashion such that deviations from the expected motif would become explicit and the noise from dimeric occurrences would not corrupt monomeric models. We propose a novel modeling technique and an expectation maximization algorithm, implemented as software tool MODER, for discovering monomeric TF binding motifs and their dimeric combinations. Given training data and seeds for monomeric motifs, the algorithm learns in the same probabilistic framework a mixture model which represents monomeric motifs as standard position-specific probability matrices (PPMs), and dimeric motifs as pairs of monomeric PPMs, with associated orientation and spacing preferences. For dimers the model represents deviations from pure modular model of two independent monomers, thus making co-operative binding effects explicit. MODER can analyze in reasonable time tens of Mbps of training data. We validated the tool on HT-SELEX and ChIP-seq data. Our findings include some TFs whose expected model has palindromic symmetry but the observed model is directional.


Genome Research | 2016

Multiparameter functional diversity of human C2H2 zinc finger proteins

Frank W. Schmitges; Ernest Radovani; Hamed Shateri Najafabadi; Marjan Barazandeh; Laura F. Campitelli; Yimeng Yin; Arttu Jolma; Guoqing Zhong; Hongbo Guo; Tharsan Kanagalingam; Wei F. Dai; Jussi Taipale; Andrew Emili; Jack Greenblatt; Timothy R. Hughes

Collaboration


Dive into the Yimeng Yin's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Alexander Popov

European Synchrotron Radiation Facility

View shared research outputs
Researchain Logo
Decentralizing Knowledge