Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Harmen J. Bussemaker is active.

Publication


Featured researches published by Harmen J. Bussemaker.


Nature Genetics | 2001

Regulatory element detection using correlation with expression

Harmen J. Bussemaker; Hao Li; Eric D. Siggia

We present here a new computational method for discovering cis-regulatory elements that circumvents the need to cluster genes based on their expression profiles. Based on a model in which upstream motifs contribute additively to the log-expression level of a gene, this method requires a single genome-wide set of expression ratios and the upstream sequence for each gene, and outputs statistically significant motifs. Analysis of publicly available expression data for Saccharomyces cerevisiae reveals several new putative regulatory elements, some of which plausibly control the early, transient induction of genes during sporulation. Known motifs generally have high statistical significance.


Cell | 2010

Systematic protein location mapping reveals five principal chromatin types in Drosophila cells

Guillaume J. Filion; Joke G. van Bemmel; Ulrich Braunschweig; Wendy Talhout; Jop Kind; Lucas D. Ward; Wim Brugman; Inês J. de Castro; Ron M. Kerkhoven; Harmen J. Bussemaker; Bas van Steensel

Chromatin is important for the regulation of transcription and other functions, yet the diversity of chromatin composition and the distribution along chromosomes are still poorly characterized. By integrative analysis of genome-wide binding maps of 53 broadly selected chromatin components in Drosophila cells, we show that the genome is segmented into five principal chromatin types that are defined by unique yet overlapping combinations of proteins and form domains that can extend over > 100 kb. We identify a repressive chromatin type that covers about half of the genome and lacks classic heterochromatin markers. Furthermore, transcriptionally active euchromatin consists of two types that differ in molecular organization and H3K36 methylation and regulate distinct classes of genes. Finally, we provide evidence that the different chromatin types help to target DNA-binding factors to specific genomic regions. These results provide a global view of chromatin diversity and domain organization in a metazoan cell.


Cell | 2011

Cofactor Binding Evokes Latent Differences in DNA Binding Specificity between Hox Proteins

Matthew Slattery; Todd Riley; Peng Liu; Namiko Abe; Pilar Gomez-Alcala; Iris Dror; Tianyin Zhou; Remo Rohs; Barry Honig; Harmen J. Bussemaker; Richard S. Mann

Members of transcription factor families typically have similar DNA binding specificities yet execute unique functions in vivo. Transcription factors often bind DNA as multiprotein complexes, raising the possibility that complex formation might modify their DNA binding specificities. To test this hypothesis, we developed an experimental and computational platform, SELEX-seq, that can be used to determine the relative affinities to any DNA sequence for any transcription factor complex. Applying this method to all eight Drosophila Hox proteins, we show that they obtain novel recognition properties when they bind DNA with the dimeric cofactor Extradenticle-Homothorax (Exd). Exd-Hox specificities group into three main classes that obey Hox gene collinearity rules and DNA structure predictions suggest that anterior and posterior Hox proteins prefer DNA sequences with distinct minor groove topographies. Together, these data suggest that emergent DNA recognition properties revealed by interactions with cofactors contribute to transcription factor specificities in vivo.


BMC Bioinformatics | 2004

Defining transcriptional networks through integrative modeling of mRNA expression and transcription factor binding data

Feng Gao; Barrett C. Foat; Harmen J. Bussemaker

BackgroundFunctional genomics studies are yielding information about regulatory processes in the cell at an unprecedented scale. In the yeast S. cerevisiae, DNA microarrays have not only been used to measure the mRNA abundance for all genes under a variety of conditions but also to determine the occupancy of all promoter regions by a large number of transcription factors. The challenge is to extract useful information about the global regulatory network from these data.ResultsWe present MA-Networker, an algorithm that combines microarray data for mRNA expression and transcription factor occupancy to define the regulatory network of the cell. Multivariate regression analysis is used to infer the activity of each transcription factor, and the correlation across different conditions between this activity and the mRNA expression of a gene is interpreted as regulatory coupling strength. Applying our method to S. cerevisiae, we find that, on average, 58% of the genes whose promoter region is bound by a transcription factor are true regulatory targets. These results are validated by an analysis of enrichment for functional annotation, response for transcription factor deletion, and over-representation of cis-regulatory motifs. We are able to assign directionality to transcription factors that control divergently transcribed genes sharing the same promoter region. Finally, we identify an intrinsic limitation of transcription factor deletion experiments related to the combinatorial nature of transcriptional control, to which our approach provides an alternative.ConclusionOur reliable classification of ChIP positives into functional and non-functional TF targets based on their expression pattern across a wide range of conditions provides a starting point for identifying the unknown sequence features in non-coding DNA that directly or indirectly determine the context dependence of transcription factor action. Complete analysis results are available for browsing or download at http://bussemaker.bio.columbia.edu/papers/MA-Networker/.


Nature | 2002

Identification of genes expressed in C. elegans touch receptor neurons

Yun Xia Zhang; Charles Ma; Thomas M. Delohery; Brian T. Nasipak; Barrett C. Foat; Alexander Bounoutas; Harmen J. Bussemaker; Stuart K. Kim; Martin Chalfie

The extent of gene regulation in cell differentiation is poorly understood. We previously used saturation mutagenesis to identify 18 genes that are needed for the development and function of a single type of sensory neuron—the touch receptor neuron for gentle touch in Caenorhabditis elegans. One of these genes, mec-3, encodes a transcription factor that controls touch receptor differentiation. By culturing and isolating wild-type and mec-3 mutant cells from embryos and applying their amplified RNA to DNA microarrays, here we have identified genes that are known to be expressed in touch receptors, a previously uncloned gene (mec-17) that is needed for maintaining touch receptor differentiation, and more than 50 previously unknown mec-3-dependent genes. These genes are randomly distributed in the genome and under-represented both for genes that are co-expressed in operons and for multiple members of gene families. Using regions 5′ of the start codon of the first 20 genes, we have also identified an over-represented heptanucleotide, AATGCAT, that is needed for the expression of touch receptor genes.


Nucleic Acids Research | 2005

T-profiler: scoring the activity of predefined groups of genes using gene expression data

André Boorsma; Barrett C. Foat; Daniel J. Vis; Frans M. Klis; Harmen J. Bussemaker

One of the key challenges in the analysis of gene expression data is how to relate the expression level of individual genes to the underlying transcriptional programs and cellular state. Here we describe T-profiler, a tool that uses the t-test to score changes in the average activity of predefined groups of genes. The gene groups are defined based on Gene Ontology categorization, ChIP-chip experiments, upstream matches to a consensus transcription factor binding motif or location on the same chromosome. If desired, an iterative procedure can be used to select a single, optimal representative from sets of overlapping gene groups. T-profiler makes it possible to interpret microarray data in a way that is both intuitive and statistically rigorous, without the need to combine experiments or choose parameters. Currently, gene expression data from Saccharomyces cerevisiae and Candida albicans are supported. Users can upload their microarray data for analysis on the web at .


Nature Biotechnology | 2013

Evaluation of methods for modeling transcription factor sequence specificity

Matthew T. Weirauch; Raquel Norel; Matti Annala; Yue Zhao; Todd Riley; Julio Saez-Rodriguez; Thomas Cokelaer; Anastasia Vedenko; Shaheynoor Talukder; Phaedra Agius; Aaron Arvey; Philipp Bucher; Curtis G. Callan; Cheng Wei Chang; Chien-Yu Chen; Yong-Syuan Chen; Yu-Wei Chu; Jan Grau; Ivo Grosse; Vidhya Jagannathan; Jens Keilwagen; Szymon M. Kiełbasa; Justin B. Kinney; Holger Klein; Miron B. Kursa; Harri Lähdesmäki; Kirsti Laurila; Chengwei Lei; Christina S. Leslie; Chaim Linhart

Genomic analyses often involve scanning for potential transcription factor (TF) binding sites using models of the sequence specificity of DNA binding proteins. Many approaches have been developed to model and learn a proteins DNA-binding specificity, but these methods have not been systematically compared. Here we applied 26 such approaches to in vitro protein binding microarray data for 66 mouse TFs belonging to various families. For nine TFs, we also scored the resulting motif models on in vivo data, and found that the best in vitro–derived motifs performed similarly to motifs derived from the in vivo data. Our results indicate that simple models based on mononucleotide position weight matrices trained by the best methods perform similarly to more complex models for most TFs examined, but fall short in specific cases (<10% of the TFs examined here). In addition, the best-performing motifs typically have relatively low information content, consistent with widespread degeneracy in eukaryotic TF sequence preferences.


Proceedings of the National Academy of Sciences of the United States of America | 2006

Hotspots of transcription factor colocalization in the genome of Drosophila melanogaster

Celine Moorman; Ling V. Sun; Junbai Wang; Elzo de Wit; Wendy Talhout; Lucas D. Ward; Frauke Greil; Xiang-Jun Lu; Kevin P. White; Harmen J. Bussemaker; Bas van Steensel

Regulation of gene expression is a highly complex process that requires the concerted action of many proteins, including sequence-specific transcription factors, cofactors, and chromatin proteins. In higher eukaryotes, the interplay between these proteins and their interactions with the genome still is poorly understood. We systematically mapped the in vivo binding sites of seven transcription factors with diverse physiological functions, five cofactors, and two heterochromatin proteins at ≈1-kb resolution in a 2.9 Mb region of the Drosophila melanogaster genome. Surprisingly, all tested transcription factors and cofactors show strongly overlapping localization patterns, and the genome contains many “hotspots” that are targeted by all of these proteins. Several control experiments show that the strong overlap is not an artifact of the techniques used. Colocalization hotspots are 1–5 kb in size, spaced on average by ≈50 kb, and preferentially located in regions of active transcription. We provide evidence that protein–protein interactions play a role in the hotspot association of some transcription factors. Colocalization hotspots constitute a previously uncharacterized type of feature in the genome of Drosophila, and our results provide insights into the general targeting mechanisms of transcription regulators in a higher eukaryote.


Proceedings of the National Academy of Sciences of the United States of America | 2003

Genomewide analysis of Drosophila GAGA factor target genes reveals context-dependent DNA binding

Bas van Steensel; Jeffrey J. Delrow; Harmen J. Bussemaker

The association of sequence-specific DNA-binding factors with their cognate target sequences in vivo depends on the local molecular context, yet this context is poorly understood. To address this issue, we have performed genomewide mapping of in vivo target genes of Drosophila GAGA factor (GAF). The resulting list of ≈250 target genes indicates that GAF regulates many cellular pathways. We applied unbiased motif-based regression analysis to identify the sequence context that determines GAF binding. Our results confirm that GAF selectively associates with (GA)n repeat elements in vivo. GAF binding occurs in upstream regulatory regions, but less in downstream regions. Surprisingly, GAF binds abundantly to introns but is virtually absent from exons, even though the density of (GA)n is roughly the same. Intron binding occurs equally frequently in last introns compared with first introns, suggesting that GAF may not only regulate transcription initiation, but possibly also elongation. We provide evidence for cooperative binding of GAF to closely spaced (GA)n elements and explain the lack of GAF binding to exons by the absence of such closely spaced GA repeats. Our approach for revealing determinants of context-dependent DNA binding will be applicable to many other transcription factors.


Proceedings of the National Academy of Sciences of the United States of America | 2015

Quantitative modeling of transcription factor binding specificities using DNA shape.

Tianyin Zhou; Ning Shen; Lin Yang; Namiko Abe; John Horton; Richard S. Mann; Harmen J. Bussemaker; Raluca Gordân; Remo Rohs

Significance Genomes provide an abundance of putative binding sites for each transcription factor (TF). However, only small subsets of these potential targets are functional. TFs of the same protein family bind to target sites that are very similar but not identical. This distinction allows closely related TFs to regulate different genes and thus execute distinct functions. Because the nucleotide sequence of the core motif is often not sufficient for identifying a genomic target, we refined the description of TF binding sites by introducing a combination of DNA sequence and shape features, which consistently improved the modeling of in vitro TF−DNA binding specificities. Although additional factors affect TF binding in vivo, shape-augmented models reveal binding specificity mechanisms that are not apparent from sequence alone. DNA binding specificities of transcription factors (TFs) are a key component of gene regulatory processes. Underlying mechanisms that explain the highly specific binding of TFs to their genomic target sites are poorly understood. A better understanding of TF−DNA binding requires the ability to quantitatively model TF binding to accessible DNA as its basic step, before additional in vivo components can be considered. Traditionally, these models were built based on nucleotide sequence. Here, we integrated 3D DNA shape information derived with a high-throughput approach into the modeling of TF binding specificities. Using support vector regression, we trained quantitative models of TF binding specificity based on protein binding microarray (PBM) data for 68 mammalian TFs. The evaluation of our models included cross-validation on specific PBM array designs, testing across different PBM array designs, and using PBM-trained models to predict relative binding affinities derived from in vitro selection combined with deep sequencing (SELEX-seq). Our results showed that shape-augmented models compared favorably to sequence-based models. Although both k-mer and DNA shape features can encode interdependencies between nucleotide positions of the binding site, using DNA shape features reduced the dimensionality of the feature space. In addition, analyzing the feature weights of DNA shape-augmented models uncovered TF family-specific structural readout mechanisms that were not revealed by the DNA sequence. As such, this work combines knowledge from structural biology and genomics, and suggests a new path toward understanding TF binding and genome function.

Collaboration


Dive into the Harmen J. Bussemaker's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Remo Rohs

University of Southern California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Tianyin Zhou

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Lucas D. Ward

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge