Alexander J. Hartemink

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Alexander J. Hartemink is active.

Explore More

Publication

Featured researches published by Alexander J. Hartemink.

Genome Research | 2012

ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia

Stephen G. Landt; Georgi K. Marinov; Anshul Kundaje; Pouya Kheradpour; Florencia Pauli; Serafim Batzoglou; Bradley E. Bernstein; Peter J. Bickel; James B. Brown; Philip Cayting; Yiwen Chen; Gilberto DeSalvo; Charles B. Epstein; Katherine I. Fisher-Aylor; Ghia Euskirchen; Mark Gerstein; Jason Gertz; Alexander J. Hartemink; Michael M. Hoffman; Vishwanath R. Iyer; Youngsook L. Jung; Subhradip Karmakar; Manolis Kellis; Peter V. Kharchenko; Qunhua Li; Tao Liu; X. Shirley Liu; Lijia Ma; Aleksandar Milosavljevic; Richard M. Myers

Chromatin immunoprecipitation (ChIP) followed by high-throughput DNA sequencing (ChIP-seq) has become a valuable and widely used approach for mapping the genomic location of transcription-factor binding and histone modifications in living cells. Despite its widespread use, there are considerable differences in how these experiments are conducted, how the results are scored and evaluated for quality, and how the data and metadata are archived for public use. These practices affect the quality and utility of any global ChIP experiment. Through our experience in performing ChIP-seq experiments, the ENCODE and modENCODE consortia have developed a set of working standards and guidelines for ChIP experiments that are updated routinely. The current guidelines address antibody validation, experimental replication, sequencing depth, data and metadata reporting, and data quality assessment. We discuss how ChIP quality, assessed in these ways, affects different uses of ChIP-seq data. All data sets used in the analysis have been deposited for public viewing and downloading at the ENCODE (http://encodeproject.org/ENCODE/) and modENCODE (http://www.modencode.org/) portals.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2005

Sparse multinomial logistic regression: fast algorithms and generalization bounds

Balaji Krishnapuram; Lawrence Carin; Mário A. T. Figueiredo; Alexander J. Hartemink

Recently developed methods for learning sparse classifiers are among the state-of-the-art in supervised learning. These methods learn classifiers that incorporate weighted sums of basis functions with sparsity-promoting priors encouraging the weight estimates to be either significantly large or exactly zero. From a learning-theoretic perspective, these methods control the capacity of the learned classifier by minimizing the number of basis functions used, resulting in better generalization. This paper presents three contributions related to learning sparse classifiers. First, we introduce a true multiclass formulation based on multinomial logistic regression. Second, by combining a bound optimization approach with a component-wise update procedure, we derive fast exact algorithms for learning sparse multiclass classifiers that scale favorably in both the number of training samples and the feature dimensionality, making them applicable even to large data sets in high-dimensional feature spaces. To the best of our knowledge, these are the first algorithms to perform exact multinomial logistic regression with a sparsity-promoting prior. Third, we show how nontrivial generalization bounds can be derived for our classifier in the binary case. Experimental results on standard benchmark data sets attest to the accuracy, sparsity, and efficiency of the proposed methods.

Bioinformatics | 2004

Advances to Bayesian network inference for generating causal networks from observational biological data

Jing Yu; V. Anne Smith; Paul P. Wang; Alexander J. Hartemink; Erich D. Jarvis

MOTIVATION Network inference algorithms are powerful computational tools for identifying putative causal interactions among variables from observational data. Bayesian network inference algorithms hold particular promise in that they can capture linear, non-linear, combinatorial, stochastic and other types of relationships among variables across multiple levels of biological organization. However, challenges remain when applying these algorithms to limited quantities of experimental data collected from biological systems. Here, we use a simulation approach to make advances in our dynamic Bayesian network (DBN) inference algorithm, especially in the context of limited quantities of biological data. RESULTS We test a range of scoring metrics and search heuristics to find an effective algorithm configuration for evaluating our methodological advances. We also identify sampling intervals and levels of data discretization that allow the best recovery of the simulated networks. We develop a novel influence score for DBNs that attempts to estimate both the sign (activation or repression) and relative magnitude of interactions among variables. When faced with limited quantities of observational data, combining our influence score with moderate data interpolation reduces a significant portion of false positive interactions in the recovered networks. Together, our advances allow DBN inference algorithms to be more effective in recovering biological networks from experimentally collected data. AVAILABILITY Source code and simulated data are available upon request. SUPPLEMENTARY INFORMATION http://www.jarvislab.net/Bioinformatics/BNAdvances/

pacific symposium on biocomputing | 2000

Using graphical models and genomic expression data to statistically validate models of genetic regulatory networks.

Alexander J. Hartemink; David K. Gifford; Tommi S. Jaakkola; Richard A. Young

We propose a model-driven approach for analyzing genomic expression data that permits genetic regulatory networks to be represented in a biologically interpretable computational form. Our models permit latent variables capturing unobserved factors, describe arbitrarily complex (more than pair-wise) relationships at varying levels of refinement, and can be scored rigorously against observational data. The models that we use are based on Bayesian networks and their extensions. As a demonstration of this approach, we utilize 52 genomes worth of Affymetrix GeneChip expression data to correctly differentiate between alternative hypotheses of the galactose regulatory network in S. cerevisiae. When we extend the graph semantics to permit annotated edges, we are able to score models describing relationships at a finer degree of specification.

pacific symposium on biocomputing | 2001

Combining location and expression data for principled discovery of genetic regulatory network models.

Alexander J. Hartemink; David K. Gifford; Tommi S. Jaakkola; Richard A. Young

We develop principled methods for the automatic induction (discovery) of genetic regulatory network models from multiple data sources and data modalities. Models of regulatory networks are represented as Bayesian networks, allowing the models to compactly and robustly capture probabilistic multivariate statistical dependencies between the various cellular factors in these networks. We build on previous Bayesian network validation results by extending the validation framework to the context of model induction, leveraging heuristic simulated annealing search algorithms and posterior model averaging. Using expression data in isolation yields results inconsistent with location data so we incorporate genomic location data to guide the model induction process. We combine these two data modalities by allowing location data to influence the model prior and expression data to influence the model likelihood. We demonstrate the utility of this approach by discovering genetic regulatory models of thirty-three variables involved in S. cerevisiae pheromone response. The models we automatically generate are consistent with the current understanding regarding this regulatory network, but also suggest new directions for future experimental investigation.

Nature | 2008

Global control of cell-cycle transcription by coupled CDK and network oscillators

David A. Orlando; Charles Y. Lin; Allister Bernard; Jean Y. J. Wang; Joshua E. S. Socolar; Edwin S. Iversen; Alexander J. Hartemink; Steven B. Haase

A significant fraction of the Saccharomyces cerevisiae genome is transcribed periodically during the cell division cycle, indicating that properly timed gene expression is important for regulating cell-cycle events. Genomic analyses of the localization and expression dynamics of transcription factors suggest that a network of sequentially expressed transcription factors could control the temporal programme of transcription during the cell cycle. However, directed studies interrogating small numbers of genes indicate that their periodic transcription is governed by the activity of cyclin-dependent kinases (CDKs). To determine the extent to which the global cell-cycle transcription programme is controlled by cyclin–CDK complexes, we examined genome-wide transcription dynamics in budding yeast mutant cells that do not express S-phase and mitotic cyclins. Here we show that a significant fraction of periodic genes are aberrantly expressed in the cyclin mutant. Although cells lacking cyclins are blocked at the G1/S border, nearly 70% of periodic genes continued to be expressed periodically and on schedule. Our findings reveal that although CDKs have a function in the regulation of cell-cycle transcription, they are not solely responsible for establishing the global periodic transcription programme. We propose that periodic transcription is an emergent property of a transcription factor network that can function as a cell-cycle oscillator independently of, and in tandem with, the CDK oscillator.

Nature Methods | 2013

Synergistic and tunable human gene activation by combinations of synthetic transcription factors

Pablo Perez-Pinera; David G. Ousterout; Jonathan M. Brunger; Alicia M Farin; Katherine A. Glass; Farshid Guilak; Gregory E. Crawford; Alexander J. Hartemink; Charles A. Gersbach

Mammalian genes are regulated by the cooperative and synergistic actions of many transcription factors. In this study we recapitulate this complex regulation in human cells by targeting endogenous gene promoters, including regions of closed chromatin upstream of silenced genes, with combinations of engineered transcription activator–like effectors (TALEs). These combinations of TALE transcription factors induced substantial gene activation and allowed tuning of gene expression levels that will broadly enable synthetic biology, gene therapy and biotechnology.

pacific symposium on biocomputing | 2004

Informative structure priors: joint learning of dynamic regulatory networks from multiple types of data.

Allister Bernard; Alexander J. Hartemink

We present a method for jointly learning dynamic models of transcriptional regulatory networks from gene expression data and transcription factor binding location data. Models are automatically learned using dynamic Bayesian network inference algorithms; joint learning is accomplished by incorporating evidence from gene expression data through the likelihood, and from transcription factor binding location data through the prior. We propose a new informative structure prior with two advantages. First, the prior incorporates evidence from location data probabilistically, allowing it to be weighed against evidence from expression data. Second, the prior takes on a factorable form that is computationally efficient when learning dynamic regulatory networks. Results obtained from both simulated and experimental data from the yeast cell cycle demonstrate that this joint learning algorithm can recover dynamic regulatory networks from multiple types of data that are more accurate than those recovered from each type of data in isolation.

Science | 2014

Convergent transcriptional specializations in the brains of humans and song-learning birds.

Andreas R. Pfenning; Erina Hara; Osceola Whitney; Miriam V. Rivas; Rui Wang; Petra L. Roulhac; Jason T. Howard; Morgan Wirthlin; Peter V. Lovell; Ganeshkumar Ganapathy; Jacquelyn Mouncastle; M. Arthur Moseley; J. Will Thompson; Erik J. Soderblom; Atsushi Iriki; Masaki Kato; M. Thomas P. Gilbert; Guojie Zhang; Trygve E. Bakken; Angie Bongaarts; Amy Bernard; Ed Lein; Claudio V. Mello; Alexander J. Hartemink; Erich D. Jarvis

INTRODUCTION Vocal learning, the ability to imitate sounds, is a trait that has undergone convergent evolution in several lineages of birds and mammals, including song-learning birds and humans. This behavior requires cortical and striatal vocal brain regions, which form unique connections in vocal-learning species. These regions have been found to have specialized gene expression within some species, but the patterns of specialization across vocal-learning bird and mammal species have not been systematically explored. Identifying molecular brain similarities across species. Brain region gene expression specializations were hierarchically organized into specialization trees of each species (blue lines), including for circuits that control learned vocalizations (highlighted green, purple, and orange regions). A set of comparative genomic algorithms found the most similarly specialized regions between songbird and human (orange lines), some of which are convergently evolved. RATIONALE The sequencing of genomes representing all major vocal-learning and vocal-nonlearning avian lineages has allowed us to develop the genomic tools to measure anatomical gene expression across species. Here, we asked whether behavioral and anatomical convergence is associated with gene expression convergence in the brains of vocal-learning birds and humans. RESULTS We developed a computational approach that discovers homologous and convergent specialized anatomical gene expression profiles. This includes generating hierarchically organized gene expression specialization trees for each species and a dynamic programming algorithm that finds the optimal alignment between species brain trees. We applied this approach to brain region gene expression databases of thousands of samples and genes that we and others generated from multiple species, including humans and song-learning birds (songbird, parrot, and hummingbird) as well as vocal-nonlearning nonhuman primates (macaque) and birds (dove and quail). Our results confirmed the recently revised understanding of the relationships between avian and mammalian brains. We further found that songbird Area X, a striatal region necessary for vocal learning, was most similar to a part of the human striatum activated during speech production. The RA (robust nucleus of the arcopallium) analog of song-learning birds, necessary for song production, was most similar to laryngeal motor cortex regions in humans that control speech production. More than 50 genes contributed to their convergent specialization and were enriched in motor control and neural connectivity functions. These patterns were not found in vocal nonlearners, but songbird RA was similar to layer 5 of primate motor cortex for another set of genes, supporting previous hypotheses about the similarity of these cell types between bird and mammal brains. CONCLUSION Our approach can accurately and quantitatively identify functionally and molecularly analogous brain regions between species separated by as much as 310 million years from a common ancestor. We were able to identify analogous brain regions for song and speech between birds and humans, and broader homologous brain regions in which these specialized song and speech regions are located, for tens to hundreds of genes. These genes now serve as candidates involved in developing and maintaining the unique connectivity and functional properties of vocal-learning brain circuits shared across species. The finding that convergent neural circuits for vocal learning are accompanied by convergent molecular changes of multiple genes in species separated by millions of years from a common ancestor indicates that brain circuits for complex traits may have limited ways in which they could have evolved from that ancestor. Song-learning birds and humans share independently evolved similarities in brain pathways for vocal learning that are essential for song and speech and are not found in most other species. Comparisons of brain transcriptomes of song-learning birds and humans relative to vocal nonlearners identified convergent gene expression specializations in specific song and speech brain regions of avian vocal learners and humans. The strongest shared profiles relate bird motor and striatal song-learning nuclei, respectively, with human laryngeal motor cortex and parts of the striatum that control speech production and learning. Most of the associated genes function in motor control and brain connectivity. Thus, convergent behavior and neural connectivity for a complex trait are associated with convergent specialized expression of multiple genes.

Nature Biotechnology | 2005