Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Michael M. Hoffman is active.

Publication


Featured researches published by Michael M. Hoffman.


Genome Research | 2012

ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia

Stephen G. Landt; Georgi K. Marinov; Anshul Kundaje; Pouya Kheradpour; Florencia Pauli; Serafim Batzoglou; Bradley E. Bernstein; Peter J. Bickel; James B. Brown; Philip Cayting; Yiwen Chen; Gilberto DeSalvo; Charles B. Epstein; Katherine I. Fisher-Aylor; Ghia Euskirchen; Mark Gerstein; Jason Gertz; Alexander J. Hartemink; Michael M. Hoffman; Vishwanath R. Iyer; Youngsook L. Jung; Subhradip Karmakar; Manolis Kellis; Peter V. Kharchenko; Qunhua Li; Tao Liu; X. Shirley Liu; Lijia Ma; Aleksandar Milosavljevic; Richard M. Myers

Chromatin immunoprecipitation (ChIP) followed by high-throughput DNA sequencing (ChIP-seq) has become a valuable and widely used approach for mapping the genomic location of transcription-factor binding and histone modifications in living cells. Despite its widespread use, there are considerable differences in how these experiments are conducted, how the results are scored and evaluated for quality, and how the data and metadata are archived for public use. These practices affect the quality and utility of any global ChIP experiment. Through our experience in performing ChIP-seq experiments, the ENCODE and modENCODE consortia have developed a set of working standards and guidelines for ChIP experiments that are updated routinely. The current guidelines address antibody validation, experimental replication, sequencing depth, data and metadata reporting, and data quality assessment. We discuss how ChIP quality, assessed in these ways, affects different uses of ChIP-seq data. All data sets used in the analysis have been deposited for public viewing and downloading at the ENCODE (http://encodeproject.org/ENCODE/) and modENCODE (http://www.modencode.org/) portals.


Nucleic Acids Research | 2013

Integrative annotation of chromatin elements from ENCODE data

Michael M. Hoffman; Jason Ernst; Steven P. Wilder; Anshul Kundaje; Robert S. Harris; Max Libbrecht; Belinda Giardine; Paul M. Ellenbogen; Jeff A. Bilmes; Ewan Birney; Ross C. Hardison; Ian Dunham; Manolis Kellis; William Stafford Noble

The ENCODE Project has generated a wealth of experimental information mapping diverse chromatin properties in several human cell lines. Although each such data track is independently informative toward the annotation of regulatory elements, their interrelations contain much richer information for the systematic annotation of regulatory elements. To uncover these interrelations and to generate an interpretable summary of the massive datasets of the ENCODE Project, we apply unsupervised learning methodologies, converting dozens of chromatin datasets into discrete annotation maps of regulatory regions and other chromatin elements across the human genome. These methods rediscover and summarize diverse aspects of chromatin architecture, elucidate the interplay between chromatin activity and RNA transcription, and reveal that a large proportion of the genome lies in a quiescent state, even across multiple cell types. The resulting annotation of non-coding regulatory elements correlate strongly with mammalian evolutionary constraint, and provide an unbiased approach for evaluating metrics of evolutionary constraint in human. Lastly, we use the regulatory annotations to revisit previously uncharacterized disease-associated loci, resulting in focused, testable hypotheses through the lens of the chromatin landscape.


Nature | 2014

Comparative analysis of metazoan chromatin organization

Joshua W. K. Ho; Youngsook L. Jung; Tao Liu; Burak H. Alver; Soohyun Lee; Kohta Ikegami; Kyung Ah Sohn; Aki Minoda; Michael Y. Tolstorukov; Alex Appert; Stephen C. J. Parker; Tingting Gu; Anshul Kundaje; Nicole C. Riddle; Eric P. Bishop; Thea A. Egelhofer; Sheng'En Shawn Hu; Artyom A. Alekseyenko; Andreas Rechtsteiner; Dalal Asker; Jason A. Belsky; Sarah K. Bowman; Q. Brent Chen; Ron Chen; Daniel S. Day; Yan Dong; Andréa C. Dosé; Xikun Duan; Charles B. Epstein; Sevinc Ercan

Genome function is dynamically regulated in part by chromatin, which consists of the histones, non-histone proteins and RNA molecules that package DNA. Studies in Caenorhabditis elegans and Drosophila melanogaster have contributed substantially to our understanding of molecular mechanisms of genome function in humans, and have revealed conservation of chromatin components and mechanisms. Nevertheless, the three organisms have markedly different genome sizes, chromosome architecture and gene organization. On human and fly chromosomes, for example, pericentric heterochromatin flanks single centromeres, whereas worm chromosomes have dispersed heterochromatin-like regions enriched in the distal chromosomal ‘arms’, and centromeres distributed along their lengths. To systematically investigate chromatin organization and associated gene regulation across species, we generated and analysed a large collection of genome-wide chromatin data sets from cell lines and developmental stages in worm, fly and human. Here we present over 800 new data sets from our ENCODE and modENCODE consortia, bringing the total to over 1,400. Comparison of combinatorial patterns of histone modifications, nuclear lamina-associated domains, organization of large-scale topological domains, chromatin environment at promoters and enhancers, nucleosome positioning, and DNA replication patterns reveals many conserved features of chromatin organization among the three organisms. We also find notable differences in the composition and locations of repressive chromatin. These data sets and analyses provide a rich resource for comparative and species-specific investigations of chromatin composition, organization and function.


Nucleic Acids Research | 2004

AANT: the Amino Acid–Nucleotide Interaction Database

Michael M. Hoffman; Maksim Khrapov; J. Colin Cox; Jianchao Yao; Lingnan Tong; Andrew D. Ellington

We have created an Amino Acid-Nucleotide Interaction Database (AANT; http://aant.icmb.utexas. edu/) that categorizes all amino acid-nucleotide interactions from experimentally determined protein-nucleic acid structures, and provides users with a graphic interface for visualizing these interactions in aggregate. AANT accomplishes this by extracting individual amino acid-nucleotide interactions from structures in the Protein Data Bank, combining and superimposing these interactions into multiple structure files (e.g. 20 amino acids x 5 nucleotides) and grouping structurally similar interactions into more readily identifiable clusters. Using the Chime web browser plug-in, users can view 3D representations of the superimpositions and clusters. The unique collection and representation of data on amino acid-nucleotide interactions facilitates understanding the specificity of protein-nucleic acid interactions at a more fundamental level, and allows comparison of otherwise extremely disparate sets of structures. Moreover, by modularly representing the fundamental interactions that govern binding specificity it may prove possible to better engineer nucleic acid binding proteins.


Genome Biology | 2015

Extending reference assembly models.

Deanna M. Church; Valerie Schneider; Karyn Meltz Steinberg; Michael C. Schatz; Aaron R. Quinlan; Chen Shan Chin; Paul Kitts; Bronwen Aken; Gabor T. Marth; Michael M. Hoffman; Javier Herrero; M. Lisandra Zepeda Mendoza; Richard Durbin; Paul Flicek

The human genome reference assembly is crucial for aligning and analyzing sequence data, and for genome annotation, among other roles. However, the models and analysis assumptions that underlie the current assembly need revising to fully represent human sequence diversity. Improved analysis tools and updated data reporting formats are also required.


Journal of the Royal Society Interface | 2018

Opportunities and obstacles for deep learning in biology and medicine

Travers Ching; Daniel Himmelstein; Brett K. Beaulieu-Jones; Alexandr A. Kalinin; Brian T. Do; Gregory P. Way; Enrico Ferrero; Paul-Michael Agapow; Michael Zietz; Michael M. Hoffman; Wei Xie; Gail Rosen; Benjamin J. Lengerich; Johnny Israeli; Jack Lanchantin; Stephen Woloszynek; Anne E. Carpenter; Avanti Shrikumar; Jinbo Xu; Evan M. Cofer; Christopher A. Lavender; Srinivas C. Turaga; Amr Alexandari; Zhiyong Lu; David J. Harris; Dave DeCaprio; Yanjun Qi; Anshul Kundaje; Yifan Peng; Laura Wiley

Deep learning describes a class of machine learning algorithms that are capable of combining raw inputs into layers of intermediate features. These algorithms have recently shown impressive results across a variety of domains. Biology and medicine are data-rich disciplines, but the data are complex and often ill-understood. Hence, deep learning techniques may be particularly well suited to solve problems of these fields. We examine applications of deep learning to a variety of biomedical problems—patient classification, fundamental biological processes and treatment of patients—and discuss whether deep learning will be able to transform these tasks or if the biomedical sphere poses unique challenges. Following from an extensive literature review, we find that deep learning has yet to revolutionize biomedicine or definitively resolve any of the most pressing challenges in the field, but promising advances have been made on the prior state of the art. Even though improvements over previous baselines have been modest in general, the recent progress indicates that deep learning methods will provide valuable means for speeding up or aiding human investigation. Though progress has been made linking a specific neural networks prediction to input features, understanding how users should interpret these models to make testable hypotheses about the system under study remains an open challenge. Furthermore, the limited amount of labelled data for training presents problems in some domains, as do legal and privacy constraints on work with sensitive health records. Nonetheless, we foresee deep learning enabling changes at both bench and bedside with the potential to transform several areas of biology and medicine.


Bioinformatics | 2010

A dynamic Bayesian network for identifying protein-binding footprints from single molecule-based sequencing data

Xiaoyu Chen; Michael M. Hoffman; Jeff A. Bilmes; Jay R. Hesselberth; William Stafford Noble

Motivation: A global map of transcription factor binding sites (TFBSs) is critical to understanding gene regulation and genome function. DNaseI digestion of chromatin coupled with massively parallel sequencing (digital genomic footprinting) enables the identification of protein-binding footprints with high resolution on a genome-wide scale. However, accurately inferring the locations of these footprints remains a challenging computational problem. Results: We present a dynamic Bayesian network-based approach for the identification and assignment of statistical confidence estimates to protein-binding footprints from digital genomic footprinting data. The method, DBFP, allows footprints to be identified in a probabilistic framework and outperforms our previously described algorithm in terms of precision at a fixed recall. Applied to a digital footprinting data set from Saccharomyces cerevisiae, DBFP identifies 4679 statistically significant footprints within intergenic regions. These footprints are mainly located near transcription start sites and are strongly enriched for known TFBSs. Footprints containing no known motif are preferentially located proximal to other footprints, consistent with cooperative binding of these footprints. DBFP also identifies a set of statistically significant footprints in the yeast coding regions. Many of these footprints coincide with the boundaries of antisense transcripts, and the most significant footprints are enriched for binding sites of the chromatin-associated factors Abf1 and Rap1. Contact: [email protected]; [email protected] Supplementary information: Supplementary material is available at Bioinformatics online.


Genome Research | 2010

An effective model for natural selection in promoters

Michael M. Hoffman; Ewan Birney

We have produced an evolutionary model for promoters, analogous to the commonly used synonymous/nonsynonymous mutation models for protein-coding sequences. Although our model, called Sunflower, relies on some simple assumptions, it captures enough of the biology of transcription factor action to show clear correlation with other biological features. Sunflower predicts a binding profile of transcription factors to DNA sequences, in which different factors compete for the same potential binding sites. The parametrized model simultaneously estimates a continuous measurement of binding occupancy across the genomic sequence for each factor. We can then introduce a localized mutation, rerun the binding model, and record the difference in binding profiles. A single mutation can alter interactions both upstream and downstream of its position due to potential overlapping binding sites, and our statistic captures this domino effect. Over evolutionary time, we observe a clear excess of low-scoring mutations fixed in promoters, consistent with most changes being neutral. However, this is not consistent across all promoters, and some promoters show more rapid divergence. This divergence often occurs in the presence of relatively constant protein-coding divergence. Interestingly, different classes of promoters show different sensitivity to mutations, with phosphorylation-related genes having promoters inherently more sensitive to mutations than immune genes. Although there have previously been a number of models attempting to handle transcription factor binding, Sunflower provides a richer biological model, incorporating weak binding sites and the possibility of competition. The results show the first clear correlations between such a model and evolutionary processes.


Genome Biology | 2016

ChromNet: Learning the human chromatin network from all ENCODE ChIP-seq data

Scott M. Lundberg; William B. Tu; Brian Raught; Linda Z. Penn; Michael M. Hoffman; Su-In Lee

A cell’s epigenome arises from interactions among regulatory factors—transcription factors and histone modifications—co-localized at particular genomic regions. We developed a novel statistical method, ChromNet, to infer a network of these interactions, the chromatin network, by inferring conditional-dependence relationships among a large number of ChIP-seq data sets. We applied ChromNet to all available 1451 ChIP-seq data sets from the ENCODE Project, and showed that ChromNet revealed previously known physical interactions better than alternative approaches. We experimentally validated one of the previously unreported interactions, MYC–HCFC1. An interactive visualization tool is available at http://chromnet.cs.washington.edu.


Bioinformatics | 2010

The Genomedata format for storing large-scale functional genomics data

Michael M. Hoffman; Orion J. Buske; William Stafford Noble

Summary: We present a format for efficient storage of multiple tracks of numeric data anchored to a genome. The format allows fast random access to hundreds of gigabytes of data, while retaining a small disk space footprint. We have also developed utilities to load data into this format. We show that retrieving data from this format is more than 2900 times faster than a naive approach using wiggle files. Availability and Implementation: Reference implementation in Python and C components available at http://noble.gs.washington.edu/proj/genomedata/ under the GNU General Public License. Contact: [email protected]

Collaboration


Dive into the Michael M. Hoffman's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jeff A. Bilmes

University of Washington

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Brian Raught

Princess Margaret Cancer Centre

View shared research outputs
Top Co-Authors

Avatar

Eric G. Roberts

Princess Margaret Cancer Centre

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Rachel C.W. Chan

University of British Columbia

View shared research outputs
Top Co-Authors

Avatar

William B. Tu

Princess Margaret Cancer Centre

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge