Is this you? Create Your Porfile

Bonnie Berger

Massachusetts Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Bonnie Berger is active.

Explore More

Publication

Featured researches published by Bonnie Berger.

Science | 2010

Identification of functional elements and regulatory circuits by Drosophila modENCODE

Sushmita Roy; Jason Ernst; Peter V. Kharchenko; Pouya Kheradpour; Nicolas Nègre; Matthew L. Eaton; Jane M. Landolin; Christopher A. Bristow; Lijia Ma; Michael F. Lin; Stefan Washietl; Bradley I. Arshinoff; Ferhat Ay; Patrick E. Meyer; Nicolas Robine; Nicole L. Washington; Luisa Di Stefano; Eugene Berezikov; Christopher D. Brown; Rogerio Candeias; Joseph W. Carlson; Adrian Carr; Irwin Jungreis; Daniel Marbach; Rachel Sealfon; Michael Y. Tolstorukov; Sebastian Will; Artyom A. Alekseyenko; Carlo G. Artieri; Benjamin W. Booth

From Genome to Regulatory Networks For biologists, having a genome in hand is only the beginning—much more investigation is still needed to characterize how the genome is used to help to produce a functional organism (see the Perspective by Blaxter). In this vein, Gerstein et al. (p. 1775) summarize for the Caenorhabditis elegans genome, and The modENCODE Consortium (p. 1787) summarize for the Drosophila melanogaster genome, full transcriptome analyses over developmental stages, genome-wide identification of transcription factor binding sites, and high-resolution maps of chromatin organization. Both studies identified regions of the nematode and fly genomes that show highly occupied targets (or HOT) regions where DNA was bound by more than 15 of the transcription factors analyzed and the expression of related genes were characterized. Overall, the studies provide insights into the organization, structure, and function of the two genomes and provide basic information needed to guide and correlate both focused and genome-wide studies. The Drosophila modENCODE project demonstrates the functional regulatory network of flies. To gain insight into how genomic information is translated into cellular and developmental programs, the Drosophila model organism Encyclopedia of DNA Elements (modENCODE) project is comprehensively mapping transcripts, histone modifications, chromosomal proteins, transcription factors, replication proteins and intermediates, and nucleosome properties across a developmental time course and in multiple cell lines. We have generated more than 700 data sets and discovered protein-coding, noncoding, RNA regulatory, replication, and chromatin elements, more than tripling the annotated portion of the Drosophila genome. Correlated activity patterns of these elements reveal a functional regulatory network, which predicts putative new functions for genes, reveals stage- and tissue-specific regulators, and enables gene-expression prediction. Our results provide a foundation for directed experimental and computational studies in Drosophila and related species and also a model for systematic data integration toward comprehensive genomic and functional annotation.

research in computational molecular biology | 1998

Protein folding in the hydrophobic-hydrophilic ( HP ) is NP-complete

Bonnie Berger; Tom Leighton

One of the simplest and most popular biophysical models of protein folding is the hydrophobic-hydrophilic (HP) model. The HP model abstracts the hydrophobic interaction in protein folding by labeling the amino acids as hydrophobic (H for nonpolar) or hydrophilic (P for polar). Chains of amino acids are configured as self-avoiding walks on the 3D cubic lattice, where an optimal conformation maximizes the number of adjacencies between Hs. In this paper, the protein folding problem under the HP model on the cubic lattice is shown to be NP-complete. This means that the protein folding problem belongs to a large set of problems that are believed to be computationally intractable.

Proceedings of the National Academy of Sciences of the United States of America | 2008

Global alignment of multiple protein interaction networks with application to functional orthology detection

Rohit Singh; Jinbo Xu; Bonnie Berger

Protein–protein interactions (PPIs) and their networks play a central role in all biological processes. Akin to the complete sequencing of genomes and their comparative analysis, complete descriptions of interactomes and their comparative analysis is fundamental to a deeper understanding of biological processes. A first step in such an analysis is to align two or more PPI networks. Here, we introduce an algorithm, IsoRank, for global alignment of multiple PPI networks. The guiding intuition here is that a protein in one PPI network is a good match for a protein in another network if their respective sequences and neighborhood topologies are a good match. We encode this intuition as an eigenvalue problem in a manner analogous to Googles PageRank method. Using IsoRank, we compute a global alignment of the Saccharomyces cerevisiae, Drosophila melanogaster, Caenorhabditis elegans, Mus musculus, and Homo sapiens PPI networks. We demonstrate that incorporating PPI data in ortholog prediction results in improvements over existing sequence-only approaches and over predictions from local alignments of the yeast and fly networks. Previous methods have been effective at identifying conserved, localized network patterns across pairs of networks. This work takes the further step of performing a global alignment of multiple PPI networks. It simultaneously uses sequence similarity and network data and, unlike previous approaches, explicitly models the tradeoff inherent in combining them. We expect IsoRank—with its simultaneous handling of node similarity and network similarity—to be applicable across many scientific domains.

Bioinformatics | 2009

IsoRankN: spectral methods for global alignment of multiple protein networks

Chung-Shou Liao; Kanghao Lu; Michael H. Baym; Rohit Singh; Bonnie Berger

Motivation: With the increasing availability of large protein–protein interaction networks, the question of protein network alignment is becoming central to systems biology. Network alignment is further delineated into two sub-problems: local alignment, to find small conserved motifs across networks, and global alignment, which attempts to find a best mapping between all nodes of the two networks. In this article, our aim is to improve upon existing global alignment results. Better network alignment will enable, among other things, more accurate identification of functional orthologs across species. Results: We introduce IsoRankN (IsoRank-Nibble) a global multiple-network alignment tool based on spectral clustering on the induced graph of pairwise alignment scores. IsoRankN outperforms existing algorithms for global network alignment in coverage and consistency on multiple alignments of the five available eukaryotic networks. Being based on spectral methods, IsoRankN is both error tolerant and computationally efficient. Availability: Our software is available freely for non-commercial purposes on request from: http://isorank.csail.mit.edu/ Contact: [email protected]

research in computational molecular biology | 2007

Pairwise global alignment of protein interaction networks by matching neighborhood topology

Rohit Singh; Jinbo Xu; Bonnie Berger

We describe an algorithm, IsoRank, for global alignment of two protein-protein interaction (PPI) networks. IsoRank aims to maximize the overall match between the two networks; in contrast, much of previous work has focused on the local alignment problem-- identifying many possible alignments, each corresponding to a local region of similarity. IsoRank is guided by the intuition that a protein should be matched with a protein in the other network if and only if the neighbors of the two proteins can also be well matched. We encode this intuition as an eigenvalue problem, in a manner analogous to Googles PageRank method. We use IsoRank to compute the first known global alignment between the S. cerevisiae and D. melanogaster PPI networks. The common subgraph has 1420 edges and describes conserved functional components between the two species. Comparisons of our results with those of a well-known algorithm for local network alignment indicate that the globally optimized alignment resolves ambiguity introduced by multiple local alignments. Finally, we interpret the results of global alignment to identify functional orthologs between yeast and fly; our functional ortholog prediction method is much simpler than a recently proposed approach and yet provides results that are more comprehensive.

Journal of Computational Biology | 1998

Protein folding in the hydrophobic-hydrophilic (HP) model is NP-complete.

Bonnie Berger; Tom Leighton

Nature Genetics | 2015

Efficient Bayesian mixed model analysis increases association power in large cohorts

Po-Ru Loh; George Tucker; Brendan Bulik-Sullivan; Bjarni J. Vilhjálmsson; Hilary Finucane; Rany M. Salem; Daniel I. Chasman; Paul M. Ridker; Benjamin M. Neale; Bonnie Berger; Nick Patterson; Alkes L. Price

Linear mixed models are a powerful statistical tool for identifying genetic associations and avoiding confounding. However, existing methods are computationally intractable in large cohorts and may not optimize power. All existing methods require time cost O(MN2) (where N is the number of samples and M is the number of SNPs) and implicitly assume an infinitesimal genetic architecture in which effect sizes are normally distributed, which can limit power. Here we present a far more efficient mixed-model association method, BOLT-LMM, which requires only a small number of O(MN) time iterations and increases power by modeling more realistic, non-infinitesimal genetic architectures via a Bayesian mixture prior on marker effect sizes. We applied BOLT-LMM to 9 quantitative traits in 23,294 samples from the Womens Genome Health Study (WGHS) and observed significant increases in power, consistent with simulations. Theory and simulations show that the boost in power increases with cohort size, making BOLT-LMM appealing for genome-wide association studies in large cohorts.

PLOS Computational Biology | 2008

Matt: Local Flexibility Aids Protein Multiple Structure Alignment

Matthew Menke; Bonnie Berger; Lenore J. Cowen

Even when there is agreement on what measure a protein multiple structure alignment should be optimizing, finding the optimal alignment is computationally prohibitive. One approach used by many previous methods is aligned fragment pair chaining, where short structural fragments from all the proteins are aligned against each other optimally, and the final alignment chains these together in geometrically consistent ways. Ye and Godzik have recently suggested that adding geometric flexibility may help better model protein structures in a variety of contexts. We introduce the program Matt (Multiple Alignment with Translations and Twists), an aligned fragment pair chaining algorithm that, in intermediate steps, allows local flexibility between fragments: small translations and rotations are temporarily allowed to bring sets of aligned fragments closer, even if they are physically impossible under rigid body transformations. After a dynamic programming assembly guided by these “bent” alignments, geometric consistency is restored in the final step before the alignment is output. Matt is tested against other recent multiple protein structure alignment programs on the popular Homstrad and SABmark benchmark datasets. Matts global performance is competitive with the other programs on Homstrad, but outperforms the other programs on SABmark, a benchmark of multiple structure alignments of proteins with more distant homology. On both datasets, Matt demonstrates an ability to better align the ends of α-helices and β-strands, an important characteristic of any structure alignment program intended to help construct a structural template library for threading approaches to the inverse protein-folding problem. The related question of whether Matt alignments can be used to distinguish distantly homologous structure pairs from pairs of proteins that are not homologous is also considered. For this purpose, a p-value score based on the length of the common core and average root mean squared deviation (RMSD) of Matt alignments is shown to largely separate decoys from homologous protein structures in the SABmark benchmark dataset. We postulate that Matts strong performance comes from its ability to model proteins in different conformational states and, perhaps even more important, its ability to model backbone distortions in more distantly related proteins.

Journal of Molecular Biology | 1999

LearnCoil-VMF: computational evidence for coiled-coil-like motifs in many viral membrane-fusion proteins.

Mona Singh; Bonnie Berger; Peter S. Kim

Abstract Crystallographic studies have shown that the coiled-coil motif occurs in several viral membrane-fusion proteins, including HIV-1 gp41 and influenza virus hemagglutinin. Here, the LearnCoil-VMF program was designed as a specialized program for identifying coiled-coil-like regions in viral membrane-fusion proteins. Based upon the use of LearnCoil-VMF, as well as other computational tools, we report detailed sequence analyses of coiled-coil-like regions in retrovirus, paramyxovirus and filovirus membrane-fusion proteins. Additionally, sequence analyses of these proteins outside their putative coiled-coil domains illustrate some structural differences between them. Complementing previous crystallographic studies, the coiled-coil-like regions detected by LearnCoil-VMF provide further evidence that the three-stranded coiled coil is a common motif found in many diverse viral membrane-fusion proteins. The abundance and structural conservation of this motif, even in the absence of sequence homology, suggests that it is critical for viral-cellular membrane fusion. The LearnCoil-VMF program is available at http://web.wi.mit.edu/kim

Genetics | 2013

Inferring Admixture Histories of Human Populations Using Linkage Disequilibrium

Po-Ru Loh; Mark Lipson; Nick Patterson; Priya Moorjani; Joseph K. Pickrell; David Reich; Bonnie Berger

Long-range migrations and the resulting admixtures between populations have been important forces shaping human genetic diversity. Most existing methods for detecting and reconstructing historical admixture events are based on allele frequency divergences or patterns of ancestry segments in chromosomes of admixed individuals. An emerging new approach harnesses the exponential decay of admixture-induced linkage disequilibrium (LD) as a function of genetic distance. Here, we comprehensively develop LD-based inference into a versatile tool for investigating admixture. We present a new weighted LD statistic that can be used to infer mixture proportions as well as dates with fewer constraints on reference populations than previous methods. We define an LD-based three-population test for admixture and identify scenarios in which it can detect admixture events that previous formal tests cannot. We further show that we can uncover phylogenetic relationships among populations by comparing weighted LD curves obtained using a suite of references. Finally, we describe several improvements to the computation and fitting of weighted LD curves that greatly increase the robustness and speed of the calculations. We implement all of these advances in a software package, ALDER, which we validate in simulations and apply to test for admixture among all populations from the Human Genome Diversity Project (HGDP), highlighting insights into the admixture history of Central African Pygmies, Sardinians, and Japanese.

Explore More