Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Wouter Boomsma is active.

Publication


Featured researches published by Wouter Boomsma.


Systematic Biology | 2008

Statistical Assignment of DNA Sequences Using Bayesian Phylogenetics

Kasper Munch; Wouter Boomsma; John P. Huelsenbeck; Rasmus Nielsen

We provide a new automated statistical method for DNA barcoding based on a Bayesian phylogenetic analysis. The method is based on automated database sequence retrieval, alignment, and phylogenetic analysis using a custom-built program for Bayesian phylogenetic analysis. We show on real data that the method outperforms Blast searches as a measure of confidence and can help eliminate 80% of all false assignment based on best Blast hit. However, the most important advance of the method is that it provides statistically meaningful measures of confidence. We apply the method to a re-analysis of previously published ancient DNA data and show that, with high statistical confidence, most of the published sequences are in fact of Neanderthal origin. However, there are several cases of chimeric sequences that are comprised of a combination of both Neanderthal and modern human DNA.


Proceedings of the National Academy of Sciences of the United States of America | 2008

A generative, probabilistic model of local protein structure

Wouter Boomsma; Kanti V. Mardia; Charles C. Taylor; Jesper Ferkinghoff-Borg; Anders Krogh; Thomas Hamelryck

Despite significant progress in recent years, protein structure prediction maintains its status as one of the prime unsolved problems in computational biology. One of the key remaining challenges is an efficient probabilistic exploration of the structural space that correctly reflects the relative conformational stabilities. Here, we present a fully probabilistic, continuous model of local protein structure in atomic detail. The generative model makes efficient conformational sampling possible and provides a framework for the rigorous analysis of local sequence–structure correlations in the native state. Our method represents a significant theoretical and practical improvement over the widely used fragment assembly technique by avoiding the drawbacks associated with a discrete and nonprobabilistic approach.


PLOS Computational Biology | 2014

Combining Experiments and Simulations Using the Maximum Entropy Principle

Wouter Boomsma; Jesper Ferkinghoff-Borg; Kresten Lindorff-Larsen

A key component of computational biology is to compare the results of computer modelling with experimental measurements. Despite substantial progress in the models and algorithms used in many areas of computational biology, such comparisons sometimes reveal that the computations are not in quantitative agreement with experimental data. The principle of maximum entropy is a general procedure for constructing probability distributions in the light of new data, making it a natural tool in cases when an initial model provides results that are at odds with experiments. The number of maximum entropy applications in our field has grown steadily in recent years, in areas as diverse as sequence analysis, structural modelling, and neurobiology. In this Perspectives article, we give a broad introduction to the method, in an attempt to encourage its further adoption. The general procedure is explained in the context of a simple example, after which we proceed with a real-world application in the field of molecular simulations, where the maximum entropy procedure has recently provided new insight. Given the limited accuracy of force fields, macromolecular simulations sometimes produce results that are at not in complete and quantitative accordance with experiments. A common solution to this problem is to explicitly ensure agreement between the two by perturbing the potential energy function towards the experimental data. So far, a general consensus for how such perturbations should be implemented has been lacking. Three very recent papers have explored this problem using the maximum entropy approach, providing both new theoretical and practical insights to the problem. We highlight each of these contributions in turn and conclude with a discussion on remaining challenges.


Philosophical Transactions of the Royal Society B | 2008

Fast phylogenetic DNA barcoding

Kasper Munch; Wouter Boomsma; Rasmus Nielsen

We present a heuristic approach to the DNA assignment problem based on phylogenetic inferences using constrained neighbour joining and non-parametric bootstrapping. We show that this method performs as well as the more computationally intensive full Bayesian approach in an analysis of 500 insect DNA sequences obtained from GenBank. We also analyse a previously published dataset of environmental DNA sequences from soil from New Zealand and Siberia, and use these data to illustrate the fact that statistical approaches to the DNA assignment problem allow for more appropriate criteria for determining the taxonomic level at which a particular DNA sequence can be assigned.


PLOS ONE | 2010

Potentials of Mean Force for Protein Structure Prediction Vindicated, Formalized and Generalized

Thomas Hamelryck; Mikael Borg; Martin Paluszewski; Jonas Paulsen; Jes Frellsen; Christian Andreetta; Wouter Boomsma; Sandro Bottaro; Jesper Ferkinghoff-Borg

Understanding protein structure is of crucial importance in science, medicine and biotechnology. For about two decades, knowledge-based potentials based on pairwise distances – so-called “potentials of mean force” (PMFs) – have been center stage in the prediction and design of protein structure and the simulation of protein folding. However, the validity, scope and limitations of these potentials are still vigorously debated and disputed, and the optimal choice of the reference state – a necessary component of these potentials – is an unsolved problem. PMFs are loosely justified by analogy to the reversible work theorem in statistical physics, or by a statistical argument based on a likelihood function. Both justifications are insightful but leave many questions unanswered. Here, we show for the first time that PMFs can be seen as approximations to quantities that do have a rigorous probabilistic justification: they naturally arise when probability distributions over different features of proteins need to be combined. We call these quantities “reference ratio distributions” deriving from the application of the “reference ratio method.” This new view is not only of theoretical relevance but leads to many insights that are of direct practical use: the reference state is uniquely defined and does not require external physical insights; the approach can be generalized beyond pairwise distances to arbitrary features of protein structure; and it becomes clear for which purposes the use of these quantities is justified. We illustrate these insights with two applications, involving the radius of gyration and hydrogen bonding. In the latter case, we also show how the reference ratio method can be iteratively applied to sculpt an energy funnel. Our results considerably increase the understanding and scope of energy functions derived from known biomolecular structures.


Journal of the American Chemical Society | 2015

Structure of a functional amyloid protein subunit computed using sequence variation.

Pengfei Tian; Wouter Boomsma; Yong Wang; Daniel E. Otzen; Mogens H. Jensen; Kresten Lindorff-Larsen

Functional amyloid fibers, called curli, play a critical role in adhesion and invasion of many bacteria. Unlike pathological amyloids, curli structures are formed by polypeptide sequences whose amyloid structure has been selected for during evolution. This important distinction provides us with an opportunity to obtain structural insights from an unexpected source: the covariation of amino acids in sequences of different curli proteins. We used recently developed methods to extract amino acid contacts from a multiple sequence alignment of homologues of the curli subunit protein, CsgA. Together with an efficient force field, these contacts allow us to determine structural models of CsgA. We find that CsgA forms a β-helical structure, where each turn corresponds to previously identified repeat sequences in CsgA. The proposed structure is validated by previously measured solid-state NMR, electron microscopy, and X-ray diffraction data and agrees with an earlier proposed model derived by complementary means.


BMC Bioinformatics | 2010

Beyond rotamers: a generative, probabilistic model of side chains in proteins

Tim Harder; Wouter Boomsma; Martin Paluszewski; Jes Frellsen; Kristoffer E. Johansson; Thomas Hamelryck

BackgroundAccurately covering the conformational space of amino acid side chains is essential for important applications such as protein design, docking and high resolution structure prediction. Today, the most common way to capture this conformational space is through rotamer libraries - discrete collections of side chain conformations derived from experimentally determined protein structures. The discretization can be exploited to efficiently search the conformational space. However, discretizing this naturally continuous space comes at the cost of losing detailed information that is crucial for certain applications. For example, rigorously combining rotamers with physical force fields is associated with numerous problems.ResultsIn this work we present BASILISK: a generative, probabilistic model of the conformational space of side chains that makes it possible to sample in continuous space. In addition, sampling can be conditional upon the proteins detailed backbone conformation, again in continuous space - without involving discretization.ConclusionsA careful analysis of the model and a comparison with various rotamer libraries indicates that the model forms an excellent, fully continuous model of side chain conformational space. We also illustrate how the model can be used for rigorous, unbiased sampling with a physical force field, and how it improves side chain prediction when used as a pseudo-energy term. In conclusion, BASILISK is an important step forward on the way to a rigorous probabilistic description of protein structure in continuous space and in atomic detail.


Journal of Chemical Theory and Computation | 2014

Probabilistic Determination of Native State Ensembles of Proteins

Simon Olsson; Beat Vögeli; Andrea Cavalli; Wouter Boomsma; Jesper Ferkinghoff-Borg; Kresten Lindorff-Larsen; Thomas Hamelryck

The motions of biological macromolecules are tightly coupled to their functions. However, while the study of fast motions has become increasingly feasible in recent years, the study of slower, biologically important motions remains difficult. Here, we present a method to construct native state ensembles of proteins by the combination of physical force fields and experimental data through modern statistical methodology. As an example, we use NMR residual dipolar couplings to determine a native state ensemble of the extensively studied third immunoglobulin binding domain of protein G (GB3). The ensemble accurately describes both local and nonlocal backbone fluctuations as judged by its reproduction of complementary experimental data. While it is difficult to assess precise time-scales of the observed motions, our results suggest that it is possible to construct realistic conformational ensembles of biomolecules very efficiently. The approach may allow for a dramatic reduction in the computational as well as experimental resources needed to obtain accurate conformational ensembles of biological macromolecules in a statistically sound manner.


PLOS ONE | 2013

Inference of structure ensembles of flexible biomolecules from sparse, averaged data.

Simon Olsson; Jes Frellsen; Wouter Boomsma; Kanti V. Mardia; Thomas Hamelryck

We present the theoretical foundations of a general principle to infer structure ensembles of flexible biomolecules from spatially and temporally averaged data obtained in biophysical experiments. The central idea is to compute the Kullback-Leibler optimal modification of a given prior distribution with respect to the experimental data and its uncertainty. This principle generalizes the successful inferential structure determination method and recently proposed maximum entropy methods. Tractability of the protocol is demonstrated through the analysis of simulated nuclear magnetic resonance spectroscopy data of a small peptide.


PLOS ONE | 2015

Comparing molecular dynamics force fields in the essential subspace

Fernando Martín-García; Elena Papaleo; Paulino Gómez-Puertas; Wouter Boomsma; Kresten Lindorff-Larsen

The continued development and utility of molecular dynamics simulations requires improvements in both the physical models used (force fields) and in our ability to sample the Boltzmann distribution of these models. Recent developments in both areas have made available multi-microsecond simulations of two proteins, ubiquitin and Protein G, using a number of different force fields. Although these force fields mostly share a common mathematical form, they differ in their parameters and in the philosophy by which these were derived, and previous analyses showed varying levels of agreement with experimental NMR data. To complement the comparison to experiments, we have performed a structural analysis of and comparison between these simulations, thereby providing insight into the relationship between force-field parameterization, the resulting ensemble of conformations and the agreement with experiments. In particular, our results show that, at a coarse level, many of the motional properties are preserved across several, though not all, force fields. At a finer level of detail, however, there are distinct differences in both the structure and dynamics of the two proteins, which can, together with comparison with experimental data, help to select force fields for simulations of proteins. A noteworthy observation is that force fields that have been reparameterized and improved to provide a more accurate energetic description of the balance between helical and coil structures are difficult to distinguish from their “unbalanced” counterparts in these simulations. This observation implies that simulations of stable, folded proteins, even those reaching 10 microseconds in length, may provide relatively little information that can be used to modify torsion parameters to achieve an accurate balance between different secondary structural elements.

Collaboration


Dive into the Wouter Boomsma's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jesper Ferkinghoff-Borg

Technical University of Denmark

View shared research outputs
Top Co-Authors

Avatar

Jes Frellsen

University of Copenhagen

View shared research outputs
Top Co-Authors

Avatar

Pengfei Tian

University of Copenhagen

View shared research outputs
Top Co-Authors

Avatar

Yong Wang

University of Copenhagen

View shared research outputs
Top Co-Authors

Avatar

Sandro Bottaro

International School for Advanced Studies

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Simon Olsson

University of Copenhagen

View shared research outputs
Researchain Logo
Decentralizing Knowledge