Martin Paluszewski
University of Copenhagen
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Martin Paluszewski.
PLOS ONE | 2010
Thomas Hamelryck; Mikael Borg; Martin Paluszewski; Jonas Paulsen; Jes Frellsen; Christian Andreetta; Wouter Boomsma; Sandro Bottaro; Jesper Ferkinghoff-Borg
Understanding protein structure is of crucial importance in science, medicine and biotechnology. For about two decades, knowledge-based potentials based on pairwise distances – so-called “potentials of mean force” (PMFs) – have been center stage in the prediction and design of protein structure and the simulation of protein folding. However, the validity, scope and limitations of these potentials are still vigorously debated and disputed, and the optimal choice of the reference state – a necessary component of these potentials – is an unsolved problem. PMFs are loosely justified by analogy to the reversible work theorem in statistical physics, or by a statistical argument based on a likelihood function. Both justifications are insightful but leave many questions unanswered. Here, we show for the first time that PMFs can be seen as approximations to quantities that do have a rigorous probabilistic justification: they naturally arise when probability distributions over different features of proteins need to be combined. We call these quantities “reference ratio distributions” deriving from the application of the “reference ratio method.” This new view is not only of theoretical relevance but leads to many insights that are of direct practical use: the reference state is uniquely defined and does not require external physical insights; the approach can be generalized beyond pairwise distances to arbitrary features of protein structure; and it becomes clear for which purposes the use of these quantities is justified. We illustrate these insights with two applications, involving the radius of gyration and hydrogen bonding. In the latter case, we also show how the reference ratio method can be iteratively applied to sculpt an energy funnel. Our results considerably increase the understanding and scope of energy functions derived from known biomolecular structures.
BMC Bioinformatics | 2010
Tim Harder; Wouter Boomsma; Martin Paluszewski; Jes Frellsen; Kristoffer E. Johansson; Thomas Hamelryck
BackgroundAccurately covering the conformational space of amino acid side chains is essential for important applications such as protein design, docking and high resolution structure prediction. Today, the most common way to capture this conformational space is through rotamer libraries - discrete collections of side chain conformations derived from experimentally determined protein structures. The discretization can be exploited to efficiently search the conformational space. However, discretizing this naturally continuous space comes at the cost of losing detailed information that is crucial for certain applications. For example, rigorously combining rotamers with physical force fields is associated with numerous problems.ResultsIn this work we present BASILISK: a generative, probabilistic model of the conformational space of side chains that makes it possible to sample in continuous space. In addition, sampling can be conditional upon the proteins detailed backbone conformation, again in continuous space - without involving discretization.ConclusionsA careful analysis of the model and a comparison with various rotamer libraries indicates that the model forms an excellent, fully continuous model of side chain conformational space. We also illustrate how the model can be used for rigorous, unbiased sampling with a physical force field, and how it improves side chain prediction when used as a pseudo-energy term. In conclusion, BASILISK is an important step forward on the way to a rigorous probabilistic description of protein structure in continuous space and in atomic detail.
Proteins | 2009
Martin Paluszewski; Kevin Karplus
Given a set of alternative models for a specific protein sequence, the model quality assessment (MQA) problem asks for an assignment of scores to each model in the set. A good MQA program assigns these scores such that they correlate well with real quality of the models, ideally scoring best that model which is closest to the true structure. In this article, we present a new approach for addressing the MQA problem. It is based on distance constraints extracted from alignments to templates of known structure, and is implemented in the Undertaker program for protein structure prediction. One novel feature is that we extract noncontact constraints as well as contact constraints. We describe how the distance constraint extraction is done and we show how they can be used to address the MQA problem. We have compared our method on CASP7 targets and the results show that our method is at least comparable with the best MQA methods that were assessed at CASP7. We also propose a new evaluation measure, Kendalls τ, that is more interpretable than conventional measures used for evaluating MQA methods (Pearsons r and Spearmans ρ). We show clear examples where Kendalls τ agrees much more with our intuition of a correct MQA, and we therefore propose that Kendalls τ be used for future CASP MQA assessments. Proteins 2009.
great lakes symposium on vlsi | 2004
Martin Paluszewski; Pawel Winter; Martin Zachariasen
The interest in alternatives to traditional Manhattan routing has increased tremendously during recent years. The so-called Y- and X-architectures have been proposed as architectures of the future. Manhattan, Y- and X-architectures are special cases of a general architecture in which a fixed set of uniformly oriented directions is allowed. In this paper we present a new paradigm for routing in this general architecture. The routing algorithm is based on a concept of flexibility polygons for Steiner minimum trees --- a new way of describing the inherent flexibility of Steiner trees in uniform orientation metrics. Flexibility polygons characterize possible routing regions for the nets while keeping their netlength at a minimum. The proposed routing algorithm first routes nets that intersect highly congested areas of the chip --- as given by the flexibility polygons --- and then employs dynamic maze (liquid) routing. Experiments with industrial chips show great promise for this new routing paradigm.
BMC Bioinformatics | 2010
Martin Paluszewski; Thomas Hamelryck
BackgroundMocapy++ is a toolkit for parameter learning and inference in dynamic Bayesian networks (DBNs). It supports a wide range of DBN architectures and probability distributions, including distributions from directional statistics (the statistics of angles, directions and orientations).ResultsThe program package is freely available under the GNU General Public Licence (GPL) from SourceForge http://sourceforge.net/projects/mocapy. The package contains the source for building the Mocapy++ library, several usage examples and the user manual.ConclusionsMocapy++ is especially suitable for constructing probabilistic models of biomolecular structure, due to its support for directional statistics. In particular, it supports the Kent distribution on the sphere and the bivariate von Mises distribution on the torus. These distributions have proven useful to formulate probabilistic models of protein and RNA structure in atomic detail.
Proteins | 2009
John G. Archie; Martin Paluszewski; Kevin Karplus
Our group tested three quality assessment functions in CASP8: a function which used only distance constraints derived from alignments (SAM‐T08‐MQAO), a function which added other single‐model terms to the distance constraints (SAM‐T08‐MQAU), and a function which used both single‐model and consensus terms (SAM‐T08‐MQAC). We analyzed the functions both for ranking models for a single target and for producing an accurate estimate of GDT_TS. Our functions were optimized for the ranking problem, so are perhaps more appropriate for metaserver applications than for providing trustworthiness estimates for single models. On the CASP8 test, the functions with more terms performed better. The MQAC consensus method was substantially better than either single‐model function, and the MQAU function was substantially better than the MQAO function that used only constraints from alignments. Proteins 2009.
Algorithms for Molecular Biology | 2006
Martin Paluszewski; Thomas Hamelryck; Pawel Winter
BackgroundA new, promising solvent exposure measure, called half-sphere-exposure (HSE), has recently been proposed. Here, we study the reconstruction of a proteins Cαtrace solely from structure-derived HSE information. This problem is of relevance for de novo structure prediction using predicted HSE measure. For comparison, we also consider the well-established contact number (CN) measure. We define energy functions based on the HSE- or CN-vectors and minimize them using two conformational search heuristics: Monte Carlo simulation (MCS) and tabu search (TS). While MCS has been the dominant conformational search heuristic in literature, TS has been applied only a few times. To discretize the conformational space, we use lattice models with various complexity.ResultsThe proposed TS heuristic with a novel tabu definition generally performs better than MCS for this problem. Our experiments show that, at least for small proteins (up to 35 amino acids), it is possible to reconstruct the protein backbone solely from the HSE or CN information. In general, the HSE measure leads to better models than the CN measure, as judged by the RMSD and the angle correlation with the native structure. The angle correlation, a measure of structural similarity, evaluates whether equivalent residues in two structures have the same general orientation. Our results indicate that the HSE measure is potentially very useful to represent solvent exposure in protein structure prediction, design and simulation.
workshop on algorithms in bioinformatics | 2008
Martin Paluszewski; Pawel Winter
We propose a new discrete protein structure model (using a modified face-centered cubic lattice). A novel branch and bound algorithm for finding global minimum structures in this model is suggested. The objective energy function is very simple as it depends on the predicted half-sphere exposure numbers of C ? -atoms. Bounding and branching also exploit predicted secondary structures and expected radius of gyration. The algorithm is fast and is able to generate the decoy set in less than 48 hours on all proteins tested. Despite the simplicity of the model and the energy function, many of the lowest energy structures, using exact measures, are near the native structures (in terms of RMSD). As expected, when using predicted measures, the fraction of good decoys decreases, but in all cases tested, we obtained structures within 6 A RMSD in a set of low-energy decoys. To the best of our knowledge, this is the first de novobranch and bound algorithm for protein decoy generation that only depends on such one-dimensional predictable measures. Another important advantage of the branch and bound approach is that the algorithm searches through the entire conformational space. Contrary to search heuristics, like Monte Carlo simulation or tabu search, the problem of escaping local minima is indirectly solved by the branch and bound algorithm when good lower bounds can be obtained.
Journal of Mathematical Modelling and Algorithms | 2010
Rasmus Fonseca; Martin Paluszewski; Pawel Winter
Archive | 2010
Thomas Hamelryck; Mikael Borg; Martin Paluszewski; Jonas Paulsen; Jes Frellsen; Christian Andreetta; Wouter Boomsma; Sandro Bottaro; Jesper Ferkinghoff-Borg