Fabian Paul
Max Planck Society
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Fabian Paul.
Journal of Chemical Theory and Computation | 2015
Martin K. Scherer; Benjamin Trendelkamp-Schroer; Fabian Paul; Guillermo Pérez-Hernández; Moritz Hoffmann; Nuria Plattner; Christoph Wehmeyer; Jan-Hendrik Prinz; Frank Noé
Markov (state) models (MSMs) and related models of molecular kinetics have recently received a surge of interest as they can systematically reconcile simulation data from either a few long or many short simulations and allow us to analyze the essential metastable structures, thermodynamics, and kinetics of the molecular system under investigation. However, the estimation, validation, and analysis of such models is far from trivial and involves sophisticated and often numerically sensitive methods. In this work we present the open-source Python package PyEMMA ( http://pyemma.org ) that provides accurate and efficient algorithms for kinetic model construction. PyEMMA can read all common molecular dynamics data formats, helps in the selection of input features, provides easy access to dimension reduction algorithms such as principal component analysis (PCA) and time-lagged independent component analysis (TICA) and clustering algorithms such as k-means, and contains estimators for MSMs, hidden Markov models, and several other models. Systematic model validation and error calculation methods are provided. PyEMMA offers a wealth of analysis functions such that the user can conveniently compute molecular observables of interest. We have derived a systematic and accurate way to coarse-grain MSMs to few states and to illustrate the structures of the metastable states of the system. Plotting functions to produce a manuscript-ready presentation of the results are available. In this work, we demonstrate the features of the software and show new methodological concepts and results produced by PyEMMA.
Proceedings of the National Academy of Sciences of the United States of America | 2016
Hao Wu; Fabian Paul; Christoph Wehmeyer; Frank Noé
Significance Molecular dynamics simulations can provide mechanistic understanding of biomolecular processes. However, direct simulation of slow transitions such as protein conformational transitions or protein–ligand dissociation are unfeasible with commonly available computational resources. Two typical strategies are (i) conducting large ensembles of short simulations and estimating the long-term kinetics with a Markov state model, and (ii) speeding up rare events by bias potentials or higher temperatures and estimating the unbiased thermodynamics with reweighting estimators. In this work, we introduce the transition-based reweighting analysis method (TRAM), a statistically optimal approach that combines the best of both worlds and estimates a multiensemble Markov model (MEMM) with full thermodynamic and kinetic information at all simulated ensembles. We introduce the general transition-based reweighting analysis method (TRAM), a statistically optimal approach to integrate both unbiased and biased molecular dynamics simulations, such as umbrella sampling or replica exchange. TRAM estimates a multiensemble Markov model (MEMM) with full thermodynamic and kinetic information at all ensembles. The approach combines the benefits of Markov state models—clustering of high-dimensional spaces and modeling of complex many-state systems—with those of the multistate Bennett acceptance ratio of exploiting biased or high-temperature ensembles to accelerate rare-event sampling. TRAM does not depend on any rate model in addition to the widely used Markov state model approximation, but uses only fundamental relations such as detailed balance and binless reweighting of configurations between ensembles. Previous methods, including the multistate Bennett acceptance ratio, discrete TRAM, and Markov state models are special cases and can be derived from the TRAM equations. TRAM is demonstrated by efficiently computing MEMMs in cases where other estimators break down, including the full thermodynamics and rare-event kinetics from high-dimensional simulation data of an all-atom protein–ligand binding model.
Protein Science | 2014
Thomas R. Weikl; Fabian Paul
Protein binding and function often involves conformational changes. Advanced nuclear magnetic resonance (NMR) experiments indicate that these conformational changes can occur in the absence of ligand molecules (or with bound ligands), and that the ligands may “select” protein conformations for binding (or unbinding). In this review, we argue that this conformational selection requires transition times for ligand binding and unbinding that are small compared to the dwell times of proteins in different conformations, which is plausible for small ligand molecules. Such a separation of timescales leads to a decoupling and temporal ordering of binding/unbinding events and conformational changes. We propose that conformational‐selection and induced‐change processes (such as induced fit) are two sides of the same coin, because the temporal ordering is reversed in binding and unbinding direction. Conformational‐selection processes can be characterized by a conformational excitation that occurs prior to a binding or unbinding event, while induced‐change processes exhibit a characteristic conformational relaxation that occurs after a binding or unbinding event. We discuss how the ordering of events can be determined from relaxation rates and effective on‐ and off‐rates determined in mixing experiments, and from the conformational exchange rates measured in advanced NMR or single‐molecule fluorescence resonance energy transfer experiments. For larger ligand molecules such as peptides, conformational changes and binding events can be intricately coupled and exhibit aspects of conformational‐selection and induced‐change processes in both binding and unbinding direction.
Journal of Chemical Physics | 2015
Benjamin Trendelkamp-Schroer; Hao Wu; Fabian Paul; Frank Noé
Reversibility is a key concept in Markov models and master-equation models of molecular kinetics. The analysis and interpretation of the transition matrix encoding the kinetic properties of the model rely heavily on the reversibility property. The estimation of a reversible transition matrix from simulation data is, therefore, crucial to the successful application of the previously developed theory. In this work, we discuss methods for the maximum likelihood estimation of transition matrices from finite simulation data and present a new algorithm for the estimation if reversibility with respect to a given stationary vector is desired. We also develop new methods for the Bayesian posterior inference of reversible transition matrices with and without given stationary vector taking into account the need for a suitable prior distribution preserving the meta-stable features of the observed process during posterior inference. All algorithms here are implemented in the PyEMMA software--http://pyemma.org--as of version 2.0.
Proceedings of the National Academy of Sciences of the United States of America | 2017
Simon Olsson; Hao Wu; Fabian Paul; Cecilia Clementi; Frank Noé
Significance Structural biology is moving toward a paradigm characterized by data from a broad range of sources sensitive to multiple timescales and length scales. However, a major open problem remains: to devise an inference method that optimally combines all of this information into models amenable to human analysis. In this work, we make a significant step toward achieving this goal. We introduce a statistically rigorous method to merge information from molecular simulations and experimental data into augmented Markov models (AMMs). AMMs are easy-to-analyze atomistic descriptions of biomolecular structure and exchange kinetics. We show how AMMs may provide accurate descriptions of molecular dynamics probed by NMR spin relaxation and thereby provide a unique way to integrate analysis of experimental with simulation data. Accurate mechanistic description of structural changes in biomolecules is an increasingly important topic in structural and chemical biology. Markov models have emerged as a powerful way to approximate the molecular kinetics of large biomolecules while keeping full structural resolution in a divide-and-conquer fashion. However, the accuracy of these models is limited by that of the force fields used to generate the underlying molecular dynamics (MD) simulation data. Whereas the quality of classical MD force fields has improved significantly in recent years, remaining errors in the Boltzmann weights are still on the order of a few kT, which may lead to significant discrepancies when comparing to experimentally measured rates or state populations. Here we take the view that simulations using a sufficiently good force-field sample conformations that are valid but have inaccurate weights, yet these weights may be made accurate by incorporating experimental data a posteriori. To do so, we propose augmented Markov models (AMMs), an approach that combines concepts from probability theory and information theory to consistently treat systematic force-field error and statistical errors in simulation and experiment. Our results demonstrate that AMMs can reconcile conflicting results for protein mechanisms obtained by different force fields and correct for a wide range of stationary and dynamical observables even when only equilibrium measurements are incorporated into the estimation process. This approach constitutes a unique avenue to combine experiment and computation into integrative models of biomolecular structure and dynamics.
Journal of Chemical Physics | 2017
Hao Wu; Feliks Nüske; Fabian Paul; Stefan Klus; Péter Koltai; Frank Noé
Markov state models (MSMs) and master equation models are popular approaches to approximate molecular kinetics, equilibria, metastable states, and reaction coordinates in terms of a state space discretization usually obtained by clustering. Recently, a powerful generalization of MSMs has been introduced, the variational approach conformation dynamics/molecular kinetics (VAC) and its special case the time-lagged independent component analysis (TICA), which allow us to approximate slow collective variables and molecular kinetics by linear combinations of smooth basis functions or order parameters. While it is known how to estimate MSMs from trajectories whose starting points are not sampled from an equilibrium ensemble, this has not yet been the case for TICA and the VAC. Previous estimates from short trajectories have been strongly biased and thus not variationally optimal. Here, we employ the Koopman operator theory and the ideas from dynamic mode decomposition to extend the VAC and TICA to non-equilibrium data. The main insight is that the VAC and TICA provide a coefficient matrix that we call Koopman model, as it approximates the underlying dynamical (Koopman) operator in conjunction with the basis set used. This Koopman model can be used to compute a stationary vector to reweight the data to equilibrium. From such a Koopman-reweighted sample, equilibrium expectation values and variationally optimal reversible Koopman models can be constructed even with short simulations. The Koopman model can be used to propagate densities, and its eigenvalue decomposition provides estimates of relaxation time scales and slow collective variables for dimension reduction. Koopman models are generalizations of Markov state models, TICA, and the linear VAC and allow molecular kinetics to be described without a cluster discretization.
Nature Communications | 2017
Fabian Paul; Christoph Wehmeyer; Esam T. Abualrous; Hao Wu; Michael D. Crabtree; Johannes Schöneberg; Jane Clarke; Christian Freund; Thomas R. Weikl; Frank Noé
Understanding and control of structures and rates involved in protein ligand binding are essential for drug design. Unfortunately, atomistic molecular dynamics (MD) simulations cannot directly sample the excessively long residence and rearrangement times of tightly binding complexes. Here we exploit the recently developed multi-ensemble Markov model framework to compute full protein-peptide kinetics of the oncoprotein fragment 25–109Mdm2 and the nano-molar inhibitor peptide PMI. Using this system, we report, for the first time, direct estimates of kinetics beyond the seconds timescale using simulations of an all-atom MD model, with high accuracy and precision. These results only require explicit simulations on the sub-milliseconds timescale and are tested against existing mutagenesis data and our own experimental measurements of the dissociation and association rates. The full kinetic model reveals an overall downhill but rugged binding funnel with multiple pathways. The overall strong binding arises from a variety of conformations with different hydrophobic contact surfaces that interconvert on the milliseconds timescale.Binding and unbinding kinetics are important determinants of protein-protein or small molecule protein functional interactions that can guide drug development. Here the authors exploit the multi-ensemble Markov model framework to develop a computational approach that allows the estimation of binding kinetics reaching into the seconds timescale.
Journal of Chemical Theory and Computation | 2017
Giovanni Pinamonti; Jianbo Zhao; David E. Condon; Fabian Paul; Frank Noé; Douglas H. Turner; Giovanni Bussi
Nowadays different experimental techniques, such as single molecule or relaxation experiments, can provide dynamic properties of biomolecular systems, but the amount of detail obtainable with these methods is often limited in terms of time or spatial resolution. Here we use state-of-the-art computational techniques, namely, atomistic molecular dynamics and Markov state models, to provide insight into the rapid dynamics of short RNA oligonucleotides, to elucidate the kinetics of stacking interactions. Analysis of multiple microsecond-long simulations indicates that the main relaxation modes of such molecules can consist of transitions between alternative folded states, rather than between random coils and native structures. After properly removing structures that are artificially stabilized by known inaccuracies of the current RNA AMBER force field, the kinetic properties predicted are consistent with the time scales of previously reported relaxation experiments.
PLOS Computational Biology | 2016
Fabian Paul; Thomas R. Weikl
Protein binding often involves conformational changes. Important questions are whether a conformational change occurs prior to a binding event (‘conformational selection’) or after a binding event (‘induced fit’), and how conformational transition rates can be obtained from experiments. In this article, we present general results for the chemical relaxation rates of conformational-selection and induced-fit binding processes that hold for all concentrations of proteins and ligands and, thus, go beyond the standard pseudo-first-order approximation of large ligand concentration. These results allow to distinguish conformational-selection from induced-fit processes—also in cases in which such a distinction is not possible under pseudo-first-order conditions—and to extract conformational transition rates of proteins from chemical relaxation data.
Journal of Physical Chemistry B | 2018
Fabian Paul; Frank Noé; Thomas R. Weikl
Unstructured proteins and peptides typically fold during binding to ligand proteins. A challenging problem is to identify the mechanism and kinetics of these binding-induced folding processes in experiments and atomistic simulations. In this Article, we present a detailed picture for the folding of the inhibitor peptide PMI into a helix during binding to the oncoprotein fragment 25-109Mdm2 obtained from atomistic, explicit-water simulations and Markov state modeling. We find that binding-induced folding of PMI is highly parallel and can occur along a multitude of pathways. Some pathways are induced-fit-like with binding occurring prior to PMI helix formation, while other pathways are conformational-selection-like with binding after helix formation. On the majority of pathways, however, binding is intricately coupled to folding, without clear temporal ordering. A central feature of these pathways is PMI motion on the Mdm2 surface, along the binding groove of Mdm2 or over the rim of this groove. The native binding groove of Mdm2 thus appears as an asymmetric funnel for PMI binding. Overall, binding-induced folding of PMI does not fit into the classical picture of induced fit or conformational selection that implies a clear temporal ordering of binding and folding events. We argue that this holds in general for binding-induced folding processes because binding and folding events in these processes likely occur on similar time scales and do exhibit the time-scale separation required for temporal ordering.